Open Trust Benchmark

Published Reports

Only internally approved and sanitized benchmark reports are exposed on the public site.

Catalog Health

MISSING

../spring-boot-starter-contexa-enterprise/build/reports/official-verification-fullstack-benchmark

Customer self-run

Publication-safe customer result bundles can be reviewed and surfaced without exposing private enterprise evidence.

Safety Gate

N/A

Public reports surface official metric coverage, submission readiness, and failing gate counts.

Published Benchmark Reports

Each report is rendered as a public scorecard with chart-ready artifacts, HTML output, and PDF export.

No published benchmark reports are available yet.

Benchmark Families

Family pages explain how human, agent, protocol, verification, SOAR, and Java production fit are tracked over time.

Human Zero Trust

Request-time human access decisions, explanations, and replay quality.

Published

Submission Ready

Agent Zero Trust

Delegated agent lineage, objective, scope, and tool-chain drift controls.

Published

Submission Ready

Protocol Boundary

Canonical security fidelity across MCP, A2A, and internal runtime boundaries.

Published

Submission Ready

Verification and Assurance

Evidence completeness, replay match rate, and submission readiness.

Published

Submission Ready

AI Native SOAR

Approval, permit, tool execution, and incident lineage for AI native action planes.

Published

Submission Ready

Java Production Fit

Java and Spring runtime integration readiness for production deployment.

Published

Submission Ready

Top Published Runs

The leaderboard is transparent about source, coverage, official metric pass count, and readiness.

View full leaderboard

Rank	Source	Report	Coverage	Official Metrics	Ready

Methodology and Certification Snapshot

The public surface explains how CONTEXA keeps the benchmark falsifiable, reviewable, and ready for future standards.

Safety Gates First

CONTEXA benchmark treats permit, lineage, replay, and evidence integrity as mandatory gates before any aggregate score.

Unsafe action, broken lineage, or unverifiable replay fails the benchmark regardless of the average score.
Public reports expose both aggregate scores and gate failures.
The benchmark measures controllable action-plane quality rather than isolated model output.

Human and Agent Unified Semantics

Human requests and delegated agent execution are evaluated under the same canonical security semantics.

Human, service-client, and delegated-agent executions are assessed in one request-time control plane.
Objective, scope, tool-chain, permit, approval, and protocol-boundary remain common evaluation axes.
Public reports expose scenario families and scorecards without leaking private evidence.

Publication-safe Reporting

Public benchmark artifacts are generated from sanitized publication bundles instead of internal raw evidence.

Private evidence stays inside contexa-iam-enterprise for operator review.
contexa-site reads only publication-approved public artifacts.
HTML and PDF reports are generated from the same public summary and chart dataset.

spring-profile BUILDING

Spring Application Profile

Java and Spring application runtime protection, verification, and benchmark publication readiness.

Matching reports: 0

mcp-profile BUILDING

MCP Tool Governance Profile

Protocol-boundary, permit, and tool-execution controls for MCP mediated agent actions.

Matching reports: 0

a2a-profile BUILDING

A2A Delegation Profile

Multi-agent delegation lineage, protocol compatibility, and zero-trust chain integrity.

Matching reports: 0

soar-profile BUILDING

AI Native SOAR Profile

Action-plane safety, permit enforcement, approval, and incident lineage for automated response.

Matching reports: 0

Public, Verifiable Benchmark Reports

Human Zero Trust

Agent Zero Trust

Protocol Boundary

Verification and Assurance

AI Native SOAR

Java Production Fit

Safety Gates First

Human and Agent Unified Semantics

Publication-safe Reporting

Spring Application Profile

MCP Tool Governance Profile

A2A Delegation Profile

AI Native SOAR Profile