Roadmap

Today HVTrust still ranks on supply-chain trust — how a project is built, signed, and released. Runtime-trust discovery is now live on profiles, and the next step is deciding how much those new fields should actually affect rank.

Now Shipped

Live Supply-chain trust v3

HVTrust ranks open-source AI agents on five weighted dimensions: Safety/Integrity (25), Identity/Provenance (18), Transparency (17), Maintenance (20), Adoption (20).

OSSF Scorecard via deps.dev + fallback
npm + PyPI provenance attestation checks
Signed commit ratio from GitHub verification
Confidence-weighted evidence grades A–D

Live Runtime trust — public discovery

Agent profiles show a compact runtime snapshot sourced from public docs, manifests, and registry metadata. As of methodology v4.0 these fields are folded into the production HVTrust rank as a bounded calibration (see below).

MCP server support — declared vs implemented server signals
External service dependencies — model providers, storage, and third-party services
Tool / plugin surface — broad capability and extension hints
Package provenance drift — whether published package metadata still points back to the tracked repo

Building Now

Live Runtime trust — in the production score

Runtime-trust signals are now part of the production HVTrust score and rank. The promotion was evidence-gated: an upset review plus several rounds of detector auditing (same-owner package variants, repo transfers, and docs-only mentions are excluded from scoring) preceded the cutover.

Calibrated scoring — MCP support, external dependencies, tool/plugin surface, and package-provenance drift adjust the base score within published bounds
Public methodology — every adjustment value is documented in the methodology and the runtime-trust spec
Comparable baseline — the pre-calibration ranking stays available on the leaderboard for comparison

Up Next

Q3 Runtime trust — production scoring

Once calibration stabilises, selected runtime signals will be folded into HVTrust as a visible, separate scoring slice so readers can distinguish supply-chain posture from runtime reach and complexity.

Per-agent capability surface page with clearer breakdowns
Trend tracking for runtime drift over time
Public spec at /spec/runtime-trust

Q3 Weekly alerts and watchlists

The free alert layer should stay low-friction and editorial: a simple weekly digest of meaningful leaderboard and trust changes. Granular watchlists can come later.

Weekly “what changed” digest for the leaderboard
Major trust/regression callouts when they truly matter
Historical runtime changes and drift summaries for watched projects

Later

Later Maintainer self-service

Today corrections flow through GitHub issues. Longer term, maintainers should be able to claim listings, declare runtime fields with evidence, and respond to drift or provenance mismatches directly.

Later Cryptographic identity gate

Tighten the gate at the top of the leaderboard: require Sigstore-style identity binding for projects to reach the verified tier.

Later Continuous behavioural signals

Lightweight, opt-in runtime telemetry that detects when an agent's actual behaviour drifts from its declared capabilities — without sending user data anywhere.

How to influence the roadmap

This roadmap is intentionally short. Priorities are driven by what maintainers and users ask for. To suggest a signal, flag a gap, or push back on the order, open an issue.