Trust signals for open-source AI agents

Roadmap

Today HVTrust scores supply-chain trust — how a project is built, signed, and released. Next, we're extending it to runtime trust — what an agent actually does when you run it.

Now Shipped

Live Supply-chain trust v3

HVTrust ranks 172 open-source AI agents on five weighted dimensions: Safety/Integrity (30), Identity/Provenance (20), Transparency (20), Maintenance (20), Adoption (10).

  • OSSF Scorecard via deps.dev + fallback
  • npm + PyPI provenance attestation checks
  • Signed commit ratio from GitHub verification
  • Confidence-weighted evidence grades A–D

Building Now

In progress Runtime trust — discovery

For each tracked agent, we're collecting static runtime characteristics from public sources (repo manifests, READMEs, registry metadata). Starting with non-invasive fields anyone can verify themselves:

  • MCP support — does the agent declare or implement Model Context Protocol?
  • Tool / plugin surface — built-in tools, integrations, plugin ecosystem
  • External service dependencies — required APIs, model providers, data services
  • Package provenance drift — does the latest published package still match its source repo?

Up Next

Q3 Runtime trust — scored signals

Once discovery stabilises, runtime signals get scored and folded into HVTrust as a new dimension — separate from supply-chain hygiene so readers can see them independently.

  • Per-agent capability surface page (what it can reach)
  • Trend tracking for runtime drift over time
  • Public spec at /spec/runtime-trust
Q3 Maintainer self-service

Today corrections flow through GitHub issues. Next we want maintainers to:

  • Claim their listing
  • Declare runtime fields (with evidence) before we infer them
  • Get notified when their score changes

Later

Later Cryptographic identity gate

Tighten the gate at the top of the leaderboard: require Sigstore-style identity binding for projects to reach the verified tier.

Later Continuous behavioural signals

Lightweight, opt-in runtime telemetry that detects when an agent's actual behaviour drifts from its declared capabilities — without sending user data anywhere.

How to influence the roadmap

This roadmap is intentionally short. Priorities are driven by what maintainers and users ask for. To suggest a signal, flag a gap, or push back on the order, open an issue.