How to Evaluate AI Agent Safety: 5 Signals That Actually Matter

May 27, 2026 · 6 min read · HVTracker Research

Open-source AI agents are being adopted at an unprecedented rate. Developers are integrating coding assistants, research agents, and workflow automation tools into production systems that handle sensitive data and make real decisions.

But how do you know if an AI agent is safe to use? GitHub stars measure popularity, not trustworthiness. A project with 50,000 stars can still have unsigned releases, no security policy, and dependencies riddled with known vulnerabilities.

At HVTracker, we track trust signals for 170+ open-source AI agents daily. Here are the 5 evidence-based signals that actually predict whether an agent is safe to adopt.

1. OpenSSF Scorecard

What it measures

Supply-chain security practices across 18 automated checks

The OpenSSF Scorecard is a tool maintained by the Open Source Security Foundation that runs automated checks against a GitHub repository. It evaluates branch protection, dependency pinning, vulnerability disclosure policies, CI/CD security, and more. Each check produces a score from 0-10.

This is the single most informative signal for supply-chain safety. A project scoring 7+/10 on the Scorecard has branch protection enabled, pins its dependencies, uses SAST tools, and has a vulnerability disclosure process.

The problem? Most AI agent projects don't score well. In our dataset, the median Scorecard score across tracked agents is below 5/10. The agents that score 7+ tend to be backed by established organizations with mature security practices.

What to look for: A Scorecard score of 6+ is good. 7+ is excellent. Below 4 means basic security hygiene is missing. You can check any repo yourself at securityscorecards.dev.

2. Package Provenance

What it measures

Whether published packages can be traced back to their source code

Package provenance creates a cryptographic link between a published npm or PyPI package and the specific source code commit and CI/CD pipeline that produced it. It uses SLSA (Supply-chain Levels for Software Artifacts) attestations to prove a package wasn't tampered with between the repository and the registry.

This matters because the supply-chain attack surface for AI agents is enormous. When you run pip install or npm install, you're trusting that the published package matches the source code you reviewed on GitHub. Without provenance, there's no way to verify this.

Provenance adoption is still early. In our tracking, only a fraction of AI agents publish packages with verified provenance attestations. But the ones that do are signaling a level of security maturity that goes beyond the basics.

What to look for: Check the package page on npm or PyPI for a provenance badge or SLSA attestation. On HVTracker, we flag this as "Package provenance verified" on agent profile pages.

3. Signed Commits

What it measures

Whether code contributions are cryptographically authenticated

Signed commits use GPG or SSH keys to cryptographically prove that a commit was actually made by the claimed author. Without signing, anyone who gains write access to a repository (through a compromised token, for example) can push commits that appear to come from a trusted maintainer.

A high signed-commit ratio (80%+) indicates that a project's maintainers take code integrity seriously. It's not a perfect signal — some legitimate maintainers don't sign commits — but when combined with other signals, it's a strong indicator of security culture.

We measure this by sampling recent commits and calculating the ratio that are cryptographically verified. Projects backed by organizations with security policies tend to score highest here.

What to look for: On GitHub, look for the green "Verified" badge next to commits. A project where 80%+ of recent commits are signed has strong identity controls.

4. Activity Patterns

What it measures

Whether a project is actively maintained and responsive

Activity isn't just about commit frequency. The pattern matters: regular commits over time indicate sustained maintenance. A burst of commits followed by silence suggests a project might be abandoned. We look at commit cadence over 30 days, time since last push, and issue response patterns.

An unmaintained AI agent is a security liability. Vulnerabilities in dependencies go unpatched. Breaking changes in APIs go unaddressed. Users who report bugs get silence.

But very high activity can also be a warning sign. A project with hundreds of commits per week from a single contributor might be moving too fast for proper review. The healthiest pattern is consistent, multi-contributor activity with a reasonable cadence.

What to look for: Last push within the past 7 days is ideal. 30+ days without a push is a yellow flag. 90+ days is a red flag for any project that people rely on in production.

5. Transparency Indicators

What it measures

Whether the project makes its practices and policies visible

Transparency includes having a clear license, a security policy (SECURITY.md), a code of conduct, contributing guidelines, and documentation. These aren't just nice-to-have — they signal that a project is run with governance and accountability in mind.

A project that lacks a security policy is telling you something: they haven't thought about what happens when a vulnerability is discovered. A project with no license leaves you legally exposed. Missing documentation makes it harder to audit what the agent actually does.

We also look at whether the project uses public GitHub Actions (visible CI/CD) versus private pipelines. Public CI/CD means anyone can verify how the software is built, tested, and released.

What to look for: At minimum, look for a license, README, and SECURITY.md. Extra credit for CONTRIBUTING.md, a code of conduct, and visible CI/CD pipelines.

Putting It All Together

No single signal tells the full story. An agent with a perfect Scorecard score but no activity in 6 months might be abandoned. An agent with thousands of commits but no provenance might have supply-chain risks.

The most trustworthy agents score well across multiple dimensions: they have reasonable Scorecard scores, some form of provenance or signing, active maintenance, and transparent governance.

This is exactly what HVTracker's Trust Score measures — a composite of activity, adoption, transparency, safety, and identity signals, weighted to reflect real-world risk. Each agent gets a score from 0-100, updated daily.

Check the trust score for any AI agent

HVTracker independently evaluates 170+ open-source AI agents daily. Browse the full leaderboard, compare agents side by side, and see exactly which trust signals each agent has.

Browse the Trust Registry

How to Evaluate AI Agent Safety: 5 Signals That Actually Matter

In this article

1. OpenSSF Scorecard

What it measures

2. Package Provenance

What it measures

3. Signed Commits

What it measures

4. Activity Patterns

What it measures

5. Transparency Indicators

What it measures

Putting It All Together

Check the trust score for any AI agent

Further Reading