HVTracker Methodology Specification
https://hvtracker.net/spec/methodology/v2.0
1. Abstract
This document defines the measurement methodology used by HVTracker to compute health scores and supply chain trust signals for open-source AI agent projects. It specifies the data sources, scoring formula, provenance signal collection procedures, and update cadence.
Implementations conforming to this specification MUST produce scores within 0.1 points of the reference implementation given identical inputs. The scoring formula is deterministic given its inputs; variation may arise only from differences in API response timing.
The key words MUST, MUST NOT, SHOULD, SHOULD NOT, and MAY in this document are to be interpreted as described in RFC 2119.
2. Terminology
- Agent
- An open-source software project, tracked by HVTracker, that implements or supports autonomous AI agent behavior. Agents MUST be open-source (see Section 8).
- Health Score
- A scalar value in the range [0, 100] representing the composite activity and adoption health of an agent, computed according to Section 5.
- Provenance Signal
- A binary or scalar value derived from public cryptographic infrastructure indicating the supply chain trustworthiness of an agent's release artifacts (see Section 6).
- Reference Implementation
- The canonical Python implementation at fetch_and_build.py in the HVTracker repository.
- Snapshot
- A complete JSON export of the leaderboard state at a single point in time, archived to
output/history/YYYY-MM-DD.json. - Daily Run
- One execution of the reference implementation, producing a new Snapshot and updating all output files.
3. Scope and Applicability
This specification governs the production HVTracker leaderboard at hvtracker.net and any conforming implementations that wish to replicate or extend it.
This specification does not govern:
- Runtime correctness benchmarks or task-completion evaluations.
- Closed-source or proprietary AI agent products.
- Agents not listed in the reference
agents.jsonregistry.
All signals defined in this specification are unilaterally observable from public APIs. No maintainer participation, registration, or opt-in is required or assumed.
4. Data Sources
All data MUST be fetched from the public APIs listed in this section. Authenticated requests SHOULD use a GitHub personal access token to raise the rate limit from 60 to 5,000 requests per hour.
4.1 GitHub REST API
Base URL: https://api.github.com
The following endpoints are used per agent:
| Endpoint | Fields consumed | Notes |
|---|---|---|
GET /repos/{owner}/{repo} | stargazers_count, forks_count, pushed_at, description, language, open_issues_count | Primary metadata fetch |
GET /repos/{owner}/{repo}/stats/commit_activity | Weekly commit totals, last 52 weeks | Returns HTTP 202 while computing; implementation MUST retry up to 3 times with exponential backoff |
GET /repos/{owner}/{repo}/commits | verification.verified on each commit | Used for signed commit ratio (Section 6.4); sample of last 100 commits |
Commit activity for the last four weeks is computed as the sum of the total field across the last four elements of the commit activity array. If the Stats API returns an empty array, the implementation MUST fall back to the Commits API with a since parameter of 30 days prior.
4.2 npm Registry
Download counts: https://api.npmjs.org/downloads/point/last-week/{package}
Provenance: https://registry.npmjs.org/{package}/latest — field dist.attestations
Only agents with a non-empty npm_package field in agents.json are queried. npm API requests MAY be made in parallel; the npm API does not publish a strict rate limit for anonymous reads.
4.3 PyPI
Download counts: https://pypistats.org/api/packages/{package}/recent — field data.last_week
PyPI stats requests MUST be made serially with a minimum 1.2-second delay between requests. On HTTP 429, the implementation MUST fall back to the cached value from the most recent prior Snapshot.
Provenance: https://pypi.org/simple/{package}/ with Accept: application/vnd.pypi.simple.v1+json — field files[-1].provenance. A non-null value indicates a PEP 740 attestation is present on the latest release file.
4.4 Hacker News
API: Algolia HN Search — https://hn.algolia.com/api/v1/search
Story mentions are counted over the last 30 days using a per-agent hn_search_term configured in agents.json. Agents without an hn_search_term receive a null value. The Algolia API allows 10,000 requests per hour; a 0.3-second sleep SHOULD be applied between requests as a courtesy.
5. Health Score Formula
The health score is a real number in [0, 100], computed as the sum of four components. All components are non-negative and bounded by their respective maxima.
score = stars_score + freshness_score + activity_score + community_score
The score is rounded to one decimal place for display.
5.1 Stars Component
Maximum: 30 points
stars_score = min(30, ln(1 + stars) / ln(1 + 100_000) × 30)
Log-scaled against a fixed anchor of 100,000 stars. A project with 100,000 stars earns the full 30 points. Log scaling prevents megarepos from dominating the leaderboard linearly.
5.2 Freshness Component
Maximum: 25 points
days_since = (now_utc - pushed_at).days freshness_score = max(0.0, 25 × (1 − days_since / 180))
Linear decay to zero over 180 days of inactivity. A push today earns the full 25 points. A push 180 or more days ago earns 0 points.
5.3 Activity Component
Maximum: 25 points
activity_score = min(25, ln(1 + commits_4wk) / ln(1 + 100) × 25)
Where commits_4wk is the sum of commits in the last four weeks per Section 4.1. Log-scaled; 100 commits in four weeks earns the full 25 points.
5.4 Community Component
Maximum: 20 points
community_score = min(20, ln(1 + forks) / ln(1 + 20_000) × 20)
Fork count as a proxy for downstream reuse. Log-scaled against 20,000 forks.
6. Supply Chain Trust Signals
Supply chain trust signals are independently observable boolean or scalar values derived from public cryptographic infrastructure. They MUST NOT affect the health score. They MUST be collected on every Daily Run and stored in the Snapshot.
All signals are unilaterally observable. No maintainer action is required for a signal to be collected.
6.1 npm Provenance
Field: npm_provenance (boolean or null)
Source: https://registry.npmjs.org/{package}/latest
Signal: true if the response body contains a non-null dist.attestations field; false if the field is absent or null; null if the agent has no npm_package configured or the request fails.
Interpretation: A true value indicates the latest published version includes an in-toto SLSA provenance attestation signed via Sigstore and logged to the Rekor transparency log.
Limitation: Only the latest version is checked. Historical versions are not evaluated.
6.2 PyPI Provenance (PEP 740)
Field: pypi_provenance (boolean or null)
Source: https://pypi.org/simple/{package}/ with Accept: application/vnd.pypi.simple.v1+json
Signal: true if the last file entry in the files array has a non-null provenance field; false if absent; null if no pypi_package is configured or the request fails.
Interpretation: A true value indicates the latest release was published via a Trusted Publisher and carries a PEP 740 digital attestation, generated by the PyPA GitHub Actions publishing workflow.
Limitation: Packages published via twine with API tokens will not have PEP 740 attestations regardless of build pipeline quality.
6.3 OSSF Scorecard
Fields: scorecard_score (float 0–10 or null), scorecard_checks (object mapping check name to score)
Source: Scorecard data is generated by running the OSSF Scorecard CLI tool directly against each repository, refreshed weekly. Results are cached in scorecard-cache.json and served from cache during daily builds. If a repository is absent from the cache, the build falls back to https://api.deps.dev/v3/projects/github.com%2F{owner}%2F{repo} and then https://api.securityscorecards.dev/projects/github.com/{owner}/{repo}.
Signal: The overall score (0–10) and individual check scores from the Scorecard CLI output.
Interpretation: The OpenSSF Scorecard evaluates security posture across checks including: Maintained, Code-Review, Branch-Protection, Signed-Releases, Pinned-Dependencies, Vulnerabilities, Token-Permissions, Dangerous-Workflow, and others. A score of 10 is the maximum.
Limitation: The weekly CLI scan runs on GitHub Actions; results are at most 7 days old. Absence of a score does not imply poor security posture.
6.4 Signed Commit Ratio
Field: signed_commits_ratio (float 0.0–1.0 or null)
Source: GET /repos/{owner}/{repo}/commits?per_page=100 — field commit.verification.verified on each result
Signal: verified_count / total_count across up to 100 most recent commits on the default branch. null if the API request fails.
Interpretation: The fraction of recent commits carrying a verified GPG, SSH, or S/MIME signature as reported by GitHub's signature verification API.
Limitation: Web-based commits made through GitHub's UI are signed by GitHub's own key and counted as verified, which may inflate the ratio for projects that accept many web-based edits. This signal measures signature presence, not signature quality or key trust level.
6.5 Public Action Tracking (Behavioral Signals)
Field: public_actions (object or null)
Source: GitHub Search API — GET /search/commits and GET /search/issues
Signal: For agents with a configured fingerprint, counts the number of public commits or merged PRs created by that agent on GitHub in the trailing 30 days. Fingerprints are one of:
- commit_trailer — a standardized co-author or attribution string appended to commit messages (e.g., Aider's
Co-authored-by: aider). - pr_body — a standardized footer string appended to PR descriptions (e.g.,
Generated with Gemini CLI). - bot_account — a GitHub App bot account that authors commits or PRs (e.g.,
openhands-agent).
Sub-fields:
actions_30d— total count of detected actions in the trailing 30 days.actions_30d_merged— count of merged PRs specifically (null for commit-based fingerprints).actions_30d_by_repo— top repos where this agent was active (sampled from first page of search results).
NOT included in health score. Public action counts are displayed on the leaderboard and agent profile pages but do not contribute to the composite health score computed in Section 5. They are an informational signal only.
Limitations:
- Only agents with a confirmed, unique fingerprint pattern are tracked. Agents without a detectable fingerprint report
null. - Private repository usage is entirely invisible to this signal.
- GitHub Search API caps results at 1,000 per query. Counts above 1,000 are lower bounds.
- Fingerprint patterns may produce false positives if the pattern string is not sufficiently unique. Each fingerprint is documented and validated in
docs/research/agent-fingerprints.md.
7. Update Process and Cadence
The reference implementation MUST execute at least once per calendar day. The production deployment runs at 06:00 UTC daily via GitHub Actions.
Each Daily Run MUST:
- Fetch fresh data for all agents in
agents.jsonfrom sources defined in Section 4. - Compute health scores per Section 5.
- Collect all supply chain trust signals per Section 6.
- Collect behavioral signals per Section 6.5 (for agents with configured fingerprints).
- Write a Snapshot to
output/history/YYYY-MM-DD.json. Existing Snapshots MUST NOT be modified or deleted. - Update
data.json,index.html,feed.json,sitemap.xml, and all agent profile pages. - Generate stable data endpoints under
data/per the Data Schema Specification v0.1.
Rank deltas are computed by comparing the current run's ranks against the most recent prior Snapshot. If no prior Snapshot exists, all rank deltas are marked as "NEW".
8. Agent Eligibility
To be listed in the HVTracker registry, an agent MUST:
- Be open-source with a public GitHub repository.
- Implement or materially support autonomous AI agent behavior.
- Have at least one public release or a non-trivial commit history.
An agent MUST NOT be listed if:
- Its source code is closed-source or proprietary. (Closed-source agents lack the supply chain signals this methodology depends on.)
- The GitHub repository is private, archived, or deleted.
Agent addition and removal decisions are made by the HVTracker maintainer. The agent registry is defined by agents.json in the reference implementation repository.
9. Versioning
This specification uses semantic versioning of the form vMAJOR.MINOR:
- MAJOR increments when the scoring formula changes in a way that would reorder a substantial fraction of the leaderboard.
- MINOR increments when new signals are added, data sources change, or non-score-affecting methodology changes are made.
All published versions of this specification remain permanently accessible at their versioned URLs. A version MUST NOT be modified after it receives Published status. Corrections MUST be issued as a new version with a Superseded marker on the prior version.
The current specification version is recorded in the methodology_version field of every Snapshot and in the data.json export.
A. Reference Implementation
The reference implementation is maintained at:
https://github.com/YugantM/hvtracker
The primary scoring and data collection logic is in fetch_and_build.py. The agent registry is in agents.json. Historical Snapshots are in output/history/.
The reference implementation is open-source under the MIT License. The dataset (Snapshots, methodology, brand) is proprietary.
B. Changelog
| Version | Date | Summary |
|---|---|---|
| v2.0 | 2026-05-24 | Added Section 6: Supply Chain Trust Signals. Defined npm provenance, PyPI PEP 740 attestations, OSSF Scorecard (CLI-based, weekly cache), and signed commit ratio. Trust signals are collected but do not affect the health score. |
| v1.1 | 2026-05-10 | Added npm, PyPI, and Hacker News data sources. Daily historical snapshots introduced. Rank delta computation defined. |
| v1.0 | 2026-05-01 | Initial specification. GitHub-only signals: stars, freshness (pushed_at), commit activity, forks. |