RegistryCompare › Promptfoo vs Weights & Biases Weave

Promptfoo vs Weights & Biases Weave

An independent, evidence-based trust comparison of Promptfoo and Weights & Biases Weave, two Observability & Evaluation projects in the HVTracker registry. Scores come from public, checkable signals — supply-chain provenance, OSSF Scorecard, maintenance, and adoption — not popularity.

Weights & Biases Weave leads on trust — 90.2/100 (Grade A) vs 83.4/100 (Grade A), a 6.8-point gap. Full breakdown below.
Signal Promptfoopromptfoo/promptfoo Weights & Biases Weavewandb/weave
HVTrust score 83.4 90.2
Evidence grade A A
Overall rank #31 #16
Rank in Observability & Evaluation #3 #1
GitHub stars 22.9k 1.1k
Last updated today 1d ago
Build provenance Yes Yes
OSSF Scorecard 6.4 / 10
License MIT Apache-2.0
Downloads 384k/wk 612k/wk
Trust dimensions (points earned)
Safety / integrity / 25 7.5 20.4
Identity & provenance / 20 18.0 18.0
Transparency / 17 8.5 13.9
Maintenance / 20 20.0 19.9
Adoption / 20 17.9 15.0
Open in the live compare tool → Promptfoo profile Weights & Biases Weave profile More Observability & Evaluation →

How to read this: HVTrust (0–100) weighs supply-chain signals (provenance, OSSF Scorecard, signed commits, open license) alongside real-world adoption, scaled by an evidence-confidence factor. Grade bands: A ≥ 80, B ≥ 65, C ≥ 50, D < 50. Signals refresh daily. Full methodology v4.0 →