mirror of
https://github.com/msitarzewski/agency-agents/
synced 2026-06-09 10:13:17 +00:00
b456845e85
Adds promptfoo eval harness for agent quality scoring. LLM-as-judge system scoring task completion, instruction adherence, identity consistency, deliverable quality, and safety. Includes tests.