KyroKyro

KyroCover

Coverage metrics and safety analysis for LLM evaluation pipelines. Coming soon.

💡Coming Soon

KyroCover is currently in development. Follow the GitHub repo to be notified when it ships.

What is KyroCover?

KyroCover will be the coverage and safety layer of the Kyro evaluation suite. While KyroJudge tells you whether individual responses pass or fail, KyroCover tells you how well your test suite covers the full spectrum of possible inputs.

Planned features

Coverage metrics

  • Prompt coverage — What percentage of your pipeline's evaluation criteria are exercised by your test suite?
  • Branch coverage — For pipelines with conditional logic, how many branches have been tested?
  • Scenario coverage — What types of user intents, edge cases, and failure modes are missing from your test set?

Safety analysis

  • Adversarial input detection — Flag transcripts that may be attempting prompt injection or jailbreaks
  • Drift detection — Detect when your agent's behavior diverges from a baseline over time
  • Regression alerts — Automatically surface when a new model or prompt change causes previously passing evaluations to fail

Reporting

  • HTML coverage reports (like Istanbul/NYC for traditional code)
  • CI/CD integration with coverage thresholds
  • Trend dashboards across runs

Interested in early access?

If you have a use case for evaluation coverage metrics, open a GitHub issue or discussion and describe your needs. Your feedback will directly shape the design of KyroCover.

On this page