This directory tracks benchmark scenarios used to calibrate precision/recall for autonomous findings.
- Deliberately vulnerable API fixtures (BOLA, BFLA, mass assignment)
- Rate limiting and auth differential scenarios
- Multi-step exploit chain scenarios
- endpoint coverage ratio
- findings precision and recall
- deterministic-vs-AI validation ratio
- exploit-chain reproducibility rate