AI Test Case Generator + Eval
ActiveA tool that generates test cases from product requirements using Claude, GPT-4.1, and Gemini in parallel, then scores each model's output with both human and LLM-as-judge scoring. Tracks quality trends across prompt revisions with Langfuse observability, includes CLI scripts for batch experiments, and ships with 138 tests across 12 suites.