Benchmark harness measuring AI coding tool+workflow performance, not just model capability. 60 tasks, sigmoid scoring, capability profiles, gap analysis. - xmpuspus/ai-workflow-benchmark
Show HN: AWB – Benchmark that tests your AI coding workflow, not just the model
A new benchmark tool called AWB (AI Workflow Benchmark) evaluates AI coding workflows and models, revealing potential security gaps in automated code generation processes. Developers and organizations using AI-assisted coding tools may unknowingly introduce vulnerabilities due to flawed workflows, even if the underlying model performs well.