Skip to main content

ai-research-11-evaluation-bigcode-evaluation-harness

Evaluates code generation models using multiple benchmarks to assess coding abilities and quality across various programming languages.

Install this skill

or
93/100

Security score

The ai-research-11-evaluation-bigcode-evaluation-harness skill was audited on Jun 8, 2026 and we found 3 security issues across 2 threat categories. Review the findings below before installing.

Categories Tested

Security Issues

medium line 228

Template literal with variable interpolation in command context

SourceSKILL.md
228```bash
low line 401

External URL reference

SourceSKILL.md
401- **BigCode Leaderboard**: https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard
low line 402

External URL reference

SourceSKILL.md
402- **HumanEval Dataset**: https://huggingface.co/datasets/openai/openai_humaneval
Scanned on Jun 8, 2026
View Security Dashboard
Installation guide →