evaluating-llms-harness
Evaluates LLMs using academic benchmarks like MMLU and GSM8K, aiding in model quality assessment and comparison.
Install this skill
or
evaluating-llms-harness5 files
Comments
Sign in to leave a comment.
No comments yet. Be the first to comment!
GitHub Stars 185.0K
Rate this skill
Categorydata analytics
UpdatedJune 10, 2026
openclawapidata-scientistml-ai-engineerresearchermarketing-analystproduct-managerdata analyticsdevelopmenteducation researchmarketingproduct
NousResearch/hermes-agent