-
- Downloads
Add llama 3.2 mmlu, math, gpqa evals to meta_eval harness (#801)
No related branches found
No related tags found
Showing
- tools/benchmarks/llm_eval_harness/meta_eval/README.md 18 additions, 11 deletionstools/benchmarks/llm_eval_harness/meta_eval/README.md
- tools/benchmarks/llm_eval_harness/meta_eval/eval_config.yaml 10 additions, 8 deletionstools/benchmarks/llm_eval_harness/meta_eval/eval_config.yaml
- tools/benchmarks/llm_eval_harness/meta_eval/meta_template/gpqa/gpqa_0shot.yaml 29 additions, 0 deletions...eval_harness/meta_eval/meta_template/gpqa/gpqa_0shot.yaml
- tools/benchmarks/llm_eval_harness/meta_eval/meta_template/gpqa/utils.py 19 additions, 0 deletions...ks/llm_eval_harness/meta_eval/meta_template/gpqa/utils.py
- tools/benchmarks/llm_eval_harness/meta_eval/meta_template/gpqa_cot/gpqa_0shot_cot.yaml 1 addition, 1 deletion...ness/meta_eval/meta_template/gpqa_cot/gpqa_0shot_cot.yaml
- tools/benchmarks/llm_eval_harness/meta_eval/meta_template/math_hard/math_4shot_cot.yaml 21 additions, 0 deletions...ess/meta_eval/meta_template/math_hard/math_4shot_cot.yaml
- tools/benchmarks/llm_eval_harness/meta_eval/meta_template/math_hard/math_hard_0shot_cot.yaml 1 addition, 1 deletion...eta_eval/meta_template/math_hard/math_hard_0shot_cot.yaml
- tools/benchmarks/llm_eval_harness/meta_eval/meta_template/mmlu/mmlu.yaml 14 additions, 0 deletions...s/llm_eval_harness/meta_eval/meta_template/mmlu/mmlu.yaml
- tools/benchmarks/llm_eval_harness/meta_eval/meta_template/mmlu/utils.py 31 additions, 0 deletions...ks/llm_eval_harness/meta_eval/meta_template/mmlu/utils.py
- tools/benchmarks/llm_eval_harness/meta_eval/prepare_meta_eval.py 78 additions, 24 deletions...enchmarks/llm_eval_harness/meta_eval/prepare_meta_eval.py
Please register or sign in to comment