E
evaluation

Projects with this topic

View Langfuse project

mirrored_repos / MachineLearning / Langfuse / Langfuse

🔧🔗https://github.com/langfuse/langfuse 🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with LlamaIndex, Langchain, OpenAI SDK, L

open-source playground monitoring analytics evaluation self-hosted OpenAI gpt observability Large Langua... prompt-engin... langchain llmops llama-index prompt-manag... llm-evaluation llm-observab...

0

Updated Feb 21, 2026

0 0 0 0

Updated Feb 21, 2026
View Gai project

mirrored_repos / MachineLearning / maragudk / Gai

gai
🔧🔗https://github.com/maragudk/gai Go Artificial Intelligence (GAI) helps you work with foundational models, large language models, and other AI models.

go GoLang Large Langua... Synthetic In... llm-eval evaluation evals

0

Updated Feb 17, 2026

0 0 0 0

Updated Feb 17, 2026
View Lm Evaluation Harness project

mirrored_repos / MachineLearning / EleutherAI / Lm Evaluation Harness

🔧🔗https://github.com/EleutherAI/lm-evaluation-harness A framework for few-shot evaluation of language models.

Large Langua... evaluation harness

0

Updated Feb 15, 2026

0 0 0 0

Updated Feb 15, 2026
View Evalscope project

mirrored_repos / MachineLearning / modelscope / Evalscope

🔧🔗https://github.com/modelscope/evalscope A streamlined and customizable framework for efficient large model evaluation and performance benchmarking

performance evaluation vlm Retrieval Au... Large Langua... modelscope

0

Updated Feb 07, 2026

0 0 0 0

Updated Feb 07, 2026
View Lmeval project

mirrored_repos / MachineLearning / google / Lmeval

🔧🔗https://github.com/google/lmeval

LMEval: Large Model Evaluation Framework

Large Langua... evaluation

0

Updated Nov 27, 2025

0 0 0 0

Updated Nov 27, 2025
View VL CheckList project

mirrored_repos / MachineLearning / OM AI Lab / VL CheckList

🔧🔗https://github.com/om-ai-lab/VL-CheckList Evaluating Vision & Language Pretraining Models with Objects, Attributes and Relations. [EMNLP 2022]

evaluation metrics multimodal vision language Deep Learning

0

Updated Mar 05, 2025

0 0 0 0

Updated Mar 05, 2025
View OVDEval project

mirrored_repos / MachineLearning / OM AI Lab / OVDEval

🔧🔗https://github.com/om-ai-lab/OVDEval

A Comprehensive Evaluation Benchmark for Open-Vocabulary Detection (AAAI 2024)

object-detec... vision language open-vocabul... evaluation

0

Updated Mar 04, 2025

0 0 0 0

Updated Mar 04, 2025
View OpenScholar ExpertEval project

mirrored_repos / MachineLearning / AkariAsai / OpenScholar ExpertEval

🔧🔗https://github.com/AkariAsai/OpenScholar_ExpertEval This repository contains expert evaluation interface and data evaluation script for the OpenScholar project.

openscholar evaluation

0

Updated Jan 12, 2025

0 0 0 0

Updated Jan 12, 2025

🐾❤️ Strive to be the person your dogs believe you are ❤️🐾