Accepting contributions
The trusted source for AI benchmarks
AgentEval.org is an open, community-driven initiative dedicated to advancing AI development by sharing domain-specific benchmarks and methodologies.
Why we built AgentEval.org
Mission
To establish a trusted, open source for sharing AI benchmarks and best practices that drive transparency and continuous improvement in AI evaluation.
Vision
A future where open collaboration and shared data standards accelerate responsible AI innovation, making evaluation methodologies accessible and verifiable for everyone.
Why Open Benchmarks?
Transparency & Trust
Open-sourcing our benchmarks and methodologies allows anyone to inspect, validate, and contribute to our evaluation processes.
Community-driven innovation
An open platform invites contributions from a broad community, leading to more robust and diverse evaluation practices.
Industry adoption
Open-source tools and standards are more likely to be adopted by academic institutions, industry players, and public agencies.
Non-profit and collaborative alignment
Emphasizing open source aligns perfectly with our mission to move the industry forward through shared knowledge and collective effort.