This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
EA Forum Bot Site
Topics
EA Forum
Login
Sign up
AI evaluations and standards
•
Applied to
I read every major AI lab’s safety plan so you don’t have to
2d
ago
•
Applied to
OpenAI's o1 tried to avoid being shut down, and lied about it, in evals
12d
ago
•
Applied to
OpenAI's CBRN tests seem unclear
1mo
ago
•
Applied to
College technical AI safety hackathon retrospective - Georgia Tech
1mo
ago
•
Applied to
Comparing AI Labs and Pharmaceutical Companies
1mo
ago
•
Applied to
The current state of RSPs
1mo
ago
•
Applied to
Trendlines in AIxBio evals
2mo
ago
•
Applied to
Announcing ForecastBench, a new benchmark for AI and human forecasting abilities
3mo
ago
•
Applied to
Join the $10K AutoHack 2024 Tournament
3mo
ago
•
Applied to
Model evals for dangerous capabilities
3mo
ago
•
Applied to
Submit Your Toughest Questions for Humanity's Last Exam
3mo
ago
•
Applied to
Thinking About Propensity Evaluations
4mo
ago
•
Applied to
A Taxonomy Of AI System Evaluations
4mo
ago
•
Applied to
Case studies on social-welfare-based standards in various industries
6mo
ago
•
Applied to
[Paper] AI Sandbagging: Language Models can Strategically Underperform on Evaluations
6mo
ago
•
Applied to
Demonstrate and evaluate risks from AI to society at the AI x Democracy research hackathon
8mo
ago
•
Applied to
LLM Evaluators Recognize and Favor Their Own Generations
8mo
ago
•
Applied to
OMMC Announces RIP
9mo
ago
•
Applied to
Join the AI Evaluation Tasks Bounty Hackathon
9mo
ago
•
Applied to
Introducing METR's Autonomy Evaluation Resources
9mo
ago