Comments
METR: Measuring AI Ability to Complete Long Tasks — EA Forum