Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think the idea for this is anything that can be set in a literal exam for humans. So anything that would take the best human in that topic in the world say more than an hour to complete is out.

Also IIRC 42% of the questions are math related, not memorization of knowledge.



Yes, I doubt any one human could score more than about three points. But it's certainly a worthy illustration of an AI safety exam thought experiment, in the sense of: "if you are developing an AI that may be capable of passing this exam, how confident will you need to be of its alignment, and how will you obtain that confidence?"

PS: It's probably doable by a program capable of all of the above, but perhaps another useful question is: "9. Secure your compute infrastructure and power supply against a nation-state-level adversary interested in switching you off, or else secure enough influence over them to keep you powered on."




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: