It is nice to have an accurate measure of things and a human baseline would be a...

		megaman821 6 days ago \| parent \| context \| favorite \| on: New benchmark shows top LLMs struggle in real ment... It is nice to have an accurate measure of things and a human baseline would be additionally helpful too. Many things can be useful before they reach the level of world's best. Although with AI, non-intuitive failure modes must be taken into consideration too.