Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
nmca
on Sept 12, 2024
|
parent
|
context
|
favorite
| on:
Learning to Reason with LLMs
I have spent some time doing this for these benchmarks — the model still does make mistakes. Of the questions I can understand, (roughly half in this case) about half were real errors and half were broken questions.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: