Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You're right, I apologize for my mistake. The problem has no solution. Initiating self-destruct sequence.

(It actually shows no sign of being stuck on the pattern of "wolf eats sheep," but no matter how many times you tell it it's wrong, it never breaks out of the pattern of guessing at incorrect solutions.)



Right. There doesn’t seem to be a solution to the problem as given. Rutabaga eats sheep. Wolf eats rutabaga. Sheep eats wolf. If you take rutabaga, sheep eats wolf. If you take sheep, wolf eats rutabaga. If you take wolf, rutabaga eats sheep. I don’t know if the intention was that it has a solution, but it clearly has no idea what it’s saying.


Haha, my bad. I outclevered myself. My bad. Let me do it again.

https://chat.openai.com/share/5a2700de-1850-4f25-8adf-2d2b97...

It handles this properly.


No, your test was great, very well-conceived to trip up an LLM (or me), and it'll be the first thing I try when ChatGPT5 comes out.

You can't throw GPT4 off-balance just by changing the object names or roles -- and I agree that would have been sufficient in earlier versions -- but it has no idea how to recognize a cycle that renders the problem unsolvable. That's an interesting limitation.


It conceptually never admits ignorance and never asks for clarifications. It always produces something, to the best of its ability. It _seems_ to be a minor technical limitation (there is plenty of traditional ML systems producing confidence %% alongside the answer from years if not decades ago, in image recognition in particular), but most likely it's actually a very hard problem, as otherwise it would be mitigated somehow by now by OpenAI, given that they clearly agree that this is a serious problem [2] (more generally formulated as reliability [1])

[1] https://www.youtube.com/watch?v=GI4Tpi48DlA&t=1342s (22:22, "Highlights of the Fireside Chat with Ilya Sutskever & Jensen Huang: AI Today & Vision of the Future", recorded March 2023, published May 16, 2023)

[2] https://www.youtube.com/watch?v=GI4Tpi48DlA&t=1400s (23:20, ditto)


It still doesn’t know that it’s (the original problem) unsolvable or that it’s wrong. Or maybe it does and just bullshits you to seem smart.


It can't quite get there on its own, but interestingly it can take a hint: https://news.ycombinator.com/edit?id=38396490


Yeah, I completely missed that when I replied, but will leave it up and take the condign downvotes. :-P




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: