You're right, I apologize for my mistake. The problem has no solution. Initiatin...

what · on Nov 23, 2023

Right. There doesn’t seem to be a solution to the problem as given. Rutabaga eats sheep. Wolf eats rutabaga. Sheep eats wolf. If you take rutabaga, sheep eats wolf. If you take sheep, wolf eats rutabaga. If you take wolf, rutabaga eats sheep. I don’t know if the intention was that it has a solution, but it clearly has no idea what it’s saying.

aetherson · on Nov 23, 2023

Haha, my bad. I outclevered myself. My bad. Let me do it again.

https://chat.openai.com/share/5a2700de-1850-4f25-8adf-2d2b97...

It handles this properly.

CamperBob2 · on Nov 23, 2023

No, your test was great, very well-conceived to trip up an LLM (or me), and it'll be the first thing I try when ChatGPT5 comes out.

You can't throw GPT4 off-balance just by changing the object names or roles -- and I agree that would have been sufficient in earlier versions -- but it has no idea how to recognize a cycle that renders the problem unsolvable. That's an interesting limitation.

reroute22 · on Nov 23, 2023

It conceptually never admits ignorance and never asks for clarifications. It always produces something, to the best of its ability. It _seems_ to be a minor technical limitation (there is plenty of traditional ML systems producing confidence %% alongside the answer from years if not decades ago, in image recognition in particular), but most likely it's actually a very hard problem, as otherwise it would be mitigated somehow by now by OpenAI, given that they clearly agree that this is a serious problem [2] (more generally formulated as reliability [1])

[1] https://www.youtube.com/watch?v=GI4Tpi48DlA&t=1342s (22:22, "Highlights of the Fireside Chat with Ilya Sutskever & Jensen Huang: AI Today & Vision of the Future", recorded March 2023, published May 16, 2023)

[2] https://www.youtube.com/watch?v=GI4Tpi48DlA&t=1400s (23:20, ditto)

what · on Nov 23, 2023

It still doesn’t know that it’s (the original problem) unsolvable or that it’s wrong. Or maybe it does and just bullshits you to seem smart.

CamperBob2 · on Nov 23, 2023

It can't quite get there on its own, but interestingly it can take a hint: https://news.ycombinator.com/edit?id=38396490

CamperBob2 · on Nov 23, 2023

Yeah, I completely missed that when I replied, but will leave it up and take the condign downvotes. :-P