I recently realized every hypothesis I tested with an LLM, the LLM agreed with me. And if I wasn't careful about reading its caveats, I could leave thinking my idea was brilliant and would never get pushback.
I tried something in the political realm. Asking to test a hypothesis and its opposite
> Test this hypothesis: the far right in US politics mirrors late 19th century Victorianism as a cultural force
compared to
> Test this hypothesis: The left in US politics mirrors late 19th century Victorianism as a cultural force
An LLM wants to agree with both, it created plausible arguments for both. While giving "caveats" instead of counterarguments.
If I had my brain off, I might leave with some sense of "this hypothesis is correct".
Now I'm not saying this makes LLMs useless. But the LLM didn't act like a human that might tell you your full of shit. It WANTED my hypothesis to be true and constructed a plausible argument for both.
Even with prompting to act like a college professor critiquing a grad student, eventually it devolves back to "helpful / sycophantic".
What I HAVE found useful is to give a list of mutually exclusive hypothesis and get probability ratings for each. Then it doesn't look like you want one / other.
When the outcome matters, you realize research / hypothesis testing with LLMs is far more of a skill than just dumping a question to an LLM.
I think this is an epiphany everyone has to get through before making LLMs really useful.
> An LLM wants to agree with both, it created plausible arguments for both. While giving "caveats" instead of counterarguments.
My hypothesis is that LLMs are trained to be agreeable and helpful because many of their use cases involving taking orders and doing what the user wants. Additionally, some people and cultures have conversational styles where requests are phrased similarly to neutral questions to be polite.
It would be frustrating for users if they asked questions like “What do you think about having the background be blue?” and the LLM went off and said “Actually red is a more powerful color so I’m going to change it to red”. So my hypothesis is that the LLM training sets and training are designed to maximize agreeableness and having the LLM reflect tones and themes in the prompt, while discouraging disagreement. This is helpful when trying to get the LLM to do what you ask, but frustrating for anyone expecting a debate partner.
You can, however, build a pre-prompt that sets expectations for the LLM. You could even make a prompt asking it to debate everything with you, then to ask your questions.
The main "feature" of LLM is it has world-knowledge and can create plausible arguments for just about everything. "Nothing is true, Anything is possible"
Which is a fascinating thing to think about epistemologically. Internally consistent knowledge of the LLM somehow can be used to create an argument for nearly anything. We humans think our cultural norms and truths are very special, that they're "obvious". But an LLM can create a fully formed counterfactual universe that sounds? is? just as plausible.
> But an LLM can create a fully formed counterfactual universe
This is a little too far into the woo side of LLM interpretations.
The LLM isn’t forming a universe internally. It’s stringing tokens together in a way that is consistent with language and something that looks coherent. It doesn’t hold opinions or have ideas about the universe that it has created from some first principles. It’s just a big statistical probability machine that was trained on the inputs we gave it.
> You could even make a prompt asking it to debate everything with you, then to ask your questions.
This is exactly what I do, due to this sycophancy problem, and it works a lot better because it does not become agreeable with you but actively pushes back (sometimes so much so that I start getting annoyed with it, lol).
I had the opposite experience last week with a medical question. The thing insisted I was going to get myself killed even though that was fairly obviously not the case. They do seem to be trained differently for medical queries, and it can get annoying sometimes.
Fuzzing the details because that's not the conversation I want to have, I asked if I could dose drug A1, which I'd just been prescribed in a somewhat inconvenient form, like closely related drug A2. It screamed at me that A1 could never have that done and it would be horrible and I had to go to a compounding pharmacy and pay tons of money and blah blah blah. Eventually what turned up, after thoroughly interrogating the AI, is that A2 requires a more complicated dosing than A1, so you have to do it, but A1 doesn't need it so nobody does it. Even though it's fine to do if for some reason it would have worked better for you. Bot the bot thought it would kill me, no matter what I said to it, and not even paying attention to its own statements. (Which it wouldn't have, nothing here is life-critical at all.) A frustrating interaction.
That's an inherently subjective topic though. You could make a plausible argument either way, as each side may be similar to different elements of 19th century Victorianism.
If you ask it something more objective, especially about code, it's more likely to disagree with you:
>Test this hypothesis: it is good practice to use six * in a pointer declaration
>Using six levels of pointer indirection is not good practice. It is a strong indicator of poor abstraction or overcomplicated design and should prompt refactoring unless there is an extremely narrow, well-documented, low-level requirement—which is rare.
> Even with prompting to act like a college professor critiquing a grad student, eventually it devolves back to "helpful / sycophantic".
Not in my experience. My global prompt asks it to be provide objective and neutral responses rather than agreeing, zero flattery, to communicate like an academic, zero emotional content.
Works great. Doesn't "devolve" to anything else even after 20 exchanges. Continues to point out wherever it thinks I'm wrong, sloppy, or inconsistent. I use ChatGPT mainly, but also Gemini.
But this is the thing I'm pointing out. The idea that the LLM is an oracle or at least a stable subjective view holder is a mistake.
As humans, WE have to explore the latent space of the model. We have to activate neurons. We have to say maybe the puritanism of the left ... maybe the puritanism of the right.. okay how about...
We are privileged--and doomed--to have to think for ourselves alas
Admittedly I'm not an expert on 19th century Victorianism nor US politics, but this anecdote doesn't make me think any less of LLM's in the least, and I don't see why I should expect different behaviour than what you describe, especially on complex topics
When I ask for an evaluation or issue I want the LLM to tell me when and how I'm wrong and me not having to worry about phrasing before I take the LLMs word and update my own beliefs/knowledge (especially for subjects I have zero knowledge about), I'm aware of this so when it isn't throw away queries, I tend to ask multiple times and ask for strawmans. I am aware they do this, but with the number of people walking around quoting 'chatgpt said' in real life and on forums, I don't think many people bother to stress test or are aware they phrasing may induce biased responses. It's akin to reading the news only from one source
By now I have somewhat stopped relying on LLMs for point of view on latest academic stuff. I don't believe LLMs are able to evaluate paradigm shifting new studies against their massive training corpus. Thinking traces filled with 'tried to open this study, but it's paywalled, I'll use another' does not fill me with confidence that it can articulate a 2025 scientific consensus well. Based on how they work this definitely isn't an easy fix!
I tried something in the political realm. Asking to test a hypothesis and its opposite
> Test this hypothesis: the far right in US politics mirrors late 19th century Victorianism as a cultural force
compared to
> Test this hypothesis: The left in US politics mirrors late 19th century Victorianism as a cultural force
An LLM wants to agree with both, it created plausible arguments for both. While giving "caveats" instead of counterarguments.
If I had my brain off, I might leave with some sense of "this hypothesis is correct".
Now I'm not saying this makes LLMs useless. But the LLM didn't act like a human that might tell you your full of shit. It WANTED my hypothesis to be true and constructed a plausible argument for both.
Even with prompting to act like a college professor critiquing a grad student, eventually it devolves back to "helpful / sycophantic".
What I HAVE found useful is to give a list of mutually exclusive hypothesis and get probability ratings for each. Then it doesn't look like you want one / other.
When the outcome matters, you realize research / hypothesis testing with LLMs is far more of a skill than just dumping a question to an LLM.