What is this? > Assistant: chain-of-thought Does every LLM have this internal th...

Tzt · 2025-12-12T16:48:30 1765558110

Yes, absolute majority of new ones use CoTs, long chain of reasoning you don't see.

Also some of them use such a weird style of talking in them e.g.

o3 talks about watchers and marinade, and cunning schemes https://www.antischeming.ai/snippets

gpt5 gets existential about seahorses https://x.com/blingdivinity/status/1998590768118731042

I remember one where gpt5 spontaneously wrote a poem about deception in its CoT and then resumed like nothing weird happened. But I can't find mentions of it now.

DenisM · 2025-12-12T17:45:18 1765561518

> But the user just wants answer; they'd not like; but alignment.

And there it is - the root of the problem. For whatever reason the model is very keen to produce an answer that “they” will like. This desire to produce is intrinsic but alignment is extrinsic.

DenisM · 2025-12-12T17:35:11 1765560911

Gibberish can be the model using contextual embeddings. These are not supposed to Make sense.

Or it could be trying to develop its own language to avoid detection.

The deception part is spooky too. It’s probably learning that from dystopian AI fiction. Which raises the questions if models can acquire injected goals from the training set.

catigula · 2025-12-12T16:52:55 1765558375

Yes, they're purposely not 'trained on' chain-of-thought to avoid making it useless for interpretability. As a result, some can find it epistemically shocking if you tell them you can see their chain-of-thought. More recent models are clever enough to know you can see their chain-of-thought implicitly without training.

DenisM · 2025-12-12T17:35:56 1765560956

It is in their training set by now.