More

Rover222 · 2025-12-12T15:31:44 1765553504

Just curious in what ways, since you've actually owned both. I've only had Teslas, but I like the looks of Rivians.

mbesto · 2025-12-15T15:39:41 1765813181

Tesla UX gripes (may be specific to Model S 2020):

- Yolk steering is terrible

- Lack of physical knobs. Haptics are nice but haptics don't work well for cars.

- Tesla menus are getting stuffed more and more with options, making affordance and UI crowding much worse

- The horn is a button press...no one in a emergency situation is looking for a button press..

- Native apps (for example Spotify) are inferior to just using my phone via bluetooth

- Calendar integration / notification is too chatty

Rover222 · 2025-12-12T15:30:41 1765553441

Sounds like you want a Nissan Leaf? They exist.

ActorNightly · 2025-12-12T16:52:40 1765558360

Nissan Leaf is the same price as a Camry. Its not cheap.

Rover222 · 2025-12-12T15:29:57 1765553397

Any company adding fancy hardware (beyond good cameras and inference chips) to achieve self driving is on the wrong track at this point. Software is what will win this game.

Of course, Waymo has achieved good results with A LOT of fancy hardware. But it's hard to see how they stand a chance against Cybercab mass production (probably behind schedule, but eventually).

I think Rivian has the best-looking line of EVs out there. Maybe they will be able to come from behind in self-driving tech. But this big reveal is not that promising, IMO.

horsawlarway · 2025-12-12T15:37:13 1765553833

An alternate take here:

The "fancy" hardware is going to get dirt cheap, and in a game where you're asking your customer to trust you with their lives, reliability is going to win. Combine that with time to market, and Tesla feels like a pretty clear "risky bet" at best... Maybe they make it work, but they have to do it before the other companies make lidar cheap, and prices have fallen dramatically over the past 10 years, for much better hardware.

Rover222 · 2025-12-12T16:42:28 1765557748

Yeah, that is all reasonable. I think the jury is still out on if sensor fusion can really get far enough up the march of nines (will it work in 99.99 percent of scenarios)? Karpathy has given some good interviews about why Tesla ditched the sensor fusion approach and switched to vision-only.

Same can be said for vision-only, of course. Maybe it won't every quite get to 99.99.

Rover222 · 2025-12-11T23:55:12 1765497312

Waymo operates on guardrails, with a lot more human-in-the-loop (remotely) help than most people seem aware of.

Tesla's already have similar capabilities, in a much wider range of roads, in vehicles that cost 80% less to manufacture.

They're both achieving impressive results. But if you read beyond headlines, Tesla is setup for such more more success than Waymo in the next 1-2 years.

JumpCrisscross · 2025-12-12T02:58:30 1765508310

> Tesla is setup for such more more success than Waymo in the next 1-2 years

Iff cameras only works. With threshold for "works" beig set by Waymo, since a Robotaxi that's would have been acceptable per se may not be if it's statistically less safe compared to an existing solution.

Waymo also sets the timeline. If cameras only would work, but Waymo scales before it does, Tesla may be forced by regulators to integrate radars and lidars. This nukes their cost advantage, at least in part, though Tesla maintains its manufacturing lead and vertical integration.)

Tesla has a good hand. But Rivian's play makes sense. If cameras only fails, they win on licensing and a temporary monopoly. If cameras only work, they are a less-threatening partner for other car companies than Waymo.

simondotau · 2025-12-12T08:53:45 1765529625

In the increasingly rare instances where Tesla's solution is making mistakes, it's pretty much never to do with a failure of spatial awareness (sensing) but rather a failure of path planning (decision-making).

The only thing LIDAR can do sense depth, and if it turns out sensing depth using cameras is a solved problem, adding LIDAR doesn't help. It can't read road signs. It can't read road lines. It can't tell if a traffic light is red or green. And it certainly doesn't improve predictions of human drivers.

dzhiurgis · 2025-12-12T10:48:27 1765536507

Which begs me the question why Tesla took so long to get here? It's only since v12 it starting to look bearable for supervised use.

The only answer I see is their goal to create global model that works in every part of the world vs single city which is vastly more difficult. After all most drivers really only know how to drive well in their own town and make a lot of mistakes when driving somewhere else.

Rover222 · 2025-12-12T16:45:59 1765557959

It was only about 2 years ago that they switched from hard coded logic to machine learning (video in, car control out), and this was the beginning of their final path they are committed to now. (building out manufacturing for Cybercab while still finalizing the FSD software is a pretty insane risk that no other company would take)

dzhiurgis · 2025-12-12T20:20:37 1765570837

That’s the switch for controls, the machine vision was nn from the start.

simondotau · 2025-12-12T11:36:04 1765539364

Path planning (decision-making) is by far the most complicated part of self-driving. Waymo vehicles were making plenty of comically stupid mistakes early on, because having sufficient spatial accuracy was never the truly hard part.

KeplerBoy · 2025-12-12T10:49:11 1765536551

Sensing depth is pretty important though. Especially in scenarios where vision fails, radar for example works perfectly fine in the thickest of fog.

simondotau · 2025-12-12T11:43:29 1765539809

In "scenarios where vision fails" the car should not be driving. Period. End of story. It doesn't matter how good radar is in fog, because radar alone is not enough.

KeplerBoy · 2025-12-12T11:55:57 1765540557

Too bad conditions can change instantly. You can't stop the car at an alpine tunnel exit just because there's heavy fog on the other side of the mountain.

simondotau · 2025-12-12T13:06:36 1765544796

If the fog is thick enough that you literally can't see the road, you absolutely can and should stop. Most of the time there's still some visibility through fog, and so your speed should be appropriate to the conditions. As the saying goes, "don't drive faster than your headlights."

ra7 · 2025-12-12T15:14:41 1765552481

> The only thing LIDAR can do sense depth

This is absolutely false. LiDAR is used heavily in object detection. There’s plenty of literature on this. Here’s a few from Waymo:

https://waymo.com/research/streaming-object-detection-for-3-...

https://waymo.com/research/lef-late-to-early-temporal-fusion...

https://waymo.com/research/3d-human-keypoints-estimation-fro...

In fact, LiDAR is a key component for detecting pedestrian keypoints and pose estimation. See https://waymo.com/blog/2022/02/utilizing-key-point-and-pose-...

Here’s an actual example of LiDAR picking up people in the dark well before cameras: https://www.reddit.com/r/waymo/s/U8eq8BEaGA

Not to mention they’re also highly critical for simulation.

> It can't read road signs. It can't read road lines.

Also false. Here’s Waymo’s 5th-gen LiDAR raw point clouds that can even read a logo on a semi truck: https://youtube.com/watch?v=COgEQuqTAug&t=11600s

It seems you’re misinformed about how this sensor is used. The point clouds (plus camera and radar data) are all fed to the models for detection. That makes their detectors much more robust in different lighting and weather conditions than cameras alone.

Rover222 · 2025-12-12T16:43:58 1765557838

I think "sensing depth" and "object detection" are the same things in this debate though

ra7 · 2025-12-12T16:55:30 1765558530

It's just "sensing depth" the same way cameras provide just "pixels". A fused cameras+radars+lidar input provides more robust coverage in a variety of conditions.

simondotau · 2025-12-14T03:44:59 1765683899

You know it would be even more robust under even more conditions? Putting 80 cameras and 20 LIDAR sensors on the car. Also a dozen infrared heat sensors, a spectrophotometer, and a Doppler radar. More is surely always better. Waymo should do that.

ra7 · 2025-12-14T03:55:21 1765684521

Maybe Tesla should reduce their camera count from 8 to 2 and put them on a swivel like human eyes. Less is surely always better.

I can also make “clever” arguments that are useless.

simondotau · 2025-12-14T09:07:15 1765703235

Remarkable. You managed to both misunderstand my point and, in drafting your witty riposte, accidentally understand it and adopt it as your own. More isn't objectively better, less isn't objectively better. There's only different strategies and actual real world outcomes.

ra7 · 2025-12-14T13:46:04 1765719964

> More isn't objectively better, less isn't objectively better.

Great, you finally got there. All it took was one round of correcting misinformation about LiDAR and another round of completely useless back-and-forth about sensor count.

The words you’re looking for are necessary and sufficient. Cameras are necessary, but not sufficient.

> There's only different strategies and actual real world outcomes.

Thanks for making my point. Actual real world outcomes are exactly what matter: 125M+ fully autonomous miles versus 0 fully autonomous miles.

simondotau · 2025-12-14T19:42:06 1765741326

Oh I’m sorry, I didn’t realise you think you’re in a battle of fanboy talking points. Never mind. Not interested.

ra7 · 2025-12-14T19:55:58 1765742158

Highly ironic considering you started this comment chain with a bunch of fanboy talking points and misinformation. Clearly, you’re not interested in being factual. Bye.

ra7 · 2025-12-12T02:33:49 1765506829

Tesla literally has a human in the driver seat for each and every mile. Their robotaxi which operates on geofenced “guardrails” has a human in the driver seat or passenger seat depending on area of its operation, and also has active remote supervision. That’s direct supervision 100% of the time. It is in no way similar in capability to Waymo.

We’ve been hearing Tesla will “surpass Waymo in the next 1-2 years” from the past 8 years, yet they are nowhere close. It’s always future tense with Tesla and never about the current state.

Rover222 · 2025-12-11T15:37:23 1765467443

The lightly-used EV market is where it's at, for now.

rtkwe · 2025-12-11T16:18:09 1765469889

100% let someone else eat the initial depreciation hit from going from new to used and you can usually get one with pretty low mileage too.

bluGill · 2025-12-11T16:29:39 1765470579

So far, but it is hard to find good EVs on the used market. Sometimes you can, but at the moment I'm looking and there just are not many options at all.

Rover222 · 2025-12-11T16:32:26 1765470746

A 1 or 2 yr old Model Y is a great EV...

bluGill · 2025-12-11T16:49:35 1765471775

Is it? Will I be able to get parts for it 15 years from now? Will I be able to repair it myself? (I have rebuilt engines before, and I'm planning to replace the transmission in my truck myself this year, so DIY ability is important to me). There have been headlines recently about how unreliable Tesla is.

rtkwe · 2025-12-11T18:13:41 1765476821

Transmissions on EVs are generally single stage speed reduction so yes in the extremely rare chance you need to repair it you can. The motors are way harder to repair but they're similarly way less likely to need repair. Other part availability will vary by brand just like with ICE cars/trucks there's nothing magically less available if we're looking at new parts, older common parts there's a bit of a difference just from the designs being newer so there's a smaller population of junked cars to pull random EV pumps off of the way you can with some old car parts.

Rover222 · 2025-12-11T23:57:18 1765497438

It's the most popular vehicle in the world by most metrics, so I don't think you need to be concerned about parts. Especially if comparing to any other EV available in the US. Also extremely reliable. Don't trust clickbait headlines, look at consumer reports and satisfaction.

rtkwe · 2025-12-11T16:49:03 1765471743

Might be location dependent I had a lot of options last time I was poking around when my current gas car started making potentially expensive sounding noises earlier this year. (they stopped so I'm fine again /s)

Rover222 · 2025-12-11T15:36:37 1765467397

My brain insists on reading that title as "gas for gays for only $1.69"

Would be an interesting promo, I guess.

ugh123 · 2025-12-11T16:07:27 1765469247

How do you verify it?!

BobaFloutist · 2025-12-12T00:12:22 1765498342

With a classic gas-station restroom tradition!

Rover222 · 2025-12-11T16:32:39 1765470759

that's the big question haha

Rover222 · 2025-12-10T17:49:05 1765388945

Imagine being a blackmarket GPU smuggler. High danger and high reward to get the most advanced AI silicon to a corporation operating under a repressive regime.

Sounds straight out of sci-fi.

gosub100 · 2025-12-10T19:07:24 1765393644

Illegal video gamers playing 3d shooters in virtual world using contraband gpus

KurSix · 2025-12-10T17:52:14 1765389134

The wild part is that it sounds sci-fi, but the reality is probably way less glamorous

RobotToaster · 2025-12-10T18:28:10 1765391290

I don't think a h100 in the prison pocket would be very comfortable.

Rover222 · 2025-12-05T20:56:14 1764968174

I just tried to get Gemini to produce an image of a dog with 5 legs to test this out, and it really struggled with that. It either made a normal dog, or turned the tail into a weird appendage.

Then I asked both Gemini and Grok to count the legs, both kept saying 4.

Gemini just refused to consider it was actually wrong.

Grok seemed to have an existential crisis when I told it it was wrong, becoming convinced that I had given it an elaborate riddle. After thinking for an additional 2.5 minutes, it concluded: "Oh, I see now—upon closer inspection, this is that famous optical illusion photo of a "headless" dog. It's actually a three-legged dog (due to an amputation), with its head turned all the way back to lick its side, which creates the bizarre perspective making it look decapitated at first glance. So, you're right; the dog has 3 legs."

You're right, this is a good test. Right when I'm starting to feel LLMs are intelligent.

theoa · 2025-12-06T08:32:42 1765009962

Draw a millipede as a dog:

Gemini responds:

Conceptualizing the "Millipup"

https://gemini.google.com/share/b6b8c11bd32f

Draw the five legs of a dog as if the body is a pentagon

https://gemini.google.com/share/d74d9f5b4fa4

And animal legs are quite standardized

https://en.wikipedia.org/wiki/List_of_animals_by_number_of_l...

It's all about the prompt. Example:

Can you imagine a dog with five legs?

https://gemini.google.com/share/2dab67661d0e

And generally, the issue sits between the computer and the chair.

;-)

vunderba · 2025-12-06T17:06:08 1765040768

This is basically the "Rhinos are just fat unicorns" approach. Totally fine if you want to go that route but a bit goofy. You can get SOTA models to generate a 5-legged dog simply by being more specific about the placement of the fifth leg.

https://imgur.com/a/jNj98Pc

Asymmetry is as hard for AI models as it is for evolution to "prompt for" but they're getting better at it.

Rover222 · 2025-12-06T16:14:52 1765037692

haha fair point, you can get the expected results with the right prompt, but I think it still reveals a general lack of true reasoning ability (or something)

ithkuil · 2025-12-06T21:23:47 1765056227

Or it just shows that it tries to overcorrect the prompt which is generally a good idea in the most cases where the prompter is not intentionally asking a weird thing.

This happens all the time with humans. Imagine you're at a call center and get all sorts of weird descriptions of problems with a product: every human is expected to not expect the caller is an expert and actually will try to interpolate what they might mean by the weird wording they use

macNchz · 2025-12-05T23:46:43 1764978403

An interesting test in this vein that I read about in a comment on here is generating a 13 hour clock—I tried just about every prompting trick and clever strategy I could come up with across many image models with no success. I think there's so much training data of 12 hour clocks that just clobbers the instructions entirely. It'll make a regular clock that skips from 11 to 13, or a regular clock with a plaque saying "13 hour clock" underneath, but I haven't gotten an actual 13 hour clock yet.

RestartKernel · 2025-12-06T00:25:43 1764980743

Right you are. It can do 26 hours just fine, but appears completely incapable when the layout would be too close to a normal clock.

https://gemini.google.com/share/b3b68deaa6e6

I thought giving it a setting would help, but just skip that first response to see what I mean.

mkl · 2025-12-06T02:03:47 1764986627

That's a 24 hour clock that skips some numbers and puts other numbers out of order.

petters · 2025-12-06T10:06:09 1765015569

"just fine" is not really an accurate description of that 26-hour clock

raw_anon_1111 · 2025-12-06T06:10:04 1765001404

It was ugly. But I got ChatGPT to cheat and do it

https://chatgpt.com/share/6933c848-a254-8010-adb5-8f736bdc70...

This is the SVG it created.

https://imgur.com/a/LLpw8YK

vunderba · 2025-12-05T22:48:44 1764974924

If you want to see something rather amusing - instead of using the LLM aspect of Gemini 3.0 Pro, feed a five-legged dog directly into Nano Banana Pro and give it an editing task that requires an intrinsic understanding of the unusual anatomy.

  Place sneakers on all of its legs.

It'll get this correct a surprising number of times (tested with BFL Flux2 Pro, and NB Pro).

https://imgur.com/a/wXQskhL

tensegrist · 2025-12-06T02:06:56 1764986816

i imagine the real answer is that the edits are local because that's how diffusion works; it's not like it's turning the input into "five-legged dog" and then generating a five-legged dog in shoes from scratch

Lamprey · 2025-12-06T01:56:53 1764986213

Does this still work if you give it a pre-existing many-legged animal image, instead of first prompting it to add an extra leg and then prompting it to put the sneakers on all the legs?

I'm wondering if it may only expect the additional leg because you literally just told it to add said additional leg. It would just need to remember your previous instruction and its previous action, rather than to correctly identify the number of legs directly from the image.

I'll also note that photos of dogs with shoes on is definitely something it has been trained on, albeit presumably more often dog booties than human sneakers.

Can you make it place the sneakers incorrectly-on-purpose? "Place the sneakers on all the dog's knees?"

vunderba · 2025-12-06T02:13:07 1764987187

My example was unclear. Each of those images on Imgur was generated using independent API calls which means there was no "rolling context/memory".

In other words:

1. Took a personal image of my dog Lily

2. Had NB Pro add a fifth leg using the Gemini API

3. Downloaded image

4. Sent image to BFL Flux2 Pro via the BFL API with the prompt "Place sneakers on all the legs of this animal".

5. Sent image to NB Pro via Gemini API with the prompt "Place sneakers on all the legs of this animal".

So not only was there zero "continual context", it was two entirely different models as well to cover my bases.

EDIT: Added images to the Imgur for the following prompts:

- Place red Dixie solo cups on the ends of every foot on the animal

- Draw a red circle around all the feet on the animal

dwringer · 2025-12-05T21:17:45 1764969465

I had no trouble getting it to generate an image of a five-legged dog first try, but I really was surprised at how badly it failed in telling me the number of legs when I asked it in a new context, showing it that image. It wrote a long defense of its reasoning and when pressed, made up demonstrably false excuses of why it might be getting the wrong answer while still maintaining the wrong answer.

Rover222 · 2025-12-05T21:44:22 1764971062

Yeah it gave me the 5-legged dog on the 4th or 5th try.

AIorNot · 2025-12-05T21:19:07 1764969547

Its not that they aren’t intelligent its that they have been RL’d like crazy to not do that

Its rather like as humans we are RL’d like crazy to be grossed out if we view a picture of a handsome man and beautiful woman kissing (after we are told they are brother and sister) -

Ie we all have trained biases - that we are told to follow and trained on - human art is about subverting those expectations

majormajor · 2025-12-05T21:42:11 1764970931

Why should I assume that a failure that looks like a model just doing fairly simple pattern matching "this is dog, dogs don't have 5 legs, anything else is irrelevant" vs more sophisticated feature counting of a concrete instance of an entity is RL vs just a prediction failure due to training data not containing a 5-legged dog and an inability to go outside-of-distribution?

RL has been used extensively in other areas - such as coding - to improve model behavior on out-of-distribution stuff, so I'm somewhat skeptical of handwaving away a critique of a model's sophistication by saying here it's RL's fault that it isn't doing well out-of-distribution.

If we don't start from a position of anthropomorphizing the model into a "reasoning" entity (and instead have our prior be "it is a black box that has been extensively trained to try to mimic logical reasoning") then the result seems to be "here is a case where it can't mimic reasoning well", which seems like a very realistic conclusion.

mlinhares · 2025-12-05T21:48:29 1764971309

I have the same problem, people are trying so badly to come up with reasoning for it when there's just nothing like that there. It was trained on it and it finds stuff it was trained to find, if you go out of the training it gets lost, we expect it to get lost.

didgeoridoo · 2025-12-05T23:01:16 1764975676

I’m inclined to buy the RL story, since the image gen “deep dream” models of ~10 years ago would produce dogs with TRILLIONS of eyes: https://doorofperception.com/2015/10/google-deep-dream-incep...

Lamprey · 2025-12-06T02:05:40 1764986740

That's apples to oranges; your link says they made it exaggerate features on purpose.

"The researchers feed a picture into the artificial neural network, asking it to recognise a feature of it, and modify the picture to emphasise the feature it recognises. That modified picture is then fed back into the network, which is again tasked to recognise features and emphasise them, and so on. Eventually, the feedback loop modifies the picture beyond all recognition."

HardCodedBias · 2025-12-06T02:39:14 1764988754

"There are four lights"

And the AI has been RLed for tens of thousands of years not just a few days.

squigz · 2025-12-06T01:55:29 1764986129

I feel a weird mix of extreme amusement and anger that there's a fleet of absurdly powerful, power-hungry servers sitting somewhere being used to process this problem for 2.5 minutes

Rover222 · 2025-12-06T16:17:52 1765037872

what a world we live in

tarsinge · 2025-12-06T17:03:06 1765040586

I have only a high level understanding of LLMs but to me it doesn’t seem surprising: they are trying to come up with a textual output of your prompt aggregated to their result that scores high (i.e. is consistent) with their training set. There is no thinking, just scoring consistency. And a dog with 5 legs is so rare or nonexistent in their training set and their resulting weights that it scores so bad they can’t produces an output that accepts it. But how the illusion breaks down in this case is quite funny indeed.

irthomasthomas · 2025-12-05T21:32:02 1764970322

Isn't this proof that LLMs still don't really generalize beyond their training data?

adastra22 · 2025-12-05T23:23:19 1764976999

LLMs are very good at generalizing beyond their training (or context) data. Normally when they do this we call it hallucination.

Only now we do A LOT of reinforcement learning afterwards to severely punish this behavior for subjective eternities. Then act surprised when the resulting models are hesitant to venture outside their training data.

runarberg · 2025-12-06T01:38:49 1764985129

Hallucination are not generalization beyond the training data but interpolations gone wrong.

LLMs are in fact good at generalizing beyond their training set, if they wouldn’t generalize at all we would call that over-fitting, and that is not good either. What we are talking about here is simply a bias and I suspect biases like these are simply a limitation of the technology. Some of them we can get rid of, but—like almost all statistical modelling—some biases will always remain.

adastra22 · 2025-12-06T04:17:13 1764994633

What, may I ask, is the difference between "generalization" and "interpolation"? As far as I can tell, the two are exactly the same thing.

In which case the only way I can read your point is that hallucinations are specifically incorrect generalizations. In which case, sure if that's how you want to define it. I don't think it's a very useful definition though, nor one that is universally agreed upon.

I would say a hallucination is any inference that goes beyond the compressed training data represented in the model weights + context. Sometimes these inferences are correct, and yes we don't usually call that hallucination. But from a technical perspective they are the same -- the only difference is the external validity of the inference, which may or may not be knowable.

Biases in the training data are a very important, but unrelated issue.

runarberg · 2025-12-06T05:43:57 1764999837

Interpolation and generalization are two completely different constructs. Interpolation is when you have two data points and make a best guess where a hypothetical third point should fit between them. Generalization is when you have a distribution which describes a particular sample, and you apply it with some transformation (e.g. a margin of error, a confidence interval, p-value, etc.) to a population the sample is representative of.

Interpolation is a much narrower construct then generalization. LLMs are fundamentally much closer to curve fitting (where interpolation is king) then they are to hypothesis testing (where samples are used to describe populations), though they certainly do something akin to the latter to.

The bias I am talking about is not a bias in the training data, but bias in the curve fitting, probably because of mal-adjusted weights, parameters, etc. And since there are billions of them, I am very skeptical they can all be adjusted correctly.

adastra22 · 2025-12-06T06:56:09 1765004169

I assumed you were speaking by analogy, as LLMs do not work by interpolation, or anything resembling that. Diffusion models, maybe you can make that argument. But GPT-derived inference is fundamentally different. It works via model building and next token prediction, which is not interpolative.

As for bias, I don’t see the distinction you are making. Biases in the training data produce biases in the weights. That’s where the biases come from: over-fitting (or sometimes, correct fitting) of the training data. You don’t end up with biases at random.

IsTom · 2025-12-06T09:42:45 1765014165

> It works via model building and next token prediction, which is not interpolative.

I'm not particularly well-versed in LLMs, but isn't there a step in there somewhere (latent space?) where you effectively interpolate in some high-dimensional space?

adastra22 · 2025-12-06T10:19:37 1765016377

Not interpolation, no. It is more like the N-gram autocomplete used to use to make typing and autocorrect suggestions in your phone. Attention js not N-gram, but you can kinda think of it as being a sparsely compressed N-gram where N=256k or whatever the context window size is. It’s not technically accurate, but it will get your intuition closer than thinking of it as interpolation.

The LLM uses attention and some other tricks (attention, it turns out, is not all you need) to build a probabilistic model of what the next token will be, which it then sampled. This is much more powerful than interpolation.

runarberg · 2025-12-06T07:25:06 1765005906

What I meant was that what LLMs are doing is very similar to curve fitting, so I think it is not wrong to call it interpolation (curve fitting is a type of interpolation, but not all interpolation is curve fitting).

As for bias, sampling bias is only one many types of biases. I mean the UNIX program YES(1) has a bias towards outputting the string y despite not sampling any data. You can very easily and deliberately program a bias into everything you like. I am writing a kanji learning program using SSR and I deliberately bias new cards towards the end of the review queue to help users with long review queues empty it quicker. There is no data which causes that bias, just program it in there.

I don‘t know enough about diffusion models to know how biases can arise, but with unsupervised learning (even though sampling bias is indeed very common) you can get a bias because you are using wrong, mal-adjusted, to many parameters, etc. even the way your data interacts during training can cause a bias, heck even by random one of your parameters hits an unfortunate local maxima yielding a mal-adjusted weight, which may cause bias in your output.

adastra22 · 2025-12-06T10:28:52 1765016932

Training is kinda like curve fitting, but inference is not. The inference algorithm is random sampling from a next-token probability distribution.

It’s a subtle distinction, but I think an important one in this case, because if it was interpolation then genuine creativity would not be possible. But the attention mechanism results in model building in latent space, which then affects the next token distribution.

runarberg · 2025-12-06T16:58:29 1765040309

I’ve seen both opinions on this in the philosophy of statistics. Some would say that machine learning inference is something other then curve fitting, but others (and I subscribe to this) believe it is all curve fitting. I actually don‘t think which camp is right is that important but I do like it when philosophers ponder about these tings.

My reasons to subscribing to the latter camp is that when you have a distribution and you fit things according to that distribution (even when the fitting is stochastic; and even when the distribution belongs in billions of dimensions) you are doing curve fitting.

I think the one extreme would be a random walk, which is obviously not curve fitting, but if you draw from any other distribution then the uniform distribution, say the normal distribution, you are fitting that distribution (actually, I take that back, the original random walk is fitting the uniform distribution).

Note I am talking about inference, not training. Training can be done using all sorts of algorithms, some include priors (distributions) and would be curve fitting, but only compute the posteriors (also distributions). I think the popular stochastic linear descent does something like this, so it would be curve-fitting, but the older evolutionary algorithm just random walks it and is not fitting any curve (except the uniform distribution). What matters to me is that the training arrives at a distribution, which is described by a weight matrix, and what inference is doing is fitting to that distribution (i.e. the curve).

adastra22 · 2025-12-06T21:41:28 1765057288

I get the argument that pulling from a distribution is a form of curve fitting. But unless I am misunderstanding, the claim is that it is a curve fitting / interpolation between the training data. The probability distribution generated in inference is not based on the training data though. It is a transform of the context through the trained weights, which is not the same thing. It is the application of a function to context. That function is (initially) constrained to reproduce the training data when presented with a portion of that data as context. But that does not mean that all outputs are mere interpolations between training datapoints.

Except in the most technical sense that any function constrained to meet certain input output values is an interpolation. But that is not the smooth interpolation that seems to be implied here.

CamperBob2 · 2025-12-05T22:17:38 1764973058

They do, but we call it "hallucination" when that happens.

Zambyte · 2025-12-05T22:49:46 1764974986

I wonder how they would behave given a system prompt that asserts "dogs may have more or less than four legs".

irthomasthomas · 2025-12-06T00:04:39 1764979479

That may work but what actual use would it be? You would be plugging one of a million holes. A general solution is needed.

CamperBob2 · 2025-12-06T19:45:08 1765050308

Not necessarily. The problem may be as simple as the fact that LLMs do not see "dog legs" as objects independent of the dogs they're attached to.

The systems already absorb much more complex hierarchical relationships during training, just not that particular hierarchy. The notion that everything is made up of smaller components is among the most primitive in human philosophy, and is certainly generalizable by LLMs. It just may not be sufficiently motivated by the current pretraining and RL regimens.

Rover222 · 2025-12-05T21:43:51 1764971031

Kind of feels that way

visioninmyblood · 2025-12-06T23:01:34 1765062094

I tried this by using an gemini visual agent build with orion from vlm.run. it was able to produce two different images with five leg dog. you need to make it play with itself to improve and correct.

https://chat.vlm.run/c/62394973-a869-4a54-a7f5-5f3bb717df5f

Here is the though process summary(you can see the full thinking the link above):

"I have attempted to generate a dog with 5 legs multiple times, verifying each result. Current image generation models have a strong bias towards standard anatomy (4 legs for dogs), making it difficult to consistently produce a specific number of extra limbs despite explicit prompts."

qnleigh · 2025-12-05T23:40:30 1764978030

It's not obvious to me whether we should count these errors as failures of intelligence or failures of perception. There's at least a loose analogy to optical illusion, which can fool humans quite consistently. Now you might say that a human can usually figure out what's going on and correctly identify the illusion, but we have the luxury of moving our eyes around the image and taking it in over time, while the model's perception is limited to a fixed set of unchanging tokens. Maybe this is relevant.

(Note I'm not saying that you can't find examples of failures of intelligence. I'm just questioning whether this specific test is an example of one).

cyanmagenta · 2025-12-05T23:52:52 1764978772

I am having trouble understanding the distinction you’re trying to make here. The computer has the same pixel information that humans do and can spend its time analyzing it in any way it wants. My four-year-old can count the legs of the dog (and then say “that’s silly!”), whereas LLMs have an existential crisis because five-legged-dogs aren’t sufficiently represented in the training data. I guess you can call that perception if you want, but I’m comfortable saying that my kid is smarter than LLMs when it comes to this specific exercise.

FeepingCreature · 2025-12-06T00:52:07 1764982327

Your kid, it should be noted, has a massively bigger brain than the LLM. I think the surprising thing here maybe isn't that the vision models don't work well in corner cases but that they work at all.

Also my bet would be that video capable models are better at this.

qnleigh · 2025-12-06T11:11:13 1765019473

LLMs can count other objects, so it's not like they're too dumb to count. So a possible model for what's going on is that the circuitry responsible for low-level image recognition has priors baked in that cause it to report unreliable information to parts that are responding for higher-order reason.

So back to the analogy, it could be as if the LLMs experience the equivalent of a very intense optical illusion in these cases, and then completely fall apart trying to make sense of it.

nearbuy · 2025-12-06T06:20:33 1765002033

My guess is the part of its neural network that parses the image into a higher level internal representation really is seeing the dog as having four legs, and intelligence and reasoning in the rest of the network isn't going to undo that. It's like asking people whether "the dress" is blue/black or white/gold: people will just insist on what they see, even if what they're seeing is wrong.

SecretDreams · 2025-12-06T01:36:38 1764984998

LLMs are getting a lot better at understanding our world by standard rules. As it does so, maybe it losses something in the way of interpreting non standard rules, aka creativity.

dostick · 2025-12-07T09:31:43 1765099903

Of course if you put the leg there it’s confusing. Just give it AGIP logo as a test.

varispeed · 2025-12-06T00:28:11 1764980891

Do 7 legged dog. Game over.

criddell · 2025-12-06T12:13:11 1765023191

Is that a dog though?

DANmode · 2025-12-06T16:23:39 1765038219

What is "a dog"?

What is " a dog" to Gemini?

isodev · 2025-12-06T06:59:29 1765004369

> starting to feel LLMs are intelligent

LLMs are fancy “lorem ipsum based on a keyword” text generators. They can never become intelligent … or learn how to count or do math without the help of tools.

It can probably generate a story about a 5 legged dog though.

Rover222 · 2025-12-02T17:10:48 1764695448

`Something one doesn't see` - no pun intended

Rover222 · 2025-12-01T15:02:24 1764601344

+1, Taiwan is a great place