Hacker Newsnew | past | comments | ask | show | jobs | submit | gordon_freeman's commentslogin

Folks who have some extra time this Christmas, do check out the tv show Pluribus on Apple TV. It’s not your typical action packed sci-fi show but is very slow burning, philosophical kind of show which I found very smart.

Turns out my best 2 tv shows of last couple of years are on Apple TV: Severance and Pluribus


thanks for the recommendation

Jumping on to agree with Severance and make a few more.

Silicon Valley - for me the funniest (and painful to watch, sometimes) to show. Schitt’s Creek - heart warming, wonderfully written, very funny bite size episodes perfect for cosy Christmas times.


Walmart therapist?


People use LLMs as their therapist because they’re either unwilling to see or unable to afford a human one. Based on anecdotal Reddit comments, some people have even mentioned that an LLM was more “compassionate” than a human therapist.

Due to economics, being able to see a human therapist in person for more than 15 minutes at a time has now become a luxury.

Imo this is dangerous, given the memory features that both Claude and ChatGPT have. Of course, most medical data is already online but at least there are medical privacy laws for some countries.


This is exactly why the two use cases need to be delineated.


As in cheap.


The thing with reading vs speaking to my emails is, it's much quicker and mentally less exhausting for me to just read and quickly reply or move them to folder in just a few seconds than having this kind of long conversation while driving and putting pedestrians and other drivers and myself at risk.


But let's be real, people need something to do while driving. They're not robots who can sit there tirelessly doing nothing but driving. Listen to music or podcasts or phone a friend. Something. If that something is talk with an AI about their emails, that seems better than having them look at their phone and trying to type into it while driving.


It seems like the progress from GPT-4 to GPT-5 has plateaued: for most prompts, I actually find GPT-4 more understandable than GPT-5 [1].

[1] Read the answers from GPT-4 and 5 for this math question: "Ugh I hate math, integration by parts doesn't make any sense"


Basic prose is a saturated bench. You can't go above 100% so by definition progress will stall on such benchmarks.


You say that, but I can imagine a good maths textbook and a bad one, both technically correct and well written prose, but one is better at taking the student on a journey and understanding where people fall off and common misunderstandings without odiously re-explaining everything


This.

Recently I uploaded screenshot of movie show timing at a specific theatre and asked ChatGPT to find the optimal time for me to watch the movie based on my schedule.

It did confidently find the perfect time and even accounted for the factors such as movies in theatre start 20 mins late due to trailers and ads being shown before movie starts. The only problem: it grabbed the times from the screenshot totally incorrectly which messed up all its output and I tried and tried to get it to extract the time accurately but it didn’t and ultimately after getting frustrated I lost the trust in its ability. This keeps happening again and again with LLMs.


And this is actually a great use of Agents because they can go and use the movie theater's website to more reliably figure out when movies start. I don't think they're going to feed screenshots in to the LLM.


Honestly might be more indicative of how far behind vision is than anything.

Despite the fact that CV was the first real deep learning breakthrough VLMs have been really disappointing. I'm guessing it's in part due to basic interleaved web text+image next token prediction being a weak signal to develop good image reasoning.


Is there anyone trying to solve OCR, I often think of that annas-archive blog about how we basically just have to keep shadow libraries alive long enough until the conversion from pdf to plaintext is solved.

https://annas-archive.org/blog/critical-window.html

I hope one of these days one of these incredibly rich LLM companies accidentally solves this or something, would be infinitely more beneficial to mankind than the awful LLM products they are trying to make


You may want to have a look at Mistral OCR: https://mistral.ai/news/mistral-ocr


This... what?


Reading this makes me so sad and reminded me of a book I read years ago: Hiroshima by John Hersey - about the first-person narrative account of survivors who witnessed the impact of atomic bomb dropped on Hiroshima that morning.


If you have the opportunity to visit, I recommend Nagasaki over Hiroshima and especially these two places in Nagasaki:

Shiroyama Elementary School

Nagai Takashi Memorial Museum Nagasaki

These felt much more personal than anything I saw in Hiroshima and there were zero (other) tourists to interrupt the experience (very much unlike the museum in Hiroshima).


Like the little boy with his skin melted off walking down the road crying for his mother… horrendous stuff.


These stories always have me instantly sobbing, life can be tragically unfair.


[flagged]


What an insensitive, assumptive, stupid remark. You can't possibly know that the person you replied to behaves as you claim. It's 2025, the firebombing of Tokyo is widely recognized now, maybe not by most normies but certainly by any historical adjacent nerd.


Hey, this is kind of a rude response in an otherwise thoughtful and empathetic thread


Oh, I see you don’t give a shit about Dresden?


That book lives rent-free in my head since I read it about 10 years ago. There's no way to forget some of the scenes in that.


Everyone seems to have a different definition for AGI. Is there some kind of standard there?


No- but the main issue is that all reasonable ones I can conceive lead inevitably to the Singularity technologically, and pretty quickly since we seem determined to throw as much silicon as possible at the problem. Hopefully the final step is intractable.


Same for me. Severance is probably the best show of last decade. The last time I had such an engrossing experience was while reading 1984.

My other two are:

- Shogun (The depiction of 1600s Japan is so real)

- Resident Alien (Funny and heartwarming to see an Alien getting accustomed to life on Earth dealing with complex human relationships with their flaws)

PS: I am sad to exclude Parks and Recreation which ran from 2009-2015 so probably considered outside of last decade.


Counterpart!

(That, by the way, is where Severance seemingly got the inspiration for "The Board")


Counterpart was so good and I was so sad to see it cancelled.


See also The Board in Control (2019).


Exactly! This was my first thought.


You'll find halt and catch fire equally engrossing. Give it a shot!


Totally liked Halt and Catch Fire. One of the best I have seen.


> Resident Alien

Interesting. I thought the premise had potential, but found the writing unbearable. There were major plot holes in the universe they created withing the first 10 minutes. It just didn't make sense. The dialogues and acting was bad on top of that. Didn't even finish the first episode. That being said, the series has OK ratings and was renewed several times, so it might be me not giving it a fair chance.


You should give it a try and watch the S1 entirely. Based on your comment, it seems like you are watching it with a different lense. It is not a drama or thriller where you'd look for holes. It is about perspective of someone who is new to this world trying to blend in.


For a similar vibe to parks and Recreation, Veep.


I found Shogun the show to be relatively disappointing, after having read the book before. The book has a lot of nuanced explanations of people's motivations and philosophical/intelligent dialogue that the show just skips over, since they wanted to cover a huge tome in just one season.

This series deserved to be 2X longer to cover those imho.


What is fascinating about this announcement is if you look into future after considerable improvements in product and the model, we will be just chatting with ChatGPT to book dinner tables, flights, buy groceries and do all sort of mundane and hugely boring things we do on the web, just by talking to the agents. I'd definitely love that.


I don't. Chat interface sucks; for most of these things, a more direct interface could be much more ergonomic, and easier to operate and integrate. The only reason we don't have those interfaces is because neither restaurants, nor airlines, nor online stores, nor any other businesses actually want us to have them. To a business, the user interface isn't there to help the user achieve their goals - it's a platform for milking the users as much as possible. To a lesser or greater extent, almost every site actively defeats attempts at interoperability.

Denying interoperability is so culturally ingrained at this point, that it got pretty much baked into entire web stack. The only force currently countering this is accessibility - screen readers are pretty much an interoperability backdoor with legal backing in some situations, so not every company gets to ignore it.

No, we'll have to settle for "chat agents" powered by multimodal LLMs working as general-purpose web scrappers, because those models are the ultimate form of adversarial interoperability, and chat agents are the cheapest, least-effort way to let users operate them.


I think the chat interface is bad, but for certain things it could honestly streamline a lot of mundane things as the poster you're replying two stated.

For example, McDonald's has heavily shifted away from cashiers taking orders and instead is using the kiosks to have customers order. The downside of this is 1) it's incredibly unsanitary and 2) customers are so goddamn slow at tapping on that god awful screen. An AI agent could actually take orders with surprisingly good accuracy.

Now, whether we want that in the world is a whole different debate.


McDonalds is a good example. In the beginning the Kiosks were a real time-saver, and you could order with a few "klicks".

Today, you need to bypass "do you have the app", "do you want fries with that", "do you want to donate", "are you sure you don't want fries?" and a couple more.

All this is exactly what your parent comment was saying: "To a business, the user interface isn't there to help the user achieve their goals - it's a platform for milking the users as much as possible."

Regarding sanitation, not sure if they are any worse than, say, door handles.


Come to think of it, chat may make things even worse.

What I wrote earlier, about business seeing the interface as a platform for milking users, applies just as much to human interface. After all, "do you want fries with that?" didn't originate with the Kiosks, but with human cashiers. Human stuff is, too, being programmed by corporate to upsell you shit. They have explicit instructions for it, and regular compliance checks by "mystery shoppers".

Now, the upsell capabilities of human cashier interface are limited by training, compliance and controls, all of which are both expensive and unreliable processes; additionally, customers are able to skip some of the upsells by refusing the offer quickly and angrily enough - trying to force cashiers to upsell anyway breaks too many social and cultural expectations to be feasible. Meanwhile, programming a Kiosk is free on the margin - you get zero-time training (and retraining) and 100% compliance, and the customer has no control. You can say "stop asking me about fries" to a Kiosk all day, and it won't stop.

It's highly likely a voice chat interface will combine the worst of the characteristics above. It's still software like Kiosk, just programmed by prompts, so still free on the margin, compliant, and retrainable on the spot. At the same time, the voice/conversational aspect makes us perceive the system more like a person, making us more susceptible to upsells, while at the same time, denying us agency and control, because it's still a computer and can easily be made to keep asking you about fries, with no way for you to make it shut up.


> Regarding sanitation, not sure if they are any worse than, say, door handles.

It will depend on the material of the door handles. In my experience, many of the handles are some kind of metal, and bacteria has a harder time surviving on metal surfaces. Compare that to a screen that requires some pretty hard presses in order to get registered inputs from it, and I think you'd find a considerably higher amount of bacteria sitting there.

Additionally, I try to use to my sleeve in order to open door handles whenever possible.


McDonald's already tried having AI take orders and stopped when the AI did things like randomly add $250 of McNuggets or mistake ketchup for butter.

Note - because this is something which needs to be pointed out in any discussion of AI now - even though human beings also make mistakes this is still markedly less accurate than the average human employee.


For now


Indeed. I think a GPT-4o class model, properly prompted, would work just fine today. The trick is, unlike a human, the computer is free to just say "no" without consequences. The model could be aggressively prompted to detect and refuse weird orders. Having to escalate to a human supervisor (who conveniently is always busy doing other things and will come to you in a minute or three) should be sufficient at discouraging pranksters and fraudsters, while not annoying enough to deter normal customer.

(I say model, but for this problem I'd consider a pipeline where the powerful model is just parsing orders and formulating replies, while being sanity-checked by a cheaper model and some old-school logic to detect excessive amounts or unusual combinations. I'd also consider using "open source" model in place of GPT-4o, as open models allow doing "alignment" shenanigans in the latent space, instead of just in the prompts.)


I've never used a McDonalds kiosk for the reason you gave. Actually, I think no matter how much you streamlined it with cutting edge AI assistants it would still be faster and more natural to just say "A big mac and a diet coke please" to the cashier. I don't see any end-user benefit to these assistants, the only ones who benefit are the bean counters and executives who will use them to do more layoffs and keep the money that saves to themselves.


With a true GPT ordering experience, you would just say “a Big Mac and a diet coke please” to a speaker just like you would in a drive thru and it would ring you up. It would replace the cashier


This is how it is in Australia at some Macca's with a kiosk, no cashiers at all. You can still request but there isn't people just waiting for you to order.


The guy taking orders does other things rather than just taking an order. Wake me up when chatgimp can prepare my fries and bring the bag with ready food to my car.


If the guy taking orders doesn’t have to take orders in addition to his other work anymore, they can hire less staff overall.


> it's incredibly unsanitary

About as unsanitary as opening the door to get to the kiosks. The kiosks get wiped down more than the door.


That will depend on the materials used for the door handles. If the handles are made of metal, then bacteria generally has a harder time surviving. Additionally, I use my sleeve when opening the door.


The point may have flown over your head. The kiosks are cleaned than most other items you will be touching up until that point. It is not incredibly unsanitary but it can be aggravating for those that think a lot about germs.


I quite like the kiosk system for ordering McDonald's. You can see the entire available menu, along with all possible options for adding or removing ingredients, sides, sizes, combo deals, etc. You can always see the current state of your order. A chat-based interface wouldn't be a major improvement on this UX imho.


Yes. Chat is absolutely bad, because it is opaque. It perfectly reproduces what used to be called "hunt the verb" in gaming, for the same reason. The simple truth is you're interacting with a piece of software, with features and subroutines. GUIs are great at surfacing features, affordances, changing with context. A chat interface invites you to guess.

LLMs, if used at all, aren't aware enough to even know what the software can do, and many actual chat UIs are worse than that!

My "favourite" design pattern for chat UIs is to invite you type, freely, whatever you like, then immediately enter a wizard "flow" like it's 1991 and entirely discard everything you typed. Pure hostility.


    > it's incredibly unsanitary
I never thought about this. Does McD's PR team have anything to say about it? I assume that a bunch of people have challenged them about it on Twitter or TikTok. Would you feel better if there was a kind of automatic/robotic window washer that sanitised the screen after each use?

The key to me about the kiosks is: (1) initially, replace cashier labour costs with new expensive machines, and (2) medium-to-long term, upgrade the software with more and more "upsell" logic. This could be incredibly effective as a sales tactic. (Not withstanding the possibility, I fully agree with your final sentence!)

Can you imagine if a celebrity, like Kim Kardashian or David Beckham, lent their likeness for a fee to McD's to create an assistant that would talk with you during your order? (Surely, AI/ML can generate video/anime that looks/moves/sounds just like them.) I can foresee it, and it would be the near-perfect economic exploitation of parasocial relationships in a retail setting.


> I never thought about this. Does McD's PR team have anything to say about it? I assume that a bunch of people have challenged them about it on Twitter or TikTok.

They probably ignore them, as they should - the same problem exists everywhere, from ATMs to door keypads to stores to self-checkout to tapping your card on stuff, etc.

> initially, replace cashier labour costs with new expensive machines,

Labor, like energy, is conserved in the system. It might be easier to counter proliferation of those systems if the narration was focused less about companies replacing labor on their side, and more on the fact that this labor gets transferred to the customers, who are now laboring for free for the company, doing the same things that used to be done better and faster by a dedicated employee.


You opened the door to walk into the McDonalds and never thought twice about it.


At least the outside of the door is bathed in ultraviolet from outer space.


I'm honestly pretty aware of it. When I open the door, I try to use my sleeve. If I'm unable to do that (say I'm wearing a short sleeve shirt), then I'll consider washing my hands if I'm eating in.


McDonald’s makes a lot more money with the kiosks. Slowness is an issue but the upselling is major, and putting a lot of images of tasty looking things in front of a hungry person is very effective. Chat could never do this!


I also do not like Chat interface. What I meant by above comment was actually talking and having natural conversations with Operator agent while driving car or just going for a walk or whenever and wherever something comes to my mind which requires me to go to browser and fill out forms etc. That would get us closer to using chatGPT as a universal AI agent to get those things done. (This is what Siri was supposed to be one day when Steve Jobs introduced it on that stage but unfortunately that day never arrived.)


> This is what Siri was supposed to be one day when Steve Jobs introduced it on that stage but unfortunately that day never arrived.

The irony is, the reason neither Siri nor Alexa nor Google Assistant/Now/${whatever they call it these days} nor Cortana achieved this isn't the voice side of the equation. That one sucks too, when you realize that 20 years ago Microsoft Speech API could do better, fully locally, on cheap consumer hardware, but the real problem is the integration approach. Doing interop by agreements between vendors only ever led to commercial entities exposing minimal, trivial functionality of their services, which were activated by voice commands in the form of "{Brand Wake word}, {verb} {Brand 1} to {verb} {Brand 2}" etc.

This is not an ergonomic user interface, it's merely making people constantly read ads themselves. "Okay Google, play some Taylor Swift on Spotify" is literally three brand ads in eight words you just spoke out loud.

No, all the magical voice experience you describe is enabled[0] by having multimodal LLMs that can be sicced on any website and beat it into submission, whether the website vendor likes it or not. Hopefully they won't screw it up (again[1]) trying to commercialize it by offering third parties control over what LLMs can do. If, in this new reality, I have to utter the word "Spotify" to have my phone start playing music, this is going to be a double regression relative to MS Speech API in the mid 2000s.

--

[0] - Actually, it was possible ever since OpenAI added function calling, which was like over a good year ago - if you exposed stuff you care about as functions on your own. As it is, currently the smartphone voice assistant that's closest to Star Trek experience is actually free and easy to set up - it's Home Assistant with its mobile app (for the phone assistant side) and server-side integrations (mostly, but not limited to, IoT hardware).

[1] - Like OpenAI did with "GPTs". They've tried to package a system prompt and function call configuration into a digital product and build a marketplace around it. This delayed their release of the functionality to the official ChatGPT app/website for about half a year, leading to an absurd situation where, for those 6+ months, anyone with API access could use a much better implementation of "GPTs" via third-party frontends like TypingMind.


Yes. Chat is absolutely bad, because it is opaque. It perfectly reproduces what used to be called "hunt the verb" in gaming, for the same reason. The simple truth is you're interacting with a piece of software, with features and subroutines. GUIs are great at surfacing features, affordances, changing with context. A chat interface invites you to guess.


Why assume that chat will be the interface? Multimodal and dynamically generated seems more likely.


Voice chat with LLMs is a complete interface, and it's one that already works and can be slotted right into the product. You can prototype voice chat-based ordering app via no-code tools today, and without much effort going into it.

Dynamically generated interactive UIs are something people are barely beginning to experiment with; we don't know if current models can do them reliably for realistic problems, and how effort has to go into setting them up for any particular product. At this point, they're an expensive, conceptual solution, that doesn't scale.


Are our attention spans so shot that we consider booking a reservation at a restaurant or buying groceries "hugely boring"? And do we value convenience so much that we're willing to sacrifice a huge breadth of options for whatever sponsor du jour OpenAI wants to serve us just to save less than 10 minutes?

And would this company spend billions of dollars for this infinitesimally small increase in convenience? No, of course not; you are not the real customer here. Consider reading between the lines and thinking about what you are sacrificing just for the sake of minor convenience.


I'm reminded of Kurt Vonnegut's famous story about buying postage stamps: https://www.insidehook.com/wellness/kurt-vonnegut-advice

"I stamp the envelope and mail it in a mailbox in front of the post office, and I go home. And I’ve had a hell of a good time. And I tell you, we are here on Earth to fart around, and don’t let anybody tell you any different...How beautiful it is to get up and go do something."


I love so much. It really encapsulates what I've been feeling about tech and life generally. Society and especially tech seems so efficiency minded that I feel like a crazy person for going to do my groceries at the store sometimes.


> Are our attention spans so shot that we consider booking a reservation at a restaurant or buying groceries "hugely boring"?

Dont be limited with these examples.

How about Airline booking, try different airlines, go to the confirmation screen. then the user can check if everything is allright and if the user wants to finish the booking on the most cheapest one.


Google flight does that for you, and your browser can already full in 80% of the form fields. I don't remember spending more than 1 minute booking tickets. Deciding where to go takes 50-100x more time, the booking speed is such a non issue.

What's the goal of technology ? Automate everything so that we don't have to live anymore ? We might as well build matrix pods at that point


Exactly. 90% of the time I spend booking tickets is in reading every single field a few extra times before clicking "yes, please charge my card $N00". I'm not about to outsource that confirmation step to an LLM, and outsourcing anything else isn't going to save any real time.


Airline booking is a solved problem. Google, Expedia, and many others have their hands on flight pricing and can show you comparisons of those in a single query. Takes 2 minutes. What is the value add of AI here? Making the experience feel like a conversation and at a hyper inflated cost and resource usage with the risk of hallucination? No thanks, solved problem.


It's not a solved problem though, the final cost on the website can be different based on what options you select


The potential of x-Models (x=ll, transformer, tts, etc), which are not AI, to perfect the flooding of social media with bullshit to increase the sales of drop-shipped garbage to hundreds of millions of people is so great that there is a near-infinite stream of money available to be spent on useless shit like this.

Talking to an x-Model (still not AI), just like talking to a human, has never been, is not now, and will never be faster than looking at an information-dense table of data.

x-Models (will never be AI) will eat the world though, long after the dream of talking to a computer to reserve a table has died, because they are so good at flooding social media with bullshit to facilitate the sales of drop-shipped garbage to hundreds of millions of people.

That being said, it is highly likely that is an extremely large group of people who are so braindead that they need a robot to click through TripAdvisor links for them to create a boring, sterile, assembly-line one-day tour of Rome.

Whether or not those people have enough money to be extracted from them to make running such a service profitable remains to be seen.


These are chores and you are vastly underestimating the time saved. The 5-10 min saved per task, they all stack up. Also eventually these would be open source models that you can host yourself so you wouldn't need to worry about giving control to any corporation.


The fact that you are downvoted despite pointing the obvious tells you about the odds of the tech industry adopting a different path. Fleecing the ignoramy is the name of the game.


I am almost 50 and I have never booked a reservation for a restaurant in my entire life.

The Rome trip is even more absurd. Part of the fun of a trip is figuring out what you want to do.

This seems like a product aimed at the delusional, self important, managerial class.


> I am almost 50 and I have never booked a reservation for a restaurant in my entire life.

Ok but that does not mean others share the same opinion. Try doing a walk in for a fancy restaurant on the weekend, see how that goes?


After many years of dealing with chat bots, I think we can all agree that we don't want chat-based interfaces to order our pizza (clicking buttons and scrolling through lists of options is way way faster). I can't think of many other things I'd like to accomplish by chat that I wouldn't want to do through a website or an app. My eyes bleed watching the AI crawl tediously slow to place a pizza order for me.

But… what if I told you that AI could generate an context-specific user interface on the fly to accomplish the specific task at hand. This way we don't have to deal with the random (and often hostile) user interfaces from random websites but still enjoy the convenience of it. I think this will be the future.


The internet optimized away things like concierge services and travel agents by giving us the power to book reservations and plan trips on our own, without dealing with a cumbersome and expensive middleman.

Now with the power of AI we have added back in that middle man to countless more services!


I just booked a restaurant table, it took me maybe 10s on opentable. Booking flights are well under a minute now. Grocery shopping is a 15m stop on my daily walk around the block.

If these are your pain points in life, and they're worth spending $500b to solve, you must live in an insane bubble.


Reserving dinner and booking flights is like .01% of my time. Really just negligible, and they are easy enough. Groceries are more time, but I don't really want groceries delivered, I enjoy going to the store and seeing what is available and what looks good.

Maybe it could read HN for me and tell me if there is anything I'd find interesting. But then how would I slack off?


Not until ChatGPT can do these things as reliably as concierge service, and provide full refund for any situation it messes up.

I am not looking forward to a trip booked for wrong dates with the hotel name confused/hallucinated for a different one.


Yeah the failure states are really an issue. Happy path looks magical but there are so many ways that it can go wrong. And you don’t have the fallback of an actual human you’re talking to to clear it up.


At the moment, I'm looking forward to it.

Let the bot deal with the ads, the cookie banners, the upsells, "newsletters" and all of the other web BS we deal with.

The bot clicks through the front door of the website, just like us. No APIs, no keys, no nothing.

"Hey Siri, grab me a bottle of slow release 500mg Vitamin C from either Amazon or Walmart, whichever has the best deal. Kthx"


I would really love for Apple Knowledge Navigator to be real: https://www.youtube.com/watch?v=umJsITGzXd0

and I'm surprised that people don't bring this visualisation up more often.


This is something we'd like to build. It requires owning both hardware and software - you can not build this in the world of platforms with permissionless apps.


David Lynch was a giant in film directing with his unique vision and surreal style and he gave us so many great movies. But more importantly I feel that he inspired so many modern movie directors such as Ari Aster and Yorgos Lanthimos to make movies like that. I put Lynch in the same category of greatness as Kubrick and Tarkovsky. True genius!


Lynch anecdote (you can find it on YouTube). Kubrick invited some movie people to see his favorite film (no qualifiers) 'Eraserhead'


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: