Hacker Newsnew | past | comments | ask | show | jobs | submit | killthebuddha's commentslogin

I don't have an opinion on the matter, but it's pretty popular. According to [1], Phoenix "was used extensively over the past year" by 2.4% of responders.

[1] https://survey.stackoverflow.co/2025/technology


I really appreciate his iconoclasty right now, but every time I engage with his ideas I come away feeling short changed. I’m always like “there is no such thing as outside the training data”. What’s inside and what’s outside the training data is at least as ill-defined as “what is AGI”.


The answer to (2) is, IMO, "to make computing cheaper". It's interesting to me that this is not the obvious, default answer (it may not be the most actionable answer but IMO it should at least be noted as a way to frame discussions). I think we're at the tail end of computing's artisanal, pre-industrial era where researchers and programmers alike have this latent, tacit view of computing as a kind of arcana.


I feel like the Netflix tech blog has officially jumped the shark.



I know I can code it up myself and then offer it as a model to Roo Code, but an "out of the box" api served model that can use tools like the Claude 3 family would be nice.


I've always felt like the argument is super flimsy because "of course we can _in theory_ do error correction". I've never seen even a semi-rigorous argument that error correction is _theoretically_ impossible. Do you have a link to somewhere where such an argument is made?


In theory transformers are Turing-complete and LLMs can do anything computable. The more down-to-earth argument is that transformer LLMs aren't able to correct errors in a systematic way like Lecun is describing: it's task-specific "whack-a-mole," involving either tailored synthetic data or expensive RLHF.

In particular, if you train an LLM to do Task A and Task B with acceptable accuracy, that does not guarantee it can combine the tasks in a common-sense way. "For each step of A, do B on the intermediate results" is a whole new Task C that likely needs to be fine-tuned. (This one actually does have some theoretical evidence coming from computational complexity, and it was the first thing I noticed in 2023 when testing chain-of-thought prompting. It's not that the LLM can't do Task C, it just takes extra training.)


As soon as you need to start leaning heavily on error correction, that is an indication that your architecture and solution is not correct. The final solution will need to be elegant and very close to a perfect solution immediately.

You must always keep close to the only known example we have of an intelligence which is the human brain. As soon as you start to wander away from the way the human brain does it, you are on your own and you are not relying on known examples of intelligence. Certainly that might be possible, but since there's only one known example in this universe of intelligence, it seems ridiculous to do anything but stick close to that example, which is the human brain.


> of course we can _in theory_ do error correction

Oh yeah? This is begging the question.


One thing he said I think was a profound understatement, and that's that "more reasoning is more unpredictable". I think we should be thinking about reasoning as in some sense exactly the same thing as unpredictability. Or, more specifically, useful reasoning is by definition unpredictable. This framing is important when it comes to, e.g., alignment.


Wouldn't it be the reverse? The word unreasonable is often used as a synonym for volatile, unpredictable, even dangerous. That's because "reason" is viewed as highly predictable. Two people who rationally reason from the same set of known facts would be expected to arrive at similar conclusions.

I think what Ilya is trying to get at here is more like: someone very smart can seem "unpredictable" to someone who is not smart, because the latter can't easily reason at the same speed or quality as the former. It's not that reason itself is unpredictable, it's that if you can reason quickly enough you might reach conclusions nobody saw coming in advance, even if they make sense.


Your second paragraph is basically what I'm saying but with the extension that we only actually care about reasoning when we're in these kinds of asymmetric situations. But the asymmetry isn't about the other reasoner, it's about the problem. By definition we only have to reason through something if we can't predict (don't know) the answer.

I think it's important for us to all understand that if we build a machine to do valuable reasoning, we cannot know a priori what it will tell us or what it will do.


they only arrive at the same conclusion if they both have the same goal.

one could be about maximising wealth while respecting other human beings, the other could be about maximising wealth without respecting other human beings.

Both could be presented same facts and 100% logical but arrive at different conclusions.


I think many of replies here to you missing is the word he uses is "unpredictable". It is not "surprising", "unverifiable" or "unreasonable".

"Prediction" associated in this particular talk is about "intuition": what human can do in 0.1 second. And a most powerful reasoning model by its definition will arrive at "unintuitive" answer because if it is intuitive, it will arrive at the same answer much sooner without long chain of "reasoning". (I also want to make distinction "reasoning" here is not the same as "proof" in mathematical sense. In mathematics, an intuitive conclusion can require extrodinary proof.)


To me the chess AI example he used was perhaps not the most apt. Human players may not be able to reason on as far a horizon as AI and therefore find some of AI's moves perplexing, but they can be more or less sure that a Chess AI is optimizing for the same goal under the same set of rules with them. With Reasoners, alignment is not given. They may be reasoning under an entirely different set of rules and cost functions. On more open ended questions, when Reasoners produce something that human don't understand, we can't easily say whether it's a stroke of genius, or misaligned thoughts.


Not necessarily true when you think about e.g. finding vs. verifying a solution (in terms of time complexity).


IMO verifying a solution is a great example of how reasoning is unpredictable. To say "I need to verify this solution" is to say "I do not know whether the solution is correct or not" or "I cannot predict whether the solution is correct or not without reasoning about it first".


But you will know beforehand some/a lot of properties that the solution will satisfy, which is a type of certainty.


It's not clear any of that follows at all.

Just look at inductive reasoning. Each step builds from a previous step using established facts and basic heuristics to reach a conclusion.

Such a mechanistic process allows for a great deal of "predictability" at each step or estimating likelihood that a solution is overall correct.

In fact I'd go further and posit that perfect reasoning is 100% deterministic and systematic, and instead it's creativity that is unpredictable.


Perfect reasoning, with certain assumptions, is perfectly deterministic, but that does not at all imply that it's predictable. In fact we have extremely strong evidence to the contrary (e.g. we have the halting problem).


This sounds confused. Why do you think the halting problem is relevant to predictability? Undecidable problems != Unpredictable problems


Are you sure that's what he was referring to? In other words, you don't think he was meaning that getting more reasoning out of models is an unpredictable process and not saying that reasoning is unpredictable.


Reasoning by analogy is more predictable because it is by definition more derivative of existing ideas. Reasoning from first principles though can create whole new intellectual worlds by replacing the underpinnings of ideas such that they grow in completely new directions.


I see a good number of comments that seem skeptical or confused about what's going on here or what the value is.

One thing that some people may not realize is that right now there's a MASSIVE amount of effort duplication around developing something that could maybe end up looking like MCP. Everyone building an LLM agent (or pseudo-agent, or whatever) right now is writing a bunch of boilerplate for mapping between message formats, tool specification formats, prompt templating, etc.

Now, having said that, I do feel a little bit like there's a few mistakes being made by Anthropic here. The big one to me is that it seems like they've set the scope too big. For example, why are they shipping standalone clients and servers rather than client/server libraries for all the existing and wildly popular ways to fetch and serve HTTP? When I've seen similar mistakes made (e.g. by LangChain), I assume they're targeting brand new developers who don't realize that they just want to make some HTTP calls.

Another thing that I think adds to the confusion is that, while the boilerplate-ish stuff I mentioned above is annoying, what's REALLY annoying and actually hard is generating a series of contexts using variations of similar prompts in response to errors/anomalies/features detected in generated text. IMO this is how I define "prompt engineering" and it's the actual hard problem we have to solve. By naming the protocol the Model Context Protocol, I assumed they were solving prompt engineering problems (maybe by standardizing common prompting techniques like ReAct, CoT, etc).


Your point about boilerplate is key, and it’s why I think MCP could work well despite some of the concerns raised. Right now, so many of us are writing redundant integrations or reinventing the same abstractions for tool usage and context management. Even if the first iteration of MCP feels broad or clunky, standardizing this layer could massively reduce friction over time.

Regarding the standalone servers, I suspect they’re aiming for usability over elegance in the short term. It’s a classic trade-off: get the protocol in people’s hands to build momentum, then refine the developer experience later.


I don't see I or any other developer would abandon their homebrew agent implementation for a "standard" which isn't actually a standard yet.

I also don't see any of that implementation as "boilerplate". Yes there's a lot of similar code being written right now but that's healthy co-evolution. If you have a look at the codebases for Langchain and other LLM toolkits you will realize that it's a smarter bet to just roll your own for now.

You've definitely identified the main hurdle facing LLM integration right now and it most definitely isn't a lack of standards. The issue is that the quality of raw LLM responses falls apart in pretty embarrassing ways. It's understood by now that better prompts cannot solve these problems. You need other error-checking systems as part of your pipeline.

The AI companies are interested in solving these problems but they're unable to. Probably because their business model works best if their system is just marginally better than their competitor.


data security is the reason i'd imagine they're letting other's host servers


The issue isn’t with who’s hosting, it’s that their SDKs don’t clearly integrate with existing HTTP servers regardless of who’s hosting them. I mean integrate at the source level, of course they could integrate via HTTP call.


Location: San Diego, CA, USA

Remote: preferred but not necessary

Willing to relocate: no

Technologies: TypeScript, React, Next.js, Node.js, Postgres, Docker, AWS, GitHub CI, Python, Elixir, Golang, Java

Résumé/CV: https://www.ktb.pub/dev/resume.pdf

Email: achilles@ktb.pub

I'm a full-stack developer with wide-ranging technical experience and strong general problem solving skills. Most recently I co-founded a startup, worked on it for a few years, and then took some time off to recharge, be with my family, and work on hobby projects. I'm most interested in, and in my opinion best suited for, the kind of fast-paced small-team environment you typically find in early-stage startups.


This looks like a hard maybe, thank you!


I believe mediamtx can ingest the RTMP and present to a browser, via HLS or WebRTC.

https://github.com/bluenviron/mediamtx


You’re welcome. Sounds like what you want is RTMP or SRT. There are free solutions, for those, as well.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: