Hacker Newsnew | past | comments | ask | show | jobs | submit | AlexCoventry's commentslogin

If it had been about taking out dictators, they were kind of spoiled for choice in that regard. They could have picked an easier one, or at least one which made strategic sense in some way.

https://chatgpt.com/share/695a2613-97e8-800e-b2e4-28fc7707f2...


I like the idea, but this [1]:

    # Check for POSITIVE patterns (new in v3)
    elif echo "$PROMPT" | grep -qiE "perfect!|exactly right|that's exactly|that's what I wanted|great approach|keep doing this|love it|excellent|nailed it"; then
is fanciful.

[1] https://github.com/BayramAnnakov/claude-reflect/blob/main/sc...


Sorry, i will fix that

Created an issue to track , will fix tomorrow: https://github.com/BayramAnnakov/claude-reflect/issues/2

No, there's no training going on, here, as far as I can tell. E.g., they use GPT-5 as their base model. Also, AFAICT from a quick skim/search there's no mention of loss functions or derivatives, FWIW.

The derivative being a grad(ient) student sampling scaffolds against evals + qualitative observations: most prompt-based llm papers

I think most of the progress is training by reinforcement learning on automated assessments of the code produced. So data is not really an issue.

Explosive ignition of a fire.

The PRC has nothing remotely corresponding to the Fourth Amendment, as far as I know.


This is probably a bit different. An LLM outputs a token at a time ("autoregressively") by sampling from a per-position token probability distribution, which depends on all the prior context so far. While the post doesn't describe OpenRouter's approach, most structured LLM output works by putting a mask over that distribution, so that any token which would break the intended structure has probability zero and cannot be sampled. So for instance, in the broken example from the post,

    {"name": "Alice", "age": 30
the standard LLM output would have stopped there because the LLM output an end-of-sequence (EOS) token. But because that would lead to a syntax error in JSON, the EOS token would have probability zero, and it would be forced to either extend the number "30", or add more entries to the object, or end it with "}".

I haven't played much with structured output, but I imagine the biggest risk is that you may force the model to work with contexts outside its training data, leading it to produce garbage, though hopefully syntactically-correct garbage.

I don't understand, though, why the probability of incorrect JSON wouldn't go to 0, under this framework (unless you hit the max sequence length before the JSON ended.) The post implies that JSON errors still happen, so it's possible they're doing something else.


What do you find interesting about it, and how does it compare to commercial offerings?


It's rare to find a local model that's capable of running tools in a loop well enough to power a coding agent.

I don't think gpt-oss:20b is strong enough to be honest, but 120b can do an OK job.

Nowhere NEAR as good as the big hosted models though.


Think of it as the early years of UNIX & PC. Running inferences and tools locally and offline opens doors to new industries. We might not even need client/server paradigm locally. LLM is just a probabilistic library we can call.


Thanks.


With the massive dependencies we tolerate these days, the risk of supply-chain attacks has already been enormous for years, so I was already in the habit of just doing all my development in a VM anyway, except for throwaway scripts with no dependencies. It amazes me that people don't do that.


The fundamental ideas in the paper aren't particularly novel. They will probably work as advertised.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: