I love the visual of humans desperately trying to preserve what they consider the natural world, and when they turn their backs evolution does it's thing.
It's very true that we only approximate truth with our maps. All abstractions are leaky, but that fact does not imply "catastrophic relativism" (as the grandfather post phrased it). It just implies that we need better, more accurate, maps.
Or, to return the topic of the post, it just means that our translations need to try a little harder, not that human quality translation is impossible to do via machine.
I think it's very important to remember that objective truth exists, because some large percentage of society has a political interest in denying that, and we're slipping ever closer into Sagan's "Demon Haunted World."
It sounds like you are making a distinction between digital (silicon computers) and analog (biological brains).
As far as possible reasons that a computer can’t achieve AGI go, this seems like the best one (assuming computer means digital computer of course).
But in a philosophical sense, a computer obeys the same laws of physics that a brain does, and the transistors are analog devices that are being used to create a digital architecture. So whatever makes you brain have uncountable states would also make a real digital computer have uncountable states. Of course we can claim that only the digital layer on top matters, but why?
> if these tools are close to having super-human intelligence, and they make humans so much more productive, why aren't we seeing improvements at a much faster rate than we are now? Why aren't inherent problems like hallucination already solved, or at least less of an issue? Surely the smartest researchers and engineers money can buy would be dogfooding, no?
Hallucination does seem to be much less of an issue now. I hardly even hear about it - like it just faded away.
As far as I can tell smart engineers are using AI tools, particularly people doing coding, but even non-coding roles.
The criticism feels about three years out of date.
Not at all. The reason it's not talked about as much these days is because the prevailing way to work around it is by using "agents". I.e. by continuously prompting the LLM in a loop until it happens to generate the correct response. This brute force approach is hardly a solution, especially in fields that don't have a quick way of verifying the output. In programming, trying to compile the code can catch many (but definitely not all) issues. In other science and humanities fields this is just not possible, and verifying the output is much more labor intensive.
The other reason is because the primary focus of the last 3 years has been scaling the data and hardware up, with a bunch of (much needed) engineering around it. This has produced better results, but it can't sustain the AGI promises for much longer. The industry can only survive on shiny value added services and smoke and mirrors for so long.
> In other science and humanities fields this is just not possible, and verifying the output is much more labor intensive.
Even just in industry, I think data functions at companies will have a dicey future.
I haven't seen many places where there's scientific peer review - or even software-engineering-level code-review - of findings from data science teams. If the data scientist team says "we should go after this demographic" and it sounds plausible, it usually gets implemented.
So if the ability to validate was already missing even pre-LLM, what hope is there for validation of the LLM-powered replacement. And so what hope is there of the person doing the non-LLM-version of keeping their job (at least until several quarters later when the strategy either proves itself out or doesn't.)
How many other departments are there where the same lack of rigor already exists? Marketing, sales, HR... yeesh.
> Hallucination does seem to be much less of an issue now. I hardly even hear about it - like it just faded away.
Last week I had Claude and ChatGPT both tell me different non-existent options to migrate a virtual machine from vmware to hyperv.
Week before that one of them (don't remember which, honestly) gave me non existent options for fio.
Both of these are things that the first party documentation or man page has correct but i was being lazy and was trying to save time or be more efficient like these things are supposed to do for us. Not so much.
The few times I've used Google to search for something (Kagi is amazing!), it's Gemini Assistant at the top fabricated something insanely wrong.
A few days ago, I asked free ChatGPT to tell me the head brewer of a small brewery in Corpus Christi. It told me that the brewery didn't exist, which it did, because we were going there in a few minutes, but after re-prompting it, it gave me some phone number that it found in a business filing. (ChatGPT has been using web search for RAG for some time now.)
The google AI clippy thing at the top of search has to be one of the most pointless, ill-advised and brand-damaging stunts they could have done. Because compute is expensive at scale (even for them) it’s running a small model, so the suggestions are pretty terrible. That leads people to who don’t understand what’s happening to think their AI is just bad in general.
That’s not the case in my experience. Gemini is almost as good as Claude for most of the things I try.
That said, for queries tgat don’t use agentic search or rag, hallucination is as bad a problem as ever and it won’t improve because hallucination is all these models do. In Karpathy’s phrase they “dream text”. Agentic search and rag and similar techniques disguise the issue because they stuff the context of the model with real results, so the scope for it to go noticeably off the rails is less. But it’s still very visible if you ask for references, links etc many/most/sometimes all will be hallucinations depending on the prompt.
Are you hallucinating?? "AI" is still constantly hallucinating. It still writes pointless code that does nothing towards anything I need it to do, a lot more often than is acceptable.
ChatGPT constantly hallucinates. At least once per conversation I attempt to happen with it. We all gave up on bitching about it constantly because we would never talk about anything else, but I have no reason to believe that any LLM has vaguely solved this problem.
How can it not be hallucinating anymore if everything the current crop of generative AI algorithm does IS hallucination? What actually happens is that sometimes the hallucinated output is "right", or more precisely, coherent with the user mental model.
> Hallucination does seem to be much less of an issue now. I hardly even hear about it - like it just faded away.
Nonsense, there is a TON of discussion around how the standard workflow is "have Cursor-or-whatever check the linter and try to run the tests and keep iterating until it gets it right" that is nothing but "work around hallucinations." Functions that don't exist. Lines that don't do what the code would've required them to do. Etc. And yet I still hit cases weekly-at-least, when trying to use these "agents" to do more complex things, where it talks itself into a circle and can't figure it out.
What are you trying to get these things to do, and how are you validating that there are no hallucinations? You hardly ever "hear about it" but ... do you see it? How deeply are you checking for it?
(It's also just old news - a new hallucination is less newsworthy now, we are all so used to it.)
Of course, the internet is full of people claiming that they are using the same tools I am but with multiple factors higher output. Yet I wonder... if this is the case, where is the acceleration in improvement in quality in any of the open source software I use daily? Or where are the new 10x-AI-agent-produced replacements? (Or the closed-source products, for that matter - but there it's harder to track the actual code.) Or is everyone who's doing less-technical, less-intricate work just getting themselves hyped into a tizzy about getting faster generation of basic boilerplate for languages they hadn't personally mastered before?
I just tried asking ChatGPT on how to "force PhotoSync to not upload images to a B2 bucket that are already uploaded previously", and all it could do is hallucinate options that don't exist and webpages that are irrelevant. This is with the latest model and all the reasoning and researching applied, and across multiple messages in multiple chats. So no, hallucination is still a huge problem.
"hallucinations" CAN'T fade away. They are the one and only thing LLMs can do. If you remove that your output would be absolutely nothing. It is intrinsic to how they work. Anyone claiming they can eliminating has a bridge to sell you.
> If they would expect to achieve AGI soon, their behaviour would be completely different. Why bother developing chatbots or doing sales, when you will be operating AGI in a few short years?
What if chatbots and user interactions ARE the path to AGI? Two reasons they could be:
(1) Reinforcement learning in AI has proven to be very powerful. Humans get to GI through learning too - they aren’t born with much intelligence. Interactions between AI and humans may be the fastest way to get to AGI.
(2) The classic Silicon Valley startup model is to push to customers as soon as possible (MVP). You don’t develop the perfect solution in isolation, and then deploy it once it is polished. You get users to try it and give feedback as soon as you have something they can try.
I don’t have any special insight into AI or AGI, but I don’t think OpenAI selling useful and profitable products is proof that there won’t be AI.
Of course, you have to interact with the solver through JSON files and provide your own initial mesh. They recently added adaptive refinement but I don't know how well it works. Given what Ansys charges for an HFSS seat though can't complain!
Relatedly, is there a good open replacement for ADS/Microwave Office? I've seen QUCS/QUCS studio but don't know if there's others out there? Last I looked, the more general SPICE solvers don't support S-param blocks and transmission lines, which are pretty critical.
Depends what you're doing and how maths-y you are since you need to tell it the equations you want to solve, but FEniCs is a very good general purpose FEM code, and you can couple it to Bempp for BEM stuff.
Does HFSS visualize the field in real-time or a user needs to set the geometry/parameters then precalculate the field and only then be able to explore the visualization?
Say, if I wanted to see immediate effects of changing an incidence angle, could I just "scroll" the incidence parameter?
HFSS does a typical FEM matrix solve then displays the results. It is often used for very complex or large problems, so as far as I know it isn’t set up for instant display of results. That would be a neat feature for small problems.
There have passed too many years since they have claimed that they know how to make iron nitride magnets, but no commercial products have appeared.
The iron nitride magnets require a special crystal structure, which is not the most stable for that material, so it is difficult to find processing methods so that the material will have that crystal structure and even if the structure is produced, it might revert in time to a more stable structure, which does not have the desired magnetic properties.
Like typical for such startups, that company does not say anything about which are the roadblocks that have prevented them for so many years to make iron nitride magnets until now, but it is pretty certain that their patents are bogus and the methods described there have not been usable for making what they claim to be able to make.