Don't misunderstand, building systems models using existing system response as a way of analyzing those systems is a useful methodology and it makes some things otherwise tedious things not so tedious. Much like "high level" languages removed the tedium of writing in assembly code. But for the same reason that a compiler won't emit a new, more powerful, CPU instruction in its code generator, LLMs don't generate previously unseen system responses.
is it possible for copilot or say llama or gpt4o to suggest a piece of code and actually go and try to run a test that they design on an ide and see if there are any results and try to fix issues?
right now you ask llm to write a code to do basic web scraping for HN website for latest url and give username of the submitter. sure they will give you a code and give you a test script but you as the user have to run the script and give manual feedback to LLM.
if the testing step can be automated, user would give an input and desired output or a prompt and choose between the results, that would be good.
kinda like you do inpainting and outpainting and other painting stuff but for code.
Genetic Programming was a thing in the 90s but hampered by a combination of the inefficiency of largely random mutations (plus some crossover, which was still largely undirected) with low odds of doing anything helpful, and lack of computational speed to test. A GP framework attempting to use LLMs to apply more or less "reasoned" changes within the same structure of generations of "mutations" tested against each other and previous generations best would be interesting.
They key bit here is there is no known way (as yet) to encode "reasoning".
I was a big fan of genetic programming, wrote a lot of code, did lots of research. And unlike LLMs it could end up on code that had never been written before that accomplished some task, but the random walk through a galactic sized space with atom (or maybe molecule) sized solution spaces made it computationally infeasible.
Being able to somehow code 'reasoning' one could do the equivalent of gradient descent to converge on a working solution but without that, you are unlikely[1] to find anything in reasonable amounts of time.
[1] The chance is non-zero but it is very very near zero.
LLMs can definitely end up with code that has never been written before, even before considering that you would be able both to ask it for modifications to very constrained parts of the code and can sample more broadly than always picking the most probably tokens.
But it also appear to have a far higher probability of producing changes that move towards something that will run.
Yes, exactly. Tools perform without "knowing" the purpose. Unintelligent yet effective.
So, "perform a selection over the enumerated combinations in the solutions space" works without the process being further sophisticated. It works as much as it can - as a preparation of data until the stage in which intelligence is required.
We have been doing it since a while; simulated annealing, genetic algorithms... Dumb hammers in a way, encoding an action from an intelligent operator, and providing an effective aid when under intelligent control.
I guess the key aspect for human inventions is a stochastic element to the combination of existing pattern. I.e. seeing (or imagining) connections that are not obvious.
Don't misunderstand, building systems models using existing system response as a way of analyzing those systems is a useful methodology and it makes some things otherwise tedious things not so tedious. Much like "high level" languages removed the tedium of writing in assembly code. But for the same reason that a compiler won't emit a new, more powerful, CPU instruction in its code generator, LLMs don't generate previously unseen system responses.