Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've been wondering this for a while:

In the future, code-writing AI could be tasked with generating the most reliable and/or optimized code to pass your unit tests. Human programmers will decide what we want the software to do, make sure that we find all the edge cases and define as many unit tests as possible, and let the AI write significant portions of the product. Not only that, but you could include benchmarks that pit AI against itself to improve runtime or memory performance. Programmers can spend more time thinking about what they want the final product to do, rather than getting mired in mundane details, and be guaranteed that portions of software will perform extremely well.

Is this a naive fantasy on my part, or actually possible?



> Is this a naive fantasy on my part, or actually possible?

Possible, yes, desirable, no.

The issue I have with all these end-to-end models is that they're a massive regression. Practitioners fought tooth and nails to get programmers to acknowledge correctness and security aspects.

Mathematicians and computer scientists developed theorem solvers to tackle the correctness part. Practitioners proposed methodologies like BDD and "Clean Code" to help with stability and reliability (in terms of actually matching requirements now and in the future).

AI systems throw all this out of the window by just throwing a black box onto the wall and scraping up whatever sticks. Unit tests will never be proof for correctness - they can only show the presence of errors, not their absence.

You'd only shift the burden from implementation (i.e. the program) to the tests. What you actually want is a theorem prover that proofs the functional correctness in conjunction with integration tests that demonstrate the runtime behaviour if need be (i.e. profiling) and references that link implementation to requirements.

The danger lies in the fact that we already have a hard time getting security issues and bugs under control with software that we should be able to understand (i.e. fellow humans wrote and designed it). Imagine trying to locate and fix a bug in software that was synthesised by some elaborate black box that emitted inscrutable code in absence of any documentation and without references to requirements.


It seems to me that writing an exhausting set of unit cases is harder than writing the actual code.


Otherwise the AI will just over-fit the unit test case subset.


First you need really good infra to make it easy to test working multiple solutions for AI but I think this will be bleeding edge in 2030.

EDIT: with in-memory DBs I can imagine AI assisted mainframe than can solve 90% of business problems.


And a second AI to generate additional test cases similar to yours (which you accept as also in scope) to avoid the first AI gaming the test.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: