Has the cost of building software dropped 90%?

tangotaylor · 2025-12-09T12:42:50 1765284170

> Engineers need to really lean in to the change in my opinion.

I tried leaning in. I really tried. I'm not a web developer or game developer (more robotics, embedded systems). I tried vibe coding web apps and games. They were pretty boring. I got frustrated that I couldn't change little things. I remember getting frustrated that my game character kept getting stuck on imaginary walls and kept asking Cursor to fix it and it just made more and more of a mess. I remember making a simple front-end + backend with a database app to analyze thousands of pull request comments and it got massively slow and I didn't know why. Cursor wasn't very helpful in fixing it. I felt dumber after the whole process.

The next time I made a web app I just taught myself Flask and some basic JS and I found myself moving way more quickly. Not in the initial development, but later on when I had to tweak things.

The AI helped me a ton with looking things up: documentation, error messages, etc. It's essentially a supercharged Google search and Stack Overflow replacement, but I did not find it useful letting it take the wheel.

r_lee · 2025-12-09T13:08:52 1765285732

These posts like the one OP made is why I'm losing my mind.

Like, is there truly an agentic way to go 10x or is there some catch? At this point while I'm not thrilled about the idea of just "vibe coding" all the time, I'm fine with facing reality.

But I keep having the same experience as you, or rather leaning more on that supercharged Google/SO replacement

or just a "can you quickly make this boring func here that does xyz" "also add this" or for bash scripts etc.

And that's only when I've done most of the plumbing myself.

KallDrexx · 2025-12-09T15:14:21 1765293261

EVERY DX survey that comes out (surveying over 20k developers) says the exact same thing.

Staff engineers get the most time savings out of AI tools, and their weekly time savings is 4.4 hours for heavy AI users. That's a little more than 10% productivity, so not anywhere close to 10x.

What's more telling about the survey results is they are also consistent in their findings between heavy and light users of AI. Staff engineers who are heavy users of AI save 4.4 hours a week while staff engineers who are light users of AI save 3.3 hours a week. To put another way, the DX survey is pretty clear that the time savings between heavy and light AI users is minimal.

Yes surveys are all flawed in different ways but an N of 20k is nothing to sneeze at. Any study with data points shows that code generation is not a significant time savings and zero studies show significant time savings. All the productivity gains DX reports come from debugging and investigation/code base spelunking help.

valbaca · 2025-12-09T16:48:42 1765298922

> That's a little more than 10% productivity, so not anywhere close to 10x.

No Silver Bullet still alive and well in 2026

jonfw · 2025-12-09T17:11:27 1765300287

For those surveys to mean anything (for or against AI), we'd have to have an effective measure of developer productivity

ponector · 2025-12-09T17:36:08 1765301768

In my experience the productivity measured in created merge requests increased massively.

More merge requests because now the same senior developers are creating more bugs, 4x comparing to 2025. Same developers, same codebase but now with Cursor!

Izkata · 2025-12-09T18:44:13 1765305853

Lines of code changed would be a high one as well, though that's already well-known as a bad measure.

y0eswddl · 2025-12-09T16:11:32 1765296692

do you have a link to the latest survey? my google-fu is failing me at the moment

KallDrexx · 2025-12-09T16:45:22 1765298722

Unfortunately I do not.

Past survey results are hidden in some presentations I've seen, and the latest survey I have full access due to my company paying for it. So I'm not sure it's legal for me to reproduce

jatora · 2025-12-09T16:25:36 1765297536

your google-fu isnt failing. there's simply only a couple large studies on this, and of those, zero that have a useful methodology.

maddmann · 2025-12-10T02:52:55 1765335175

I think there is going to be 2-3 year lag in understanding how llms actually impact developer productivity. There are way too many balls in the air, and anyone claiming specific numbers on productivity increase is likely very very wrong.

For example citing staff engineers as an example will have a bias: they have years of traditional training and are obviously not representative of software engineers in general.

KallDrexx · 2025-12-09T17:11:31 1765300291

The latest survey is here apparently: https://getdx.com/report/ai-assisted-engineering-q4-impact-r....

No idea if signing up gives the full survey for free. We get it through our company paying for DX's services

hattmall · 2025-12-10T05:42:20 1765345340

>Like, is there truly an agentic way to go 10x

Yes, absolutely.

>or is there some catch?

Yes, absolutely.

The catch is that to go 10x you have to either do a lot of work of the variety that AI excels at, mainly boilerplate and logical but tedious modifications. There's a lot of code I can write, but I will probably need to check the syntax and implementations for 10 or more functions / methods, but I know what they are and how I want the code to flow. AI never really nails it, but it gets close enough that I can fix it with considerable time savings. The major requirement here is that I, for the most part, already knew almost exactly what I wanted to do. This is the really fancy auto-complete that is actually a pretty reasonable assistant.

The other way is that you have to start from a position of 0.1x (or less) and go to !~1x.

There are a tremendous amount of people employed in tech roles, but outside of actual tech companies that have very very low throughput.

I've recently worked in a very large non-tech firm but one that is part of a major duopoly and is for the most part a household name worldwide. They employ 1000s of software developers whose primary function is to have a vague idea of who they should email about any question or change. The ratio of emails to lines of code is probably 25:1.

The idea that you could simply ask an AI to modify code, and it might do it correctly, in only a day is completely mind blowing to people whose primary development experience is from within one of these organizations.

adriand · 2025-12-09T13:34:27 1765287267

> Like, is there truly an agentic way to go 10x or is there some catch?

Yes. I think it’s practice. I know this sounds ridiculous, but I feel like I have reached a kind of mind meld state with my AI tooling, specifically Claude Code. I am not really consciously aware of having learned anything related to these processes, but I have been all in on this since ChatGPT, and I honestly think my brain has been rewired in a way that I don’t truly perceive except in terms of the rate of software production.

There was a period of several months a while ago where I felt exhausted all the time. I was getting a lot done, but there was something about the experience that was incredibly draining. Now I am past that and I have gone to this new plateau of ridiculous productivity, and a kind of addictive joy in the work. A marvellous pleasure at the orchestration of complex tasks and seeing the results play out. It’s pure magic.

Yes, I know this sounds ridiculous and over-the-top. But I haven’t had this much fun writing software since my 20s.

mrwrong · 2025-12-09T13:43:40 1765287820

> Yes, I know this sounds ridiculous and over-the-top.

in that case you should come with more data. tell us how you measured your productivity improvement. all you've said here is that it makes you feel good

adriand · 2025-12-09T14:08:41 1765289321

Work that would have taken me 1-2 weeks to complete, I can now get done in 2-3 hours. That's not an exaggeration. I have another friend who is as all-in on this as me and he works in a company (I work for myself, as a solo contractor for clients), and he told me that he moved on to Q1 2026 projects because he'd completed all the work slated for 2025, weeks ahead of schedule. Meanwhile his colleagues are still wading through scrum meetings.

I realize that this all sounds kind of religious: you don't know what you're missing until you actually accept Jesus's love, or something along those lines. But you do have to kinda just go all-in to have this experience. I don't know what else to say about it.

no_wizard · 2025-12-09T15:57:30 1765295850

If your work maps exceedingly well to the technology it is true, it goes much faster. Doubly so when you have enough experience and understanding of things to find its errors or suboptimal approaches and adjust it that much faster.

The second you get to a place where the mapping isn’t there though, it goes off rails quickly.

Not everyone programs in such a way that they may ever experience this but I have, as a Staff engineer at a large firm, run into this again and again.

It’s great for greenfield projects that follow CRUD patterns though.

wilsonnb3 · 2025-12-10T03:21:18 1765336878

Assuming 40 hours a week of work time, you’re claiming a ~25x speed up, which is laughably absurd to me.

It will take you 2.5 months to accomplish what would have taken you five years, that is the kind of productivity increase you’re describing.

It doesn’t pass the smell test. I’m not sure that going from assembly to python would even have such a ludicrous productivity enhancement.

mrwrong · 2025-12-09T15:20:08 1765293608

this is just not a very interesting way to talk about technology. I'm glad it feels like a religious experience to you, I don't care about that. I care about reality

coderatlarge · 2025-12-09T15:58:47 1765295927

it seems to me if these things were real and repeatable there would be published traces that show the exact interactions that led to a specific output and the cost in time and money to get there.

do such things exist?

beepbooptheory · 2025-12-09T14:17:10 1765289830

My sympathies go out to the friend's coworkers. They are probably wading through a bunch of stuff right now, but given the context you have given us, its probably not "scrum meetings"..

I don't even care about the llm, I just want the confidence you have to assess that any given thing will take N weeks. You say 1-2 weeks.. thats like a big range! Something that "would" take 1 week takes ~2 hours, something that "would" take 2 weeks also takes ~2 hours. How does that even make sense? I wonder how long something that would of taken three weeks would take?

Do you still charge your clients the same?

adriand · 2025-12-09T14:35:39 1765290939

> They are probably wading through a bunch of stuff right now, but given the context you have given us, its probably not "scrum meetings"..

This made me laugh. Fair enough. ;)

In terms of the time estimations: if your point is that I don't have hard data to back up my assertions, you're absolutely correct. I was always terrible at estimating how long something would take. I'm still terrible at it. But I agree with the OP. I think the labour required is down 90%.

It does feel to me that we're getting into religious believer territory. There are those who have firsthand experience and are all-in (the believers), there are those who have firsthand experience and don't get it (the faithless), and there are those who haven't tried it (the atheists). It's hard to communicate across those divides, and each group's view of the others is essentially, "I don't understand you".

bonesss · 2025-12-09T16:27:34 1765297654

Religions are about faith, faith is belief in the absence of evidence. Engineering output is tangible and measurable, objectively verifiable and readily quantifiable (both locally and in terms of profits). Full evidence, testable assertions, no faith required.

Here we have claims of objective results, but also admissions we’re not even tracking estimations and are terrible at making them when we do. People are notoriously bad at estimating actual time spent versus output, particularly when dealing with unwanted work. We’re missing the fundamental criteria of assessment, and there are known biases unaccounted for.

Output in LOC has never been the issue, copy and paste handles that just fine. TCO and holistic velocity after a few years is a separate matter. Masterful orchestration of agents could include estimation and tracking tasks with minimal overhead. That’s not what we’re seeing though…

Someone who has even a 20% better method for deck construction is gonna show me some timetables, some billed projects, and a very fancy new car. If accepting Mothra as my lord and saviour is a prerequisite to pierce an otherwise impenetrable veil of ontological obfuscation in order to see the unseeable? That deck might not be as cheap as it sounds, one way or the other.

I’m getting a nice learning and productivity bump from LLMs, there are incredible capabilities available. But premature optimization is still premature, and claims of silver bullets are yet to be demonstrated.

adriand · 2025-12-09T17:17:04 1765300624

Here's an example from this morning. At 10:00 am, a colleague created a ticket with an idea for the music plugin I'm working on: wouldn't it be cool if we could use nod detection (head tracking) to trigger recording? That way, musicians who use our app wouldn't need a foot switch (as a musician, you often have your hands occupied).

Yes, that would be cool. An hour later, I shipped a release build with that feature fully functional, including permissions plus a calibration UI that shows if your face is detected and lets you adjust sensitivity, and visually displays when a nod is detected. Most of that work got done while I was in the shower. That is the second feature in this app that got built today.

This morning I also created and deployed a bug fix release for analytics on one platform, and a brand-new report (fairly easy to put together because it followed the pattern of other reports) for a different platform.

I also worked out, argued with random people on HN and walked to work. Not bad for five hours! Do I know how long it would have taken to, for example, integrate face detection and tracking into a C++ audio plugin without assistance from AI? Especially given that I have never done that before? No, I do not. I am bad at estimating. Would it have been longer than 30 minutes? I mean...probably?

jakubmazanec · 2025-12-09T21:27:34 1765315654

> An hour later, I shipped a release build

I would love to see that pull request, and how readable and maintainable the code is. And do you understand the code yourself, since you've never done this before?

beepbooptheory · 2025-12-09T18:04:51 1765303491

Just having a 'count-in' type feature for recording would be much much more useful. Head nodding is something I do all the time anyway as a musician :).

I don't know what your user makeup is like, but shipping a CV feature same day sounds so potentially disastrous.. There are so many things I would think you would at least want to test, or even just consider with the kind of user emapthy we all should practice.

fauigerzigerk · 2025-12-09T17:41:29 1765302089

I think you have to make a distinction between indvidual experience and claims about general truths.

If I know someone as an honest and serious professional, and they tell me that some tool has made them 5x or 10x more productive, then I'm willing to believe that the tool really did make a big difference for them and their specific work. I would be far more sceptical if they told me that a tool has made them 10% more productive.

I might have some questions about how much technical debt was accumulated in the process and how much learning did not happen that might be needed down the road. How much of that productivity gain was borrowed from the future?

But I wouldn't dismiss the immediate claims out of hand. I think this experience is relevant as a starting point for the science that's needed to make more general claims.

Also, let's not forget that almost none of the choices we make as software engineers are based on solid empirical science. I have looked at quite a few studies about productivity and defect rates in software engineering projects. The methodology is almost always dodgy and the conclusions seem anything but robust to me.

WesleyJohnson · 2025-12-09T15:25:19 1765293919

Not to pick apart your analogy, but asserting that atheists haven't tried religion is misinformed.

timeon · 2025-12-09T16:08:48 1765296528

Brain-rot can be associated with heavy LLM usage.

jakebasile · 2025-12-09T16:20:10 1765297210

So, you say that AI has made you "ridiculously faster", but then admit you've always been terrible at estimating how long something would take?

newsoftheday · 2025-12-09T17:13:49 1765300429

> It does feel to me that we're getting into religious believer territory. There are those who have firsthand experience and are all-in (the believers), there are those who have firsthand experience and don't get it (the faithless), and there are those who haven't tried it (the atheists). It's hard to communicate across those divides, and each group's view of the others is essentially, "I don't understand you".

What a total crock. Your prose reminds of of the ridiculously funny Mike Meyers in "The Love Guru".

beepbooptheory · 2025-12-09T15:51:43 1765295503

But then does this not give you pause, that it "feels religious"? Is there not some morsel of critical/rational interrogation on this? Aren't you worried about becoming perhaps too fundamentalist in your belief?

To extend the analogy: why charge clients for your labor anymore, which Claude can supposedly do in a fraction of the time? Why not just ask if they have heard the good word, so to speak?

stocksinsmocks · 2025-12-09T15:44:12 1765295052

Nobody had a robust, empirical metric of programmer productivity. Nobody. Ticket count, function points, LoC, and others tell you nothing about the fitness of the product. It’s all feels.

mrwrong · 2025-12-09T15:55:27 1765295727

ok, but there's a spectrum between fully reproducible empirical evidence and divine revelation. I'm not convinced it's impossible to measure productivity in a meaningful way, even if it isn't perfect. it at least seems better to try than... whatever this is

klank4 · 2025-12-09T14:38:43 1765291123

What's worked best with Gemini such I made a DSL that transpiles to C with CUDA support to train small models in about 3 hours... (all programs must run against an image data set, must only generate embeddings)

Do not; vibe code from top down (ex. Make me a UI with React, with these buttons and these behaviors to each button)

Do not; chat casually with it. (ex. I think it would look better if the button was green)

Do; constrain phrasing to the next data transform goal (ex. You must add a function to change all words that start with lowercase to start with uppercase)

Do; vibe code bottom up (ex. You must generate a file with a function to open a plaintext file and appropriate tests; now you must add a function to count all words that begin with "f")

Do; stick to must/should/may (ex. You must extend the code with this next function)

Do; constrain it to mathematical abstractions (ex. sys prompt: You must not use loops, you must only use recursion and functional paradigms. You must not make up abstractions and stick to mathematical objects and known algorithms)

Do; constrain it to one file per type and function. This makes it quick to review, regenerate only what needs to change.

Using those patterns, Gemini 2.5 and 3 have cranked out banging code with little wandering off in the weeds and hallucinating.

Programming has been mired in made up semantics of the individual coder for the luls, to create mystique and obfuscate the truth to ensure job security; end of the day it's matrix math and state sync between memory and display.

pdimitar · 2025-12-09T18:57:48 1765306668

Awesome comment, thank you. No idea why it was flagged as dead. Vouched for it to not be.

nasmorn · 2025-12-09T18:47:11 1765306031

Just as an aside I also think I am way more productive now but a really convincing datapoint would be someone who does project work and now has 5x the hourly rate they had last year. If there are not plenty of people like this, it cannot be 10x

thfuran · 2025-12-09T19:57:58 1765310278

That's not a very convincing argument. Even if you can do 10x the work, that doesn't necessarily mean you can easily find customers ready to pay 5x the hourly rate.

newsoftheday · 2025-12-09T17:08:21 1765300101

> Yes, I know this sounds ridiculous and over-the-top. But I haven’t had this much fun writing software since my 20s.

But...you're not writing it. The culmination of many sites, many people, Stack Overflow, etc. all wrote it through the filtering mechanism being called AI.

You didn't write a damn thing.

dent9 · 2025-12-10T04:37:24 1765341444

Lol that's like saying that because you found the solution on stack overflow you didn't write the program

News flash buddy: YOU never wrote any code yourself either. Literally every single line of code you've ever committed to a repo was first written by someone else and you just copied it and modified it a little.

r_lee · 2025-12-09T13:41:37 1765287697

That's really interesting.

May I ask what kinds of projects, stack and any kind of markdown magic you use?

And any specific workflow? And are there times when you have to step in manually?

adriand · 2025-12-09T13:53:11 1765288391

Currently three main projects. Two are Rails back-ends and React front-ends, so they are all Ruby, Typescript, Tailwind, etc. The third is more recent, it's an audio plugin built using the JUCE framework, it is all C++. This is the one that has been blowing my mind the most because I am an expert web developer, but the last time I wrote a line of C++ was 20 years ago, and I have zero DSP or math skills. What blows my mind is that it works great, it's thread safe and performant.

In terms of workflow, I have a bunch of custom commands for tasks that I do frequently (e.g. "perform code review"), but I'm very much in the loop all the time. The whole "agent can code for hours at a time" thing is not something I personally believe. It depends on the task how involved I get, however. Sometimes I'm happy to just let it do work and then review afterwards. Other times, I will watch it code and interrupt it if I am unhappy with the direction. So yes, I am constantly stepping in manually. This is what I meant about "mind meld". The agent is not doing the work, I am not doing the work, WE are doing the work.

efields · 2025-12-09T14:16:46 1765289806

I maintain a few rails apps and Claude Code has written 95% of the code for the last 4 months. I deploy regularly.

I make my own PRs then have Copilot review them. Sometimes it finds criticisms, and I copy and paste that chunk of critique into Claude Code, and it fixes it.

Treat the LLMs like junior devs that can lookup answers supernaturally fast. You still need to be mindful of their work. Doubtful even. Test, test, test.

lomase · 2025-12-09T16:08:05 1765296485

Can we see any of this software created by this amazing LLMs?

timeon · 2025-12-09T16:11:29 1765296689

Why do you need to use Tailwind if the code is generated? Can't there be something more efficient?

Deegy · 2025-12-09T16:32:04 1765297924

Extensive tailwind training data in the models. Sure there's something more efficient but it's just safer to let the model leverage what it was trained on.

camdenreslink · 2025-12-09T18:18:06 1765304286

Surely there is an order of magnitude more training data on plain CSS than tailwind, right?

frikk · 2025-12-09T20:55:10 1765313710

In my experience the LLMs work better with frameworks that have more rigid guidance. Something like Tailwind has a body of examples that work together, language to reason about the behavior needed, higher levels of abstraction (potentially), etc. This seems to be helpful.

The LLMs can certainly use raw CSS and it works well, the challenge is when you need consistent framing across many pages with mounting special cases, and the LLMs may make extrapolate small inconsistencies further. If you stick within a rigid framework, the inconsistencies should be less across a larger project (in theory, at least).

itgoon · 2025-12-09T13:57:10 1765288630

Research -> Plan -> Implement

Start by having the agent ask you questions until it has enough information to create a plan.

Use the agent to create the plan.

Follow the plan.

When I started, I had to look at the code pretty frequently. Rather than fix it myself, I spent time thinking about what I could change in my prompts or workflow.

godelski · 2025-12-10T06:02:59 1765346579

  > for bash scripts etc

Everyone keeps telling me that it's good for bash scripts but I've never had real success.

Here's an example from today. I wanted to write a small script to grab my Google scholar citations and I'm terrible with web so I ask the best way to parse the curl output. First off, it suggests I use a python package (seriously? For one line of code? No thanks!) but then it gets the wrong grep. So I pull up the page source, copy paste some to it, and try to parse it myself. I already have a better grep command and for the second time it's telling me to use pearl regex (why does it love -P as much as it loves delve?). Then I'm pasting in my new command showing it my output asking for the awk and sed parts while googling the awk I always forget. It messes up the sed parts while googling, so I fix it, which means editing the awk part slightly but I already had the SO post open that I needed anyways. So I saved maybe one minutes total?

Then I give it a skeleton of a script file adding the variables I wanted and fully expected it to be a simple cleanup. No. It's definitely below average, I mean I've never seen an LLM produce bash functions without being explicitly told (not that the same isn't also true for the average person). But hey, it saved me the while loop for the args so that was nice. So it cost as much time as it gave back.

Don't get me wrong, I find LLMs useful but they're nowhere near game changing like everyone says they are. I'm maybe 10% more productive? But I'm not convinced that's even true. And sure, I might have been able to do less handholding with agents and having it build test cases but for a script that took 15 minutes to write? Feels like serious overkill. And this is my average experience with them.

Is everyone just saying it's so good at bash because no one is taking the time to learn bash? It's a really simple language that every Linux user should know the basics of...

bonzini · 2025-12-09T13:43:53 1765287833

I did find some benefit in lowering the cost of exploratory work, but that's it—certainly worth 20€/month, but not the price of any of the "ultimate" plans.

For example today I had to write a simple state machine (for a parser that I was rewriting so I had all the testcases already). I asked Claude Code to write the state machine for me and stopped it before it tried compiling and testing.

Some of the code (of course including all the boilerplate) worked, some made no sense. It saved a few minutes and overall the code it produced was a decent first approximation, but waiting for it to "reason" through the fixes would have made no sense, at least to me. The time savings mostly came from avoiding the initial "type the boilerplate and make it compile" part.

When completing the refactoring there were a few other steps like where using AI was useful. But overall the LLM did maybe 10% of the work and saved optimistically 20-30 minutes over a morning.

Assuming I have similar savings once a week, which is again very optimistic... That's a 2% reduction or less.

adam_patarino · 2025-12-09T13:18:48 1765286328

I keep using this excel analogy. When excel came out everyone said accountants were dead. Now we have more accountants than ever.

The analogy carries to what you’re saying here. Accountants or folks who know excel deeply can get a lot more from it than folks who are novice to it.

AI coding can be really helpful for an engineer. Keep at it!

jjice · 2025-12-09T15:25:24 1765293924

> or just a "can you quickly make this boring func here that does xyz" "also add this" or for bash scripts etc.

I still write most of the interesting code myself, but when it comes to boring, tedious work (that's usually fairly repetitive, but can't be well abstracted any more), that's when I've found gen AI to be a huge win.

It's not 10x, because a lot of the time, I'm still writing code normally. For very specific, boring things (that also are usually my least favorite parts of code to write), it's fantastic and it really is a 10x. If you amortize that 10x over all the time, it's more like a 1.5x to 3x in my experience, but it saves my sanity.

Things like implementing very boring CRUD endpoints that have enough custom logic that I can't use a good abstraction and writing the associated tests.

I would dread doing work like that because it was just so mind numbing. Now, I've written a bunch of Cursor rules (that was actually pretty fun) so I can now drop in a Linear ticket description and have it get somewhere around 95% done all at once.

Now, if I'm writing something that is interesting, I probably want to work on it myself purely because it's fun, but also because the LLM may suck at it (although they're getting pretty damn good).

vbezhenar · 2025-12-09T15:26:05 1765293965

I tried claude code to write very simple app for me. Basically Golang mock server which will dump request to console. I'd write this kind of app in an hour. I spent around 1.5 hours with claude code and in the end I had code which I liked, almost the same code I'd write myself. It's not vibe coding, I carefully instructed it to write code in a way I prefer, one small step after another.

So for me, it's pretty obvious that with better training, I'd be able to achieve speed ups with the same result in the end. Not 10x, but 2x is possible. The very first attempt to use AI ended up with almost the same time I'd write the same code, and I have a lot to improve.

That said, I have huge problem with this approach. It's not fun to work like that. I started to program 25 years ago, because it was fun for me. It still fun for me today. I love writing all these loops and ifs. I can accept minimal automation like static autocomplete, but that's about it.

digitalsushi · 2025-12-09T15:24:07 1765293847

does anyone remember that episode of star trek tng where the kid is given a little laser engraver that carves a dolphin from a block of wood? and the kid is like "i didn't make this" and the teacher (who abducted him, ew) is like "yeah but it's what you wanted to make, the tool just guided you"

so in 2026 we're going to get in trouble doing code "the old way", the pleasurable way, the way an artist connects with the work. we're not to chefs any longer, we're a plumber now that pours food from a faucet.

we're annoyed because our output can suddenly be measured by the time unit. the jig is up. our secret clubhouse has a lightbulb the landlord controls.

some of us were already doing good work, saving money, making the right decisions. we'll be fine.

some of us don't know how to do those things - or won't do those things - and our options are funneled down. we're trashing at this, like dogs being led to the pound.

there's before, there's during, and there's after; the during is a thing we so seldom experience, and we're in it, and 2024 felt like nothing, 2025 feels like the struggle, and 2026 will be the reconciliation.

change sucks. but it's how we continue. we continue differently or we dont exist.

RScholar · 2025-12-09T22:18:20 1765318700

I sure do. I believe it's the first season episode "When the Bough Breaks," (S01E16). That show tackled so many heavy topics right out of the gate... I respect the hell of of the courage to try, even if it produced some pretty epic whiffs along with the home runs and standing doubles.

frikk · 2025-12-09T20:57:16 1765313836

I like your perspective. What do you think "2026: the reconciliation looks like?

Also I need to track down that Star Trek TNG episode... it sounds poignant.

Lich · 2025-12-09T15:55:15 1765295715

Feeling the same. I’m guessing the folks getting good results are literally writing extremely detailed pseudocode by hand?! Like:

Write a class Person who has members (int) age, (string) first name, (string) last name…

But if you can write that detailed…don’t you know the code you want to write and how you should write it? Writing plain pseudo code feels more verbose.

skapadia · 2025-12-09T17:35:54 1765301754

But the AI coding agent can then ask you follow up questions, consider angles you may not have, and generate other artifacts like documentation, data generation and migration scripts, tests, CRUD APIs, all in context. If you can reliably do all that from plain pseudo code, that's way less verbose than having to write out every different representation of the same underlying concept, by hand.

Sure, some of that, like CRUD APIs, you can generate via templates as well. Heck, you can even have the coding agent generate the templates and the code that will process/compile them, or generate the code that generates the templates given a set of parameters.

GoatInGrey · 2025-12-09T17:12:01 1765300321

It's been my experience that reaching for an LLM is a significant context switch that breaks flow state. Comparable to a monkey entering your office and banging cymbals together for a minute, returning to programming after writing up instructions for an LLM requires a refocusing process to reestablish the immersion you just forfeited. This can be a worthwhile trade with particularly tedious or annoying tasks, but not always.

I suspect that this explains the current bifurcation of LLM usage. Where individuals either use LLMs for everything or use them minimally. With the in-between space shrinking by the day.

JeremyNT · 2025-12-09T14:48:30 1765291710

> Like, is there truly an agentic way to go 10x or is there some catch? At this point while I'm not thrilled about the idea of just "vibe coding" all the time, I'm fine with facing reality.

Below is based on my experience using (currently) mostly GPT-5 with open source code assistants.

For a new project with straightforward functionality? I think you (and "you" being "basically anybody who can code at all") can probably manage to go 10x the pace of a junior engineer of yesteryear.

Things get a lot trickier when you have complex business logic to express and backwards compatibility to maintain in an existing codebase. Writing out these kinds of requirements in natural language is its own skillset (which can be developed), and this process takes time in and of itself.

The more confusing the requirements, the more error prone the process becomes though. The model can do things "correctly" but oops maybe you forgot something in your description, and now the whole thing will be wrong. And the fact that you didn't write the code means that you missed out on your opportunity to fix / think about stuff in the first pass of implementation (i.e. you need to seriously review stuff, which also slow you down).

Sometimes iterating over English instructions will take longer than just writing/expressing things in code from the start. But sometimes it will be a lot faster too.

Basically the easy stuff is way easier but the more complex stuff is still going to require a lot of hand holding and a lot of manual review.

lionkor · 2025-12-09T13:41:31 1765287691

I have a feeling that people who are genuinely impressed by long term vibe coding on a single project are only impressed because they don't know any better.

Take writing a book, or blog post; writing a good blog post, or a chapter of a book, takes lots of skill and practice. The results are very satisfying and usually add value to both the writer's life as well as the reader's. When someone who has done that uses AI and sees the slop it generates, he's not impressed, probably even frustrated.

However, someone who can barely write a couple coherent sentences, would be baffled at how well AIs can put together sentences, paragraphs, and have a somewhat coherent train of thought through the entire text. People who struggled in school with writing an introduction and a conclusion will be amazed at AIs writing. They would maybe even assume that "those paragraphs actually add no meaning and are purely fluff" is a totally normal part of writing and not an AI artifact.

stocksinsmocks · 2025-12-09T16:02:39 1765296159

I’m impressed by getting the output of at least a mediocre developer at less than 1% of the cost. Brute force is an underrated strategy. I’ve been having a great experience.

That developers in the Hacker News comment bin report experiences that align with their personal financial interests doesn’t really dissuade me.

itgoon · 2025-12-09T13:52:38 1765288358

Like another reply says: practice.

How many hours have you spent writing code? Thousands? Tens of thousands? Were you able to achieve good results in the first hundred hours?

Now, compare it to how much time you've spent working with agents. Did you dedicate considerable time to figuring out how to use them? Do you stop using the agent and do things manually when it isn't going right, or do you spend time figuring out how to get the agent to do it?

crowbahr · 2025-12-09T14:00:25 1765288825

You can't really compare those 2. Agents a re non-deterministic. I can tell Clod to go update my unit test coverage and it will choke itself, burn 200k tokens and then loudly proclaim "Great! I've updated unit test coverage".

I'll kill that terminal, open it again and run the exact same command. 30k tokens, actually adds new tests.

It's hard to "learn" when the feedback cycle can take 30 minutes and result in the agent sitting in the corner touching itself and crooning about what a good boy it is. It's hard to _want_ to learn when you can't trust the damn thing with the same prompt twice.

funkydata · 2025-12-09T14:02:38 1765288958

That comment made my day. I actually had an intern like that!

ilidur · 2025-12-09T13:56:07 1765288567

And then all the heuristics you've learnt change under you and you're stuck doing 100-1000 more hours of learning with a drop in quality during that time.

dent9 · 2025-12-10T04:41:46 1765341706

Agentic AI is a pretty huge scam. Every organization worth it's salt has so many IAM protections in place that an AI developer is useless because you can't give it SSO and OIDC credentials to access the company resources. And no IT team will ever let it. So all these folks trying to convince us that AI will ever deploy anything useful are lying their butts off; IT won't even let their own developers deploy anything why would an AI be treated any differently?

coffeefirst · 2025-12-09T15:02:58 1765292578

That's my finding as well. The smaller the chunk, the better, and it saves me 5m here and an hour there. These really add up.

This is cool. It's extra cool on annoying things like "fix my types" or "find the syntax error" or "give me the flags for ffmpeg to do exactly this."

If I ever meet someone who drank the koolaid and wants to show me their process, I'm happy to see it. But I've tried enough to believe my own eyes, and when I see open source contributors I respect demo their methods, they spend enough time and energy either waiting on the machine and trying to keep it on the rails that, yes this is harder, but it does not appear to be faster.

estimator7292 · 2025-12-09T20:02:56 1765310576

It seems to very heavily depend on your exact project and how well it's represented in the training set.

For instance, AI is great at react native bullshit that I can't be bothered with. It absolutely cannot handle embedded development. Particularly if you're not using Arduino framework on an Atmel 328. I'm presently doing bare metal AVR on a new chip and none of the AI agents have a single clue what they're doing. Even when fed with the datasheet and an entire codebase of manually written code for this thing, AI just produces hot wet garbage.

If you're on the 1% happy path AI is great. If you diverge even slightly from the top 10 most common languages and frameworks it's basically useless.

The weird thing is if you go in reverse it works great. I can feed bits of AVR assembly in and the AI can parse it perfectly. Not sure how that works, I suspect it's a fundamentally different type of transformation that these models are really good at

csomar · 2025-12-09T15:26:50 1765294010

I have been building a game (preview here: https://qpingpong.codeinput.com) as a practice to "vibe-coding". There is only one rule: I am not allowed to write a single line of code. But can prompt as much as I want.

So far I am hitting a "hard-block" on getting the AI to make changes once you have a large code base. One "unblocker" was to restructure all the elements as their own components. This makes it easier for the LLM (and you?) to reason about each component (React) in isolation.

Still, even as this "small/simple game" stage, it is not only hard for the LLM to get any change done but very easy for it to break things. The only way I can see my around it, is to structure very through tests (including E2E tests) so that any change by the LLM has to be thoroughly tested for regression.

I've been working on this for a month or so. I could have coded it faster by hand except for the design part.

allochthon · 2025-12-09T22:58:02 1765321082

I have a hobby project on the side involving radio digital signal processing in Rust that I've been pure vibe coding, just out of curiosity to see how far I can get. On more than one occasion the hobby project has gotten bogged down in a bug that is immensely challenging to resolve. And since the project isn't in an area I have experience with, and since I don't have a solid "theory of the program", since it's a gray box because I've been vibe coding it, I've definitely seen CC get stuck and introduce regressions in tricky issues we previously worked through.

The use of Claude Code with my day job has been quite different. In my day job, I understand the code and review it carefully, and CC has been a big help.

groby_b · 2025-12-09T14:36:09 1765290969

You can go faster once you understand the domain reasonably well that you could have written it yourself. This allows you to write better designs, and steer LLMs in the right direction.

"Vibe coding" though is moving an ever growing pile of nonunderstanding and complexity in front of you, until you get stuck. (But it does work until you've amassed a big enough pile, so it's good for smaller tasks - and then suddenly extremely frustrating once you reach that threshold)

Can you go 10x? Depends. I haven't tried any really large project yet, but I can compress fairly large things that would've taken a week or two pre-LLM into a single lazy Sunday.

For larger projects, it's definitely useful for some tasks. ("Ingest the last 10k commits, tell me which ones are most likely to have broken this particular feature") - the trick is finding tasks where the win from the right answer is large, and the loss from the wrong one is small. It's more like running algorithmic trading on a decent edge than it is like coding :)

It definitely struggles to do successfully do fully agentic work on very large code bases. But... I've also not tried too much in that space yet, so take that with a grain of salt.

rbalicki · 2025-12-09T17:19:00 1765300740

A git bisect where you can't easily check whether the feature is broken (because the testing step isn't easily described as code) would be killer

jonfw · 2025-12-09T16:31:28 1765297888

If you have not started working on a new codebase while adopting AI, it may be harder to realize the gains.

I switched jobs somewhat recently. At my previous job, where I was on the codebase for years, I knew where the changes should be and what they should look like. So I tried to jump directly to implementation with the AI because I didn't need much help planning and the AI got confused and did an awful job.

In a new codebase, where I had no idea how things are structured, I started the process by using AI to understand where the relevant code is, the call hierarchies and side effects, etc.

I have found by using the AI to conduct the initial investigation, it was then very easy to get the AI to generate an effective spec, and then it was relatively easy to get the AI to generate the code to that spec. That flow works much better than trying to one shot implementation

hackernewds · 2025-12-09T16:44:57 1765298697

You are discounting his experience with your own, while your example is not remotely relevant to his flask game example.

jonfw · 2025-12-09T21:30:28 1765315828

It sounded like he was trying to one shot things when he mentioned he would ask it to fix problems with no luck. It's an approach I've tried before with similar results, so I was sharing an alternative that worked for me. Apologies if it came across as dismissive

dominotw · 2025-12-09T16:49:11 1765298951

gp clearly said they use ai to lookup/explain stuff. they could do the same with local codebase.

not sure what you are trying to say.

HDThoreaun · 2025-12-09T18:52:51 1765306371

GP said they were doing vibe coding and trying to get the ai to do one shots. Thats the worst way to use these tools. AI coding agents work best when you generally know what you want the output to look like but dont want to waste time writing that output

alemanek · 2025-12-09T15:34:24 1765294464

I don’t vibe code yet but it has sped me up a lot when working with large frameworks that have a lot of magic behind the scenes (Spring Boot). I am doing a very large refactor, major version spring boot upgrade, at the moment.

When given focused questions for parts of the code it it will give me 2-4 different approaches extending/implementing different bean overrides. I go through a cycle of back and forth having it give me sample implementations. I often ask what is considered the more modern or desirable approach. Things like give me a pros and cons list of the different approaches. The one I like the best I then go look up the specific docs to fact check a bit.

For this type of work it easily is a 2-3x. Spring specifically is really tough to search for due to its long history and large changes between major versions. More times than not it lands me on the most modern approach for my Spring Boot version and while the code it produces is not bad it isn’t great either. So, I rewrite it.

Also it does a pretty good job of writing integration tests. I have it give me the boilerplate for the test and then I can modify it for all my different scenarios. Then I run those against the unmodified and refactored code as validation suite that the refactor didn’t introduce issues.

When I am working in GoLang I don’t get this level of speed up but I also don’t need to look up as much. The number of ways to do things is far lower and there is no real magic behind the scenes. This might be one reason experiences may differ so radically.

sheepscreek · 2025-12-09T14:04:01 1765289041

The thing is, using an agent or AI to code for you is a learned skill. It doesn’t come naturally to most people. For you to be successful at it, you’ve got to adopt a mentor / lead mindset - directing vs doing. In other words, you have to be an expert at explaining yourself - communicating clearly to get great results.

Someone who hasn’t got any experience coding, or leading in any capacity, anywhere in life (or mentoring) will have a hard time with agentic development.

I’ll elaborate a bit more - the ideal mindset requires fighting that itch to “do it yourself” and sticking to the prompts for any changes. This habit will force you to get better at communicating effectively to others (including agents).

xtracto · 2025-12-09T21:54:54 1765317294

How are you guys using LLMs? I've done a couple of applications for my own use, including a "Mexican Train Dominoes" online multiplayer using LLMs and it doesn't stop amazing me, Gemini 3 is crazy good at finding bugs at work, And every week there are very interesting advances in Arxiv articles.

I'm 45 years old, have been programming since I was 9, and this is the most amazing time to be building stuff.

xnx · 2025-12-09T21:57:15 1765317435

Don't be put off forever by one bad experience with Cursor. Gemini 3 would likely yield much better results now.

Volundr · 2025-12-09T14:28:54 1765290534

FTA

> I've had Claude Code write an entire unit/integration test suite in a few hours (300+ tests) for a fairly complex internal tool. This would take me, or many developers I know and respect, days to write by hand.

I have no problem believing that Claude generated 300 passing tests. I have a very hard time believing those tests were all well thought out, consise, actually testing the desired behavior while communicating to the next person or agent how the system under test is supposed to work. I'd give very good odds at least some of those tests are subtly testing themselves (ex mocking a function, calling said function, then asserting the mock was called). Many of them are probably also testing implementation details that were never intended to be part of the contract.

I'm not anti-AI, I use it regularly, but all of these articles about how crazy productive it is skip over the crazy amount of supervision it needs. Yes, it can spit out code fast, but unless your prepared to spend a significant chunk of that 'saved" time CAREFULLY (more carefully than with a human) reviewing code, you've accepted a big drop in quality.

rkozik1989 · 2025-12-09T14:52:14 1765291934

The benefit of having a team of QA engineers create tests is their differing perspectives, so with LLMs being trained to act like affirmation engines you have to wonder how that impacts the test cases it creates. Its the problem of LLMs being miserable at critiques manifesting itself in a different way.

However, in saying that, I am by no means an AI hater, but rather I just want models to be better than they currently are. I am tired of the tech demos and benchmark stats that don't really mean much aside from impressing someone who's not in a critical thinking mindset.

klysm · 2025-12-09T16:16:22 1765296982

Very similar experience here. I have not once managed to get an LLM to generate good tests, even for very simple code. It generally writes tautologies that will pass with high confidence.

insane_dreamer · 2025-12-10T05:46:55 1765345615

Experience from 2 days ago:

I had CC write a bunch of tests to make sure some refactoring didn't break anything, and then I ran the app and it crashed out of the gate. Why? Because despite the verbosity of the tests it turns out that it had mocked the most import parts to test, so the _actual_ connections weren't being tested, and while CC was happy to claim victory with all tests green, the app was broken.

kllrnohj · 2025-12-09T14:38:25 1765291105

Anecdotes etc etc but the AI tests I've been sent to review have been absolute shit. Stuff like that just calling a function doesn't crash the program. No assertions other than "end of test method reached"

Yes sometimes those tests are necessary, but it seemed to just do it everywhere because it made the code coverage percentage go up. Even though it was useless.

I have also had great experiences with AI cranking out straightforward boilerplate or asking C++ template metaprogramming questions. It's not all negative. But net-net it feels like it takes more work in total to use AI as you have to learn to recognize when it just won't handle the task, which can happen a lot. And you need to keep up with what it did enough to be able to take over. And reading code is harder than writing it.

piperswe · 2025-12-09T16:00:27 1765296027

I’ve seen agents produce plenty of those tests, but recently I’ve seen them generate some actually decent unit tests that I wouldn’t have thought of myself. It’s a bit of a crapshoot

jf22 · 2025-12-09T15:00:09 1765292409

> you've accepted a big drop in quality.

Right, but you do it in a 10th of the time.

WesleyJohnson · 2025-12-09T15:28:24 1765294104

So you're openly saying you're fine with quantity over quality.... in software engineering? That's fine for a MVP, maybe, but nothing beyond on that IMHO unless they're throw away scripts.

"Houston, we have a problem."

"Yeah, but we did it in a 10th of the time"

bluesnowmonkey · 2025-12-09T20:52:19 1765313539

Of course it's fine for any project.

There is exactly one "best" programmer in the world, and at this moment he/she is working on at most one project. Every other project in the world is accepting less than the "best" possible quality. Yes... in software engineering.

As soon as you sat down at the keyboard this morning, your employer accepted a sacrifice in quality for the sake of quantity. So did mine. Because neither one of us is the best. They could have hired someone better but they hired you and they're fine with that. They'd rather have the code you produce today than not have it.

It's the same for an AI. It could produce some code for you, right now, for nearly free. Would you rather have that code or not have it? It depends on the situation, yeah not always but sometimes it's worth having.

TemptedMuse · 2025-12-09T16:09:32 1765296572

Here is the thing, most software engineers are not designing rockets, they are making basic CRUD apps. If there is a minor defect it can be caught and corrected without much issue. Our jobs are a lot less "critical infrastructure" than a lot of software engineers will allow their egos to accept.

Sure if you are making some medical surgery robot do it right, but if you are making a website the recommends wine pairings who cares if one of the buttons has a weird animation bug that doesn't even get noticed for a couple of years.

dlisboa · 2025-12-09T16:19:48 1765297188

I think I'm "most" engineers and I haven't ever worked on something that was "just" a CRUD app. Having a DB behind your web app doesn't make it "just" a CRUD.

It's really overestimated how many simple apps exist.

jf22 · 2025-12-09T16:47:58 1765298878

What kind of apps do you work on?

dlisboa · 2025-12-09T16:53:28 1765299208

Regular SaaS products of different kinds, cloud software, hosting software, etc. Really representative of most of the Web-enabled software out there.

For every one of them there has been an almost negligible amount of CRUD code, the meat of every one of those apps was very specific business logic. Some were also heavy on the frontend with equal amount of complexity on the backend. As a senior/staff level engineer you also have dive into other things like platform enablement, internal tooling, background jobs and data wrangling, distributed architectures, etc. which are even farther from CRUD.

jf22 · 2025-12-09T17:14:46 1765300486

A fancy CRUD app is still a CRUD app.

dlisboa · 2025-12-09T17:35:39 1765301739

Yes, like a guided missile is a fancy firecracker.

TemptedMuse · 2025-12-09T17:51:49 1765302709

Not to call you out but this is exactly what I meant when I said software engineers have egos that will not let them accept that they are not designing critical stuff.

Comparing your cloud based CRUD app to a missile is a perfect illustration. There is no dishonor in admitting that our stuff isn't going to kill anyone if there is a bug. Don't write bad code, but also sometimes just getting something out the door is much better than perfect quality (bird in the hand and all that).

dlisboa · 2025-12-09T18:00:51 1765303251

Not to call you out either but it seems you have really no idea what a basic CRUD app is. Which is fine, I guess not everyone likes to reads the base definitions of these things. It's clear I replied to the wrong person as we don't have a shared understanding of complexity.

instig007 · 2025-12-09T18:49:51 1765306191

> software engineers have egos that will not let them accept that they are not designing critical stuff

> Don't write bad code, but also sometimes just getting something out the door is much better than perfect quality (bird in the hand and all that).

Your bank account can be represented as a CR app, it's two letters short of CRUD, but it doesn't make it simple or simpler in any sense of the words.

Now the question: how much are you tolerant to bugs in your bank account? How often can they happen before you complain?

TemptedMuse · 2025-12-09T19:02:00 1765306920

Banking software is critical, but guess what, most software engineers are not writing banking software. I never said no software engineers write critical code. Heck I'd argue most at some point in their career will write something that needs to be as bug free as possible... at some point in their careers.

My point is that for most software engineering getting a product out is more important that a super high quality bar that slows everything down.

If you are writing banking software or flight control systems please do it with care, if you are making some React based recipe website or something I don't really care (99% of software engineering falls into this latter category in my opinion).

Software engineers need to get over themselves a bit, AI really exposed how many were just getting by making repetitive junk and thinking they were special.

instig007 · 2025-12-09T19:51:14 1765309874

> most software engineers are not writing banking software

Many software engineers write software for people who won't like the idea that their request/case can be ignored/failed/lost, when expressed openly on the front page of your business offering. Are bookings important enough? Are gifts for significant events important? Maybe you're okay with losing my code commits every once in a while, I don't know. And I'm not sure why you think it's okay to spread this bad management idea of "not valuable or critical enough" among engineers who should know better and who should keep sources of bad ideas at bay when it comes to software quality in general.

TemptedMuse · 2025-12-09T17:44:55 1765302295

That is just CRUD with buzzword soup around it.

jf22 · 2025-12-09T16:47:18 1765298838

> quantity over quality

Yes? Making quality concessions for more code or features is part of the job.

bagacrap · 2025-12-09T16:21:30 1765297290

I mean, just say you view unit testing as nothing more than a checkbox.

jf22 · 2025-12-09T16:48:46 1765298926

I don't know why you'd think that.

200 decent unit tests are better than zero unit tests.

welshwelsh · 2025-12-09T19:54:20 1765310060

The main benefit of writing tests is that is forces the developer to think about what they just wrote and what it is supposed to do. I often will find bugs while writing tests.

I've worked on projects with 2,000+ unit tests that are essentially useless, often fail when nothing is wrong, and rarely detect actual bugs. It is absolutely worse than having 0 tests. This is common when developers write tests to satisfy code coverage metrics, instead of in an effort to make sure their code works properly.

jf22 · 2025-12-09T21:11:17 1765314677

Look, you tell the LLMs what kind of tests you want and judge the quality before committing.

If you're letting the LLM create useless test that's on you.

I think you're reading these comments in bad faith as if I'm letting the LLM add slop to satisfy a metric.

No, I'm using an LLM to write good tests that I will personally approve as usefull, and other people will review too, before merging into master.

a_rana · 2025-12-09T18:54:17 1765306457

Hard disagree! Review is still an important part but more and more, I find myself agreeing with the LLM generated changes ~90% of the time.

stocksinsmocks · 2025-12-09T16:17:35 1765297055

Which is the better value:

Hundreds of tests that were written basically for free in a few minutes even though a lot of them are kind of dumb?

Or hundreds of tests that were written for a five figure sum that took weeks or months, and only some of them are kind of dumb?

If you’re just thinking of code as the end in and of itself, then of course, the handcrafted artisanal product is better. If you think of code like an owner, an incidental expense towards solving a problem that has value, then cheap and disposable wins every time. We can throw our hands up about “quality“ and all that, but that baby was thrown out with the bathwater a very, very long time ago. The modern Web is slower than the older web. Desktop applications are just web browsers. Enterprise software barely works. Windows 11 happened. I don’t think anybody even bothers to scrutinize their dependency chains except for, I don’t know, like maybe missile guidance or something. And I just want to say Claude is not responsible for any of this. You humans are.

welshwelsh · 2025-12-09T20:00:23 1765310423

Neither. Tests should be written by developers only when it saves them time. The cost of writing them should be negative.

Instead of writing hundreds of useless tests so that the code coverage report shows high numbers, it is better to write a couple dozen tests based on business needs and code complexity.

stocksinsmocks · 2025-12-09T22:22:38 1765318958

Having used Bentley software products I can tell you with complete certainty that professional software developers have extremely bad judgment when it comes to the need to test software and verify its functionality. Developers just think they know what they’re doing because there’s typically not a strong feedback mechanism that inflicts serious career damage when they do things that are extremely lazy or stupid or unethical. How many people lost their job or had to change their name and live out the rest of their days in Juarez Mexico over AWS’ incomprehensible configuration causing an internet brown out? Anyone? A teenager serves cold onion rings at a burger joint and he’s on the street. Some lazy dweeb at Amazon blows up the internet and - come on, isn’t it about the friends we made along the way? It’s obscene and the lack of professionalism and accountability is a total disgrace.

simonw · 2025-12-08T22:37:16 1765233436

The cost of writing simple code has dropped 90%.

If you can reduce a problem to a point where it can be solved by simple code you can get the rest of the solution very quickly.

Reducing a problem to a point where it can be solved with simple code takes a lot of skill and experience and is generally still quite a time-consuming process.

loandbehold · 2025-12-08T22:47:31 1765234051

Most of software work is maintaining "legacy" code, that is older systems that have been around for a long time and get a lot of use. I find Claude Code in particular is great at grokking old code bases and making changes to it. I work on one of those old code bases and my productivity increased 10x mostly due to Claude Code's ability to research large code bases, make sense of it, answer questions and making careful surgical changes to it. It also helps with testing and debugging which is huge productivity boost. It's not about its ability to churn out lots of code quickly: it's an extra set of eyes/brain that works much faster that human developer.

Jean-Papoulos · 2025-12-09T12:22:13 1765282933

I have the opposite experience. Claude can't get it all in the context window and make changes that will completely break something on the other side of the program.

Granted that's because the program is incredibly poorly written, but still, context window will stay a huge barrier for quite some time.

jollyllama · 2025-12-09T14:12:39 1765289559

Between yours and GP's comments, I find echoes of my experience:

> Most of software work is maintaining "legacy" code, that is older systems that have been around for a long time and get a lot of use.

> Granted that's because the program is incredibly poorly written

LLMs can't fix big, shitty legacy codebases. That is where most maintenance work (in terms of hours) is, and where it will remain.

I would take it one step further and argue that LLMs and vibe-coding will compound into more big, shitty legacy codebases over time, and therefore, in the long arc, nothing will really change.

Espressosaurus · 2025-12-09T16:01:13 1765296073

Yeah. I've got some EE coworkers that are vibe coding their way through everything and nothing in the codebase is understandable.

We're going to have to go through another quality hangover I suspect.

But since people that have never coded are now coding and think it's the best thing ever the only way out is through.

jollyllama · 2025-12-09T16:27:44 1765297664

It has ever been thus. There are multi-million dollar businesses propped up by .NET applications on a foundation of shunted-around files, and at best, SQL used as APIs/queues. "Working" code is, in the long run, a liability outside the hands of those doing real engineering.

newsoftheday · 2025-12-09T17:27:57 1765301277

I want to voice the same bad experience, tried Claude and several more actually. I could get AI to understand some things but it quickly went of the rails trying to comprehend larger complexities and its suggested changes would have been between worse to detrimental had I allowed them to be committed.

zinodaur · 2025-12-09T13:20:59 1765286459

I don’t put all the source in context either - if you let Claude run the compiler (or even tests) it can avoid these issues

loandbehold · 2025-12-09T19:58:56 1765310336

If you like I can try to help you getting Claude Code to handle your incredibly poorly written program. loandbehold0 at proton.me

bonzini · 2025-12-09T13:45:24 1765287924

Yep. It can refactor very well but that's it. For complex code bases it cannot even build boilerplate that makes sense; at most it saves some typing.

unregistereddev · 2025-12-09T15:15:23 1765293323

> It can refactor very well but that's it.

Can it though? I thought it was most useful for writing new code, but have so far never had it correctly refactor existing code. Its refactoring attempts usually change behavior / logic, and sometimes even leave the code in a state where it's even harder to read.

zmmmmm · 2025-12-09T01:22:46 1765243366

I've found this as well. In some cases we aren't fully authorised to use the AI tools for actual coding but even just asking "how would you make this change" or "where would you look to resolve this bug" or "give me an overview of how this process works" is amazingly helpful.

eru · 2025-12-09T03:02:36 1765249356

> In some cases we aren't fully authorised to use the AI tools for actual coding but even just asking "how would you make this change" [...]

Isn't the logical endpoint of this equivalent to printing out a Stackoverflow answer and manually typing it into your computer instead of copy-and-pasting?

Nitpicks aside, I agree that contemporary AIs can be great for quickly getting up to speed with a code base. Both a new library or language you want to be using, and your own organisation's legacy code.

One of the biggest advantages of using established ecosystem was that stack-overflow had a robust repository of already answered questions (and you could also buy books on it). With AI you can immediately cook up your own Stackoverflow community equivalent that provides answers promptly instead of closing your question as off-topic.

And I pick Stackoverflow deliberately: it's a great resources, but not reliable enough to use blindly. I feel we are in a similar situation with AI at the moment. This will change gradually as the models become better. Just like Stackoverflow required less expertise to use than attending a university course. (And a university course requires less expertise than coming up with QuickSort in the first place.)

ChrisMarshallNY · 2025-12-09T04:22:49 1765254169

> Isn't the logical endpoint of this equivalent to printing out a Stackoverflow answer and manually typing it into your computer instead of copy-and-pasting?

Not in my case (I never used SO like that, anyway). I use it almost exactly like SO, except much more quickly and interactively (and without the inference that I’m “lazy,” or “stupid,” for not already knowing the answer).

I have found that ChatGPT gives me better code than Claude (I write Swift); even learning my coding and documentation style.

I still need to review all the code it gives me, and I have yet to use it verbatim, but it’s getting close.

The most valuable thing, is I get an error, and I can ask it “Here’s the symptoms and the code. What do you think is going on?”. It usually gives me a good starting point.

I could definitely figure it out on my own, but it might take half an hour. ChatGPT will give me a solid lead in about half a minute.

jordanbeiber · 2025-12-09T06:03:30 1765260210

The problem is most likely not writing the actual code, but rather understanding an old, fairly large codebase and how it’s stitched together.

SO is (was?) great when you where thinking about how nice a recursive reduce function could replace the mess you’ve just cobbled together, but language x just didn’t yet flow naturally for you.

bryanrasmussen · 2025-12-09T10:18:53 1765275533

>Isn't the logical endpoint of this equivalent to printing out a Stackoverflow answer and manually typing it into your computer instead of copy-and-pasting?

when AI works well it is superior to Stack Overflow because what it replaces is not "Look up answer on SO, copy paste" but rather, look up several different things on SO that relate to your problem that you are trying to solve but which there is no exact definite solution posted anywhere, and copy those things together into a bit of code that you will probably just refactor a bit with a shorter time than doing all the SO look up yourself. When it works it can turn 2 hours of research into 2 minutes.

The problems are:

AI also sometimes replicates the following process - dev not understanding all parts of solution or requirements copies bits of code together from various answers making something that sort of works but is inefficient and has underlying problems.

Even with the working correctly solution your developer does not get in that 2 minutes what they used to get in the two hours before, an understanding of the problem space and how these parts of the solution hang together. This is the reason why it is more useful for seniors than juniors, because part of the looking through SO for what you want is light education.

coldtea · 2025-12-09T09:57:27 1765274247

>Isn't the logical endpoint of this equivalent to printing out a Stackoverflow answer and manually typing it into your computer instead of copy-and-pasting?

Isn't the answer on SO the result of a human intelligence writing it in the first place, and then voted by several human intelligencies to top place? If an LLM was merely an automated "equivalent" to that, that's already a good thing!

But in general, the LLM answer you appear to dismiss amounts to a lot more:

  Having an close-to-good-human-level programmer 
  understand your existing codebase
  answer questions about your existing codebase 
  answer questions about changes you want to make
  on demand (not confined to copying SO answers)
  interactively 
  and even being able to go in and make the changes

That amounts to "manually typing an SO answer" about as much as a pickup truck amounts to a horse carriage.

Or, to put it another way, isn't "the logical endpoint" of hiring another programmer and asking them to fix X "equivalent to printing out a Stackoverflow answer and manually typing it into their computer"?

>And I pick Stackoverflow deliberately: it's a great resources, but not reliable enough to use blindly. I feel we are in a similar situation with AI at the moment.

Well, we shouldn't be using either blindly anyway. Not even the input of another human programmer (that's way we do PR reviews).

dns_snek · 2025-12-09T13:48:32 1765288112

> Isn't the answer on SO the result of a human intelligence writing it in the first place, and then voted by several human intelligencies to top place? If an LLM was merely an automated "equivalent" to that, that's already a good thing!

The word "merely" is doing all of the heavy lifting here. Having human intelligence in the loop providing and evaluating answers is what made it valuable. Without that intelligence you just have a machine that mimics the process yet produces garbage.

colechristensen · 2025-12-09T05:56:06 1765259766

>not reliable enough to use blindly

I've been building things with Claude while looking at say less than 5% of the code it produces. What I've built are tools I want to use myself and... well they work. So somebody can say that I can't do it, but on the other hand I've wanted to build several kinds of ducks and what I've built look like ducks and quack like ducks so...

I've found it's a lot better at evaluating code than producing it so what you do is tell it to write some code, then tell it to give you the top 10 things wrong with the code, then tell it to fix the five of them that are valid and important. That is a much different flow than going on an expedition to find a SO solution to an obscure problem.

A good quality metric of your code is to ask an LLM to find the ten worst things about it and if all of those are all stupid then your code is pretty good. I did this recently on a codebase and it's number 1 complaint was that the name I had chosen was stupid and confusing (which it was, I'm not explaining the joke to a computer) and that was my sign that it was done finding problems and time to move on.

impjohn · 2025-12-09T12:07:30 1765282050

>then tell it to give you the top 10 things wrong with the code, then tell it to fix the five of them that are valid and important.

I would be cautious of this. I've tried this multiple times and often it produces very subtle bugs. Sometimes the code is not bad enough to have 5 defects with it, but it will comply, and change things that don't need to. You will find out in prod at some point.

colechristensen · 2025-12-09T18:34:17 1765305257

To be clear, I'm instructing it to generate a list of issues for me. I then decide if anything on that list is worth fixing (or is an issue at all, etc.)

nuclearnicer · 2025-12-08T22:59:43 1765234783

This is great. Asking questions of library code is a big pattern of mine too.

Here's an example I saw on twitter. Asking an LLM to document a protocol from the codebase:

https://ampcode.com/threads/T-f02e59f8-e474-493d-9558-11fddf...

Do you think you will be able to capture any of this extra value? I think I'm faster at coding, but the overall corporate project timeline feels about the same. I feel more relaxed and confident that the work can be done. Not sure how to get a raise out of this.

loandbehold · 2025-12-08T23:12:01 1765235521

For me, as a remote developer, it means I'm able to finish my work in 1 hour instead of 8 hours. So I'm able to capture "extra value" in the form of time. In our team everyone uses GitHub Copilot and I use Claude Code. My teammates' productivity increased slightly but my productivity increased a lot. This is because 1. Claude Code is just a better coding agent 2. I invested time to get good at agentic coding. Eventually Copilot will catch up and management will realize that now 1 developer can do what previously would take a whole team.

overfeed · 2025-12-09T00:17:10 1765239430

I'm really curious on what your role is, and which industry are you in? I'm awed by these productivity gains others report, but I feel like AI helps in such a small part of my job (implementing specific changes as I direct).

Agentic workflows for me results in bloated code, which is fine when I'm willing to hand over an subsystem to the agent, such as a frontend on a side project and have it vibe code the entire thing. Trying to get clean code erases all/most of my productivity gains, and doesn't spark joy. I find having a back-end-forth with an agent exhausting, probably because I have to build and discard multiple mental models of the proposed solution, since the approach can vary wildly between prompts. An agent can easily switch between using Newton-Raphson and bisection when asked to refactor unrelated arguments, which a human colleague wouldn't do after a code review.

ryandrake · 2025-12-09T00:51:20 1765241480

I've come to the same conclusion: If you just want a huge volume of code written as fast as possible, and don't care about 1. how big it is, 2. how fast it runs, 3. how buggy it is, 4. how maintainable or understandable it is, or 5. the overall craftsmanship and artistry of it, then you're probably seeing huge productivity gains! And this is fine for a lot of people and for a lot of companies: Quality really doesn't matter. They just care about shitting out mediocre code as fast as possible.

If you do care about these things, it will take you overall longer to write the code with an LLM than it would by hand-crafting it. I started playing around with Claude on my hobby projects, and found it requires an enormous amount of exhausting handholding and post-processing to get the code to the point where I am really happy with it as a consistent, complete, expressive work of art that I would be willing to sign my name to.

matwood · 2025-12-09T12:02:14 1765281734

> Quality really doesn't matter.

It does matter, but it's one requirement among many. Engineers think quality metrics as you listed are the most important requirements, but that's not typically true.

citizenpaul · 2025-12-09T01:10:29 1765242629

>shitting out mediocre code as fast as possible.

This really is what businesses want and always have wanted. I've seen countless broken systems spitting out wrong info that was actively used by the businesses in my career, before AI. They literally did not want it fixed when I brought it up because dealing with errors was part of the process now in pretty much all cases. I don't even try anymore unless I'm specifically brought on to fix a legacy system.

>that I would be willing to sign my name to.

This right here is what mgmt thinks is the big "problem" that AI solves. They have always wanted us to magically know what parts are "good enough" and what parts can slide but for us to bear the burden of blame. The real problem is same as always bad spec. AI won't solve that but it will in their eyes remove a layer in their poor communication. Obviously no SWE is going to build a system that spit out wrong info and just say "hire people to always double check the work" or add it to so-so's job duties to check, but that really is the solution most places seem to go with by lack of decision.

Perhaps there is some sort of failure of SWE's to understand that businesses don't care. Accounting will catch the expensive errors anyway. Then Execs will bull whip middle managers and it will go away.

swatcoder · 2025-12-09T01:23:43 1765243423

The adversarial tension was all that ever made any of it work.

The "Perfectionist Engineer" without a "Pragmatic Executive" to press them into delivering something good enough would of course still been in their workshop, tinkering away, when the market had already closed.

But the "Pragmatic Executive" without the "Perfectionist Engineer" around to temper their naive optimism would just as soon find themselves chased from the market for selling gilded junk.

You're right that there do seem to be some execs, in the naive optimism that defines them, eager to see if this technology finally lets them bring their vision to market without the engineer to balance them.

We'll see how it goes, I guess.

pdimitar · 2025-12-09T09:54:11 1765274051

That's a nice balanced wholesome take, only the problem is that the "Pragmatic Executive" is more like "Career-driven frenzied 'ship it today at all costs' psychopath executive".

You are describing a push-and-pull / tug-of-war balanced relationship. In reality that's absolutely exactly never balanced. The engineer has 1% say, the other 99% go to the executive.

I so wish your take was universally applicable. In my 24 years of career, it was not.

jordwest · 2025-12-09T01:36:21 1765244181

> Perhaps there is some sort of failure of SWE's to understand that businesses don't care

I think it's an engineer's nature to want to improve things and make them better, but then we naively assume that everybody else also wants to improve things.

I know I personally went through a pretty rough disillusionment phase where I realised most of the work I was asked to do wasn't actually to make anything better, but rather to achieve some very specific metrics that actually made everything but that metric worse.

Thanks to the human tendency to fixate on narratives, we can (for a while) trick ourselves into believing a nice story about what we're doing even if it's complete bunk. I think that false narrative is at the core of mission statements and why they intuitively feel fake (mission statement is often more gaslighting than guideline - it's the identity a company wants to present, not the reality it does present).

AI is eager to please and doesn't have to deal with that cognitive dissonance, so it's a metric chaser's dream.

nullbound · 2025-12-09T13:38:14 1765287494

<< They have always wanted us to magically know what parts are "good enough" and what parts can slide but for us to bear the burden of blame.

Well, that part is bound to add a level of tension to the process. Our leadership has AI training, where the user is responsible for checking its output, but the same leadership also outright stated it now sees individual user of AI as having 7 employees under them ( so should be 7x more productive ). Honestly, its maddening. None of it is how it works at all.

re-thc · 2025-12-09T04:29:42 1765254582

> This really is what businesses want and always have wanted.

There's a difference between what they really want and executives knowing what they want. You make it sound like every business makes optimal decisions to get optimal earnings.

> They literally did not want it fixed when I brought it up because

Because they thought they knew what earns them profits. The key here they thought they knew.

The real problem behind the scenes is a lot of management is short term. Of course they don't care. They roll out their shiny features, get their promotions and leave. The issues after that are not theirs. It is THE business' problem.

loandbehold · 2025-12-09T02:05:13 1765245913

Senior Software Engineer. The system is a niche business software software for a specific industry. It doesn't do any fancy math, all straightforward business logic.

> Trying to get clean code erases all/most of my productivity gains, and doesn't spark joy. I find having a back-end-forth with an agent exhausting, probably because I have to build and discard multiple mental models of the proposed solution, since the approach can vary wildly between prompts

You probably work on something that requires very unique and creative solutions. I work on dumb business software. Claude Code is generally good at following existing code patterns. As far as back-and-forth with Claude Code being exhausting, I have few tips how how to minimize number or shots required to get good solution from CC: 1. Start by exploring relevant code by asking CC questions. 2. Then use Plan Mode for anything more than trivial change. Using Plan Mode is essential. You need to make sure you and CC are on the same page BEFORE it starts writing code 3. If you see CC making same mistake over and over, add instructions to your CLAUDE.md to avoid it in the future. This way your CC setup improves over time, like a coworker who learns over time.

overfeed · 2025-12-09T02:27:32 1765247252

Thank you for the actionable ideas. I'll experiment with closer supervision during the planning stage, hopefully finer-grained implementation details will reduce unnecessarily large refactors during review.

robot-wrangler · 2025-12-09T00:56:07 1765241767

Claims about agentic workflows are the new version of "works on my machine" and should be treated with skepticism if they cannot be committed to a repository and used by other people.

Maybe parent is a galaxy-brained genius, or.. maybe they are just leaving work early and creating a huge mess for coworkers who now must stay late. Hard to say. But someone who isn't interested in automating/encoding processes for their idiosyncratic workflows is a bad engineer, right? And someone who isn't interested in sharing productivity gains with coworkers is basically engaged in sabotage.

eru · 2025-12-09T03:23:43 1765250623

> And someone who isn't interested in sharing productivity gains with coworkers is basically engaged in sabotage.

Who says they aren't interested in sharing? To give a less emotionally charged example: I think my specific use pattern of Git makes me (a bit) more productive. And I'm happy to chew anyone's ear off about it who's willing to listen.

But the willingness and ability of my coworkers to engage in git-related lectures, while greater than zero, is very definitely finite.

robot-wrangler · 2025-12-09T03:40:29 1765251629

Something that is advertised as 10x improvement in productivity isn't like your personal preferences for git or a few dinky bash aliases or whatever. It's more like a secret personal project test-suite, or a whole data pipeline you're keeping private while everyone else is laboriously doing things manually.

Assuming 10x is real, then again the question: why would anyone do that? The only answers I can come up with are that they cannot share it (incompetence) or that they don't want to (sabotage). You're saying the third option is.. people just like working 8 hours while this guy works 1? Seems unlikely. Even if that's not sabotaging coworkers it's still sabotaging the business

loandbehold · 2025-12-09T04:17:20 1765253840

The reason is because we are a Microsoft shop and our company doesn't have Claude account. I'm using my personal Claude Max account. My manager does know that I use Claude Code and I requested the person responsible for AI tooling in our company to use Claude Code but he just said that management already decided to go with GitHub copilot. He thinks that using Claude model in Copilot is same as using Claude Code. Another issue is that we are a Microsoft shop and I use Claude Code through WSL but I'm the only person on our team with Linux skills.

gessha · 2025-12-09T12:43:56 1765284236

Every time I hear “we are a Microsoft shop” makes me remember the scene with Jimmy O Yang and Windows auto updating in Space Force

https://youtu.be/xDLvUqhwHZc

cyberpunk · 2025-12-09T05:36:58 1765258618

There are methods of connecting the claude code cli tools to copilot’s api — look at litellm or something along those lines, it’s a pip pkg and translates the calls code makes

blub · 2025-12-09T08:04:15 1765267455

Business and Enterprise plans have a no-training-on-your-data clause.

I’m not sure personal Claude has that. My account has the typical bullshit verbiage with opt-outs where nobody can really know whether they’re enforceable.

Using a personal account is akin to sharing the company code and could get one in serious trouble IMO.

loandbehold · 2025-12-09T09:06:38 1765271198

You can opt-out of having your code being trained on. When Claude Code first came out Anthropic wasn't using CC sessions for training. They started training on it starting from Claude Code 2 that came out with Sonnet 4.5. User is asked on first use whether to opt-in or out of training.

eru · 2025-12-09T08:51:02 1765270262

> You're saying the third option is.. people just like working 8 hours while this guy works 1?

Nope, I don't say that at all.

I am saying that certain accommodations might feel like 10x to the person making them, but that doesn't mean they are portable.

Another personal example: I can claim with a straight face that using a standing desk and a Dvorak keyboard make me 10x more productive than otherwise. But that doesn't necessarily mean that other people will benefit from copying me, even if I'm happy to explain to anyone how to buy a standing desk from Ikea (or how to work company procurement to get one, in case you are working not-from-home).

In any case, the original commenter replied with a better explanation than our speculations here.

overfeed · 2025-12-09T16:05:54 1765296354

> And someone who isn't interested in sharing productivity gains with coworkers is basically engaged in sabotage.

I'll have to vigorously dissent on this notion: we sell our labor to employers - not our souls. Our individual labor, contracts and remuneration are personalized. Our labor. Not some promise to maximize productivity - that's a job for middle and upper management.

Your employer sure as hell won't directly share 8x productivity gains with employees. The best they can offer is a once-off, 3-15% annual bonus (based on your subjective performance, not the aggregate), alternatively, if you have RSU/options, gains on your miniscule ownership fraction.

pdimitar · 2025-12-09T10:01:14 1765274474

It seems to me that the devs that managed to become sergeants of a small platoon of LLM agents to a crushing success deem their setup a competitive advantage and as such will never share it.

But them being humans, they do want to brag about it.

__mharrison__ · 2025-12-09T04:20:34 1765254034

I'm teaching a course in how to do this to one of my clients this week.

Also, I used this same process to address a bug that is many years old in a very popular library this week. Admittedly, the first solution was a little wordy and required some back and forth, but I was able to get to a clean tested solution with little pain.

wreath · 2025-12-09T07:10:37 1765264237

This has been my experience too. At the end of each session, i’m left very exhausted mentally without full understanding of what I just did, so I have to review it again.

Coding this way requires an effort that is equal to both designing, coding, and reviewing except the code i review isnt mine. Strange situation.

raw_anon_1111 · 2025-12-09T04:45:26 1765255526

Well for me, all of my actual implementation work has been green field from “git init” and mostly coding around the AWS SDK in the target language and infrastructure as code since AI coding has gotten decent.

I haven’t had to write a line of code in a year. First ChatGPT and more recently Claude Code.

I don’t do “agentic coding”. I keep my hands on the steering wheel and build my abstractions and modules up step by step. I make sure every line of code looks like something I would write.

I’m a staff consultant (cloud + app dev) and always lead projects, discovery and design and depending on the size of the project, do all of the actual hands on work myself.

I would have had to staff at least one maybe two less senior consultants to do the actual hands on work before. It’s actually easier for me to do the work then having to have really detailed requirements and coordinating work (the whole “Mythical Man Month” thing).

FWIW: before the pearl clutching starts, I started coding in assembly in 1986 on an Apple //e and have been delivering production level code since 1996.

matwood · 2025-12-09T11:34:28 1765280068

I have tech adjacent people on my team vibing out internal tools that are super useful, and take a load off of engineering. Most internal software is rehashing existing software with different/specific requirements.

jakubmazanec · 2025-12-09T09:48:31 1765273711

Exactly what I experience. I don't need AI to generate complex algorithm, I need e.g. a lot of code for a UI library that is clean and maintainable - but it's can't ever generate such code and it can't be prompted, because training data has much less excellent code than good and ok code. Therefore I can't use AI for high-level design task, ony low-level code, which I then have to check and clean line by line, and that isn't an enjoyable work.

I don't need LLMs, I need some kind mind reading device :D

saxenaabhi · 2025-12-09T00:48:59 1765241339

Not the OP but we use LLMs to build a restaurant pos system with reservations, loyalty, webshop etc. Almost at feature parity with bigwigs like lightspeed/toast.

> I find having a back-end-forth with an agent exhausting, probably because I have to build and discard multiple mental models of the proposed solution, since the approach can vary wildly between prompts

Just right now I had it improve QR payments on POS. This is standard stuff, and I have done it multiple time but i'm happy I didn't have to spend the mental energy to implement it and just had to review the code and test it.

```

Perfect! I've successfully implemented comprehensive network recovery strategies for the OnlinePaymentModal.tsx file. Here's a summary of what was added:

  Implemented Network Recovery Strategies

  1. Exponential Backoff for Polling (lines 187-191)
  2. Network Status Detection (lines 223-246, 248-251)
  3. Transaction Timeout Handling (lines 110-119)
  4. Retry Logic for Initial Transaction (lines 44-105)
  5. AbortController for Request Cancellation (lines 134-139, 216-220)
  6. Better Error Messaging (lines 85-102, 193-196)
  7. Circuit Breaker Pattern (lines 126-132)
  All strategies work together to provide a robust, user-friendly payment
  experience that gracefully handles network issues and automatically
  recovers when connectivity is restored.

```

> An agent can easily switch between using Newton-Raphson and bisection when asked to refactor unrelated arguments, which a human colleague wouldn't do after a code review.

Can you share what domain your work is in? Is it deeptech. Maybe coding agents right now work better for transactional/ecommerce systems?

bccdee · 2025-12-09T04:41:37 1765255297

I don't know if that example is real, but if it is, that's exactly the reason I find AI tools irritating. You do not need six different ways to handle the connection being down, and if you do, you should really factor that out into a connection management layer.

One of my big issues with LLM coding assistants is that they make it easy to write lots & lots of code. Meanwhile, code is a liability, and you should want less of it.

saxenaabhi · 2025-12-09T05:34:16 1765258456

These aren't 6 different way.

You are talking about something like network layers in graphql. That's on our roadmap for other reasons(switching api endpoints to digital ocean when our main cloudflare worker is having an outage), however even with that you'll need some custom logic since this is doing at least two api calls in succession, and that's not easy to abstract via a transaction abstraction in a network layer(you'll have handle it durably in the network layer like how temporal does).

Despite the obvious downsides we actually moved it from durable workflow(cf's take of temporal) server side to client since on workflows it had horrible and variable latencies(sometimes 9s v/s consistent < 3s with this approach). It's not ideal, but it makes more sense business wise. I think many a times people miss that completely.

I think it just boils down to what you are aiming. AI is great for shipping bugfixes and features fast. At a company level I think it also shows in product velocity. However I'm sure very soon our competitors will catch up when AI skepticism flatters.

locknitpicker · 2025-12-09T08:37:43 1765269463

> Most of software work is maintaining "legacy" code, that is older systems that have been around for a long time and get a lot of use.

That's not the definition of legacy. Being there for a long time and getting lots of use is not what makes a legacy project "legacy".

Legacy projects are characterized by not being maintained and having little to no test coverage. The term "legacy" means "I'm afraid to touch it because it might break and I doubt I can put it back together". Legacy means resistance to change.

You can and do have legacy projects created a year or two ago. Most vibecoded apps fit the definition of legacy code.

That is why legacy projects are a challenge to agentic coding. Agents already output huge volumes of code changes that developers struggle to review, let alone assert it's correctness. On legacy projects that are in production, this is disastrous.

tarsinge · 2025-12-09T09:00:02 1765270802

What you list are common characteristics encountered in legacy systems, but what makes it legacy is a business decision of declaring it obsolete and in maintenance mode, and so that no money or time to be invested in it. Old systems that continues to evolve are not legacy, like say Linux, and yes like you say a project that is only one year old can be declared legacy. Resistance to change is only an economic variable that drives the decision. Vibecoded apps fits the definition because the developer is unlikely to want to invest more time in them for different reasons.

locknitpicker · 2025-12-09T10:13:04 1765275184

> What you list are common characteristics encountered in legacy systems, but what makes it legacy is a business decision of declaring it obsolete and in maintenance mode, and so that no money or time to be invested in it.

No, not necessarily. Business decisions are one of the many factors in creating legacy code,but they are by no means the single cause. A bigger factor is developers mismanaging a project to the point they became a unmaintainable mess. I personally went through a couple of projects which were legacy code the minute they hit production. One of them was even a proof of concept that a principal engineer, one of those infamous 10x types, decided to push as-is to production and leave two teams of engineers to sort out the mess left in his wake.

I recommend reading "Working Effectively with Legacy Code" by Michael Feathers. It will certainly be eye-opening to some, as it dispels the myth that legacy code is a function of age.

theshrike79 · 2025-12-09T12:02:38 1765281758

So a project that's still using Java 1.6 and has perfect test coverage and some poor developer is paid to maintain it (but NOT upgrade it!) is not "legacy" in your book?

Then we disagree on the definition.

"Legacy" projects to me are those that should've went through at least two generational refactorings but haven't because of some unfathomable reason. These are the ones that eventually end up being rewritten from scratch because it's faster than trying to upgrade the 25 year old turd.

mekael · 2025-12-09T21:00:25 1765314025

I've predominantly worked in two industries, healthcare/public health and insurance where policies terms are measured in decades. The software for both ranges from 20 to 40 years old, and it hasn't been upgraded because to do so poses an existential risk to either the business or, in the case of healthcare, to human life. Upgrades are measured in terms of human generations because of said risk, but I wouldn't call these systems legacy due to not moving beyond java 1.6.

also, the Hyperion Cantos was a great series.

lambdaone · 2025-12-09T12:11:04 1765282264

Claude is insanely good at grunt-work maintenance coding, which is a fairly formulaic exercise that mostly requires RTFM and simple code changes that look a lot likw the surrounding code. Designing new things from scratch based on human specs is something which Claude still struggles with.

mcny · 2025-12-09T12:25:18 1765283118

The problem is that it often doesn't get it right the first time. You have to sort of have a conversation and it eventually gets there but if you have no idea what the destination should be like, you can't guide it there.

dleeftink · 2025-12-08T23:05:43 1765235143

Although many tools exist, there still seems to a large context gap here: we need better tools to orient ourselves and to navigate large (legacy) codebases. While not strictly a a source graph or the like, I do think Enso like interface may prove successful here[0].

locknitpicker · 2025-12-09T07:07:02 1765264022

> It's not about its ability to churn out lots of code quickly: it's an extra set of eyes/brain that works much faster that human developer.

This is the key take right here. LLMs excel at parsing existing content, summarizing it, and use it to explore scenarios and hypotheticals.

Even the best coding agents out there such as Claude Code or Gemini often fail to generate at the first try code that actually compiles, let alone does what it is expected to do.

Apologists come up with excuses such as the legacy software is not architected well enough to be easily parseable by LLMs but that is a cheap excuse. The same reference LLMs often output utter crap in greenfield projects they themselves generated, and do so after a hand full of prompts. The state of a project is not the issue.

The world is coming to the realization that the AI hype is not delivering. The talk of AI bubble is already mainstream. But like IDEs with auto complete, agents might not solve every problem but they are nevertheless useful and are here to stay. They are more a kin to a search engine where a user doesn't need to copy/paste code snippets to apply a code change.

The sooner we realize this, the better.

hello12343214 · 2025-12-09T05:30:50 1765258250

this was really interesting to read.

curiousgal · 2025-12-09T12:21:15 1765282875

I honestly don't know what code bases you guys are working with, for me I tried it with a large quant library (C++ 97) in an effort to modernize it and so far it's been nothing hit a waste of time. Similarly for a medium sized python quant codebase (3.6) trying to port it to 3.12, and it's also been a headache.

loandbehold · 2025-12-09T19:54:28 1765310068

Email me loandbehold0 at proton.me with details of what you are trying to do. I don't have quantitative finance experience but I can give it a shot.

freedomben · 2025-12-08T23:54:08 1765238048

Completely agree. In the past 12 months, I've had five or six use cases that I would not have bothered scripting or automating before, but I've cranked out scripts or even small web services in under an hour that get the job done using AI. It has really revolutionized the super small bite-sized issues

KellyCriterion · 2025-12-09T14:01:12 1765288872

exactly this! You can do things nearby with one prompt which would have taken weeks before to tinker around; esp. ClaudeAI is very good with "give a detailed first prompt and some context sourcefiles and create a working example on first shot".

E.g. I have to deal with lot of reports and those are usually "never fully developed" because "we can add this one row/feature later" because "management want us to ship early". Now I can enhance our reports by whatever metric just by handing over a current XLSX-exportcode and tell the LLM: "now i want additionally XY here...."

trollbridge · 2025-12-09T03:41:11 1765251671

Well said. The cost of building a CRUD has dropped 90%.

The open question is why people needed fancy AI tools like Claude to write CRUDs in the first place. These kind of tasks ought to be have been automated a long time ago.

rsynnott · 2025-12-09T11:17:56 1765279076

> These kind of tasks ought to be have been automated a long time ago

They have been, repeatedly, since the 70s. See dBase, Clipper, Microsoft Access, Hypercard, Ruby on Rails, stretching Wordpress to within an inch of its life, all manner of "no-code" things...

And, honestly, Excel. People do all manner of terrifying things with Excel, and it is unquestionably the most successful, and arguably the _only_ successful, "we can do this thing instead of employing a programmer" tool.

Generally, one of two things has happened. Either (a) the products of such automation become unmaintainable nightmares (common for the more automated approaches like MS Access) or (b) they become complex enough that they tend towards 'normal' programming (common with, say, Rails, where you could get a simple CRUD with basically just DSL, but realistically eventually you're gonna be writing lots of Ruby).

I feel like LLM-produced stuff is probably going to fall into column A.

trollbridge · 2025-12-09T16:54:44 1765299284

Excel and Google Sheets are indeed where most non-programmers frequently come the closest to programming and actually create useful apps for themselves.

So what’s interesting is that Copilot is basically useless for this task, as is Gemini. How is Microsoft messing up this badly?

sothatsit · 2025-12-09T04:37:21 1765255041

> These kind of tasks ought to be have been automated a long time ago.

It’s much easier to write business logic in code. The entire value of CRUD apps is in their business logic. Therefore, it makes sense to write CRUD apps in code and not some app builder.

And coding assistants can finally help with writing that business logic, in a way that frameworks cannot.

raxxorraxor · 2025-12-09T10:00:51 1765274451

CRUD as a concept is flawed. It is more or less any computational system with input -> process -> output. Just as this abstract system can have any complexity, the same is true for any CRUD app.

You don't need Claude to write it. But you cannot generate solid web forms with the same speed. What usually would have taken you a few hours is now solved in much less time.

I doubt software will get cheaper though, requirements will adapt.

trollbridge · 2025-12-09T17:02:32 1765299752

If I wanted to make a crud, I’d whip up my own little framework first (a few hours) and then cranking out forms would be almost trivial.

eloisant · 2025-12-09T11:02:45 1765278165

That went 90% down even before AI, Rails and the other frameworks, libraries, tooling have made a big difference compared to earlier years.

The number of lines of code to be written is much, much lower than in the early 2000's.

IshKebab · 2025-12-09T09:04:04 1765271044

> These kind of tasks ought to be have been automated a long time ago.

People have been trying for literally decades. The problem is that there is just enough uniqueness to every CRUD app that you can't really have "the CRUD app".

I guess it's the sweet spot for AI at the moment because they're 95% all the same but with some fairly simple unique aspects.

almosthere · 2025-12-08T23:07:13 1765235233

Most code is simple, the fact that large complex systems are layers of simple code on top of itself, like garbage heaps at the dump, makes it complex. Sticking with the garbage analogy, the LLM is like upgrading from one shovel to an crew of 10 people with excavators to look for a lost Bitcoin hard drive.

Your project is still going to fail, but it will fail faster with the 10 excavators.

edg5000 · 2025-12-09T07:35:01 1765265701

The analogy I use is going from hand-farming or farm animals to large combines overnight. You still need all the knowledge abouw farming, but it amplifies and multiplies your ability.

shigawire · 2025-12-09T05:26:22 1765257982

Sometimes you just need to move trash around with low accuracy... In that case the excavator swarm is good enough.

ngc248 · 2025-12-09T09:18:57 1765271937

Every use case has necessary complexity. The only things simple are CRUD apps.

tclancy · 2025-12-09T02:00:16 1765245616

And pollute the world. Really good analogy.

Terr_ · 2025-12-08T23:46:51 1765237611

> The cost of writing simple code has dropped 90%.

Plus there's a lot of simple code you shouldn't be writing either way, because it's in a library by now.

By their nature, LLMs will do their best with things that could be plagiarized.

vidarh · 2025-12-09T00:17:19 1765239439

LLMs work great at identifying libraries I'd never have otherwise found and use them, as long as you ask them for solutions instead of micromanage how they should get things done.

matwood · 2025-12-09T12:13:34 1765282414

I've noticed this also. Very useful.

mewpmewp2 · 2025-12-09T00:15:56 1765239356

Aren't we having major issues with there being too many small libraries right now and dependency chain that grows exponentially? I have thought LLMs will actually benefit us a lot here, with not having to use a lib for every little thing (leftpad etc?).

rsynnott · 2025-12-09T11:20:55 1765279255

That's primarily a culture problem, mostly with Javascript (you don't really see the same issue in most language ecosystems). Having lots of tiny libraries is bad, but writing things covered by libraries instead of using _sensible_ libraries is also bad.

(IMO Javascript desperately needs an equivalent to Boost, or at the very least something like Apache Commons.)

bluefirebrand · 2025-12-09T01:30:33 1765243833

Frankly if you were relying on libraries like Leftpad to you probably had no business writing code before AI

And if you rely on AI to write that code now you still have no business writing code

teknopaul · 2025-12-09T02:52:26 1765248746

Bit harsh.

That was probably a node / npm thing, because they had no stdlib it was quite common to have many small libraries.

I consider it an absolute golden rule for coding to not write unnecessary code & don't write collections.

I still see a lot of C that ought not to have been written.

I'm a grey beard, and don't fear for my job. But not relying on AI if it's faster to write, is as silly as refusing a correct autocomplete and typing it by hand. The bytes don't come out better