More

hmottestad · 2025-08-30T14:25:24 1756563924

Been playing with Codex CLI the past week and it really loves to create a fix for a bug by adding a special case for just that bug in the code. It couldn't see the patterns unless I pointed them out and asked it to create new abstractions.

It would just keep adding what it called "heuristics", which were just if statements that tested for a specific condition that arose during the bug. I could write 10 tests for a specific type of bug, and it would happily fix all of them. When I add another one test with the same kind of bug it obviously fails, because the fix that Codex came up with was a bunch of if statements that matched the first 10 tests.

xyzzy123 · 2025-08-30T15:02:55 1756566175

Also they hedge a lot, will try doing things one way, have a catch / error handler and then try a completely different way - only one of them can right but it just doesn't care. Have to lean hard to get it to check which paths are actually used and delete the others.

I am convinced this behaviour and the one you described are due to optimising for swe benchmarks that reward 1-shotting fixes without regard to quality. Writing code like this makes complete sense in that context.

mewpmewp2 · 2025-08-30T15:48:52 1756568932

That's a really good point. I was wondering why some of the LLMs were trained to try to pass things so sloppily constantly. Writing mock data, methods and pretending as if the task is complete and everything is great, good to go. They do seem to be trained just to pass some sort of conditions sadly and it feels somehow to me that it has got worse as of late. It should be relatively easy to reward them for writing robust code even if it takes longer or won't work, but it does seem they are geared towards getting high swe benchmarks.

Buttons840 · 2025-08-30T14:29:15 1756564155

It's clear that these AIs are approaching human level intelligence. (:

Thank you for giving a perfect example of what I was describing.

The thing is, you actually can make the software work this way, you just have to add enough if-statements to handle all cases--or rather, enough cases that the manager is happy.

hmottestad · 2025-08-22T16:45:19 1755881119

People at work still don’t believe me when I tell them that there’s no point using the pods that say they have rinse aid built in…

hmottestad · 2025-08-20T19:08:42 1755716922

Have you tried this one here by any chance?

https://huggingface.co/dslim/bert-base-NER

Just wondering if it’s worth testing and what it would be most useful for.

hmottestad · 2025-08-05T20:33:09 1754425989

I don’t think they trained it for fact retrieval.

Would probably do a lot better if you give it tool access for search and web browsing.

Invictus0 · 2025-08-05T20:41:10 1754426470

What is the point of an offline reasoning model that also doesn't know anything and makes up facts? Why would anyone prefer this to a frontier model?

MuteXR · 2025-08-05T21:32:24 1754429544

Data processing? Reasoning on supplied data?

hmottestad · 2025-08-03T05:47:41 1754200061

In Oslo we seem to have a problem with trucks. Just in the past year, two people have been run over and killed by trucks. One was where the truck driver was reversing and another where the truck driver did an illegal right turn over a pavement.

Recently there has been a case in the courts where a truck driver didn’t yield to a cyclist and killed her. The narrative from the national truck association was basically that the cyclist was at fault. Even the courts were in on it, only when it got to the highest court did it seem that anyone was willing to blame the truck driver.

hmottestad · 2025-08-02T12:11:03 1754136663

«Get notified when we’re launching»

There doesn’t seem to be anything here yet. I signed up, but haven’t heard back yet.

So not much of a «Show HN» when there is nothing to see at the moment.

hmottestad · 2025-07-27T20:59:14 1753649954

Good for you. Cross my fingers that you'll land a good job soon. Or create your own job.

tombert · 2025-07-27T21:04:50 1753650290

I have a few prospects lined up.

I'm actually planning on doing a second masters from a slightly more prestigious university with a more theory-heavy degree [1], but it's nice to at least have an official graduate degree now. Hopefully it helps me find work a bit quicker, and if nothing else it's just kind of fun to pile up degrees.

[1] https://www.open.ac.uk/postgraduate/qualifications/f04

yu3zhou4 · 2025-07-28T11:09:04 1753700944

I was recently looking for a master's degree in math that is:

- 100% remote

- 100% self-paced

- fairly cheap

and it looks like Open University is the best option right now? Did you find any better option?

tombert · 2025-07-28T12:55:01 1753707301

I couldn’t find anything that ticks all those outside of OU.

University of Texas has one that looked pretty ok, but it was kind of expensive for a non-Texas resident.

University of Western Florida has one for “Mathematical Sciences”, which more or less fits, and it’s not even that expensive, but I think that one is synchronous.

yu3zhou4 · 2025-07-28T15:30:08 1753716608

Yeah, I wish there were more options than that. Also, remote phd or master+phd would be even better, but these are even more uncommon and pricey (unless you know about a one that is good and cheap and remote then I’d love to learn more)

tombert · 2025-07-28T16:05:33 1753718733

I was doing the University of York online PhD in computer science (formal methods), and it was actually pretty great, but it was costing me like $17,000 per year, and it was a huge time sink when I was already working full time.

That said, if you feel like you're organized enough to pull it off, I do recommend looking into University of York. It's a very good school.

yu3zhou4 · 2025-07-28T17:12:34 1753722754

Thanks, that's actually useful and I will be happy to consider it once I have more spare time in life!

Why did you complete WGU masters in computer science after already having a PhD in computer science?

tombert · 2025-07-28T17:47:30 1753724850

I don't have a PhD, I dropped it about a year ago; sorry, rereading my comment that was not made clear.

I wanted a graduate degree in CS, and I figured I could get the WGU one quickly.

yu3zhou4 · 2025-07-28T19:47:23 1753732043

Oh I get it now, thanks! Please let me know if you decide to enroll to math master's in OU, maybe we can help each other! I think I'll do the same on the next semester (so early 2026)

tombert · 2025-07-28T19:57:51 1753732671

I actually just registered for it :) Assuming I'm approved I'll be starting in October.

Feel free to email me (address in profile) if you want to talk about it.

yu3zhou4 · 2025-07-29T13:18:59 1753795139

Thanks! Keeping fingers crossed for your success with this degree

hmottestad · 2025-07-17T17:34:19 1752773659

Might be related to EFTA.

hmottestad · 2025-07-09T18:55:38 1752087338

My mum is using my old 12 Pro, which is about the same age as yours. She’s still happy with it, though to me it feels old.

hmottestad · 2025-07-03T22:06:16 1751580376

I’m sure the CPU designers would love it if they didn’t need several different layers of cache. Or no cache at all. Imagine if memory IOPS were as fast as L1 cache, no need for all that dedicated SRAM on the chip or worry about side channel attacks.

atq2119 · 2025-07-04T18:16:00 1751652960

Sure, but we were talking about the perspective of software developers. The hardware designers take on complexity so that the software developer's work can be simpler.