I read this all the time and yet no one can seem to come up with even a few ques...

birracerveza · on Sept 25, 2023

It's probably just subjective bias, once the novelty wears off you learn not to rely on it as much because sometimes it's very difficult to get what you specifically want, so in my personal experience I ended up using it less and less to avoid butting heads with it, to the point I disabled my subscription altogether. YMMV of course.

edgyquant · on Sept 25, 2023

Everytime it’s mentioned someone says this and other users provide examples. Maybe you just don’t care about those examples

SOLAR_FIELDS · on Sept 26, 2023

Care to share these examples, in a scientific (n > 30) manner that can’t just be attributed to model nondeterminism? I don’t follow these threads religiously but in the ones I’ve seen no one has been able to provide any sort of convincing evidence. I’m not some sort of OpenAI apologist, so if there is actual good provable evidence here I will easily change my mind about it

sebzim4500 · on Sept 26, 2023

I don't see how anyone could provide what you are asking for. I can go through my chat history and find a prompt that got a better answer 3 months ago than I get now, but you can always just say it's nondeterminism.

Without access to the old model, I can't collect samples with n > 1

HenryBemis · on Sept 25, 2023

Here is one. I ask it to write some code. 4-5 pages long. With some back & forth it does. Then I ask "change lines 50-65 from blue to red", and it does (change#1). I ask it to show me the full code. Then I ask "change lines 100-120 from yellow to green". Aaaaand it makes the change#2 and revokes the change#1. Oh!! the amount of times this has happened.. So now I ask it to make a change, I do it by 'paragraph' and I copy & paste the new paragraph. It's annoying, but still makes things faster.

phkahler · on Sept 25, 2023

I haven't used it, but can't you just say "OK, use that as the new baseline from here on." Or something similar?

dmm · on Sept 25, 2023

OpenAI regularly changes the model and they admit the new models are more restricted, in the sense that they prevent tricky prompts from producing naughty words, etc.

It should be their responsibility to prove that it's just as capable.

SOLAR_FIELDS · on Sept 26, 2023

He who makes the logical argument must provide the burden of proof. Did OpenAI claim that their models didn’t regress while putting these new safeguards into place? If not, it feels like the burden of proof lies on whoever said that they did.

To be specific, the claim we are talking about here is “ChatGPT gives generally worse answers to the exact same questions than ChatGPT gave X months ago”. Perhaps for the subset of knowledge space you reference that updates were pushed to that is pretty easily provably true, but I’m more interested in the general case.

In other words, you can pretty easily make the claim that ChatGPT got worse at telling me how to make a weapon than it did 3 months ago. I could pretty easily believe that and also accept that it was probably intentional. While we can debate whether it was a good idea or not, I’m more interested in the claim over whether ChatGPT got worse at summarizing some famous novel or helping write a presentation than it was 3 months ago.

tessierashpool · on Sept 25, 2023

> I read this all the time and yet no one can seem to come up with even a few questions from several months ago that ChatGPT has become “worse” at

this could just mean that people do not have time to argue with strangers

SOLAR_FIELDS · on Sept 26, 2023

Well, sure, but shouldn’t some pedant have the time to dig up their ChatGPT history from 4 months ago to disprove the claim? Seems like it would be pretty easy to do and there are plenty of pedants on the internet but I don’t see the blogosphere awash of side by side comparisons showing how much worse it got

xfz · on Sept 26, 2023

One example: it now refuses to summarise books that it trained on. Soon after trying GPT-4 I could get it to summarise Evans DDD chapter by chapter. Not anymore.

Not a surprise, but a change nonetheless.

bondarchuk · on Sept 25, 2023

Here's a specific example https://news.ycombinator.com/item?id=37533417

SOLAR_FIELDS · on Sept 26, 2023

Pointing out a specific bug with functionality is not the same as saying “in general the quality of GPT answers has decreased over X months” especially when that bug is in a realm that LLM’s have already been provably bad at.

bondarchuk · on Sept 26, 2023

You're moving the goalposts.