I tested o1-preview on some coding stuff I've been using gpt-4o for. I am not impressed. The new, more intentional chain of thought logic is apparently not something it can meaningfully apply to a non-trivial codebase.
Sadly I think this OpenAI announcement is hot air. I am now (unfortunately) much less enthusiastic about upcoming OpenAI announcements. This is the first one that has been extremely underwhelming (though the big announcement about structured responses (months after it had already been supported nearly identically via JSONSchema) was in hindsight also hot air.
I think OpenAI is making the same mistake Google made with the search interface. Rather than considering it a command line to be mastered, Google optimized to generate better results for someone who had no mastery of how to type a search phrase.
Similarly, OpenAI is optimizing for someone who doesn't know how to interact with a context-limited LLM. Sure it helps the low end, but based on my initial testing this is not going to be helpful to anyone who had already come to understand how to create good prompts.
What is needed is the ability for the LLM to create a useful, ongoing meta-context for the conversation so that it doesn't make stupid mistakes and omissions. I was really hoping OpenAI would have something like this ready for use.
I have tested o1-preview on a couple of coding tasks and I am impressed.
I am looking at a TypeScript project with quite an amount of type gymnastics and a particular line of code did not validate with tsc no matter what I have tried. I copy pasted the whole context into o1-preview and it told me what is likely the error I am seeing (and it was a spot on correct letter-by-letter error message including my variable names), explained the problem and provided two solutions, both of which immediately worked.
Another test was I have pasted a smart contract in solidity and naively asked to identify vulnerabilities. It thought for more than a minute and then provided a detailed report of what could go wrong. Much, much deeper than any previous model could do. (No vulnerabilities found because my code is perfect, but that's another story).
Sadly I think this OpenAI announcement is hot air. I am now (unfortunately) much less enthusiastic about upcoming OpenAI announcements. This is the first one that has been extremely underwhelming (though the big announcement about structured responses (months after it had already been supported nearly identically via JSONSchema) was in hindsight also hot air.
I think OpenAI is making the same mistake Google made with the search interface. Rather than considering it a command line to be mastered, Google optimized to generate better results for someone who had no mastery of how to type a search phrase.
Similarly, OpenAI is optimizing for someone who doesn't know how to interact with a context-limited LLM. Sure it helps the low end, but based on my initial testing this is not going to be helpful to anyone who had already come to understand how to create good prompts.
What is needed is the ability for the LLM to create a useful, ongoing meta-context for the conversation so that it doesn't make stupid mistakes and omissions. I was really hoping OpenAI would have something like this ready for use.