Reasoning tokens are indeed billed as output tokens. > While reasoning tokens ar...

baq · on Sept 12, 2024

This is concerning - how do you know you aren’t being fleeced out of your money here…? You’ll get your results, but did you really use that much?

jstummbillig · on Sept 13, 2024

I think it's fantastic that now, for very little money, everyone gets to share a narrow but stressful subset of what it feels like to employ other people.

Really, I recommend reading this part of the thread while thinking about the analogy. It's great.

baq · on Sept 13, 2024

It’s nice on the outside, but employees are actually all different people and this here is one company’s blob of numbers with not much incentive to optimize your cost.

Competition fixes some of this, I hope Anthropic and Mistral are not far behind.

adwn · on Sept 13, 2024

> […] with not much incentive to optimize your cost. Competition fixes some of this […]

Just like employing other people!

jstummbillig · on Sept 13, 2024

On the contrary. It will be the world's most scrutinized employee. Thousands of people, amongst them important people with big levers, will be screaming in their ear on my behalf constantly, and my — our collective — employee gets better without me having to do anything. It's fantastic!

ta8645 · on Sept 13, 2024

Your idea is really a brilliant insight. Revealing.

rohanm93 · on Sept 14, 2024

I love this so much haha.

"I can only ask my employee 20 smart things this week for $20?! And they get dumber (gpt-4o) after that? Not worth it!"

gsbcbdjfncnjd · on Sept 13, 2024

Any respectable employer/employee relationship transacts on results rather than time anyway. Not sure the analogy is very applicable in that light.

adwn · on Sept 13, 2024

> Any respectable employer/employee relationship transacts on results rather than time anyway.

No. This may be common in freelance contracts, but is almost never the case in employment contracts, which specify a time-based compensation (usually either per hour or per month).

ethbr1 · on Sept 13, 2024

I believe parent's point was that if ones management is clueless as to how to measure output and compensation/continued employment is unlinked from same... one is probably working for a bad company.

gsbcbdjfncnjd · on Sept 13, 2024

Yea, I said ‘respectable’.

KeplerBoy · on Sept 13, 2024

That's just not how employment laws are written.

anticensor · on Sept 13, 2024

Employment law actually permits per-piece payments too, albeit that type of pay scale is rare.

konschubert · on Sept 13, 2024

It is!

rsanek · on Sept 12, 2024

obfuscated billing has long been a staple of all great cloud products. AWS innovated in the space and now many have followed in their footsteps

lolinder · on Sept 12, 2024

Also, now we're paying for output tokens that aren't even output, with no good explanation for why these tokens should be hidden from the person who paid for them.

HeatrayEnjoyer · on Sept 12, 2024

If you read the link they have a section specifically explaining why it is hidden.

lolinder · on Sept 12, 2024

I read it. It's a bad explanation.

The only bit about it that feels at all truthful is this bit, which is glossed over but likely the only real factor in the decision:

> after weighing multiple factors including ... competitive advantage ... we have decided not to show the raw chains of thought to users.

infogulch · on Sept 13, 2024

Good catch. That indicates that chains of thought are a straightforward approach to make LLMs better at reasoning if you could copy it just by seeing the steps.

HeatrayEnjoyer · on Sept 16, 2024

Bad, in your opinion.

RobertDeNiro · on Sept 12, 2024

Also seems very impractical to embed this into a deployed product. How can you possibly hope to control and estimate costs? I guess this is strictly meant for R&D purposes.

sebzim4500 · on Sept 12, 2024

You can specify the max length of the response, which presumably includes the hidden tokens.

I don't see why this is qualitatively different from a cost perspective than using CoT prompting on existing models.

BoorishBears · on Sept 13, 2024

For one, you don't get to see any output at all if you run out of tokens during thinking.

If you set a limit, once it's hit you just get a failed request with no introspection on where and why CoT went off the rails

Aeolun · on Sept 13, 2024

Why would I pay for zero output? That’s essentially throwing money down the drain.

dartos · on Sept 12, 2024

You can’t verify that you’re paying what you should be if you can’t see the hidden tokens.

sebzim4500 · on Sept 13, 2024

With the conventional models you don't get the activations or the logits even though those would be useful.

Ultimately if the output of the model is not worth what you end up paying for it then great, I don't see why it really matters to you whether OpenAI is lying about token counts or not.

dartos · on Sept 13, 2024

As a single user, it doesn’t really, but as a SaaS operator I want tractable, hopefully predictable pricing.

I wouldn’t just implicitly trust a vendor when they say “yeah we’re just going to charge you for what we feel like when we feel like. You can trust us.”

HarHarVeryFunny · on Sept 13, 2024

They are currently trying to raise money (talk of new $150B valuation), so that may have something to do with it

Emiledel · on Sept 13, 2024

In the UI the reasoning is visible. The API can probably return it too, just check the code

AlphaAndOmega0 · on Sept 13, 2024

OAI doesn't show the actual COT, on the grounds that it's potentially unsafe output and also to prevent competitors training on it. You only see a sanitized summary.

famouswaffles · on Sept 13, 2024

What's shown in the UI is a summary of the reasoning

creatonez · on Sept 13, 2024

No access to reasoning output seems totally bonkers. All of the real cost is in inference, assembling an HTTP request to deliver that result seems trivial?