The weights likely won't be available wrt. this model since this is part of the ...

storystarling · 2026-01-26T20:24:25 1769459065

The closed nature is one thing, but the opaque billing on reasoning tokens is the real dealbreaker for integration. If you are bootstrapping a service, I don't see how you can model your margins when the API decides arbitrarily how long to think and bill for a prompt. It makes unit economics impossible to predict.

TobTobXX · 2026-01-26T21:17:11 1769462231

Doesn't ClosedAI do the same? Thinking models bill tokens, but the thinking steps are encrypted.

Rastonbury · 2026-01-27T02:01:07 1769479267

Destroying unit economics is a bit dramatic... you can chose thinking effort for modern models/APIs and add guidance to the system prompts

czl · 2026-01-27T02:08:20 1769479700

FYI: Newer LLM hosting APIs offer control over amount of "thinking" (as well as length of reply) -- some by token count others by an enum (high low, medium, etc.).

zozbot234 · 2026-01-26T20:56:35 1769460995

You just have to plan for the worst case.