The weights likely won't be available wrt. this model since this is part of the Max series that's always been closed. The most "open" you get is the API.
The closed nature is one thing, but the opaque billing on reasoning tokens is the real dealbreaker for integration. If you are bootstrapping a service, I don't see how you can model your margins when the API decides arbitrarily how long to think and bill for a prompt. It makes unit economics impossible to predict.
FYI: Newer LLM hosting APIs offer control over amount of "thinking" (as well as length of reply) -- some by token count others by an enum (high low, medium, etc.).