The generated chain of thought for their example is incredibly long! The style is kind of similar to how a human might reason, but it's also redundant and messy at various points. I hope future models will be able to optimize this further, otherwise it'll lead to exponential increases in cost.