is end to end speech model like openai real time /gemini live or open source qwe...

a6kme · 2025-12-16T06:24:56 1765866296

There is always a tradeoff between latency and reasoning. The bigger the model, the more stuff we can get it to do by better instruction following, but it comes at a cost of increased latency. OpenSource colocated smaller models do much better in terms of latency, but the instruction following is not that great, and we might have to tune the prompts much more than tuning for bigger models.