Thanks for your answers! While it is seemingly hard to calculate it, maybe one s... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		lagrange77 6 months ago \| parent \| context \| favorite \| on: Running GPT-OSS-120B at 500 tokens per second on N... Thanks for your answers! While it is seemingly hard to calculate it, maybe one should just make a database website that tracks specific setups (model, exact variant / quantisation, runner, hardware) where users can report, which combination they got running (or not) along with metrics like tokens/s. Visitors could then specify their runner and hardware and filter for a list of models that would run on that.

diggan 6 months ago [–]

Yeah, what you're suggesting sounds like it could be more useful than the "generalized calculators" people are currently publishing and using.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact