I thought ChatGPT is only 20B parameters to begin with? (Source https://www.forb...

sebzim4500 · on March 13, 2023

I haven't seen anything official from OpenAI confirming that ChatGPT has fewer than 175B parameters, although it is a reasonable guess if you read between the lines of their statements.

Given the author of that article is a CEO of an 'AI Ad Optimization Platform' I think that number is speculative at best.

stavros · on March 13, 2023

ChatGPT is fine-tuned InstructGPT, which has 1.3B parameters, if I'm not mistaken.

Reference for the former: https://www.technologyreview.com/2023/03/03/1069311/inside-s...

sebzim4500 · on March 14, 2023

InstructGPT isn't a single model, it's a set of techniques for finetuning a foundation model

est · on March 14, 2023

what does "1.3B parameters" mean in this context?

Does it mean we load 175B gpt-3 model first, then overwrite 1.3B parameters with InstructGPT?

I find this sentence difficult to understand

> Our labelers prefer outputs from our 1.3B InstructGPT model over outputs from a 175B GPT-3 model

https://openai.com/research/instruction-following

I am a newbie, plz correct me if I am wrong.

sebzim4500 · on March 14, 2023

They mean that they took a 1.3B parameter model, applied the InstructGPT finetuning model and found that it worked better for their usecase than a 175B parameter model which had not gone through that process.

est · on March 14, 2023

Ah I got it now. Thanks.

From the gpt-3 paper it looks like they have many variants like

- GPT-3-350M

- GPT-3-1.3B

- GPT-3-2.7B

- GPT-3-6.7B

- GPT-3-13B

- GPT-3-175B

Ada, Babbage, Curie and Davinci line up closely with 350M, 1.3B, 6.7B, and 175B respectively. The names are pretty suggestive.