Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What's the correlation between parameter count and RAM usage? Will LLaMA-13B fit on my MacBook Air with 8 GB of RAM or am I stuck with 7B?


13B uses about 9GB on my MacBook Air. If you have another machine (x86) with enough RAM to convert the original LLaMA representation to GGML, you can give it a try. But quantization step must be done on MacBook.

Maybe it is more feasible for you to use 7B with larger context. For some "autocompletion" experiments with Python code I had to extend context to 2048 tokens (+1-1.5GB).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: