13B uses about 9GB on my MacBook Air. If you have another machine (x86) with eno...

13B uses about 9GB on my MacBook Air. If you have another machine (x86) with enough RAM to convert the original LLaMA representation to GGML, you can give it a try. But quantization step must be done on MacBook.

Maybe it is more feasible for you to use 7B with larger context. For some "autocompletion" experiments with Python code I had to extend context to 2048 tokens (+1-1.5GB).