I tried that with RTX 4090 as the primary card and 3090 as eGPU over Thunderbolt...

valine · on April 10, 2023

I’ve got 2 rtx 4090s on an EATX motherboard. Been using them to run the full 13B un-quantized with a good deal of success. Getting about 20 tokens/s.

int_19h · on April 10, 2023

What is your setup for cooling? I don't think I'd want to stick another 4090-size card in mine with just air cooling...

valine · on April 10, 2023

I have AIO liquid cooling for both cards. The radiators are annoying though, I might convert it to a custom loop if I ever add a third card.

alchemist1e9 · on April 10, 2023

Looking at the top end H100 80GB systems with NVLink from HPC vendors and it occurred to me we are about to swing back to massive almost mainframe like form-factor systems, giant bus, like old expandable qbus in the 80s but this time for GPUs.

What I mean is they have systems with 8x cards but given the compute requirements of these huge LLMs probably systems with 32+ all on a dedicated memory bus (NVLink) are what will be needed as weights sizes expand. This is all for inference btw, not even training, but same hold for training probably best possible interconnect between same monster systems.

I’m dreaming there might be a distributed eventually consistent partial training algorithm then that would democratize creation of these models.

In regards to smaller scale individual systems for inference, if one has resources and is fairly technical and can utilize such technology then perhaps in 5-10 years the wealthy might buy units for $50K+ that get installed in their home or something.

Really incredible developments very quickly. Apologies for the potentially inappropriately long rant to the previous comment.

int_19h · on April 10, 2023

The other possibility is that we'll get cards that are very specifically designed just for the LLMs, basically ditching everything that is not strictly necessary for the sake of squeezing more compute / VRAM, and perhaps optimizing around int4/int8 (the latter is apparently "good enough" for training?).