finetuning is easily within reach for llava-mistral or something like that, just...

codester1000 · on May 15, 2024

LLaVA is even less open than PaliGemma, it is trained on CC-BY-NC4.0 data so it can't be used commercially. I emailed the team about it. At least with Pali-Gemma the base pt models are available to be used commercially if you fine-tune them yourself

whimsicalism · on May 16, 2024

llava-mistral v1.6 is apache, my friend :)

codester1000 · on May 16, 2024

Where did you get this info? because I would love to use it, but I went through their info and it says they use a lot of the same data as 1.5 and the acknowledgement section of their site says: "The dataset is CC BY NC 4.0 (allowing only non-commercial use) and models trained using the dataset should not be used outside of research purposes"

Would love it if I could use LLaVA, but don't want to spend the money on like 18 A100s for 24 hrs that they use for training it. A lot of the models using CC BY NC 4.0 datasets, like VILA, thats not available for commercial use unless you train the model yourself. This is the first time at least a research or company has been open with this info, they specifically say: only the pt models can be used with fine-tuning for commercial use.

whimsicalism · on May 16, 2024

The author has added the weights on huggingface under Apache-2.0 license [0]. All previous versions to 1.6 were not listed under Apache. This repo has no code, it is the weights repository.

[0]: https://huggingface.co/liuhaotian/llava-v1.6-mistral-7b

codester1000 · on May 19, 2024

Oh awesome, thanks so much, didn't see this

llama_person · on May 16, 2024

you can try out https://huggingface.co/datasets/BAAI/SVIT which appears to support commercial. I've not tried it yet, but it seems to be an option.

If you build a smaller model, you should need much less than 18 A100s for 24 hours, though I don't disagree you'll need at least a few.

codester1000 · on May 16, 2024

yes their code is, but not their dataset if you want to use the model weights without training yourself

simonw · on May 15, 2024

Have you seen any good documentation anywhere on how to do that?

llama_person · on May 15, 2024

Here's a tutorial https://wandb.ai/byyoung3/ml-news/reports/How-to-Fine-Tune-L...

There's not really a super easy to use software solution yet, but a few different ones have cropped up. Right now you'll have to read papers to get the training recipes.

- https://github.com/haotian-liu/LLaVA/blob/main/scripts/finet...

- https://github.com/InternLM/xtuner/tree/main

- https://github.com/TinyLLaVA/TinyLLaVA_Factory

Is a pointer in the right direction, along with:

https://arxiv.org/abs/2304.08485

simonw · on May 15, 2024

Thanks!

whimsicalism · on May 15, 2024

the linked resources are good, also the llava repo has pretty good guides on how to train which you can adapt to SFT