Show HN: Mistral NeMo finetuning fits in Colab
2 by danielhanchen | 0 comments on Hacker News.
Managed to make Mistral NeMO 12b fit in a free Google Colab with a Tesla T4 GPU (16GB) for 4bit QLoRA finetuning! Managed to shave 60% VRAM usage and made it 2x faster as well! It should work in under 12GB of VRAM as well!