Version | Model Parameters | Quantized Format | Model Size (FP16) | Model Size (INT8) | Model Size (INT4) |
Llama 1 | 6.7B | FP16, INT8, INT4 | ~13.4 GB | ~6.7 GB | ~3.35 GB |
γ
€ | 13B | FP16, INT8, INT4 | ~26 GB | ~13 GB | ~6.5 GB |
γ
€ | 32.5B | FP16, INT8, INT4 | ~65 GB | ~32.5 GB | ~16.25 GB |
γ
€ | 65.2B | FP16, INT8, INT4 | ~130.4 GB | ~65.2 GB | ~32.6 GB |
Llama 2 | 6.7B | FP16, INT8, INT4 | ~13.4 GB | ~6.7 GB | ~3.35 GB |
γ
€ | 13B | FP16, INT8, INT4 | ~26 GB | ~13 GB | ~6.5 GB |
γ
€ | 69B | FP16, INT8, INT4 | ~138 GB | ~69 GB | ~34.5 GB |
Llama 3.0 | 8B | FP16, INT8, INT4 | ~16 GB | ~8 GB | ~4 GB |
γ
€ | 70B | FP16, INT8, INT4 | ~141.2 GB | ~70.6 GB | ~35.3 GB |
Llama 3.1 | 8B | FP16, INT8, INT4 | ~16 GB | ~8 GB | ~4 GB |
γ
€ | 405B | FP16, INT8, INT4 | ~141.2 GB | ~70.6 GB | ~35.3 GB |
Llama 3.2 | 1B | FP16, INT8, INT4 | ~2 GB | ~1 GB | ~0.5 GB |
γ
€ | 3B | FP16, INT8, INT4 | ~6 GB | ~3 GB | ~1.5 GB |
γ
€ | 11B (vision-enabled) | FP16, INT8, INT4 | ~22 GB | ~11 GB | ~5.5 GB |
γ
€ | 90B | FP16, INT8, INT4 | ~180 GB | ~90 GB | ~45 GB |
Llama 3.3 | 70B optimized | FP16, INT8, INT4 | γ
€ | γ
€ | γ
€ |
Model Quantization
Tags
Published
February 14, 2025