Model Quantization
πŸ“’

Model Quantization

Tags
Published
February 14, 2025
Version
Model Parameters
Quantized Format
Model Size (FP16)
Model Size (INT8)
Model Size (INT4)
Llama 1
6.7B
FP16, INT8, INT4
~13.4 GB
~6.7 GB
~3.35 GB
γ…€
13B
FP16, INT8, INT4
~26 GB
~13 GB
~6.5 GB
γ…€
32.5B
FP16, INT8, INT4
~65 GB
~32.5 GB
~16.25 GB
γ…€
65.2B
FP16, INT8, INT4
~130.4 GB
~65.2 GB
~32.6 GB
Llama 2
6.7B
FP16, INT8, INT4
~13.4 GB
~6.7 GB
~3.35 GB
γ…€
13B
FP16, INT8, INT4
~26 GB
~13 GB
~6.5 GB
γ…€
69B
FP16, INT8, INT4
~138 GB
~69 GB
~34.5 GB
Llama 3.0
8B
FP16, INT8, INT4
~16 GB
~8 GB
~4 GB
γ…€
70B
FP16, INT8, INT4
~141.2 GB
~70.6 GB
~35.3 GB
Llama 3.1
8B
FP16, INT8, INT4
~16 GB
~8 GB
~4 GB
γ…€
405B
FP16, INT8, INT4
~141.2 GB
~70.6 GB
~35.3 GB
Llama 3.2
1B
FP16, INT8, INT4
~2 GB
~1 GB
~0.5 GB
γ…€
3B
FP16, INT8, INT4
~6 GB
~3 GB
~1.5 GB
γ…€
11B (vision-enabled)
FP16, INT8, INT4
~22 GB
~11 GB
~5.5 GB
γ…€
90B
FP16, INT8, INT4
~180 GB
~90 GB
~45 GB
Llama 3.3
70B optimized
FP16, INT8, INT4
γ…€
γ…€
γ…€