graph LR
A["24 GB"] -->|"×8"| B["192 GB"]
Install
gleam add viva_tensor
Use
import viva_tensor/nf4
let small = nf4.quantize(big_tensor, nf4.default_config())
// 8x less memory
Algorithms
flowchart LR
T[Tensor] --> Q{Quantize}
Q -->|4x| I[INT8]
Q -->|8x| N[NF4]
Q -->|8x| A[AWQ]
| Compression | Efficiency | |
|---|---|---|
| INT8 | 4x | 40% |
| NF4 | 7.5x | 77% |
| AWQ | 7.7x | 53% |
Build
make test
make bench
Docs
docs/ — PT-BR, EN, 中文