release: 0.0.8 New features: - weight-only quantization, - integer matmul acceleration on CUDA. Bug fixes: - actually use float16 weights, - avoid float16 overflows, - correct device placement, - robust serialization.
release: 0.0.8 New features: - weight-only quantization, - integer matmul acceleration on CUDA. Bug fixes: - actually use float16 weights, - avoid float16 overflows, - correct device placement, - robust serialization.