Skip to content
GitLab
Explore
Sign in
Overview
Active
Stale
All
This project is mirrored from
https://github.com/SJTU-IPADS/PowerInfer.git
. Pull mirroring updated
Sep 19, 2024
.
fix/vram-budget-inaccuracy
diverged from upstream
4d80abd3
·
wip: disable vram budget hard limit temporarily
·
Dec 28, 2023
model-mistral
f270a288
·
support dense Mistral model
·
Feb 05, 2024
144-cmake-317-or-higher-is-required-the-repository-asks-for-version-3134
764347f2
·
Fix CMake requirement in README
·
Feb 18, 2024
fix-compile-worktree
abf4aa93
·
Fix compiling issue under git worktrees
·
Feb 20, 2024
fix/cuda-warning-options
8edcb46c
·
fix: cuda host compiler options at wrong position
·
Mar 07, 2024
full-gpu-graph
efaf9264
·
calc gpu_idx sum at load time
·
Mar 08, 2024
axpy-cublas
584155a7
·
chore: update default n_batch
·
Mar 26, 2024
q4_mulmat_sparse
978b0d37
·
quantized version for `mul_mat_sparse` vec
·
Mar 27, 2024
model-bamboo
70c1bd3a
·
Add news in README.md
·
Mar 28, 2024
quantize-support
4f126960
·
fix syntax error
·
Mar 31, 2024
moe-dense
84c33451
·
fix computation graph
·
Apr 01, 2024
fix/axpy-dense
fb91425f
·
Remove axpy dense op
·
Apr 02, 2024
fix/offload-ffn-norm
38ce80a2
·
fix: offload ffn norm weights
·
Apr 07, 2024
feat-moe
09d79ec0
·
minimal impl. of gpu_idx generation for moe model
·
Apr 08, 2024
news
1cf13fc2
·
fix typo
·
Jun 11, 2024
readme
4d45db92
·
minor
·
Jun 11, 2024