12 Commits (23ebe8fe11995cfd99cafcf9871fce9a9a120fd1)

Author SHA1 Message Date
peanut256 a189810df6
Determine max VRAM on macOS using `recommendedMaxWorkingSetSize` (#2354) 2 years ago
Daniel Hiltgen 7427fa1387 Fix up the CPU fallback selection 2 years ago
Daniel Hiltgen 39928a42e8 Always dynamically load the llm server library 2 years ago
Daniel Hiltgen d88c527be3 Build multiple CPU variants and pick the best 2 years ago
Jeffrey Morgan c336693f07
calculate overhead based number of gpu devices (#1875) 2 years ago
Jeffrey Morgan 08f1e18965
Offload layers to GPU based on new model size estimates (#1850) 2 years ago
Jeffrey Morgan c7ea8f237e
set `num_gpu` to 1 only by default on darwin arm64 (#1771) 2 years ago
Daniel Hiltgen a2ad952440 Fix windows system memory lookup 2 years ago
Daniel Hiltgen d966b730ac Switch windows build to fully dynamic 2 years ago
Daniel Hiltgen 7555ea44f8 Revamp the dynamic library shim 2 years ago
Daniel Hiltgen 6558f94ed0 Fix darwin intel build 2 years ago
Daniel Hiltgen 35934b2e05 Adapted rocm support to cgo based llama.cpp 2 years ago
Daniel Hiltgen d4cd695759 Add cgo implementation for llama.cpp 2 years ago