ollama

Commit Graph

Author	SHA1	Message	Date
peanut256	a189810df6	Determine max VRAM on macOS using `recommendedMaxWorkingSetSize` (#2354 ) * read iogpu.wired_limit_mb on macOS Fix for https://github.com/ollama/ollama/issues/1826 * improved determination of available vram on macOS read the recommended maximal vram on macOS via Metal API * Removed macOS-specific logging * Remove logging from gpu_darwin.go * release Core Foundation object fixes a possible memory leak	2 years ago
Daniel Hiltgen	7427fa1387	Fix up the CPU fallback selection The memory changes and multi-variant change had some merge glitches I missed. This fixes them so we actually get the cpu llm lib and best variant for the given system.	2 years ago
Daniel Hiltgen	39928a42e8	Always dynamically load the llm server library This switches darwin to dynamic loading, and refactors the code now that no static linking of the library is used on any platform	2 years ago
Daniel Hiltgen	d88c527be3	Build multiple CPU variants and pick the best This reduces the built-in linux version to not use any vector extensions which enables the resulting builds to run under Rosetta on MacOS in Docker. Then at runtime it checks for the actual CPU vector extensions and loads the best CPU library available	2 years ago
Jeffrey Morgan	c336693f07	calculate overhead based number of gpu devices (#1875 )	2 years ago
Jeffrey Morgan	08f1e18965	Offload layers to GPU based on new model size estimates (#1850 ) * select layers based on estimated model memory usage * always account for scratch vram * dont load +1 layers * better estmation for graph alloc * Update gpu/gpu_darwin.go Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com> * Update llm/llm.go Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com> * Update llm/llm.go * add overhead for cuda memory * Update llm/llm.go Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com> * fix build error on linux * address comments --------- Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>	2 years ago
Jeffrey Morgan	c7ea8f237e	set `num_gpu` to 1 only by default on darwin arm64 (#1771 )	2 years ago
Daniel Hiltgen	a2ad952440	Fix windows system memory lookup This refines the gpu package error handling and fixes a bug with the system memory lookup on windows.	2 years ago
Daniel Hiltgen	d966b730ac	Switch windows build to fully dynamic Refactor where we store build outputs, and support a fully dynamic loading model on windows so the base executable has no special dependencies thus doesn't require a special PATH.	2 years ago
Daniel Hiltgen	7555ea44f8	Revamp the dynamic library shim This switches the default llama.cpp to be CPU based, and builds the GPU variants as dynamically loaded libraries which we can select at runtime. This also bumps the ROCm library to version 6 given 5.7 builds don't work on the latest ROCm library that just shipped.	2 years ago
Daniel Hiltgen	6558f94ed0	Fix darwin intel build	2 years ago
Daniel Hiltgen	35934b2e05	Adapted rocm support to cgo based llama.cpp	2 years ago
Daniel Hiltgen	d4cd695759	Add cgo implementation for llama.cpp Run the server.cpp directly inside the Go runtime via cgo while retaining the LLM Go abstractions.	2 years ago

12 Commits (23ebe8fe11995cfd99cafcf9871fce9a9a120fd1)