185 Commits (main)

Author SHA1 Message Date
Jesse Gross a343ae53a4 ggml: Use ordinal IDs for AMD GPUs on Linux when UUID is unavailable 8 months ago
Jesse Gross 79f6376f5b ggml: No-alloc mode 8 months ago
Michael Yang fa7776fd24
gpt-oss (#11672) 8 months ago
Daniel Hiltgen 25911a6e6b
mac: disable bf16 on unsupported OS versions (#11585) 8 months ago
Oliver Simons ea85e27bbd
Increase performance for Gemma3n models on NVGPUs by enabling CUDA Graph execution (#11525) 8 months ago
Jesse Gross 35fda7b4af ggml: Report ordinal IDs for AMD GPUs on Windows 9 months ago
Michael Yang 73b642e6f3
add new gemma model (#11204) 9 months ago
Daniel Hiltgen 1c6669e64c
Re-remove cuda v11 (#10694) 9 months ago
Jeffrey Morgan 6baf1e31e2
Revert "Revert "ggml: Export GPU UUIDs" (#11115)" (#11117) 10 months ago
Jeffrey Morgan ed567ef43b
Revert "ggml: Export GPU UUIDs" (#11115) 10 months ago
Jesse Gross aaa7818000 ggml: Export GPU UUIDs 11 months ago
Parth Sareen 884d26093c
llama: add minimum memory for grammar (#10820) 10 months ago
Jesse Gross 6db8a3771c ggml: Report graph memory for failed allocations 11 months ago
DarkCaster e6a800ca11
llama: fix incorrect initialization of C.struct_common_sampler_cparams.penalty_present (#10779) 10 months ago
Michael Yang 333e360422
model: handle multiple eos tokens (#10577) 11 months ago
Bruce MacDonald 0aa8b371dd
model: add Qwen2.5-VL support (#10385) 11 months ago
Michael Yang 23125648b8
chore: update mllama to use ollama engine (#10637) 11 months ago
Parth Sareen 8cc33f4c2b
llama: fix memory leak for grammar (#10696) 11 months ago
Jeffrey Morgan f46df4e5d2
llama: fix defrag patch to defragment when no slots are available (#10695) 11 months ago
Jeffrey Morgan 4b903f088a
llama: fix crash on snowflake embedding model (#10690) 11 months ago
Jeffrey Morgan 0cefd46f23
llama: update to commit de4c07f93 (#10655) 11 months ago
frob ecf14a220f
llama: allocate grammar buffer based on schema length (#10649) 11 months ago
Jeffrey Morgan fa9973cd7f
api: remove unused sampling parameters (#10581) 11 months ago
Daniel Hiltgen 424810450f
Move quantization to new backend (#10363) 11 months ago
Jeffrey Morgan 3b2d2c8326
api: remove unused or unsupported api options (#10574) 11 months ago
Jeffrey Morgan 913905028b
all: fix cgo compiler warnings on windows (#10563) 11 months ago
Jesse Gross c2f5d6662b ollamarunner: Re-enable worst case graph preallocation. 11 months ago
Jeffrey Morgan 8dd12c873d
llama: update to commit e1e8e099 (#10513) 11 months ago
Jeffrey Morgan e9e5f61c45
llama: update to commit 2016f07b (#10352) 11 months ago
Parth Sareen a53d744b01
llama: remove model loading for grammar (#10096) 11 months ago
Jeffrey Morgan dc264be6ff
ml: add missing cmake property and remove additional CMakeLists.txt (#10310) 12 months ago
Jeffrey Morgan 943464ccb8
llama: update to commit 71e90e88 (#10192) 12 months ago
Jesse Gross ccb7eb8135 ggml: Free ggml_backend_buffer_t when releasing buffer 12 months ago
Bruce MacDonald 6bd0a983cd model: support for mistral-small in the ollama runner 1 year ago
Bruce MacDonald 66b2539238
runner: clear cache when shift is not possible (#9433) 1 year ago
saman-amd ead27aa9fe
Add gfx1200 & gfx1201 support on linux (#9878) 1 year ago
Patrick Devine ef378ad673
gemma3 quantization (#9776) 1 year ago
Michael Yang 9e4642e9b3 ollama debug tensor 1 year ago
Jeffrey Morgan e093db92c4
sample: temporarily use grammars for constrained generation in new engine (#9586) 1 year ago
Jeffrey Morgan 4289c74359
llama: fix kv loading on snowflake-arctic-embed models (#9536) 1 year ago
Michael Yang 05a01fdecb ml/backend/ggml: consolidate system info logging 1 year ago
Michael Yang ba7d31240e fix: own lib/ollama directory 1 year ago
Michael Yang 657685e85d fix: replace deprecated functions 1 year ago
Jeffrey Morgan 98d44fa39d
llama: add phi4 mini support (#9403) 1 year ago
Michael Yang a59f665235 ml/backend/ggml: fix debug logging 1 year ago
Jeffrey Morgan d7d7e99662
llama: update llama.cpp vendor code to commit d7cfe1ff (#9356) 1 year ago
Jeffrey Morgan 3ad4bc8afe
llama: removed unused 'vendoring' file (#9351) 1 year ago
Jeffrey Morgan 8c13cfa4dd
ml/backend/ggml: fix crash on windows paths with wide characters (#9305) 1 year ago
Michael Yang bda4ef6c56 reorder patches 1 year ago
Jeffrey Morgan d2eb226c91
llama: add patch to fix ggml backend reg on Linux with utf-8 characters in the path (#9159) 1 year ago