ollama

Commit Graph

Author	SHA1	Message	Date
Daniel Hiltgen	20c3266e94	Reduce default parallelism to 1 (#11330 ) The current scheduler algorithm of picking the paralellism based on available VRAM complicates the upcoming dynamic layer memory allocation algorithm. This changes the default to 1, with the intent going forward that parallelism is explicit and will no longer be dynamically determined. Removal of the dynamic logic will come in a follow up.	9 months ago
Daniel Hiltgen	34088dbcfb	API/CLI context enhancements (#11331 ) * API: expose context size of loaded models * CLI: add context UX This adds a column in the ps output to show the models context size.	9 months ago
Parth Sareen	43107b15b9	add `tool_name` to api.md (#11326 )	9 months ago
Parth Sareen	1f91cb0c8c	template: add tool result compatibility (#11294 )	9 months ago
Daniel Hiltgen	12d8ad0d38	ci: modularization (#11324 ) switch a few constants to variables	9 months ago
Jesse Gross	592d21e7db	Revert "ggml: Temporarily disable reporting UUIDs" The root cause was an unclean upgrade - this code is fine. This reverts commit `45f216a9c7`.	9 months ago
Jeffrey Morgan	5a08b01f5b	readme: update Ollama icon size	9 months ago
Daniel Hiltgen	4f473e224c	int: add performance integration tests (#11173 ) usage example: go test --tags=integration,perf -count 1 ./integration -v -timeout 1h -run TestModelsPerf 2>&1 \| tee int.log cat int.log \| grep MODEL_PERF_HEADER \| cut -f2- -d: > perf.csv cat int.log \| grep MODEL_PERF_DATA \| cut -f2- -d: >> perf.csv	9 months ago
Daniel Hiltgen	9d60bb44cf	doc: add NVIDIA blackwell to supported list (#11307 )	9 months ago
Vincent RAMPAL	f371260e75	Update base image to Ubuntu 24.04 LTS (#9681 )	9 months ago
Daniel Hiltgen	c9e6d7719e	doc: Update link for mac install (#11288 ) Favor the dmg now.	9 months ago
Daniel Hiltgen	2c4ce40334	mimic logs for layers on new engine (#11278 ) This adds some extra logs to make the new engine a bit more consistent with the llama engine.	9 months ago
XuKecheng	5d8c173529	readme: add NativeMind to community integrations (#11242 )	9 months ago
Jeffrey Morgan	44b17d2bfa	tools: fix parsing tool calls with empty arguments, missing required fields (#11233 )	9 months ago
Attogram Project	3b8b692218	readme: add ollama-bash-toolshed to community integrations (#11224 )	9 months ago
Michael Yang	4129af9205	chore: cleanup comments + unused vars (#11225 )	9 months ago
Jesse Gross	45f216a9c7	ggml: Temporarily disable reporting UUIDs This is causing segfaults, so disable it. Currently UUIDs are only used for debugging purposes, although they planned to be used in additional ways in the future. Bug #11211	9 months ago
Michael Yang	d0b32def60	skip quantizing per_layer_token_embd (#11207 ) this tensor isn't compatible with cuda when quantized to q4_K so skip it	9 months ago
Daniel Hiltgen	11ffc36157	ci: multi-stage release process (#11001 )	9 months ago
Jeffrey Morgan	ba04902670	fs/ggml: add multiplier in graph estimates (#11208 )	9 months ago
Jeffrey Morgan	3944602f51	fs/ggml: add missing architecture to OllamaEngineRequired() (#11206 )	9 months ago
Michael Yang	73b642e6f3	add new gemma model (#11204 ) * update patches * cherry pick metal mean kernel * cherry pick cuda mean kernel * gemma3n	9 months ago
Daniel Hiltgen	ad118d8b13	ci: arm sbsa fixes (#11194 )	9 months ago
Daniel Hiltgen	f08534137b	ci: include dependencies	9 months ago
Daniel Hiltgen	4b4a90f233	ci: pick up arm sbsa cuda libs (#11192 )	9 months ago
Daniel Hiltgen	03274a6b2f	ci: recombine linux amd64 binaries (#11188 ) Glue the rocm and archive builds back together.	9 months ago
Devon Rifkin	cc6463ebca	Merge pull request #10238 from ollama/drifkin/array-head-count-simple ggml: fix crash for array head counts	9 months ago
Daniel Hiltgen	405d2f628f	ci: rocm parallel builds on windows (#11187 ) The preset CMAKE_HIP_FLAGS isn't getting used on Windows. This passes the parallel flag in through the C/CXX flags, along with suppression for some log spew warnings to quiet down the build.	9 months ago
Devon Rifkin	a3f7dd3e98	Merge branch 'main' into drifkin/array-head-count-simple	9 months ago
Daniel Hiltgen	c85c0ebf89	CI: switch windows to vs 2022 (#11184 ) * CI: switch windows to vs 2022 * ci: fix regex match	9 months ago
Daniel Hiltgen	10a8e04a8d	avoid context overflow (#11175 ) For smaller context models, make sure we do not exceed the training size.	9 months ago
Daniel Hiltgen	1c6669e64c	Re-remove cuda v11 (#10694 ) * Re-remove cuda v11 Revert the revert - drop v11 support requiring drivers newer than Feb 23 This reverts commit `c6bcdc4223`. * Simplify layout With only one version of the GPU libraries, we can simplify things down somewhat. (Jetsons still require special handling) * distinct sbsa variant for linux arm64 This avoids accidentally trying to load the sbsa cuda libraries on a jetson system which results in crashes. * temporary prevent rocm+cuda mixed loading	9 months ago
Devon Rifkin	b2b270ad5d	Merge branch 'main' into drifkin/array-head-count-simple	9 months ago
AJ	2bb69b40c7	readme: add ai-hub to community integrations (#11169 )	9 months ago
Daniel Hiltgen	65bff664cb	build speedups (#11142 ) Enable parallel building of the GPU architectures.	9 months ago
Michael Yang	c088ac0e79	convert: utility for merging tensors (#11069 )	9 months ago
Michael Yang	0a066cfd91	Reapply "feat: incremental gguf parser (#10822 )" (#11114 ) (#11119 ) * Reapply "feat: incremental gguf parser (#10822)" (#11114) This reverts commit `a6e64fbdf2`. * fix older ggufs	9 months ago
Jesse Gross	87b7af6cee	ggml: Check return status for computation. We don't check the return status after computing the graph, which can silently lead to bad outputs if we try to keep going and future computation succeeds. This appears to happens in certain cases on Apple M2 devices. Fixes #11070	9 months ago
Daniel Hiltgen	f2527b08fb	int: add coverage for older models (#11137 ) Verified these fail on 0.9.1 and pass on HEAD.	9 months ago
Jeffrey Morgan	8bcb3125c1	benchmark: remove unused benchmark test (#11120 ) Removes a test under benchmark/ that is unused	10 months ago
Jeffrey Morgan	6baf1e31e2	Revert "Revert "ggml: Export GPU UUIDs" (#11115 )" (#11117 ) Reverts PR #11115. The original change was mistakingly reverted instead of #10822	10 months ago
Jeffrey Morgan	ed567ef43b	Revert "ggml: Export GPU UUIDs" (#11115 ) This reverts commit `aaa7818000`.	10 months ago
Jeffrey Morgan	a6e64fbdf2	Revert "feat: incremental gguf parser (#10822 )" (#11114 ) This reverts commit `6b04cad7e8`.	10 months ago
曹家巧	60cfa2a203	cache: fix comment function name in cache.go (#11110 )	10 months ago
Jeffrey Morgan	55bbf3b4a1	tools: return empty arguments object instead of null (#11113 )	10 months ago
Jeffrey Morgan	6bda1d2479	tools: fix parsing tool calls without any parameters (#11101 ) Fixes issue where tool calls that don't expect any parameters were not being parsed. This also fixes two additional issues: one where 2+ tool calls would not be correctly parsed, and cases where tool calls with invalid parameters would still get parsed	10 months ago
Jeffrey Morgan	9e125d884c	model: treat 'user defined' tokens as special tokens (#11077 )	10 months ago
Michael Yang	a6fbfc880c	gguf: fix write order (#11068 ) * ggml: test write gguf order * ggml: fix write tensor order	10 months ago
NGC13009	502028968d	readme: add ollama-launcher to community integrations (#11080 )	10 months ago
Phil	5a8eb0e151	readme: add GPTranslate to community integrations (#11071 )	10 months ago

1 2 3 4 5 ...

4394 Commits (20c3266e943f62ef7947f00b563de5f6c790ecb7) All Branches Search

4394 Commits (20c3266e943f62ef7947f00b563de5f6c790ecb7)

All Branches