mirror of https://gitee.com/namelin2022/ollama
Tree:
dc5a645434
api
bmizerany/client-registry
bmizerany/embedspeedup
bmizerany/fastverify
bmizerany/filepathnobuild
bmizerany/filepathwithcoloninhost
bmizerany/grammar
bmizerany/hrm
bmizerany/modenameenforcealphanum
bmizerany/nameswork
bmizerany/noseek
bmizerany/nosillyggufslurps
bmizerany/replacecolon
bmizerany/types/model/defaultfix
bmizerany/validatenames
bmizerany/x
bruce/iq-quants
brucemacd/allow-ollama
brucemacd/api-doc-formatting
brucemacd/benchmark-list
brucemacd/browser-key-register
brucemacd/cache-models
brucemacd/check-key-register
brucemacd/check-key-register-structured-err
brucemacd/community-docs
brucemacd/concurrent-fail
brucemacd/convert-cli
brucemacd/convert-valid-tests
brucemacd/create-no-loop
brucemacd/default-param-tag
brucemacd/doc-go-engine
brucemacd/e2e-benchmark
brucemacd/encode
brucemacd/err-hint
brucemacd/err-no-vocab
brucemacd/forward-test
brucemacd/go_qwen2
brucemacd/ignore-debug
brucemacd/install-path-clean
brucemacd/jomorganca/mistral
brucemacd/lib-wpath
brucemacd/llama-mem-calc
brucemacd/logprobs
brucemacd/mem-calc
brucemacd/mistral
brucemacd/mistral-small-convert
brucemacd/model-forward-test-ext
brucemacd/models_dir_tilde
brucemacd/new_runner_e2e
brucemacd/new_runner_graph_bench
brucemacd/new_runner_qwen2
brucemacd/next-bpe-bench
brucemacd/next-bpe-test
brucemacd/no-at-create
brucemacd/no-move-prompt-path
brucemacd/openai-chat
brucemacd/parallel-embed-models
brucemacd/partial-read-caps
brucemacd/push-name-validation
brucemacd/qwen25vl
brucemacd/qwen2_5
brucemacd/remove-ggml-runner
brucemacd/rope-config
brucemacd/ropeconfig
brucemacd/runner-completion
brucemacd/runner-test
brucemacd/shim-grammar
brucemacd/structured-api-errs
brucemacd/token-gen-timeout
brucemacd/tokenize
brucemacd/use-req-model-chat
brucemacd/user-template
build_dist
cgo
cp-model
cuda-search
delete-fix
deletemodels
dhiltgen/remove_submodule
distribution
drifkin/5483
drifkin/array-head-count
drifkin/array-head-count-simple
drifkin/chat-truncation-fix
drifkin/num-parallel
drifkin/print-template
editor
fix-model-names
fix-unknown-model
format-config
go-opts
gpt-oss-bump
insecure-registry
jessegross/bump-memory
jessegross/memory
jessegross/new_runner
jessegross/sample
jessegross/worst-multimodal
jmorgan/sample-fix-sorting-extras
jmorganca/add-missing-symlink-eval
jmorganca/batch-embeddings
jmorganca/cuda-compression-none
jmorganca/degin-1
jmorganca/done-reason
jmorganca/enable-fa
jmorganca/execstack
jmorganca/faster-releases
jmorganca/fix-gguf-error
jmorganca/fix-null-format
jmorganca/fix-proxy
jmorganca/ga
jmorganca/ggml-static
jmorganca/if-none-match
jmorganca/initcmake
jmorganca/limit
jmorganca/llama-bump
jmorganca/llama-cpp-7c26775
jmorganca/llama-cpp-8960fe8
jmorganca/llama-update-6
jmorganca/llama-vit
jmorganca/mistral
jmorganca/mistral-wip
jmorganca/mllama
jmorganca/mm
jmorganca/native
jmorganca/no-concat
jmorganca/no-error-template
jmorganca/openai-context
jmorganca/openai-fix-first-message
jmorganca/options
jmorganca/qwen25vl
jmorganca/qwen2vl
jmorganca/replace-assets
jmorganca/silence-tokenizer
jmorganca/sync
jmorganca/temp-0-images
jmorganca/template-mistral
jmorganca/testing
jmorganca/vendor-081b29bd
jyan/auth
jyan/convert-prog
jyan/format
jyan/local
jyan/local2
jyan/ollama-v
jyan/p2
jyan/paligemma
jyan/palitest
jyan/parse-temp
jyan/progress
jyan/q4_4/8
jyan/quant3
jyan/quant4
jyan/quant5
jyan/reord-g
jyan/v0.146
language_support
license-layers
list-models
ls
main
matt/examplemodelfiles
matt/streamingapi
mattw/airenamer
mattw/allmodelsonhuggingface
mattw/communitylinks
mattw/faq-context
mattw/howtoquant
mattw/noprune
mattw/python-functioncalling
mattw/quantcontext
mattw/selfqueryingretrieval
mattw/whatneedstorun
modelfile-readme
modelpath
modenameenforcealphanum
mxyng/16-bit
mxyng/api-models
mxyng/benchmark
mxyng/cleanup
mxyng/cmd-history
mxyng/convert
mxyng/create-context
mxyng/create-stdin
mxyng/environ-2
mxyng/extra-args
mxyng/fix-memory
mxyng/func-checks
mxyng/gguf
mxyng/gin-slog
mxyng/install
mxyng/layers-from-files
mxyng/llama4
mxyng/mllama
mxyng/modelname-5
mxyng/modelname-6
mxyng/modelname-7
mxyng/modelname-8
mxyng/next
mxyng/next-bert
mxyng/next-build
mxyng/next-debug
mxyng/next-mlx
mxyng/no-deprecated-gpu-targets
mxyng/omit-array
mxyng/parallel-create-blobs
mxyng/quant
mxyng/server-timestamp
mxyng/split-bin
mxyng/tune-concurrency
mxyng/update-registry-domain
mxyng/v3
native
nogogen
ollama.com
paligemma-support
parth/cmd-cleanup-SO
parth/constrained-sampling-json
parth/deepseek-r1-tools
parth/disallow-streaming-tools
parth/fix-default-to-warn-json
parth/fix-referencing-so
parth/log-probs
parth/next-sampling
parth/openai-stream-usage
parth/opt-in-error-context-window
parth/python-function-parsing
parth/python-tools-calling
parth/sample-correctness-fix
parth/sample-fix-sorting
parth/sample-so-test
parth/sample-unmarshal-json-for-params
parth/sampling-remove-model-loading-for-grammar
parth/sampling-structured-outputs
parth/server-enable-content-stream-with-tools
parth/server-improve-json-grammar
parth/set-context-size-openai
parth/templating
parth/tokenize-detokenize
parth/tool-prefix-temp
pdevine/authorizedkeys
pdevine/bfloat16
pdevine/convert-cohere2
pdevine/fix-template
pdevine/geems-2b
pdevine/gemma2
pdevine/ggla
pdevine/import-docs
pdevine/logging
pdevine/newlines
pdevine/ps-glitches
pdevine/showggmlinfo
progress-flicker
progressbar
pulse
qwen25omni
readme-updates
remove-first
rename
revert-5963-revert-5924-mxyng/llama3.1-rope
revert-991-brucemacd/history-api
rmdisplaylong
roy-embed-parallel
royh-embed-parallel
royh-imgembed
royh-ls
royh-name
royh-openai-delete
royh-openai-suffixdocs
royh-params
royh-precision
royh-show-rigid
royh-testdelete
royh/embed-viz
royh/ep-methods
royh/stream-tools
royh/whisper
scratch
shell
skip-list
stream-tools-stop
timeout
update-nous-hermes
upgrade-all
upload-progress
whitespace-detection
v0.0.1
v0.0.10
v0.0.11
v0.0.12
v0.0.13
v0.0.14
v0.0.15
v0.0.16
v0.0.17
v0.0.18
v0.0.19
v0.0.2
v0.0.20
v0.0.21
v0.0.3
v0.0.4
v0.0.5
v0.0.6
v0.0.7
v0.0.8
v0.0.9
v0.1.0
v0.1.1
v0.1.10
v0.1.11
v0.1.12
v0.1.13
v0.1.14
v0.1.15
v0.1.16
v0.1.17
v0.1.18
v0.1.19
v0.1.2
v0.1.20
v0.1.21
v0.1.22
v0.1.23
v0.1.24
v0.1.25
v0.1.26
v0.1.27
v0.1.28
v0.1.29
v0.1.3
v0.1.30
v0.1.31
v0.1.32
v0.1.32-rc1
v0.1.32-rc2
v0.1.33
v0.1.33-rc1
v0.1.33-rc2
v0.1.33-rc3
v0.1.33-rc4
v0.1.33-rc5
v0.1.33-rc6
v0.1.33-rc7
v0.1.34
v0.1.34-rc1
v0.1.35
v0.1.35-rc1
v0.1.36
v0.1.37
v0.1.38
v0.1.39
v0.1.39-rc1
v0.1.39-rc2
v0.1.4
v0.1.40
v0.1.40-rc1
v0.1.41
v0.1.42
v0.1.43
v0.1.44
v0.1.45
v0.1.45-rc1
v0.1.45-rc2
v0.1.45-rc3
v0.1.45-rc4
v0.1.45-rc5
v0.1.46
v0.1.47
v0.1.48
v0.1.49-rc1
v0.1.49-rc10
v0.1.49-rc11
v0.1.49-rc12
v0.1.49-rc13
v0.1.49-rc14
v0.1.49-rc2
v0.1.49-rc3
v0.1.49-rc4
v0.1.49-rc5
v0.1.49-rc6
v0.1.49-rc7
v0.1.49-rc8
v0.1.49-rc9
v0.1.5
v0.1.6
v0.1.7
v0.1.8
v0.1.9
v0.10.0
v0.10.0-rc0
v0.10.0-rc1
v0.10.0-rc2
v0.10.0-rc3
v0.10.0-rc4
v0.10.1
v0.11.0
v0.11.1
v0.11.2
v0.11.3
v0.11.3-rc0
v0.11.4
v0.11.4-rc0
v0.2.0
v0.2.1
v0.2.2
v0.2.2-rc1
v0.2.2-rc2
v0.2.3
v0.2.4
v0.2.5
v0.2.6
v0.2.7
v0.2.8
v0.2.8-rc1
v0.2.8-rc2
v0.3.0
v0.3.1
v0.3.10
v0.3.10-rc1
v0.3.11
v0.3.11-rc1
v0.3.11-rc2
v0.3.11-rc3
v0.3.11-rc4
v0.3.12
v0.3.12-rc1
v0.3.12-rc2
v0.3.12-rc3
v0.3.12-rc4
v0.3.12-rc5
v0.3.13
v0.3.14
v0.3.14-rc0
v0.3.2
v0.3.3
v0.3.4
v0.3.5
v0.3.6
v0.3.7
v0.3.7-rc1
v0.3.7-rc2
v0.3.7-rc3
v0.3.7-rc4
v0.3.7-rc5
v0.3.7-rc6
v0.3.8
v0.3.9
v0.4.0
v0.4.0-ci3
v0.4.0-rc0
v0.4.0-rc1
v0.4.0-rc2
v0.4.0-rc3
v0.4.0-rc4
v0.4.0-rc5
v0.4.0-rc6
v0.4.0-rc7
v0.4.0-rc8
v0.4.1
v0.4.1-rc0
v0.4.2
v0.4.2-rc0
v0.4.2-rc1
v0.4.3
v0.4.3-rc0
v0.4.4
v0.4.5
v0.4.6
v0.4.7
v0.4.8-rc0
v0.5.0
v0.5.0-rc1
v0.5.1
v0.5.10
v0.5.11
v0.5.12
v0.5.12-rc0
v0.5.12-rc1
v0.5.13
v0.5.13-rc0
v0.5.13-rc1
v0.5.13-rc2
v0.5.13-rc3
v0.5.13-rc4
v0.5.13-rc5
v0.5.13-rc6
v0.5.2
v0.5.2-rc0
v0.5.2-rc1
v0.5.2-rc2
v0.5.2-rc3
v0.5.3
v0.5.3-rc0
v0.5.4
v0.5.5
v0.5.5-rc0
v0.5.6
v0.5.7
v0.5.8
v0.5.8-rc0
v0.5.8-rc1
v0.5.8-rc10
v0.5.8-rc11
v0.5.8-rc12
v0.5.8-rc13
v0.5.8-rc2
v0.5.8-rc3
v0.5.8-rc4
v0.5.8-rc5
v0.5.8-rc6
v0.5.8-rc7
v0.5.8-rc8
v0.5.8-rc9
v0.5.9
v0.5.9-rc0
v0.6.0
v0.6.0-rc0
v0.6.1
v0.6.1-rc0
v0.6.2
v0.6.2-rc0
v0.6.3
v0.6.3-rc0
v0.6.3-rc1
v0.6.4
v0.6.4-rc0
v0.6.5
v0.6.5-rc0
v0.6.5-rc1
v0.6.6
v0.6.6-rc0
v0.6.6-rc1
v0.6.6-rc2
v0.6.7
v0.6.7-rc0
v0.6.7-rc1
v0.6.7-rc2
v0.6.8
v0.6.8-rc0
v0.7.0
v0.7.0-rc0
v0.7.0-rc1
v0.7.1
v0.7.1-rc0
v0.7.1-rc1
v0.7.1-rc2
v0.8.0
v0.8.0-rc0
v0.9.0
v0.9.0-rc0
v0.9.1
v0.9.1-rc0
v0.9.1-rc1
v0.9.2
v0.9.3
v0.9.3-rc0
v0.9.3-rc1
v0.9.3-rc2
v0.9.3-rc3
v0.9.3-rc4
v0.9.3-rc5
v0.9.4
v0.9.4-citest0
v0.9.4-rc0
v0.9.4-rc1
v0.9.4-rc2
v0.9.4-rc3
v0.9.4-rc4
v0.9.4-rc5
v0.9.4-rc6
v0.9.5
v0.9.6
v0.9.6-rc0
v0.9.7-rc0
v0.9.7-rc1
${ noResults }
2 Commits (dc5a645434f0ea6364c426c6ba112da1afa40cb2)
| Author | SHA1 | Message | Date |
|---|---|---|---|
|
|
d7c94e0ca6
|
Better support for AMD multi-GPU on linux (#7212)
* Better support for AMD multi-GPU This resolves a number of problems related to AMD multi-GPU setups on linux. The numeric IDs used by rocm are not the same as the numeric IDs exposed in sysfs although the ordering is consistent. We have to count up from the first valid gfx (major/minor/patch with non-zero values) we find starting at zero. There are 3 different env vars for selecting GPUs, and only ROCR_VISIBLE_DEVICES supports UUID based identification, so we should favor that one, and try to use UUIDs if detected to avoid potential ordering bugs with numeric IDs * ROCR_VISIBLE_DEVICES only works on linux Use the numeric ID only HIP_VISIBLE_DEVICES on windows |
1 year ago |
|
|
05cd82ef94
|
Rename gpu package discover (#7143)
Cleaning up go package naming |
1 year ago |
|
|
b732beba6a |
lint
|
2 years ago |
|
|
283948c83b |
Adjust windows ROCm discovery
The v5 hip library returns unsupported GPUs which wont enumerate at inference time in the runner so this makes sure we align discovery. The gfx906 cards are no longer supported so we shouldn't compile with that GPU type as it wont enumerate at runtime. |
2 years ago |
|
|
784bf88b0d |
Wire up windows AMD driver reporting
This seems to be ROCm version, not actually driver version, but it may be useful for toggling logic for VRAM reporting in the future |
2 years ago |
|
|
8727a9c140 |
Record more GPU information
This cleans up the logging for GPU discovery a bit, and can serve as a foundation to report GPU information in a future UX. |
2 years ago |
|
|
34b9db5afc |
Request and model concurrency
This change adds support for multiple concurrent requests, as well as loading multiple models by spawning multiple runners. The default settings are currently set at 1 concurrent request per model and only 1 loaded model at a time, but these can be adjusted by setting OLLAMA_NUM_PARALLEL and OLLAMA_MAX_LOADED_MODELS. |
2 years ago |
|
|
6c5ccb11f9 |
Revamp ROCm support
This refines where we extract the LLM libraries to by adding a new OLLAMA_HOME env var, that defaults to `~/.ollama` The logic was already idempotenent, so this should speed up startups after the first time a new release is deployed. It also cleans up after itself. We now build only a single ROCm version (latest major) on both windows and linux. Given the large size of ROCms tensor files, we split the dependency out. It's bundled into the installer on windows, and a separate download on windows. The linux install script is now smart and detects the presence of AMD GPUs and looks to see if rocm v6 is already present, and if not, then downloads our dependency tar file. For Linux discovery, we now use sysfs and check each GPU against what ROCm supports so we can degrade to CPU gracefully instead of having llama.cpp+rocm assert/crash on us. For Windows, we now use go's windows dynamic library loading logic to access the amdhip64.dll APIs to query the GPU information. |
2 years ago |