54 Commits (44bc36d06301bbc23ea3cd4af935e24cfb945f33)

Author SHA1 Message Date
Daniel Hiltgen 20c3266e94
Reduce default parallelism to 1 (#11330) 9 months ago
Patrick Devine aa25aff10d
client: add request signing to the client (#10881) 10 months ago
Michael Yang f95a1f2bef
feat: add trace log level (#10650) 11 months ago
frob 69ce44b33c
envconfig: Remove no longer supported max vram var (#10623) 11 months ago
Devon Rifkin 44b466eeb2 config: update default context length to 4096 11 months ago
Devon Rifkin dd93e1af85
Revert "increase default context length to 4096 (#10364)" 11 months ago
Devon Rifkin 424f648632
increase default context length to 4096 (#10364) 11 months ago
Eries Trisnadi dc13813a03
server: allow vscode-file origins (#9313) 1 year ago
Parth Sareen 314573bfe8
config: allow setting context length through env var (#8938) 1 year ago
Blake Mizerany 68bac1e0a6
server: group routes by category and purpose (#9270) 1 year ago
Jesse Gross ed443a0393 Runner for Ollama engine 1 year ago
Michael Yang dcfb7a105c
next build (#8539) 1 year ago
Daniel Hiltgen 4879a234c4
build: Make target improvements (#7499) 1 year ago
Sam 1bdab9fdb1
llm: introduce k/v context quantization (vRAM improvements) (#6279) 1 year ago
Daniel Hiltgen d7c94e0ca6
Better support for AMD multi-GPU on linux (#7212) 1 year ago
Jeffrey Morgan 48708ca0d5
server: allow vscode-webview origin (#7273) 1 year ago
Jeffrey Morgan 96efd9052f
Re-introduce the `llama` package (#5034) 1 year ago
Daniel Hiltgen cd5c8f6471
Optimize container images for startup (#6547) 2 years ago
Michael Yang dddb72e084 add *_proxy for debugging 2 years ago
Daniel Hiltgen 6719097649
llm: make load time stall duration configurable via OLLAMA_LOAD_TIMEOUT 2 years ago
Daniel Hiltgen b05c9e83d9
Introduce GPU Overhead env var (#5922) 2 years ago
Daniel Hiltgen 93ea9240ae
Move ollama executable out of bin dir (#6535) 2 years ago
Michael Yang 386af6c1a0 passthrough OLLAMA_HOST path to client 2 years ago
Daniel Hiltgen 88bb9e3328 Adjust layout to bin+lib/ollama 2 years ago
Daniel Hiltgen 74d45f0102 Refactor linux packaging 2 years ago
Michael Yang 85d9d73a72 comments 2 years ago
Michael Yang 78140a712c cleanup tests 2 years ago
Michael Yang 0f1910129f int 2 years ago
Michael Yang e2c3f6b3e2 string 2 years ago
Michael Yang 8570c1c0ef keepalive 2 years ago
Michael Yang 55cd3ddcca bool 2 years ago
Michael Yang 66fe77f084 models 2 years ago
Michael Yang d1a5227cad origins 2 years ago
Michael Yang 4f1afd575d host 2 years ago
Michael Yang 35b89b2eab rfc: dynamic environ lookup 2 years ago
Daniel Hiltgen cc269ba094 Remove no longer supported max vram var 2 years ago
Anatoli Babenia 0d16eb310e
fix: use `envconfig.ModelsDir` directly (#4821) 2 years ago
Daniel Hiltgen 955f2a4e03 Only set default keep_alive on initial model load 2 years ago
Daniel Hiltgen 173b550438 Remove default auto from help message 2 years ago
Daniel Hiltgen 9929751cc8 Disable concurrency for AMD + Windows 2 years ago
Daniel Hiltgen 17b7186cd7 Enable concurrency by default 2 years ago
Daniel Hiltgen d34d88e417 Revert "Revert "gpu: add env var for detecting Intel oneapi gpus (#5076)"" 2 years ago
Wang,Zhe 755b4e4fc2 Revert "gpu: add env var for detecting Intel oneapi gpus (#5076)" 2 years ago
Jeffrey Morgan 163cd3e77c
gpu: add env var for detecting Intel oneapi gpus (#5076) 2 years ago
Daniel Hiltgen 6be309e1bd Centralize GPU configuration vars 2 years ago
Daniel Hiltgen 5e8ff556cb Support forced spreading for multi GPU 2 years ago
Patrick Devine 94618b2365
add OLLAMA_MODELS to envconfig (#5029) 2 years ago
Patrick Devine c69bc19e46
move OLLAMA_HOST to envconfig (#5009) 2 years ago
royjhan 1a29e9a879
API app/browser access (#4879) 2 years ago
Michael Yang c895a7d13f some gocritic 2 years ago