2 Commits (16f4eabe2d409b2b8a6e50fa08c8ce3a2a3b18d1)

Author SHA1 Message Date
Daniel Hiltgen 16f4eabe2d
Refine default thread selection for NUMA systems (#7322) 1 year ago
Daniel Hiltgen 05cd82ef94
Rename gpu package discover (#7143) 2 years ago
Daniel Hiltgen 24636dfa87
Discovery CPU details for default thread selection (#6264) 2 years ago
Daniel Hiltgen f3c8b898cd
Track GPU discovery failure information (#5820) 2 years ago
Daniel Hiltgen 69be940bf6
gpu: Group GPU Library sets by variant (#6483) 2 years ago
Daniel Hiltgen 4fe3a556fa Add cuda v12 variant and selection logic 2 years ago
Daniel Hiltgen fc3b4cda89 Report GPU variant in log 2 years ago
Daniel Hiltgen d470ebe78b Add Jetson cuda variants for arm 2 years ago
Jeffrey Morgan c4cf8ad559
llm: avoid loading model if system memory is too small (#5637) 2 years ago
Daniel Hiltgen f6f759fc5f Detect CUDA OS Overhead 2 years ago
Daniel Hiltgen 9929751cc8 Disable concurrency for AMD + Windows 2 years ago
Daniel Hiltgen da3bf23354 Workaround gfx900 SDMA bugs 2 years ago
Daniel Hiltgen 6f351bf586 review comments and coverage 2 years ago
Daniel Hiltgen 4e2b7e181d Refactor intel gpu discovery 2 years ago
Daniel Hiltgen 6fd04ca922 Improve multi-gpu handling at the limit 2 years ago
Daniel Hiltgen 43ed358f9a Refine GPU discovery to bootstrap once 2 years ago
Daniel Hiltgen 8727a9c140 Record more GPU information 2 years ago
Daniel Hiltgen 34b9db5afc Request and model concurrency 2 years ago
Michael Yang 7e33a017c0 partial offloading 2 years ago
Michael Yang 91b3e4d282 update memory calcualtions 2 years ago
Daniel Hiltgen 6d84f07505 Detect AMD GPU info via sysfs and block old cards 2 years ago
Daniel Hiltgen 8da7bef05f Support multiple variants for a given llm lib type 2 years ago
Jeffrey Morgan c336693f07
calculate overhead based number of gpu devices (#1875) 2 years ago
Daniel Hiltgen a2ad952440 Fix windows system memory lookup 2 years ago
Daniel Hiltgen d966b730ac Switch windows build to fully dynamic 2 years ago
Daniel Hiltgen 7555ea44f8 Revamp the dynamic library shim 2 years ago
Daniel Hiltgen 35934b2e05 Adapted rocm support to cgo based llama.cpp 2 years ago