mirror of https://gitee.com/namelin2022/ollama
Browse Source
When we later have a large batch running purely on a CPU, this results the error: GGML_ASSERT(talloc->buffer_id >= 0) Disabling this means that we will incrementally reallocate memory as the graph grows. Fixes #10410jmorganca/cuda-compression-none
committed by
Jesse Gross
1 changed files with 6 additions and 4 deletions
Loading…
Reference in new issue