mirror of https://gitee.com/namelin2022/ollama
Browse Source
We already run flash attention on CPUs in cases where we have partial offloading but were disabling it if running on pure CPU, which is unnecessary.jessegross/memory
committed by
Jesse Gross
1 changed files with 2 additions and 1 deletions
Loading…
Reference in new issue