38 Commits (01d155c96943acf7e3dec4a5ace7cc6704a02b27)

Author SHA1 Message Date
Michael Yang ad3a7d0e2c add NumGQA 3 years ago
Michael Yang 18ffeeec45 update llama.cpp 3 years ago
Michael Yang cca61181cb sample metrics 3 years ago
Michael Yang c490416189 lock on llm.lock(); decrease batch size 3 years ago
Michael Yang f62a882760 add session expiration 3 years ago
Michael Yang 3003fc03fc update predict code 3 years ago
Michael Yang 35af37a2cb session id 3 years ago
Michael Yang 726bc647b2 enable k quants 3 years ago
Michael Yang cb55fa9270 enable accelerate 3 years ago
Michael Yang b71c67b6ba allocate a large enough tokens slice 3 years ago
Michael Yang 8526e1f5f1 add llama.cpp mpi, opencl files 3 years ago
Michael Yang a83eaa7a9f update llama.cpp to e782c9e735f93ab4767ffc37462c523b73a17ddc 3 years ago
Michael Yang 5156e48c2a add script to update llama.cpp 3 years ago
Michael Yang 40c9dc0a31 fix multibyte responses 3 years ago
Michael Yang 0142660bd4 size_t 3 years ago
Michael Yang 1775647f76 continue conversation 3 years ago
Michael Yang 05e08d2310 return more info in generate response 3 years ago
Michael Yang e1f0a0dc74 fix eof error in generate 3 years ago
Jeffrey Morgan c63f811909 return error if model fails to load 3 years ago
Jeffrey Morgan 7c71c10d4f fix compilation issue in Dockerfile, remove from `README.md` until ready 3 years ago
Michael Yang c5f7eadd87 error checking new model 3 years ago
Jeffrey Morgan e64ef69e34 look for ggml-metal in the same directory as the binary 3 years ago
Michael Yang 442dec1c6f vendor llama.cpp 3 years ago
Michael Yang fd4792ec56 call llama.cpp directly from go 3 years ago
Jeffrey Morgan 268e362fa7 fix binding build 3 years ago
Jeffrey Morgan a18e6b3a40 llama: remove unnecessary std::vector 3 years ago
Jeffrey Morgan 5fb96255dc llama: remove unused helper functions 3 years ago
Patrick Devine 3f1b7177f2 pass model and predict options 3 years ago
Michael Yang 5dc9c8ff23 more free 3 years ago
Bruce MacDonald da74384a3e remove prompt cache 3 years ago
Michael Yang 2c80eddd71 more free 3 years ago
Jeffrey Morgan 9fe018675f use `Makefile` for dependency building instead of `go generate` 3 years ago
Michael Yang 1b7183c5a1 enable metal gpu acceleration 3 years ago
Jeffrey Morgan 0998d4f0a4 remove debug print statements 3 years ago
Jeffrey Morgan 79a999e95d fix crash in bindings 3 years ago
Jeffrey Morgan fd962a36e5 client updates 3 years ago
Jeffrey Morgan 0240165388 fix llama.cpp build 3 years ago
Jeffrey Morgan 9164981d72 move prompt templates out of python bindings 3 years ago
Jeffrey Morgan 6093a88c1a add llama.cpp go bindings 3 years ago