Releases: li-plus/chatglm.cpp
Releases · li-plus/chatglm.cpp
v0.4.2
31 Jul 06:12
Compare
Sorry, something went wrong.
No results found
Apply flash attention on vision encoder for lower first-token latency.
Fix metal compilation error on Apple silicon chips.
v0.4.1
25 Jul 07:04
Compare
Sorry, something went wrong.
No results found
Support GLM4V, the first vision language model in GLM series
Fix nan/inf logits by rescheduling attention scaling
v0.4.0
21 Jun 03:09
Compare
Sorry, something went wrong.
No results found
Dynamic memory allocation on demand to fully utilize device memory. No preset scratch size or memory size any more.
Drop Baichuan/InternLM support since they were integrated in llama.cpp.
API change:
CMake CUDA option: -DGGML_CUBLAS
changed to -DGGML_CUDA
CMake CUDA architecture: -DCUDA_ARCHITECTURES
changed to -DCMAKE_CUDA_ARCHITECTURES
num_threads
in GenerationConfig
was removed: the optimal thread settings will be automatically selected.
v0.3.4
14 Jun 12:52
Compare
Sorry, something went wrong.
No results found
Fix regex negative lookahead for code input tokenization
Fix OpenAI API server by using apply_chat_template
to calculate tokens
v0.3.3
13 Jun 06:36
Compare
Sorry, something went wrong.
No results found
Support ChatGLM4 conversation mode
v0.3.2
24 Apr 08:20
Compare
Sorry, something went wrong.
No results found
Support p-tuning v2 finetuned models for ChatGLM family
Fix convert.py for lora models & chatglm3-6b-128k
Fix RoPE theta config for 32k/128k sequence length
Better cuda cmake script respecting nvcc version
v0.3.1
20 Jan 16:14
Compare
Sorry, something went wrong.
No results found
Support function calling in OpenAI api server
Faster repetition penalty sampling
Support max_new_tokens generation option
v0.3.0
22 Nov 03:08
Compare
Sorry, something went wrong.
No results found
Full functionality of ChatGLM3 including system prompt, function call and code interpreter
Brand new OpenAI-style chat API
Add token usage information in OpenAI api server to be compatible with LangChain frontend
Fix conversion error for chatglm3-6b-32k
v0.2.10
30 Oct 06:35
Compare
Sorry, something went wrong.
No results found
Support ChatGLM3 in conversation mode.
Coming soon: new prompt format for system message and function call.
v0.2.9
22 Oct 03:03
Compare
Sorry, something went wrong.
No results found
Support InternLM 7B & 20B model architectures