Displays an animated progress bar below the chat input field that shows how much of the available context window is filled.
Right now, only the llama.cpp
, llamacpp_HF
, ExLlamav2
and ExLlamav2_HF
model loaders are supported.
The new llama.cpp loader is only supported for older versions of text-generation-webui (<= v3.4.1) or newer versions if llama-cpp-binaries <= v0.14.0 is installed*. You then need to activate metrics for the llama server. To do this, put the word metrics
into the "extra-flags" field of the "Model" tab. Alternatively, add the command line flag --extra-flags metrics
when starting the web UI or add it to the file text-generation-webui/user_data/CMD_FLAGS.txt
.
*If you really want to use this extension with newer version of llama-cpp-binaries, you can apply the following patch in the llama.cpp
folder, which restores support for some models: https://gist.github.com/mamei16/8637e31ec85336a55b7b2d05ce19ca86