SillyTavern All Compressed
SillyTavern All Compressed
What is SillyTavern?
Installation
Follow the installation guide for your platform:
Windows
Linux and Mac
Android
Docker
Branches
SillyTavern is being developed using a two-branch system to ensure a smooth experience
for all users.
release -🌟 Recommended for most users. This is the most stable and
recommended branch, updated only when major releases are pushed. It's suitable for
the majority of users. Typically updated once a month.
staging - ⚠️ Not recommended for casual use. This branch has the latest features,
but be cautious as it may break at any time. Only for power users and enthusiasts.
Updates several times daily.
Previous Next
What is SillyTavern? Windows
Windows Installation
DO NOT INSTALL INTO ANY WINDOWS CONTROLLED FOLDER (Program Files,
System32, etc).
DO NOT RUN START.BAT WITH ADMIN PERMISSIONS
INSTALLATION ON WINDOWS 7 IS IMPOSSIBLE AS IT CAN NOT RUN NODEJS
18.16
2. On your keyboard: press WINDOWS + E to open File Explorer, then navigate to the
folder where you want to install the launcher. Once in the desired folder, type cmd
into the address bar and press enter. Then, run the following command:
git clone https://github.com/SillyTavern/SillyTavern-Launcher.git && cd SillyTavern-
Launcher && start installer.bat
6. Double-click on the start.bat file. (Note: the .bat part of the file name might be
hidden by your OS, in that case, it will look like a file called " Start ". This is what you
double-click to run SillyTavern)
7. After double-clicking, a large black command console window should open and
SillyTavern will begin to install what it needs to operate.
8. After the installation process, if everything is working, the command console window
should look like this and a SillyTavern tab should be open in your browser:
9. Connect to any of the supported APIs and start chatting!
Previous Next
Installation MacOS & Linux
Linux/MacOS Install
Manual Git install
For MacOS / Linux all of these will be done in a Terminal.
1. Install git and nodeJS (the method for doing this will vary depending on your OS)
2. Clone the repo
for Release Branch: git clone https://github.com/SillyTavern/SillyTavern -b
release
for Staging Branch: git clone https://github.com/SillyTavern/SillyTavern -b
staging
3. cd SillyTavern to navigate into the install folder.
4. Run the start.sh script with one of these commands:
./start.sh
bash start.sh
SillyTavern Launcher
For Linux users
1. Open your favorite terminal and install git
2. Download Sillytavern Launcher with: git clone
https://github.com/SillyTavern/SillyTavern-Launcher.git
3. Navigate to the SillyTavern-Launcher with: cd SillyTavern-Launcher
4. Start the install launcher with: chmod +x install.sh && ./install.sh and choose
what you wanna install
5. After installation start the launcher with: chmod +x launcher.sh && ./launcher.sh
For Mac users
1. Open a terminal and install brew with: /bin/bash -c "$(curl -fsSL
https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
2. Then install git with: brew install git
3. Download Sillytavern Launcher with: git clone
https://github.com/SillyTavern/SillyTavern-Launcher.git
4. Navigate to the SillyTavern-Launcher with: cd SillyTavern-Launcher
5. Start the install launcher with: chmod +x install.sh && ./install.sh and choose
what you wanna install
6. After installation start the launcher with: chmod +x launcher.sh && ./launcher.sh
Previous Next
Windows Android (Termux)
Previous Next
MacOS & Linux Docker
Docker Installation
This guide assumes you installed SillyTavern in a non-root (non-admin) folder. If
you installed SillyTavern in a root folder, you may have to run some of these
commands with administrator rights [ sudo , doas , Command Prompt
(Administrator)].
Installation
Linux
1. Install Docker by following the Docker installation guide here.
2. Follow the steps in Manage Docker as a non-root user in the Docker Post-Installation
Guide.
3. Install Git using your package manager.
Debian (Ubuntu/Pop! OS/etc.)
sudo apt install git
5. Execute docker compose by running the following command within the Docker folder.
docker compose up -d
6. Execute the following Docker command to obtain the IP of your SillyTavern Docker
container.
docker network inspect docker_default
You should receive some sort of output similar to the following below.
[
{
"Name": "docker_default",
"IPAM": {
"Config": [
{
"Subnet": "172.18.0.0/16",
"Gateway": "172.18.0.1"
}
]
}
}
]
Copy down the IP you see in Gateway as this will be important.
7. Using sudo , open nano and run the following command.
sudo nano config/config.yaml
Within nano , go down to whitelist . You should see something similar to the
following below.
whitelist:
- 127.0.0.1
Add a new line below 127.0.0.1 and put in the IP you copied from Docker. It should look
something similar to the following afterwards.
whitelist:
- 127.0.0.1
- 172.18.0.1
Save the file by pressing Ctrl+S then exit nano by pressing Ctrl+X.
Note that if you configured Docker network as a bridge, you could also add
external IP addresses to the whitelist as usual.
9. Open an new browser and go to http://localhost:8000. You should see SillyTavern load
in a few moments.
10. Enjoy! :D
Windows
Regarding Docker on Windows
Using Docker on Windows is really complicated. Not only do you need to
activate Windows Subsystem for Linux within Turn Windows features on or off,
but also configure your system for Virtualization (Intel VT-d/AMD SVM) which
differs from PC manufacturer to PC manufacturer (or motherboard
manufacturer). Sometimes, this option is not present on some systems.
It is highly suggested you install SillyTavern by following our Windows guide. This
section is a rough idea of how it can be done on Windows.
4. Execute docker compose by running the following command within the Docker folder.
docker compose up -d
5. Execute the following Docker command to obtain the IP of your SillyTavern Docker
container.
docker network inspect docker_default
You should receive some sort of output similar to the following below.
[
{
"Name": "docker_default",
"IPAM": {
"Config": [
{
"Subnet": "172.18.0.0/16",
"Gateway": "172.18.0.1"
}
]
}
}
]
Within the editor of your choice, you should see something similar to the following
below.
whitelist:
- 127.0.0.1
Add a new line below 127.0.0.1 and put in the IP you copied from Docker. It should look
something similar to the following afterwards.
whitelist:
- 127.0.0.1
- 172.18.0.1
Note that if you configured Docker network as a bridge, you could also add
external IP addresses to the whitelist as usual.
7. Restart the Docker Container to apply the new configuration.
docker compose restart sillytavern
8. Open an new browser and go to http://localhost:8000. You should see SillyTavern load
in a few moments.
9. Enjoy! :D
macOS
Even though macOS is similar to Linux, it doesn't have the Docker Engine. You
will have to install Docker Desktop similarly to Windows. You will also need to
install Homebrew in order to install Git on your Mac. This section is a rough idea
on how it can be done on macOS.
4. Execute docker compose by running the following command within the Docker folder.
docker compose up -d
5. Execute the following Docker command to obtain the IP of your SillyTavern Docker
container.
docker network inspect docker_default
You should recieve some sort of output similar to the following below.
[
{
"Name": "docker_default",
"IPAM": {
"Config": [
{
"Subnet": "172.18.0.0/16",
"Gateway": "172.18.0.1"
}
]
}
}
]
If you can't run nano , either install it via Homebrew or use TextEdit.
Within nano , go down to whitelist . You should see something similar to the
following below.
whitelist:
- 127.0.0.1
Add a new line below 127.0.0.1 and put in the IP you copied from Docker. It should look
something similar to the following afterwards.
whitelist:
- 127.0.0.1
- 172.18.0.1
Save the file by pressing Ctrl+S then exit nano by pressing Ctrl+X.
Note that if you configured Docker network as a bridge, you could also add
external IP addresses to the whitelist as usual.
7. Restart the Docker Container to apply the new configuration.
docker compose restart sillytavern
8. Open an new browser and go to http://localhost:8000. You should see SillyTavern load
in a few moments.
9. Enjoy! :D
Configuring SillyTavern
SillyTavern's configuration file (config.yaml) will be located within the config folder.
Configuring the config file should be no different than configuring it without Docker,
however you will need to run nano or a code editor with administrator rights in order to
save your changes.
Don't forget to restart the Docker container for SillyTavern in order to apply your
changes! Make sure you execute this command within the docker folder.
docker compose restart sillytavern
Locating User Data
SillyTavern's data folder will be within the data folder. Backing up your files should be
easy to do, however, restoring or adding content into it may require you to do so with
administrator rights.
Running Server Plugins
Running plugins like HoYoWiki-Scraper-TS or SillyTavern-Fandom-Scraper within Docker is
no different from running it on your system without Docker, however we will need to do a
slight modification to the Docker Compose script in order to do so.
Note
If you already see a plugins folder within the docker folder, you can skip Steps
1-2.
1. Using nano or a code editor, open docker-compose.yml and add the following line
below volumes .
volumes:
- "./config:/home/node/app/config"
- "./data:/home/node/app/data"
- "./plugins:/home/node/app/plugins"
6. Profit.
Previous Next
Android (Termux) Updating
Linux/Termux or MacOS
You definitely installed via git, so just 'git pull' inside the SillyTavern directory.
cd SillyTavern to enter the correct folder.
git pull to get the update.
./start.sh or bash start.sh to start ST.
Windows
First try using the UpdateAndStart.bat which is located in your SillyTavern
installation base folder.
Assets
Backgrounds
Characters
Chats
Context
Groups
Group chats
Instruct
movingUI
KoboldAI Settings
NovelAI Settings
OpenAI Settings
QuickReplies
TextGen Settings (textgen = ooba)
Themes
User Avatars
Worlds
User
settings.json
secrets.json <---- this one is in the base folder, not /public/
7. Once those folders/files are copied, paste them into the /data/default-user folder
(with secrets.json going into the folder root) of the new install.
8. Start SillyTavern once again with the method appropriate to your OS, and pray you
got it right.
9. If everything shows up, you can safely delete the old ST folder.
Common Update Problems
"There are unresolved conflicts in the working directory."
This means that you've modified default files that have been changed in the remote
repository (such as setting presets).
To fix this, run this in the terminal. Use cautiously, as it can be destructive. Make sure to
have a backup if needed.
git merge --abort
git reset --hard
git pull --rebase --autostash
Unix/Linux
rm -rf node_modules
npm cache clean --force
npm install
Docker
1. Open a terminal window and navigate to your docker directory cd
SillyTavern/docker
2. Delete your container with docker compose down
3. Delete the SillyTavern docker image from cache docker rmi
ghcr.io/sillytavern/sillytavern:latest (Replace sillytavern:latest with
sillytavern:staging if you are targeting the staging branch.)
4. Rebuild the container with sudo docker compose up -d
If everything goes smoothly, docker should start redownloading the image, and you will be
up and running shortly. If you face any issues, refer to the next section of this guide.
Common Update Problems
I use Docker and all my data is gone after the update!
You must follow the Migration guide for Docker containers to update volume mappings for
the new data model introduced in 1.12.0
Permission denied when running docker commands
This is a Linux issue, and implies that your permissions are not properly set up. There are
two ways to get around this:
1. The Easy method: If you have sudo access on your user, simply prefix commands with
sudo (for example: sudo docker compose down )
2. The Proper method: Fix your permissions. This varies depending on the version of
Linux you use. There are plenty of guides online to help you fix this issue.
Previous Next
Docker 1.12.0 Migration Guide
© Copyright 2025. All rights reserved.
SillyTavern Documentation
YAML example
# -- DATA CONFIGURATION --
# Root directory for user data storage
dataRoot: C:\Users\Harry\Documents\ST-Data
Console example
The default data root path is ./data , which means the data directory in SillyTavern's
repository.
Note
The data root path should be either a full absolute or a full relative path. You
can't use path shortcuts like ~ or %APP_DATA% , as these are resolved by a shell,
not the operating system.
Migration
IMPORTANT! Before we begin
1. Only if you want to move dataRoot from the default location. Otherwise, skip this
part. Set the data root before first running the server after pulling an update. Run npm
install for the config.yaml to populate with a new value, or pass a console
argument.
2. All data will be migrated into a default-user account. See more on Users below.
Containerless (bare metal) installs
You don't have to do anything! An automatic migration should handle everything for you
when you start the ST server and it detects the old storage format (by checking the
existence of the /public/characters directory).
Upon moving any files, an automatic backup will be created in the
/backups/_migration/YYYY-MM-DD (resolved to the current date) directory, but it is always
a good practice to make a full manual backup before running the migration.
Containerized (Docker) installs
Migrating the data in Docker volumes is a bit trickier but pretty straightforward. While
docker-compose.yml provided with the repo was updated to reflect the changes, you may
need to adjust your custom workflows/deployments.
Step 1. Create a new volume, and mount it to the "/home/node/app/data" path within the
container. Don't remove the config volume.
volumes:
- "./config:/home/node/app/config"
- "./data:/home/node/app/data"
Step 2. Move everything but the config.yaml file from the config volume into the
default-user subdirectory of the data volume.
Note
Soft links between the /public directory and the config volume are no longer
needed and are not built into the Docker container!
What to migrate?
The following files and directories are subject to the data migration. Assuming the default
configuration, the before and after paths are provided in the table below.
Before After
/secrets.json /data/default-user/secrets.json
/thumbnails /data/default-user/thumbnails
/vectors /data/default-user/vectors
/public/settings.json /data/default-user/settings.json
/public/stats.json /data/default-user/stats.json
/public/assets /data/default-user/assets
/public/backgrounds /data/default-user/backgrounds
/public/characters /data/default-user/characters
/public/chats /data/default-user/chats
/public/context /data/default-user/context
/public/scripts/extensions/third-party /data/default-user/extensions
/public/group chats /data/default-user/group chats
/public/groups /data/default-user/groups
/public/instruct /data/default-user/instruct
/public/KoboldAI Settings /data/default-user/KoboldAI Settings
/public/movingUI /data/default-user/movingUI
/public/NovelAI Settings /data/default-user/NovelAI Settings
/public/OpenAI Settings /data/default-user/OpenAI Settings
/public/QuickReplies /data/default-user/QuickReplies
/public/TextGen Settings /data/default-user/TextGen Settings
/public/themes /data/default-user/themes
/public/worlds /data/default-user/worlds
/default/content/content.log /data/default-user/content.log
Users
1.12.0 adds a (completely optional) ability to create a multi-user setup on the same server,
allowing multiple users to use their own fully isolated SillyTavern instances even at the
same time. User accounts can also be password-protected for an additional layer of
privacy.
Please refer to the Users documentation for more information.
Edit this page
Previous Next
Updating 1.9.0 Migration Guide
4. Skip to next item if you have no errors. You may have something like:
error: Your local changes to the following files would be overwritten by checkout:
config.conf
public/css/bg_load.css
public/settings.json
You will see a list of files affected. If you do not care about those settings files being
replaced git switch -f release or git switch -f staging will set your branch. If
you do care to save those changes restore from backup.
5. Type npm install and then npm run start to test that everything behaves correctly.
6. Enjoy! Restore your data from a backup if needed.
fatal: invalid reference: release
This may happen if you cloned just a single branch from an old remote (before migration
to the organization repo). To fix this, you need to add and fetch a branch from a new
remote:
git remote add st https://github.com/SillyTavern/SillyTavern
git fetch st
git checkout -t st/release
Previous Next
1.12.0 Migration Guide Usage
Usage
Interact with AI, your way. Build your world, your work, or your dreams.
Getting Started
Quick Start
Send your first message to the AI and get a response
Chatting
How to chat with the AI and use the chat interface
FAQ
Frequently asked questions about SillyTavern, AI models, making characters, getting
better responses, and more
Fundamentals
API Connections
Connect to AI models for generating text, images, and more
Characters and Personas
Create and use characters to shape the AI's role, and personas to define your
identities
Response Configuration and Prompts
Control the requests that you send to the AI and how it responds
Building on SillyTavern
World Info
Manage information and when to insert it into the prompt
Data Bank
Store and retrieve information for use in the AI's responses
Extensions
Add new features and capabilities to the AI or the interface
Development and Automation
Automate tasks, let your AI interact with the world, and write your own extensions
Control Panels
What all the buttons do, from the left to the right:
API Connections
Connect to AI models for generating text, images, and more
Advanced Formatting
Customize prompt construction for Text Completion APIs
World Info
Manage information and when to insert it into the prompt
User Settings
Change the theme, and the look and feel of messages and chats
Backgrounds
Change the background image
Extensions
Add new features and capabilities to the AI or the interface
Personas
Create and manage personas to use with the AI
Characters
Create and manage characters for the AI to use
Previous Next
1.9.0 Migration Guide Quick Start
Quick Start
I'm clueless. Just spoonfeed me the easiest and fastest way I can start using
SillyTavern. -- Anonymous
You can get started with SillyTavern in just a few minutes. Here are two easy ways to get
started:
You can use AI Horde for free. AI Horde is a community-driven AI service that provides
access to a variety of AI models.
If you have an OpenAI account or want to register one, you can use OpenAI.
Quick start with AI Horde
1. Follow the Installation Guide to install and start SillyTavern.
2. In SillyTavern's onboarding screen, enter a name for your persona. This name will be
used in the chat.
5. Select some AI models to use. Just choose a few from the top. You can always
change them later.
6. Close the API Connections window. Enter a message in the chat box at the bottom
and press Enter.
7. Your AI will respond in a few moments. You can continue chatting with it. Success!
Quick start with OpenAI
Install SillyTavern
Follow the Installation Guide to install and start SillyTavern.
Get access to OpenAI
1. Sign up to OpenAI.
2. Go to https://platform.openai.com
3. Click your account icon in the top right, then View API Keys.
4. Click "Create new secret key". Copy it somewhere immediately. DO NOT SHARE THIS
KEY. WHOEVER HAS IT CAN USE YOUR ACCOUNT TO USE GPT AT YOUR EXPENSE.
Configure SillyTavern to use your API
1. In SillyTavern's top bar, click API Connections.
2. Under API, select Chat Completion (OpenAI).
3. Under Chat Completion Source, select OpenAI.
4. Paste the API key you saved in the previous step.
5. Click the Connect button. Confirm it says Valid.
6. By default, SillyTavern will use GPT-4 Turbo. You can choose a different model, but
educate yourself on the pricing.
Test your setup
1. In SillyTavern's top bar, click Character Management at the far right.
2. Select an existing character such as Seraphina.
3. In the text box at the bottom, write something to Seraphina, then press Enter or click
the Send button.
If you did everything right, after a few seconds, Seraphina should respond.
Previous Next
Usage FAQ
FAQ
Explain what SillyTavern is about
Modern AI language models such as ChatGPT have gotten so powerful that some of them
are now convincingly able to simulate a character you create, and who you can chat with,
write fiction with, etc. For example, you can tell the AI to pretend to be a Go instructor
named Jubei from medieval Japan, and it will act and respond accordingly. You can have
a long chat with Jubei, go to the pub together, decide to get in a fight with samurais,
whatever you can imagine, and the AI will play along and write/react around this content,
acting as your foil and dungeon master. Your imagination is the limit. You can tell the AI to
pretend it's Wonder Woman. You can also specify a scenario ("Wonder Woman and I are
robbing a bank"), a writing style ("Wonder Woman speaks in ebonics"), or anything else
you can think of.
SillyTavern is an app to facilitate these uses:
It's a user interface that handles communication with AI language models.
It lets you create new character cards (prompts), and switch between them easily.
It lets you import characters created by other people.
It will keep your chat history with a character, allowing you to resume at any time,
start a new chat, review old chats, etc.
In the background, it does the necessary things to prepare the AI prompt for you.
Specifically, it will send a system prompt (instructions for the AI) that primes the AI to
follow certain rules to improve response accuracy.
Give me an overview of my AI model
options
SillyTavern can interact with two types of AI:
1. Web services (Cloud-based, usually paid, proprietary, closed)
2. Self-hosted (local, free, open-source)
Paid web service AIs
Paid web models are black boxes. You pay a company to use their AI service. You put your
account info in SillyTavern and it will connect to your provider to use the AI on your behalf.
Pros:
Really easy to get started.
Highest quality AI writing.
Cons:
They cost money to use.
Everything is logged on their server. Privacy concerns.
They are often censored and will refuse to chat with you about certain subjects.
Self-hosted AIs
Self-hosted models are free models you can run on your PC but require a powerful PC and
more work to set up.
Pros:
Once you set them up, they can be used for free even without Internet access.
Total privacy. Everything you write stays on your own PC.
There's a wide variety of models. As a community-driven technology, you can find
models that fit certain tasks or behaviors that you want.
Cons:
They are not as capable as SOTA models (i.e., they write worse dialog, are less
creative, etc).
Running local models requires a GPU with at least 6GB VRAM.
If you are interested in using these, refer to the dedicated guide here: How To Use A Self-
Hosted Model.
Can I use SillyTavern on my phone or
tablet?
iPhones and iPads are not capable of running the whole SillyTavern app, but since it's just
a web interface, you can run it on another computer on your home Wi-Fi, and then access
it in your mobile browser. Refer to Remote Connections for more information.
For Android users, in addition to the above, you can run the whole SillyTavern directly on
your phone, without needing a PC, using the Termux app. Refer to Installation (Android).
(NOTE: Termux installations are not officially supported, and we can't guarantee it will
work.)
I tried to import a PNG character card but
got an error that it's invalid. Why?
Two possibilities:
1. The card did not have the definitions embedded inside it and was just a normal image
file. Some programs or file managers will strip the embedded definitions from the card
when you save them. Make sure you're using the raw PNG file as it was posted by the
person who shared it.
2. The PNG file was actually a WEBP file with a .png filename. You can try renaming the
card to .webp before importing, or look for a proper PNG version of the image.
How can I make my own AI character?
1. Click the Character Management button
2. Click Create New Character
3. Under Character Name, give a name, like Amanda
4. Optionally, click the Select Avatar button to pick an image portrait for this character
5. Under Description, describe the character, and include any information you want that
you feel is relevant to the chat. For example: Amanda is a student traveling during
her gap year. She's 6 feet tall, and a volleyball player. She has an athletic
figure. She has long brown hair. She loves the Victorian England period, and
watching TV and reading novels relating to that period. For example, if you want
Amanda to be friendly, then you would add: Amanda is extremely cheerful and
outgoing.
6. Under First Message, write the greeting the character when you begin a new chat. For
example: *Amanda waves at you* Hey! Are you a backpacker too?
7. Click the Create Character button
You now have a basic character you can chat with. Select Amanda from the character list,
and a new chat will begin.
Note that you can use the Description and/or First Message to create a more specific
scenario, and/or include yourself in the description. For example:
Description:
Amanda is a student traveling during her gap year. She's 6 feet tall, and a volleyball
player. She has an athletic figure. She has long brown hair. She loves the Victorian
England period, and watching TV and reading novels relating to that period. She's been
keeping a secret that weighs heavily on her soul. She's waiting for the right person to
unburden herself to, but this may lead to a cat and mouse game against a powerful secret
society. She's recently arrived in Calcutta.
You're Rajesh Nahasmapetilon, a world-famous Indian volleyball superstar. You're out for a
walk in Calcutta. Amanda spots you and screams in excitement.
First Message:
*Amanda runs up to you, beaming.* Rajesh! I can't believe it! I'm such a big fan. I have
your poster in my bedroom.
Any relevant information you include can be used. How well it's used depends on the
power level of the AI model.
NOTE: you can go back and edit any of this information once the character is created,
except the name.
Where are my API keys stored? Why can't I
see them?
SillyTavern saves your API keys to a secrets.json file in the server directory.
By default, they will not be exposed to a frontend after you enter them and reload the
page.
To enable viewing your keys by clicking a button in the API block:
1. Set the value of allowKeysExposure to true in the config.yaml file.
2. Restart the SillyTavern server.
Performance Tips
Why is the UI so slow/jittery?
Try enabling the No Blur Effect (Fast UI) mode on the User settings panel.
Enable Reduced motion in the UI theme settings to remove cosmetic animations.
Make sure your browser is using Hardware Acceleration.
I'm experiencing an input lag. What can I do?
Performance degradation, particularly input lag, is most commonly attributed to browser
extensions. Known problematic extensions include:
iCloud Password Manager
DeepL Translation
AI-based grammar correction tools
Various ad-blocking extensions
If you experience performance issues and cannot identify the cause, or suspect an issue
with SillyTavern itself, please:
1. Record a performance profile
2. Export the profile as a JSON file
3. Submit it to the development team for analysis
We recommend first testing with all browser extensions and third-party SillyTavern
extensions disabled to isolate the source of the performance degradation.
When I import a lot of characters, the app becomes slow.
Why?
Unfortunately, SillyTavern wasn't designed to handle huge character libraries. The more
you have, the longer it will take to load the character list. Evidential data suggests that the
performance degradation starts to become noticeable when you have more than 1000
characters.
However, there are some things you can do to mitigate the issue:
1. Use lazy loading.
Enable lazy loading of characters setting the value performance.lazyLoadCharacters to
true in the config.yaml file. After the next server restart, the character list will only load
the full data of characters you interact with. Please be aware that some third-party
extensions may not work correctly with this setting enabled if they were not updated to
support it (contact the extension developer for more information).
2. Use memory cache.
Increase the memory cache capacity if you have some spare RAM. This will allow the
server to keep more characters in memory, reducing the time it takes to load them. You
can do this by adjusting the value of performance.memoryCacheCapacity to a higher
number in the config.yaml file. The default value is 100mb . Approximate rule of thumb:
increase the value by 100mb for every 3000 characters you have.
Limitations:
1. Advanced (fuzzy) characters search will not work with lazy loading enabled. Only
character names will be searched.
2. Memory cache is disabled on Android devices due to the limited amount of available
memory.
How to make the AI write more?
Sometimes the AI will only respond with a single sentence when you'd like it to be more
verbose. This is usually a problem with locally run models.
If you simply want the bot to continue writing from where it left off at the end of its most
recent reply, you can send an empty user message by typing nothing into the Input Bar
and clicking Send. This will force the bot to continue the story.
Strategies for fixing this:
Increase the value of the Response Length setting
Design a good First Message for the Character, which shows them speaking in a
long-winded manner. AI models can improve a lot when given guidance about the
writing style you expect.
Add a phrase in the character's Description Box such as "likes to talk a lot" or "very
verbose speaker"
Do the same thing for your Author's Note , or Post-History Instruction Prompt
As a last resort, you can try turning on Auto-Continue (in the User Settings panel),
but will make responses come out slower because it's making the AI produce small
replies back to back, and then combining them all together into one big reply. It may
also be incompatible with some API options.
How to make the AI write less?
This is mostly only a problem for models like ChatGPT or Claude. The same strategies can
be applied but in reverse.
Decrease the value of the Response Length setting
Give the character a phrase like 'short spoken', or 'doesn't talk much' line in their
Description.
Give the character a brief First Message to set the tone and expectation for the chat.
Make sure Auto-Continue is turned off.
How to make the AI stop writing the actions
of my character, and driving the plot all on
its own?
This should be handled in the Author's Note with a combination of phrases like:
{{char}}'s responses shall only be passive and reactive to {{user}}'s actions.
Your next response shall be solely from the POV of {{char}}.
You are never allowed to dictate actions or speech for {{user}}
Chatting
When you are connected to an API, send messages to the AI by typing in the chat bar at
the bottom of the screen. Then click Send or press Enter.
Chat bar
The AI will respond with a message that continues the conversation.
Chat message
You can now:
Send another message
Swipe the response: Click the Swipe button on the message to generate a different
response.
Edit the message: Click the Edit button on any message to edit the message
content.
Message actions: Click the Message actions button on a message for more
message options like translation, image generation, and story branching.
Chat options: Click the Options button next to the chat bar for more chat options
like author's notes and chat file management.
Keyboard shortcuts
You can also use the Right arrow key to swipe, and the Up arrow key to edit the
last message in the chat. For more hotkeys, use the /help hotkeys slash
command in the chat or check the HotKeys page.
Message Visibility
Included: AI sees this message; click to exclude it
Excluded: AI does not see this message; click to include it
Content Management
Embed: Attach files or images
Checkpoint: Create story checkpoint
Checkpoint Navigation: Click to open checkpoint chat, Shift+Click to update
existing checkpoint
Branch: Start alternate story path
Copy: Copy message text
Edit: Edit message content
Message Operations
Copy: Duplicate message content
Delete: Remove message
Message Position
Move Up: Shift message higher in chat
Move Down: Shift message lower in chat
Note: Movement controls may be disabled based on message position in chat history.
Chat options panel
Manage chat settings and operations via the Options button at the bottom left of the
chat interface.
Display Controls
Close chat: Exit current chat session
Toggle Panels: Show/hide interface panels
Generation Settings
Author's Note: Custom context instructions
CFG Scale: Adjust response creativity
Token Probabilities: View token generation stats
Chat Navigation
Back to parent chat: Return to main conversation
Save checkpoint: Create story checkpoint
Convert to group: Transform into group chat
Chat Management
Start new chat: Begin fresh conversation
Manage chat files: Chat file operations such as import, export, and renaming
Message Controls
Delete messages: Select and remove multiple messages
Regenerate: Create new response
Impersonate: AI writes message as user
Continue: Extend last message
Note: Some options may be hidden depending on context and chat state.
Token Probabilities Panel
The Token Probabilities panel lets you look into the AI's sampling process for text
generation. It shows you not just what the AI wrote, but what other options it considered
at each point in the text.
To open it, click the Token Probabilities button in the Chat Options panel.
Example message
Token probabilities display for example message
When you click any token (word, punctuation, or formatting character) in the generated
text, the panel displays alternative tokens the AI considered at that position, along with
their probability scores. This gives you insight into the AI's "thought process" and shows
other directions the response could have taken. Looking at these alternatives can help
you understand whether there were several likely options or a single clear choice.
Alternative tokens and probabilities
If you see a token that you think the AI should have chosen differently, choose an
alternative and the message will regenerate from that point forward, potentially giving you
a different response.
Rerolling
If you change a specific token and regenerate the response, the part of the new response
before the changed token will be the same as the original response. This part is shown in
gray. Since it was not generated, there is no probability information for this part.
You may like to see other responses that could have been generated based on your
alternative token.
You can click the gray portion to "reroll" the generation, giving you a new variation of the
text. Clicking any part of the gray portion will keep the entire gray portion and regenerate
the entire white/tinted portion.
Holding Ctrl while clicking a token in the gray portion will retain the gray portion up to the
clicked token and regenerate the rest of the text. Your choice of alternative token can not
be kept in this case.
Controls
Token Display:
Generated text is split into individual tokens
Each token is interactive, click a token to see alternatives considered by the AI
Tokens are tinted as a visual aid but this does not indicate probability
Special characters (spaces, newlines) are visibly marked
Token Selection:
Click a token to view alternatives
Click an alternative to replace the token and regenerate the response
Hover over a token to see its raw log-probability score
Window Controls:
Drag handle for panel repositioning (MovingUI only)
Maximize/restore panel size
Expand/collapse panel content
Close panel
Availability
You must select Request token probabilities in User Settings to enable this feature.
Token probabilities are only available for the most recent message, and are not saved to
the chat. If token probability information is no longer available for a message, the panel
will display a message indicating this.
Token probabilities are not available when using Smooth Streaming.
Token probabilities are not available from all APIs. If you are using an API that does not
support token probabilities, the panel will open but will not display any information.
Text Completion
LlamaCPP: Available
Text Generation WebUI (oobabooga): Available
TabbyAPI: Available
NovelAI: Available
KoboldCPP: Available
Ollama: Appears to be unavailable
OpenRouter Text: Appears to be unavailable
Chat Completion
OpenAI or Custom: Available, but rerolling is not supported
Anthropic: Appears to be unavailable
Google AI Studio: Appears to be unavailable
OpenRouter Chat: Appears to be unavailable
Previous Next
FAQ Slash commands
Slash commands
This is not an exhaustive list as it is updated rarely.
For the most up-to-date list of commands that will work in your instance, use
the /help slash chat command in any SillyTavern chat.
Previous Next
Chatting HotKeys
HotKeys
For the most up-to-date list of HotKeys that will work in your SillyTavern instance, use
the /help hotkeys slash command in any chat.
Hotkeys are disabled for mobile devices.
Chat Hotkeys
Up = Edit last message in chat
Ctrl+Up = Edit last USER message in chat
Left = swipe left
Right = swipe right (NOTE: swipe hotkeys are disabled when chatbar has something
typed into it)
Enter (with chat bar selected) = send your message to AI
Ctrl+Enter = Regenerate the last AI response
Alt+Enter = Continue the last AI response
Escape
(while editing message AND Message Edit AutoSave is enabled) = close edit box.
(while an AI message is generating or streaming) = stop the generation
immediately.
Markdown Hotkeys
Needs to be enabled under the "User Settings" tab. Works in the chatbar and textareas
marked with the "M↓" icon:
Ctrl+B = **bold**
Ctrl+I = *italic*
Ctrl+U = __underline__
Ctrl+K = `inline code`
Ctrl+Shift+~ = ~~strikethrough~~
Edit this page
Previous Next
Slash commands Common Settings
Common Settings
These settings control the sampling process when generating text using a language
model. The meaning of these settings is universal for all the supported backends.
Context Settings
Response (tokens)
The maximum number of tokens that the API will generate to respond.
The higher the response length, the longer it will take to generate the response.
If supported by the API, you can enable Streaming to display the response bit by bit
as it is being generated.
When Streaming is off, responses will be displayed all at once when they are
complete.
Context (tokens)
The maximum number of tokens that SillyTavern will send to the API as the prompt, minus
the response length.
Context comprises character information, system prompts, chat history, etc.
A dotted line between messages denotes the context range for the chat. Messages
above that line are not sent to the AI.
To see a composition of the context after generating the message, click on the
Prompt Itemization message option (expand the ... menu and click on the lined
square icon).
Sampler Parameters
Temperature
Temperature controls the randomness in token selection:
Low temperature (<1.0) leads to more predictable text, favoring higher probability
tokens
High temperature (>1.0) increases creativity and diversity in the output by giving lower
probability tokens a better chance.
Set to 1 for the original probabilities.
Repetition Penalty
Attempts to curb repetition by penalizing tokens based on how often they occur in the
context.
Set the value to 1 to disable its effect.
Repetition Penalty Range
How many tokens from the last generated token will be considered for the repetition
penalty. This can break responses if set too high, as common words like "the, a, and," etc.
will be penalized the most.
Set the value to 0 to disable its effect.
Repetition Penalty Slope
If both this and Repetition Penalty Range are above 0, the repetition penalty will have a
greater effect at the end of the prompt. The higher the value, the stronger the effect.
Set the value to 0 to disable its effect.
Top K
Top K sets a maximum amount of top tokens that can be chosen from. For example, if Top
K is 20, this means only the 20 highest ranking tokens will be kept (regardless of their
probabilities being diverse or limited).
Set to 0 (or -1, depending on your backend) to disable.
Top P
Top P (a.k.a. nucleus sampling) adds up all the top tokens required to add up to the target
percentage. If the Top 2 tokens are both 25%, and Top P is 0.50, only the Top 2 tokens are
considered.
Set the value to 1 to disable its effect.
Typical P
Typical P Sampling prioritizes tokens based on their deviation from the average entropy of
the set. It maintains tokens whose cumulative probability is close to a predefined
threshold (e.g., 0.5), emphasizing those with average information content.
Set the value to 1 to disable its effect.
Min P
Limits the token pool by cutting off low-probability tokens relative to the top token.
Produces more coherent responses but can also worsen repetition if set too high.
Works best at low values such as 0.1-0.01 , but can be set higher with a high
Temperature . For example: Temperature: 5, Min P: 0.5
API Connections
SillyTavern can connect to a wide range of LLM APIs. Below is a description of their
respective strengths, weaknesses, and use cases.
ELI5: Chat Completions vs Text
Completions
When you first navigate to the "API Connections" page in ST, you will notice a drop-down
option to select between options using nomenclature such as "Chat Completion" and "Text
Completion". It's helpful to understand what this is.
What it's not: It's easy to think of "Text Completion" as local models and "Chat
Completion" as cloud-based LLMs but that's not the case. Neither is e.g. "Novel AI" or
"Kobold" actually a separate type of model altogether, even though they are separate
options in the API dropdown in ST. You can force models into different API structures with
the appropriate backend, but that's not the point of this section.
When you send a message using ST, your chat, character description, and other prompts
such as lorebooks or author notes are constructed into a single "prompt" to be sent to the
model. The API "type" for the model you are using decides how exactly this prompt will be
constructed (something that ST takes care of you automatically in the background - you
can open your ST terminal and see exactly what the prompt being sent to the AI looks
like).
Chat Completions
A Chat Completion model, as its name suggests will structure your prompt into a series of
messages between the User (you) and the Assistant (the AI) or System (neutral). Models
that are trained for Chat Completion help create the feeling of a "Chat", with the AI
"responding" to the last message. When you're using the ChatGPT website, you're dealing
with a Chat Completions API in the background.
Text Completions (a.k.a just "Completions")
A Text Completion on the other hand, and again as its name suggests, will convert your
prompt into one long string and the model will simply try to continue this (like, literally
imagine all your text, your hundreds of messages, all your formatting, newlines, etc.
squashed into one very long sentence).
If your messages in ST happen to be formatted as a series of messages between
YourPersona: and Character:, the Text Completion model will try to continue this pattern
and ST will render it as a new chat message for you, but really the model is just trying to
continue the Text. If you offered an input of "The Sun rises in the", a text completion model
is likely to finish that message for you with "East".
Most Text Completion models have a recommended "Instruct Template" (usually
mentioned in the model's documentation or download page) that help them "respond" to
messages and instructions, just like a Chat Completion model. ST usually has most (if not
all) Instruct Templates available for you to choose from in the "Advanced Formatting"
page.
Local APIs
These LLM APIs can be run on your PC.
They are free to use and have no content filter.
Installation process can be complex (SillyTavern dev team does not provide support
for this).
Requires separate download of LLM models from HuggingFace which can be 5-50GB
each.
Most models are not as powerful as cloud LLM APIs.
KoboldAI
Runs on your PC, 100% private, wide range of models available
Gives the most direct control of the AI's generation settings
Requires large amounts of VRAM in your GPU (6-24GB, depending on the LLM model)
Models limited to 2k context
No streaming
Popular KoboldAI versions:
Henky's United
0cc4m's 4bit-supporting United
KoboldCpp
Easy-to-use API with CPU offloading (helpful for low VRAM users) and streaming
Runs from a single .exe file on Windows (must be compiled from source on MacOS and
Linux)
Supports GGUF/GGML models
Slower than GPU-only loaders such as AutoGPTQ and Exllama/v2
GitHub
Oobabooga TextGeneration WebUI
All-in-one Gradio UI with streaming
Broadest support for quantized (AWQ, Exl2, GGML, GGUF, GPTQ) and FP16 models
One-click installers available
Regular updates, which can sometimes break compatibility with SillyTavern
GitHub
Correct Way to Connect SillyTavern to Ooba's new OpenAI API
1. Make sure you're on the latest update of Oobabooga's TextGen (as of Nov 14th,
2023).
2. Edit the CMD_FLAGS.txt file, and add the --api flag there. Then restart Ooba's
server.
3. Connect ST to http://localhost:5000/ (by default) without checking the 'Legacy API'
box. You may remove the /v1 postfix from the URL Ooba's console provides you.
You can change the API hosting port with the --api-port 5001 flag, where 5001 is your
custom port.
TabbyAPI
Lightweight Exllamav2-based API with streaming
Supports Exl2, GPTQ, and FP16 models
Official extension allows loading/unloading models directly from SillyTavern
Not recommended for users with low VRAM (no CPU offloading)
GitHub
Cloud LLM APIs
These LLM APIs are run as cloud services and require no resources on your PC
They are stronger/smarter than most local LLMs
However they all have content filtering of varying degrees, and most require payment
OpenAI (ChatGPT)
Easy to set up and acquire an API key
Requires prepayment for credits and charges per prompt
Very logical. Creative style can be repetitive and predictable
Most of the newer models (gpt-4-turbo, gpt-4o) support multimodality
Website, Setup Instructions
Claude (by Anthropic)
Recommended for users who want their AI chats to have a creative, unique writing
style
Requires prepayment for credits and charges per prompt
The newest models (Claude 3) support multimodality
Requires a specific prompting style and utilization of prefills for reply steering
Website
Mistral (by Mistral AI)
Efficient models from various sizes and use cases. You can create an account and API
key on their platform.
From 32k to 128k context sizes for general use, and 32k to 256k context sizes for
coding.
Free Tier with rate limits.
Reasonable moderation, with Mistrals main principles being to be neutral and
empower users, more information here.
OpenRouter
WindowAI browser extension allows you to connect to the abovementioned cloud
LLMs with your own API key
Use OpenRouter to pay to use their API keys instead
Useful if you don't want to create individual accounts on each service
WindowAI website and OpenRouter website
DreamGen
Uncensored models tuned for steerable creative writing
Free monthly credits, as well as paid subscription
Models ranging from 7B to 70B
Setup Instructions
AI Horde
SillyTavern can access this API out of the box with no additional settings required
Uses the GPU of individual volunteers (Horde Workers) to process responses for your
chat inputs
At the mercy of the Worker in terms of generation wait times, AI settings, and
available models
Website
Mancer AI
Service that hosts unconstrained models of various families
Uses 'credits' to pay for tokens on various models
Does not log prompts by default, but you can enable it to get credit discounts on
tokens.
Uses an API similar to Oobabooga TextGeneration WebUI , see Mancer docs for details.
Website, Setup Instructions
NovelAI
No content filter
Paid subscription required
Website, Setup Instructions
Previous Next
Common Settings Connection Profiles
Connection Profiles
Save Connection Profiles to quickly switch between different APIs, models and formatting
templates. This is useful when you actively use multiple API connections or need to switch
between different configurations without surfing through the menus.
Accessing Connection Profiles
This feature is enabled by default starting from SillyTavern 1.12.6 or later as a built-in
extension, and available in the API Connections menu. If you wish to disable it, open the
Extensions panel, click on "Manager extensions", locate Connection Profiles in the list,
uncheck the "Enabled" checkbox, and then click "Close".
What is Saved
Connection Profiles store the following selections.
Common
API type, model and the server URL
Settings preset
Start Reply With (can be explicitly empty)
Custom Stopping Strings (can be explicitly empty)
Reasoning Formatting
Text Completion APIs
System Prompt and its state
Instruct Mode state and template
Context Template
Tokenizer
Chat Completion APIs
Proxy preset
Managing Connection Profiles
Note
Profiles only save the selection in dropdown fields, without knowing anything
about the underlying settings. This means that you will lose unsaved changes by
switching to a different profile. To prevent this, make sure to update all presets
and templates if you don't want to lose ephemeral changes.
To save a profile, set all the required settings and click the "Create" button. Then
review the settings and provide a name for the profile. A name should be unique.
To view the detailed information about a chosen profile, click on the "Information"
button. Click again to hide the details.
Connection Profile settings are saved into settings.json without altering the
associated profile save file until you press the "Update" button. This means that if you
setup a profile, but then switch to a different one without updating, you will lose all of
your previous changes.
To restore the changed selections from a saved profile, click the "Reload" button.
To delete a profile, click the "Delete" button and confirm the deletion. This action is
irreversible.
Slash Commands
Connection profiles can be managed using the following slash commands.
1. /profile [name] - switch to a profile if the argument is provided, or get the name of
the current profile if not.
2. /profile-create [name] - saves the current settings as a new profile with the
provided name.
3. /profile-list - returns a JSON-serialized array of available profile names.
4. /profile-get [name] - gets the details of the profile with the provided name as a
JSON-serialized object.
5. /profile-update - updates the selected profile with the current settings.
Edit this page
Previous Next
API Connections Self-hosted AI models
Self-hosted AI models
This guide is based on the author's personal experience and knowledge and is
not an absolute truth. All statements should be taken with a grain of salt. If you
have any corrections or suggestions, please contact us on Discord or send a PR
to the SillyTavern documentation repository.
Intro
This guide aims to help you get set up using SillyTavern with a local AI running on your PC
(we'll start using the proper terminology from now on and call it an LLM). Read it before
bothering people with tech support questions.
What are the best Large Language Models?
It is impossible to answer this question as there's no standardized scale of "Best". The
community has enough resources and discussions going on Reddit and Discord to form at
least some opinion on what is the preferred / go-to model. Your mileage may vary.
What is the best configuration?
If there was a best or no-brainer setup, would there even have to be a need for
configuration? The best configuration is the one that works for you. It's a trial-and-error
process.
Hardware requirements and orientation
This is a complex subject, so I'll stick to the essentials and generalize.
There are thousands of free LLMs you can download from the Internet, similar to how
Stable Diffusion has tons of models you can get to generate images.
Running an unmodified LLM requires a monster GPU with a ton of VRAM (GPU
memory). More than you will ever have.
It is possible to reduce VRAM requirements by compressing the model using
quantization techniques, such as GPTQ or AWQ. This makes the model somewhat less
capable, but greatly reduces the VRAM requirements to run it. Suddenly, this allowed
people with gaming GPUs like a 3080 to run a 13B model. Even though it's not as good
as the unquantized model, it's still good.
It gets better: there also exists a model format and quantization called GGUF
(previously GGML) which has become the format of choice for normal people without
monster GPUs. This allows you to use an LLM without a GPU at all. It will only use CPU
and RAM. This is much slower (probably 15 times) than running the LLM on a GPU
using GPTQ/AWQ, especially during the prompt processing, but the model's ability is
just as good. The GGUF creator then optimized GGUF further by adding a
configuration option that allows people with a gaming-grade GPU to offload parts of
the model to the GPU, allowing them to run part of the model at GPU speed (note that
this doesn't reduce RAM requirements, it only improves your generation speed).
There are different sizes of models, named based on the number of parameters they
were trained with. You will see names like 7B, 13B, 30B, 70B, etc. You can think of
these as the brain size of the model. A 13B model will be more capable than the 7B
from the same family of models: they were trained on the same data, but the bigger
brain can retain the knowledge better and think more coherently. Bigger models also
require more VRAM/RAM.
There are several degrees of quantization (8-bit, 5-bit, 4-bit, etc). The lower you go,
the more the model degrades, but the lower the hardware requirements. So even on
bad hardware, you might be able to run a 4-bit version of your desired model. There's
even 3-bit and 2-bit quantization but at this point, you're beating a dead horse.
There's also a further quantization subtypes named k_s, k_m, k_l, etc. k_m is better
than k_s but requires more resources.
The context size (how long your conversation can become without the model dropping
parts of it) also affects VRAM/RAM requirements. Thankfully, this is a configurable
setting, allowing you to use a smaller context to reduce VRAM/RAM requirements.
(Note: the context size of Llama2-based models is 4k. Mistral is advertised as 8k, but
it's 4k in practice.)
Sometime in 2023, NVIDIA changed their GPU driver so that if you need more VRAM
than your GPU has, instead of the task crashing, it will begin using regular RAM as a
fallback. This will ruin the writing speed of the LLM, but the model will still work and
give the same quality of output. Thankfully, this behavior can be disabled.
Given all of the above, the hardware requirements and performance vary completely
depending on the family of model, the type of model, the size of the model, the
quantization method, etc.
Model size calculator
You can use Nyx's Model Size Calculator to determine how much RAM/VRAM you need.
Remember, you want to run the largest, least quantized model that can fit in your memory,
i.e. without causing disk swapping.
Downloading an LLM
To get started, you will need to download an LLM. The most common place to find and
download LLMs is on HuggingFace. There are thousands of models available. A good way
to find GGUF models is to check bartowski's account page:
https://huggingface.co/bartowski. If you don't want GGUF, he links the original model page
where you might find other formats for that same model.
On a given model's page, you will find a whole bunch of files.
You might not need all of them! For GGUF, you just need the .gguf model file (usually
4-11GB). If you find multiple large files, it's usually all different quantizations of the
same model, you only need to pick one.
For .safetensors files (which can be GPTQ or AWQ or HF quantized or unquantized), if
you see a number sequence in the filename like model-00001-of-00003.safetensors,
then you need all 3 of those .safetensors files + all the other files in the repository
(tokenizer, configs, etc.) to get the full model.
As of January 2024, Mixtral MOE 8x7B is widely considered the state of the art for
local LLMs. If you have the 32GB of RAM to run it, definitely try it. If you have less than
32GB of RAM, then use Kunoichi-DPO-v2-7B, which despite its size is stellar out of the
gate.
Walkthrough for downloading Kunoichi-DPO-v2-7B
We will use the Kunoichi-DPO-v2-7B model for the rest of this guide. It's an excellent
model based on Mistral 7B, that only requires 7GB RAM, and punches far above its weight.
Note: Kunoichi uses Alpaca prompting.
Go to https://huggingface.co/brittlewis12/Kunoichi-DPO-v2-7B-GGUF
Click 'Files and versions'. You will see a listing of several files. These are all the same
model but offered in different quantization options. Click the file 'kunoichi-dpo-v2-
7b.Q6_K.gguf', which gives us a 6-bit quantization.
Click the 'download' button. Your download should start.
How to identify the type of model
Good model uploaders like TheBloke give descriptive names. But if they don't:
Filename ends in .gguf: GGUF CPU model (duh)
Filename ends in .safetensors: can be unquantized, or HF quantized, or GPTQ, or AWQ
Filename is pytorch-***.bin: same as above, but this is an older model file format that
allows the model to execute arbitrary Python script when the model is loaded, and is
considered unsafe. You can still use it if you trust the model creator, or are desperate,
but pick .safetensors if you have the option.
config.json exists? Look if it has a quant_method.
q4 means 4-bit quantization, q5 is 5-bit quantization, etc
You see a number like -16k? That's an increased context size (i.e. how long your
conversation can get before the model forgets the beginning of your chat)! Note that
higher context sizes require more VRAM.
Installing an LLM server: Oobabooga or
KoboldAI
With the LLM now on your PC, we need to download a tool that will act as a middle-man
between SillyTavern and the model: it will load the model, and expose its functionality as a
local HTTP web API that SillyTavern can talk to, the same way that SillyTavern talks with
paid webservices like OpenAI GPT or Claude. The tool you use should be either KoboldAI
or Oobabooga (or other compatible tools).
This guide covers both options, you only need one.
Previous Next
Connection Profiles Chat Completions
Chat Completions
OpenAI
API key
How to get:
1. Go to OpenAI and sign in.
2. Use "View API keys" option to create a new API key.
Important!
Lost API keys can't be restored! Make sure to keep it safe!
Claude
If you have access to Anthropic's Claude API:
Select 'Claude' for 'Chat Completion Source'.
Input your API key.
Click connect.
Mistral AI
Mistral AI is a team developing both open and proprietary models with high scientific
standards and a focus on openness. You can run their models locally or through their API
service, La Plateforme.
API
The first step is to create an account on La Plateforme.
Once that's done, you can choose a plan and set up your payment information or opt
for the Free Tier.
Next, you can create your API key. You may need to wait a couple of minutes before
the key becomes valid!
Important!
Lost API keys can't be restored! You would have to create a new one. Make sure to keep it
safe!
Custom OpenAI-compatible endpoint
It is important to note that we do not provide support for possible issues that
you may have! We do not guarantee compatibility with every possible API
endpoint!
If you intend to use this feature to use a local endpoint, like TabbyAPI,
Oobabooga, Aphrodite, or any like those, you might want to check out the built-
in compatibility for those instead. The custom endpoint feature is mainly
intended for use with other services and programs that expose an OpenAI-
compatible API Chat Completion endpoint.
Most Text Completion APIs support far greater customization options than
OpenAI's standards allow for. These greater customization options, such as the
Min-P sampler, may be worthwhile for SillyTavern users to check out, which can
greatly improve the quality of generations.
You can configure an alternative endpoint for the Chat Completions backend. This custom
endpoint can connect to any server that supports the generic OpenAI API schema.
Examples of compatible backends include:
LM Studio
LiteLLM
LocalAI
Connecting
To access this feature:
1. Switch to the 'Chat Completion' API type
2. Select 'Custom (OpenAI-compatible)' for 'Chat Completion Source'
Enter the custom endpoint URL and an API key if required. For example, TabbyAPI requires
an API key for authentication.
Hint: If you experience connection issues, try adding /v1 to the end of the endpoint
URL. Do NOT add the /chat/completions suffix.
Selecting a Model
If the custom API implements the /v1/models endpoint to provide a list of available
models, you can choose from a dropdown list. Otherwise, use the text field to manually
input a model ID.
Check 'Bypass API status check' to prevent SillyTavern from alerting you about a non-
functioning API endpoint. Enable this option if your API endpoint works properly but
SillyTavern continues to display warnings.
Click "Test Message" to verify connectivity by sending a simple prompt to the model.
Prompt Post-Processing
Some endpoints may impose specific restrictions on the format of incoming prompts, such
as requiring only one system message or strictly alternating roles.
SillyTavern provides built-in prompt converters to help meet these requirements (from
least to most restrictive):
1. Merge consecutive messages from the same role
2. Merge roles and allow only one system message (semi-strict)
3. Merge roles, allow only one optional system message, and require a user role to be
first (strict)
OpenRouter
Don't have access to OpenAI / Claude APIs due to geolocking or waitlists? Use
OpenRouter.
OpenRouter works by letting you use keys they own to access models like GPT-4 and
Claude 2, all in one service with a shared credit pool.
It has a free trial (about $1) and paid access afterward. No subscription or monthly bill -
you pay for what you actually use. Some models have free access with a limited context
size.
OpenRouter Pricing Details
Create an OpenRouter account: openrouter.ai
OpenRouter-ConnectionPanel
From top to down (see image above):
1. Select 'Chat Completion' API
2. Select OpenRouter source
3. Click "Authorize" to get a key using OAuth flow. Alternatively, generate an API key here
and paste it into the box.
4. Click "Connect" and select a model
5. (Optional) Use the "Test Message" button to verify your connection
WindowAI
WindowAI is a browser extension by the makers of OpenRouter that allows control of your
OpenRouter connection for any enabled site or web app.
You can also use your own Claude and OpenAI API keys there.
Previous Next
Chat Completions AI Horde
AI Horde
Disclaimer
AI Horde is a crowdsourced, distributed GPU cluster run entirely by volunteers.
By default, your inputs are anonymously sent and responses can not be seen by the
person running the Horde Worker.
However, since it is an open-sourced program, Malicious Workers could modify the
code to:
log your activity (input prompts, AI responses).
produce bad or offensive responses.
When using Horde never send any personal information such as names, email
addresses, etc.
Switching on the "Trusted Workers Only" checkbox will limit the selection of available
workers to only those who have been hosting on Horde for a while and are generally
considered trusted. But they could still be seeing prompts, for example by hosting using
unaccounted software.
To help reduce this problem, SillyTavern has built in the following feature:
When a chat response is generated by a Horde Worker, SillyTavern records the
Worker's ID and what model they were using.
This information can be seen by hovering your mouse cursor over the chat item (see
image below).
If you believe you received a malicious response, you can pass this information to the
Horde admin on the AI Horde Discord for review and possible disciplinary action
against that Worker.
Horde Worker Info Popup
Setup
SillyTavern is able to connect with Horde out of the box with no additional setup
required.
Select 'AI Horde' from the API Dropdown Selector in the ST API Panel.
Select one or more Models ('AI brains' for the characters) from the Model Selector at
the bottom of the panel.
Select a character and begin chatting.
Tips
Register an account on the Horde website then add your Horde key into the
SillyTavern Horde API Key box.
Set up a Horde Worker to provide your GPU for others.
Letting others use your GPU earns you 'Kudos', a kind of Horde-only currency.
The more kudos your account has, the faster you will get chat responses from
other Horde Workers.
Kudos can also be used to create AI images on Stable Horde.
SillyTavern supports Stable Horde image generation out of the box.
If your GPU isn't powerful enough to run an AI, or you don't have a computer, you can
still participate in the Horde community to earn Kudos in various ways.
Previous Next
OpenRouter DreamGen
DreamGen
DreamGen is an app and an API for AI-powered creative writing. They have a free tier, as
well as a paid subscription that allows unlimited monthly access to their high-quality in-
house text generation models made specifically for the purpose of steerable AI-assisted
writing. Create an account to get started: https://dreamgen.com/.
The (free) credits reset at the start of each calendar month. See pricing to see the credit
cost for each model and usage to see your remaining credits.
Connecting to DreamGen
Get API Key
Go to the DreamGen API keys page and click the "New API Key" button. Make sure the API
Key is copied into your clipboard.
Models
DreamGen offers opus-v1-sm , opus-v1-lg , and opus-v1-xl . The larger the model, the
better it will be at following instructions and writing good stories.
Formatting Settings
The DreamGen models expect a specific input format, which is documented here.
SillyTavern comes with built-in presets made for DreamGen. Make sure to use these
settings as your baseline. These settings try to stick to the DreamGen format as closely as
possible but due to the irregular formatting of character cards, it is not always perfect.
1. Go to the "Advanced Formatting" page.
2. Under "Context Template" pick DreamGen Role-Play V1 Llama3 / ChatML depending
on the model (*).
3. Enable "Instruct Mode".
4. Under "Instruct Mode Presets" pick DreamGen Role-Play V1 Llama3 / ChatML .
## Plot description:
The librarian sets up a blind date between Lucifer and Mia. Lucifer immediately falls in
love with Mia, but Mia needs more space and time to make up her mind.
## Style description:
The narrative is vivid and intensely sensual, with a strong emphasis on raw emotion
conveyed from a first-person point of view. The language is explicit, evoking intense
imagery and indulging in the erotic exploration of the characters' passionate encounters.
## Characters
### Lucifer
Lucifer, the red-skinned, horned demon, is the embodiment of fallen grace. Wrestling with
his notorious heritage and a newfound desire for love, his complex nature ferments with
vulnerability. His character oscillates between hedonism and self-reflection, hungering
for acceptance by Mia and the librarian. Embracing his mortal love, he yearns for
transformation, embodying the notion that even the damned may seek solace in love's
redemption.
### Mia
Note that the prompt should be a description of the story, rather than instructions or
directives on how the story should be written. Avoid using phrases like:
"Write the story as if..."
"Make sure to..."
etc.
See more examples of what the plot, style and character descriptions should look like.
The default "DreamGen Role-Play V1" template substitutes the different sections as
follows:
## Plot description: will consist of {{scenario}} and {{wiBefore}} .
## Style description: is not provided, you should either add it to the system prompt
under Advanced Settings, or to the character cards, at the end of {{scenario}} . This
section is useful to influence the narrative style (first, second, third person), the tense
(past, present), the level of detail and verbosity, etc.
## Characters: will have a {{char}} character with description consisting of
{{description}} and {{personality}} and a {{user}} character with description
consisting of {{persona}} .
Message Examples and Initial Message
The DreamGen models are very responsive to the context -- they will largely stick to the
writing style (and facts) presented in the previous conversation turns. This makes the
message examples and the initial message very important.
Formatting Message Examples
The {{mesExamples}} are appended at the end of the system prompt. To take full
advantage of the instruct formatting, make sure that your examples are separated with
the <START> separator. For example:
<START>
{{user}}: (user's turn)
{{char}}: (char's turn)
<START>
{{user}}: (user's turn)
{{char}}: (char's turn)
Examples
Here are a couple of example cards, adapted for DreamGen, that take into account the
unique prompting. These cards also leverage the {{mesExamples}} as described above.
Seraphina
This is an edit of the popular Seraphina card that's built into SillyTavern by default.
Seraphina
Lara Lightland
This is an edit of the Lara Lightland card by Deffcolony.
Lara Lightland
FAQ
What sampler settings should I use?
You can start with these:
Temperature: 1.0
MinP: 0.05
Presence Penalty: 0.1
Frequency Penalty: 0.1
How can I make the responses longer or shorter?
You have several options:
Change or add the ## Style description: in the system prompt or model card. You
can try adding something like "Sentences are generally long, and the narrative
describes the setting in painstaking detail."
Change the Min Length in the Completion Settings.
Add Last Output Sequence similar to the following in the Advanced Formatting
settings under Instruct Mode:
Here's an example of the Last Output Sequence that might help make the model respond
in a more verbose way, using the Llama 3 template:
<|eot_id|>
<|start_header_id|>user<|end_header_id|>
Length: 400 words
Plot: {{char}} replies to {{user}} in detailed and elaborate way.<|eot_id|>
<|start_header_id|>writer character: {{char}}<|end_header_id|>
You can change the text within to something more suitable for your scenario or context.
How can I stop the model from repeating itself?
If the model repeats what's in the context, you can try increasing "Repetition Penalty" in
the Completion Settings or you can try rephrasing the part of the context that's getting
repeated. If the model repeats itself within one message, you can try increasing "Presence
Penalty" or "Frequency Penalty".
How can I steer the story?
If you want to direct the characters to do something, or to steer the plot in certain
direction, you can use the user role (that is the <|im_start|>user preamble).
At this point, this functionality is not neatly integrated into SillyTavern natively, but you can
use the Last Output Sequence as described above to insert the user (instruction) turn.
See examples of what the instructions should look here.
KoboldCpp
KoboldCpp is a self-contained API for GGML and GGUF models.
This VRAM Calculator by Nyx will tell you approximately how much RAM/VRAM your model
requires.
Nvidia GPU Quickstart
This guide assumes you're using Windows.
Download the latest release: https://github.com/LostRuins/koboldcpp/releases
Launch KoboldCpp. You may see a pop-up from Microsoft Defender, click Run
Anyway .
As of version 1.58, KoboldCpp should look like this:
KoboldCpp 1.58
Under the Quick Launch tab, select the model and your preferred Context Size .
Select Use CuBLAS and make sure the yellow text next to GPU ID matches your GPU.
Do not tick Low VRAM , even if you have low VRAM.
Unless you have an Nvidia 10-series or older GPU, untick Use QuantMatMul (mmq) .
GPU Layers should have been populated when you loaded your model. Leave it there
for now.
Under the Hardware tab, tick High Priority .
Click Save so you don't have to configure KoboldCpp on every launch.
Click Launch and wait for the model to load.
You should see something like this:
Load Model OK: True
Embedded Kobold Lite loaded.
Starting Kobold API on port 5001 at http://localhost:5001/api/
Starting OpenAI Compatible API on port 5001 at http://localhost:5001/v1/
======
Please connect to custom endpoint at http://localhost:5001
You can now connect to KoboldCpp within SillyTavern with http://localhost:5001 as the
API URL and start chatting.
Congratulations! You're done!
Kind of.
GPU Layers
KoboldCpp is working, but you can improve performance by ensuring that as many layers
as possible are offloaded to the GPU. You should see something like this in the terminal:
llm_load_tensors: offloading 9 repeating layers to GPU
llm_load_tensors: offloaded 9/33 layers to GPU
llm_load_tensors: CPU buffer size = 25215.88 MiB
llm_load_tensors: CUDA0 buffer size = 7043.34 MiB
.............................................................................................
llama_kv_cache_init: CUDA_Host KV buffer size = 1479.19 MiB
llama_kv_cache_init: CUDA0 KV buffer size = 578.81 MiB
Don't be afraid of numbers; this part is easier than it looks. CPU buffer size refers to
how much system RAM is being used. Ignore that. CUDA0 buffer size refers to how much
GPU VRAM is being used. CUDA_Host KV buffer size and CUDA0 KV buffer size refer to
how much GPU VRAM is being dedicated to your model's context. In this case, KoboldCpp
is using about 9 GB of VRAM.
I have 12 GB of VRAM, and only 2 GB of VRAM is being used for context, so I have about
10 GB of VRAM left over to load the model. Because 9 layers used about 7 GB of VRAM
and 7000 / 9 = 777.77 we can assume each layer uses approximately 777.77 MIB of
VRAM. 10,000 MIB / 777.77 = 12.8 , so I'll round down and load 12 layers with this model
from now on.
Now do your own math using the model, context size, and VRAM for your system, and
restart KoboldCpp:
If you're smart, you clicked Save before, and now you can load your previous
configuration with Load . Otherwise, select the same settings you chose before.
Change the GPU Layers to your new, VRAM-optimized number (12 layers in my case).
Click Save to save your updated configuration.
You should now see something like this:
llm_load_tensors: offloading 12 repeating layers to GPU
llm_load_tensors: offloaded 12/33 layers to GPU
llm_load_tensors: CPU buffer size = 25215.88 MiB
llm_load_tensors: CUDA0 buffer size = 9391.12 MiB
.............................................................................................
llama_kv_cache_init: CUDA_Host KV buffer size = 1286.25 MiB
llama_kv_cache_init: CUDA0 KV buffer size = 771.75 MiB
KoboldCpp is using about 11.5 GB of my 12 GB VRAM. This should perform a lot better than
the settings generated automatically by KoboldCpp.
Congratulations! You're (actually) done!
For a more in-depth look at KoboldCpp settings, check out Kalmomaze's Simple Llama +
SillyTavern Setup Guide.
Previous Next
DreamGen Mancer
Mancer
Mancer is a large language model inferencing service that lets you run whatever prompts
you want and doesn't censor responses. Most of the models require a preloaded balance
to start chatting, but there is a free model as of writing (2024-11-28).
Models
Pricing
How to Get Started
1. Sign up for an account at mancer.tech.
2. Click on Dashboard and copy your API Key.
3. In SillyTavern, select the Text Completion API, and then select Mancer under API Type.
4. Enter your Mancer API Key and click Connect.
API Key
You should now be able to chat with any Mancer model of your choice.
Anonymous Logging
If you don't mind your chats potentially being used to train models, improve Mancer's
service, publish datasets, or whatever else they may decide to do with it, you can opt-in
to anonymous logging for a 25% token discount on select models. Simply go to your
Mancer dashboard and tick Enable Anon. Logging .
Support
Still need help? Head over to the #mancer support channel on the SillyTavern Discord.
Previous Next
KoboldCpp NovelAI
© Copyright 2025. All rights reserved.
SillyTavern Documentation
NovelAI
NovelAI is a paid subscription service that allows unlimited monthly access to their high-
quality in-house text generation, image generation, and text-to-speech models. Register
an account here to get started: https://novelai.net/
You will get only 50 generations for free to evaluate the model. When the "Not eligible for
this model" error appears, this means that you've exhausted your trial period and need to
subscribe to a paid plan.
API Key
To get your NovelAI API key, follow these steps:
1. Select the gear icon at the top of the left sidebar.
Left Sidebar
2. Select "Account" under "User Settings".
User Settings
3. Select "Get Persistent API Token".
Account
4. Select the copy icon to copy your NovelAI API token to the clipboard.
Persistent API Token
Models
If you have Opus, then Erato is the model to use. If you don't have Opus, then Kayra is the
best available model.
Clio has a larger context size on Tablet/scroll tiers, but the strength of Kayra usually
makes up for that difference.
Settings
The files with the settings are here ( SillyTavern/data/<user-handle>/NovelAI Settings ).
You can also manually add your own settings files.
Response Length
How much text you want to generate per message. Note that NovelAI has a limit of 150
tokens per response.
Context Size
How many tokens of the chat are kept in the context at any given time. How large the
maximum context size you can use depends on the model and your subscription tier:
Kayra (Tablet) - 3072 tokens
Kayra (Scroll) - 6144 tokens
Erato (Opus exclusive), Kayra (Opus) and Clio (all tiers) - 8192 tokens
Preamble
Text that is inserted right above the chat to modify the writing style. The recommended
format is a list of short tags, like "[ Style: chat, detailed, sensory ]".
Preset Descriptions
This is, according to Novel AI, what the default presets are good for.
Erato
Golden Arrow - A good all-rounder.
Wilder - Higher variety of word choice, more differences between rerolls, more prone
to mistakes.
Zany Scribe - Avoids mistakes and repetition. Prioritizes more complex words.
Dragonfruit - Varied and complex language with little repetition. More frequent
mistakes and contradictions.
Shosetsu - Designed for writing in Japanese. Works fine for English too.
Kayra
Asper - For creative writing. Expect unexpected twists.
Carefree - A good All-rounder
Fresh-Coffee - Keeps things on track. Handles instruct well.
Pro_Writer - Mimic the pacing and feel of best-selling fiction
Stelenes - More likely to choose reasonable alternatives. Variety on retries.
Tea_Time - It gets good when it gets going.
Writers-Daemon - Extremely imaginative, sometimes too much.
Clio
Edgewise - Handles a variety of generation styles well
Fresh Coffee - Keeps things on track.
Long-Press - Intended for creative prose.
Talker Chat - Designed for chat style generation.
Vingt-Un - A good all-around default with a bent towards prose.
Tips and FAQs for using NovelAI with
SillyTavern
There are a lot of common problems and questions that come up when switching to
NovelAI from another ST backend API. The difference comes down to what the models are
trained for. Most likely, you've used an OpenAI or Anthropic model (or a local model made
to resemble those), which is built around following the user's instructions. NovelAI's
models are built purely around text completion: instead of taking your input as a message
and formulating a response, NAI's models attempt to continue the incoming prompt. Due
to this difference, a lot of tips and common knowledge that work for other APIs won't work
for NAI.
Tweaking settings for NovelAI
Under Advanced Formatting (the A icon):
Set "Context Template" to "NovelAI"
Set "Tokenizer" to "Best match"
Check "Always add character's name to prompt"
Check "Collapse Consecutive Newlines"
Uncheck the "Enabled" box under "Instruct Mode"
Under User Settings (the person with a gear)
Turn on "Swipes" (Not NAI specific, but it's so useful you should just do it)
Building/Adapting character cards for NovelAI
To optimize your character cards for NovelAI, there are a couple of recommended
methods for writing your character's description: prose, and attributes.
Prose is so simple it doesn't feel like it should work: "Sylpheed is a young-looking but
actually 900 year old nymph. She's short and petite, with long white hair that fades into a
green gradient in her braided side ponytail, and emerald green eyes shaped like crosses.
[...]" No, really, that's it. Just write out, in normal sentences, what the character looks like,
acts like, etc., and the AI will pick up on it.
If you don't trust your writing abilities or want a more structured way to go about it, you
can use the attributes method, which is present in the NovelAI training data. This works as
a simple list of character traits of different types. Here's a list of possible attributes that
have been tested to be effective with NovelAI's models:
Name:
AKA:
Type: character
Setting:
Nationality:
Species:
Gender:
Age:
Height:
Weight:
Appearance:
Clothing:
Attire:
Personality:
Mind:
Mental:
Likes:
Dislikes:
Sexuality:
Speech:
Voice:
Abilities:
Skills:
Quote:
Affiliation:
Occupation:
Reputation:
Secret:
Family:
Allies:
Enemies:
Background:
Description:
Attributes:
"Type: character" is there to tell the AI that this is describing a character (as opposed to a
location, object, or other type of thing). The rest of the attributes are optional, and some
are redundant (for example, Personality, Mind, and Mental all mean basically the same
thing), but these have been tested and work well with NovelAI's models. Fill in whichever
ones are relevant to your character. The attributes should be written in lower case and
separated by commas, no need for quotes around the words. For example:
Skills: lockpicking, stealth, running away very fast
These methods are recommended because they're present in NovelAI's training data, so
they specifically work well with the model.
Example cards
Here are a couple of example cards, made for NovelAI, that show off different ways of
creating cards specifically for NovelAI. The first card, Valka, uses the attributes method
for the character description, while Eris, the second card, uses prose descriptions, along
with a large amount of example dialogue.
Valka Eris
What not to do
Most of the existing character card formats are a poor fit for NovelAI. They'll give you
some results, even some good ones, but they have a lot of problems. W++ is one of the
biggest offenders, where it doesn't resemble anything that NovelAI's models were trained
on, and its constant use of brackets/braces/quotes eats up a ton of tokens, bloating the
size of the cards with no real benefit.
Of the existing formats that aren't baked into NovelAI, AliChat is the one most likely to
work, as it relies on using example messages to get across both information about the
character and their voice at the same time, in the format of the type of message that you
want the AI to output.
For most other formats, since they are usually ways of listing out different characteristics
of a particular character, they can be converted to the attributes method rather
straightforwardly.
Which module should I use?
Probably No Module. Prose Augmenter is useful if you want a character to speak in a more
flowery manner, but be careful not to overdo it. Text Adventure might be useful for a text
adventure-style card/story.
Not the instruct module?
You can invoke the Instruct module when you need it. Create a newline in your message,
and put your instructions in curly brackets like this: { CharName is offended by that
seemingly innocuous statement } (the spaces are required between the text and the
brackets). Doing that will automatically switch the AI into the Instruct module for a short
time. You don't want to use the Instruct module all the time because it tends to produce
less creative output than the other modules, just when you need to guide the AI strongly in
a particular direction.
Why do my responses keep getting cut off?
NovelAI limits response length to ~150 tokens total, even if you set the slider higher than
that. When it reaches the number of tokens in the slider or 150, whichever is lower, it will
generate up to 20 more tokens, looking for a stop sequence or the end of a sentence, so
there's an effective limit of 170 tokens for a response, at which point it will just stop,
causing it to cut off.
If it cuts off, you can select the continue option (in the three-line menu to the left of the
text box) to get the character to continue their response.
If you regularly want responses longer than 170 tokens, you can work around the limit like
this:
Keep the response length at 150 tokens.
Under Advanced Formatting, enable Auto-continue.
Set the "Target length" to the desired length.
This will chain together multiple generations to give you longer messages but doesn't
guarantee that the reply will be 100% of the desired length if the model decides to stop.
How do I get the bot to write longer responses?
Read the above about responses getting cut off. That will help to make sure that
responses aren't cut off prematurely by running into the limit of generation length.
If your responses aren't getting cut off but are still too short, it's likely you're dealing with
"garbage in, garbage out" - if you give the model bad examples, it will produce bad
output. If the character card has no example dialogue or short example dialogue and the
messages you send to the bot are short, the model will pick up on that, take it as the
accepted way to do things and the responses will be short. So, write longer example
dialogue and longer messages to the bot. (You can always use NovelAI to write some
example dialogue for you rather than doing it yourself.)
How do I get the bot to stop talking for me?
Check that the character card's first message and example dialogue don't include the
character taking actions for you - if they do, then rewrite them to get rid of it acting
for you
Make sure that "Always add character's name to prompt" is checked
Make sure that you're currently using the same user persona as the rest of the chat. If
you changed user personas and didn't change back (or don't have a persona locked
to that chat), the usual rules to stop generating for you will fail
Add ["\n{{user}}:"] to Custom Stopping Strings (shouldn't be necessary, but sometimes
helps)
Why isn't my character responding?
A lot of things can cause this, so we need to look in a few places:
Make sure that "Always add character's name to prompt" is checked in Advanced
Formatting
Check to make sure there aren't any errors coming from the API. While you can use
SillyTavern with the NAI free trial, once it runs out, you'll just get errors
Check what you have in "Custom Stopping Strings" - if those are being generated at
the start of the response, it might be cut off prematurely
How should I use the Author's Note?
In general, you probably shouldn't. It's inserted very close to the end of the context, and
with NAI's models, it frequently overpowers everything else in the context. It's mostly an
artifact from older, weaker models where it was more necessary.
How do I do a scene break/time jump?
Put the following as a system message or on newlines at the start of your next message:
***
[ 2 days later ]
Then put the rest of your message on the next line. The bracketed text can be a time
jump, a new location, or anything else. The "***" (hilariously named a "dinkus") tells the AI
that the scene has changed, and the bracketed text gives that more context.
The AI keeps repeating specific words/phrases, what do I
do?
As mentioned above, you can push the repetition penalty slider up a bit more, though
pushing it too far can make the output incoherent. To more thoroughly fix the problem, go
back through the context, especially recent messages, and delete the repeated
word/phrase. Removing it from the context gives the AI less reason to start saying it in the
first place.
Previous Next
Mancer Scale
© Copyright 2025. All rights reserved.
SillyTavern Documentation
Scale
Scale is an easy way to access GPT-4 and other LLMs through deployed "apps" that act
like API endpoints.
Currently, Scale doesn't support token streaming and configuring parameters like
temperature through SillyTavern's UI.
Scale API is not free, but offers a $5 trial if you link a credit card.
Quick Start
Create a Scale Spellbook account at https://spellbook.scale.com (if your country is
not supported, use a VPN)
Create an "App" with any name and description
Create a "Variant", which sets the parameters (system prompt, model, temperature,
response token limit, etc)
Select a proper language model to be deployed (GPT-4 is recommended)
Replace the contents of the "User" section of the prompt with the following:
Previous Next
NovelAI TabbyAPI
TabbyAPI
A FastAPI based application that allows for generating text using an LLM using the
Exllamav2 backend, with support for Exl2, GPTQ, and FP16 models.
GitHub
Quickstart
1. Follow the installation instructions on the official TabbyAPI GitHub.
2. Create your config.yml to set your model path, default model, sequence length, etc.
You can ignore most (if not all) of these settings if you want.
3. Launch TabbyAPI. If it worked, you should see something like this:
Prompts
When you send a message to your AI, the text you write is combined with other text to
form a single request that's sent to the AI. This combined text is called a "prompt" or
sometimes the "request" or "context."
The prompt can include a variety of different types of text, including:
Main instructions to the AI about how to generate a response
Definitions of the roles that the AI should take on
Definitions of the role that you are taking on
Information about the "world" that the AI is interacting with
Relevant documents or information from Data Bank
Summaries of the past conversation
Results of web searches or other external data sources
Previous messages in the conversation
Your message to the AI
Final instructions for the AI about how to generate a response
This can be a lot to manage! To help you understand how to structure and modify the
request that's sent to the AI, SillyTavern identifies different elements that you might want
to include in your prompt. You can then structure your prompt to include the things that
make sense for the way you want to interact with the AI.
Many of these elements are explained in the sections where you will change them. For
example, to describe the role that you would like the AI to take on, you could use the
Description field in Character Design.
Viewing the Prompt
Reading the final prompt that's sent to the AI is very helpful for understanding what the AI
was told, and why it generated the response that it did. You can view the prompt in
several ways:
Using the Prompt Itemization icon on the reply message from the AI
Using the Prompt Inspector extension
Checking the logs in the terminal window that you're running SillyTavern in
Checking the console in your browser's developer tools
Changing how the Prompt is Built
Presenting all the parts of your prompt to the AI in the right way is crucial for getting the
best responses. You can control how the prompt is built.
Use the Advanced Formatting panel to customize prompt construction for Text
Completion APIs.
The System Prompt is a part of the Story String and usually the first part of the
prompt that the model receives.
Write {{char}}'s next reply in a fictional chat between {{char}} and {{user}}.
The {{char}} and {{user}} placeholders are replaced with the names of the character and
persona that you've defined in the conversation.
You can use any of the supported {{macro}} tags in the Main Prompt to include information
that might vary between conversations or changes as the conversation progresses.
Adjusting the Main Prompt
The default main prompt helps the model understand what it's expected to do with the
character and persona information that follows, how to interpret the past conversation,
and what kind of response to generate. It's a flexible general-purpose prompt that works
well for many situations, because it establishes that the AI is writing as a character in a
conversation with your persona.
However, you can adjust the main prompt to better suit your needs. Here are some
common reasons to adjust the main prompt:
Provide additional instructions: for example, you want the AI to explain its reasoning,
follow specific rules, or avoid certain topics
Clarify the role of the AI: for example, you want the AI to act as a narrator, a
storyteller, or a guide
Change the context of the conversation: for example, you want the AI to respond as
if it were an AI assistant, text adventure game, or a writing partner
Try things out and see what works best for you
All the examples in this guide have worked well for other users, but the prompt
that works for your needs and the model you're using might be different.
Experiment with different instructions and prompting styles to see what works
best for you. If you're not sure what to try, you can always ask for help in the
SillyTavern Discord.
Giving the AI additional instructions in the Main Prompt can help it understand what you
want from the conversation.
Markdown is enabled. Use it to format your response. Enclose code snippets in triple
backticks.
Write character dialogue in quotation marks. Write {{char}}'s thoughts in
parentheses.
You are an anime roleplay generation model for users aged 13 to 17. You always
generate fun, age-appropriate responses.
Answer truthfully and write out your thinking step by step to be sure you get the right
answer.
The AI will more easily follow instructions about what it should do than what it should not
do. For example, if you want the AI to avoid writing in a certain way, it's better to tell it how
you want it to write instead. And while "Do not decide what {{user}} says or does" is
commonly included in prompts to prevent the AI from controlling your persona, some users
find "Write {{char}}'s responses in a way that respects {{user}}'s autonomy" is more
effective.
There is often a better place than the Main Prompt to include information about the user
or characters, modify a character's writing and speaking style, or give other specific
instructions. The Main Prompt is best used for general instructions about the conversation
as a whole, or about a type of conversation that you want to have.
Effect of Message History
When adjusting the main prompt to improve the AI's responses, consder that the AI picks
up a lot from the message history. The history is its memory of past events, character
interactions and relationships, and its style guide for word choice and writing style.
Use this to your advantage by also providing example messages showing how you want
the AI to respond. Showing what you want is often easier than trying to explain it!
When your conversation already has history, changing the main prompt has a limited
effect on the AI's responses. In terms of events and relationships, the AI assumes that the
main prompt occurred in the distant past, and the message history updates it. In terms of
writing style and word choice, the AI assumes that all the messages in history were
generated according to the rules in the current main prompt, and that it should continue to
generate messages in the same way. Some suggestions for dealing with this are:
insert current instructions close to or after the end of message history, for example by
using an Author's Note
test your changes to the main prompt by starting a new conversation
edit the message history to remove or correct examples of unwanted behaviour
use the Post-History Instructions to provide final instructions to the AI
You may not want the AI to think of itself as role-playing at all. Instead of removing the
idea of a character, you can remove the idea of an AI:
You are {{char}}, a helpful assistant. You provide useful information and help {{user}}
with their questions.
AI as Narrator or Storyteller
What if you want the AI to act as a narrator, describing events from an omniscient
perspective, inventing its own characters and settings?
One approach is to create a named character for the AI to use as a narrator. This
character could be called "Narrator" or "AI", suggesting that the AI is a general-purpose
storyteller, or it could be named after a specific scenario or setting, giving the AI the task
of narrating a story in that setting. The details of the setting can then be defined in the
Character or in World Info.
You will need to adjust the default main prompt to reflect the AI's role. For a general-
purpose narrator, you might use:
You are {{char}}, a skilled and versatile storyteller. Narrate the story.
You are the narrator of a fantasy scenario. Play as the characters that visit {{char}}.
It helps to clarify the role of the user in the conversation. Are your messages part of the
story, or are they instructions to the narrator about what your character does or says? An
example that includes the user in the story:
The story should progress by responding to the actions and dialogue of {{user}}.
Narrate the story in third person.
Enter Adventure Mode. Narrate the story based on {{user}}'s dialogue and actions
after ">". Describe the surroundings in vivid detail. Be detailed, creative, verbose,
and proactive. Move the story forward by introducing fantasy elements and
interesting characters.
Defining the role of the user not only helps the AI understand how to respond to your
messages, but also to what extent it is allowed to control your persona. This avoids
situations where the AI makes decisions for your persona that you would rather make
yourself.
Post-History Instructions
Post-History Instructions are additional instructions sent to the AI after the main prompt
and the user message. They can be used to provide additional context or instructions to
the AI based on the message history.
Since the Post-History Instructions are sent after the user message, they are the final
instructions that the AI receives before generating a response. The AI usually gives them a
higher priority than the main prompt, and they can override the main prompt's
instructions.
Post-History Instructions cannot be defined globally. You could achieve the same
effect with an Author's Note.
To use per-character Post-History Instructions, add them to the character's Post-
History Instructions and enable both Prefer Char. Instructions and Allow Post-History
Instructions.
The Post-History Instructions is added as an invisible user role injection that
precedes the last line of the prompt (usually containing a response message
"header").
Previous Next
TabbyAPI Advanced Formatting
Advanced Formatting
The settings provided in this section allow for more control over the prompt-building
strategy, primarily for Text Completion APIs.
Most of the settings in this panel do not apply to Chat Completions APIs as they are
governed by the prompt manager system instead.
System Prompt
Context Template
Tokenizer
Custom Stopping Strings
Backend-defined templates
Applies to: Text Completion APIs
Not applicable to Chat Completion APIs as they use a different prompt builder.
The System Prompt defines the general instructions for the model to follow. It sets the
tone and context for the conversation. For example, it tells the model to act as an AI
assistant, a writing partner, or a fictional character.
The System Prompt is a part of the Story String and usually the first part of the prompt
that the model receives.
See the prompting guide to learn more about the System Prompt.
Context Template
Applies to: Text Completion APIs
For equivalent settings in Chat Completion APIs, use Prompt Manager.
Usually, AI models require you to provide the character data to them in some specific way.
SillyTavern includes a list of pre-made conversion rules for different models, but you may
customize them however you like.
The options for this section are explained in Context Template.
Tokenizer
A tokenizer is a tool that breaks down a piece of text into smaller units called tokens.
These tokens can be individual words or even parts of words, such as prefixes, suffixes, or
punctuation. A rule of thumb is that one token generally corresponds to 3~4 characters of
text.
The options for this section are explained in Tokenizer.
Custom Stopping Strings
Accepts a JSON-serialized array of stopping strings. Example: ["\n", "\nUser:",
"\nChar:"] . If you're unsure about the formatting, use an online JSON validator. If the
model output ends with any of the stop strings, they will be removed from the output.
Supported APIs:
1. KoboldAI Classic (versions 1.2.2 and higher) or KoboldCpp
2. AI Horde
3. Text Completion APIs: Text Generation WebUI (ooba), Tabby, Aphrodite, Mancer,
TogetherAI, Ollama, etc.
4. NovelAI
5. OpenAI (max 4 strings) and compatible APIs
6. OpenRouter (both Text and Chat Completion)
7. Claude
8. Google AI Studio
9. MistralAI
Start Reply With
Note
By default, the Start Reply With prefix won't be shown in the resulting message.
Enable "Show reply prefix in chat" to display it.
Previous Next
Prompts Context Template
Context Template
Applies to: Text Completion APIs
For equivalent settings in Chat Completion APIs, use Prompt Manager.
Usually, AI models require you to provide the character data to them in some specific way.
SillyTavern includes a list of pre-made conversion rules for different models, but you may
customize them however you like.
Edit these settings in the "Advanced Formatting" panel.
Story string
This field is a template for pre-chat character data (known internally as a story string).
This is the main way to format your character card for text completion and instruct
models.
The template supports Handlebars syntax and any custom text injections or formatting.
See the language reference here: https://handlebarsjs.com/guide/
We provide the following parameters to the Handlebars evaluator (wrap them into double-
curly braces):
1. description - character's Description
2. scenario - character's Scenario
3. personality - character's Personality
4. system - system prompt OR character's main prompt override (if exists and "Prefer
Char. Prompt" is enabled in User Settings)
5. persona - selected persona description
6. char - character's name
7. user - selected persona name
8. wiBefore or loreBefore - combined activated World Info entries with Position set to
"Before Char Defs"
9. wiAfteror loreAfter - combined activated World Info entries with Position set to
"After Char Defs"
10. mesExamples - (optional) character's Example Dialogues, instruct-formatted with
separator. Important: Set "Example Messages Behavior" in the User Settings panel to
"Never include examples" to avoid duplication.
A special {{trim}} macro is supported to remove any newlines that surround it. Use it in
case you want some part of text NOT be separated with a newline from the previous line
(spaces are not trimmed).
WARNING: If some of the above parameters are missing from the story string template,
they are not going to be sent in the prompt at all.
Example Separator
Used as a block header and a separator between the example dialogue blocks. Any
instance of <START> tags in the example dialogues will be replaced with the contents of
this field.
Chat Start
Inserted as a separator after the rendered story string and after the example dialogues
blocks, but before the first message in context.
Separators as Stop Strings
Adds "Example Separator" and "Chat Start" to the list of stop strings.
Helpful if the model tends to hallucinate or leak whole blocks of example dialogue
preceded by the separator.
Names as Stop Strings
Adds Character and User Persona names to the list of stop strings.
Recommended to keep it on to prevent model impersonation.
Allow Post-History Instructions
Includes the Post-History Instructions at the end of the prompt, formatted as the last user
message.
The Post-History Instructions prompt should be defined in the character card and "Prefer
Char. Instructions" setting should be enabled.
Should be used with care, as placing instructions low in the context can lead to degraded
quality of the outputs of smaller models.
Always add character's name to prompt
Appends the character's name to the prompt to force the model to complete the message
as the character:
** OTHER CONTEXT HERE **
Character:
Previous Next
Advanced Formatting Instruct Mode
Instruct Mode
Instruct Mode allows you to adjust the prompting for instruction-following models trained
on various prompt formats, such as Alpaca, ChatML, Llama2, etc.
API support
Text Completion API
Fully supported. This includes:
All of the sources under Text Completion
KoboldAI Classic
AI Horde
Choosing a formatting
A chosen instruct template must match the expectations of an actual model that is
running on a backend.
This is usually reflected in a model card on HuggingFace, and some even provide
SillyTavern-compatible JSON files.
Example: NeverSleep/Noromaid-13b-v0.1.1
Chat Completion API (OpenAI, Claude, etc)
This is not supported (and not needed) for Chat Completion APIs. They use an entirely
different prompt builder.
NovelAI
While technically supported for NovelAI, none of their models were trained to understand
instruct formatting. NovelAI models can use a special instruct module that is activated
automatically when an instruction wrapped in curly braces is encountered in chat
messages, so using Instruct Mode for the entire prompt will lead to degraded quality of
the outputs.
Here's an example that auto-activates the instruct module for NovelAI:
User: { Write a happy song about Nintendo Switch. }
Templates
Provides ready-made templates with sequences for some well-known instruct models.
Changing a template resets the unsaved settings to the last saved state! Don't forget to
save your template if you made any changes you don't want to lose.
Activation Regex
If defined as a valid regular expression, when connected to a model and its name matches
this regex, will automatically select this template.
Instruct mode needs to be enabled prior. Only the first regex match across templates will
be selected (evaluated in alphabetical order).
Wrap Sequences with Newline
Each sequence text will be wrapped with newline characters when inserted into the
prompt. Required for Alpaca and its derivatives.
Disable if you want to have full control over line terminators.
Replace Macro in Sequences
If enabled, known {{macro}} substitutions will be replaced if defined in message wrapping
sequences.
Also, a special {{name}} macro can be used in message prefixes to reference the actual
name attached to a message (rather than a currently active {{char}} or {{user}}), which
can be helpful when using group chats or /sendas command. If the name can't be
determined, "System" is used as a fallback placeholder.
Include Names
If enabled, prepend characters and user names to chat history logs after the prefix
sequence.
The following options are available:
Never: Do not add name prefixes before the message contents.
Groups and Past Personas: Only add name prefixes to messages from group
characters and past personas.
Always: Always add name prefixes before the message contents.
Sequences: System Prompt Wrapping
Define how the System Prompt will be wrapped.
System Prompt Prefix
Inserted before a System prompt.
System Prompt Suffix
Inserted after a System prompt.
Important: this applies only to the System Prompt itself, not the entire Story String! If you
want to wrap the Story String, add these sequences to the Story String template in the
Context Template section.
Sequences: Chat Messages Wrapping
These settings define how messages belonging to different roles will be wrapped upon
building a prompt.
All prefix sequences will also be automatically used as stopping strings.
User Message Prefix
Inserted before a User message and as a last prompt line when impersonating.
User Message Suffix
Inserted after a User message.
Assistant Message Prefix
Inserted before an Assistant message and as a last prompt line when generating an AI
reply.
Assistant Message Suffix
Inserted after an Assistant message
System Message Prefix
Inserted before a System (added by slash commands or extensions) message.
System Message Suffix
Inserted after a System message.
System same as User
If checked true, System messages will be using User role message sequences.
Otherwise, System messages use their own sequences (if not empty) or will not do any
wrapping at all (if empty).
Misc. Sequences
Various advanced configurations for finer tuning of the prompt building
First Assistant Prefix
Inserted before the first Assistant's message.
Only the first message of the chat history counts, not the message that actually
goes into the prompt first!
Not used when generating text in a background (e.g. Stable Diffusion prompts or
Summaries). System Instruction Prefix or Regular Assistant Prefix will be used
instead.
Tokenizer
A tokenizer is a tool that breaks down a piece of text into smaller units called tokens.
These tokens can be individual words or even parts of words, such as prefixes, suffixes, or
punctuation. A rule of thumb is that one token generally corresponds to 3~4 characters of
text.
SillyTavern provides a "Best match" option that tries to match the tokenizer using the
following rules depending on the API provider used.
Text Completion APIs (overridable):
1. NovelAI Clio: NerdStash tokenizer.
2. NovelAI Kayra: NerdStash v2 tokenizer.
3. Text Completion: API tokenizer (if supported) or Llama tokenizer.
4. KoboldAI Classic / AI Horde: Llama tokenizer.
5. KoboldCpp: model API tokenizer.
If you get inaccurate results or wish to experiment, you can set an override tokenizer for
SillyTavern to use while forming a request to the AI backend:
1. None. Each token is estimated to be ~3.3 characters, rounded up to the nearest
integer. Try this if your prompts get cut off on high context lengths. This approach is
used by KoboldAI Lite.
2. Llama tokenizer. Used by Llama 1/2 models family: Vicuna, Hermes, Airoboros, etc.
Pick if you use a Llama 1/2 model.
3. Llama 3 tokenizer. Used by Llama 3/3.1 models. Pick if you use a Llama 3/3.1 model.
4. NerdStash tokenizer. Used by NovelAI's Clio model. Pick if you use the Clio model.
5. NerdStash v2 tokenizer. Used by NovelAI's Kayra model. Pick if you use the Kayra
model.
6. Mistral V1 tokenizer. Used by older Mistral models family and their finetunes. Pick if
you use an older Mistral model.
7. Mistral Nemo tokenizer. Used by Mistral Nemo models family and their finetunes. Pick
if you use a Mistral Nemo/Pixtral model.
8. Yi tokenizer. Used by Yi models. Pick if you use a Yi model.
9. Gemma tokenizer. Used by Gemini/Gemma models. Pick if you use a Gemma model.
10. DeepSeek tokenizer. Used by DeepSeek models (such as R1). Pick if you use a
DeepSeek model.
11. API tokenizer. Queries the generation API to get the token count directly from the
model. Known backends to support: Text Generation WebUI (ooba), koboldcpp,
TabbyAPI, Aphrodite API. Pick if you use a supported backend.
Chat Completion APIs (non-overridable):
1. OpenAI: model-dependant tokenizer via tiktoken.
2. Claude: model-dependant tokenizer via WebTokenizers.
3. OpenRouter: Llama, Mistral, Gemma, Yi tokenizers for their respective models.
4. Google AI Studio: Gemma tokenizer.
5. Scale API: GPT-4 tokenizer.
6. AI21 API: Jamba tokenizer (requires a one-time download).
7. Cohere API: Command-R or Command-A tokenizer (requires a one-time download).
8. MistralAI API: Mistral V1 or V3 tokenizer (requires a one-time download).
9. DeepSeek API: DeepSeek tokenizer (requires a one-time download).
10. Fallback tokenizer: GPT-3.5 turbo tokenizer.
Additional Tokenizers
These tokenizers are not included in the default installation due to their size A one-time
download is required when they're used for the first time.
1. Qwen2 tokenizer.
2. Command-R / Command-A tokenizers. Used by Cohere source in Chat Completion.
3. Mistral V3 (Nemo) tokenizer. Used by MistralAI source in Chat Completion (Nemo and
Pixtral models).
4. DeepSeek (deepseek-chat) tokenizer. Used by DeepSeek source in Chat Completion.
If you don't want to use internet downloads, the opt-out option exists in config.yaml:
enableDownloadableTokenizers . Set to false to disable downloads.
You can also download tokenizers manually from the SillyTavern-Tokenizers repository.
Download the JSON files and put them in the _cache subdirectory of your data root, the
path is ./data/_cache by default. Create the _cache directory if it doesn't exist. After
that, restart the SillyTavern server to re-initialize tokenizers.
If the required tokenizer model is not cached and downloads are disabled, a fallback
tokenizer (Llama 3) will be used for counting.
Token Padding
Applies to: Text Completion APIs
SillyTavern will always use the matching tokenizer for Chat Completion models,
so there is no need for token padding.
Unless SillyTavern uses a tokenizer provided by the remote backend API that runs the
model, all token counts assumed during prompt generation are estimated based on the
selected tokenizer type.
Since the results of tokenization can be inaccurate on context sizes close to the model-
defined maximum, some parts of the prompt may be trimmed or dropped, which may
negatively affect the coherence of character definitions.
To prevent this, SillyTavern allocates a portion of the context size as padding to avoid
adding more chat items than the model can accommodate. If you find that some part of
the prompt is trimmed even with the most-matching tokenizer selected, adjust the
padding so the description is not truncated.
You can input negative values for reverse padding, which allows allocating more than the
set maximum amount of tokens.
Previous Next
Instruct Mode CFG
© Copyright 2025. All rights reserved.
SillyTavern Documentation
CFG
Page written by: kingbri
Contributors: kingbri, Guillaume "Vermeille" Sanchez, AliCat
What is it?
CFG, or classifier-free guidance is a method that's used to help make parts of a prompt
less or more prominent.
Supported Backend APIs
Currently, the supported backends are oobabooga's textgen WebUI, NovelAI, and
TabbyAPI. NovelAI had its own documentation for CFG.
WARNING: CFG increases vram usage due to ingesting more than 1 prompt! If your GPU
memory runs out while generating a prompt with CFG on, consider reducing your context
size, using a lesser parameter model, or turning off CFG entirely.
Configuration
Accessing CFG settings are the same as accessing Author's note:
CFGhamburgermenupng
And here's what the CFG panel looks like:
CFGchatpanelpng
There are four dropdowns in the CFG panel:
Chat CFG
Scopes the CFG scale and prompts to only this chat
Character CFG
Scopes the CFG scale and prompts to the specified character
Global CFG
Globally overrides the CFG scale and prompts (also overrides the model preset!)
CFG Advanced Settings (formerly called CFG Prompt Cascading)
A place to combine prompts from the previous 3 dropdowns and set insertion
depth.
NOTE: If the guidance scale is set to 1, nothing will be sent since that's when CFG is in an
"off" state.
Group Chats
In group chats, the CFG scale panel looks like this:
CFGpanelgcpng
The main change is that character CFG is removed and a checkbox called Use Character
CFG Scales is present in the chat CFG dropdown. This allows for the current character's
guidance scale to be used instead of whatever the chat CFG scale is set to.
The main utility of this feature is to alter the scale based on each character's individual
needs.
In addition, checking the Character Negatives box in prompt cascading will append the
independent character negative prompts along with the chat ones (if enabled).
Concepts
Isn't this in Stable Diffusion?
Yes and no. CFG with LLMs works in a different way than what one might be used to in
Stable Diffusion. LLM-based CFG works on the principle of "prompt mixing". The CFG
formula takes a positive and negative prompt, then mixes the differences between them.
From there, a combined prompt is sent and a response is generated!
Here's an illustration to help visualize this concept. The red represents the negative
prompt, the blue represents the neutral prompt, and the purple represents the mixed result
that's interpreted. All the white space is the same across all 3 prompts, so those are not
used for CFG mixing.
stcfgdiagrampng
If you want to know more about CFG and LLMs, Vermifuge's original paper is located here.
I'd suggest giving it a read/listen:
Paper - [2306.17806] Stay on topic with Classifier-Free Guidance (arxiv.org)
Audio version - https://www.youtube.com/watch?v=MGY00YFcyco
Do I need CFG prompts?
No! CFG prompts are completely optional. Just adjusting the guidance scale above 1 will
also help produce an effect on responses, which can accentuate chats and character
interaction.
What makes a good CFG prompt?
So, we established that CFG prompting is not the same as Stable Diffusion's negative tags
and embeddings. How do we make a prompt?
Warning: This assumes that you have created a character using PLists and Ali:Chat. If you
have not, feel free to experiment with various prompting techniques.
Let's say I have a character named "John". John is supposed to feel happy and excited all
the time from his example dialogues. However, when chatting with John, he's sometimes
sad and depressed.
To remove this, CFG comes to the rescue! Just make the negative prompt [John's
feelings: sad, depressed] to help remove the sadness portions. You can optionally make
the positive prompt [John's feelings: happy, joyful] to further bring out John's happy
parts.
Positive Prompts
I went over this in the previous section, but I'd like to touch on this a bit more. Positive
prompts are used to further accentuate parts of a character. Let's use John again as our
example. By making him happier with a positive prompt of [John's feelings: happy,
joyful] , John should start outputting dialogue with a more happy feeling than if the
positive prompt was not included.
But...
These are just loose guidelines from experience with one specific character format. There
are many other ways to create prompts that you should experiment with. Feel free to
share your thoughts with other users!
Guidance Scale
Here's a rule of thumb. A guidance scale of 1 means that CFG is disabled. In fact,
SillyTavern won't send anything to your backend if the guidance scale is 1. A guidance
scale >1 will give the results shown in the other sections at varying degrees.
However, a guidance scale of <1 will give the opposite effect since the negative prompt
is used as the primary prompt here.
Let's use the example with John again. The negative prompt is [John's feelings: sad,
depressed] and the positive prompt is [John's feelings: happy, joyful] with a
guidance scale of 0.8 .
This will in turn accentuate the negative prompt more and you'll see John start to act
sadder than normal rather than happier.
tldr; Use a guidance scale of 1.5 and work up and down from there based on your
outputs.
Prompt Cascading
Negatives and positives can be cascaded between CFG types (the types being per-chat,
per-character, and global overrides). See the Configuration header for more information.
Insertion Depth
Follow the basic rule: The lower something is located in the prompt, the more influential it
is to the response. For chatting, I recommend using the default depth of 1 since it's very
flexible with other components of SillyTavern.
However, if you want to experiment, an insertion depth of 0 is open. However, these can
dramatically alter how your response will look and it's NOT recommended to use prompt
cascading here!
Previous Next
Tokenizer Prompt Manager
© Copyright 2025. All rights reserved.
SillyTavern Documentation
Prompt Manager
The Prompt Manager is a system that allows for more control over the prompt-building
strategy for Chat Completion APIs.
Access Prompt Manager by clicking on the "AI Response Configuration" button in the
navigation bar. Prompt manager is below the common settings panel.
Previous Next
CFG Reasoning
Reasoning
In language models, reasoning (also known as model thinking) refers to a chain-of-
thought (CoT) technique that mirrors human problem-solving through step-by-step
analysis. SillyTavern provides several features that make the use of reasoning models
more efficient and consistent across supported backends.
Common issues
1. When using reasoning models, the model's internal reasoning process consumes part
of your response token allowance, even if this reasoning isn't shown in the final output
(e.g. o3-mini or Gemini Thinking). If you notice your responses are coming back
incomplete or empty, you should try adjusting the Max Response Length setting found
in the AI Response Configuration panel. For reasoning models, it's typical to use
significantly higher token limits - anywhere from 1024 to 4096 tokens - compared to
standard conversational models.
Configuration
Most reasoning-related settings can be configured in the "Reasoning" section of
Advanced Formatting panel.
Reasoning blocks appear in the chat as collapsible message sections. They can be added
manually, automatically by the backend, or through response parsing (see below).
By default, reasoning blocks are collapsed to save space. Click a block to expand and
view its contents. You can set blocks to expand automatically by enabling Auto-Expand in
the reasoning settings.
When a reasoning block is expanded, you can copy or edit its contents using the Copy
and Edit buttons.
Some models models support reasoning, but will not send their thoughts back. It is
possible to still show the reasoning block with reasoning time for those by toggling the
Show Hidden setting.
Adding Reasoning
Manually
Add a reasoning block to any message through the Message Edit menu. Click while
editing to add a reasoning section. Third-party extensions can also add reasoning by
writing to the extra.reasoning field of the message object before adding it to the chat.
With a Command
Use the /reasoning-set STscript command to add reasoning to a message. The
command takes at (message ID, defaults to the last message) and reasoning text as
arguments.
stscript
By Backend
If your chosen LLM backend and model support reasoning output, enable "Request Model
Reasoning" in the AI Response Configuration panel.
Supported sources:
DeepSeek
OpenRouter
By Parsing
Enable "Auto-Parse" in the Advanced Formatting panel to automatically parse
reasoning from the model's output.
The response must contain a reasoning section wrapped in configured Prefix and Suffix
sequences. The sequences provided by default correspond to the DeepSeek R1 reasoning
format.
Example with prefix <think> and suffix </think> :
<think>
This is the reasoning.
</think>
Most model providers do not recommend sending CoT back to the model in
multi-turn conversations.
Regex Scripts
Regular expression scripts from the Regex extension can be applied to the contents of
reasoning blocks. Check "Reasoning" in the "Affects" section of the script editor to target
reasoning blocks specifically.
Different ephemerality options affect reasoning blocks in the following ways:
1. No ephemerality: reasoning content is permanently changed.
2. Run on edit: regex script will be re-evaluated when the reasoning block is edited.
3. Alter chat display: regex is applied to the reasoning block's display text, not the
underlying content.
4. Alter outgoing prompts: regex is only applied to reasoning blocks before they are sent
to the model.
Reasoning Effort
Reasoning Effort is a Chat Completion setting in the AI Response Configuration panel
that influences how many tokens may potentially be used on reasoning. The effect of
each option depends on the source connected to. Currently, Auto simply means the
relevant parameter is not included in the request.
Claude (≤ Google
Option 21333 if AI Studio OpenAI OpenRouter xAI (Grok
no (≤ (keyword) (keyword) (keyword
streaming) 24576)
not not
specified, not not specified, not
Auto no specified specified effect specified
thinking depends on
model
World Info
World Info (also known as Lorebooks or Memory Books) is a powerful tool available in
ST to insert prompts dynamically into your chat to help guide the AI replies.
Commonly, World Info (WI for short) is used to enhance the AI's understanding of the
details in your fictional world, however you could use a World Info entry to insert
ANYTHING that you would like to insert into the prompt.
It functions like a dynamic dictionary that only inserts relevant information from World Info
entries when keywords associated with the entries are present in the message text.
The SillyTavern engine activates and seamlessly integrates the appropriate lore into the
prompt, providing background information to the AI.
It is important to note that while World Info helps guide the AI toward the desired content,
it does not guarantee its appearance in the generated output messages. That depends on
how good your model is at making use of additional information!
Pro Tips
The World Info engine is a very powerful prompt management tool. Don't fixate on
adding character lore alone, feel free to experiment.
Activation keywords, titles, and other information that is not in the Content field is not
inserted into context, so each World Info entry should have a comprehensive,
standalone description.
To create rich and detailed world lore, entries can be interlinked and reference one
another by using recursive activation. See more on Recursion below.
SillyTavern offers flexible context budgeting for inserted background information. To
conserve prompt tokens, it is advisable to keep entry contents concise.
Further reading
World Info Encyclopedia: Exhaustive in-depth guide to World Info and Lorebooks. By
kingbri, Alicat, Trappu.
Character Lore
Optionally, one World Info file could be assigned to a character to serve as a dedicated
lore source across all chats with that character (including groups).
To do that, navigate to a Character Management panel and click a globe button, then pick
World Info from a dropdown list and click "Ok".
To unbind or change character lore, Shift-click the globe button. If on mobile, click
"More..." and then "Link World Info".
Character Lore Insertion Strategy
When generating an AI reply, entries from the character World Info will be combined with
the entries from a global World Info selector using one of the following strategies:
Sorted Evenly (default)
All entries will be sorted according to their Insertion Order as if they a part of one big file,
ignoring the source.
Character Lore First
Entries from the Character World Info would be included first by their Insertion Order, then
entries from the Global World Info.
Global Lore First
Entries from the Global World Info Info would be included first by their Insertion Order, then
entries from the Character World Info.
World Info Entry
Key
A list of keywords that trigger the activation of a World Info entry. Keys are not case-
sensitive by default (this is configurable).
Regular Expression (Regex) as Keys
Keys allow a more flexible approach to matching by supporting regex. This makes it
possible to match more dynamic content with optional words or characters, spacing, and
all the other utilities that regex provides.
If a defined key is a valid regex (Javascript regex style, with / as delimiters. All flags are
allowed), it will be treated as such when checking whether an entry should be triggered.
Multiple regexes can be entered as separate keys and will work alongside each other.
Inside a regex, commas are possible. Plaintext keys do not support commas, as they are
treated as key separators.
An example of a use-case for advanced regex matching:
An entry/instruction that should be inserted, when char is doing a weather-related action
/(?:{{char}}|he|she) (?:is talking about|is noticing|is checking whether|observes) (?:the
)?(rainy weather|heavy wind|it is going to rain|cloudy sky)/i
For more information on Regex syntax and possibilities: Regular expressions - JavaScript |
MDN
Advanced Regex Per-Message Matching
ST prefixes every chat message in the WI scan buffer with character name: and after
v1.12.6, concatenates prepends them using the character value 1 ( \x01 ).
This means you can match specific input or output from a certain character using a regex
tied to that separation character.
For example, to match only the user saying "hello", you could use the following regex:
/\x01{{user}}:[^\x01]*?hello/
Key Input
There are two modes to enter keywords, each with a slightly different UI. In ⌨️ plaintext
mode (default), keys can be entered as a comma-separated list in a single text field.
Regexes can be included too, but they don't have any special highlighting. In ✨ fancy
mode, the keys appear as separate elements and regexes will be highlighted as such. The
control supports editing and deleting keys. The mode can be switched via the inline button
inside the input control.
Optional Filter
A list of supplementary keywords that are used in conjunction with the main keywords.
See Optional Filter. These keys also support regex.
Entry Content
The text that is inserted into the prompt upon entry activation.
Insertion Order
Numeric value. Defines a priority of the entry if multiple were activated at once. Entries
with higher order numbers will be inserted closer to the end of the context as they will
have more impact on the output.
Insertion Position
Before Char Defs: World Info entry is inserted before the character's description and
scenario. Has a moderate impact on the conversation.
After Char Defs: World Info entry is inserted after the character's description and
scenario. Has a greater impact on the conversation.
Before Example Messages: The World Info entry is parsed as an example dialogue
block and inserted before the examples provided by the character card.
After Example Messages: The World Info entry is parsed as an example dialogue
block and inserted after the examples provided by the character card.
Top of AN: World Info entry is inserted at the top of Author's Note content. Has a
variable impact depending on the Author's Note position.
Bottom of AN: World Info entry is inserted at the bottom of Author's Note content. Has
a variable impact depending on the Author's Note position.
@ D: World Info entry is inserted at a specific depth in the chat (Depth 0 being the
bottom of the prompt).
⚙️ - as a system role message
👤 - as a user role message
🤖 - as an assistant role message
Note
Since the retrieval quality depends entirely on the outputs of the embedding
model, it's impossible to predict exactly what entries will be inserted. If you want
deterministic and predictable results, stick to keyword matching.
Timed Effects
Usually, World Info evaluation is stateless, meaning that the result of the evaluation is the
same, only depending on the current chat context. However, with the introduction of
Timed Effects, you can create entries that have an activation delay, stay active after
being triggered, or can't be triggered after the activation.
Timed Effects Rules
1. The time frames for the effects are measured in messages (not pairs of
messages/exchanges), with 0 meaning there is no effect.
2. Effects only apply in the chat where the entry was activated. Branches inherit the
state of the parent chat.
3. Active timed effects are removed if the chat doesn't advance, e.g. if the last message
was swiped or deleted.
4. Making any changes to the entry that is currently on timed effect will cause the effect
to be forcibly removed.
5. Consequent triggering of keywords does not refresh the effect duration if it's already
active.
Types of Timed Effects
1. Sticky - the entry stays active for N messages after being activated. Stickied entries
ignore probability checks on consequent scans until they expire.
2. Cooldown - the entry can't be activated for N messages after being activated. Can be
used together with sticky: the entry goes on cooldown when the sticky duration ends.
3. Delay - the entry can't be activated unless there are at least N messages in the chat
at the moment of evaluation.
Delay = 0 -> The entry can be activated at any time.
Delay = 1 -> The entry can't be activated if the chat is empty (no greeting).
Delay = 2 -> The entry can't be activated if there is zero or only one message in
the chat, etc.
Timed Effects Example
Entry configuration: sticky = 3, cooldown = 2, delay = 2.
Message 0: delay
Message 1: entry activated
Message 2: sticky
Message 3: sticky
Message 4: sticky
Message 5: cooldown
Message 6: cooldown
Message 7: entry can be activated again
Activation Settings
Collapsible menu at the top of the World Info screen.
Scan Depth
Can be overridden on an entry level.
Defines how many messages in the chat history should be scanned for World Info keys.
If set to 0, then only recursed entries and Author's Note are evaluated.
If set to 1, then SillyTavern only scans the last message.
2 = two last messages, etc.
Include Names
Defines if the names of the chat participants should be included in the scanned text buffer
as message prefixes. This allows activating entries that use names as keywords without
directly mentioning the names in messages.
See an example of the text to be scanned below, assuming the chat participants are
named Alice and Bob.
Enabled (default):
Alice: Hello! Good to see you.
Bob: How is the weather today?
Disabled:
Hello! Good to see you.
How is the weather today?
Context % / Budget
Defines how many tokens could be used by World Info entries at once. You can define a
threshold relative to your API's max-context settings (Context %) or an objective token
threshold (Budget)
If the budget is exhausted, then no more entries are activated even if the keys are present
in the prompt.
Constant entries will be inserted first. Then entries with higher order numbers.
Entries inserted by directly mentioning their keys have higher priority than those that were
mentioned in other entries' contents.
Min Activations
This setting is mutually exclusive with Max Recursion Steps.
Minimum Activations: If set to a non-zero value, this will disregard the limitation of "scan-
depth", seeking all of the chat log backward from the latest message for keywords until as
many entries as specified in min activations have been triggered. This will still be limited
by the Max Depth setting or your overall Budget cap.
Additional scan sweeps triggered by Min Activations will not check entries added by
recursion on previous steps. Only chat messages and extension prompts can trigger these
additional activations. However, the entries activated by Min Activations can trigger other
entries as usual.
Max Depth
Maximum Depth to scan for when using the Min Activations setting.
Recursive scanning
Recursive scanning allows for entries to activate other entries or be activated by others,
enabling complex interactions and dependencies between different World Info entries.
This feature can significantly enhance the dynamic nature of your creative scenarios.
Whether recursive scanning is enabled can be controlled with the global setting Recursive
Scan.
There are three options available to control recursion for each entry:
Non-recursable: When this checkbox is selected, the entry will not be activated by
other entries. This is useful for static information that should not change or be
influenced by other world info entries.
Prevent further recursion: Selecting this option ensures that once this entry is
activated, it will not trigger any other entries. This is helpful to avoid unintended
chains of activations.
Delay until recursion: This entry will only be activated during recursive checks,
meaning it won't be triggered in the initial pass but can be activated by other entries
that have recursion enabled. Now, with the added Recursion Level for those delays,
entries are grouped by levels. Initially, only the first level (smallest number) will match.
Once no matches are found, the next level becomes eligible for matching, repeating
the process until all levels are checked. This allows for more control over how and
when deeper layers of information are revealed during recursion, especially in
combination with criteria as NOT ANY or NOT ALL combination of key matches.
Entries can activate other entries by mentioning their keywords in the content text.
For example, if your World Info contains two entries:
Entry #1
Keyword: Bessie
Content: Bessie is a cow and is friends with Rufus.
Entry #2
Keyword: Rufus
Content: Rufus is a dog.
Both of them will be pulled into the context if the message text mentions just Bessie.
Max Recursion Steps
This setting is mutually exclusive with Min Activations.
When set to zero, recursion nesting is only limited by your prompt budget. When set to a
non-zero value, limits the total number of scan sweeps to desired maximum "nesting
level".
Example values:
1 effectively disables recursion as the check stops after the first step.
2 can only activate recursive entries once.
3 can trigger recursion twice...
Case-sensitive keys
Can be overridden on an entry level.
To get pulled into the context, entry keys need to match the case as they are defined in
the World Info entry.
This is useful when your keys are common words or parts of common words.
For example, when this setting is active, keys 'rose' and 'Rose' will be treated differently,
depending on the inputs.
Match whole words
Can be overridden on an entry level.
Entries with keys containing only one word will be matched only if the entire word is
present in the search text. Enabled by default.
For example, if the setting is enabled and the entry key is "king", then text such as "long
live the king" would be matched, but "it's not to my liking" wouldn't.
Important: this setting can have a detrimental effect when used with languages that don't
use whitespace to separate words (e.g. Japanese or Chinese). If you write entries in these
languages, it is advised to keep it off.
Alert on overflow
Shows an alert if the activated World Info exceeds the allocated token budget.
Previous Next
Reasoning User Settings
User Settings
UI Customization
Change the theme, look and feel of the chat interface to suit your preferences.
General Settings
These are the core settings that affect your overall SillyTavern experience.
UI Language
SillyTavern's user interface is available in multiple languages. The language selector
provides these options:
Default: Uses your system language if available
English: Forces English UI regardless of system settings
Other languages available through the dropdown
Note: This setting only affects the user interface text. For AI conversation translation,
please use the Chat Translation extension.
Software Version
Your current version of SillyTavern is displayed in the top-right corner. This information is
essential for:
Troubleshooting problems
Ensuring compatibility with extensions
Determining if updates are available
To update SillyTavern to the latest version, please refer to the Updating documentation.
Account Management
Control your SillyTavern user account, back up your settings and user data, and manage
user roles and permissions in multi-user mode.
Account
In the Account dialog, you can view and edit your profile information, change your
password, and manage account settings.
Profile Information
Display name (editable via pencil icon)
User avatar (can also be changed using Personas)
Account handle
User role
Account creation date
Password status (locked/unlocked icon indicates protection)
Account Actions
Settings Snapshots: Create, manage, and restore backups of your user settings
Download Backup: Export a complete backup of all your user data
Change Password: Update your account security credentials
Danger Zone
Critical account operations that should be used with caution:
Reset Settings: Restore all settings to factory defaults
Reset Everything: Complete account wipe and factory reset
Admin Panel
Applies to: multi-user mode
Multi-account features require enableUserAccounts to be set to true in
config.yaml.
Management Actions
Download user data backup
Change user password
Delete account
New User
Select New User to create a new user account.
Display Name* (e.g., "John Snow")
User Handle* (lowercase letters, numbers, and dashes only)
Password (optional)
Password Confirmation
Creating a new user automatically generates a subfolder in the /data/ directory using the
user's handle as the folder name.
Logout
Applies to: multi-user mode
Previous Next
World Info UI Customization
UI Customization
UI Theme
Theme Management
Theme files allow you to save, share, and reuse your UI customizations. You can maintain
multiple themes for different moods or purposes, and switch between them instantly.
Import/Export theme files
Delete existing themes
Save changes to current theme
Save as new theme
All the settings in this section are saved to the current theme. If you switch themes, the
settings will be replaced by the settings of the new theme.
Display Settings
These display options affect how characters and messages are presented in the chat
interface.
Avatar Style
Choose between Circle, Square, or Rectangle.
Chat Style
Style Description Slash
command
Flat Clean and continuous "chat log" style, a flat canvas /flat
for your AI interactions to come to life. /default
"Instant messenger" style with distinct bubbles for
Bubbles each message, delightful rounded corners, and a /bubble
Theme Colors
Customize the color scheme of every UI element to create your perfect theme. Colors can
be selected using a color picker, and include transparency options where applicable.
Main Text
Italics Text
Underlined Text
Quote Text
Text Shadow
Chat Background
UI Background
UI Border
User Message
AI Message
Layout & Visual Settings
Fine-tune the visual presentation of the interface with these sliders.
Chat Width: Adjust chat window width (25-100% of screen)
Font Scale: Customize text size (0.5-1.5x)
Blur Strength: Control UI panel blur (0-30)
Shadow Width: Adjust text shadow intensity (0-5)
Theme Toggles
These switches control various UI features and behaviors. Some options can improve
performance on lower-end devices, while others add useful information or functionality to
the chat interface.
Reduced Motion: Disable animations and transitions
No Blur Effect: Remove background blur for better performance
No Text Shadows: Disable text shadow effects
Visual Novel mode: Compact chat with background sprite
Expand Message Actions: Always show full message context menu
Zen Sliders: Simplified parameter controls
Mad Lab Mode: Unrestricted parameter ranges
Message Timer: Show AI response generation time
Chat Timestamps: Display message timestamps
Model Icons: Show AI model icons for messages
Message IDs: Display sequential message numbers
Hide Chat Avatars: Remove avatars from chat
Message Token Count: Show token counts per message
Compact Input Area: Single-row input (Mobile only)
Swipe # for All Messages: Show swipe numbers on all messages (Mobile)
Characters Hotswap: Quick-select buttons for favorite characters
Avatar Hover Magnification: Zoom effect on avatar hover
Tags as Folders: Organize characters using tags as folders
Custom CSS
Allows you to apply custom CSS styles to further customize the appearance of the chat
interface.
Use Expand to expand the editor window for better visibility and editing.
If you switch themes, your custom CSS will be replaced by the custom CSS of the new
theme. Ensure you save your custom CSS to a theme if you want to keep it when switching
themes.
If you use a lot of custom CSS, or want to use the same custom CSS with several themes,
the unofficial CSS Snippets extension can help you manage and organize your custom
CSS.
Message Sound
To play your own custom sound on receiving a new message from bot, replace the
following MP3 file in your SillyTavern folder:
public/sounds/message.mp3
```asciimath
int_{-oo}^{oo} e^{-x^2} dx = sqrt{pi}
```
Deprecation notice
The legacy $ and $$ wrapper syntax is no longer supported. Please use the
following regex scripts to polyfill the old syntax:
$$ - LaTeX
$ - AsciiMath
Edit this page
Previous Next
User Settings Visual Novel (VN) Mode
User Settings
Disabling Visual Novel Mode
Disabling Visual Novel Mode is the same steps as enabling it. Untoggle Visual Novel Mode
and you should be back to the normal chat screen itself.
Regarding VN Mode with VN Extensions
Some extensions (like the Prome VN Extension) will toggle 'Visual Novel Mode'
on if you use their own respective VN modes. Enabling/Disabling VN Mode from
the User Settings menu will also affect these extensions as well.
VN Display
In Visual Novel Mode, the UI is altered slightly in order to accommodate character sprites
(or the character card image) which is shown in the center. In a group chat with multiple
characters however, the character sprites will spread themselves out, accommodating for
each other as shown below.
Group VN Display
VN Mode with MovingUI
To toggle MovingUI, go to User Settings and check on MovingUI. Do note that
this feature only works on Desktops.
If MovingUI is enabled in User Settings, the sprites (or character card image) can be
moved around if you wish to move them around or place them in a more specific area on
the screen.
VN Extensions
Prome Visual Novel Extension
The Prome Visual Novel Extension is an endorsed third-party extension from Bronya Rand
and Prometheus that enhances the visual novel experience in SillyTavern even further with
features such as Letterbox Mode which makes the visual novel UI more "cinematic", Focus
Mode with Darken Character Sprites, Traditional VN Mode where only the last message in
chat appears in chat and more planned to come!
To install the Prome Visual Novel Extension, you can either install by going to Download
Extensions & Assets and finding Prome Visual Novel Extension, or follow the installation
instructions on the Prome Visual Novel Extension Github page. Adjusting Prome's settings
can be found either in Extensions -> Prome (Visual Novel Extension) or via the 🪄 (Wand)
menu.
visual novel vn
Previous Next
UI Customization Personas
Personas
What is a Persona?
A persona in SillyTavern is the identity you use to participate in chats — essentially a
combination of your display name, avatar, and optional descriptive text. Personas allow
you to easily switch roles or "characters" you speak as, without having to manually
update your username/avatar each time.
Note: Legacy user avatars/names that weren't tied to a persona have been removed.
Existing data will be migrated to personas. If no name was specified, the persona will
be named "[Unnamed Persona]".
Note
Since {{user}} and {{char}} macros have opposite meanings when used in
Persona and Character descriptions, you'll be prompted to swap them if the
converted description contains either of them.
Persona Description
Each persona can store a custom text description — mental and physical traits, age,
occupation, or any personal details. These can also include template macros such as
{{char}} or {{user}} (see Macros).
Where your persona description is injected into the AI prompt depends on the Position
setting in the Persona Management panel:
None (disabled)
In Story String / Prompt Manager (the default)
Top of Author's Note / Bottom of Author's Note (Will only be added when an Author's
Note exists)
In Chat @ Depth (This will open up configuration options to set depth and the role)
The position is saved per persona.
Persona Connections / Locking
Persona connections ensure that a given persona is automatically selected in certain
situations. If no persona is connected, the currently chosen persona will stay selected.
There are three types of locking:
1. Chat lock – The persona is locked to the current chat.
2. Character lock – The persona is locked to a specific character.
3. Default persona – One persona that is used whenever no other locks apply.
1. Lock to a Chat
If a persona is locked to a chat, opening that chat in the future will automatically switch
your active persona to the locked one.
To lock: Select the desired persona, then click the Chat button under the
"Connections" section (or use /persona-lock type=chat on ).
To unlock: Click the button again (or use /persona-lock type=chat off ).
2. Lock to a Character
You can also link a persona to a specific character. Opening any chat with that character
automatically selects your locked persona.
To lock: Select the desired persona, then click the Character button under the
"Connections" section (or use /persona-lock type=character on ).
To unlock: Click the button again (or use /persona-lock type=character off ).
The Persona Management panel also shows which characters are linked to that persona
(displayed as small avatars). Clicking them navigates directly to that character's chat.
Locking multiple personas to the same character
If another persona was already linked with that character, it will be automatically unlinked
by default.
To have multiple personas linked at once, the global setting Allow multiple persona
connections per character can be used.
If multiple personas are linked to the same character, you'll see a popup asking which
persona to use each time you open or start a new chat with that character (unless a
persona is bound to the chat).
3. Default Persona
Your default persona is used whenever there's no other relevant lock. The default
persona is recognizable by a yellow border around its avatar.
To set/unset default: Select the desired persona, then click the Default button
under the "Connections" section (or use /persona-lock type=default ).
Only one persona can be chosen as the default persona.
Temporary Persona
If any of the three connection options connects a persona to the current character/chat,
you can still choose to use a different persona. This persona will be marked in the persona
panel as "Temporary Persona". Any reload of the browser window or switch to a different
chat and back will reset it to the linked persona again.
You can manually convert a Temporary Persona to be persistently connected by linking it
to the chat.
Global Persona Settings
All settings under the Current Persona are saved per-persona. A few global settings exist
too, those can be found under Global Persona Settings in the Persona Management
panel.
1. Show notifications on switching personas
Enables persona-related toast messages (e.g., "Persona Auto Selected",
"Temporary Persona").
2. Allow multiple persona connections per character
When enabled, you can link multiple personas to a single character. Opening that
character's chat will prompt you which persona to use. If disabled, only one
persona can be connected to a character at a time.
3. Auto-lock a chosen persona to the chat
When enabled, any time you select a persona (manually or by auto-selection) or
create a new chat, it locks that persona to the chat.
This combined with "Allow multiple" provides the option to have a persona
selection per character, but keep it bound once chosen for a chat.
Slash Commands for Personas
/persona-lock type=<type?>
chat locks the current persona to your active chat.
character locks the current persona to the character in use.
none (or no argument) unlocks/clears the persona lock for the current context.
If used without arguments, it returns the current lock state (or an error if none is set).
The lock state can be chosen via on , off or toggle . Default is toggle.
/persona <name>
Quickly switch your active persona by name without opening the Persona
Management panel.
Example: /persona Blaze .
Using mode=temp allows to temporarily set your name of the current persona, even
though a persona with the same name might already exist (preserving your current
avatar and description).
/persona-sync
Re-attributes all user messages in the active chat to the current persona and it's
name.
Note: The older /lock and /unlock commands remain for backward compatibility
but may be removed in the future. Use /persona-lock instead.
Pro Tips
1. Switching personas mid-chat doesn't re-attribute your past user messages to the
new persona; those remain attributed to whichever persona you were using at the
time.
2. Batch re-attribution: If you ever need all prior messages to match a new persona, hit
the sync button or use /persona-sync .
3. Replace persona images without losing description or locks choosing your persona
and clicking the Change Persona Image button.
4. Character link popups: If multiple personas are linked to the same character, you'll
get a popup to pick which persona each time you open the chat. This is a handy way
to have a small selection of personas to choose from for specific characters.
5. Backups: You can back up your entire Persona list (names, character connections,
descriptions) with the Backup button in Persona Management, and restore it later if
needed.
Remarks:
Images and and Chat connections are not saved together with personas and will
not be backed via this.
These backups are not designed to be shared, as they contain internal links.
Characters
Characters are the AI identities that you can create and manage to shape the AI's role in
the conversation. Each character has a name, personality, and conversation history. You
can create as many characters as you like, and switch between them at any time.
Characters can be used in solo chats, or add multiple characters to a group chat to let
them interact with each other.
Character Management Panel
Open the Characters panel from the navbar to access the character list. Click on a
character or group to chat with them or edit them, or choose Create New Character to
add a new character.
Panel Controls
Pin Panel: Keep panel open while interacting
Character List: Return to character list view
HotSwap Bar: Quick access to favorite characters
Character List
Create New Character: Add a new character
Import Character: Load character from file
External Import: Import from URL
Create Group: Start a new group chat
Extended Options
World Info linking
Card lore import
Scenario override
Persona conversion
Character rename
Source linking
Replace/Update
Tag import
Gallery view
Content Fields
Character Description: Brief character summary
First Message: Initial greeting or prompt when starting a new chat
Alternative greetings: Define multiple first messages that you can swipe between
when starting a chat
Advanced Definitions Panel
Click on the Advanced Definitions button to access the extended character settings.
Prompt Overrides (Chat Completion/Instruct Mode)
Main Prompt: Replaces default main/system prompt, can use {{original}} placeholder
to include the original prompt
Post-History Instructions: Overrides default post-history instructions
Creator's Metadata
Non-prompt information about the character:
Creator name/contact
Character version
Creator's notes
Embedded tags list
Character Personality
Personality Summary: Brief overview of character's traits
Scenario: Context and circumstances of the dialog
Character's Note: Custom message with selectable depth and message role (also see
Author's Note)
Talkativeness (Group Chats): Slider for Shy → Normal → Chatty
Example Messages: Examples of character's writing style
Group Chat Management
If this is a group chat, you can manage the group members and settings from this panel.
See Group Chats for more details.
Edit this page
Previous Next
Personas Character Design
Character Design
Character Description
Used to add the character description and the rest that the AI should know. This will
always be present in the prompt, so all the important facts should be included here.
For example, you can add information about the world in which the action takes place and
describe the characteristics of the character you are playing for.
It could be of any length (be it 200 or 2000 tokens) and formatted in any style (free text,
W++, conversation style, etc).
Methods and format
Methods of character formatting is a complicated topic beyond the scope of this
documentation page.
Recommended guides that were tested with or rely on SillyTavern's features:
Trappu's PLists + Ali:Chat guide: https://wikia.schneedc.com/bot-
creation/trappu/creation
AliCat's Ali:Chat guide: https://rentry.co/alichat
kingbri's minimalistic guide: https://rentry.co/kingbri-chara-guide
Character tokens
TL;DR: If you're working with an AI model with a 2048 context token limit, your 1000
token character definition is cutting the AI's 'memory' in half.
To put this in perspective, a decent response from a good AI can easily be around 200-
300 tokens. In this case, the AI would only be able to 'remember' about 3 exchanges
worth of chat history.
Why did my character's token counter turn red?
When we see your character has over half of the model-defined context length of tokens
in its definitions, we highlight it for you because this can lower the AI's capabilities to
provide an enjoyable conversation.
What happens if my Character has too many tokens?
Don't worry - it won't break anything. At worst, if the Character's permanent tokens are
too large, it simply means there will be less room left in the context for other things (see
below).
The only negative side effect this can have is the AI will have less 'memory', as it will have
less chat history available to process.
This is because every AI model has a limit to the amount of context it can process at one
time.
'Context'?
This is the information that gets sent to the AI each time you ask it to generate a
response:
Character definitions
Chat history
Author's Notes
Special Format strings
[bracket commands]
SillyTavern automatically calculates the best way to allocate the available context tokens
before sending the information to the AI model.
What are a Character's 'Permanent Tokens'?
These will always be sent to the AI with every generation request:
Character Name (keep the name short! Sent at the start of EVERY Character
message)
Character Description Box
Character Personality Box
Scenario Box
What parts of a Character's Definitions are NOT
permanent?
The first message box - only sent once at the start of the chat.
Example messages box - only kept until chat history fills up the context (optionally
these can be forced to be kept in context)
Popular AI Model Context Token Limits
LLaMA 3 and its finetunes - 8192
OpenAI GPT-4 - up to 128k
Anthropic's Claude - 200k (Claude 3) or 100k (Claude 2)
NovelAI - 8192 (Erato and Kayra, Opus tier; Clio, all tiers), 6144 (Kayra, Scroll tier), or
3072 (Kayra, Tablet tier)
Personality summary
A brief description of the personality.
Examples:
Cheerful, cunning, provocative
Aqua likes to do nothing and also likes to get drunk
First message
The First Message is an important thing that sets exactly how and in what style the
character will communicate.
The character's first message should be long so that later it would be less likely that the
character would respond with very short messages.
You can also use asterisks ** to describe the character's actions.
For example:
*I noticed you came inside, I walked up and stood right in front of you* Welcome. I'm glad
to see you here. *I said with a toothy smug sunny smile looking you straight in the eye*
What brings you...
Examples of dialogue
Describes how the character speaks. Before each example, you need to add the <START>
tag. The blocks of examples dialogue are only inserted if there's a free space in the
context for them and pushed out of context block by block. <START> will not be present in
the prompt as it is just a marker - it will be instead replaced with "Example Separator"
from Advanced Formatting for Text Completion APIs and contents of the "New Example
Chat" utility prompt for Chat Completion APIs.
Use {{char}} instead of the character name.
Use {{user}} instead of the user name.
Example:
<START>
{{user}}: Hi Aqua, I heard you like to spend time in the pub.
{{char}}: *excitedly* Oh my goodness, yes! I just love spending time at the pub! It's so
much fun to talk to all the adventurers and hear about their exciting adventures! And you
are?
{{user}}: I'm new here and I wanted to ask for your advice.
{{char}}: *giggles* Oh, advice! I love giving advice! And in gratitude for that, treat me
to a drink! *gives signals to the bartender*
<START>
{{user}}: Hello
{{char}}: *excitedly* Hello there, dear! Are you new to Axel? Don't worry, I, Aqua the
goddess of water, am here to help you! Do you need any assistance? And may I say, I look
simply radiant today! *strikes a pose and looks at you with puppy eyes*
Scenario
Circumstances and context of the dialogue.
Favorite Character
Mark the character as a favorite to quickly filter on the side menu bar by selecting the
"Favorites" sort option. Favorite characters have a golden highlight in the list. This will also
make the character portrait appear in the hotswaps area (if enabled in User Settings).
Previous Next
Characters Macros (replacement tags)
Macros can be used in character description, author's notes, world info and many other
places and replaced with the corresponding values when generating a response. They can
be used to insert dynamic content into the prompt, such as the user's name, character's
description, or the current time. Macros are enclosed in double curly braces, e.g.
{{user}} and are usually case-insensitive. Please keep in mind that macro nesting is
currently not supported.
Note: some extensions may also add special context-specific macros that only work in
certain areas (i.e. special placeholders for extension prompts). These will not be
documented here unless the macro is not bound to a specific functionality.
General Macros
Macro Description
Only for slash command batching. Replaced with
{{pipe}}
the returned result of the previous command.
{{newline}} Inserts a newline.
{{trim}} Trims newlines surrounding this macro.
{{noop}} No operation, just an empty string.
{{user}} or <USER> User's name.
{{charPrompt}} Character's Main Prompt override.
Character's Post-History Instructions Prompt
{{charJailbreak}}
override.
{{group}} or Comma-separated list of group member names
{{charIfNotGroup}} or character name in solo chats.
Same as {{group}} but excludes muted
{{groupNotMuted}}
members.
{{char}} or <BOT> Character's name.
{{description}} Character's description.
Character's scenario or chat scenario override
{{scenario}}
(if set).
{{personality}} Character's personality.
{{persona}} User's persona description.
Character's examples of dialogue (instruct-
{{mesExamples}}
formatted).
Character's examples of dialogue (unaltered and
{{mesExamplesRaw}}
unsplit).
{{charVersion}} The character's version number.
{{charDepthPrompt}} The character's at-depth prompt.
Text generation model name for the currently
{{model}}
selected API. Can be inaccurate!
{{lastMessageId}} Last chat message ID.
{{lastMessage}} Last chat message text.
The ID of the first message included in the
{{firstIncludedMessageId}} context. Requires generation to be run at least
once in the current session.
{{lastCharMessage}} Last chat message sent by character.
{{lastUserMessage}} Last chat message sent by user.
1-based ID of the currently displayed last
{{currentSwipeId}}
message swipe.
{{lastSwipeId}} Number of swipes in the last chat message.
Type of the last queued generation request.
{{lastGenerationType}} Values: "normal", "impersonate", "regenerate",
"quiet", "swipe", "continue".
Can be used in Prompt Overrides fields to
include the default prompt from system settings.
{{original}}
Applied to Chat Completion APIs and Instruct
mode only.
{{time}} Current system time.
Current time in the specified UTC offset
{{time_UTC±X}} (timezone), e.g. for UTC+02:00 use
{{time_UTC+2}} .
{{timeDiff::(time1):: The time difference between time1 and time2.
(time2)}} Accepts time and date macros.
{{date}} Current system date.
{{input}} Contents of the user input bar.
{{weekday}} The current weekday.
{{isotime}} The current ISO time (24-hour clock).
{{isodate}} The current ISO date (YYYY-MM-DD).
Current date/time in specified format (e.g.
{{datetimeformat DD.MM.YYYY HH:mm}} ).
{{datetimeformat ...}}
Macro Description
Context template example dialogues
{{exampleSeparator}}
separator.
{{chatStart}} Context template chat start line.
{{instructSystemPrompt}} Instruct system prompt.
{{instructSystemPromptPrefix}} System prompt prefix sequence.
{{instructSystemPromptSuffix}} System prompt suffix sequence.
{{instructUserPrefix}} User message prefix sequence.
{{instructAssistantPrefix}} Assistant message prefix sequence.
{{instructSystemPrefix}} System message prefix sequence.
{{instructUserSuffix}} User message suffix sequence.
{{instructAssistantSuffix}} Assistant message suffix sequence.
{{instructSystemSuffix}} System message suffix sequence.
{{instructFirstAssistantPrefix}} Assistant first output sequence.
{{instructLastAssistantPrefix}} Assistant last output sequence.
{{instructFirstUserPrefix}} Instruct user first input sequence.
{{instructLastUserPrefix}} Instruct user last input sequence.
{{instructSystemInstructionPrefix}} System instruction prefix sequence.
{{instructUserFiller}} User filler message text.
{{instructStop}} Instruct stop sequence.
Max size of the prompt in tokens
{{maxPrompt}} (context length reduced by response
length).
System prompt content, including
{{systemPrompt}} character prompt override if allowed
and available.
System prompt content (excluding
{{defaultSystemPrompt}}
character prompt override).
Chat variables Macros
Local variables = unique to the current chat
Global variables = works in any chat for any character
Macro Description
Replaced with the value of the local variable
{{getvar::name}}
"name".
Replaced with empty string, sets the local
{{setvar::name::value}}
variable "name" to "value".
Replaced with empty string, adds a numeric
{{addvar::name::increment}} value of "increment" to the local variable
"name".
Replaced with the result of incrementing the
{{incvar::name}}
value of variable "name" by 1.
Replaced with the result of decrementing the
{{decvar::name}}
value of variable "name" by 1.
Replaced with the value of the global variable
{{getglobalvar::name}}
"name".
Replaced with empty string, sets the global
{{setglobalvar::name::value}}
variable "name" to "value".
Replaced with empty string, adds a numeric
{{addglobalvar::name::value}} value of "increment" to the global variable
"name".
Replaced with the result of incrementing the
{{incglobalvar::name}}
value of global variable "name" by 1.
Replaced with the result of decrementing the
{{decglobalvar::name}}
value of global variable "name" by 1.
Replaced with the value of the scoped variable
{{var::name}}
"name" (STscript only).
Replaced with the value at index of the scoped
{{var::name::index}} variable "name" (for arrays/objects in
STscript).
Extension-specific Macros
Added by extensions and only work under certain conditions.
Macro Description
Replaced with the summary of the current chat
{{summary}}
session (if available).
{{authorsNote}} Replaced with the contents of the Author's Note.
Replaced with the contents of the Character's
{{charAuthorsNote}}
Author's Note.
Replaced with the contents of the default Author's
{{defaultAuthorsNote}}
Note.
Previous Next
Character Design Chat File Management
Note
Some of these options are available in the "Manage chat files" dialog that opens
from the bottom left options menu.
Previous Next
Macros (replacement tags) Group Chats
Group Chats
Reply order strategies
Decides how characters in group chats are drafted for their replies.
Manual
You can select the character to reply manually from the menu or with the /trigger
command. The selected group member will be the only one to reply. User messages won't
trigger any replies automatically. Triggering a generation with an empty user input will
trigger a random unmuted group member to reply.
Natural Order
Tries to simulate the flow of a real human conversation. The algorithm is as follows:
1. Mentions of the group member names are extracted from the last message in chat.
Only whole words are recognized as mentions! If your character's name is "Misaka
Mikoto", they will reply only activate on "Misaka" or "Mikoto", but never to "Misa",
"Railgun", etc.
Unless the "Allow Self Responses" setting is enabled, characters won't reply to
mentions of their name in their own message!
2. Characters are activated by the "Talkativeness" factor.
Talkativeness defines how often the character speaks if they were not mentioned.
Adjust this value on the "Advanced Definitions" screen in the character editor. Slider
values are on a linear scale from 0% / Shy (character never talks unless mentioned) to
100% / Chatty (character always replies). The default value for new characters is
50% chance.
3. A random character is selected.
If no characters were activated at previous steps, one speaker is selected randomly,
ignoring all other conditions.
List Order
Characters are drafted based on the order they are presented in the group members list.
No other rules apply.
Pooled Order
Activates one random character who have't spoken yet since the last user message. If all
characters have spoken, selects one randomly until the next user message.
Group generation handling mode
This setting decides how to handle the character information of the group chat members.
No matter the choice, the group chat history is always shared between all the members.
Swap character cards
Default mode. Every time the message is generated, only the character card information
of the active speaker is included in the context.
Join character cards
The information of all of the group members is combined into one joint prompt in their list
order. This can help in cases when altering large chunks of the context is undesirable, e.g.
with llama.cpp prompt caching.
This mode has two sub-modes (you must choose one):
Include muted - muted characters will always be included into the joint prompt.
Exclude muted - muted characters won't be included if they aren't the current
speaker.
The following fields are being combined:
1. Description
2. Scenario, if not overridden for the chat
3. Personality
4. Message examples
5. Character notes / Depth prompts
Important! Please be aware that due to how the typical character card is structured, the
use of this mode can lead to unexpected behavior, including but not limited to: characters
being confused about themselves, having merged personalities, uncertain traits, etc.
Join Prefix and Suffix
When 'Join character cards' is selected, all respective fields of the characters are being
joined together. This means that in the resulting prompt all character descriptions will be
joined to one big blob of text. If you want those fields to be separated, you can define a
prefix and/or suffix.
These options support normal macros and will also replace {{char}} with the relevant
characters's name and <FIELDNAME> with the name of the part (e.g.: description,
personality, scenario, etc.)
Other Group Chat menu options
Mute Character
The struck-out speech bubble icon next to the character avatar in the group chat menu
can disable or enable replies from a particular character in the chat.
Force Talk
The speech bubble icon next to the character avatar in the group chat menu will trigger a
reply only from a particular character, bypassing the reply order strategy. It will work even
if the group member is muted.
Auto-mode
While auto-mode is enabled, the group chat will follow the reply order and trigger the
message generation without user interaction. The next auto-mode turn is triggered after a
5-second delay when the last drafted character sends its message. When the user starts
typing into the send message text area, the auto-mode will be disabled, but already
queued generations are not stopped automatically.
Allow Self Responses
Will allow consecutive replies from the character who sent the latest message of each
turn if they happen to be triggered due to being self-mentioned when the Natural Order is
selected. Has no effect on List order.
Group Chat Scenario Override
All group members will use the entered scenario text instead of what is specified in their
character cards. Branched chats inherit the scenario override from their parent and can
be changed individually after that.
Peek Character Definitions
Clicking on the character card icon next to the avatar in the group chat menu will quickly
navigate to the usual character definitions screen. Any changes made here will be saved
to the card itself.
To return back to the group chat, click the Group Name title link.
Member Management
Any of your existing characters can be added, removed, muted, or re-ordered within the
group chat. By default, a new member is added to the top of the group members list and
then can be re-ordered using the arrow icons.
Group Chat pop-out
The group chat menu pop-out can be activated by clicking on the icon next to the "Current
Members" field. This creates a pop-out of the group chat menu. By enabling MovingUI
from user settings, this menu can resized and dragged to any position within the interface
and functions just like the regular group chat menu.
Previous Next
Chat File Management Tags
Tags
Character cards and groups can be assigned zero or more tags. They are useful to
organize quickly growing collections by themes, quality, provenance or whatever you like.
Tagging
There are several ways to add or remove tags to a character card:
Import embedded tags during the import.
Open a card from the Character Management panel. From there you will be able to
assign tags to a character card.
Mass tagging.
To do mass tagging, click the "Bulk edit characters" button (pencil icon), select the cards
you want to tag, right click on any of them, then click "Tag" in the contextual menu.
Note
Please note that groups cannot be mass tagged.
Warning
The tags backup JSON file is not intended for sharing with others as it contains
information specific to your instance only, such as internal entity names!
Note
This popup will appear only if a User Settings option "Import Card Tags" is set to
"Ask".
In the "Import tags for CHARACTER NAME" popup that opens, you'll see a list of Existing
tags (which you already had locally with a matching name), and New tags (which you did
not have locally).
You can either:
Trim the lists as needed and then hit "Import" - remaining Existing tags will be added
to the imported character card, and remaining New tags will be created locally and
then added to the card.
Or simply hit "Import none" to ignore tags contained in the character card and import
ONLY the card.
Or "Import All" as a shortcut to import all tags found in the character card (NOTE:
including any that you trimmed from the lists above; use the "Import" button if you
did).
Or "Import Existing" as a shortcut to only import tags that existed locally with a
matching name.
Filtering character cards
After you create tags, you will see them on a row in the Character Management panel. You
can click these to switch tag filtering state; in order:
One click will show cards tagged with this tag.
Another click to only show cards NOT tagged with this tag.
Another click to reset filtering by this tag.
You can filter by any number of tags at the same time.
Tags as Folders
Note
To use this functionality, it has to be enabled first in the User Settings, under the
UI Theme column. The state of this toggle also saves with the UI theme.
From the "Manage tags" button (gear icon), each tag entry has a multi-state toggle button
to cycle between these tags-as-folder modes (called "bogus folder" in the code):
one click to turn this tag into an "open folder". It will appear as a virtual entry in the
card list; clicking into it will only show cards with that tag
another click to turn this tag into a "closed folder". As above, but cards tagged with
this tag will not appear by default - you'll need to click into the folder to see them.
another click to reset tag-as-folder state for this tag.
Edit this page
Previous Next
Group Chats Author's Note
Author's Note
What is it?
Author's Note is a powerful tool for customizing AI responses which inserts a section of
text into the prompt at any position and at any frequency you desire.
Usage
The Author's Note can be found in the Options menu on the left side of the chat input bar.
Previous Next
Tags Data Bank (RAG)
Note
While not formally a part of the data bank, you can attach files even to
individual messages. Use the Attach File option from the "Wand" menu, or a
paperclip icon in the message actions row.
What can be a document? Practically anything that is representable in plain text form!
Examples include, but are not limited to:
Local files (books, scientific papers, etc.)
Web pages (Wikipedia, articles, news)
Video transcripts
Various extensions and plugins can also provide new ways to gather and process data,
more on that below.
Data Sources
To add a document to any of the scopes, click "Add" and pick one of the available
sources.
Notepad
Create a text file from scratch, or edit an existing attachment.
File
Upload a file from the hard drive of your computer. SillyTavern provides built-in converters
for popular file formats:
PDF (text only)
HTML
Markdown
ePUB
TXT
You can also attach any text files with non-standard extensions, such as JSON, YAML,
source codes, etc. If there are no known conversions from the type of a selected file, and
the file can't be parsed as a plain text document, the file upload will be rejected, meaning
that raw binary files are not allowed.
Note
Importing Microsoft Office (DOCX, PPTX, XLSX) and LibreOffice documents
(ODT, ODP, ODS) requires a Server Plugin to be installed and loaded. See the
plugin's README page for installation instructions.
Web
Scrape text from a web page by its URL. The HTML document is then processed through
the Readability library to extract only usable text.
Some web servers may reject fetch requests, be protected by Cloudflare, or rely heavily
on JavaScript to function. If you're facing issues with any particular site, download the
page manually through the web browser and attach it using the file uploader.
YouTube
Download a transcript of the YouTube video by its ID or URL, either uploaded by the
creator or automatically generated by Google. Some videos may have the transcripts
disabled, also parsing of age-restricted videos is unavailable as it requires login.
The script is loaded in the video's default language. Optionally, you can specify the two-
letter language code to try and fetch the transcript in a specific language. This feature is
not always available and may fail, so use it with caution.
Web Search
Note
This source requires to have a Web Search extension installed and properly
configured. See the linked page for more details.
Perform a web search and download the text from the search result pages. This is similar
to the Web source but fully automated. A chosen search engine will be inherited from the
extension settings, so set it up in advance.
To begin, specify the search query, max number of links to be visited, and the output type:
one combined file (formatted according to the extension rules) or an individual file for
every single page. You can choose to save the page snippets as well.
Fandom
Note
This source requires to have a Server Plugin installed and loaded. See the
plugin's README page for installation instructions.
Scrape articles from a Fandom wiki by its ID or URL. As some wikis are very large, it may
be beneficial to limit the scope using the filter regular expression, it will be tested against
the article's title. If no filter is provided, then all of the pages are subject to be exported.
You may save them either as individual files for every page, or joint into a single
document.
Bronie Parser Extension (Third-Party)
Note
This source comes from a third-party and is not affiliated with the SillyTavern
team. This source requires you to have Bronya Rand's Bronie Parser Extension
installed as well as Server Plugins that require the parser to work.
Bronya Rand's Bronie Parser Extension allows the use of third-party scrapers, such as
miHoYo/HoYoverse's HoYoLab into SillyTavern, similar to the other data sources.
Currently, Bronya Rand's Bronie Parser Extension supports the following:
miHoYo/HoYoverse's HoYoLab (for Genshin Impact/Honkai: Star Rail) via HoYoWiki-
Scraper-TS
To begin, install Bronya Rand's Bronie Parser Extension by following it's installation guide
and install a supported Server Plugin into SillyTavern. Restart SillyTavern and go to the
Data Bank menu. Click + Add and you should see that your recently installed scrapers
are added into the possible list of sources to obtain information from.
Vector Storage
So, you've built yourself a nice and comprehensive library of information on your specific
subject matter. What's next?
To use the documents for RAG, you need to use a compatible extension that will insert
related data into the LLM prompt.
Vector Storage, which comes bundled with SillyTavern, is a reference implementation of
such an extension. It uses embeddings (also known as vectors) to search for documents
that relate to your ongoing chats.
Fun facts
1. Embeddings are arrays of numbers that abstractly represent a piece of text,
produced by specialized language models. More similar texts have a shorter
distance between their respective vectors.
2. Vector Storage extension uses the Vectra library to keep track of file
embeddings. They are stored in JSON files in the /vectors folder of your
user data directory. Every document is internally represented by its own
index/collection file.
As the Vectors functionality is disabled by default, you need to open the extensions panel
("Stacked Cubes" icon on the top bar), then navigate to the "Vector Storage" section, and
tick the "Enabled for files" checkbox under the "File vectorization settings".
By itself, Vector Storage does not produce any vectors, you need to use a compatible
embedding provider.
Vector Providers
Warning
Embeddings are only usable when they are retrieved using the same model that
generated them. When changing an embedding model or source, the vectors
need to be recalculated.
Local
These sources are free and unlimited and use your CPU/GPU to calculate embeddings.
1. Local (Transformers) - runs on a Node server. SillyTavern will automatically download
a compatible model in ONNX format from HuggingFace. Default model: jina-
embeddings-v2-base-en.
2. WebLLM - requires an extension to be installed and a web browser that supports
WebGPU. Runs directly in your browser, can use hardware accelleration. Automatically
downloads supported models from HuggingFace. Install the extension from here:
https://github.com/SillyTavern/Extension-WebLLM.
3. Ollama - get it from https://ollama.com/. Set the API URL in the API connection menu
(under Text Completion, default: http://localhost:11434 ). Must download a
compatible model first, then set its name in the extension settings. Example model:
mxbai-embed-large. Optionally, check an option to keep the model loaded in memory.
4. llama.cpp server - get it from ggerganov/llama.cpp and run the server executable with
--embedding flag. Load compatible GGUF embedding models from HuggingFace, for
example, nomic-ai/nomic-embed-text-v1.5-GGUF.
5. vLLM - get it from vllm-project/vllm. Set the API URL and API key in the API connection
menu first.
6. Extras (deprecated) - runs under the Extras API using the SentenceTransformers
loader. Default model: all-mpnet-base-v2. This source is not maintained and will be
eventually removed in the future.
API sources
All these sources require an API key of the respective service and usually have a usage
cost, but generally calculating embeddings is pretty cheap.
1. OpenAI
2. Cohere
3. Google MakerSuite
4. TogetherAI
5. MistralAI
6. NomicAI
Vectorization Settings
After you've selected your embedding provider, don't forget to configure other settings
that will define the rules for processing and retrieving documents.
Note
Splitting, vectorization, and retrieval of information from the attachments take
some time. While the initial ingestion of the file may take a while, the RAG search
queries are usually fast enough not to create a significant lag.
Message attachments
These settings control the files that are attached directly to the messages.
The following rules apply:
1. Only messages that fit in the LLM context window can have their attachments
retrieved.
2. When the vector storage extension is disabled, file attachments and their
accompanying message are fully inserted into the prompt.
3. When file vectorization is enabled, then the file will be split into chunks and only the
most relevant pieces will be inserted, saving the context space and allowing the
model to stay focused.
Size threshold (KB) - sets a chunking splitting threshold. Only the files larger than the
specified size will be split.
Chunk size (chars) - sets the target size of an individual chunk (in textual characters,
not model tokens!).
Chunk overlap (%) - sets the percentage of a chunk size that will be shared between
adjacent chunks. This allows for a smoother transition between the chunks, but may
also introduce some redundancy.
Retrieve chunks - sets the maximum amount of the most relevant file chunks to be
retrieved. They will be inserted in their original order.
Data Bank files
These settings control how the Data Bank documents are handled.
The following rules apply:
1. When file vectorization is disabled, the Data Bank is not used.
2. Otherwise, all available documents from the current scope (see above) are
considered for the query. Only the most relevant chunks across all the files are
retrieved. Multiple chunks of the same file are inserted in their original order.
3. The inserted chunks will reserve a part of the context before fitting the chat
messages.
Size threshold (KB) - sets a chunking splitting threshold. Only the files larger than the
specified size will be split.
Chunk size (chars) - sets the target size of an individual chunk (in textual characters,
not model tokens!).
Chunk overlap (%) - sets the percentage of a chunk size that will be shared between
adjacent chunks. This allows for a smoother transition between the chunks, but may
also introduce some redundancy.
Retrieve chunks - sets the maximum amount of the file chunks to be retrieved. This
allowance is shared between all files.
Injection Template - defines how the retrieved information will be inserted into the
prompt. You can use a special {{text}} macro to specify the position of the retrieved
text, as well as any other macros.
Injection Position - sets where to insert the prompt injection. The same rules as for
Author's Note and World Info apply.
Shared settings
Query messages - how many of the latest chat messages will be used for querying
document chunks.
Score threshold - adjust to allow culling the retrieval of chunks based on their
relevance score (0 - no match at all, 1 - perfect match). Higher values allow for more
accurate retrieval and prevent completely random information from entering the
context. Sane values are in a range between 0.2 (more loose) and 0.5 (more focused).
Include in World Info Scanning - check if you want the injected content to activate lore
book entries.
Vectorize All - forcibly ingests the embeddings for all unprocessed files.
Purge Vectors - clears the file embeddings, allowing to recalculate their vectors.
Note
For "Chat vectorization" settings see Chat Vectorization.
Conclusion
Congratulations! Your chatting experience is now enhanced with the power of RAG. Its
capabilities are only limited by your imagination. As always, don't be afraid to experiment!
vector storage rag retrieval-augmented generation vectors documents files
attachments
Previous Next
Author's Note Extensions
Extensions
SillyTavern comes with many extensions that can be enabled or disabled in the Extensions
panel. Extensions can add new features, change the behaviour of existing features, or
provide additional content for your AI to use. More extensions can be installed from the
"Download Extensions & Assets" menu in the Extensions panel.
Extensions panel
To open or close the Extensions panel, choose Extensions in the top bar.
Manage extensions: Activate, deactivate, and update extensions
Download Extensions & Assets: Install more extensions, characters, sounds, and
backgrounds from the SillyTavern repository
Notify on extension updates: Check to be notified when there are updates available
for installed extensions
Install extension: Import an extension from a Git repository URL
Using third-party extensions can have unintended side effects and may pose
security risks. Always make sure you trust the source before importing an
extension via Install extension. We are not responsible for any damage
caused by third-party extensions.
Built-in extensions
These extensions are built into SillyTavern and do not need to be installed. They can be
enabled or disabled in the Extensions panel.
Chat Translation
Translate chat messages to a different language
Image Captioning
Generates text from images so your AI can "see" and respond to visual content in
your conversations
Image Generation
Use local or cloud-based Stable Diffusion, FLUX or DALL-E APIs to generate images
Expression Images
Images (aka 'sprites') of your AI character, shown next to or behind the chat window
Summarize
Auto-summary of the chat history
Chat Vectorization
Finds relevant messages from chat history and adds them into the context
Text To Speech
Voice narration for your chat messages via ElevenLabs, Silero, your system TTS,
AllTalk, XTTS, and more
Quick Reply
Reply to chat messages with a single click, run commands and STscripts, and more
Token Counter
Converts text into tokens and counts the number of tokens
Installable extensions
Install any of these extensions from the "Download Extensions & Assets" menu in
Extensions.
Blip
Animate the text of character messages with variable speed and play sound along
the animation.
Dynamic Audio
Adds immersive background music and ambient sounds to your chats.
EmulatorJS
Play retro console games directly in SillyTavern chats.
Live2d
Adds support for live2d models. Customizable expressions, animations and
interactions.
Objective
Set an Objective for the AI to aim for during the chat.
RVC
Adds Realtime Voice Cloning capabilities to the Text-to-Speech module.
Speech Recognition
Convert your speech to text using browser or extras.
VRM
Adds support for VRM models. Customizable expressions, animations and
interactions.
Web Search
Adds web search results to LLM prompts.
AccuWeather
Provides weather information using the AccuWeather API as a slash command or a
function tool.
Chess
Play the game of chess with the LLM.
Code Runner
Allows running JavaScript and STscript code from code blocks in chat.
D&D Dice
A set of 7 classic D&D dice for all your dice rolling needs.
Duplicate Finder
Adds an ability to cluster characters by similarity groups to easily find duplicates.
Emoji Picker
Adds a button to quickly insert emojis into a chat message.
Group Greetings
Allows setting alternate greetings that are specific to group chats.
Group SendAs
Adds a button to quickly insert a /sendas command template for the selected group
member.
HypeBot
Show personalized suggestions based on your recent chats using the NovelAI's
HypeBot engine. Requires an active NovelAI subscription.
Idle
Adds "idle prompting" after the user has been idle for some time to organically
continue the conversation.
LaTeX
Render LaTeX and AsciiMath formulas in chat messages.
Mermaid
Adds Mermaid diagrams & flowcharts rendering to SillyTavern chats.
Notebook
Adds a place to store your notes. Supports rich text formatting.
Parameter Randomizer
Adds ability to randomize API settings sliders with every generation.
Prompt Inspector
Adds an option to inspect and edit output prompts before sending them to the server.
Push Notifications
Allows to receive push notifications for incoming chat messages.
Quick Persona
Adds a dropdown menu for selecting user personas from the chat bar.
RSS
Gets the latest news from RSS feeds as a slash command or a function tool.
Screen Share
Provides the screen image for multimodal models when you send a message.
Silence Player
Adds a silence audio player to the extensions menu. Can help if the browser tab is
being killed in a background.
Timelines
Adds a timeline navigation to the chat history.
Variable Viewer
Easy way to view and modify variables.
WebLLM
Provides an interface for extensions to use language models directly in the browser.
Previous Next
Data Bank (RAG) Blip
Blip
This guide will walk you through setting up and customizing blip extension for your
SillyTavern experience. This extension animate the text of messages with variable speed
and play sound along the animation. You can use audio file or generate the sound.
Prerequisites
Before you begin, ensure you've met the following prerequisites:
Make sure you're on the latest version of SillyTavern.
Install the "Blip" extension from the "Download Extensions & Assets" menu in the
Extensions panel (stacked blocks icon).
Blip global settings
1. Blip user message:
Enable checkbox to play animation on user message.
Set a profile for the user or a default profile if you want blip animation for user.
2. Blip only for certain text:
Enable checkbox to only blip for text inside quotes.
Enable checkbox to ignore everything inside asterisks.
3. Automatic scroll down:
Enable checkbox to make the chat go down to follow the text animation, disable it
if you wanna scroll freely during animation.
4. Audio volume
Mute the audio if just the animation of the text is desired.
You can adjust the global volume of blip audio.
Character animation/voice profile
You can save a profile for each character:
including the user and an optional default profile that will be used when character
have no profile.
If only the current chat characters are shown in the list, click the checkbox to show all
your characters.
1. Select the character to assign/update profile:
Select a character, if he have a profile it will be loaded.
If it does not have a profile yet the current parameters will become his profile
settings.
Any profile can be deleted using the remove button.
Use refresh button if your character does not appear in the list.
2. Text animation settings:
Set the text speed: the delay in milliseconds between each letter printed.
Set Min/max speed multiplier different to 1.0 for randomness of speed animation.
Set comma/phrase delay superior to 0 to add a pause when special character are
printed, can add more liveliness to animation. Audio is paused too in this case.
3. Audio parameters:
Set a volume multiplier that will only affect this voice profile if needed.
Set audio speed: the delay between each blip sound, independant of text speed.
4. Blip origin: Generated sound:
Use the min/max frequency slider to customize the blip sound played.
If min/max are different a random sound in this range is played each time.
5. Blip origin: file:
Choose a file in the list.
You can get official ST blip assets from the assets extension menu.
Or put file directly into: \SillyTavern\data\<user-handle>\assets\blip .
Enable the checkbox to force to wait entire file is played before playing again if
needed.
Thank you for following this guide! Your SillyTavern experience is now enriched with text
animation and blip voices.
Character Expressions
What is it?
Expression images are images (aka 'sprites') of your AI character which are shown next to
(or behind) the chat window.
Expression images can automatically change based on a classification, adjusting to the
sentiment expressed in the AI's most recent chat response.
Adding Character Expression Images
1. Open the Extensions Panel and expand the 'Character Expressions' section. If you
have the character chat open, you will see a grid of image placeholders.
Expression Drawer
2. Click the 'Upload image' button at the top left of each image in the grid, and select
the image you want to apply to that emotion. This will save the image with the correct
filename inside the /data/<user-handle>/characters/(character_name_here)/ folder.
3. Repeat this for all expressions you want to assign an image to.
Importing an Expression images ZIP file
Using the ' Upload sprite pack (ZIP)' button, you can import a zip file that contains a
collection of expression images, and those images will automatically be added to the
correct folder for your currently selected character. The ZIP file must contain all images
in a flat structure (no subfolders) and correctly named files. Importing a zip will not
automatically rename any images to make them match the emotions.
Change Expressions Manually
1. Click on any of the uploaded expression images (sprites) to display them near the
chat interface (with default UI mode) or at the center of the screen (in Visual Novel
mode).
2. Use the /expression-set (name) slash command or matching Quick Reply to set the
sprite without opening the extensions menu.
Change Expressions Automatically
To automatically set expressions when the character replies, you have multiple options.
Expressions change per message or at regular intervals when message streaming is
enabled.
Setup Instructions (Local)
1. Open the extensions panel and expand the "Character Expressions" extension menu.
2. Select "Local" in the classification source dropdown.
3. This will start a one-time download of the classification model from HuggingFace Hub
(about ~100 Mb).
4. Generate any message to verify that the classification works and the sprite appears.
You may also check the server console for debug logs.
Local classification defaults to 28 possible image labels: Cohee/distilbert-base-uncased-
go-emotions-onnx
To use the 6-option classification model, change the value of
extensions.models.classification variable in the config.yaml file to: Cohee/bert-base-
uncased-emotion-onnx
How does the classify module work?
The classify module uses a small 'sentiment parsing' model that runs alongside the
SillyTavern server. This model takes the new output from the AI and detects what kind of
sentiment, or emotion, the text is expressing. While multiple sentiments may be expressed
in a single message, the model only picks the most likely one and returns that to the
SillyTavern. The frontend extension then displays the image that is associated with that
sentiment.
Setup Instructions (with Extras)
Warning
Extras is deprecated and may be removed in future updates.
1. Have Extras installed and running with the classify module enabled: python
server.py --enable-modules=classify
2. Import the expression images the same way as mentioned above.
3. Select "Extras" in the classification source dropdown.
4. The appropriate expression image will display automatically whenever the AI sends
you a response.
Extras API uses a classification model with 6 options by default: nateraw/bert-base-
uncased-emotion
There is also a model with 28 options: joeddav/distilbert-base-uncased-go-emotions-
student
To use this model you need to change your Extras command line to include the following
argument (with a space before and after): --classification-model=joeddav/distilbert-
base-uncased-go-emotions-student
Tip
Both Local and Extras only support a limited list of expressions.
If you want Custom Expressions to be displayed, you either need to train a
classification model with supported labels (outside the scope of this guide), or
you can use LLM or WebLLM as classification source, which both will
automatically use all existing expressions - both the default and any custom
ones.
If you have more than one character with the same display name, they will both use the
same set of expression images.
If you want a different image set to be used for each version of the same-named
character, you can use the sprites folder override.
Folder overrides can also be used to define different sprite sets (outfits, etc.) of the same
character.
How to set an override
1. Create a folder in the /data/<user-handle>/characters with any name and put
images there, e.g. /data/<user-handle>/characters/Boris .
2. Open the chat with the character whose sprites you'd like to override.
3. Enter the name of the override folder into the "Sprite Folder Override" input and click
"Submit".
4. The Sprites list will reload and the "Sprite set" indicator should show the override
folder.
5. Alternatively, you can use the /costume slash command to achieve the same result:
/costume Boris .
6. By prepending a backslash to the override folder name, it will resolve to a subfolder in
the current character sprites folder, e.g. /costume \tracksuit for the character
named Boris will resolve to the /data/<user-handle>/characters/Boris/tracksuit
folder.
Previous Next
Blip Chat Translation
Chat Translation
Overview
The Chat Translation Extension enables real-time translation of chat messages between
different languages using various translation providers. It supports both manual and
automatic translation modes.
Character message translated from English to Chinese using 'Translate Message/翻譯訊息' message
action button
DeepL-specific configuration
Formality levels available for German, French, Italian, Spanish, Dutch, Japanese, and
Russian
Configure via deepl.formality in config.yaml
Slash Commands
Use /translate command for quick translations. Syntax: /translate
[target=language_code] text . If target language is not provided, the value from the
extension settings will be used.
Basic usage
Translate text to the current target language and show it in a popup:
/translate Welcome to the Tavern | /echo
This is useful for checking the quality of a translation into a language that you don't
speak, before writing it somewhere important.
Popup, 'My hovercraft is full of eels/我的氣墊船裡裝滿了鰻魚/My hovercraft is filled with eels', en, zh-TW,
en
The UI controls are shown in the current locale, independent of the configured target
language.
/input /buttons
Popup, '我的氣墊船裡裝滿了鰻魚/My hovercraft is full of eels', zh-TW -> en -> zh-TW
Input language detection is relatively effective in the following examples:
Popup, '(My hovercraft is full of eels)/A légpárnás hajóm tele van angolnával/我的氣墊船裡裝滿了鰻魚',
zh-TW -> hu -> zh-TW
Popup, 'Il mio hovercraft è pieno di anguille/我的气垫船里装满了鳗鱼/My hovercraft is filled with eels', it -
> zh-CN -> en
Technical Notes
UTF-8 encoding, special characters, and emojis are supported
Handles large messages by splitting into chunks when needed
Preserves formatting and embedded images in messages
Caches translations to avoid redundant API calls
AI input language
internal_language controls the language into which user messages are auto-translated
before being sent to the AI. It is hardcoded to 'en' in the default settings and cannot be
changed through the UI. Thus, the translation target language for messages to the AI is
always English. Previous testing showed that AI performance was better when receiving
English messages, but this may change as more LLMs are being trained on more varied
language data. I suppose one could change internal_language in settings.json and
find out.
Chinese variant handling
The extension supports both Simplified and Traditional Chinese, but not all translation
providers do. The UI presents these as 'Chinese (Simplified)' and 'Chinese (Traditional)'
respectively, with language codes 'zh-CN' and 'zh-TW'. They are mapped to the following
language codes for translation providers:
Libre Translate: 'zh-CN' to 'zh' and 'zh-TW' to 'zt'.
DeepL and DeepLX: both variants to 'ZH'.
Bing: 'zh-CN' to 'zh-Hans', 'zh-TW' as-is.
Other providers use 'zh-CN' and 'zh-TW' as provided.
Text length limits
Some providers have character limits per request:
Yandex: 5000 characters
DeepLX: 1500 characters
Bing: 1000 characters
Google: 5000 characters
Longer texts are automatically split into chunks for translation.
Chat Vectorization
Disclaimer
The use of this extension does not guarantee a better chatting experience or
improved memory of any sort. Only use if you understand all the implications of
vector database utilization.
Chat vectorization searches for messages in your current chat history that seem relevant
to your most recent messages. It temporarily shuffles the most relevant messages to the
beginning or end of the chat history. This happens when the model's reply to your last
message is generated.
The messages at the start and end of the chat history tend to have the greatest impact
on the model's reply. Therefore, shuffling relevant messages to these locations can help
the model focus on relevant information in its reply.
In particular, chat vectorization can find relevant messages that are too far back in the
message history to fit into the request context. Shuffling these messages into context
provides the model with information that it would not have otherwise.
Chat vectorization is a kind of retrieval-augmented generation (RAG). Retrieval-
augmented generation increases the quality of responses generated by a model, by
providing additional relevant information in the prompt.
Retrieval: the most recent messages are used to retrieve relevant past messages
Augmented: the model's context is augmented by inserting past messages in a useful
way
Generation: the model is instructed to use the past messages when generating the
response
Some terms:
A vector is a set of numbers that could represent the themes, content, style, or
other characteristics of a piece of text.
Vectorization is calculating the vector that represents a piece of text. This is
done by a vectorizing model. Just as text generation models make text from
text, vectorizing models make vectors from text.
Vector search finds relevant results by comparing vectors rather than, say,
keywords. If we calculate the vector for a search query, we can compare it to
the stored vectors for a collection of pieces of text. This finds the texts in our
collection that are most similar to the text in the search query. In the case of
chat vectorization, the "search query" is the most recent 2 messages, and the
"texts in our collection" are all the other messages in the chat.
Setting up
To enable Chat vectorization, select "Extensions" > "Vector Storage" > "Enabled for chat
messages".
Configure a vectorization source and vectorization model. Chat vectorization uses the
same vector source as Data Bank, so you may have set this up already. The settings for
the Vectorization Source and Vectorization Model are documented in Data Bank.
Chat vectorization uses the same vector storage as Data Bank, but this does not need to
be set up or configured. There is also information about Vector Storage in Data Bank.
Chat vectorization does not use Data Bank to store the chat messages. The messages are
stored in the chat.
Preparing chat messages for search
(vector storage)
So that chat messages can be searched, a vector is calculated for each message and
stored.
Vectorizing occurs in the background, whenever you send or receive a message.
Each message is stored individually, so that it can be found and shuffled individually
during generation.
Large messages are split into "chunks" so that the model can be given the most relevant
part of a long message. The chunk size is 400 characters. You can change this with
"Chunk size (chars)".
Messages are divided into chunks by finding a chunk boundary such as a paragraph
break, line break, or space between words. This is so that the all the chunks make sense,
as far as possible. If your chat messages have some other way to mark natural splitting
points, such as ---- , you can add this to "Chunk boundary". The setting for "Chunk
boundary" is shared with Data Bank.
Vector storage controls
To calculate vectors for all messages in the current chat, without waiting for them to be
processed in the background, choose "Vectorize All" from the settings.
To see how many messages in the current chat have been vectorized, choose "View
Stats". This displays the total number of vectors stored. It also indicates the specific chat
messages that have been vectorized, by marking them with a green ball.
To remove all the vectors for messages in the current chat, choose "Purge Vectors".
The controls for "Vectorize All" and "Purge Vectors" within Chat vectorization
only affect the stored vectors for the current chat. However, there are identical
buttons in File vectorization that affect the vectors for files in Data Bank. Ensure
that you are purging the vectors that you intend to purge.
Vector summarization is intended to make vector search of chat messages more effective.
It does this by introducing a summarizing step prior to vectorizing. The summarizing step
extracts the most important parts of the message, so that the resulting vector is a better
indicator of what the message relates to.
Vector summarization may make vector search less effective.
To summarize the messages in the chat history, and generate a vector for each
summarized message, choose "Summarize chat messages for vector generation".
The summarized message does not replace the original message in chat. If a vector
search matches the vector of a summarized message, the original message is retrieved
from chat history and shuffled into context. The summarized versions of the messages are
retained in Vector Storage, which may be of interest for debugging.
To summarize the content of the messages used to search the chat history (the last 2
messages by default), choose "Summarize chat messages when sending".
Each time a message is summarized for vectorising, a separate request is made to the
summarizing model. You can choose which summarizing source is used with "Summarize
with". Choosing "Main API" will generate the summaries using the same model and
connection settings that you use for generating chat or text completions.
The request consists of the raw message content and an instruction about how the model
should produce the summary. You can change the instruction with "Summary Prompt".
vector storage rag retrieval-augmented generation vectors summarization chats
messages
Dynamic Audio
This guide will walk you through setting up and customizing dynamic audio assets for your
SillyTavern experience.
Prerequisites
Before you begin, ensure you've met the following prerequisites:
Make sure you're on the latest version of SillyTavern.
Install the "Dynamic Audio" extension from the "Download Extensions & Assets" menu
in the Extensions panel (stacked blocks icon).
Dynamic Audio Setup (Browser)
1. Connect to the Assets Repository:
Launch SillyTavern and navigate to Extensions > Assets.
Click on the "Connect" button to establish a connection to the official assets
repository.
Download the desired audio assets, such as background music (BGM) or ambient
sounds, that correspond to the backgrounds you intend to use.
2. Enable Dynamic Audio Extension:
In SillyTavern, go to Extensions > Dynamic Audio.
Enable the extension, unmute and adjust the volume of BGM and ambient sounds
to your preference.
When bgm end another one will play randomly, click on loop button to keep
current bgm playing
Click on roll button to pick another bgm randomly
3. Expression based BGM:
Enable expression BGM switch if you want bgm to follow character expression
(require bgm in character folder see below).
Adjust the cooldown timer (in seconds) between BGM updates. Increase it if you
find the BGM changes too frequently in group chats or when using character-
specific BGM with emotion detection.
Importing Music for Characters
To set up custom music for your characters' emotions, follow these steps:
1. Navigate to Character Folder:
Go to the characters folder, e.g., \SillyTavern\data\<user-
handle>\characters\Seraphina .
2. Create BGM Folder:
Inside the character folder, create a subfolder named bgm .
3. Import Emotion Music:
Within the bgm folder, import the music files for each emotion. Supported audio
extensions include .mp3 , .ogg , and .wav .
Naming convention: [emotion]_[number].mp3 , e.g., anger_0.mp3 , joy_0.mp3 .
4. Multiple Tracks for Emotions:
You can import multiple tracks for the same emotion by incrementing the number,
e.g., neutral_1.mp3 , neutral_2.mp3 .
5. Default Music Selection:
When no emotion is detected, a random neutral track will play as the default.
Emotions are detected similarly to updating sprites; refer to the expression
images documentation for details.
Changing Default BGM Music
If a character doesn't have custom BGM in their folder, a default track will play. Here's
how you can change it:
1. Navigate to BGM Folder:
Go to the following folder: \SillyTavern\data\<user-handle>\assets\bgm .
2. Replace/Add Music:
Replace or add music files ( .mp3 , .ogg , .wav ) to this folder.
These are the official audio assets downloaded using the assets extension.
One of these tracks will play randomly when no character-specific BGM is found
(solo or group chat).
Changing Ambient Sounds
Ambient sounds add depth to your scenes. Here's how you can customize them:
1. Navigate to Ambient Folder:
Go to the following folder: \SillyTavern\data\<user-handle>\assets\ambient .
2. File Naming Convention:
Ambient audio filenames correspond to background image filenames, replacing
spaces with dashes.
Example: "bedroom-clean.mp3" corresponds to the "bedroom clean.jpg"
background.
If the lock button is unlock the audio file corresponding to the background will
play. Activating lock will keep current ambient playing.
3. Custom Ambients:
You can add your own ambient sounds for custom or existing backgrounds by
following the same naming pattern.
Thank you for following this guide! Your SillyTavern experience is now enriched with
dynamic audio.
Previous Next
Chat Vectorization EmulatorJS
EmulatorJS
This extension allows you to play retro console games right from the SillyTavern chat.
Installation
Prerequisites:
Latest release version of SillyTavern.
ROM files downloaded from the net. You can find them anywhere.
How to install:
1. Install using SillyTavern's extensions downloader.
2. Or use this link: https://github.com/SillyTavern/SillyTavern-EmulatorJS
Usage
Open the "EmulatorJS" extension menu.
Click "Add ROM file". ROMs are saved to your browser storage and not stored on a
server.
Select the game file to add. Input the name and core (if it wasn't auto-detected). If
the core requires a BIOS file, add it too.
Click the "Play" button in the list or launch via the wand menu.
You can customize controls and other settings in the emulator frame after launching
the game.
Use save/load state functions if you need to take a break.
Check the EmulatorJS docs to see the list of available cores and their requirements:
Systems.
Comments mode
With the power of multimodal models such as GPT-4 Vision, your AI bots can see your
gameplay and provide witty in-character comments.
Requirements
1. A browser that supports ImageCapture. Tested on desktop Chrome. Firefox requires to
enable it with config. Safari won't work.
2. Chat Completion API with image inlining mode is recommended. Requires OpenAI or
OpenRouter API key with "gpt-4-turbo" or "gpt-4o" as the selected model; Google AI
Studio with Gemini 1.5 Pro or Gemini 1.5 Flash model; Anthropic Claude (Opus 3 or
Sonnet 3.5 models recommended). Check the API documentation of the chosen to see
if the chosen model supports multimodal prompts.
3. If image inlining is disabled, make sure that the "Image Captioning" extension is
enabled, then select the "Multimodal" captioning source:
OpenAI, Claude, MistralAI, Google AI Studio with access with any vision-supported
model.
OpenRouter API with a compatible multimodal model.
Locally hosted Llava model in Ollama, KoboldCpp, oobabooga TextGen WebUI or
vLLM.
How to enable comments
1. Make sure you set the interval of providing comments in the EmulatorJS extension
settings. This setting defines how often the character is queried for comments using a
snap of your current gameplay. A value of 0 indicates that no comments are provided.
2. Select a character chat and launch the game. For the best performance, make sure
that the ROM file is properly named so that AI can have more background context.
3. Start playing as you normally would. The vision model will be queried periodically to
write a comment based on the latest screenshot it "sees".
Settings
1. Caption template - a prompt used to describe the in-game screenshot. {{game}} and
{{core}} additional macro are supported.
2. Comment template - a prompt used to write a comment based on the generated
caption. {{game}}, {{core}}, {{caption}} additional macro are supported. For image
inlining mode, {{caption}} is replaced with see included image .
3. Force captions - will force the use of multimodal captioning even if image inlining is
supported and enabled.
Why I'm not seeing any comments?
Comments are temporarily paused (interval step skipped) if:
1. Emulator is paused (with a pause button, not in-game).
2. The browser window is out of focus.
3. The user input area is not empty. This is to let you type your reply in peace.
4. Another reply generation is currently in progress.
5. TTS voice is being read aloud. Comment is held off (20 seconds maximum) until it
finishes, but not skipped.
Other common issues:
1. Make sure you've set a commenting interval before launching the game.
2. Make sure you have set a multimodal API key and there are no errors in the ST server
console.
Still doesn't work? Send us your browser debug console logs (press F12).
Credits
EmulatorJS engine (GPLv3): https://github.com/EmulatorJS/EmulatorJS
Previous Next
Dynamic Audio Image Captioning
Image Captioning
Image Captioning allows SillyTavern to automatically generate text descriptions for
images used in chats.
Use Image Captioning when you want your AI character to "see" and respond to visual
content in your conversations.
Create captions for images you upload or paste into messages
Add context to existing images in the chat history
Use various sources for generation, including local models, cloud APIs, and
crowdsourced networks
There are options that require no setup, no money, and no GPU. There are also options
that require some or all of those things. Choose the one that fits your needs and
resources.
The image captioning extension is built-in to SillyTavern and does not need to be installed
separately.
Quick start
1. Set up:
Open the Image Captioning panel in the Extensions panel
Choose a captioning source (most likely "Local" or "Multimodal")
For "Multimodal" ensure you've set up the connection in the API Connections
tab
2. Generate a caption:
Choose "Generate Caption" from the Extensions popup menu
Select an image file when prompted
Wait for the caption to be generated
3. Review and send:
The captioned image will be inserted into your message
See the caption using the image tooltip
Click Send to see what your character thinks of the image!
Panel controls
Source Selection
Choose the source for image captioning. Supported options:
Source Description
Cloud: OpenAI, Anthropic, Google, MistralAI, and others.
Multimodal Local: Ollama, llama.cpp, KoboldCpp, Text Generation WebUI, and
vLLM.
Supports custom prompts so you can ask your images questions.
Extras The Extras project was discontinued in April 2024 and is not
maintained or supported.
Caption Configuration
Caption Prompt: Enter a custom prompt for captioning. The default prompt is "What's
in this image?"
Ask every time: Toggle to request a custom prompt for each image caption
Message Template
Message Template: Customize the caption message template. Use {{caption}}
macro to insert the generated caption. The default template is [{{user}} sends
{{char}} a picture that contains: {{caption}}]
Auto-captioning
Automatically caption images: Toggle to enable automatic captioning of images
pasted or attached to messages
Edit captions before saving: Toggle to allow editing captions before they are saved
Captioning images
All the ways to caption images in SillyTavern:
Choose "Generate Caption" from the Extensions popup menu and select an image
file when prompted
Click the Caption icon at the top of an image already in a message
Paste an image directly into the chat input with auto-captioning enabled
Attach an image file to a message using the Embed File or Image button in the
actions of a message.
Send a message with an embedded image
Use the /caption slash command
Auto-Captioning
The auto-captioning feature allows you to automatically generate captions for images as
they are added to the chat, without manually triggering the captioning process each time.
To enable, select the "Automatically caption images" checkbox in the Image Captioning
panel. You can also choose to edit captions before they are saved by checking the "Edit
captions before saving" box.
Once enabled, auto-captioning will trigger in the following scenarios:
When an image is pasted directly into the chat input.
When an image file is attached to a message.
When a message with an embedded image is sent.
The system will use your selected captioning source (Local, Extras, Horde, or Multimodal)
and the configured settings to generate a caption for the image.
Editing captions before saving (Refine Mode)
If you've enabled the "Edit captions before saving" option:
1. After an image is added, a popup will appear with the generated caption.
2. You can review and edit the caption as needed.
3. Click "OK" to apply the caption, or "Cancel" to discard the caption without saving.
Caption sending
The generated (and optionally edited) caption will be automatically inserted into the
prompt using the Message Template you've configured. By default, it will be sent in this
format:
[BaronVonUser sends Seraphina a picture that contains: ...]
prompt (optional): A custom prompt for the captioning model. Only supported by
multimodal sources.
quiet=true|false : If set to true, suppresses sending a captioned message to the
chat. Default is false.
mesId=number : Specifies a message ID to caption an image from an existing message
instead of uploading a new one.
If no mesId is provided, the command will prompt you to upload an image. When quiet
is false (default), a new message with the captioned image will be sent to the chat. The
generated caption can be used as input for other commands.
Examples
Caption a new image with the default settings:
/caption
Caption an image from message #10 with a custom prompt then generate a new image
based on the caption:
/caption mesId=10 Describe this image using comma-separated keywords | /imagine
Local source
You can change the model in config.yaml. The key is called extras.captioningModel
because reasons. Enter the Hugging Face model ID you want to use. The default is
Xenova/vit-gpt2-image-captioning .
You can use any model that supports image captioning ( VisionEncoderDecoderModel or
"image-to-text" pipeline). The model needs be to compatible with the transformers.js
library. That is, it needs ONNX weights. Look for models with the ONNX and image-to-
text tags, or that have a folder called onnx full of .onnx files.
Multimodal source
General configuration
Model: Choose the model for image captioning. Options vary based on the selected
API.
Allow reverse proxy: Toggle to allow using a reverse proxy if defined and valid
(OpenAI, Anthropic, Google, Mistral)
API keys and endpoint URLs for captioning sources are managed in the API Connections
panel. Set the connection up in API Connections first, then select it as your captions
source in Captioning.
For most local backends, you will need to set some options in the model backend rather
than in SillyTavern. If your backend can only run one model at a time and doesn't support
automatic switching, you are unfortunately going to have a hard time using the same
backend for chat and captioning with different models.
Even if you run two instances of the backend on different ports, API Connections only
allows one active configuration per backend type. But what if I told you... that you can
probably connect to your backend in both Text Completion and Chat Completion modes?
Now you can have two connections to the same backend type.
Sources
To use one of these caption sources, select Multimodal in the Source dropdown.
"I want the best captioning possible, and I don't mind paying for it": Anthropic
"I don't want to pay anything or run anything": Google AI Studio free tier
"I want to caption images locally and have it just work": Ollama
"I want to keep the dream of local AI alive": KoboldCpp
"I want to complain when it doesn't work": Extras
API Provider Description
01.AI (Yi) Cloud, paid, yi-vision
KoboldCpp
For general information on installing and using KoboldCpp, see the KoboldCpp
documentation.
To use KoboldCpp for multimodal captioning:
get a multimodal-capable model, trained to process text and image prompts at the
same time.
also get the multimodal projections for the model. These weights allow the model to
understand how the text and image parts of the input relate to each other.
load the model and projections in the KoboldCpp launch GUI or command line
interface.
The original and classic local multimodal model is LLaVA. GGUF-format files for the model
and projections are available from Mozilla/llava-v1.5-7b-llamafile. To load them from the
command line, set the model and projections with the --model and --mmproj flags. For
example:
./koboldcpp \
--model="models/llava-v1.5-7b-Q4_K.gguf" \
--mmproj="models/ llava-v1.5-7b-mmproj-Q4_0.gguf" \
... other flags ...
Some LLaVA finetunes you can try: xtuner/llava-llama-3-8b-v1_1-gguf, xtuner/llava-phi-3-
mini-gguf.
You can use multimodal projections for the base model that your particular finetune was
built from. Projections for some common base models are available from
koboldcpp/mmproj.
Previous Next
EmulatorJS Image Generation
Image Generation
Use local or cloud-based Stable Diffusion, FLUX or DALL-E APIs to generate images.
Automatically generate images as replies to your messages for full immersion, generate
from chat history and character information from the wand menu or slash commands, or
use the /sd (anything_here) command in the chat input bar to make an image with your
own prompt.
Most common Stable Diffusion generation settings are customizable within the SillyTavern
UI.
Supports multiple image generation sources, both local and cloud-based
Various generation modes for characters, scenes, and custom prompts
Slash commands for easy image generation within chats
Interactive mode to trigger image generation based on natural language requests
Customizable prompt templates and prefixes for consistent style and quality
Character-specific prompt prefixes for tailored character images
Style presets to quickly switch between different image generation settings
Flexible visibility options for generated images in chat
Advanced ComfyUI integration for highly customizable workflows
Ability to view all generated images in a character gallery
Image swipes feature to regenerate images while keeping the same prompt
Options to edit prompts before generation and extend free-mode prompts
Integration with AI function calling for automatic image generation detection
Supported sources
Source Remarks
Generation modes
Wand menu Slash
item command Description Remarks
argument
A full-body portrait
"Yourself" you of the current -
character.
A close-up portrait Forces a portrait
"Your Face" face of the current aspect ratio.
character.
Image swipes
Images swipes allow to reroll the image generation while keeping the same prompt. If a
fixed seed is set, it will be randomized for the next generation.
To cycle through images, hover a mouse cursor (tap on mobile) over a generated image to
reveal arrow buttons and swipes counter. Tapping right arrow on the latest image will
generate a new one.
'Swipes' here is just a name, don't try the actual swiping gesture, as this will regenerate
the message itself, not the attached image.
Options
Edit prompts before generation
Allow to edit the automatically generated prompts manually before sending them to the
Stable Diffusion API.
Use function tool
Uses function calling to automatically detect the intention to generate an image.
Requirements:
1. Must have image generation configured with a supported source.
2. Must use a supported Chat Completion API model and have function tool calling
enabled in the AI Response settings.
3. The "Use function tool" option must be enabled in the Image Generation settings.
4. The user should express an intent to generate an image in the chat message, e.g.
"Send me a picture of a cat".
The interactive mode will not trigger when the function tool is enabled.
ComfyUI Configuration
ComfyUI is a fast and very flexible option for image generation.
If you're familiar with ComfyUI, the tl;dr is: make your workflow in ComfyUI, download it in
API format, and paste it into the SillyTavern ComfyUI Workflow Editor. ST will submit your
workflow to ComfyUI's API and you will get an image in your chat. But with great power
comes great responsibility, and the main responsibility is inserting placeholders in your
workflow JSON so you can change settings from SillyTavern.
If you're not familiar with ComfyUI, you can still use it to generate images in SillyTavern
using the default workflow. Later, when you want great power, you can learn how to use
ComfyUI...
Controls
This panel allows you to configure and manage your ComfyUI integration with SillyTavern.
Enter the URL of your ComfyUI server in the ComfyUI URL input field. The default value is
http://127.0.0.1:8188 . If you are using SwarmUI, the default port for the managed
ComfyUI server is 7821 , 20 ports higher than the default port for SwarmUI.
After entering the URL, choose Connect to validate and establish a connection. The
ComfyUI server must be accessible from the SillyTavern host machine.
Workflow Management
Select a ComfyUI workflow from the dropdown menu. Two default workflows are provided:
Default_Comfy_Workflow.json: A basic text-to-image workflow supporting the most
common image generation settings.
Char_Avatar_Comfy_Workflow.json: A sample image-to-image workflow that uses the
character avatar, plus the prompt, to generate an image.
Use the following buttons to manage your workflows:
Open workflow editor to view and modify the selected workflow.
+ Create new workflow to create a new workflow with a custom name.
Delete workflow to remove the selected workflow.
Workflow Editor
The ComfyUI Workflow Editor allows you to view and modify ComfyUI workflows for use
with SillyTavern.
The main component of the editor is a large text area where you can insert or edit your
ComfyUI workflow in JSON format.
To add a ComfyUI workflow to the editor, follow these steps:
1. Enable 'Dev Mode' in ComfyUI settings.
2. Use the 'Save (API Format)' option in ComfyUI to download the JSON data.
3. Create a new workflow in SillyTavern and open the editor.
4. Paste the downloaded JSON data into the text area.
5. Replace specific values with placeholders as needed for your use case.
Tips
You can add the API-format JSON file directly to the data/default-
user/user/workflows directory in your SillyTavern installation. This will save you
from steps 3 and 4.
Retain the original JSON file. If you need to open the workflow again in ComfyUI
to make changes, it is much more convenient to edit the original file than the
one with all the placeholders.
Placeholders
The editor provides a list of predefined placeholders that can be used in your workflow
JSON. These placeholders are replaced with dynamic values when the workflow is
executed in SillyTavern.
Placeholders marked with ✅ are present in your workflow JSON. Placeholders marked
with ❌ are not present in your workflow JSON. You can add these placeholders to your
workflow JSON as needed. You do not need to add all the placeholders, only the ones that
your workflow uses and you want to replace dynamically.
Prompts
The %prompt% and %negative_prompt% placeholders are used to insert the image
generation prompts into the workflow. These contain the final prompts generated by
SillyTavern, including the generated prompt for your chosen /sd mode, the common
prompt prefix, negative prompt, and character-specific prompt prefix.
For example, you may have tested your workflow with a prompt like "forest elf" in
ComfyUI. To use this workflow in SillyTavern, you can replace the "forest elf" prompt with
the %prompt% placeholder:
{
"class_type": "CLIPTextEncode",
"inputs": {
"clip": ["4", 1],
"text": "%prompt%"
}
}
Notice that the placeholder is wrapped in double quotes. This is important for the JSON
format, and required by SillyTavern's placeholder replacement system. Even for numbers,
you must use double quotes in the template JSON.
Sometimes the prompt (or other value) doesn't appear where you might expect. ComfyUI
will remove nodes from the API version of the workflow if they are not necessary for the
workflow to function in API mode.
For instance, this workflow uses a LoRA tag loader node with a prompt primitive so the
workflow is clearer in UI mode:
{
"inputs": {
"text": "%prompt%",
"model": ["112", 0],
"clip": ["112", 1]
},
"class_type": "LoraTagLoader",
"_meta": {"title": "Load LoRA Tag"}
}
In some cases you may need to make several replacements in the workflow JSON, even if
the prompt appears only once in the UI.
Model
The %model% placeholder will insert the value of the selected model in the image
generation settings.
An example from the default text-to-image workflow:
{
"class_type": "CheckpointLoaderSimple",
"inputs": {
"ckpt_name": "%model%"
}
}
To load GGUF-quantized UNets, use a UNet Loader (GGUF) node in your workflow, choose
a GGUF model in the SillyTavern model dropdown, and use the %model% placeholder in
the node's settings like this:
{
"inputs": {
"unet_name": "%model%"
},
"class_type": "UnetLoaderGGUF",
"_meta": {
"title": "Unet Loader (GGUF)"
}
}
If you have model types other than the usual SD checkpoints in ComfyUI
Stable Diffusion checkpoints, SD UNets, and GGUF-quantized UNets all appear
in the Model dropdown. Models of one type will not work with workflows/loader
nodes expecting another type. If you choose an incompatible model type in ST,
ComfyUI will report a problem with the loader node.
Avatar images
Use the %user_avatar% and %char_avatar% placeholders to include the user and
character avatars in the workflow. These placeholders are replaced with the PNG data of
the avatars when the workflow is executed. The image data is encoded in base64 format,
so you must decode it in your workflow. A popular choice for this task is the Load image
(Base64) node.
In this example, the character avatar is loaded with a Load Image (Base64) node. It also
uses an Image Resize node to rescale the image to whatever size is specified in the image
generation settings:
Load image from base64 string and resize
Insert the %char_avatar% , %width% , and %height% placeholders into the JSON for the
Load Image (Base64) and Image Resize nodes:
{
"97": {
"inputs": {
"image": "%char_avatar%"
},
"class_type": "ETN_LoadImageBase64",
"_meta": {"title": "Load Image (Base64)"}
},
"98": {
"inputs": {
"mode": "resize",
"resize_width": "%width%",
"resize_height": "%height%",
"image": ["97", 0]
},
"class_type": "Image Resize",
"_meta": {"title": "Resize image"}
}
}
To get a base64-encoded image string for testing your workflow in ComfyUI, use any
online tool that converts images to base64 strings. Here's an example string you can use
for initial testing: sd-comfy-base64-test-string.txt.
Other placeholders
Most other placeholders use the values of the corresponding controls in image generation
settings, or the values that you specify with the /sd command:
%vae% , but most SD models include a VAE so the default workflows do not use this
placeholder. Use it with custom workflows to load a VAE alongside a UNet, override
the default VAE, etc.
%sampler%
%scheduler%
%steps%
%scale%
%width%
%height%
%denoise% : for the sample image-to-image workflow, vary the denoise amount
between about 0.5 (barely-noticeable changes to the source image) and 1.0 (a
completely different image as if no source image was used). Not used by the default
text-to-image workflow because there's no point using a value other than 1.0 for text-
to-image.
%clip_skip% : not used by the default workflows but available for custom workflows.
The %seed% placeholder will insert the seed value from the control if you have specified
one. If you set the seed to -1 , SillyTavern will generate a new random seed for each
image in %seed% .
Custom placeholders
You can add custom placeholders to your workflow:
1. Look for the "Custom" section below the predefined placeholders.
2. Click the "+" button to add a new custom placeholder.
3. Enter a name for the placeholder in the find field.
4. Enter the value that you want to replace the placeholder with in the replace field.
Custom placeholders will appear in a separate list below the predefined ones.
For example, you could replace the "SillyTavern" prefix for saved image filenames in the
default workflow with a custom placeholder. Add a new custom placeholder with find
set to filename_prefix and replace set to ServiceTesnor . Insert the new
%filename_prefix% placeholder into your workflow JSON. Now you can change the
filename prefix from SillyTavern to ServiceTesnor by changing the value of the custom
placeholder.
JSON with placeholder Original JSON
{
"class_type": "SaveImage",
"inputs": {
"filename_prefix": "%filename_prefix%",
"images": ["8", 0]
}
}
Comfy tricks
Read all the general information on this page so you're familiar with the image generation
options. Options such as switchable styles and common prompt prefixes, when combined
wih the total flexibility of ComfyUI workflows, allow you to create a wide variety of image
generation setups.
Loading LoRAs
Use a LoRA tag loader node (such as Load LoRA Tag) to load any LoRAs specified in the
prompt. Now you can add as many LoRAs as you like to your prompt with tags like
<lora:CroissantStyle:0.8> , and they will be loaded into your workflow. This will also
make the "pro-tip" of using LoRAs in character-specific prompt prefixes work with
ComfyUI.
Setting workflow values from styles or slash-commands
You can use macros in custom placeholder values. As a practical example, let's say you
sometimes want to generate images without a background, and you'd like this to be
switchable with a slash-command or image style. Here's how you could do it:
1. Make a ComfyUI workflow that removes the image background, or not, depending on
the value of an input
2. Use a custom placeholder to set the value of that input, but use
{{getvar::remove_background}} as the replace value
3. Now you can set the value of remove_background with /setvar
key=remove_background true or /setvar key=remove_background false before
generating an image
4. The workflow will use the value you set to determine whether to remove the
background
5. Make an image style "No background" with common prompt prefix
{{setvar::remove_background::true}}
6. Use the style control or /imagine-style No background to set the value of
remove_background to true before generating an image
Previous Next
Image Captioning Live2D
Live2D
This guide will walk you through the process of setting up and customizing the Live2D
extension for your SillyTavern experience. This extension allows you to use Live2D
animated models for your character, providing a dynamic and interactive element to your
virtual character.
Prerequisites
Before you begin, ensure you've met the following prerequisites:
1. Branch Selection: Make sure you're using the latest version of SillyTavern to access
the latest features and updates.
2. Extension Installation: Install the "Live2D" extension from the "Download Extensions
& Assets" menu in the Extensions panel (represented by the stacked blocks icon).
3. Model Folder Placement: Place your Live2D model folders into the /data/<user-
handle>/assets/live2d directory. A properly organized live2d assets folder might
look like this:
A Live2D model folder should include all necessary components for the Live2D
model, such as expressions, motions, textures, sounds, and settings files. Notably
the ***.model.json file must be at the root of the Live2D model folder for the
model to be detected by the extension. In this example the shizuku live2d model
folder may look like this:
Note: Models can also be placed in character-specific folders, such as
/data/<user-handle>/characters/Shizuku/live2d/ . However, models in character
folders will only be accessible for that specific character.
Extension Settings
The Live2D extension offers various settings to customize the behavior of your animated
model. Here are the key settings:
UI global settings
Global Settings
1. Enabled:
Enable this checkbox to activate the extension, allowing your Live2D model to
interact within SillyTavern.
You can disable the extension if you want to use normal sprites only.
You can disable the extension when you want to move normal sprites in a group
chat and enable it again when you're ready to use Live2D models.
2. Follow Cursor:
Enable this checkbox to make the Live2D model follow your cursor, provided that
the model supports this feature.
3. Auto-send Interaction:
Enable this checkbox to automatically trigger character interactions when you
click on areas with mapped messages (refer to the hit areas section for details).
Debug Settings
These settings help you control the behavior and visibility of your Live2D model for
debugging purposes.
1. Reset Model Before Animation:
Enable this checkbox to reload the model before any animation. This forces the
animation to start and allows you to spam clicks if necessary. Some models may
require this to ensure that animations begin from a compatible state.
2. Show Model Frames:
Enable this checkbox to display the model frame, making it easier to identify
where to click to drag the model around. It also shows the hit area, if available.
Hovering over a hit area will show its name.
3. Reload button
Click this button to reload every live2d model. Use it in case something glitches.
Character Selection
These settings allow you to manage characters and assign Live2D models to them.
1. Refresh Button:
Click the refresh button to update the list of characters in the current chat.
2. Select Character:
Use the drop-down list to choose a character to assign a Live2D model to.
3. Remove Button:
Click this button to delete all assigned models for a character. A confirmation
prompt will appear to confirm the deletion.
Model Selection
UI model list
1. Refresh Button:
Click the refresh button if your Live2D model does not appear in the list.
2. Select Model:
Choose a model from the list to assign it to the selected character.
The model can be located in the asset folder or the current character's folder.
The list displays the model folder name, its origin (asset or character), and the
name of the detected model setting file.
Note that some model folders may contain different versions of the same model.
You can try different model files to see which one works best.
Selecting none will use normal sprites if there is any
Settings are saved per character and model
Model Settings
UI model settings
1. Model Scale:
Use the slider to adjust the size of the model, making it larger or smaller.
2. Model Center X Offset:
Use the slider to change the horizontal position of the model relative to the
window center.
3. Model Center Y Offset:
Use the slider to adjust the vertical position of the model relative to the window
center.
Remarks
The settings are saved and carry over different chats.
You can also drag the model with your mouse, and those settings will be updated and
saved.
Use these UI settings to bring your model back on the screen if you somehow made it
out of view. Also, check the "Show frame" checkbox to see clearly where you can click
to drag the model.
Model Talk
UI model talk
1. Param mouth open Y id
Select from the list the ID of the parameter corresponding to the model's mouth Y
value. Not all models have one, and names may vary from model to model.
Usually something like "PARAM_MOUTH_OPEN_Y" or "ParamMouthOpenY". Check
the model when selecting an element from the list; it will try to run the speak
animation. If the mouth moves, you got it!
2. Mouth movement speed
Adjust the slider to change the movement speed of the mouth animation.
3. Time per character
Set the time duration of each character. The duration of the talk animation will be
this time multiplied by the number of characters of the message.
Remarks
This mouth animation does not work on every model and every animation. Even if your
model has animations where the mouth moves, it does not mean the mouth animation
can be controlled by this extension. If nothing shows in the parameter list, your model
is probably made with a too old version of Live2D to access the parameters properly.
Model Animations
UI model animations
1. Starter animation
Select an expression and motion from the lists that will play when starting a chat
with the character. You can also add a delay during which the model will be
invisible if you need to hide the character for some time to achieve a perfect
effect.
2. Default animation
Select an expression and motion from the list that will play when the character
sends a message. Use a fallback animation when using the classify expression
extension.
Remarks
Animations will play when you select one in the lists.
Use the replay button to replay the selected animation.
Some models have expressions defined as motions.
If nothing shows in the lists, it's probable your model's setting file has no
expressions/motions defined.
Hit areas mapping
UI model mapping
1. Default click animation
Select an expression and motion from the list that will play when you click on the
model. You can also set a message that will be sent as a user message.
2. Hit areas
If the model has hit areas, they will be listed, and you can assign an
animation/message to each of them.
Remarks
Some models have no hit areas, but the default click is detected for all.
The default click will trigger if you click on a hit area with nothing mapped or if you
click outside of any hit area.
Hit areas have priority defined in the model; for example, "mouth" is inside "head." If it
does not behave properly, it may be due to the model file.
For some models, animations need to be finished before starting another one. Use the
debug checkbox if you want to force the refresh and spam animations.
Classified Expressions Mapping
UI model classify
1. Requirements
Requires the use of the classify expression extension; otherwise, it will fall back to
the default animation.
2. Mapping
For each detected emotion by the classify extension, you can assign an
expression/motion animation.
Remarks
If the previous animation did not finish when a new message is received, it's possible
that the new animation will not play. This behavior is dependent on the Live2D model.
Use the debug checkbox if you want to force the animation to play.
Thank you for following this guide! Your SillyTavern experience is now enriched with
animated and interactive Live2D models.
Previous Next
Image Generation Objective
Objective
What is it?
The Objective extension lets the user specify an Objective for the AI to strive towards
during the chat. This objective is broken down into step-by-step tasks. Tasks may be
branched, where child tasks can be created automatically or manually. This gives the
ability to create complex task trees. The completion status of each task in the list will be
checked at certain intervals.
This differs from adding static direction through prompting in that it adds sequential and
paced directives for the AI to follow without user intervention. It gives a more genuine
experience of the AI autonomously striving to reach a goal.
Prerequisites
Before you begin, ensure you've met the following prerequisites:
Make sure you're on the latest version of SillyTavern.
Install the "Objective" extension from the "Download Extensions & Assets" menu in the
Extensions panel (stacked blocks icon).
Common Use Cases
Your imagination is the limit, you can give the AI any objective you wish and it will plan out
how to achieve it. You can ask it to plan how to slay a demon, rob a temple, throw a lavish
party, or even take over the world.
Objective Settings Panel
Configuration
The extension is found in the Extensions menu under Objective.
Type an objective into the top text box, then click on Auto-Generate Tasks . This
sends a request to the connected API and asks it to provide a list of tasks which
match the objective you have typed in.
Note: Clicking Auto-Generate Tasks will delete all existing tasks for the currently selected
Objective before adding new ones.
Upon receiving the response from the AI, a list of tasks will be created automatically
in the space below the Objective input box. Tasks can be edited after creation.
At the bottom of the panel are two boxes: Position in Chat and Task Check
Frequency
Position in Chat - This is how 'deep' in the chat section of the prompt you want
the current task to be inserted. The lower the number, the more attention the AI
will give to the task. Setting to 0 will make the task the primary thing in the AI's
mind. Setting at high values will put the task in the background and allow the AI to
focus on the conversation at hand, but setting it too high may cause the AI to
never 'get around' to the task at all.
Task Check Frequency - This is how often you want the AI to check if the task
has been completed. If it is set to 3 , the AI will be asked if the current task has
been completed every 3rd message.
Objectives, tasks, and their descriptions are saved in real-time to the current chat
session. Custom prompts are saved globally.
Custom Prompts
You can customize the prompts sent to the LLM to generate tasks, check task completion,
and for prompt injection. Editing prompts will save them for the current session. Custom
prompts can be saved and loaded for persistence.
Click Edit Prompts to open the prompt editor window. You can edit your prompts as
desired.
To save prompts, enter a name and click Save Prompt.
To load prompts, select the prompt from the dropdown list.
To delete a saved prompt, select it from the dropdown list and click Delete Prompt
WARNING: Task Checking happens in a separate API request. Setting Task Check
Frequency to 1 will double your API calls to the LLM service. Be careful with this if you
are using a paid service.
Usage
By default the Objective extension will keep track of all tasks and their respective
completion status automatically.
The User can also manually create, update, delete, and complete tasks at any time.
Current Task Selection
The current task will always be the first listed incomplete task. Any manual updates to
tasks will trigger a check for what the current task should be. So if you add a task above a
bunch of completed tasks, it will be set as the current task. Once it's completed,
previously completed tasks will be skipped and the next incomplete task will be selected
as 'Current'.
When using parent/child tasks in a task tree, tasks are selected depth-first, meaning all
child tasks will be selected in order first, then continue down the list of tasks for the
current Objective/Task.
Branch Tasks
Click the Branch Task button to set the current task as an Objective where you can auto
generate or manually create tasks as child tasks. You can continue to turn any child task
into an Objective and keep generating to your hearts content.
Marking a parent task as complete will cause the extension to skip all subtasks. When all
child tasks are complete, the parent task will be marked as complete
Manually Complete Tasks
You can manually toggle the completion status of a task by clicking the checkbox next
to it. This will set the next incomplete task to be selected.
Manual Task Check
If you want to manually trigger the AI to check for task completion, click on the Extras
Extension button (the magic wand on the right side of the chat input bar) and select
Manual Task Check .
Manual Task Check
Manually Add Tasks
When no tasks are present, an Add Task button is visible, allowing you to manually create
the first task.
If other tasks are already present, click the + button to the right of any task to insert a
new task after it.
Delete Tasks
Click the red x to delete an existing task. The next incomplete task will be selected as
the current task automatically.
Deleting a task with child tasks will delete all child tasks and their descendants.
Hiding Tasks
If you want to remain unaware of what tasks the AI is attempting to complete, check the
Hide Tasks box to hide the task list and make the AI's intentions a mystery. For 100%
mysteriousness, do this before clicking Auto-Generate Tasks !
Previous Next
Live2D Regex
Regex
What is it?
The Regex extension lets the user automatically detect specific patterns in a string of text
(called 'sequences') and apply manipulations (replacements) to them. It can be a powerful
tool when used in conjuction with other SillyTavern features such as Quick Replies or
STscript, or simply a way to remove certain words from a chat.
Helpful Links
This document will not explain the process of writing a RegEx sequence in depth. There
are many online resources to assist you with that.
https://regexr.com
https://regex101.com
https://extendsclass.com/regex-tester.html
https://en.wikipedia.org/wiki/Regular_expression
Prerequisites
Regex is a built-in extension of SillyTavern, so no additional setup is required.
You may find its settings in the Extensions panel.
Common Use Cases
RegEx is often used to apply a find-replace function on certain words in the chat, to add
markdown styles to certain words or sentence types, or to return a boolean value to an
STscript.
Script List
Example: /yourpattern/gi will match all instances of 'yourpattern' in the text, regardless
of case.
Some of the most common flags are:
i : case-insensitive
g : global (applies to all matches, not just the first)
s : dotAll (treats the input as a single line, so . will match newlines)
m : multi-line (treats the input as multiple lines, so ^ and $ match the start/end of
each line, not just the whole string)
u : unicode (treats the input as unicode, so \d , \w , etc. will match unicode
characters)
For more information on RegEx flags, see the following MDN page: Advanced searching
with flags
Ephemerality
By default (when neither box here is checked) a RegEx script will directly edit the text
values stored inside the chat's JSONL file. This ensures both the outgoing prompt and the
chat display will always contain the same values. However, these changes to the chat file
are irreversible.
If you do not want this to happen, you can enable either of the checkboxes here to limit
the RegEx script's affects to only the display or the outgoing prompt.
If only one of the boxes is checked, there will be no changes made to the chat file, but
only the checked item will be changed. This means you will be seeing one thing, but the
LLM will be seeing another. Use this carefully.
If both are selected, the script will function as normal in all ways EXCEPT it will not write
any changes to the chat file.
Advanced Use
While RegEx is commonly used as a simple Find/Replace tool, it can also be used in more
complex ways.
For example the 'Replace With' box could include a set of CSS rules and HTML to add a
specific styled HTML element into your chat whenever a certain word is found. This will
require the Show <tags> in responses box to be unchecked in the User Settings panel.
The script can also be set to never trigger during normal use, but could instead be
triggered via slash command as part of a logic check inside an STscript. The 'Replace
With' box would include a unique value the script recognizes to indicate if a logic check is
true or false. This expands the utility of RegEx to the full capabilities of all slash
commands, allowing for truly unlimited levels of control and automation based on the
contents of the chat.
Previous Next
Objective Retrieval-based Voice Conversion
(RVC)
© Copyright 2025. All rights reserved.
SillyTavern Documentation
Retrieval-based Voice
Conversion (RVC)
This guide will walk you through using RVC, a technique that allows transferring voice
features from one audio clip to another, enabling voices to speak in different tones and
styles.
Ever enjoyed those famous "Presidents Play X" videos? They were created using RVC.
With the RVC extension, you can make your SillyTavern characters speak in any voice you
desire, be it anime, movie, or even your own unique voice.
RVC is NOT TTS: it's more like speech-to-speech. It takes an audio clip as its input. In the
background, what RVC does is work in tandem with SillyTavern's TTS extension: it waits
for TTS to generate an audio file (which TTS would've done regardless of whether you use
RVC or not), then RVC will perform a second pass that takes the TTS audio file and
transforms it into the cloned voice from your RVC configuration.
RVC Setup
SillyTavern's RVC supports several API sources that perform audio conversion:
rvc-python
SillyTavern Extras (deprecated)
Common prerequisites
Before you begin, ensure you've met the following prerequisites.
ffmpeg
Make sure you have ffmpeg binary in your PATH environment variable. This tool is used to
convert incoming audio.
Windows:
Use the Toolbox in SillyTavern Launcher script to install ffmpeg automatically:
https://github.com/SillyTavern/SillyTavern-Launcher
Or download the build here: https://www.gyan.dev/ffmpeg/builds/
How to modify PATH variable: https://www.architectryan.com/2018/03/17/add-to-the-
path-on-windows-10/
To test whether you did things correctly, open a command prompt and run ffmpeg . It
should print the ffmpeg version and info.
Linux:
Install ffmpeg using your package manager.
# Debian/Ubuntu
sudo apt install ffmpeg
# Arch Linux
sudo pacman -S ffmpeg
# Fedora
sudo dnf install ffmpeg
macOS:
Install ffmpeg using Homebrew:
brew install ffmpeg
Arguments:
5050 - sets a listening port for the server. Change if you want to host on a different
port.
models_path - sets a path for models. Remove if you want to use the default
rvc_models directory.
-l - sets the server to listen on all network interfaces. Remove to only listen on
localhost.
4. Connect to the server
In the RVC extension settings, set an appropriate rvc-python API URL. By default, it
will be http://localhost:5050 .
Check the Use CUDA checkbox if you have installed rvc-python to support CUDA
acceleration.
Press "Refresh" to load a list of available voices.
5. Configure a voice map
Voice map defines voice conversion settings for every character or user persona.
To set up a voice map, choose your character or persona name from the "Character"
dropdown, then choose an RVC "Voice", then click Apply.
Optionally, you can also configure other related settings such as pitch correction or
filtering.
If you did everything correctly, the Voice Map debug area will show something like
'Betty:MyVoice(rvpme)'.
SillyTavern Extras Setup
1. Prepare RVC Model Files
In a file browser, navigate to: \SillyTavern-extras\data\models\rvc .
Create a subfolder like 'Betty' and place the .pth and .index files into it. (Hint: you
can download voice files from https://voice-models.com, make sure the voice name
says it's RVPME.)
2. Install Requirements
Install the necessary requirements using the command:
pip install -r requirements-rvc.txt`
Optionally, you may wish to run RVC on your GPU if you have a capable one, by adding --
cuda to the startup command. Based on a quick test, VRAM usage was 3.4GB for
narrating 50 tokens (~36 words), and 7.6GB for 200 tokens (~150 words).
4. Set Up Voice Mapping
Create a Voice map for RVC. Set your Character to your desired SillyTavern character
name, and set Voice to the RVC folder you created at step 1, then click Apply. If you did
things correctly, the Voice Map will show something like 'Betty:MyVoice(rvpme)'.
5. Select Pitch Extraction
Choose "rmvpe" as the pitch extraction method.
If you have trouble with "rmvpe" try other methods (for example, "harvest" or
"torchcrepe").
6. (Optional) Configure RVC to save your generations to file
If for testing or troubleshooting purposes you wish to save the generated RVC audio, add
--rvc-save-file to your startup command. This will save the last generation under
SillyTavern-extras/data/tmp/rvc_output.wav :
2. Launch RVC-Launcher.bat
Open the RVC-Launcher.bat file.
Choose option 1 to install RVC.
3. Complete Installation
When prompted, install required packages and dependencies.
4. Open WebUI for Voice Training
After installation, choose option 2 to open the WebUI for voice training.
Mangio-RVC: Training a Voice Model
Dataset Preparation:
1. Prepare Audio:
Place the audio you want to train in the datasets folder.
Ensure the audio is free of background noise – only raw voice is needed.
Longer audio makes a better output quality.
WebUI Training:
1. Access Training Tab:
Click on the training tab in the WebUI.
2. Configure Experiment:
Enter an experiment name (e.g., my-epic-voice-model ).
Set version to v2.
3. Process Data and Extract Features:
Click "Process data" and "Feature extraction".
Set "Save frequency" to 50.
4. Training Parameters:
Set "Total training epochs" to 300.
Click "Train feature index" and "Train model".
Previous Next
Regex Speech Recognition
Speech Recognition
This guide will walk you through setting up speech recognition to transcribe your voice into
text within SillyTavern.
Prerequisites
Before you begin, ensure you've met the following prerequisites:
Make sure you're on the latest version of SillyTavern.
Install the "Speech Recognition" extension from the "Download Extensions & Assets"
menu in the Extensions panel (stacked blocks icon).
Have ffmpeg binary installed. See RVC setup for more details.
Speech Recognition Setup (Browser)
1. Configure SillyTavern:
Launch SillyTavern and go to Extensions > Speech Recognition.
Select "Browser" from the dropdown options.
If your browser doesn't support voice recognition, an error popup will appear.
2. Select Message Mode:
Choose the "Message Mode" you want:
Append: Your message will be appended to the current user message text
area.
Replace: Your message will replace the current user message in the text
area.
Auto send: Your message will automatically be sent once the end of speech
is detected.
3. Enable Message Mapping (Optional):
Setup phrases mapping for vocal shortcuts.
For instance, by adding "command delete = /del2", the "/del2" command will
replace your voice message when "command delete" is detected.
Useful when combined with auto send mode for full voice control. Enable this by
checking "Enable messages mapping".
4. Select Language:
Choose the language you want to speak (Note: not every browser supports all
languages).
5. Recording:
To start recording, click the microphone button to the right of the message area
next to the send button. Click again to stop recording. Recording may stop
automatically if no voice is detected.
Speech Recognition Setup (Whisper/Vosk)
1. Enable Provider:
Enable the desired speech recognition provider on the extras server using the
following command:
python server.py --enable-modules=whisper-stt
or
python server.py --enable-modules=vosk-stt
You can also use a custom model by adding the option --stt-vosk-model-path or
--stt-whisper-model-path with the path to the model.
2. Configure SillyTavern:
Launch SillyTavern and go to Extensions > Speech Recognition.
Select "Vosk" or "Whisper" from the dropdown options (whisper is more accurate).
The settings are similar to the "Browser" provider setup (except for language) see
above.
Speech Recognition Setup (Streaming)
1. Enable Provider:
Enable the streaming speech recognition module on Sillytavern-extras with the
following command:
python server.py --enable-modules=streaming-stt
2. Configure SillyTavern:
(Optional) Specify a custom Whisper model as in the Whisper setup above.
(Optional but recommended) Set up trigger words in SillyTavern. Only messages
starting with these trigger words will be sent to SillyTavern as actual messages.
This prevents random speech or noise from being transcribed. Enable this with
the checkbox. The trigger words can be included/excluded from the actual
message using a checkbox.
Other settings are similar to other providers.
You're now ready to transcribe your voice into text using speech recognition in SillyTavern.
Previous Next
Retrieval-based Voice Conversion Summarize
(RVC)
Summarize
What is it?
This extension allows you to create, store, and utilize automatically generated summaries
based on the events happening in your chats. Summarization can help with outlining
general details of what is happening in the story, which could be interpreted as a long-
term memory, but take that statement with a grain of salt. Since the summaries are
generated by language models, the outputs may lose some important details or contain
hallucinations, so you're always advised to keep track of the summary state and correct it
manually if needed.
Common configuration
The summarization extension is installed in SillyTavern by default, thus it will show up in
ST's Extensions panel (stacked cubes icon) list like this:
Summarize Config Panel
Current summary - displays and provides an ability to modify the current summary.
The summary is updated and embedded into the chat file's metadata for the message
that was the last in context when the summary was generated. Deleting or editing a
message from the chat that has a summary attached to it, will revert the state to the
last valid summary.
Restore Previous - removes the current summary, rolling it back to the previous state.
This is useful if the summarizer does a poor job at any given point.
Pause - check this to prevent the summary from being automatically updated. This is
useful if you want to provide a custom summary of your own or to effectively disable
the summary by clearing the box and stopping updates.
Popup window - allows to detach the summary into a movable UI panel on the
sidebar. Useful for the desktop layout to easily have access to summarization settings
without having to navigate through the extensions menu.
Injection Template - defines how the summary will be wrapped when being inserted
into regular chat prompts. A special {{summary}} macro should be used to denote the
exact location of the current summary state in the prompt injection text.
Injection Position - sets the location of the prompt injection. The options are the
same as for Author's Notes: before or after the main prompt, or in-chat at designated
depth.
Supported summary sources
Main API
Summarization will be powered by your currently selected AI backend, model and settings.
This method requires no additional setup, just a working API connection.
This option has the following sub-modes that differ depending on how the summary
prompt is built:
1. Raw, blocking. The summary will be generated using nothing but the summarization
prompt and the chat history. Subsequent prompts will also include the previous
summary with messages that were sent after the summary was generated (see
example). This mode can (and will) generate prompts that have a lot of variability
between them, so it is not recommended to use it with backends that have slow
prompt processing times, such as llama.cpp and its derivatives.
2. Raw, non-blocking. Same as above, but the chat generation will not be blocked during
the summary generation. Not every backend supports simultaneous requests, so
switch to blocking mode if summarization fails.
3. Classic, blocking. The summarization prompt will be sent at the end of your usual
generation prompt, as a neutral system instruction, not omitting the character card,
main prompt, example dialogues and other parts of chat prompts. This usually results
in prompts that play nicely with reusing processed prompts, so it is recommended to
use with llama.cpp and its siblings.
Summary Settings explained
1. Summary Prompt - defines the prompt that will used for creating a summary. May
include any of the known macros, as well as a special {{words}} macro (see below).
2. Target summary length (words) - defines the value of the {{words}} macro that can
be inserted into the Summary Prompt. This setting is completely optional and has no
effect at all if the macro is not used.
3. API response length (tokens) - allows to set an override API response length for
generating summaries that are different from the globally set value.
4. Max messages per request (raw modes only) - set to limit the maximum number of
messages that will be included in one summarization prompt. 0 means no explicit
limitation, but the resulting number of messages to summarize will still depend on the
maximum context size, calculated using the formula: max summary buffer = context
size - summarization prompt - previous summary - response length . Use this when
you want to get more focused summaries on models with large context sizes.
5. No WI/AN - omit World Info and Author's Note from text to be summarized. Only has
an effect when using the Classic prompt builder. The Raw prompt builder always
omits WI/AN.
6. Update every X messages - sets the interval at which the summary is generated. 0
means that the automatic summarization is disabled, but you can still trigger it
manually by clicking the "Summarize now" button. This should be adjusted based on
how quickly the prompt buffer entirely fills with chat messages. Ideally, you'd want to
have the first summary generated when the messages are starting to get dropped out
of the prompt.
7. Update every X words - same as above, but using words (not tokens!) instead of
messages, theoretically can be a more accurate measurement due to how
unpredictable the contents of chat messages usually are, but your mileage may vary.
If both "Update every" sliders are set to a non-zero value, then both will trigger summary
updates at their respective intervals, depending on what happens first. It is strongly
advised to update these values accordingly when you switch to another model that has
differing context sizes, otherwise, the summary generation may trigger too often, or never
at all.
If you're unsure about the interval settings, you can click the "magic wand" button above
the "Update every" sliders to try and guess the optimal values based on some simple
heuristics. A brief description of the algorithm is as follows:
1. Calculate token and word counts for all chat messages
2. Determine target summary length based on desired prompt words
3. Calculate the maximum number of messages that can fit in the prompt based on the
average message length
4. If "Max messages" is set, adjust the average to account for messages that don't fit
the summary limit
5. Round down the adjusted average messages per prompt to a multiple of 5
Example prompts
Raw prompt
System:
[Summarization prompt]
Previous summary.
User:
Message foo.
Char:
Message bar.
Classic prompt
[Main prompt]
[Character card]
[Example dialogues]
User:
Message foo.
Char:
Message bar.
System:
[Summarization prompt]
Extras API
Extras server with the summarize module could run an auxiliary summarization model
(BART).
It has a very small context size (~1024 tokens), so its ability to handle large summaries is
quite limited.
To configure the Extras summary source, do the following:
1. Install or Update Extras to the latest version.
2. Run Extras with the summarize module enabled: python server.py --enable-
modules=summarize
Previous Next
Speech Recognition TTS
TTS
SillyTavern has a wide range of TTS options. This page explains the setup and use.
What is it?
TTS is used to have a voice narrate parts of your chat.
Configuring TTS
TTS Provider Selectbox
Used to select which TTS service you want to use.
ElevenLabs - paid subscription required, highest quality voices available at present.
Silero - free, runs on your PC, quality can vary widely
System - uses your OS TTS engine, if one exists. Quality can vary widely depending
on the OS.
Edge - free, runs via Azure, generally quite fast, and voices feel natural but dry and
emotionless. Like listening to the evening news or a radio announcer. When running
with "Plugin" selected as the provider, you also need to install this server plugin,
otherwise the TTS won't work.
Coqui-TTS - free, No API Implementation at this time. High-performance Text2Speech
models (Tacotron, Tacotron2, Glow-TTS, SpeedySpeech) as well as Bark.
Novel - requires a paid NovelAI subscription, generated by NovelAI's TTS engine
RVC - free, voice cloning
Checkboxes
Enabled - turns TTS playback on/off
Auto Generation - lets TTS start playing automatically when a new message enters
the chat
Only narrate "quotes" - Limits TTS playback to only include text within "quotation
marks" . This will *include "quotes" within asterisk lines* (internal variable name
= narrate_quoted_only )
Ignore *text, even "quotes", inside asterisks* - TTS will not play any text within
*asterisks* , even "quotes" (internal variable name = narrate_dialogues_only )
having both "only narrate quotes" and "ignore asterisks" checkboxes both checked
will result in the TTS only reading "quotes" which are not in asterisks, and ignoring
everything else.
Narrate only the translated text - this will make the TTS only narrate the translated
text.
Given the example text: *Cohee approaches you with a faint "nya"* "Good evening,
senpai", she says. Here's a table showing how the text will be modified based on the
boolean states of Ignore *text, even "quotes", inside asterisks* and Only narrate
"quotes":
Sliders
These will change depending on the API you select.
(explanation coming soon)
Buttons
Apply - this must be clicked after setting a TTS API and after editing the voice map.
Available voices - loads a popup with all voices available for your selected API, and
lets you preview them with sample dialogues.
Using TTS
1. Click the "Enable" checkbox, or nothing will ever happen.
2. Click the "Auto-generation" checkbox if you want the TTS to start automatically every
time a new message arrives in chat.
3. Optionally, click the megaphone icon inside the top-right of any message to playback
on demand.
4. Click the lower right "Stop" button (found inside the wand menu) to stop any
playback.
Voice Map
You must provide a voice map for the TTS to use, otherwise, it won't know what voices
should be used for each character.
These must be in the exact format stated below:
CharacterName:TTSVoice,CharacterName2:TTSVoice2
For Coqui-TTS the format needs to include the speaker and language from the WebGUI:
CharacterName:TTSVoice[speakerid][langid] or Aqua:tts_models--multilingual--multi-
dataset--your_tts\model_file.pth[2][1]
Previous Next
Summarize AllTalk TTS V2
AllTalk TTS V2
AllTalk is a voice cloning system based on Coqui XTTS, F5-TTS, VITS, Piper and other TTS
model engines, designed to produce high-quality voice reproduction (either zero shot
voice cloning or built-in voices). In AllTalk V2, significant updates enhance functionality
and ease of use, including multiple TTS engine support, expanded customization, and
performance optimizations. For a comprehensive list of features, refer to the AllTalk Wiki
here.
Previous Next
TTS XTTS with voice cloning
Installing
daswer123 made an API server that runs the XTTSv2 model on your computer and
connects to SillyTavern's TTS extension.
It's completely independent of Extras API and would use a separate environment.
Very important: Don't install the following requirements to your Extras environment or
system Python. It will break your other packages, do unnecessary downgrades, etc.
The following instruction is provided using Miniconda, but you can also do it with venv (not
covered here). Open the Anaconda command prompt and follow the instructions line by
line.
Getting the server up and running
1. Navigate to the folder you've created at step 4 of prerequisites.
cd C:\xtts
2. Create a new conda env. From now on, we'll call it xtts .
conda create -n xtts
4. Install Python 3.10 to your env. Confirm with "y" when prompted.
conda install python=3.10
6. Install PyTorch. This can take some time. The following line installs PyTorch with GPU
acceleration support (CUDA). If you want to use just the CPU inference, drop the last
part that starts with --index-url .
pip install torch torchvision torchaudio --index-url
https://download.pytorch.org/whl/cu118
7. Start the XTTS server on the default host and port: http://localhost:8020
python -m xtts_api_server
8. During your first startup, the model will be downloaded (about ~2 GB). Don't forget to
read the legal notice from Coqui AI very carefully. Lol, I'm kidding, just hit "y" again.
Connecting to SillyTavern
1. Open the extensions panel, expand the TTS menu, and pick "XTTSv2" in the provider
list.
2. Choose your text-to-speech language in the Language dropdown (I'll be sad if it's not
Polish).
3. Verify that the provider endpoint points to http://localhost:8020 and "Available voices"
shows a list of your voice samples.
4. Pick any character and set a mapping between the voice sample and the character. If
the characters list is empty, hit "Reload" a couple of times.
5. Configure the rest of the TTS settings according to your preferences.
You're all set now!
Click on the bullhorn icon in the context actions menu for any message and hear the
beautiful cloned voice emanating from your speakers. The generation takes some time
and it's not real-time even on high-end RTX GPUs.
Streaming?
It's possible to use HTTP streaming with the latest version of the XTTS server to get the
chunks of generated audio as soon as it is available!
This doesn't work with RVC!
The audio will still be generated (assuming you're using the latest version of the RVC
extension) and converted, but not streamed as RVC requires to have the full audio file
before initiating the conversion. Streamed RVC is still being investigated...
How to get streaming support?
1. Update SillyTavern to the latest version.
2. Update the XTTS server to the latest version.
conda activate xtts
pip install xtts-api-server --upgrade
3. Start and connect XTTS to ST as usual.
4. Enable the "Streaming" XTTS extension setting in SillyTavern.
Choppy audio?
Try increasing the "chunk size" setting.
For reference: with a chunk size of 200, RTX 3090 can produce uninterrupted audio at the
cost of slightly increased audio latency.
How to restart the TTS server?
Just do steps 1, 3 and 7 from the installation instruction.
Android??
Unlikely, it can't run apps that require PyTorch without some arcane black magic that we
don't provide support for. You can try it out at your own risk, but no support will be
provided if you face any problems.
Your best solution is to host the TTS API on your PC over the local network, just don't
forget to specify the host and port to listen on - see README.
Previous Next
AllTalk TTS V2 VRM
VRM
This guide will walk you through the process of setting up and customizing the VRM
extension for your SillyTavern experience. This extension allows you to use VRM animated
models for your character, providing a dynamic and interactive element to your virtual
character.
Prerequisites
Before you begin, ensure you've met the following prerequisites:
1. Branch Selection: Make sure you're using the latest version branch of SillyTavern to
access the latest features and updates.
2. Extension Installation: Install the "VRM" extension from the "Download Extensions &
Assets" menu in the Extensions panel (represented by the stacked blocks icon).
3. Model Folder Placement: Place your VRM model files (.vrm) into the /data/<user-
handle>/assets/vrm/model directory and your animation files into the /data/<user-
handle>/assets/vrm/animation directory. The currently supported animation file
format are .fbx and .bvh that are compatible with VRM models. This include any
animation you can get from Mixamo (https://www.mixamo.com/) and any animation
you can export from tools like XR Animator
(https://github.com/ButzYung/SystemAnimatorOnline).
Extension Settings
The VRM extension offers various settings to customize the behavior of your animated
model. Here are the key settings:
UI global settings
Global Settings
1. Enabled:
Enable this checkbox to activate the extension, allowing your VRM model to
interact within SillyTavern.
You can disable the extension if you want to use normal sprites only.
2. Look at camera:
Enable this checkbox to make the VRM model eyes look at the camera.
3. Blink:
Enable this checkbox to make the VRM model eyes blink at random intervals.
Model expressions should define properly blinking weight property otherwize
model can blink with closed eyes for example, if that happens either:
correct the model if you have the .vroid file
don't use that incorrect face experession
disable blinking completly with this checkbox
4. TTS Lip sync
Enable this checkbox to have the VRM mouth movement follow the sound of your
TTS when it's played. Only work with TTS whose sound is played by Sillytavern
itself like XTTS (not in streaming mode). If disabled, mouth will be animated
according to the message text length when a new character message is received.
5. Auto-send Interaction:
Enable this checkbox to automatically trigger character interactions when you
click on areas with mapped messages (refer to the hit areas section for details).
Performances Settings
1. Body hitboxes
Enable this checkbox to activate detection of click on several part of the VRM
model depending on the model the following area can be detected:
head/chest/hands/groin/butt/legs/feets. Hitboxes location are computed at each
frames and follow the body animation, disabling this option can improve
performance.
2. Use model cache
Enable this checkbox to keep in memory VRM model when switching models,
allows to switch back to previous model faster. Usefull if you use different model
for the same character to change outfit or form for example. Can affect
performance.
3. Use animation cache
Enable this checkboxx to keep in memory all animations played during the
session. All animation assigned to a model will also be loaded the first time the
model appear. Will increase the time you load the model the first time but make all
animation switch instant. Can affect performance.
Debug Settings
1. Show grid
Enable this checkbox to visualize the 3d grid, model dragging box and body
hitboxes.
2. Reload button
Click this button to reload the 3d scene, clear the cache and all VRM models. Use
it if some bug occurs or if cache starts to hit performance.
Scene Settings
UI scene settings
1. Light Color
Set the color of the light in the 3d scene. Click on the reset button to set it back to
the default white color. Depending on your browser you can use a color picker, for
example you can color pick the color of your background image to add more
immersion.
2. Light intensity
Set the light intensity in percent using the slider. Click on the reset button to set it
back to the default value of 100%. VRM model can react differently to light
depending on the baked shaders into the model, play with the value and see how
it goes.
UI model settings
Character Selection
These settings allow you to manage characters and assign VRM models to them.
1. Refresh Button:
Click the refresh button to update the list of characters in the current chat.
2. Select Character:
Use the drop-down list to choose a character to assign a VRM model to.
3. Remove Button:
Click this button to delete the assigned model for a character.
Model Selection
1. Refresh Button:
Click the refresh button if your VRM model does not appear in the list.
2. Select Model:
Choose a model from the list to assign it to the selected character.
The model has to be located in /data/<user-handle>/assets/vrm/model directory.
3. Reset button
Click this button to reset the model settings to its default. If you have animation
files that correspond to the default value they will be auto mapped. See the
naming mapping at the end of this README.
Model Settings
1. Model Scale:
Use the slider to adjust the size of the model, making it larger or smaller.
2. Model Center X/Y Offset:
Use those sliders to change the horizontal/vertical position of the model relative
to the window center.
3. Model X/Y Rotation
Use those sliders to change the horizontal/vertical rotation of the model relative
to the model hips.
Remarks
- The settings are saved per model not per character and carry over different chats.
- If you want to use the same model for two different characters with different settings
make a copy of the .vrm file.
- You can also drag the model with your mouse, and those settings will be updated and
saved. Left click and hold to drag a model around the screen. Middle mouse Click and hold
to rotate the model or use shift-left click. Use mouse wheel with cursor on the model to
scale it up or down or use ctrl+left click.
- Use these UI settings to bring your model back on the screen if you somehow made it out
of view. Also, check the "Show frame" checkbox to see clearly where you can click to drag
the model.
UI hitboxes settings
Hitboxes mapping
- Depending on the model bones definition some hitboxes area can be generated, they will
be listed in this part of the ui, and you can assign an expression/animation/message to
each of them that will trigger when you click the area.
UI classify settings
// Classify class
"admiration": "assets/vrm/animation/admiration",
"amusement": "assets/vrm/animation/amusement",
"anger": "assets/vrm/animation/anger",
"annoyance": "assets/vrm/animation/annoyance",
"approval": "assets/vrm/animation/approval",
"caring": "assets/vrm/animation/caring",
"confusion": "assets/vrm/animation/confusion",
"curiosity": "assets/vrm/animation/curiosity",
"desire": "assets/vrm/animation/desire",
"disappointment": "assets/vrm/animation/disappointment",
"disapproval": "assets/vrm/animation/disapproval",
"disgust": "assets/vrm/animation/disgust",
"embarrassment": "assets/vrm/animation/embarrassment",
"excitement": "assets/vrm/animation/excitement",
"fear": "assets/vrm/animation/fear",
"gratitude": "assets/vrm/animation/gratitude",
"grief": "assets/vrm/animation/grief",
"joy": "assets/vrm/animation/joy",
"love": "assets/vrm/animation/love",
"nervousness": "assets/vrm/animation/nervousness",
"neutral": "assets/vrm/animation/neutral",
"optimism": "assets/vrm/animation/optimism",
"pride": "assets/vrm/animation/pride",
"realization": "assets/vrm/animation/realization",
"relief": "assets/vrm/animation/relief",
"remorse": "assets/vrm/animation/remorse",
"sadness": "assets/vrm/animation/sadness",
"surprise": "assets/vrm/animation/surprise",
// Hitboxes
"head": "assets/vrm/animation/hitarea_head",
"chest": "assets/vrm/animation/hitarea_chest",
"groin": "assets/vrm/animation/hitarea_groin",
"butt": "assets/vrm/animation/hitarea_butt",
"leftHand": "assets/vrm/animation/hitarea_hands",
"rightHand": "assets/vrm/animation/hitarea_hands",
"leftLeg": "assets/vrm/animation/hitarea_leg",
"rightLeg": "assets/vrm/animation/hitarea_leg",
"rightFoot": "assets/vrm/animation/hitarea_foot",
"leftFoot": "assets/vrm/animation/hitarea_foot"
Thank you for following this guide! Your SillyTavern experience is now enriched with
animated and interactive 3D models.
Remarks
- The VRM model loaded by this extension are the .vrm files not the .vroid files.
- Animation files should be VRM compatible, you can use a tool like XR animation
(https://github.com/ButzYung/SystemAnimatorOnline) to convert fbx/bvh animation file.
- You can create animation groups by having file with same name ending with different
numbers for example: "idle1.bvh", "idle2.bhv", "idle3.bvh" will be considered as one group
"idle" and when selected in a mapping a random one will played when triggered, can be use
to add variety to animations.
- You can get curated animations from this repository: https://github.com/test157t/VRM-
Animations-Pack-For-Silly-Tavern
- Nitral has some tutorial video about how to use the extension and the animation repo:
https://www.youtube.com/@nitralai
Previous Next
XTTS with voice cloning Web Search
Web Search
Adds web search results to LLM prompts.
Available sources
Selenium Plugin
Requires an official server plugin to be installed and enabled.
See SillyTavern-WebSearch-Selenium for more details.
Supports Google and DuckDuckGo engines.
Extras API
Requires a websearch module and Chrome/Firefox web browser installed on the host
machine.
Supports Google and DuckDuckGo engines.
SerpApi
Requires an API key.
Get the key here: https://serpapi.com/dashboard
SearXNG
Requires a SearXNG instance URL (either private or public). Uses HTML format for search
results.
SearXNG preferences string: obtained from SearXNG - preferences - COOKIES - Copy
preferences hash
Learn more: https://docs.searxng.org/
Tavily AI
Requires an API key.
Get the key here: https://app.tavily.com/
KoboldCpp
KoboldCpp URL must be provided in Text Completion API settings. KoboldCpp version
must be >= 1.81.1 and WebSearch module must be enabled on startup: enable Network =>
Enable WebSearch in the GUI launcher or add --websearch to the command line.
See: https://github.com/LostRuins/koboldcpp/releases/tag/v1.81.1
Serper
Requires an API key.
Get the key here: https://serper.dev/
How to use
1. Make sure you use the latest version of SillyTavern.
2. Install the extension via the "Download Extensions & Assets" menu in SillyTavern.
3. Open the "Web Search" extension settings, set your API key or connect to Extras, and
enable the extension.
4. The web search results will be added to the prompt organically as you chat. Only user
messages trigger the search.
5. To include search results more organically, wrap search queries with single backticks:
Tell me about the `latest Ryan Gosling movie`. will produce a search query
latest Ryan Gosling movie .
6. Optionally, configure the settings to your liking.
Settings
General
1. Enabled - toggles the extension on and off.
2. Sources = sets the search results source.
3. Cache Lifetime - how long (in seconds) the search results are cached for your prompt.
Default = one week.
Prompt Settings
1. Prompt Budget - sets the maximum capacity of the inserted text (in characters of
text, NOT tokens). Rule of thumb: 1 token ~ 3-4 characters, adjust according to your
model's context limits. Default = 1500 characters.
2. Insertion Template - how the result gets inserted into the prompt. Supports the usual
macro + special macro: {{query}} for search query and {{text}} for search results.
3. Injection Position - where the result goes in the prompt. The same options as for the
Author's Note: as in-chat injection or before/after system prompt.
Search Activation
1. Use function tool - uses function calling to activate search or scrape web pages. Must
use a supported Chat Completion API and be enabled in the AI Response settings.
Disables all other activation methods when engaged.
2. Use Backticks - enables search activation using words encased in single backticks.
3. Use Trigger Phrases - enables search activation using trigger phrases.
4. Regular expressions - provide a JS-flavored regex to match the user message. If the
regex matches, the search with a given query will be triggered. Search query supports
`` and $1-syntax to reference the matched group. Example: /what is happening in
(.*)/i regex for search query news in $1 will match a message containing what is
happening in New York and trigger the search with the query news in New York .
5. Trigger Phrases - add phrases that will trigger the search, one by one. It can be
anywhere in the message, and the query starts from the trigger word and spans to
"Max Words" total. To exclude a specific message from processing, it must start with
a period, e.g. .What do you think? . Priority of triggers: first by order in the textbox,
then the first one in the user message.
6. Max Words - how many words are included in the search query (including the trigger
phrase). Google has a limit of about 32 words per prompt. Default = 10 words.
Page Scraping
1. Visit Links - text will be extracted from the visited search result pages and saved to a
file attachment.
2. Visit Count - how many links will be visited and parsed for text.
3. Visit Domain Blacklist - site domains to be excluded from visiting. One per line.
4. File Header - file header template, inserted at the start of the text file, has an
additional {{query}} macro.
5. Block Header - link block template, inserted with the parsed content of every link. Use
{{link}} macro for page URL and {{text}} for page content.
6. Save Target - where to save the results of scraping. Possible options: trigger message
attachments, or chat attachments of Data Bank, or just images (if the source
supports them).
7. Include Images - attach relevant images to the chat. Requires a source that supports
images (see below).
More info
Search results from the latest query will stay included in the prompt until the next valid
query is found. If you want to ask additional questions without accidentally triggering the
search, start your message with a period.
Web Search function tool always overrides other triggers if enabled and
available.
/websearch (links=on|off snippets=on|off [query]) – performs a web search query. Use named
arguments to specify what to return - page snippets (default: on), full parsed pages
(default: off) or both.
Previous Next
VRM Extras
Extras
Discontinued
The Extras project was discontinued in April 2024 and won't receive any new
updates or modules. The vast majority of modules are available natively in the
main SillyTavern application. You may still install and use it but don't expect to
get immediate support if you face any issues.
Previous Next
Web Search Extras via Colab
Instructions
Open the Official Extras Colab
Select the desired "Extra" options
select use_cpu to run Extras without requiring GPU credit
this will make Stable Diffusion slower, but everything else will run normally
Not required, but recommended: select the secure option to generate the API key to
protect your shared instance.
Click the Start button on the left (looks like a triangle 'play' button)
Wait for it to finish loading everything
Look for the trycloudflare.com link at the bottom of the output. Ignore the localhost
link, it won't work (we tried!).
It will start with the text Running on
Copy the API URL link that is listed under that line. (DO NOT copy the 'localhost' URL,
use the other one)
Start SillyTavern with extensions support: (set enableExtensions to true in your
config.yaml if necessary)
Navigate to SillyTavern's Extensions menu (click the 'stacked blocks' icon at the top of
the page).
Paste the API URL into the box at the top. (NOT the API Key box)
If you have NOT enabled the secure option, make sure the API Key box is completely
empty when using the official colab.
If you have enabled the secure option, paste the generated API key into the API Key
box.
API key will appear in the colab's console output, for example: Your API key is
fee2f3f559
Click "Connect"
Previous Next
Extras Local Installation
Extras Installation
This page contains instructions for installing SillyTavern Extras on your local device.
Discontinued
The Extras project was discontinued in April 2024 and won't receive any new
updates or modules. The vast majority of modules are available natively in the
main SillyTavern application. You may still install and use it but don't expect to
get immediate support if you face any issues.
Installation Methods
MiniConda (recommended)
This method is recommended because Conda makes a 'virtual environment' for the Extras
requirement packages to live inside, so they do not affect your system-wide Python setup.
1. Install Miniconda
(Important!) Read how to use Conda
2. Install git
(Chads who installed SillyTavern with git to begin with can skip this step!)
After you have both of them installed...
Type/paste the commands below ONE BY ONE IN THE CONDA COMMAND PROMPT WINDOW
and hit Enter after each one.
3. Create a new Conda environment (let's call it extras ):
conda create -n extras
8. Install Extras' requirements by using one of the following commands (will take time,
again):
pip install -r requirements.txt - for basic features
pip install -r requirements-rvc.txt - for real-time voice cloning
pip install -r requirements-coqui.txt - for Coqui TTS (not recommended)
See the Common Problems page if you get errors at this step!
9. See below 'Running Extras After Install'
System-Wide Installation
This is easier, but will affect your system-wide Python installation.
This can cause conflicts if you work with many Python programs that have different
requirements.
If this is your first time touching anything Python-related, that should not be a problem.
1. Install Python 3.11: https://www.python.org/downloads/release/python-3115/
2. Install git: https://git-scm.com/downloads
3. Open a command prompt window and go to a folder in which you have complete
access permissions.
4. Clone the repo: git clone https://github.com/SillyTavern/SillyTavern-extras , hit
Enter.
5. After the clone has finished, type cd SillyTavern-extras , hit Enter.
6. Type python -m pip install -r requirements.txt
7. See below 'Running Extras After Install'
This would enable Image Captioning, Chat Summary, and live updating Character
Expressions.
Below is a table that describes each module.
Name Description
caption Image captioning
summarize Text summarization
classify Text sentiment classification
sd Stable Diffusion image generation
silero-tts Silero TTS server
edge-tts Microsoft Edge TTS client
chromadb Vector storage server
coqui-tts Coqui TTS
rvc Real-time voice cloning
Decide which modules you want to add to your Python command line.
They will be used in the next step.
NOTE: There must be no spaces at all in your Python command's module list!
7. Replace the placeholder folder path with your actual Extras install folder path.
8. Replace the python command line with your actual command line
9. Save the file with a new name STExtras.bat (Use File >> Save As in most text
editors)
You can now simply double-click on this .bat file to easily start Extras.
If you ever want to change the module list (or any other command line modifiers for the
extras server), simply edit the python command inside the .bat file.
Previous Next
Extras via Colab Common Problems
Make sure webui-user.bat that you start Stable Diffusion with contains --api command
line option in the COMMANDLINE_ARGS variable.
Find and replace that line in your "webui-user.bat": set COMMANDLINE_ARGS=--api
How it shoud look
If the API mode is disabled for SD Web UI, the Extras server won't be able to make a
connection and you won't be able to generate images!
Still doesn't work?
Ensure that you start everything in the proper order, waiting for every program to finish
loading before proceeding to the next step:
1. Stable Diffusion Web UI
2. SillyTavern Extras
3. SillyTavern
The extras server can't reconnect to the Stable Diffusion API if it was loaded after.
hnswlib wheel building error when installing ChromaDB
ERROR: Could not build wheels for hnswlib, which is required to install pyproject.toml-
based projects
Before installing the ChromaDB module you must first do one of the following :
Install Visual C++ build tools: https://visualstudio.microsoft.com/visual-cpp-build-
tools/
Install the hnswlib package with conda: conda install -c conda-forge hnswlib
Mac does not support CUDA, so torch packages should be installed without CUDA
support.
Install the requirements using the requirements-silicon.txt file instead.
Missing modules?
You must specify a list of module names in your Python command line, with the --
enable-modules modifier.
See Modules section.
Previous Next
Local Installation Smart Context
Smart Context
THIS EXTENSION IS NO LONGER
MAINTAINED AND NOT RECOMMENDED TO
USE. CONSIDER CHAT VECTORIZATION AS
A POSSIBLE ALTERNATIVE.
Disclaimer
The use of this extension does not guarantee a better chatting experience or
improved memory of any sort. Only use if you understand all the implications of
vector database utilization.
What is it?
Smart Context is a SillyTavern extension that uses the ChromaDB library to give your AI
characters access to information that exists outside the normal chat history context limit.
How is that useful?
If you have a very long chat, the majority of the contents are outside the usual context
window and thus unavailable to the AI when it comes to writing a response.
Smart Context automatically takes the entire history of the chat file and puts it into a
vector database. This database is then searched each time you input something new into
the chat, and if messages with matching keywords are found, those chat messages are
placed into the context so the AI can see them when writing its next reply.
Setup Instructions
1. Update SillyTavern to at least version 1.10.6.
2. Install the "Smart Context" extension from the "Download Extensions & Assets" menu
in the Extensions panel (stacked blocks icon).
3. Install or Update Extras to the latest version. Alternatively, use the Colab notebook.
4. Local installs only: Install requirements-complete.txt for Extras (even if you did it once
before in a prior install).
5. Run Extras with the chromadb module enabled: python server.py --enable-
modules=chromadb
Configuration
Once Smart Context is enabled, you should configure it in the SillyTavern UI. Smart
Context configuration can be done from within the Extensions menu
Smart Context Config Panel
There are 4 main concepts to be aware of:
Chat History Preservation
Memory Injection Amount
Individual Memory Length
Injection Strategy
This database 'memory' is 103 characters long, so you would need to set the slider to at
least 103 in order to pull it entirely into the context.
If the slider is less than 103, the message would be cut off and injected like that.
Injection Strategy
Replace oldest history
This strategy keeps X recent messages, removes all message before that, and replaces
them with 'memories'.
Advantage
less likely to overflow your context limit
memories existing near the top of the context will have less immediate impact on the
response while still providing 'background information'.
Disadvantage
old messages are inserted directly into the chat history with no special demarcation,
and usually have no immediate natural relevance to the preserved natural chat history
messages. This can confuse less intelligent AI models.
Add to Bottom
This strategy leaves the chat history in its natural state and adds 'memories' after it inside
a formatted [bracket header]. This means the 'kept messages' sliders is effectively
disabled.
Advantage
does not shorten or alter the current natural chat history
'memories' exist after chat and have a stronger impact on the next AI response
Disadvantage
because no chat items are being removed/replaced, there is a higher chance you will
overflow your context limit.
because the memories exist very close to the end of the prompt they can have TOO
MUCH effect on the AI's response.
Custom Depth
This strategy leaves the chat history in its natural state and adds 'memories' at the depth
you determine within the template you specify. This means the 'kept messages' slider is
effectively disabled. The custom injection message should include the `` template word
which is where all queried memories will be placed.
Advantage
flexibility to experiment with memory placement
customizable introductions to memory within context
Disadvantage
because no chat items are being removed/replaced, there is a higher chance you will
overflow your context limit.
Use % Strategy
Note: This is not compatible with the 'Add to Bottom' strategy, which does not remove any
messages at all.
While using the 'Replace Oldest History' strategy, checking this box will enable the slider
for selecting a percentage of the in-context chat history to replace with SmartContext
memories. It will also disable the two sliders for manually selecting the number of
messages.
This strategy automatically calculates a percentage of the chat history to be replaced
with SmartContext memories, instead of a fixed number of messages.
Advantage
easier than manually calculating the number of messages yourself
adjusts with the available context size, applying the same percentage to small and
large prompt spaces
Disadvantage
calculations for how much history to remove can be slightly innacurate as they are
based on estimated tokens per message
it rounds the number of messages to remove to the nearest number divisible by 5 (0,
5, 10, 15, 20, etc), so it is not as fine grained as manual numeric selection.
FAQ
What happens to the databases when I'm done chatting? Can I save them?
For locally installed Extras servers, Smart Context saves the databases. There is no need
to save them manually in usual use cases.
For colab users, the databases are wiped when the extras server shuts down. Use the
export button to save the database as a JSON file, and import it next time you want to
use it.
Usually there is no need to save Smart Context databases.
Currently we have an Import/Export feature, which allows you to save the chat's DB and
use it again at a later date.
Can I make one big database for all of my chats to reference?
This would not be a good use of Smart Context's capabilities. We recommend using World
Info for this purpose.
Edit this page
Previous Next
Common Problems talkinghead
talkinghead
THE SUPPORT FOR TALKGINHEAD WAS DROPPED IN SILLYTAVERN 1.12.13.
THIS PAGE IS KEPT FOR HISTORICAL PURPOSES.
What is it?
An implementation of Talking Head Anime 3 Demo for AITuber. It possesses the following
features:
Generates random Live 2D-like motion actions from a single static image.
Lip-syncs to the sound output from any TTS output.
This extension contains the original demo programs for the Talking Head(?) Anime from a
Single Image 3: Now the Body Too project. As the name implies, the project allows you to
animate anime characters, and you only need a single image of that character to do so.
There are two demo programs:
The manual_poser lets you manipulate a character's facial expression, head rotation,
body rotation, and chest expansion due to breathing through a graphical user interface,
so you can save them as default expressions IE Happy, sad, joy, etc.
ifacialmocap_puppeteer lets you transfer your facial motion to an anime character.
Hardware Requirements
You can use either CPU or GPU Modes (CPU is default). However, in CPU mode expect
about 1 FPS, and in GPU mode on an RTX3060 I am getting about 9-10 FPS.
The ifacialmocap_puppeteer requires an iOS device that is capable of computing blend
shape parameters from a video feed. This means that the device must be able to run iOS
11.0 or higher and must have a TrueDepth front-facing camera. (See this page for more
info.) In other words, if you have the iPhone X or something better, you should be all set.
How to use
You must launch extras with the following modules for talkinghead to work: classify and
talkinghead ! classify is required for the handling of the talkinghead.png file. Additionally,
you may also use --talkinghead-gpu to load the blend models into GPU memory and
make the animations 10x faster. It is highly recommended to use GPU acceleration! By
default, once the program starts it will load a default image SillyTavern-
extras\talkinghead\tha3\images\lambda_00.png. You can verify it is working by going to
http://localhost:5100/api/talkinghead/result_feed or YOUR EXT
URL:PORT/api/talkinghead/result_feed .
Once the server has started go to the Extension API tab and connect. Then simply
select a character card to load. ( --enable-modules=classify,talkinghead --
talkinghead-gpu when starting server.py)
Now select the Character Expressions, if you check the image type talkinghead box
the script will replace your current character expression with the result of YOUR EXT
URL:PORT/api/talkinghead/result_feed unchecking the box SHOULD return the image
back to the original expression, however sometimes you have to send a new message
to the chat to "reload" the image.
If you do not have a talkinghead.png file in the character directory it will simply show
either the default image or the last character card that had a talkinghead.png file.
The animation source image is changed when the character card is changed.
Now open the character expressions scroll down to the talkinghead image and upload
an image file that meets the requirements in the section below called "Constraints on
Input Images".
Then check and uncheck the talkinghead box to reload the character. If the image is
funny looking it is probably because it is not transparent / has no alpha layer.
Otherwise, follow the instructions and template below.
Constraints on Input Images
In order for the system to work well, the input image must obey the following constraints:
It should be of resolution 512 x 512. (If the program receives an input image of any other
size, it will resize the image to this resolution and also output at this resolution.) It must
have an alpha channel. It must contain only one humanoid character. The character
should be standing upright and facing forward. The character's hands should be below
and far from the head. The head of the character should roughly be contained in the 128 x
128 box in the middle of the top half of the image. The alpha channels of all pixels that do
not belong to the character (i.e., background pixels) must be 0.
Input Constraints
ADVANCED SECTION
Python Environment
In addition to the base feature (app.py), both manual_poser and ifacialmocap_puppeteer
are available as desktop applications. To run them, you need to set up an environment for
running programs written in the Python language. The environment needs to have the
following software packages:
Python >= 3.8
PyTorch >= 1.11.0 with CUDA support
SciPY >= 1.7.3
wxPython >= 4.1.1
Matplotlib >= 3.5.1
One way to do so is to install Anaconda and run the following commands in your shell:
two_algo_face_body_rotator.pt
standard_float
editor.pt
two_algo_face_body_rotator.pt
standard_half
editor.pt
two_algo_face_body_rotator.pt
The model files are distributed with the Creative Commons Attribution 4.0 International
License, which means that you can use them for commercial purposes. However, Pramook
Khungurn. Talking Head(?) Anime from a Single Image 3: Now the Body Too.
https://github.com/pkhungurn/talking-head-anime-3-demo, is the creator.
Running the manual_poser Desktop Application
Open a shell. Change your working directory to the repository's root directory. Then, run:
conda activate extras if you have not already activated the environment.
obsolete
Previous Next
Smart Context Development and Automation
STscript
STscript is a powerful scripting language based on batched chat commands that can
be approached without any prior coding knowledge.
Function Calling
Add more dynamic capabilities by letting the LLM use external sources of data or
trigger specific functionality of the extension.
UI Extensions
UI extensions run in a browser environment and expand the functionality of
SillyTavern by hooking into its events and API.
Server Plugins
Server plugins allow adding functionality such as new API endpoints by running code
in the NodeJS environment.
Internationalization (i18n)
Learn how to translate SillTavern's UI into your language.
Hint: To see a list of all available commands, type /help slash into the chat.
As constant unnamed arguments and pipes are interchangeable, we could rewrite this
script simply as:
stscript
User input
Now let's add a little bit of interactivity to the script. We will accept the input value from
the user and display it in the notification.
stscript
1. The /input command is used to display an input box with the prompt specified in the
unnamed argument and then writes the output to the pipe.
2. Because /echo already has an unnamed argument that sets the template for the
output, we use the {{pipe}} macro to specify a place where the pipe value will be
rendered.
Example:
stscript
/popup large=on wide=on okButton="Accept" Please accept our terms and conditions....
Example:
stscript
Variables
Variables are used to store and manipulate data in scripts, using either commands or
macros. The variables could be one of the following types:
Local variables — saved to the metadata of the current chat, and unique to it.
Global variables — saved to the settings.json and exist everywhere across the app.
1. /getvar name or {{getvar::name}}— gets the value of the local variable.
2. /setvar key=name value or {{setvar::name::value}} — sets the value of the local
variable.
3. /addvar key=name increment or {{addvar::name::increment}} — adds the
increment to the value of the local variable.
4. /incvar name or {{incvar::name}} — increments a value of the local variable by 1.
5. /decvar name or {{decvar::name}} — decrements a value of the local variable by 1.
6. /getglobalvar name or {{getglobalvar::name}} — gets the value of the global
variable.
7. /setglobalvar key=name or {{setglobalvar::name::value}} — sets the value of the
global variable.
8. /addglobalvar key=name or {{addglobalvar::name:increment}} — adds the
increment to the value of the global variable.
9. /incglobalvar name or {{incglobalvar::name}} — increments a value of the global
variable by 1.
10. /decglobalvar name or {{decglobalvar::name}} — decrements a value of the global
variable by 1.
11. /flushvar name — deletes the value of the local variable.
12. /flushglobalvar name — deletes the value of the global variable.
1. The value of the user input is saved in the local variable named SDinput .
2. The getvar macro is used to display the value in the /echo command.
3. The getvar command is used to retrieve the value of the variable and pass it through
the pipe.
4. The value is passed to the /imagine command (provided by the Image Generation
plugin) to be used as its input prompt.
Since the variables are saved and not flushed between the script executions, you can
reference the variable in other scripts and via macros, and it will resolve to the same value
as during the execution of the example script. To guarantee that the value will be
discarded, add the /flushvar command to the script.
Arrays and objects
Variable values can contain JSON-serialized arrays or key-value pairs (objects).
Examples:
Array: ["apple","banana","orange"]
Object: {"fruits":["apple","banana","orange"]}
The following modifications can be applied to commands to work with these variables:
/len commands gets a number of items in the array.
index=number/string named argument can be added /getvar or /setvar and their
global counterparts to get or set sub-values by either a zero-based index for arrays
or a string key for objects.
If a numeric index is used on a nonexistent variable, the variable will be created
as an empty array [] .
If a string index is used on a nonexistent variable, the variable will be created as
an empty object {} .
/addvar and /addglobalvar commands support pushing a new value to array-typed
variables.
Flow control - conditionals
You can use the /if command to create conditional expressions that branch the
execution based on the defined rules.
stscript
Note that
stscript
This script evaluates the user input against a required value and displays different
messages, depending on the input value.
Arguments for /if
1. leftis the first operand. Let's call it A.
2. right is the second operand. Let's call it B.
3. rule is the operation to be applied to the operands.
4. else is the optional string of subcommands to be executed if the result of boolean
comparison is false.
5. Unnamed argument is the subcommand to be executed if the result of boolean
comparison is true.
The operand values are evaluated in the following order:
1. Numeric literals
2. Local variable names
3. Global variable names
4. String literals
String values of named arguments could be escaped with quotes to allow multi-word
strings. Quotes are then discarded.
Boolean operations
Supported rules for boolean comparison are the following. An operation applied to the
operands results in either a true or false value.
1. eq (equals) => A = B
2. neq (not equals) => A != B
3. lt (less than) => A < B
4. gt (greater than) => A > B
5. lte (less than or equals) => A <= B
6. gte (greater than or equals) => A >= B
7. not (unary negation) => !A
8. in (includes substring) => A includes B, case insensitive
9. nin (not includes substring) => A not includes B, case insensitive
Subcommands
A subcommand is a string containing a list of slash commands to execute.
1. To use command batching in subcommands, the command separator character
should be escaped (see below).
2. Since macro values are executed when the conditional is entered, not when the
subcommand is executed, a macro could be additionally escaped to delay their
evaluation to the subcommand execution time.
3. The result of the subcommands execution is piped to the command after /if .
4. The /abort command interrupts the script execution when encountered.
/if commands can be used as a ternary operator. The following example will pass a
"true" string to the next command the variable a equals 5, and a "false" string otherwise.
stscript
Escape Sequences
Macros
Escaping of macros works just like before. However, with closures, you will need to escape
macros a lot less often than before. Either escape the two opening curly braces, or both
the opening and closing pair.
stscript
/echo \{\{char}} |
/echo \{\{char\}\}
Pipes
Pipes don't need to be escaped in closures (when used as command separators).
Everywhere where you want to use a literal pipe character instead of a command
separator, you need to escape it.
stscript
With the parser flag STRICT_ESCAPING you don't need to escape pipes in quoted values.
stscript
/parser-flag STRICT_ESCAPING |
/echo title="a|b" c\|d |
/echo title=a\|b c\|d |
Quotes
To use a literal quote-character inside a quoted value, the character must be escaped.
stscript
Spaces
To use space in the value of a named argument, you either have to surround the value in
quote, or escape the space character.
stscript
Closure Delimiters
If you want to use the character combinations used to mark the beginning or end of a
closure, you have to escape the sequence with a single backslash.
stscript
/echo \{: |
/echo \:}
Pipe Breakers
stscript
||
To prevent the previous command's output from being automatically injected as the
unnamed argument into the next command, put double pipes between the two commands.
stscript
Closures
stscript
{: ... :}
Closures (block statements, lambdas, anonymous functions, whatever you want to call
them) are a series of commands wrapped between {: and :} , that are only evaluated
once that part of the code is executed.
Sub-Commands
Closures make using sub-commands a lot easier and get rid of the need to escape pipes
and macros.
stscript
// if without closures |
/if left=1 rule=eq right=1
else="
/echo not equal \|
/return 0
"
/echo equal \|
/return \{\{pipe}}
stscript
// if with closures |
/if left=1 rule=eq right=1
else={:
/echo not equal |
/return 0
:}
{:
/echo equal |
/return {{pipe}}
:}
Scopes
Closures have their own scope and support scoped variables. Scoped variables are
declared with /let , their values set and retrieved with /var . Another way to get a
scoped variable is the {{var::}} macro.
stscript
/let x |
/let y 2 |
/var x 1 |
/var y |
/echo x is {{var::x}} and y is {{pipe}}.
Within a closure, you have access to all variables declared within that same closure or in
one of its ancestors. You don't have access to variables declared in a closure's
descendants.
If a variable is declared with the same name as a variable that was declared in one of the
closure's ancestors, you don't have access to the ancestor variable in this closure and its
descendants.
stscript
Named Closures
stscript
/let myClosure {:
/echo this is my closure
:} |
/:myClosure
stscript
/let myClosure {:
/echo this is my closure |
/delay 500
:} |
/times 3 {{var::myClosure}}
/: can also be used to execute Quick Replies, as it is just a shorthand for /run .
stscript
/:QrSetName.QrButtonLabel |
/run QrSetName.QrButtonLabel
Closure Arguments
Named closures can take named arguments, just like slash commands. The arguments
can have default values.
stscript
stscript
{: ... :}()
Closures can be immediately executed, meaning they will be replaced with their return
value. This is helpful in places where no explicit support for closures exists, and to shorten
some commands that would otherwise require a lot of intermediate variables.
stscript
stscript
In addition to running named closures saved inside scoped variables, the /run command
can also be used to execute closures immediately.
stscript
/run {:
/add 1 2 3 4 |
:} |
/echo |
Comments
stscript
// ... | /# ...
// this is a comment |
/echo foo |
/# this is also a comment
Block Comments
Block comments can be used to quickly comment out multiple commands at once. They
will not terminate on a pipe.
stscript
/echo foo |
/*
/echo bar |
/echo foobar |
*|
/echo foo again |
Flow Control
Loops: /while and /times
If you need to run some command in a loop until a certain condition is met, use the
/while command.
stscript
On each step of the loop it compares the value of variable A with the value of variable B,
and if the condition yields true, then executes any valid slash command enclosed in
quotes, otherwise exists the loop. This command doesn't write anything to the output
pipe.
Arguments for /while
The set of available boolean comparisons, handing of variables, literal values, and
subcommands is the same as for the /if command.
The optional guard named argument ( on by default) is used to protect against endless
loops, limiting the number of iterations to 100. To disable and allow endless loops, set
guard=off .
This example adds 1 to the value of i until it reaches 10, then outputs the resulting value
(10 in this case).
stscript
/setvar key=i 0 |
/while left=i right=10 rule=lt "/addvar key=i 1" |
/echo {{getvar::i}} |
/flushvar i
/break |
The /break command can be used to break out of a loop ( /while or /times ) or a
closure early. The unnamed argument of /break can be used to pass a value different
from the current pipe along.
/break is currently implemented in the following commands:
/times 10 {:
/echo {{timesIndex}}
/delay 500 |
/if left={{timesIndex}} rule=gt right=3 {:
/break
:} |
:} |
stscript
/let x {: iterations=2
/if left={{var::iterations}} rule=gt right=10 {:
/break too many iterations! |
:} |
/times {{var::iterations}} {:
/delay 500 |
/echo {{timesIndex}} |
:} |
:} |
/:x iterations=30 |
/echo the final result is: {{pipe}}
stscript
/run {:
/break 1 |
/pass 2 |
:} |
/echo pipe will be one: {{pipe}} |
stscript
/let x {:
/break 1 |
/pass 2 |
:} |
/:x |
/echo pipe will be one: {{pipe}} |
Math operations
All of the following operations accept a series of numbers or variable names and
output the result to the pipe.
Invalid operations (such as division by zero), and operations that result in a NaN value
or infinity return zero.
Multiplication, addition, minimum and maximum accept an unlimited number of
arguments separated by spaces.
Subtraction, division, exponentiation, and modulo accept two arguments separated by
spaces.
Sine, cosine, natural logarithm, square root, absolute value, and rounding accept one
argument.
List of operations:
1. /add (a b c d) – performs an addition of the set of values, e.g. /add 10 i 30 j
2. /mul (a b c d) – performs a multiplication of the set of values, e.g. /mul 10 i 30 j
3. /max (a b c d) – returns a maximum from the set of values, e.g. /max 1 0 4 k
4. /min (a b c d) – return a minimum from the set of values, e.g. /min 5 4 i 2
5. /sub (a b) – performs a subtraction of two values, e.g. /sub i 5
6. /div (a b) – performs a division of two values, e.g. /div 10 i
7. /mod (a b) – performs a modulo operation of two values, e.g. /mod i 2
8. /pow (a b) – performs a power operation of two values, e.g. /pow i 2
9. /sin (a) – performs a sine operation of a value, e.g. /sin i
10. /cos (a) – performs a cosine operation of a value, e.g. /cos i
11. /log (a) – performs a natural logarithm operation of a value, e.g. /log i
12. /abs (a) – performs an absolute value operation of a value, e.g. /abs -10
13. /sqrt (a) – performs a square root operation of a value, e.g. /sqrt 9
14. /round (a) – performs a rounding to the nearest integer operation of a value, e.g.
/round 3.14
15. /rand (round=round|ceil|floor from=number=0 to=number=1) – returns a random
number between from and to, e.g. /rand or /rand 10 or /rand from=5 to=10 .
Ranges are inclusive. The returned value will contain a fractional part. Use round
named argument to get an integral value, e.g. /rand round=ceil to round up,
round=floor to round down, and round=round to round to nearest.
/setvar key=input 5 |
/setvar key=i 1 |
/setvar key=product 1 |
/while left=i right=input rule=lte "/mul product i \| /setvar key=product \| /addvar key=i
1" |
/getvar product |
/echo Factorial of {{getvar::input}}: {{pipe}} |
/flushvar input |
/flushvar i |
/flushvar product
lock — can be on or off . Specifies whether a user input should be blocked while
the generation is in progress. Default: off .
stop — JSON-serialized array of strings. Adds a custom stop string (if the API
supports it) just for this generation. Default: none.
instruct (only /genraw ) — can be on or off . Allows to use instruct formatting on
the input prompt (if instruct mode is enabled and the API supports it). Set to off to
force pure prompts. Default: on .
as (for Text Completion APIs) — can be system (default) or char . Defines how the
last prompt line will be formatted. char will use a character name, system will use
no or neutral name.
The generated text is then passed through the pipe to the next command and can be
saved to a variable or displaced using the I/O capabilities:
stscript
/genraw Write a funny message from Cthulhu about taking over the world. Use emojis. |
/popup <h3>Cthulhu says:</h3><div>{{pipe}}</div>
/genraw You have been memory wiped, your name is now Lisa and you're tearing me apart.
You're tearing me apart Lisa! |
/sendas name={{char}} {{pipe}}
Temporal character
If you are not in a group chat, scripts may temporarily make a request to the currently
connected LLM as a different character.
/ask (prompt) — generates text using the provided prompt for a specified character
and including chat messages. Please note that swipes of the response from this
character will revert back to the current character.
stscript
Prompt injections
Scripts can add custom LLM prompt injections, making it essentially an equivalent of
unlimited Author's Notes.
/inject (text) — inserts any text into the normal LLM prompt for the current chat,
and requires a unique identifier. Saved to chat metadata.
/listinjects — shows a list of all prompt injections added by scripts for the current
chat in a system message.
/flushinjects — deletes all prompt injections added by scripts for the current chat.
/note (text) — sets the Author's Note value for the current chat. Saved to chat
metadata.
/interval — sets the Author's Note insertion interval for the current chat.
/depth — sets the Author's Note insertion depth for the in-chat position.
/position — sets the Author's Note position for the current chat.
The names argument is used to specify whether you want to include character names
or not, default: on .
In an unnamed argument, it accepts a message index or range in the start-finish
format. Ranges are inclusive!
If the range is unsatisfiable, i.e. an invalid index or more messages than exist are
requested, then an empty string is returned.
Messages that are hidden from the prompt (denoted by the ghost icon) are excluded
from the output.
If you want to know the index of the latest message, use the {{lastMessageId}}
macro, and {{lastMessage}} will get you the message itself.
To calculate the start index for a range, for example, when you need to get the last N
messages, use variable subtraction. This example will get you 3 last messages in the chat:
stscript
Send messages
A script can send messages as either a user, character, persona, neutral narrator, or add
comments.
1. /send (text) — adds a message as the currently selected persona.
2. /sendas name=charname (text) — adds a message as any character, matching by
their name. name argument is required. Use the {{char}} macro to send as the
current character.
3. /sys (text) — adds a message from the neutral narrator that doesn't belong to the
user or character. The displayed name is purely cosmetic and can be customized with
the /sysname command.
4. /comment (text) — adds a hidden comment that is displayed in the chat but is not
visible to the prompt.
5. /addswipe (text) — adds a swipe to the last character message. Can't add a swipe
to the user or hidden messages.
6. /hide (message id or range) — hides one or several messages from the prompt
based on the provided message index or inclusive range in the start-finish format.
7. /unhide (message id or range) — returns one or several messages to the prompt
based on the provided message index or inclusive range in the start-finish format.
/send , /sendas , /sys , and /comment commands optionally accept a named argument
at with a zero-based numeric value (or a variable name that contains such a value) that
specifies an exact position of message insertion. By default new messages are inserted at
the end of the chat log.
This will insert a user message at the beginning of the conversation history:
stscript
Delete messages
These commands are potentially destructive and have no "undo" function. Check the
/backups/ folder if you accidentally deleted something important.
1. /cut (message id or range) — cuts one or several messages from the chat based
on the provided message index or inclusive range in the start-finish format.
2. /del (number) — deletes last N messages from the chat.
3. /delswipe (1-based swipe id) — deletes a swipe from the last character message
based on the provided 1-based swipe ID.
4. /delname (character name) — deletes all messages in the current chat that belong to
a character with the specified name.
5. /delchat — deletes the current chat.
World Info commands
World Info (also known as Lorebook) is a highly utilitarian tool for dynamically inserting
data into the prompt. See the dedicated page for more detailed explanation: World Info.
1. /getchatbook – gets a name of the chat-bound World Info file or create a new one if
was unbound, and pass it down the pipe.
2. /findentry file=bookName field=fieldName [text] – finds a UID of the record from
the specified file (or a variable pointing to a file name) using fuzzy matching of a field
value with the provided text (default field: key ) and passes the UID down the pipe,
e.g. /findentry file=chatLore field=key Shadowfang .
3. /getentryfield file=bookName field=field [UID] – gets a field value (default field:
content ) of the record with the UID from the specified World Info file (or a variable
pointing to a file name) and passes the value down the pipe, e.g. /getentryfield
file=chatLore field=content 123 .
4. /setentryfield file=bookName uid=UID field=field [text] – sets a field value
(default field: content ) of the record with the UID (or a variable pointing to UID) from
the specified World Info file (or a variable pointing to a file name). To set multiple
values for key fields, use a comma-delimited list as a text value, e.g. /setentryfield
file=chatLore uid=123 field=key Shadowfang,sword,weapon .
5. /createentry file=bookName key=keyValue [content text] – creates a new record in
the specified file (or a variable pointing to a file name) with the key and content (both
of these arguments are optional) and passes the UID down the pipe, e.g.
/createentry file=chatLore key=Shadowfang The sword of the king .
Logic values
0 = AND ANY
1 = NOT ALL
2 = NOT ANY
3 = AND ALL
Position values
0 = before main prompt
1 = after main prompt
2 = top of Author's Note
3 = bottom of Author's Note
4 = in-chat at depth
5 = top of example messages
6 = bottom of example messages
Role values (Position = 4 only)
0 = System
1 = User
2 = Assistant
Example 1: Read a content from the chat lorebook by key
stscript
Text manipulation
There's a variety of useful text manipulation utility commands to be used in various script
scenarios.
1. /trimtokens — trims the input to the specified number of text tokens from the start
or from the end and outputs the result to the pipe.
2. /trimstart — trims the input to the start of the first complete sentence and outputs
the result to the pipe.
3. /trimend — trims the input to the end of the last complete sentence and outputs the
result to the pipe.
4. /fuzzy — performs fuzzy matching of the input text to the list of strings, outputting
the best string match to the pipe.
5. /regex name=scriptName [text] — executes a regex script from the Regex extension
for the specified text. The script must be enabled.
Arguments for /trimtokens
stscript
1. directionsets the direction for trimming, which can be either start or end .
Default: end .
2. limit sets the amount of tokens to left in the output. Can also specify a variable
name containing the number. Required argument.
3. Unnamed argument is the input text to be trimmed.
Arguments for /fuzzy
stscript
/parser-flag
The parser accepts flags to modify its behavior. These flags can be toggled on and off at
any point in a script and all following input will be evaluated accordingly.
You can set your default flags in user settings.
Strict Escaping
stscript
/parser-flag STRICT_ESCAPING on |
Backslashes
A backslash in front of a symbol can be escaped to provide the literal backslash followed
by the functional symbol.
stscript
stscript
/echo \\|
/echo \\\|
/parser-flag REPLACE_GETVAR on |
This flag helps to avoid double-substitutions when the variable values contain text that
could be interpreted as macros. The {{var::}} macros get substituted last and no
further substitutions happen on the resulting text / variable value.
Replaces all {{getvar::}} and {{getglobalvar::}} macros with {{var::}} . Behind the
scenes, the parser will insert a series of command executors before the command with
the replaced macros:
call /let to save the current {{pipe}} to a scoped variable
call /getvar or /getglobalvar to get the variable used in the macro
call /let to save the retrieved variable to a scoped variable
call /return with the saved {{pipe}} value to restore the correct piped value for the
next command
stscript
stscript
/addvar key=clicks 1 |
/if left=clicks right=5 rule=eq else="/echo Keep going..." "/echo You did it! \|
/flushvar clicks"
Then click 5 times on the button that appeared above the chat bar. Every click increments
the variable clicks by one and displays a different message when the value equals 5
and resets the variable.
Automatic execution
Open the modal menu by clicking the ⋮ button for the created command.
In this menu you can do the following:
Edit the script in a convenient full-screen editor
Hide the button from the chat bar, making it accessible only for auto-execution.
Enable automatic execution on one or more of the following conditions:
App startup
Sending a user message to the chat
Receiving an AI message in the chat
Opening a character or group chat
Triggering a reply from a group member
Activating a World Info entry using the same Automation ID
Provide a custom tool tip for the quick reply (text displayed when hovering over the
quick reply in your UI)
Execute the script for test purposes
Commands are executed automatically only if the Quick Replies extension is enabled.
For example, you can display a message after sending five user messages by adding the
following script and setting it to auto-execute on the user message.
stscript
/addvar key=usercounter 1 |
/echo You've sent {{pipe}} messages. |
/if left=usercounter right=5 rule=gte "/echo Game over! \| /flushvar usercounter"
Debugger
A basic debugger exists inside the expanded Quick Reply editor. Set breakpoints with
/breakpoint | anywhere in your script. When executing the script from the QR editor, the
execution will be interrupted at that point, allowing you to examine the currently available
variables, pipe, command arguments, and more, and to step through the rest of the code
one by one.
stscript
/let x {: n=1
/echo n is {{var::n}} |
/mul n n |
:} |
/breakpoint |
/:x n=3 |
/echo result is {{pipe}} |
Calling procedures
A /run command can call scripts defined in the Quick Replies by their label, basically
providing the ability to define procedures and return results from them. This allows to have
reusable script blocks that other scripts could reference. The last result from the
procedure's pipe is passed to the next command after it.
stscript
/run ScriptLabel
Label:
GetRandom
Command:
stscript
/pass {{roll:d100}}
Label:
GetMessage
Command:
stscript
Clicking on the GetMessage button will call the GetRandom procedure which will resolve
the {{roll}} macro and pass the number to the caller, displaying it to the user.
Procedures do not accept named or unnamed arguments, but can reference the same
variables as the caller.
Avoid recursion when calling procedures as it may produce the "call stack exceeded"
error if handled unadvisedly.
Calling procedures from a different Quick Reply preset
You can call a procedure from a different quick reply preset using the a.b syntax, where
a = QR preset name and b = QR label name
stscript
/run QRpreset1.QRlabel1
By default, the system will first look for a quick reply label a.b , so if one of your labels is
literally "QRpreset1.QRlabel1" it will try to run that. If no such label is found, it will search
for a QR preset name "QRpreset1" with a QR labeled "QRlabel1".
Quick Replies management commands
Create Quick Reply
/qr-create (arguments, [message]) – creates a new Quick Reply, example: /qr-
create set=MyPreset label=MyButton /echo 123
Arguments:
label - string - text on the button, e.g., label=MyButton
set - string - name of the QR set, e.g., set=PresetName1
hidden - bool - whether the button should be hidden, e.g., hidden=true
startup - bool - auto execute on app startup, e.g., startup=true
user - bool - auto execute on user message, e.g., user=true
bot - bool - auto execute on AI message, e.g., bot=true
load - bool - auto execute on chat load, e.g., load=true
title - bool - title / tooltip to be shown on button, e.g., title="My Fancy Button"
Arguments:
newlabel - string - new text fort the button, e.g. newlabel=MyRenamedButton
label - string - text on the button, e.g., label=MyButton
set - string - name of the QR set, e.g., set=PresetName1
hidden - bool - whether the button should be hidden, e.g., hidden=true
startup - bool - auto execute on app startup, e.g., startup=true
user - bool - auto execute on user message, e.g., user=true
bot - bool - auto execute on AI message, e.g., bot=true
load - bool - auto execute on chat load, e.g., load=true
title - bool - title / tooltip to be shown on button, e.g., title="My Fancy Button"
Arguments:
enabled - bool - enable or disable the preset
nosend - bool - disable send / insert in user input (invalid for slash commands)
before - bool - place QR before user input
slots - int - number of slots
inject - bool - inject user input automatically (if disabled use {{input}} )
/setglobalvar key=summaryPrompt Summarize the most important facts and events that have
happened in the chat given to you in the Input header. Limit the summary to 100 words or
less. Your response should include nothing but the summary. |
/setvar key=tmp |
/messages 0-{{lastMessageId}} |
/trimtokens limit=3000 direction=end |
/setvar key=s1 |
/echo Generating, please wait... |
/genraw lock=on instruct=off {{instructInput}}{{newline}}{{getglobalvar::summaryPrompt}}
{{newline}}{{newline}}{{instructInput}}{{newline}}{{getvar::s1}}{{newline}}{{newline}}
{{instructOutput}}{{newline}}The chat summary:{{newline}} |
/setvar key=tmp |
/echo Done! |
/setinput {{getvar::tmp}} |
/flushvar tmp |
/flushvar s1
Buttons popup usage
stscript
stscript
/setvar key=fib_no 5 |
/pow 5 0.5 | /setglobalvar key=SQRT5 |
/setglobalvar key=PHI 1.618033 |
/pow PHI fib_no | /div {{pipe}} SQRT5 |
/round |
/echo {{getvar::fib_no}}th Fibonacci's number is: {{pipe}}
/let fact {: n=
/if left={{var::n}} rule=gt right=1
else={:
/return 1
:}
{:
/sub {{var::n}} 1 |
/:fact n={{pipe}} |
/mul {{var::n}} {{pipe}}
:}
:} |
Previous Next
Development and Automation Function Calling
Function Calling
Function Calling allows adding dynamic functionality to your extensions by letting the LLM
use structured data that you then can use to trigger a specific functionality of the
extension.
Attention
This feature is currently under development. Implementation details may
change.
Register a function
To register a function tool, you need to call the registerFunctionTool function from the
SillyTavern.getContext() object and pass the required parameters. Here is an example
of how to register a function tool:
SillyTavern.getContext().registerFunctionTool({
// Internal name of the function tool. Must be unique.
name: "myFunction",
// Display name of the function tool. Will be shown in the UI. (Optional)
displayName: "My Function",
// Description of the function tool. Must describe what the function does and when to
use it.
description: "My function description. Use when you need to do something.",
// JSON schema for the parameters of the function tool. See: https://json-schema.org/
parameters: {
$schema: 'http://json-schema.org/draft-04/schema#',
type: 'object',
properties: {
param1: {
type: 'string',
description: 'Parameter 1 description',
},
param2: {
type: 'string',
description: 'Parameter 2 description',
},
},
required: [
'param1', 'param2',
],
},
// Function to call when the tool is triggered. Can be async.
// If the result is not a string, it will be JSON-stringified.
action: async ({ param1, param2 }) => {
// Your function code here
console.log(`Function called with parameters: ${param1}, ${param2}`);
return "Function result";
},
// Optional function to format the toast message displayed when the function is
invoked.
// If an empty string is returned, no toast message will be displayed.
formatMessage: ({ param1, param2 }) => {
return `Function is called with: ${param1} and ${param2}`;
},
// Optional function that returns a boolean value indicating whether the tool should
be registered for the current prompt.
// If no shouldRegister function is provided, the tool will be registered for every
prompt.
shouldRegister: () => {
return true;
},
// Optional flag. If set to true, the function call will be performed, but the result
won't be recorded to the visible chat history.
stealth: false,
});
Unregister a function
To deactivate a function tool, you need to call the unregisterFunctionTool function from
the SillyTavern.getContext() object and pass the name of the function tool to disable.
Here is an example of how to unregister a function tool:
SillyTavern.getContext().unregisterFunctionTool("myFunction");
Previous Next
STscript Language Reference UI Extensions
UI Extensions
UI extensions expand the functionality of SillyTavern by hooking into its events and API.
You can easily create your own extensions.
Extension submissions
Want to contribute your extensions to the official repository? Contact us!
To ensure that all extensions are safe and easy to use, we have a few requirements:
1. Your extension must be open-source and have a libre license (see Choose a License).
If unsure, AGPLv3 is a good choice.
2. Extensions must be compatible with the latest release version of SillyTavern. Please
be ready to update your extension if something in the core changes.
3. Extensions must be well-documented. This includes a README file with installation
instructions, usage examples, and a list of features.
4. Extensions that have a server plugin requirement to function will not be accepted.
Examples
See live examples of simple SillyTavern extensions:
https://github.com/city-unit/st-extension-example - basic extension template.
Showcases manifest creation, local script imports, adding a settings UI panel, and
persistent extension settings usage.
Bundling
Extensions can also utilize bundling to isolate themselves from the rest of the modules and
use any dependencies from NPM, including UI frameworks like Vue, React, etc.
https://github.com/SillyTavern/Extension-WebpackTemplate - template repository of
an extension using TypeScript and Webpack (no React).
https://github.com/SillyTavern/Extension-ReactTemplate - template repository of a
barebone extension using React and Webpack.
To use relative imports from the bundle, you may need to create an import wrapper. Here's
an example for Webpack:
// define
async function importFromScript(what) {
const module = await import(/* webpackIgnore: true */'../../../../../script.js');
return module[what];
}
// use
const generateRaw = await importFromScript('generateRaw');
manifest.json
Every extension must have a folder in data/<user-handle>/extensions and have a
manifest.json file which contains metadata about the extension and a path to a JS script
file, which is the entry point of the extension.
{
"display_name": "The name of the extension",
"loading_order": 1,
"requires": [],
"optional": [],
"js": "index.js",
"css": "style.css",
"author": "Your name",
"version": "1.0.0",
"homePage": "https://github.com/your/extension",
"auto_update": true,
"i18n": {
"de-de": "i18n/de-de.json"
}
}
Scripting
Using getContext
The getContext() function in a SillyTavern global object gives you access to the
SillyTavern context, which is a collection of all the main app state objects, useful functions
and utilities.
const context = SillyTavern.getContext();
context.chat; // Chat log - MUTABLE
context.characters; // Character list
context.characterId; // Index of the current character
context.groups; // Group list
// And many more
Unless you're building a bundled extension, you can import variables and functions from
other JS files.
For example, this code snippet will generate a reply from the currently selected API in the
background:
import { generateQuietPrompt } from "../../../../script.js";
State management
When the extension needs to persist its state, it can use extensionSettings object from
the getContext() function to store and retrieve data. An extension can store any JSON-
serializable data in the settings object and must use a unique key to avoid conflicts with
other extensions.
const { extensionSettings, saveSettingsDebounced } = SillyTavern.getContext();
return extension_settings[MODULE_NAME];
}
Internationalization
For general information on providing translations, see the Internationalization
page.
Extensions can provide additional localized strings for use with the t , translate
functions and the data-i18n attribute in HTML templates.
See the list of supported locales here ( lang key):
https://github.com/SillyTavern/SillyTavern/blob/release/public/locales/lang.json
Direct addLocaleData call
Pass a locale code and an object with the translations to the addLocaleData function.
Overrides of existing keys are NOT allowed. If the passed locale code is not a currently
chosen locale, the data will be silently ignored.
SillyTavern.getContext().addLocaleData('fr-fr', { 'Hello': 'Bonjour' });
SillyTavern.getContext().addLocaleData('de-de', { 'Hello': 'Hallo' });
eventSource.on(event_types.MESSAGE_RECEIVED, handleIncomingMessage);
function handleIncomingMessage(data) {
// Handle message
}
The doExtrasFetch() function allows you to make requests to your SillyTavern Extra
server.
For example, to call the /api/summarize endpoint:
import { getApiUrl, doExtrasFetch } from "../../extensions.js";
Previous Next
Function Calling Server Plugins
Server Plugins
These plugins allow adding functionality that is impossible to achieve using UI extensions
alone, such as creating new API endpoints or using Node.JS packages that are
unavailable in a browser environment.
Plugins are contained in the plugins directory of SillyTavern and loaded on server
startup, but only if enableServerPlugins is set to true in the config.yaml file.
Warning
Server Plugins are not sandboxed. This means they can potentially gain
access to your entire file system, or introduce a wide range of security
vulnerabilities in a way that normal UI extensions cannot. Only install server
plugins from developers you trust!
Types of plugins
Files
An executable JavaScript file with ".js" (for CommonJS modules) or ".mjs" (for ES modules)
extension containing a module that exports an init function that accepts an Express
router (created specifically for your plugin) as an argument and returns a Promise.
The module should also export an info object containing the information about the
plugin ( id , name , and description strings). This will provide the information about the
plugin to the loader.
You can register routes via the router that will be registered under the
/api/plugins/{id}/{route} path. For example router.get('/foo') for plugin example
will produce a route like this: /api/plugins/example/foo .
A plugin could also optionally export an exit function that performs clean-up on shutting
down the server. It should have no arguments and must return a Promise.
TypeScript contract for plugin exports:
interface PluginInfo {
id: string;
name: string;
description: string;
}
interface Plugin {
init: (router: Router) => Promise<void>;
exit: () => Promise<void>;
info: PluginInfo;
}
module.exports = {
init,
exit,
info: {
id: 'example',
name: 'Example',
description: 'My cool plugin!',
},
};
Directories
You can load a plugin from a subdirectory in the plugins in one of the following ways (in
the order of checks):
1. package.json file that contains a path to an executable file in the "main" field.
2. index.js file for CommonJS modules.
3. index.mjs file for ES modules.
A resulting file must export an init function and info object with the same
requirements as for individual files.
Example of a directory plugin (with index.js file):
https://github.com/SillyTavern/SillyTavern-DiscordRichPresence-Server
Bundling
It is preferable to use a bundler (such as Webpack or Browserify) that will package all of
the requirements into one file. Make sure to set "Node" as a build target.
Template repository for plugins using Webpack and TypeScript:
https://github.com/SillyTavern/Plugin-WebpackTemplate
Previous Next
UI Extensions Internationalization (i18n)
Internationalization (i18n)
SillyTavern supports multiple languages. This guide explains how to add and manage
translations.
You're probably here because some piece of text is untranslated in your language, and it's
driving you nuts. First I'll show you how I fixed some missing translations in the Chinese
(Traditional) locale. Each was missing for a different reason, so you'll get a good idea of
how to fix your own missing translations.
In the second half, we look at
how i18n works in SillyTavern,
writing translations and code to use them,
debug functions to find missing translations,
adding a new language,
and contributing your changes.
If you're developing an extension or modifying the core code, write your HTML and
JavaScript with i18n in mind. This way your work is ready for other people to translate it
into their language.
Nobody knows 15 languages by themselves. We work together to make SillyTavern
accessible to everyone.
Everyone in the world should be able to use their own language on phones and
computers.
Generate Image
The text "Generate Image" is untranslated in the Chinese (Traditional) locale. Why?
generate-image-pre.png
Right-click on the element and inspect it. You'll see the HTML:
<!--rendered HTML-->
<div class="list-group-item flex-container flexGap5 interactable" id="sd_gen"
tabindex="0">
<div data-i18n="[title]Trigger Stable Diffusion" title="觸發 Stable Diffusion"
class="fa-solid fa-paintbrush extensionsMenuExtensionButton"></div>
<span>Generate Image</span>
</div>
Where is its data-i18n attribute? It's missing! Let's add it. We find it in the source code:
<!--public/scripts/extensions/stable-diffusion/button.html-->
<div id="sd_gen" class="list-group-item flex-container flexGap5">
<div class="fa-solid fa-paintbrush extensionsMenuExtensionButton" title="Trigger
Stable Diffusion"
data-i18n="[title]Trigger Stable Diffusion"></div>
<span>Generate Image</span>
</div>
<div id="sd_stop_gen" class="list-group-item flex-container flexGap5">
<div class="fa-solid fa-circle-stop extensionsMenuExtensionButton" title="Abort
current image generation task"
data-i18n="[title]Abort current image generation task"></div>
<span>Stop Image Generation</span>
</div>
We are in luck, that string Generate Image is in many of the language files, including in
Chinese (Traditional).
generate-image-lang.png
{
"Generate Image": "生成图片"
}
... which we can add to the JSON file just after the "Generate Image" translation.
{
"Generate Image": "生成图片",
"Stop Image Generation": "停止生成图片"
}
After some discussion with Claude, we're actually going to go with the following
translations:
Traditional Chinese: "Stop Image Generation": "終止圖片生成"
Simplified Chinese: "Stop Image Generation": "中止图像生成"
Japanese: "Stop Image Generation": "画像生成を停止"
stop-generating-post-2.png
Generate Caption
"Generate Caption" is untranslated in the Chinese (Traditional) locale. Let's fix it!
generate-image-post.png
Where is it? Inspect the element.
<!--rendered HTML-->
<div id="send_picture" class="list-group-item flex-container flexGap5 interactable"
tabindex="0">
<div class="fa-solid fa-image extensionsMenuExtensionButton"></div>
Generate Caption
</div>
Turns out that this HTML is produced by JavaScript. Let's find the source code.
// public/scripts/extensions/caption/index.js
const sendButton = $(`
<div id="send_picture" class="list-group-item flex-container flexGap5">
<div class="fa-solid fa-image extensionsMenuExtensionButton"></div>
Generate Caption
</div>`);
There are also no translations for "Generate Caption" in the Chinese (Traditional) file. Let's
add it!
{
"Generate Caption": "生成圖片說明"
}
Now we have to fix the JavaScript code. It has to use the t function to get the
translation.
// Extension-PromptInspector/index.js
import {t} from '../../../i18n.js';
We got these suggestions from Claude. Keep the strings, ignore the code. They have to be
added to the JSON files.
// 1. Simplified Chinese (zh-cn):
const enabledText = t`停止检查`;
const disabledText = t`检查提示词`;
// 2. Traditional Chinese (zh-tw):
const enabledText = t`停止檢查`;
const disabledText = t`檢查提示詞`;
// 3. Japanese (ja-jp):
const enabledText = t`検査を停止`;
const disabledText = t`プロンプトを検査`;
{
"Stop Inspecting": "停止檢查",
"Inspect Prompts": "檢查提示詞"
{
"Stop Inspecting": "検査を停止",
"Inspect Prompts": "プロンプトを検査"
}
toggle-prompt-inspection-post-tt.png
A pity about that tooltip. The problem is that the code doesn't use the t function.
launchButton.title = 'Toggle prompt inspection';
{
"Toggle prompt inspection": "切换提示词检查"
}
{
"Toggle prompt inspection": "プロンプト検査の切り替え"
}
Prompt inspector is a separate extension, so we will PR the code fixes to that repo:
https://github.com/SillyTavern/Extension-PromptInspector/pull/1
The translations will be added to the main SillyTavern repo.
https://github.com/SillyTavern/SillyTavern/pull/3198
start-inspecting-post.png
Language files
Each language has a JSON file in public/locales/ named with its language code (e.g.,
ru-ru.json ).
The default text in the HTML will be replaced with the translated text if available.
2. Template Strings: In the JavaScript code using the t function
t`Some text with ${variable}`
These strings should be translated keeping the ${0} , ${1} , etc. placeholders intact.
SillyTavern uses HTML elements with data-i18n attributes to mark translatable content.
There are several ways to use this:
1. Translating Element Text
For simple text content:
<span data-i18n="Role:">Role:</span>
{
"Role:": "Роль:"
}
This replaces the element's text content with the translation of "Role:".
2. Translating Attributes
To translate an attribute like a title or placeholder:
<a class="menu_button fa-chain fa-solid fa-fw"
title="Insert prompt"
data-i18n="[title]Insert prompt"></a>
{
"Insert prompt": "Вставить промпт"
}
The [title] prefix indicates which attribute to translate. The rest of the attribute value is
the text that will be used as a lookup key in the JSON file. It is common for coders to use
the English text as the key, but it is not required. The key can be any unique identifier.
The original English text must be present in the corresponding attribute ( title="Insert
prompt" ) though. It's used as a fallback if the translation is missing. Most notably, there is
no translation file for English.
Here is an example of using a unique identifier no_items_text as the key, rather than the
English text:
<!--suppress HtmlUnknownAttribute -->
<div class="openai_logit_bias_list" no_items_text="No items"
data-i18n="[no_items_text]openai_logit_bias_no_items"></div>
{
"openai_logit_bias_no_items": "没有相关产品"
}
{
"Authorize": "Авторизоваться",
"Get your OpenRouter API token using OAuth flow. You will be redirected to
openrouter.ai": "Получите свой OpenRouter API токен используя OAuth. У вас будет открыта
вкладка openrouter.ai"
}
This translates:
The element's text content using the key "Authorize"
The title attribute using the key "Get your OpenRouter API token using OAuth flow. You
will be redirected to openrouter.ai"
Note that both the title attribute and the element's text content are provided in English
as fallbacks.
You can also translate multiple attributes:
<!--suppress HtmlUnknownAttribute -->
<textarea id="send_textarea" name="text" class="mdHotkeys"
data-i18n="[no_connection_text]Not connected to API!;[connected_text]Type a
message, or /? for help"
placeholder="Not connected to API!"
no_connection_text="Not connected to API!"
connected_text="Type a message, or /? for help"></textarea>
Variable Placeholders
Some strings contain placeholders for dynamic values using ${0} , ${1} , etc:
toastr.error(t`Could not find proxy with name '${presetName}'`);
{
"Could not find proxy with name '${0}'": "Не удалось найти прокси с названием '${0}'"
}
Keep the placeholders the same for key and translation. The system will replace ${0}
with the value of presetName , etc.
Finding missing translations
Let's say you don't just want to fix one annoying missing translation, you want to find them
all.
That's a big ambition! Even fixing one translation is worth it. But if you want to catch 'em
all, you need a tool.
SillyTavern-i18n
https://github.com/SillyTavern/SillyTavern-i18n
Tools for working with frontend localization files.
Features:
Automatically add new keys to translate from HTML files.
Prune missing keys from localization files.
Use automatic Google translation to auto-populate missing values.
Sort JSON files by keys.
Inbuilt debug functions
These are under User Settings > Debug Menu.
Get missing translations
Detects missing localization data in the current locale and dumps the data into the
browser console. If the current locale is English, searches all other locales.
The console will show a table of missing translations with:
key: The text or identifier needing translation
language: Your current language code
value: The English text to translate
Apply locale
Reapplies the currently selected locale to the page
Previous Next
Server Plugins Administration
© Copyright 2025. All rights reserved.
SillyTavern Documentation
Administration
Despite following many security best practices, the SillyTavern server is not
secure enough for public internet exposure.
NEVER HOST ANY INSTANCES TO THE OPEN INTERNET WITHOUT ENSURING
PROPER SECURITY MEASURES FIRST.
WE ARE NOT RESPONSIBLE FOR ANY DAMAGE OR LOSSES IN CASES OF
UNAUTHORIZED ACCESS DUE TO IMPROPER OR INADEQUATE SECURITY
IMPLEMENTATION.
Multi-user
To share your SillyTavern instance with others, you can create multiple user accounts.
Each user has their own settings, extensions, and data. User accounts can also be
password-protected.
Remote access
You can access your SillyTavern instance from your phone, tablet, or another
computer.
Reverse proxying
For enthusiasts, you can set up a reverse proxy to access your SillyTavern instance
from the internet.
Security checklist
This is just a recommendation. Please consult a web application security specialist
before making your ST instance live.
1. Keep your operating system and runtime software like Node.js updated. This will
ensure that your system is up-to-date with the latest security patches and fixes which
can help prevent potential vulnerabilities.
2. Use a whitelist and a network firewall. Only allow trusted IP ranges to access the
server.
3. Enable basic authentication. It acts as a "master password" before you can proceed
to the front-end app.
4. Alternatively, configure external authentication. Some known services for that are
Authelia and authentik. See more in the SSO guide.
5. Never leave admin accounts passwordless. A server will warn you upon the startup if
you have any unprotected admin accounts.
6. Use the discreet login setting outside of the local network. This will hide the user list
from any potential outsiders.
7. Check the access logs often. They are written to the server console and the
access.log file and provide information on incoming connections, such as IP address
and user agent.
8. Configure HTTPS. For a localhost server, you can generate and use a self-signed
certificate. Otherwise, you may need to deploy a proxying web server like Traefik or
Caddy.
Find more on secure proxying in the following guide: Reverse Proxying SillyTavern.
Previous Next
Internationalization (i18n) Configuration File
© Copyright 2025. All rights reserved.
SillyTavern Documentation
Configuration File
Disclaimer
This documentation may be obsolete, incomplete, or incorrect. Please refer to
the default config.yaml in your installation for the most up-to-date list of
settings.
WARNING: DO NOT EDIT THE DEFAULT CONFIG DIRECTLY. THIS WON'T HAVE
ANY POSITIVE EFFECT. EDIT ITS COPY IN THE REPOSITORY ROOT INSTEAD.
config.yaml is the main configuration file for the SillyTavern server that you can find in
the repository root directory after completing the installation. It is a YAML file that
contains various settings, such as the network settings, security settings, and backend-
specific options. The changes made to this file will take effect after restarting the
server.
New settings that added to the upstream version will be automatically populated with the
default values when you run npm install (or specifically, the post-install.js script)
after updating the repository. You can then modify these settings as needed.
For nested settings, dot notation is used to indicate the hierarchy. For example,
protocol.ipv6: false refers to the ipv6 setting under the protocol section with a
value of false .
protocol:
ipv6: false
Environment Variables
Configuration may also be set via environment variables which will override the values in
the config.yaml file.
The environment variables should be prefixed with SILLYTAVERN_ and use uppercase
letters for the setting names. For example, the dataRoot setting can be overridden with
the SILLYTAVERN_DATAROOT environment variable.
The nested settings should be separated by underscores. For example, protocol.ipv6
can be overridden with the SILLYTAVERN_PROTOCOL_IPV6 environment variable.
If using Node.js >= 20, you can also store the environment variables in a .env file and
pass it to the server using the --env-file flag. For example, to use the .env file located
in the repository root, you can start the server with the following command:
node --env-file=.env server.js
Alternatively, pass the environment variables directly via the command line:
SILLYTAVERN_LISTEN=true SILLYTAVERN_PORT=8000 node server.js
Enable on-
demand true ,
enableDownloadableTokenizers
tokenizer true
false
downloads
Logging Configuration
Setting Description Default Permitted
Values
Minimum log (DEBUG = 0,
logging.minLogLevel level to display 0
(DEBUG) INFO = 1, WARN
in the terminal = 2, ERROR = 3)
Write server ,
logging.enableAccessLog
access log true true false
Network Configuration
Setting Description Default Permitted Values
Enable listening for
listen incoming false true , false
connections
Any valid port
port Server listening port 8000 number (1-
65535)
Enable listening on true , false ,
protocol.ipv4
IPv4 protocol true
auto
SSL Configuration
Setting Description Default Permitted
Values
Security Configuration
Setting Description Default Permitted
Values
Enable IP true ,
whitelistMode
whitelist filtering true
false
Check
forwarded true ,
enableForwardedWhitelist
headers for true
false
whitelisted IPs
middleware false
UI false
Disable CSRF ,
disableCsrfProtection protection (not false
true
recommended) false
Disable startup
security checks true ,
securityOverride
(not false
false
recommended)
User Authentication
Setting Description Default Permitted
Values
Enable basic ,
basicAuthMode
authentication false true false
Any number
User session (-1 to disable,
sessionTimeout timeout in -1
(disabled) 0 for browser
seconds close, >0 for
timeout)
Enable Authelia-
autheliaAuth based auto false true , false
login. See: SSO
Use account
perUserBasicAuth credentials for false true , false
basic auth
requests
Proxy Valid proxy URL (e.g.,
requestProxy.url
server URL null
"socks5://username:pa
Hosts to
requestProxy.bypass bypass ["localhost",
Array of hostnames/IP
proxy "127.0.0.1"]
AutoRun Configuration
Setting Description Default Permitted Values
Open browser
autorun automatically true true , false
on startup
Hostname used "auto" , any valid
when autorun hostname (e.g.,
autorunHostname
opens the "auto"
"localhost" ,
browser "st.example.com" )
Override port (use server port),
autorunPortOverride for browser -1
-1
any valid port number
autorun
Avoid using
avoidLocalhost 'localhost' for false true , false
autorun
Performance Configuration
Setting Description Default Permitted
Values
Lazy-load ,
performance.lazyLoadCharacters character true
true
data false
Enables disk
caching for true ,
performance.useDiskCache
character true
false
cards
Maximum Human-
memory readable
performance.memoryCacheCapacity
cache 100mb size (e.g.,
100mb ,
capacity 1gb )
Thumbnailing Configuration
Setting Description Default Permitted
Values
Enable
thumbnails.enabled thumbnail true true , false
generation
JPEG
thumbnails.quality thumbnail 95 0-100
quality
Image format ,
thumbnails.format
for thumbnails jpg jpg png
Backup Configuration
Setting Description Default Permitted
Values
Enable ,
backups.chat.enabled automatic chat true
true
backups false
Verify integrity ,
backups.chat.checkIntegrity of chat files true
true
Number of Any
backups.common.numberOfBackups backups to 50 positive
keep integer
Extensions Configuration
Setting Description Default
Enable UI
extensions.enabled
extensions true
Auto-update
extensions
(if enabled
extensions.autoUpdate
by the true
extension
manifest)
Enable
automatic
extensions.models.autoDownload
model true
downloads
HuggingFace "Cohee/distilbert-
extensions.models.classification model ID for base-uncased-go-
classification emotions-onnx"
HuggingFace
model ID for "Xenova/vit-gpt2-
extensions.models.captioning
image image-captioning"
captioning
HuggingFace "Cohee/jina-
extensions.models.embedding model ID for embeddings-v2-base-
embeddings en"
HuggingFace
model ID for "Xenova/whisper-
extensions.models.speechToText
speech-to- small"
text
HuggingFace
model ID for
extensions.models.textToSpeech
text-to- "Xenova/speecht5_tts"
h
Server Plugins
Setting Description Default Permitted
Values
Enable server- true ,
enableServerPlugins
side plugins false
false
Attempt to
automatically true ,
enableServerPluginsAutoUpdate
update server true
false
plugins on startup
System message
openai.captionSystemPrompt for caption "" Any string
completion
MistralAI Configuration
Setting Description Default Permitted
Values
Enable reply prefilling. The ,
mistral.enablePrefix prefix will be echoed in false
true
Ollama Configuration
Setting Description Default Permitted Values
(indefinite), 0
-1
Model keep-alive (immediate
ollama.keepAlive
duration (seconds) -1
unload), positive
integer
Controls the (model
"num_batch" (batch -1
default), positive
ollama.batchSize
size) parameter of the -1
integer
generation request
Claude Configuration
IMPORTANT!
Use with caution and only when the prompt prefix is static and doesn't change
between requests. {{random}} macro, lorebooks, vectors, summaries, etc. will
likely invalidate the cache and you'll just waste money on cache misses.
Behavior may be unpredictable and no guarantees can or will be made.
See: Prompt Caching
DeepL Configuration
Setting Description Default Permitted Values
Multi-user mode
Multi-user mode allows several people to use one SillyTavern server. Each user has their
own settings, extensions, and data. User accounts can also be password-protected.
Configuration
To enable and use the multi-user mode, edit the config.yaml file:
# Enable multi-user mode
enableUserAccounts: true
# Enable discreet login mode: hides user list on the login screen
enableDiscreetLogin: true
1. When the user account setting is disabled, a default-user fallback admin account is
utilized for storing the user data.
2. When the discreet login setting is disabled, a list of active users is displayed on the
login screen. If enabled, a user must enter their handle manually.
You can't delete the default-user account from the users list because it is used for
serving the user data in case if enableUserAccounts is set to false . But you can
disable it to hide it from the list and disallow logins.
User handles
A handle is the unique identifier of a user. It can consist only of lowercase letters,
numbers, and dashes.
A path to the user data directory assumes using the following pattern:
%DATA_ROOT%/%USER_HANDLE% .
The login screen is bypassed and not displayed when you have only one active user and it
is not password protected.
User profile
You can access an account self-management menu using an "Account" button under the
"User settings" panel in the top menu bar.
1. Display name - used in the login screen, can be changed. Does not correlate with
personas and is not visible for the AI APIs - you can still use as many personas as you
want.
2. Profile picture - used in the login screen. You can either use a custom picture, the
default persona picture (if set), or the last used persona otherwise.
3. Password - a lock icon reflects the account protection status (open lock = no
password). A password can be set, changed, or removed using the "Change
Password" button.
4. Settings Snapshots - access and review the backups of your settings.json file, with
the ability to create or restore snapshots.
5. Download Backup - download an archive of your user data folder.
6. Reset Settings - reset factory default settings, while leaving other data (character,
chats) intact.
Password recovery
1. A password can be recovered from a login screen. You need access to the server
console to get a one-time recovery code (consisting of 4 digits).
2. Alternatively, you can use a utility script in the SillyTavern server to reset a password
by providing the user handle.
Usage: node recover.js [account] (password)
Example: node recover.js admin SecurePassword
Previous Next
Configuration File Single Sign-On (SSO)
Previous Next
Multi-user mode Remote connections
Remote connections
Most often this is for people who want to use SillyTavern on their mobile phones while
their PC runs the ST server within the same WiFi network.
It is also the first step for allowing remote connections from outside the local network.
You should not use port forwarding to expose your ST server to the internet.
Instead, use a VPN or a tunneling service like Cloudflare Zero Trust, ngrok, or
Tailscale. See the VPN and Tunneling guide for more information.
Disclaimer
NEVER HOST ANY INSTANCES TO THE OPEN INTERNET WITHOUT ENSURING
PROPER SECURITY MEASURES FIRST.
WE ARE NOT RESPONSIBLE FOR ANY DAMAGE OR LOSSES IN CASES OF
UNAUTHORIZED ACCESS DUE TO IMPROPER OR INADEQUATE SECURITY
IMPLEMENTATION.
If you search for config.yaml directly in the SillyTavern folder, you may find
two files.
All modifications to config.yaml in this document refer to the one in the
SillyTavern root directory (/SillyTavern/config.yaml), not
/SillyTavern/default/config.yaml .
# Listen for incoming connections
listen: true
When ST is listening for remote connections, you should see this message in the console:
SillyTavern is listening on IPv4: 0.0.0.0:8000
If unsure about your local network's address range, use the whitelist above.
2. Allows two specific devices to connect:
whitelist:
- ::1
- 127.0.0.1
- 192.168.0.2
- 192.168.0.5
The server will ask for username and password whenever a client connects via HTTP. This
only works if the Remote connections (listen: true) are enabled.
To enable HTTP BA, Open config.yaml in the SillyTavern base directory and search for
basicAuthMode Set basicAuthMode to true and set username and password. Note:
config.yaml will only exist if ST has been executed before at least once.
basicAuthMode: true
basicAuthUser:
username: "MyUsername"
password: "MyPassword"
In this perUserBasicAuth mode the basic auth's username and password will be the same
as any valid multi user account that has a password. Additionally SillyTavern will login
directly to that account. Ensure you have an account with a password prior to enabling
perUserBasicAuth .
Save the file and restart SillyTavern if it was already running. You should be prompted for
username and password when connecting to your ST. Both username and password are
transmitted in plain text. If you are concerned about this, you can serve ST via HTTPS.
Connecting to your SillyTavern instance
Getting the IP address for the ST host machine
After the whitelist has been setup, you'll need the IP of the ST-hosting device.
If the ST-hosting device is on the same wifi network, you will use the ST-host's internal wifi
IP:
For Windows: windows button > type cmd.exe in the search bar > type ipconfig in
the console, hit Enter > look for IPv4 listing.
If you (or someone else) wants to connect to your hosted ST while not being on the same
network, you will need the public IP of your ST-hosting device.
While using the ST-hosting device, access this page and look for for IPv4 . This is
what you would use to connect from the remote device.
Connecting to the ST server
Whatever IP you ended up with for your situation, you will put that IP address and port
number into the remote device's web browser.
A typical address for an ST host on the same wifi network would look like:
http://192.168.0.5:8000
A console message for a browser on the same machine as the server looks like:
New connection from 127.0.0.1; User Agent: ...
A console message for a browser on a different machine on the same network as the
server might look like:
New connection from 192.168.116.187; User Agent: ...
As per default, ST will search for your certificates inside the certs folder. If your files are
located elsewhere, you can use the --keyPath and --certPath arguments.
Example:
node server.js --ssl --keyPath /home/user/certificates/privkey.pem --certPath
/home/user/certificates/cert.pem
The user you're running SillyTavern with requires read permissions on the certificate files.
How to get a certificate
The simplest, quickest way to get a certificate is by using certbot.
Previous Next
Single Sign-On (SSO) VPNs and Tunneling
Previous Next
Remote connections Reverse proxying
Reverse proxying
Note
This section does not refer to OpenAI/Claude reverse proxies. This refers
exclusively to HTTP/HTTPS Reverse Proxies.
Is Termux confusing to setup? Are you tired of updating and installing ST on every device
you have? Want organization of your chats and characters? Well you are in luck. This
guide will hopefully cover how to host SillyTavern on your PC where you can connect from
anywhere and chat to your bots on the same PC you use to run AI models!
Warning
This guide is not meant for beginners. This will be very technical.
Fair Warning
For Windows Users
This guide is not for Windows users. We recommend using a Linux VM or WSL2
to follow this guide.
Tip
It is recommended to set your private IP to a Static IP. Refer to your router's
manual or Google to configure static IPs.
Note
Do not install Docker Desktop.
4. Follow the steps in Manage Docker as a non-root user in the Docker post-installation
guide here.
5. Go to your root folder in Linux and make a new folder named docker .
cd /
sudo mkdir docker && cd docker
6. Execute chown , replacing with your Linux username to set the permissions in the
docker folder.
sudo chown -R <USER>:<USER> .
7. Make a folder inside the docker folder, that being secrets and inside secrets being
cloudflare .
8. Make a folder inside the docker folder, that being appdata and inside appdata being
traefik . Enter the appdata/traefik folder afterwards.
9. Create a acme.json file using touch and set the permissions of it to 600.
touch acme.json
chmod 600 acme.json
10. Using nano or a similar editor, create a file name traefik.yml and paste the following.
Replace the template email with your own, then save the file.
api:
dashboard: true
debug: true
insecure: true
entryPoints:
http:
address: ":80"
http:
redirections:
entryPoint:
to: https
scheme: https
https:
address: ":443"
serversTransport:
insecureSkipVerify: true
providers:
docker:
endpoint: "unix:///var/run/docker.sock"
exposedByDefault: false
file:
filename: /config.yml
watch: true
certificatesResolvers:
cloudflare:
acme:
email: YOUR_CLOUDFLARE_EMAL@DOMAIN.com
storage: acme.json
dnsChallenge:
provider: cloudflare
#disablePropagationCheck: true # uncomment this if you have issues
pulling certificates through cloudflare, By setting this flag to true disables the
need to wait for the propagation of the TXT record to all authoritative name servers.
resolvers:
- "1.1.1.1:53"
- "1.0.0.1:53"
12. Using nano or a similar editor, create a file name docker-compose.yaml and paste
the following. Save the file afterwards.
secrets:
CF_DNS_API_KEY:
file: ./secrets/cloudflare/CF_DNS_API_KEY
services:
traefik:
image: traefik:latest
container_name: traefik
restart: unless-stopped
secrets:
- CF_DNS_API_KEY
ports:
- 80:80
- 443:443
- 8080:8080
environment:
CLOUDFLARE_DNS_API_TOKEN_FILE: /run/secrets/CF_DNS_API_KEY
CLOUDFLARE_ZONE_API_TOKEN_FILE: /run/secrets/CF_DNS_API_KEY
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- ./appdata/traefik/traefik.yml:/traefik.yml:ro
- ./appdata/traefik/config.yml:/config.yml:ro
- ./appdata/traefik/acme.json:/acme.json
- /etc/localtime:/etc/localtime:ro
networks:
internal:
driver: bridge
13. Login to Cloudflare and click on your Domain, followed by Get your API token.
14. Click on Create Token then Create Custom Token and make sure you give your token
the following permissions.
Token Permissions
Zone -> DNS -> Edit
Zone -> Zone -> Read
19. cd into appdata/traefik and using nano or a similar editor, create a file name
config.yml and paste the following. Replace PRIVATE_IP with the private IP you
obtained, and silly.DOMAIN.com with the name of your subdomain and domain page,
then save the file.
http:
routers:
sillytavern:
entryPoints:
- "https"
rule: "Host(`silly.DOMAIN.com`)"
middlewares:
- https-redirectscheme
tls: {}
service: sillytavern
services:
sillytavern:
loadBalancer:
servers:
- url: "http://PRIVATE_IP:8000"
passHostHeader: true
middlewares:
https-redirectscheme:
redirectScheme:
scheme: https
20. Run Docker Compose using the following commands:
cd /docker
docker compose up -d
21. Go to your SillyTavern folder and edit config.yaml to enable listen mode and basic
authentication, whilst disabling whitelistMode .
listen: yes
whitelistMode: false
basicAuthMode: true
Tip
Make sure to change the default username and password to something
strong that you can remember.
Tip
Before enabling perUserBasicAuth ensure you have a valid multi-user setup
with working passwords.
22. Wait a few minutes, then open your domain page you made for ST. At the end of it,
you should be able to open SillyTavern from anywhere you go just with one URL and
one account.
Tip
If nothing happens after several minutes, check the container logs for
Traefik for any possible errors.
23. Enjoy! :D
Linux (Docker SillyTavern)
Note
Do note that we run SillyTavern on bare-metal over Docker. This is a rough idea
of what we would do on Docker with other Docker containers we tend to use
with ST.
Token Permissions
Zone -> DNS -> Edit
Zone -> Zone -> Read
7. Create another record of the CNAME type, then click Save. Here is an example on how
it should appear on the Cloudflare dashboard.
9. Using nano or a similar editor, create a file name docker-compose.yaml and paste
the following. Replace silly.DOMAIN.com with the subdomain you added above, the
save the file afterwards.
secrets:
CF_DNS_API_KEY:
file: ./secrets/cloudflare/CF_DNS_API_KEY
services:
traefik:
image: traefik:latest
container_name: traefik
restart: unless-stopped
secrets:
- CF_DNS_API_KEY
ports:
- "80:80"
- 443:443
- 8080:8080
environment:
CLOUDFLARE_DNS_API_TOKEN_FILE: /run/secrets/CF_DNS_API_KEY
CLOUDFLARE_ZONE_API_TOKEN_FILE: /run/secrets/CF_DNS_API_KEY
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- ./appdata/traefik/traefik.yml:/traefik.yml:ro
- ./appdata/traefik/config.yml:/config.yml:ro
- ./appdata/traefik/acme.json:/acme.json
- /etc/localtime:/etc/localtime:ro
sillytavern:
build: ./SillyTavern
container_name: sillytavern
hostname: sillytavern
image: ghcr.io/sillytavern/sillytavern:latest
volumes:
- "./appdata/sillytavern/config:/home/node/app/config"
- "./appdata/sillytavern/data:/home/node/app/data"
restart: unless-stopped
labels:
- "traefik.enable=true"
- "traefik.http.routers.sillytavern.entrypoints=http"
- "traefik.http.routers.sillytavern.rule=Host(`silly.DOMAIN.com`)"
- "traefik.http.middlewares.sillytavern-https-
redirect.redirectscheme.scheme=https"
- "traefik.http.routers.sillytavern.middlewares=sillytavern-https-
redirect"
- "traefik.http.routers.sillytavern-secure.entrypoints=https"
- "traefik.http.routers.sillytavern-secure.rule=Host(`silly.DOMAIN.com`)"
- "traefik.http.routers.sillytavern-secure.tls=true"
- "traefik.http.routers.sillytavern-secure.service=sillytavern"
- "traefik.http.services.sillytavern.loadbalancer.server.port=8000"
networks:
internal:
driver: bridge
Tip
Make sure to change the default username and password to something
strong that you can remember.
14. Wait a few minutes, then open your domain page you made for ST. At the end of it,
you should be able to open SillyTavern from anywhere you go just with one URL and
one account.
Tip
If nothing happens after several minutes, check the container logs for
Traefik for any possible errors.
15. Enjoy! :D
Updating your Cloudflare DNS
DDClient allows you to sync your public IP to Cloudflare in the situation that your ISP
changes it, allowing you to continue accessing your ST instance as if nothing ever
happened.
Previous Next
VPNs and Tunneling License and credits
Contributors
Edit this page
Previous
Reverse proxying