TokenBench

Cosmos-Tokenizer Code | Technical Report

TokenBench.mp4

TokenBench is a comprehensive benchmark to standardize the evaluation for Cosmos-Tokenizer, which covers a wide variety of domains including robotic manipulation, driving, egocentric, and web videos. It consists of high-resolution, long-duration videos, and is designed to evaluate the performance of video tokenizers. We resort to existing video datasets that are commonly used for various tasks, including BDD100K, EgoExo-4D, BridgeData V2, and Panda-70M. This repo provides instructions on how to download and preprocess the videos for TokenBench.

Installation

Clone the source code

git clone https://github.com/NVlabs/TokenBench.git
cd TokenBench

Install via pip

pip3 install -r requirements.txt
apt-get install -y ffmpeg

Preferably, build a docker image using the provided Dockerfile

docker build -t token-bench -f Dockerfile .

# You can run the container as:
docker run --gpus all -it --rm -v /home/${USER}:/home/${USER} \
    --workdir ${PWD} token-bench /bin/bash

Download StyleGAN Checkpoints from Hugging Face

You can use this snippet to download StyleGAN checkpoints from huggingface.co/LanguageBind/Open-Sora-Plan-v1.0.0:

from huggingface_hub import login, snapshot_download
import os

login(token="<YOUR-HF-TOKEN>", add_to_git_credential=True)
model_name="LanguageBind/Open-Sora-Plan-v1.0.0"
local_dir = "pretrained_ckpts/" + model_name
os.makedirs(local_dir, exist_ok=True)
print(f"downloading `{model_name}` ...")
snapshot_download(repo_id=f"{model_name}", local_dir=local_dir)

Under pretrained_ckpts/Open-Sora-Plan-v1.0.0, you can find the StyleGAN checkpoints required for FVD metrics.

├── opensora/eval/fvd/styleganv/
│   ├── fvd.py
│   ├── i3d_torchscript.pt

Instructions to build TokenBench

Download the datasets from the official websites:

EgoExo4D: https://docs.ego-exo4d-data.org/
BridgeData V2: https://rail-berkeley.github.io/bridgedata/
Panda70M: https://snap-research.github.io/Panda-70M/
BDD100K: http://bdd-data.berkeley.edu/

Pick the videos as specified in the token_bench/video/list.txt file.
Preprocess the videos using the script token_bench/video/preprocessing_script.py.

Evaluation on the token-bench

We provide the basic scripts to compute the common evaluation metrics for video tokenizer reonctruction, including PSNR, SSIM, and lpips. Use the code to compute metrics between two folders as below

python3 -m token_bench.metrics_cli --mode=lpips \
        --gtpath <ground truth folder> \
        --targetpath <reconstruction folder>

Continuous video tokenizer leaderboard

Tokenizer	Compression Ratio (T x H x W)	Formulation	PSNR	SSIM	rFVD
CogVideoX	4 × 8 × 8	VAE	33.149	0.908	6.970
OmniTokenizer	4 × 8 × 8	VAE	29.705	0.830	35.867
Cosmos-CV	4 × 8 × 8	AE	37.270	0.928	6.849
Cosmos-CV	8 × 8 × 8	AE	36.856	0.917	11.624
Cosmos-CV	8 × 16 × 16	AE	35.158	0.875	43.085

Discrete video tokenizer leaderboard

Tokenizer	Compression Ratio (T x H x W)	Quantization	PSNR	SSIM	rFVD
VideoGPT	4 × 4 × 4	VQ	35.119	0.914	13.855
OmniTokenizer	4 × 8 × 8	VQ	30.152	0.827	53.553
Cosmos-DV	4 × 8 × 8	FSQ	35.137	0.887	19.672
Cosmos-DV	8 × 8 × 8	FSQ	34.746	0.872	43.865
Cosmos-DV	8 × 16 × 16	FSQ	33.718	0.828	113.481

Core contributors

Fitsum Reda, Jinwei Gu, Xian Liu, Songwei Ge, Ting-Chun Wang, Haoxiang Wang, Ming-Yu Liu

Citation

If you find TokenBench useful in your works, please acknowledge it appropriately by citing:

@article{agarwal2025cosmos,
  title={Cosmos World Foundation Model Platform for Physical AI},
  author={NVIDIA et. al.},
  journal={arXiv preprint arXiv:2501.03575},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
token_bench		token_bench
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TokenBench

Cosmos-Tokenizer Code | Technical Report

Installation

Download StyleGAN Checkpoints from Hugging Face

Instructions to build TokenBench

Evaluation on the token-bench

Continuous video tokenizer leaderboard

Discrete video tokenizer leaderboard

Core contributors

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 5

Uh oh!

Languages

License

NVlabs/TokenBench

Folders and files

Latest commit

History

Repository files navigation

TokenBench

Cosmos-Tokenizer Code | Technical Report

Installation

Download StyleGAN Checkpoints from Hugging Face

Instructions to build TokenBench

Evaluation on the token-bench

Continuous video tokenizer leaderboard

Discrete video tokenizer leaderboard

Core contributors

Citation

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 5

Uh oh!

Languages

Packages