-
Notifications
You must be signed in to change notification settings - Fork 6.4k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
AutoencoderKLCogVideoX cannot reconstruct single frame image
Reproduction
from diffusers import AutoencoderKLCogVideoX
device='cuda'
autoencoder = AutoencoderKLCogVideoX.from_pretrained(
"THUDM/CogVideoX-5b",
subfolder="vae",
torch_dtype=torch.bfloat16,
local_files_only=True,
)
autoencoder.to(device)
image = torch.randn(1, 3, 1, 512, 512, dtype=torch.bfloat16, device=device)
with torch.no_grad():
inputs = image.to(device, dtype)
print(inputs.shape)
latent = autoencoder.encode(inputs).latent_dist.mode()
print(latent.shape)
rec = autoencoder.decode(latent).sample
print(rec.shape)
rec_image = rec[0].permute(1, 0, 2, 3)[0].cpu().float()Logs
torch.cat(): expected a non-empty list of TensorsSystem Info
x86, linux
NVIDIA-SMI 535.54.03 Driver Version: 535.54.03 CUDA Version: 12.2
A800 80G
Who can help?
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working