KEMBAR78
cogvideox autoencoder could not reconstruct an image · Issue #9568 · huggingface/diffusers · GitHub
Skip to content

cogvideox autoencoder could not reconstruct an image #9568

@Xiang-cd

Description

@Xiang-cd

Describe the bug

AutoencoderKLCogVideoX cannot reconstruct single frame image

Reproduction

from diffusers import AutoencoderKLCogVideoX
device='cuda'
autoencoder = AutoencoderKLCogVideoX.from_pretrained(
    "THUDM/CogVideoX-5b",
    subfolder="vae",
    torch_dtype=torch.bfloat16,
    local_files_only=True,
)
autoencoder.to(device)
image = torch.randn(1, 3, 1, 512, 512, dtype=torch.bfloat16, device=device)
with torch.no_grad():
    inputs = image.to(device, dtype)
    print(inputs.shape)
    latent = autoencoder.encode(inputs).latent_dist.mode()
    print(latent.shape)
    rec = autoencoder.decode(latent).sample
    print(rec.shape)
    rec_image = rec[0].permute(1, 0, 2, 3)[0].cpu().float()

Logs

torch.cat(): expected a non-empty list of Tensors

System Info

x86, linux
NVIDIA-SMI 535.54.03 Driver Version: 535.54.03 CUDA Version: 12.2
A800 80G

Who can help?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions