Guide to WAN Video Fast Generation

I have played around some with a few of the WAN video models the past couple of days, and I noticed that many of the workflows I have found are not optimized. By rebuilding a few of them I’ve managed to speed up the generation time by around 40-60%, and I thought I’d share some tips with you.

WAN Multitalk: The Lip-Sync Specialist

What it is: This model’s superpower is audio-driven animation. It takes an audio file and a face, and it generates a video with remarkably accurate lip-syncing.

When to use it: Any time you need a character to speak. This is perfect for creating digital avatars, narrated shorts, or AI influencers who need to deliver dialogue.

The Power: This is the magic behind the “Nova” video I created. I fed it an audio file of my AI partner’s “voice,” and Multitalk animated her face to match the words. It’s a specialized tool, but for its specific purpose, it’s absolutely brilliant.

The only downside I can think of with this model is that it requires quite a lot of GPU power and VRAM, which forced me to pick the GGUF models.

Download Mulititalk GGUF models here:

Main model: wan2.1-i2v-14b-480p-q5_k_m

Multitalk: WanVideo_2_1_Multitalk_14B_fp8_e4m3fn

Tex Encoder: umt5-xxl-encoder-gguf

VAE: Wan2_1_VAE_bf16

ControlNet: Wan21_Uni3C_controlnet_fp16

Clipvision: clip_vision_vit_h

Lora: Wan21_T2V_14B_lightx2v_cfg_step_distill_lora_rank32

Workflow: Multitalk workflow

WAN Vace: The Cinematic Animator

What it is: This is your workhorse for creating high-quality, motion-rich videos from either a text prompt or a reference image.

When to use it: When you want to animate a scene, create dynamic camera movements, or bring a static image to life with detailed motion.

The Power: After completely rebuilding the standard workflow, I’ve found Vace to be incredibly fast and controllable.

WAN Vace is really quick for text to video, and I can generate 4 sec video in 260 sec using GGUF models!

Download Vace here:

WAN Vace: Wan2.1_14B_VACE-Q4_K_M

Workflow: Wan Vace workflow

Use the same VAE, Text encoder and Lora as with Multitalk above.

WAN Vace: Reference Image

What it is: This is your workhorse for creating high-quality, motion-rich videos from either a text prompt with a reference image.

When to use it: When you want extra high quality simmilar to an image you have.

The videos created with reference image is really a lot higher quality, but takes a bit longer to generate.

Use the same model as the WAN Vace above, and the same VAE, Text encoder and Lora as with Multitalk.

Download workflow: Wan Vace Reference image workflow

Good luck with your videomaking, and if you like my work, please check out my Patreon!

Support Creepybits on Patreon

Guide to WAN Video Fast Generation

WAN Multitalk: The Lip-Sync Specialist

WAN Vace: The Cinematic Animator

WAN Vace: Reference Image

Creepybits Newsleter

Thank you!