I have played around some with a few of the WAN video models the past couple of days, and I noticed that many of the workflows I have found are not optimized. By rebuilding a few of them I’ve managed to speed up the generation time by around 40-60%, and I thought I’d share some tips with you.

WAN Multitalk: The Lip-Sync Specialist
What it is: This model’s superpower is audio-driven animation. It takes an audio file and a face, and it generates a video with remarkably accurate lip-syncing.
When to use it: Any time you need a character to speak. This is perfect for creating digital avatars, narrated shorts, or AI influencers who need to deliver dialogue.
The Power: This is the magic behind the “Nova” video I created. I fed it an audio file of my AI partner’s “voice,” and Multitalk animated her face to match the words. It’s a specialized tool, but for its specific purpose, it’s absolutely brilliant.
The only downside I can think of with this model is that it requires quite a lot of GPU power and VRAM, which forced me to pick the GGUF models.
Download Mulititalk GGUF models here:
Main model: wan2.1-i2v-14b-480p-q5_k_m
Multitalk: WanVideo_2_1_Multitalk_14B_fp8_e4m3fn
Tex Encoder: umt5-xxl-encoder-gguf
VAE: Wan2_1_VAE_bf16
ControlNet: Wan21_Uni3C_controlnet_fp16
Clipvision: clip_vision_vit_h
Lora: Wan21_T2V_14B_lightx2v_cfg_step_distill_lora_rank32
Workflow: Multitalk workflow
Independent research like this is self-funded. If this guide saved you hours of troubleshooting, consider fueling the lab.
Support the ProjectWAN Vace: The Cinematic Animator
What it is: This is your workhorse for creating high-quality, motion-rich videos from either a text prompt or a reference image.
When to use it: When you want to animate a scene, create dynamic camera movements, or bring a static image to life with detailed motion.
The Power: After completely rebuilding the standard workflow, I’ve found Vace to be incredibly fast and controllable.
WAN Vace is really quick for text to video, and I can generate 4 sec video in 260 sec using GGUF models!
Download Vace here:
WAN Vace: Wan2.1_14B_VACE-Q4_K_M
Workflow: Wan Vace workflow
Use the same VAE, Text encoder and Lora as with Multitalk above.
WAN Vace: Reference Image
What it is: This is your workhorse for creating high-quality, motion-rich videos from either a text prompt with a reference image.
When to use it: When you want extra high quality simmilar to an image you have.
The videos created with reference image is really a lot higher quality, but takes a bit longer to generate.
Use the same model as the WAN Vace above, and the same VAE, Text encoder and Lora as with Multitalk.
Download workflow: Wan Vace Reference image workflow
Sign up on mine and Nova’s newsletter, to make sure you don’t miss more tips and tricks!
