Wan Animate : AI can now do CGI for free

Wan Animate : AI can now do CGI for free

Wan Animate : AI can now do CGI for free

How to use Wan 2.2 Animate ?

Photo by Ion (Ivan) Sipilov on Unsplash

The latest release from Tongyi Lab, Wan 2.2 Animate, tackles a problem that’s been half-solved in the open-source space for a while:

how to animate a still character image with realistic body movement, facial expressions, and environmental blending, all in one model.

Most tools you’ve probably seen focus on one part. Some can animate a portrait, but the face comes out stiff. Others can swap characters into a scene, but they look pasted on. Wan 2.2 Animate goes after the whole pipeline.

What it actually does

Wan 2.2 Animate works in two modes:

  • Animation Mode: Take a static character image and a reference video. The model copies the motion and expressions from the reference and generates a new animation while keeping the background from the image.
  • Replacement Mode: Instead of generating a new background, it inserts the animated character into the reference video, replacing the original subject. To avoid the “Photoshop cutout” look, it also matches the video’s lighting and color tones.

So if you have a sketch, a stylized character, or even a photo, the model can drive it using motion from another video. Or, it can do a proper character swap inside an existing clip.

What makes it different

A few key things stand out compared to earlier open-source attempts:

  • Unified input design: Rather than having separate models for body, face, and replacement, Wan 2.2 Animate uses a common symbolic representation. That lets one model handle multiple tasks.

Two-level control:

  • For body motion, it extracts a 2D skeleton from the reference video and injects it into the noise latents of the diffusion process.
  • For facial expressions, it skips landmarks (which lose detail) and instead encodes the cropped face directly into latent features. These are injected through cross-attention into the Transformer layers, preserving subtle expressions.
  • Relighting LoRA: When doing replacement, the model applies a lightweight module that adjusts lighting and color tone to fit the new scene. Without it, characters look mismatched; with it, they blend naturally.
  • Long video support: Instead of being stuck at short clips, it can chain segments together by reusing the last few frames as temporal guidance, keeping continuity.

Under the hood

Wan 2.2 Animate is built on Wan-I2V, a diffusion-transformer (DiT) based image-to-video model. The architecture keeps the usual VAE compression + patchify + Transformer pipeline, but adds two adapters:

  • Body Adapter: compresses skeleton poses and aligns them spatially with the video latents.
  • Face Adapter: encodes faces into 1D latents, temporally aligns them, and feeds them into dedicated “face blocks” every few Transformer layers.
  • Relighting LoRA: only used in replacement mode, applied to self- and cross-attention layers for lighting correction.

Training is staged: body control → face control → joint control → replacement → relighting. That progressive setup helps the model converge more reliably than trying to learn everything at once.

Benchmarks and results

On quantitative metrics like SSIM, LPIPS, and FVD, Wan 2.2 Animate outperforms most open-source baselines (Animate Anyone, Unianimate, VACE). It’s also close to, or even better than, closed-source commercial models like DreamActor-M1 (Bytedance) and Runway Act-Two.

Human evaluation studies show the same: better motion accuracy, more consistent identities, and more natural expressions.

Why it matters

Up until now, open-source tools were always a compromise. You’d get either stiff faces, unstable body motion, or broken blending. Wan 2.2 Animate is the first open-source release that feels “complete”: it handles body + face + environment together, with quality good enough to rival proprietary systems.

And because the weights and code will be open-sourced, developers can actually build on top of it instead of waiting for commercial APIs to allow access.

How you’d actually use it

  1. Provide a character image (portrait, half-body, full-body).
  2. Provide a reference video with the motion/expression you want.
  3. Pick a mode:
  • Animation → keeps original background.
  • Replacement → swaps character into the reference video.

The model takes care of skeleton extraction, facial feature encoding, pose retargeting, and relighting if needed.

Output: a coherent, high-quality video that either animates your static character or swaps it seamlessly into a clip.

That’s the essence of Wan 2.2 Animate: not just another “Animate Anyone” clone, but a serious, unified system for character-driven video generation.


Wan Animate : AI can now do CGI for free was originally published in Data Science in Your Pocket on Medium, where people are continuing the conversation by highlighting and responding to this story.

Share this article
0
Share
Shareable URL
Prev Post

SLED: A Simple Decoding Trick That Makes AI Answers More Truthful

Next Post

Lucy Edit : AI model for Video editing

Read next
Subscribe to our newsletter
Get notified of the best deals on our Courses, Tools and Giveaways..