  • May 25, 2024
  Free


EMO: Emote Portrait Alive is a groundbreaking tool developed by the Institute for Intelligent Computing, Alibaba Group. It's designed to generate expressive portrait videos from a single reference image and vocal audio. The tool is the brainchild of Linrui Tian, Qi Wang, Bang Zhang, and Liefeng Bo.

The development of EMO was driven by the desire to create lifelike avatar videos that could express a range of emotions and head poses. The tool uses a two-stage process involving Frames Encoding and a Diffusion Process stage. It employs a ReferenceNet to extract features from the reference image and motion frames, and an audio encoder to process the audio embedding.

EMO stands out for its ability to generate videos of any duration, depending on the length of the input audio. It supports songs in various languages and can animate portraits from different eras, paintings, and both 3D models and AI-generated content. The tool is also capable of keeping up with fast-paced rhythms, ensuring that even the swiftest lyrics are synchronized with expressive and dynamic character animations.

The tool has a wide range of applications, from academic research to effect demonstration. It can be used to animate portraits of movie characters delivering monologues or performances in different languages and styles, expanding the possibilities of character portrayal in multilingual and multicultural contexts.

Looking ahead, the team behind EMO plans to continue refining the tool and exploring new applications. They are committed to pushing the boundaries of what's possible in the realm of expressive portrait video generation.

