Textoon

Textoon: Generating Vivid 2D Cartoon Characters from Text Descriptions

Tongyi Lab，Alibaba Group

Abstract

The 2D cartoon style is a prominent art form in digital character creation, particularly popular among younger audiences. While advancements in digital human technology have spurred extensive research into photorealistic digital humans and 3D characters, interactive 2D cartoon characters have received comparatively less attention. Unlike 3D counterparts, which require sophisticated construction and resource-intensive rendering, Live2D, a widely-used format for 2D cartoon characters, offers a more efficient alternative, which allows to animate 2D characters in a manner that simulates 3D movement without the necessity of building a complete 3D model. Furthermore, Live2D employs lightweight HTML5 (H5) rendering, improving both accessibility and efficiency. In this technical report, we introduce Textoon, an innovative method for generating diverse 2D cartoon characters in the Live2D format based on text descriptions. The Textoon leverages cutting-edge language and vision models to comprehend textual intentions and generate 2D appearance, capable of creating a wide variety of stunning and interactive 2D characters within one minute.

Method

The overview of Textoon. This framework leverages fine-tuned LLMs to accurately extract component description words from user input text, using the corresponding components to control the appearance generation of 2D cartoon characters. It allows users to re-edit details and uses the components to extract and complete the generated images into Live2D model textures. The resulting Live2D models are diverse and compatible with original animation.

Accurate Text Parsing

Our text parsing model excels at extracting detailed information from complex user descriptions. It accurately identifies features such as back hair, side hair, bangs, eye color, eyebrows, face shape, clothing type, and shoe type. This advanced text parsing capability allows for more flexible user inputs.

Controllable Appearance Generation

After parsing the text, each component is synthesized into a comprehensive character template. The contour boundaries offer precise control over the shape of the generated character, while a text-to-image model takes charge of generating the inner color and texture.

Editable

If users are not satisfied with the initial generated result and wish to modify specific details,our framework provides assistance in selecting specific positions to add, remove, or modify elements.

Animation

The control coefficients for the Live2D model's mouth primarily include MouthOpenY and MouthForm. MouthOpenY controls the vertical movement of the mouth, while MouthForm adjusts the expressions, such as upturning and grimacing. However, these controls often result in suboptimal driving performance. To enhance the accuracy of speech animations for cartoon characters, we integrate ARKit's face blend shape capabilities into the Live2D lip-sync functionality. This integration significantly improves the realism and precision of the animated speech.

Created Characters & Prompts

Textoon supports both English and Chinese prompts.

She is a long-haired girl with light purple eyes that appear dreamy and ethereal. Elegant long bangs slightly conceal her forehead. She is wearing a pure white round-neck lace top with soft three-quarter sleeves, paired with a light gray mid-length floral skirt and beige mid-calf boots.

She has styled her long hair into two playful pigtails, with neatly trimmed mid-length hair and long bangs that slightly cover her forehead, revealing a pair of lively blue eyes. She is wearing a light purple short-sleeve top with a round neckline and ruffled cuffs, paired with high-waisted white shorts and white sneakers on her feet.

A cute girl with a single ponytail has medium-length bangs slightly parted to the side, revealing a pair of deep purple eyes. She is wearing a V-neck pink short-sleeve top with a thin red belt at the waist, paired with a red knee-length skirt and Martin boots.

A girl with warm orange twin-tails has neat short bangs pressed against her forehead, revealing a pair of bright orange eyes. She is wearing a sleeveless yellow short-sleeve top with ruffles at the hem, paired with a denim mini skirt and white sneakers.

A girl with long golden hair, slightly curled at the ends, has mid-length hair that just reaches her shoulders, complemented by fresh short bangs that reveal a pair of bright brown eyes. She is wearing a high-neck white knit top adorned with delicate lace patterns at the cuffs, paired with black and white plaid casual pants and blue flat shoes.

She has long, fluffy twin-tails and deep black eyes filled with mystery. Neat bangs frame her delicate brows and eyes. She is wearing a burgundy square-neck knit top with elegant three-quarter sleeves, paired with a blue plaid mini skirt and black knee-high boots.

A girl with long yellow hair and blue eyes is wearing a light green V-neck short-sleeve shirt with a small sheep design on the chest. She has on a soft blue plaid long skirt and black Martin boots, perfectly highlighting her playful and lively youthful spirit.

A girl has smooth long hair and silver-gray eyes that convey calmness and nobility, with flowing long bangs gently brushing her face. She is dressed in an ivory-white high-neck wool sweater with thick three-quarter sleeves, paired with neatly pressed dark gray trousers and sleek black leather shoes.

She has stylish short hair and warm brown eyes that radiate friendliness, with neat medium-length bangs framing her adorable face. She wears a khaki square-neck cotton top with lightweight three-quarter sleeves, paired with deep blue shorts and gray casual shoes.

She has long twin ponytails and sapphire-blue eyes, with soft medium-length bangs gently draping down. She is dressed in a deep blue high-neck cashmere sweater with silky long sleeves, paired with a snow-white knee-length lace skirt and black knee-high leather boots.

长发女孩，头发珊瑚红色，深蓝色眼睛，方领短袖温和蓝色上衣，深蓝色短裙和咖啡色中筒靴

@article{he2025textoon, title={Textoon: Generating Vivid 2D Cartoon Characters from Text Descriptions}, author={Chao He and Jianqiang Ren and Yuan Dong and Jianjing Xiang and Xiejie Shen and Weihao Yuan and Liefeng Bo}, journal={arXiv preprint arXiv:2501.10020}, year={2025} }