Generate Images That Actually Get the Text Right

Most image models fumble dense copy, tight layouts, and multi-object prompts. ERNIE Image is trained for exactly those cases — long-form text on posters, speech bubbles in comics, structured multi-panel compositions, and bilingual Chinese/English scenes.

app screen

Why Creators Pick ERNIE Image

Strong where other image models are weak

ERNIE Image is a single-stream Diffusion Transformer trained to handle the cases that usually break generative models: legible text, strict layouts, multi-object prompts, and bilingual instructions. A lightweight Prompt Enhancer expands short inputs into structured descriptions, so you don't need to prompt-engineer to get usable output.

Fast Iteration with Turbo Mode

A distilled 8-step Turbo variant ships alongside the 50-step SFT model. Sketch at draft speed, then render the hero frame at full quality — no tool switch.

Benchmarks, Not Just Demos

GENEval 0.8856, LongTextBench 0.9733, top-tier OneIG scores in both English and Chinese. The quality is independently verified, not cherry-picked.

Write Like You Think

The built-in Prompt Enhancer turns a one-line idea into a detailed, structured prompt. You stay in creative mode; the model handles the prompt-engineering layer.

One Surface for the Whole Pipeline

Generate, edit, composite, upscale, export — your visual workflow sits inside a single tab. No tool-hopping, no stacked subscriptions.

Replace Shoots and Stock Budgets

On-brand posters, product frames, and campaign assets in minutes. Scale creative volume without scaling headcount or licensing spend.

Apache 2.0 — You Own the Output

Weights are open under Apache 2.0 and everything you generate is yours commercially. Ads, merch, print, resale, fine-tuning, self-hosting — all on the table.

Core Capabilities

Built for the cases that break other image models

ERNIE Image is an 8B single-stream DiT paired with a Prompt Enhancer and a Turbo variant. Here's what the architecture is actually good at.

Accurate In-Image Text Rendering

Long-form copy on posters, headlines on infographics, speech bubbles in comics, labels on UI mockups. Characters render cleanly where other diffusion models smear glyphs or hallucinate letters — LongTextBench 0.9733.

Instruction-Faithful Composition

Multiple objects, specific spatial relationships, knowledge-dense prompts. The model tracks what you actually described rather than collapsing to a generic 'pretty picture' — GENEval 0.8856, ahead of Qwen-Image and comparable to FLUX.2.

Structured Layouts and Multi-Panel

Posters, comics, storyboards, UI frames, infographics. ERNIE Image reasons about page layout and panel composition — not just subject and style. Supported resolutions include 1024×1024, 848×1264, 1264×848, 768×1376, and 1376×768.

Bilingual Chinese and English

Prompts in either language return results of comparable quality — OneIG-EN 0.5750 and OneIG-ZH 0.5543. In-image text handles both scripts, so you can ship the same campaign in two markets from one pipeline.

Two Variants: SFT and Turbo

The 50-step SFT model maximizes instruction fidelity for final frames. ERNIE-Image-Turbo — distilled with DMD and reinforcement learning — returns 8-step previews in seconds for fast iteration.

Open Weights, Consumer-GPU Friendly

The full 8B checkpoint is released under Apache 2.0 and runs on a single 24GB GPU. Self-host, fine-tune on your brand data, or integrate directly into a production pipeline — no vendor lock.

Trusted by Creative Professionals

Real stories from creators who transformed their workflow

Sarah Chen
Digital Artist

I mostly work on comic panels with speech bubbles, which every other AI tool mangled. ERNIE Image is the first one where the text inside the image actually renders — 20+ hours a week back in my pocket.

Marcus Rodriguez
Marketing Director

Campaign posters with real headlines used to come back from our agency in weeks. Now I generate them in-house, in both English and Chinese, in an afternoon.

Emily Watson
Content Creator

Turbo mode changed how I iterate — I preview 30 compositions in the time it used to take to render one. Then I lock in the final frame on the full SFT model.

David Kim
Graphic Designer

I've tested more than fifteen text-to-image tools. ERNIE Image is the only one I trust for layout-heavy work — posters, infographics, anything where spacing and text actually matter.

Lisa Thompson
Social Media Manager

Twelve accounts, two languages, one afternoon per month. The bilingual prompting means I'm not maintaining parallel creative pipelines anymore — engagement has roughly tripled.

James Wilson
Creative Director

We retired the stock-photo line item entirely. Every asset is original, on-brand, and ships with real text baked in — so the design team stops retouching headlines back on in Photoshop.

Anna Martinez
Freelance Illustrator

Client revisions that used to eat days now happen during the call. I walk in with dozens of explored directions — output is up roughly 5x and the client conversations are much better.

Robert Chang
Brand Manager

Holding layout consistency across 50+ SKUs used to be a full-time job. The model learned our style guide and now ships perfectly aligned product assets on demand.

Sophie Laurent
Art Director

Pitch decks used to rely on placeholder visuals because the real ones took weeks. Now every slide ships with custom imagery — and the client assumes we have a full studio behind it.

Michael Brown
Product Designer

UI mockups with real interface text — buttons, labels, microcopy — came out correctly on the first try. I cycle through 50+ concepts a day; the dev loop has easily quadrupled.

Rachel Green
Digital Marketer

We grew from 10K to 100K followers almost entirely on AI-generated posters and carousels. The difference is that the text in the image actually reads — that's the whole game for social.

Kevin Park
UX Designer

Every project exposes another capability I hadn't planned to use. Subtle retouching, structured multi-panel layouts, bilingual copy — it handles the kind of complexity a senior designer would take seriously.

Frequently Asked Questions

Everything you need to know about ERNIE Image