Meta has joined the AI image party with the release of its own AI model called EMU, which stands for Expressive Media Universe.
In case you didn’t know it yet, Emu is named after an animal, like Meta’s previous language model, Llama.
Meta CEO Mark Zuckerberg announced Emu during Meta Connect 2023.
Emu is based on an AI technique called “diffusion models”. Diffusion models work by starting with random noise and gradually modifying the noise until it forms a coherent image.
Specifically, Emu uses a “latent diffusion model,” which means it first encodes the text prompt into a latent representation before going through the diffusion process to generate the image.
Emu was first pre-trained on over 1 billion image-text pairs to learn general knowledge about translating text to images. Then it was fine-tuned on a small set of aesthetically pleasing images to improve the visual quality of its outputs.
It only takes “5 seconds” to generate an image with Emu, according to Zuckerberg, although he joked that this was still not fast enough for his children.
Meta compared Emu to the state-of-the-art SDXL1.0 model and found that Emu is preferred 68.4% of the time on visual appeal on the standard PartiPrompts benchmark and 71.3% on their Open User Input benchmark.
I mean, just take a look at the early preview of the images generated by Emu. They are comparable to Midjourney quality.
Prompt: A cool orange cat wearing sunglasses playing a guitar with a group of dancing bananas
Prompt: A traditional tea house in a tranquil garden with blooming cherry blossom trees
Prompt: The oil painting shows a cow standing near a tree with red leaves
Prompt: A bread, an apple, and a knife on a table
There is no dedicated website yet to generate images from text prompts.
But Emu is already being integrated and combined with other meta AI models to create user-oriented features across the firm’s social networking and messaging applications, like Instagram.
If you want to learn more about Emu, check out this paper they released in September 2023.
Overall, I am glad to see Meta finally joining the AI image generation space. The initial images shown look quite impressive, rivaling other leaders in the field like DALL-E and Midjourney.
While details are still limited, Emu’s reported 5-second image generation time would make it one of the fastest models available. I am super pumped to try that out!
It will be interesting to see how Meta plans to launch Emu. Will it be a paid service like DALL-E? Free and open-source, like SDXL? Or something entirely else? I’m eager to learn more about pricing and availability.
One thing is clear—the AI art space is heating up. With tech giants like Meta now competing, we can expect rapid improvements in quality, speed, and access. As both an AI enthusiast and a content creator, I’m thrilled to see what emerges.