Grok Imagine lags behind its rivals in AI video generation

Over the weekend, Elon Musk’s artificial intelligence company xAI released Grok Imagine, a new generative AI tool for generating images and videos. Grok Imagine is available now to paid xAI subscribers in the Grok iOS and Android apps.
Musk has been hyping up the project on X, sharing photos and videos from Grok users. This includes some mildly NSFW content, which the Grok app labels as “Spicy.”
This Tweet is currently unavailable. It might be loading or has been removed.
AI video is an exciting — and frankly terrifying — new frontier for the AI industry. To proponents, this technology gives artists a new medium for creativity and could reduce the costs of animation and filmmaking. To critics, AI video poses serious risks for sexual deepfakes and misinformation.
Putting aside that debate for the moment, I wanted to see how well Grok Imagine compares to xAI’s biggest rivals. As I’ve written previously, Google’s Veo 3 AI video model currently leads with field with surprisingly lifelike video. Then there’s Sora, from ChatGPT-maker OpenAI. Additionally, the popular AI image generator Midjourney recently introduced its own generative AI video tool.
So, how does Grok Imagine compare to its competitors? To be blunt, I’m not impressed.
Yes, Grok Imagine is brand new, and Musk recently said on X that it “should get better every day.” However, as of this writing, it seems to lag far behind its rivals.
Let me show my work.
Comparing Grok Imagine AI video to the competition
Mashable recently wrote about a viral AI video trend — security camera footage of animals jumping on trampolines and engaging in similar antics. So, I used a simple prompt to test Grok Imagine, Veo 3, Sora, and Midjourney: “Security camera footage of rabbits jumping on a trampoline at night.” Simple enough, right?
First, I should note that there’s a big difference between Veo 3 and Grok Imagine. Google’s Veo 3 model can generate videos based on a text prompt. Simply describe the video you want, and Veo 3 will do the rest. However, tools like Midjourney and Grok Imagine only offer text-to-image generation. After generating or uploading an image, users can then animate it, transforming it into a short video clip. In this sense, Grok Imagine is already on the back foot compared to OpenAI and Google.
Mashable Light Speed
With those caveats, let’s dive into the results, which I’ve also shared on X.
I put my test prompt into Grok, and it returned these disappointing images.

Credit: Screenshot courtesy of Grok / Timothy Beck Werth

Credit: Screenshot courtesy of Grok / Timothy Beck Werth
I selected the least bad of these images and created this short video:
This Tweet is currently unavailable. It might be loading or has been removed.
It’s…fine? Kind of mid, or meh, as the kids say.
But it also suffers in comparison to other AI video tools.
As the video shows, Google Veo 3 and Sora did much better with the same prompt:
This Tweet is currently unavailable. It might be loading or has been removed.
Finally, Midjourney, which animates images similar to xAI, was able to produce better images and videos, though it took two attempts. The image and video it produced have the grainy look of surveillance footage.

AI-generated image.
Credit: Timothy Beck Werth / Midjourney
This Tweet is currently unavailable. It might be loading or has been removed.
Audio is also a major disadvantage with Grok Imagine. While Veo 3 can produce sound effects and coherent dialogue in sync with the video, the audio I’ve found on Grok Imagine videos is limited to rough sound effects and gibberish.
Musk compared Grok Imagine to a modern-day Vine app, writing on X, “Grok Imagine is optimized for most fun and shareable content.”
And in my initial tests, Grok Imagine seems optimized for creating two types of images and videos: memes and anime. If you want to animate memes — or create sexually suggestive videos of anime girls — then Grok Imagine will do the trick, I guess. But beyond that, I can’t say I’m impressed.
There is one area where Grok Imagine does shine, and that’s in terms of speed. So far, I’ve found it produces both images and videos significantly faster than its rivals.
This Tweet is currently unavailable. It might be loading or has been removed.
Mashable reached out to xAI, and we’ll update this story if we receive a response.
Disclosure: Ziff Davis, Mashable’s parent company, filed a lawsuit in April against OpenAI, alleging it infringed Ziff Davis’ copyrights in training and operating its AI systems.