Coming back hard on text-to-video AI tools, global technology major Google has launched its own version of ‘text-to-video AI, the Make-a-Video’ platform. The new short ‘smart’ clip platform, the Imagen Video is an AI that can create video clips from text prompts.
This is the second text-to-video AI launched six months after DALLE-2, a text-to-image generator from Open AI, and merely a week after Meta announced its ‘Make-A-Video.’
According to the tech major, Imagen Video can produce videos of 1,280×768 pixels resolution at 24 frames per second of not more than 5.3 seconds. The model takes a description and generates a 16-frame, 3-fps video having 24 x 48-pixel resolution. Then, the system upscales and “predicts” additional frames, producing a 720p video at 24 frames per second.
Advertisement
EVENT
Saksham Bharat 2026
A multi-stakeholder dialogue on skilling gap in Cybersecurity, Data Resilience and AI — and the roadmap to a Saksham Bharat.
Infosec Reimagined 2026 is the premier information security summit where top leaders—CISOs, CROs, CIOs, CTOs and risk executives—converge to redefine cyber resilience.
Digital Senate is a premier conference uniting government leaders, technologists and innovators to share ideas, success stories and strategies on digital governance, public sector transformation, cybersecurity and emerging technologies in India.
CIO Prism unites forward-thinking technology leaders to exchange transformative insights, shape digital strategies, and foster innovation, empowering enterprises to excel in an era of rapid technological change.
“Imagen Video has a high degree of controllability and world knowledge. We find Imagen Video not only capable of generating videos of high fidelity, but also having a high degree of controllability and world knowledge, including the ability to generate diverse videos and text animations in various artistic styles and with 3D objectunderstanding,” Google said.
The Imagen Video was trained with an “internal dataset” of 14 million videos and 60 million still images, and the training data further contained another 400 million images from the LAION-400M open dataset.
The team at Imagen Video plans to join the researchers at Phenaki, another text-to-video AI from Google that can turn detailed text prompts into two-minute-plus videos, though with a lower quality.
Advertisement
The demos shared include a video of “Coffee pouring into a cup,” “Wooden figurine surfing on a surfboard in space,” “Balloon full of water exploding in extreme slow motion,” and more.
Tech Observer Desk at TechObserver.in is a team of technology reporters led by a senior editor who brings latest updates and developments from the world of technology.
India will chair the Common Criteria Development Board from April 2026, gaining influence over international IT security certification standards recognised by 38 countries.