AI Video Generation: Sora, Runway & Veo

The Year Video Stopped Requiring a Camera

In early 2024, AI-generated video meant nightmare fuel: melting faces, six-fingered hands, spaghetti-eating celebrities from the uncanny valley. Two years later, AI video generation produces clips with synchronized dialogue, consistent characters, and physics convincing enough that the average viewer scrolling a feed can't reliably tell them from footage shot on a camera. That is a genuinely startling rate of progress, and it's reshaping content creation faster than almost anyone predicted.

The moment that made it undeniable came last fall. When OpenAI launched Sora 2 alongside a TikTok-style social app on September 30, 2025, the invite-only app hit number one on Apple's App Store and a million downloads within five days. Suddenly AI video wasn't a research demo or a professional tool—it was a feed full of your friends, inserted into any scene imaginable. As of March 2026, the question isn't whether this technology matters. It's what it does to everyone who makes video for a living, and everyone who watches it.

The State of AI Video Generation: Who's Who in March 2026

Five players define the current landscape, each with a distinct personality.

OpenAI Sora is the consumer phenomenon. Sora 2 generates short clips with synchronized audio and dialogue, but its defining feature is "cameos": record yourself once, and you—face, voice, mannerisms—can star in any generated scene, as can friends who grant permission. The Sora app turned that into a social loop, expanding to Android in late 2025, and the result is a feed that's equal parts creative playground and copyright minefield (more on that shortly).

Runway is the filmmaker's pick. Its Gen-4 model, released in early 2025, made a leap that professionals had been begging for: consistent characters and objects across shots, the difference between a cool clip and an editable scene. Runway's Act-Two performance-capture tool maps a human actor's expressions onto generated characters, and the company has partnerships with Lionsgate and AMC Networks exploring AI in actual studio pipelines.

Google Veo is the technical heavyweight. Veo 3 stunned audiences at Google I/O in May 2025 by generating video with native synchronized audio—dialogue, ambient sound, effects—rather than bolting sound on afterward. Veo 3.1 followed in October with finer editing controls, and Google's Flow app packages it for narrative work, all distributed through the Gemini ecosystem.

Kling, from China's Kuaishou, has quietly become the volume leader among short-form creators worldwide, with aggressive pricing and strong motion quality; its early-2026 release pushed into multi-shot sequences with subject consistency. Pika rounds out the field by leaning into playful effects—swapping, inflating, exploding objects in existing footage—aimed squarely at the meme-and-Reels economy.

Tool	Standout strength	Native audio	Best suited for
Sora 2 (OpenAI)	Cameos, social app, realism	Yes	Consumer creation, social video
Runway Gen-4	Shot-to-shot consistency, pro tools	Limited	Filmmakers, studios, agencies
Google Veo 3.1	Audio-video quality, prompt fidelity	Yes	Polished commercial work
Kling	Price-performance, motion	Yes	High-volume short-form
Pika	Effects and remixing	Partial	Memes, social experiments

What These Tools Can and Can't Do

The honest scorecard matters, because hype cuts both ways.

What works now: photorealistic clips up to roughly 10 to 25 seconds, synchronized speech and sound, consistent characters across multiple shots, controllable camera moves, and editing operations like extending a shot or swapping an element. Stitch clips together with conventional editing software—even free consumer tools like CapCut—and small teams can produce work that once required a crew.

What still breaks: long-form coherence, for one. Nobody can prompt a watchable 90-minute film; features made with AI today are assembled from hundreds of short generations, with humans doing the connective storytelling. Physics still glitches under pressure—hands, crowds, liquids, fast object interactions. Fine-grained control remains the deepest gap: a director who wants the actor's eyebrow to lift exactly there, on that line, still gets approximations. And every model inherits biases and blind spots from its training data.

It's also worth naming the workflow reality, because marketing videos hide it. Practitioners describe a hit rate, not a button: you might generate eight or twelve takes of a shot to get one keeper, then color-grade and cut it like any other footage. The models are slot machines with very good odds, and professionals win by pulling the lever systematically—prompt libraries, reference images, seed control—rather than hoping for one perfect generation.

The cost picture is the part that should recalibrate your intuitions. Generating with these tools costs dollars per clip, not thousands—but professional results still demand taste, iteration, and editing. The skill hasn't disappeared; it has moved from operating equipment to directing a system, a shift we've watched play out across image, text, and music throughout the generative AI era.

The Copyright Reckoning Arrived on Schedule

If 2025 proved the technology, it also detonated the legal questions. Within days of Sora's launch, the feed filled with copyrighted characters in absurd situations—and with hyperreal videos of celebrities, living and dead. OpenAI initially treated rightsholders' content as fair game unless they opted out; within about a week, facing an industry revolt, it reversed to an opt-in model with promised revenue sharing. The Motion Picture Association was blunt, insisting that preventing infringement was OpenAI's responsibility, not rightsholders'. After Bryan Cranston's voice and likeness appeared in unauthorized clips, OpenAI tightened its guardrails in coordination with SAG-AFTRA and major talent agencies.

The deeper fight is upstream: what these models were trained on. Disney and Universal sued Midjourney in mid-2025 over image generation, and the same logic looms over video models trained on decades of film and YouTube. Courts haven't settled whether that training is fair use, and the answer—different, probably, in the US, EU, and China—will shape which tools survive in their current form.

Labor is the parallel front. The 2023 Hollywood strikes won the first contractual guardrails on AI, but the September 2025 debut of "Tilly Norwood"—an AI-generated "actress" whose creators shopped her to talent agents—showed how fast the frontier moves, drawing immediate condemnation from SAG-AFTRA. Meanwhile, an AI-assisted animated feature, Critterz, is racing toward a 2026 festival debut on a budget under 30 million dollars and a timeline under a year—numbers that traditional animation simply cannot match. These are exactly the governance questions we explored in AI Ethics and Regulation: Taming the Algorithms, now with payrolls attached.

How AI Video Generation Is Already Changing Each Industry

The disruption isn't arriving evenly. A quick tour of the blast radius as of early 2026:

Advertising is furthest along. Coca-Cola has now run AI-generated holiday campaigns two years running, and agencies routinely use AI video for storyboards, regional ad variants, and product visualizations. The economics are irresistible: dozens of tailored versions for the cost of one traditional shoot.
Film and TV are in the experimentation phase. Studios use AI for previsualization, background plates, de-aging, and dubbing, while partnerships like Runway-Lionsgate probe deeper integration. Full AI features remain festival curiosities—for now.
Social and creator content is being transformed in real time. The marginal cost of a visual gag dropped to near zero, and feeds are filling with synthetic clips. Platforms now require AI-content labeling, with uneven enforcement.
Corporate and education video—training, explainers, localization—may be the quietest big winner, since "good enough, fast, and multilingual" is precisely what these tools deliver.
Music videos and indie games are emerging beneficiaries, where surreal visuals are a feature rather than a bug and budgets were always the binding constraint.

Two second-order effects deserve attention. Stock footage marketplaces face an existential squeeze—why license a generic drone shot of a coastline when you can generate a bespoke one in a minute? And the dubbing and localization industry is being rebuilt around AI lip-sync, which lets a performance travel across languages with the actor's own face and voice intact, a capability studios were already piloting on theatrical releases by late 2025.

The flip side is trust. When anyone can fabricate convincing footage of anyone, every real video inherits a credibility tax—the so-called liar's dividend, where authentic evidence gets dismissed as fake. Provenance standards like C2PA content credentials are spreading across cameras and platforms, but the arms race is real, and knowing how to spot AI-generated content is becoming a basic media literacy skill.

Three Futures for Creators

Honest forecasting means holding multiple scenarios. Here are the three we find most plausible, with rough odds.

Scenario one: the augmentation era (most likely). AI video settles in as the most powerful production tool since digital editing. Crews shrink, budgets compress, and output explodes, but human taste, story, and performance stay at the center—much as photography survived the smartphone by becoming ubiquitous. Licensing deals between AI labs and studios mature into a functioning market. Creators who learn to direct these systems prosper; pure technical roles consolidate painfully.

Scenario two: the synthetic flood. Generation gets so cheap that feeds saturate with infinite, personalized, disposable video—and attention, not production, becomes the only scarce resource. Mid-tier creators get squeezed hardest as algorithms favor whoever generates the most variants. Audiences eventually fracture: synthetic feeds for entertainment, with a premium on verified-human content, live events, and personality-driven work that machines can't fake.

Scenario three: the legal correction. Courts rule decisively against training on unlicensed work, or regulators impose strict consent-and-provenance regimes. Models retrain on licensed data, capabilities temporarily regress, costs rise, and the field tilts toward whichever companies—or countries—secured rights early. Disruption slows but doesn't reverse.

Reality will likely blend all three. The common thread: in every scenario, distribution of skills changes faster than the demand for human judgment, and the creators who experiment now—who treat these tools like the image generators that preceded them, as instruments to master rather than threats to ignore—hold the best hand.

The Takeaway: The Camera Is Optional, the Storyteller Isn't

Step back from the model names and the lawsuits, and the shape of the change is clear. For 130 years, video meant capturing photons—an inherently expensive act that gatekept who could tell stories at scale. That constraint is dissolving. What's left is the part that was always hardest: having something worth saying, and the taste to say it well.

There are real losses coming—jobs, certainty about what's real, perhaps some of the craft traditions that made cinema what it is. It would be glib to wave those away, and the next two years of court rulings and union contracts will matter enormously. But the historical pattern of creative technology, from the printing press to the DSLR, is that radically cheaper tools produce more creators, more experimentation, and more strange new forms than they destroy.

The key takeaway: AI video generation has crossed from demo to production tool, but not from tool to author. The leverage now belongs to people who can direct machines with intent—so if you make video, or want to, the smartest move in 2026 is to spend ten hours inside these tools and decide for yourself what they're good for.