🎥 Reviewing Runway's Gen 2 Update: Image to Video

🏔 Not yet out of the uncanny valley

The TLDR

Highlight: We experiment with Runway’s new image → video AI that promises to turn still images into short videos. Take a look to see how it performs across a variety of inputs. Spoiler alert: it’s not perfect.

Musing of the Week: Google, Microsoft, and OpenAI aren’t just in tech - they’re in politics, too. Politico has reported that Google has assembled a lobbying team to convince lawmakers that its ambitions in healthcare AI are both safe and good for the market. Our thoughts below.

🎥 Image → Video with Runway’s Gen 2 AI

With the proliferation of image and text AIs over the past year, it’s notable that we still haven’t seen a significant breakthrough in generative AI video. There are some cool projects with AI talking heads through platforms like Synthesia and D-Id, and some simple stock image assemblers like Invideo AI. However, we have yet to see a generative model that could create diverse and novel videos.

Considering that the average person consumes 17 hours of video content every week, companies have a major incentive to get a generative video product to market. When Runway’s image to video AI started going viral recently it definitely piqued our interest to test if this was the next big step towards that future. Runway advertises its Gen-2 AI’s ability to “Bring your image to life” by generating video using an image prompt, text prompt, or a combination of the two.

Hands-On Testing

To gauge the efficacy of this tool, we decided to test the model using various images sourced from Midjourney.

  • Nightime City Long Exposure: Our first experiment was a cityscape with light lines from moving cars, aiming for a timelapse effect. Using the paint tool to focus the AI’s attention on the roads, we tested generation with and without a text prompt. Runway did a decent job in both, though each clip had significant artifacts – gradual darkening in the first and warping/flashing in the second.

    City Scape road highlighting with no text prompt

    City Scape road highlighting with text prompt

  • Abstract Paint Swirls: After seeing the warping in the previous clips we decided to test the model on something abstract and organic. This seemed to be where the model did its best work. If we had to guess it’s because there are fewer details and less specific expectations for where elements are supposed to look.

  • Geometric Abstract Pattern: Runway excelled with organic patterns but not as much with geometric ones. Though it produced movement, it often felt like warping rather than coherent motion.

  • Surrealist(ish) Landscape: This one might actually be our favorite. Admittedly not everything is doing what it’s supposed to, the clouds are moving in impossible directions and the water has some strange artifact when you look closely. Still, it all sort of works with the surrealist(ish) image to produce an interesting effect.

  • Realistic Dog in a Field: The biggest challenge was with living subjects, such as humans and animals. Unfortunately, Runway struggled significantly here, sliding firmly into the uncanny valley right off the bat.

  • Stormy Ocean: While the Gen-2 AI struggled with living things, it didn’t always fall flat with realism. Though this clip veered off stylistically in the second half the AI did a pretty good job predicting the trajectory of the waves.

Is Runway’s Gen 2 AI the big breakthrough in generative video?

Runway is fun to play with, and its new model is a step in the right direction. However, its limitations currently relegate it to more of a novelty than a revolutionary tool in content creation.

After experimenting for a while, we were convinced that people burned a lot of credits to generate the viral videos we’d been seeing, which doesn’t make it particularly viable for ongoing content creation.

Why don’t we have good generative video yet?

The mixed results from these tests underline the fact that generative video is a very difficult technical problem. A good video generating AI will need to not only be able to produce excellent images but will also need to be able to produce stylistically and compositionally consistent images in a sequence that makes visual sense to an audience. An excellent one would need to go a step beyond that to be able to generate sequences that follow not only visual logic but also narrative and structural logic as well.

Final Thoughts

While it's worth trying Runway's new tools, don't expect it to generate your favorite TV show or even viral clip just yet. Compared to pure image and text AIs, this technology still feels quite early in its development and we struggle to see clear utility beyond the gimmick for what’s available right now.

🧠 Musing of the Week

The risk-reward tradeoff of AI integration is different in every industry, and companies that deal with life and death need to prioritize responsible development over racing to market. Healthcare is particularly tricky (see United Healthcare’s 90% error rate in a health coverage algorithm), as companies try to use AI to triage, cut costs, and make up for labor shortages. Too much is happening, too fast.

Politico recently reported that Google is continuing to expand its position in personalized healthcare AI, and has assembled a lobbying group to frame the narrative and influence lawmakers. In the article, Sen. Mark Warner (D-Va.) is quoted saying “There is great promise in many of these tools to save more lives, [and] they also have the potential to do exactly the opposite — harm patients and their data, reinforce human bias, and add burdens to providers as they navigate a clinical and legal landscape without clear norms.” As we know, tech literacy in government is really all over the map. Sen. Warner’s statement implies that at least for some, the concerns about AI are less about the concept and more about trust in the handler’s incentives and goals.

There’s a lot of cash flowing among big tech lobbying. For reference, Apple, Amazon, Google, and Meta collectively spent a record-breaking $55M on lobbying in 2021. Google/Alphabet is covering all of its bases when it comes to AI, participating in lobbying as described as well as the Frontier Model Forum, a trade organization that includes Microsoft, Alphabet, Open AI, and Anthropic.

Why is this all so hard? We’re dealing with many unknown variables:

  • Tech literacy in government (which directly affects policy)

  • Big tech’s incentives

  • The velocity and direction of AI research

  • The tumultuous nature of the tech industry, evidenced by the recent Sam Altman saga

As Google navigates these waters, its interactions set precedents that will profoundly impact the industry, potentially paving the way for breakthroughs in personalized medicine and healthcare accessibility while also raising critical questions about data stewardship and the balance of power between tech giants and public interest.

The state of tech literacy in Congress (pictured: TikTok CEO, Shou Chew)

🙌 If you’re hyped about the generative AI industry specifically, here are some of the coolest roles we’ve seen this week:

🔨 Check out these other AI tools we’ve been looking at this week (creator edition):

  • Blue Willow: Free AI artwork generator (like DallE)

  • QuickAds AI: On-brand ad and social post generator

  • Perplexity: Free, GPT-3-based chatbot model with access to web

That’s all for this week. See you next Tuesday!

Lorel & Reily