Runway faces backlash after report of copying AI video training data from YouTube


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


Runway, a multi-hundred million dollar funded startup focused on AI video software and models that is backed by Google, among others, is in hot water from creators following a report today by 404 Media on a spreadsheet allegedly showing it undertook an effort to copy data from thousands of YouTube videos.

404 Media reports that a former employee of Runway leaked it a company spreadsheet allegedly showing Runway’s plans to categorize, tag, and train on “YouTube channels of thousands of media and entertainment companies, including The New Yorker, VICE News, Pixar, Disney, Netflix, Sony, and many other,” and that this data informed a product called “Jupiter,” which 404 says is Runway’s Gen-3 AI video creation model.

Individual YouTubers with large followings such as “Casey Neistat, Sam Kolder, Benjamin Hardman, Marques Brownlee” also were included in the spreadsheet.

We’ve reached out to Runway to verify the authenticity of the spreadsheet and will update when we hear back.

Fruit from the poisonous tree behind Gen-3 Alpha?

Runway revealed Gen-3 Alpha, an early version of the software, to acclaim for its realism, last month, and began allowing the public to use it a few weeks ago.

404 Media published a redacted Google Sheets copy of the alleged Runway spreadsheet online as a link within its article, showing more than 3,900 individual YouTube channels and a column with hashtags of different content contained therein.

Another tab of the spreadsheet labeled “high_camera_movement” includes more than 177 distinct YouTube accounts.

Rubbing creators and critics the wrong way

404 Media notes in its report that it “couldn’t confirm that every single video included in the spreadsheet was used to train Gen-3—it’s possible that some content was filtered out later or that not every single link on the spreadsheet was scraped,” but the existence of the spreadsheet itself and the implication that all or any of these YouTube videos may have been copied, downloaded, or otherwise analyzed by Runway engineers and/or machine learning algorithms to inform its Gen-3 Alpha model (or any other product for that matter) has rubbed many creators and critics of generative AI the wrong way.

Influential tech reviewer YouTuber Marques Brownlee a.k.a. MKBHD posted on X “well well well” and included a melting smiley face emoji. Brownlee has been critical in the past of others training AI on his videos.

Yet he’s also expressed excitement and enthusiasm for AI video technology such as OpenAI’s Sora in a prior video.

Ed Newton-Rex, founder and CEO of the ethical AI certification startup Fairly Trained, has posted several times on X highlighting the various notable names included in the alleged Runway spreadsheet, among them YouTube channels for musician Taylor Swift and filmmaker Wes Anderson.

YouTuber Omni or “Lay It Omni” called the spreadsheet “INSANE” in an X post and accused Runway of theft.

Even AI filmmakers who have created with Runway’s tools in the past including Dustin Hollywood have expressed criticism towards the company for what they view as theft.

Yet as I pointed out in a reply on X to Hollywood, multiple companies have already been accused or found to have used copyrighted videos without express permission or authorization or payment in training their models.

Indeed, just recently, Wired magazine (where my wife works as Editor-in-Chief) published a piece in conjunction with Proof News that found such big names as Apple, Nvidia, and the AI startup Anthropic (maker of Claude 3 Sonnet and Claude family of models) also trained AI models on YouTube Video transcripts without authorization.

My take is that scraping and training, while controversial, is legal and supported by the precedent set by Google in scraping the web and indexing it for search. But we’ll see if this holds up in court, as Runway is already among one of many AI companies being sued by creators for training on their data without permission or compensation. And in the court of public opinion, Runway appears to have taken a big hit today.



Source link lol
By stp2y

Leave a Reply

Your email address will not be published. Required fields are marked *

No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.