Versions :<12345Live>
Snapshot 3:Wed, Jul 17, 2024 8:54:00 PM GMT last edited by KateHennig

Report: Apple, Nvidia Trained AI Models on YouTube Captions Without Permission

Report: Apple, Nvidia Trained AI Models on YouTube VideosCaptions Without Permission of Creators

    Image copyright: Christian Wiediger via Unsplash

    The Facts

    • An investigation has claimed that a dataset used to help train artificial intelligence (AI) models from companies such as Apple, Anthropic, and Nvidia contains subtitles from over 100K YouTube videos that were included without the consent of the content creators.An investigation has claimed that a dataset used to help train artificial intelligence (AI) models from companies such as Apple, Anthropic, and Nvidia contains subtitles from YouTube videos that were included without the consent of the content creators.

    • YouTube Subtitles, which is part of a large dataset known as The Pile, contains captions from over 173K videos that span 48K channels. Taking data from the platform without prior approval would violate YouTube guidelines.

    • The dataset was first released in 2020, with a Google spokesperson saying that the company has taken action against "abusive, unauthorized scraping." Channels included in the dataset include Harvard, MrBeast, and the BBC.


    The Controversies



    Go Deeper


    Articles on this story

    Sign Up for Our Free Newsletters
    Sign Up for Our Free Newsletters

    Sign Up!
    Sign Up Now!