OpenAI, makers of the AI large-language model chatbot ChatGPT, announced on Thursday a deal with social media platform Reddit to give ChatGPT direct access to content from Reddit.
In a joint statement, the two companies said the deal would allow OpenAI to "better understand and showcase Reddit content, especially on recent topics" on their products, as well as help train their large-language model.OpenAI, makers of the AI large-language model chatbot ChatGPT, announced on Thursday a deal with Reddit to give ChatGPT direct access to content from the social media platform.
While this might be a boon for AI research, it will be a disaster for privacy and the rights of users on Reddit and other sites that have been used for AI training. Large-language models vacuum massive amounts of data as part of their training, withand theirReddit opaqueusers naturecould ofhave thetheir innerevery workingsword ofadded theseto productsChatGPT meansand thatpotentially misleadingexposed orto defamatorythe informationworld. mightMore beguardrails spatare outneeded asensure theirtraining knowledgemodels baserespect increases,our with little recourse for those affectedprivacy.
If we want AI tools to continue to grow in sophistication, then we have to accept that the models have to be trained on a large volume of high-quality data. Instead of a "wild-west" situation where designers scrape online data without authorization, a recent turn towards licensing deals will ensure that training data is managed in a conscientious way that is fair to all parties involved.