You Just Found Out Your Book Was Used to Train AI. Now What?

Written by

in

Art, Artificial Intelligence, ChatGPT, Deep Scratch, Deep Scratch Remix, Uncategorized

The thought of Deep Scratch being used in any training data set is a strange loop, as it kind of folds back into the AI themes of the book, to me. But, alas, it probably would not be picked up.

The Books3 dataset contains 183,000 books, downloaded from pirate sources. We know that companies like Meta (creators of LLaMA), EleutherAI, and Bloomberg have used it to train their language models. OpenAI has not disclosed training information about GPT 3.5 or GPT 4—the models underlying ChatGPT—so we don’t know whether it also used Books3. Regardless of whether GPT was trained on Books3, the class action lawsuits against OpenAI should uncover more information on the datasets used by OpenAI, which we believe also include books obtained from pirate sources.
https://authorsguild.org/news/you-just-found-out-your-book-was-used-to-train-ai-now-what/

https://authorsguild.org/news/you-just-found-out-your-book-was-used-to-train-ai-now-what/

Art books Chat GPT Deep Learning Deep Scratch Novel Large Language Models Open AI

Comments

Leave a comment Cancel reply

More posts