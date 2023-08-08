Numerous authors were shocked to discover that their books had been uploaded and scanned into a large dataset without their permission. The project, known as Prosecraft, was developed by cloud word processor Shaxpir and collected over 27,000 books. It aimed to compare and analyze the “vividness” of language used in these books.

Authors such as Maureen Johnson and Celeste Ng criticized Prosecraft for using their books to train a model without obtaining consent. In response to the online backlash, the website was eventually taken down.

Benji Smith, the creator of Prosecraft, clarified that it was not a generative AI tool. However, he acknowledged the concerns raised by authors, explaining that he had gathered a quarter billion words from published books by crawling the internet. The website presented two paragraphs from each book, highlighting the “most passive” and “most vivid” ones. The books were then ranked based on their level of vividness, length, and passivity.

Authors argued that the excerpts on Prosecraft included major spoilers, further exacerbating the situation. This incident added to their frustration with the increasing use of AI tools without their consent.

The rise of generative AI and self-publishing technology has created an environment where unethical activities can thrive. Amazon, for example, has been inundated with low-quality, AI-generated travel guides and children’s books. Real authors find themselves inadvertently facing plagiarism as their work is leveraged to train AI models.

Jane Friedman, an author, revealed that someone was selling books under her name on Amazon that appeared to have been written with AI assistance. While she managed to have the fake books removed from her Goodreads page, Amazon requires a trademark to remove them from sale.

Although some authors remain unconcerned about AI’s impact on literature, there is a worry that publishers might be convinced to replace marketing and publicity teams with AI-generated promotional content. Overall, these incidents have left authors feeling frustrated and helpless.