
Small publishers are sick and tired of AI models helping themselves to their data like it was a bowl of chips at a party.
What happened: Cloudflare customers can now monitor bots scraping their websites for material to train AI models and choose which ones to block. The firm also said it’s launching a marketplace next year where sites can sell AI companies the rights to mine their content.
Why it matters: While prominent publishers and websites like Time or Reddit have the means to strike lucrative AI partnerships, the same isn’t necessarily true of smaller organizations that are at risk of losing traffic to AI models actively siphoning their content.
- A lack of compensation could cause publications to either block these models or go out of business. Either option would be bad news for AI companies who need a wellspring of high-quality data to improve their models, lest they start regressing.
Zoom out: Cloudflare isn’t the only group trying to figure out a compensation solution for smaller publishers. AI licensing companies formed a trade group this summer to try and set industry standards, while startup ProRata is working on an algorithm that reviews AI content, identifies exact source material, and determines how much the source should be paid.—QH