Creative Commons, the non-profit that pioneered sharing content through permissive licensing, is launching CC Signals, a framework to signal permissions for content use by machines in the age of artificial intelligence. “They are both a technical and legal tool and a social proposition: a call for a new pact between those who share data and those who use it to train AI models,” says Creative Commons CEO Anna Tumadóttir, noting the signals are “based on a set of limited but meaningful options shaped in the public interest.” The framework is designed to bridge the openness of the Internet with AI’s insatiable demand for training data, according to Creative Commons.
The organization developed a system for open licensing that is used globally by content owners who want to retain copyright while still allowing the sharing and reuse of material including photos, music, video and writing.
“Demand is increasing for such a tool, as companies grapple with changing their policies and terms of service to either limit AI training on their data or explain to what extent they’ll utilize users’ data for purposes related to AI,” TechCrunch reports, pointing out that X initially let third parties use its public data to train models, then reversed that decision.
Other platforms have cobbled together solutions, with Reddit using robots.txt to tell bots if they can crawl its site for AI training, while Cloudflare is exploring for-fee scraping, TechCrunch adds. “Open-source developers have also built tools to slow down and waste the resources of AI crawlers that didn’t respect their ‘no crawl’ directives.”
Creative Commons is stepping in to provide a more regular and reliable solution, warding off a hodgepodge approach and thwarting what it fears may be a knee-jerk reaction to wall-off sites, the company explains in a blog post.
“We need to collectively insist on a new kind of give-and-take,” Creative Commons General Counsel Sarah Hinchliff Pearson said in announcing the framework, available for download on GitHub. “A single preference, uniquely expressed, is inconsequential in the machine age, but together, we can demand a different way.”
“The new tool will offer a limited set of clear options for reuse, with an emphasis on transparency and ethical practice,” writes PetaPixel, which says the project launch “coincides with a growing debate about how AI models are trained and the role that publicly available content should play.”
According to Maginative. the signals are “both human- and machine-readable, with flexible enforceability.” An alpha release is planned for November 2025.
No Comments Yet
You can be the first to comment!
Leave a comment
You must be logged in to post a comment.