Post Header
With the proliferation of AI tools in recent months, many fans have voiced concerns regarding data scraping and AI-generated works, and how these developments can affect AO3. We share your concerns. We'd like to share what we've been doing to combat data scraping and what our current policies on the subject of AI are.
Data scraping and AO3 fanworks
We've put in place certain technical measures to hinder large-scale data scraping on AO3, such as rate limiting, and we're constantly monitoring our traffic for signs of abusive data collection. We do not make exceptions for researchers or those wishing to create datasets. However, we don't have a policy against responsible data collection — such as those done by academic researchers, fans backing up works to Wayback Machine or Google's search indexing. Putting systems in place that attempt to block all scraping would be difficult or impossible without also blocking legitimate uses of the site.
With that said, it is an unfortunate reality that anything that is publicly available online can be used for reasons other than its initial intended purposes. In many cases, AI data collection traffic relies on the same techniques as the legitimate use cases above.
Once we became aware that data from AO3 was being included in the Common Crawl dataset — which is used to train AI such as ChatGPT — we put code in place in December 2022 requesting Common Crawl not scrape the Archive again.
We cannot go back in time to stop data collection that already occurred, or remove AO3's content from existing datasets, as much as we may dislike that it happened. All we can do is attempt to reduce such collection in the future. The Archive's development team will continue to be on the lookout for individual scrapers collecting AO3 data, and to take action as needed.
Likewise, our Legal committee has and will continue to serve the OTW mission of protecting fanworks from legal challenge and commercial exploitation. This includes their position that users should be allowed to opt out from having their works incorporated into AI training sets, a position that they have presented to the U.S. Copyright Office. They, too, will continue to keep pace with this developing field.
What can I do to avoid data scraping?
You may want to restrict your work to Archive users only. While this will not block every potential scraper, it should provide some protection against large-scale scraping.
AI-generated works and AO3 policies
At the moment, there is nothing in our Terms of Service that prohibits fanworks that are fully or partly generated with AI tools from being posted to the AO3, if they otherwise qualify as fanworks.
Our goals as an organization include maximum inclusivity of fanworks. This means not only the best fanworks, or the most popular fanworks, but all the fanworks that we can preserve. If fans are using AI to generate fanworks, then our current position is that this is also a type of work that is within our mandate to preserve.
Depending on the circumstances, AI-generated works could violate our anti-spam policies (e.g. if a creator posts a significant number in a short time). If you're uncertain whether a work violates our Terms of Service, you may always report it to our Policy & Abuse team using the link at the bottom of any page, and they can investigate.
This statement reflects AO3’s policy at the time of writing, as we wanted to be transparent with our users about what our current stance is and what can be done – and is being done – to mitigate scraping for AI datasets. However, these policies are also under discussion internally among AO3 volunteers. If we agree on changes to these in the future, those will be announced publicly; additionally, if there are any proposed changes to the AO3 Terms of Service, they will be made available for public comment as is required of any and all changes to our Terms of Service.
We hope that this helps to make things more clear – this is a complicated situation, and we’re doing our very best to address it in a way that doesn’t compromise AO3’s principles of maximum fanwork inclusivity or legitimate uses of the site. As discussions and approaches evolve, we will keep our users updated.
Pages Navigation
Mr_Understated Mon 15 May 2023 12:24PM UTC
Comment Actions
Account Deleted Mon 15 May 2023 12:25PM UTC
Comment Actions
orangepurple Mon 15 May 2023 04:54PM UTC
Comment Actions
TexasDreamer01 Wed 17 May 2023 02:21AM UTC
Comment Actions
Sugarysweet Mon 15 May 2023 12:34PM UTC
Comment Actions
butterflyslinky Mon 15 May 2023 12:38PM UTC
Comment Actions
Kereudio Mon 15 May 2023 03:42PM UTC
Comment Actions
butterflyslinky Mon 15 May 2023 03:52PM UTC
Comment Actions
Kereudio Mon 15 May 2023 04:11PM UTC
Comment Actions
konykorynao Mon 15 May 2023 01:27PM UTC
Comment Actions
westiec Mon 15 May 2023 02:21PM UTC
Comment Actions
rinny (snowbloom) Mon 15 May 2023 03:12PM UTC
Comment Actions
MysteriousSunshine Mon 15 May 2023 03:12PM UTC
Comment Actions
thebadwolf Mon 15 May 2023 03:20PM UTC
Comment Actions
TexasDreamer01 Wed 17 May 2023 02:24AM UTC
Comment Actions
glow Mon 15 May 2023 03:52PM UTC
Comment Actions
wednesdaysky Mon 15 May 2023 04:59PM UTC
Last Edited Mon 15 May 2023 05:00PM UTC
Comment Actions
sevenangels Mon 15 May 2023 05:03PM UTC
Comment Actions
Coroniel Mon 15 May 2023 05:25PM UTC
Comment Actions
tenrousei_kuroi Mon 15 May 2023 09:07PM UTC
Last Edited Mon 15 May 2023 09:29PM UTC
Comment Actions
Kereudio Mon 15 May 2023 09:42PM UTC
Last Edited Mon 15 May 2023 09:42PM UTC
Comment Actions
BlackKittens Mon 15 May 2023 10:35PM UTC
Comment Actions
Kereudio Mon 15 May 2023 11:09PM UTC
Comment Actions
BlackKittens Tue 16 May 2023 12:15AM UTC
Comment Actions
Kereudio Tue 16 May 2023 03:47AM UTC
Comment Actions
tenrousei_kuroi Tue 16 May 2023 11:37PM UTC
Last Edited Tue 16 May 2023 11:51PM UTC
Comment Actions
Kereudio Wed 17 May 2023 12:08AM UTC
Last Edited Wed 17 May 2023 12:08AM UTC
Comment Actions
tenrousei_kuroi Wed 17 May 2023 01:01AM UTC
Last Edited Wed 17 May 2023 01:04AM UTC
Comment Actions
tenrousei_kuroi Wed 17 May 2023 01:14AM UTC
Comment Actions
thebadwolf Mon 15 May 2023 11:11PM UTC
Comment Actions
MintGreen_and_NeonPink Tue 16 May 2023 02:14PM UTC
Comment Actions
likegoldandfaceted Sat 20 May 2023 05:53AM UTC
Comment Actions
tenrousei_kuroi Sat 20 May 2023 12:53PM UTC
Comment Actions
Answrs Mon 15 May 2023 10:31PM UTC
Comment Actions
Trudemaethien Mon 15 May 2023 10:44PM UTC
Comment Actions
kissingkeith Mon 15 May 2023 11:19PM UTC
Comment Actions
ofthesummer Wed 17 May 2023 01:19AM UTC
Comment Actions
SailorStarDust1 Mon 15 May 2023 11:24PM UTC
Last Edited Tue 16 May 2023 12:23AM UTC
Comment Actions
vanyamire Tue 16 May 2023 02:54AM UTC
Comment Actions
FireForEffect Tue 16 May 2023 04:14AM UTC
Comment Actions
SailorStarDust1 Tue 16 May 2023 02:35PM UTC
Comment Actions
vanyamire Tue 16 May 2023 02:49PM UTC
Comment Actions
SailorStarDust1 Tue 16 May 2023 03:17PM UTC
Comment Actions
In_Which_Magic_Is_Real Mon 15 May 2023 11:41PM UTC
Comment Actions
Sherloqued Tue 16 May 2023 12:41AM UTC
Comment Actions
butterflyslinky Tue 16 May 2023 01:23AM UTC
Comment Actions
Sherloqued Tue 16 May 2023 01:58AM UTC
Comment Actions
Kereudio Tue 16 May 2023 03:12AM UTC
Comment Actions
In_Which_Magic_Is_Real Tue 16 May 2023 01:13PM UTC
Comment Actions
Kereudio Tue 16 May 2023 03:13PM UTC
Comment Actions
stormyseasons Wed 17 May 2023 11:46AM UTC
Comment Actions
bluevinegar Mon 15 May 2023 11:45PM UTC
Last Edited Mon 15 May 2023 11:46PM UTC
Comment Actions
hiromiroll Mon 15 May 2023 11:58PM UTC
Last Edited Tue 16 May 2023 12:06AM UTC
Comment Actions
Pages Navigation