In a blog post released on November 10, 2025, the Wikimedia Foundation urged AI companies, developers and large-scale users to stop scraping Wikipedia’s web pages en-masse and instead access its content via its opt-in, paid service called Wikimedia Enterprise.
The foundation explained that scraping the live site at scale “severely taxes Wikipedia’s servers” and runs counter to its mission of sustaining a volunteer-edited, free encyclopedia. By using the paid API, companies can access structured, large-volume data in a way that supports the nonprofit mission.
While Wikipedia did not explicitly threaten legal action, the message is clear: rely on the official channel if you’re extracting large amounts of data for AI or large-scale use, and give proper attribution to the thousands of volunteer contributors.
Why Wikipedia Is Making This Move
Traffic Drop and Bot Scraping
The foundation reported an 8% year-over-year decline in human page-views after updating its bot-detection systems and discovering that much of the surge in traffic during May-June 2025 was from sophisticated AI bots imitating human users.
These bots were not serving Wikipedia’s mission nor donating; rather they consumed server resources, potentially degraded user experience, and contributed nothing to the community of editors. The foundation fears that “with fewer visits to Wikipedia, fewer volunteers may grow and enrich the content, and fewer individual donors may support this work.”
Cost and Sustainability
Wikipedia is largely supported by donations and volunteer edits. When AI systems divert traffic (or never send traffic) and scrape content without attribution or financial support, the sustainability of the platform comes into question. The Wikimedia Enterprise product allows companies to contractually access its content, providing revenue and ensuring that usage does not degrade the service.
Attribution and Trust
The foundation emphasised that generative-AI systems should properly attribute Wikipedia content when using it as a dataset. “For people to trust information shared on the internet, platforms should make it clear where the information is sourced from and elevate opportunities to visit and participate in those sources,” the blog reads.
What the Paid Product Offers
The Wikimedia Enterprise API is tailored for high-volume access by companies, search engines and AI platforms:
- It supports enterprise-grade service-level-agreements (SLAs), structured metadata, and large data Extract/Download.
- It prevents the strain on Wikipedia’s public site by keeping large automated traffic off the live front-end.
- It reinforces Wikipedia’s mission by turning usage into revenue that can support the volunteer editor base and infrastructure.
In effect, Wikipedia is asking: if you’re going to build AI models on our content, please pay us and give us credit — don’t treat us as a free data mine.
For AI-companies
Developers building language models, knowledge graphs, or search/assistant tools often rely on Wikipedia as a major dataset. The shift signals that free, un-attributed scraping is becoming less viable — both technically (due to bot-detection) and ethically.
AI firms must evaluate whether to:
- Transition to the paid API and integrate attribution workflows
- Reduce reliance on scraped Wikipedia content or develop alternate datasets
- Consider the financial and licensing implications of training/deploying models that rely on community-sourced knowledge
For Wikipedia and Open Knowledge Platforms
This is a watershed moment: open-knowledge platforms may start charging for industrial use of content even if the content remains free to human readers. The sustainability model may shift from purely donation-based to a hybrid donation + enterprise-access revenue.
It also raises questions about how volunteer-generated content is monetised when large corporate entities benefit from it.
For the Web and Data Rights
The move highlights a broader tension: who pays for the knowledge on which AI models are built? Websites, open-content communities and knowledge platforms are increasingly asking for either credit or compensation when their content is used at scale for AI. This may accelerate legal, regulatory and business-model changes around scraping, licensing and attribution.
Wikipedia’s appeal to AI companies marks a pivotal moment in the age of artificial intelligence and open knowledge. By asking companies to stop scraping webpages and instead use a paid API (while also demanding proper attribution), the Wikimedia Foundation is staking a position: the free-knowledge ecosystem must be respected, financially supported and credited if it is to sustain itself in the AI era.
For the AI industry, the message is unavoidable: free access to community-generated knowledge is no longer a given. How companies respond through licensing, attribution or alternative data sourcing will shape both the future of model-training and the long-term health of open-knowledge platforms.



