# Study Finds Roughly a Third of New Websites Are AI-Generated

_Tuesday, April 28, 2026 at 4:15 AM EDT · AI · Latest · Tier 1 — Major_

![Study Finds Roughly a Third of New Websites Are AI-Generated — Primary](https://storage.ghost.io/c/0f/76/0f76b548-bc58-4f25-abc3-3f5ebca07da4/content/images/size/w1200/2026/04/alex-shuper-nACMb7M2RHI-unsplash.jpg)

Researchers analyzing Internet Archive data have found that approximately 35 percent of websites published since late 2022 are AI-generated or AI-assisted. The team, which includes researchers from Stanford, Imperial College London, and the Internet Archive, published its findings online in a paper titled "The Impact of AI-Generated Text on the Internet."

The researchers sampled websites archived over a 33-month period between August 2022 and May 2025 and used the AI-detection tool Pangram v3 to identify synthetic content. They found that the share of AI-generated sites rose from zero before ChatGPT's launch to roughly 35 percent by mid-2025. "I find the sheer speed of the AI takeover of the web quite staggering," said Jonáš Doležal, a Stanford AI researcher and co-author of the paper. "After decades of humans shaping it, a significant portion of the internet has become defined by AI in just three years."

The study tested six common concerns about AI-generated text. The researchers examined whether it shrinks viewpoint diversity, spreads disinformation, makes online writing more sanitized and cheerful, fails to cite sources, lowers semantic density, and forces writing into a monoculture where unique voices disappear. To test for disinformation, the team extracted fact-based claims from selected websites and paid human fact-checkers to verify them. To assess source citation, they computed outbound link density in AI-generated text.

Of the six hypotheses, only two appeared to hold true. AI-generated text was less semantically diverse and more positive in tone overall. The researchers did not find an increase in verifiably false statements. "The most surprising result was that our Truth Decay hypothesis wasn't confirmed," Doležal said. "It could still be the case that AI is quietly increasing the volume of unverifiable claims, or it may simply be that the internet wasn't a particularly truth-adhering place to begin with."

The team is now working with the Internet Archive to turn the analysis into a continuous monitoring tool rather than a single snapshot. Maty Bohacek, a student researcher at Stanford and co-author, said they are interested in adding more granularity by looking at which kinds of websites are most affected, broken down by category or language.

## Sources

- [404 Media](https://www.404media.co/study-finds-a-third-of-new-websites-are-ai-generated/)

---
Canonical: https://techandbusiness.org/newswire/nboQQLUk2FOYJmEHiPKd2I
Retrieved: 2026-04-28T11:54:23.625Z
Publisher: Tech & Business (techandbusiness.org)
