# % of new software engineering papers on arXiv are LLM related

_Friday, June 26, 2026 at 4:39 PM EDT · science · Latest · Tier 2 — Notable_

An analysis of papers published on arXiv shows that 70 percent of new software engineering papers are related to large language models. The finding is based on data from the cs.SE subcategory, which covers software engineering research.

Preprint servers allow fast publication of research. arXiv serves as the main open access archive for papers in computing and related fields. Every weekday the abstracts of new uploads are read for relevant work.

A total of 15,899 papers from the subcategory since January 2022 were analyzed. The arxivscraper Python package retrieved the metadata. Regular expressions identified papers whose titles or abstracts contained llm, large language model, ai, artificial intelligence or agent.

The percentage of papers with these terms in titles peaked at the end of 2024. The percentage in abstracts peaked or plateaued near the end of 2025. If the growth rate stays the same, the share could reach 100 percent in around 18 months. The analysis expects the growth to slow before that level is reached.

A replication of the analysis was carried out by Martin Monperrus.

## Sources

- [The Shape of Code](https://shape-of-code.com/2026/03/22/70-of-new-software-engineering-papers-on-arxiv-are-llm-related/)

---
Canonical: https://techandbusiness.org/newswire/dwShKCC5FBZlnWiQ1Ppoew
Retrieved: 2026-06-27T01:01:17.211Z
Publisher: Tech & Business (techandbusiness.org)
