---
title: "The State of Agent Readability on the Web"
description: "We scored the 50,074 most-visited websites for how well an AI agent can discover, parse, and comprehend them. The median scores 52 of 100, not one scored excellent, and roughly three in four haven't shipped the agent-readability layer that halves what an agent spends to use a site."
doc_version: "1.0"
last_updated: "2026-06-25"
canonical: "https://timothyjordan.com/blog/2026/06/25/state-of-agent-readability.html"
---
# The State of Agent Readability on the Web

_Published June 25, 2026_

> We scored the 50,074 most-visited websites for how well an AI agent can discover, parse, and comprehend them. The median scores 52 of 100, not one scored excellent, and roughly three in four haven't shipped the agent-readability layer that halves what an agent spends to use a site.

_[Cross-posted from a14y.dev](https://a14y.dev/research/state-of-agent-readability/)_

## Summary

**On most of the popular web, an AI agent burns roughly twice the tokens it should.**

We scored the 50,074 most-visited websites for how well an agent can discover, parse, and comprehend them. The plumbing search engines asked for is everywhere: 88% expose a robots.txt. The layer agents need is not. Only 27% ship an `llms.txt` and 25% an `AGENTS.md`. The median site scores 52/100, and not one of the 50,074 scored "excellent." Raising a site's a14y score roughly halves the tokens an agent spends to use it. The web is wide open for improvement and [very few](https://a14y.dev/leaderboard/) have moved.

| Metric | Result |
| --- | --- |
| Sites scored | 50,074 |
| Median score | 52 / 100 |
| Scored "excellent" | none (best is 83) |
| llms.txt | 27% |
| AGENTS.md | 25% |

## The question

AI agents are becoming a primary way people reach the web. They read sites, follow links, and synthesize answers on someone's behalf. That raises a concrete question for every site owner: **can an agent use my site efficiently?** The a14y scorecard answers it for one site. We ran it across the web at scale to ask the bigger version: is the _web_ ready for agents?

"Ready" here is mechanical, not subjective. Does the site expose what an agent needs to find and parse it: a usable `llms.txt`, an `AGENTS.md`, a sitemap, clean robots rules, semantic HTML, all measured by the v0.2.0 scorecard's checks. It does not judge the quality of the content itself.

## Methodology

A single point-in-time scan of the CrUX most-visited list, scored in page mode against the published scorecard. Run on cloud infrastructure, off any residential IP.

| Field | Value |
| --- | --- |
| Source | CrUX top-100k most-visited origins ([zakird/crux-top-lists](https://github.com/zakird/crux-top-lists/blob/main/data/global/current.csv.gz), global) |
| Sample | **50,074** scored: 100,000 rows → 64,746 after dedup to one per registrable domain → 50,074 reachable & not adult/gambling/spam |
| Mode | Page mode: each site's homepage, not a full-site crawl |
| Scorecard | Published `v0.2.0` (38 checks) and the in-flight `v0.3.0-draft` (45 checks), scored 0–100. Headline scores use v0.2.0; the analysis below draws on both, tagged. |
| Tool | `npx a14y` (the same audit anyone can run) |
| Infra | Sharded across 64 Cloud Run tasks; run data archived in object storage |
| Source label | `crux batch-2026-06-15-100k` |
| Date | June 18, 2026 |

Page mode is deliberate. Scoring one homepage per site keeps a survey this size tractable and comparable. It proxies a site's agent readiness; it does not capture readiness that lives deeper in the site.

## Results: the web scores low

Distribution of agent-readability scores across all 50,074 sites. Mean 50.5, median 52.

| Band | Sites | Share |
| --- | --- | --- |
| Excellent (85–100) | 0 | 0% |
| Good (70–84) | 1,678 | 3.4% |
| Fair (50–69) | 26,530 | 53% |
| Poor (0–49) | 21,866 | 43.7% |

The popular web clusters in the middle and below. **43.7%** of sites score under 50, and only **3.4%** clear 70. **Not one site in the top 50,074 scored "excellent" (85 or above)**; the single best managed 83. Whatever the web was optimized for, it wasn't agents.

[Explore all 50,074 sites on the web leaderboard →](https://a14y.dev/research/web/)

## Results: the agent-era layer is missing

What the most-visited sites actually expose to agents, against the published v0.2.0 scorecard and the in-flight v0.3.0-draft.

The signals search engines taught the web to ship are nearly everywhere: 87% expose a robots.txt and 63% a sitemap. But only 78% actually let the major AI crawlers in, so close to a fifth of the most-visited sites block them outright.

The signals agents need are not just rare, they are often a mirage. 29% ship an `llms.txt`, and most are real (97% are non-empty). The other agent files mostly answer with a 200 but not a usable document: 30% appear to have an `AGENTS.md` while only 0.2% carry the expected sections, and 24% answer at `/sitemap.md` while only 0.1% are a real structured sitemap.

The markdown mirror tells the same story, and the draft scorecard makes it sharper. 27% advertise one, but under v0.3.0-draft only 1.8% of those mirrors are actually markdown rather than HTML. And the page itself often needs a browser to read: v0.3.0-draft finds 74% serve real homepage content in the initial HTML, which leaves about a quarter as JavaScript shells that agents without a JS engine (Claude, Perplexity, OpenAI's SearchBot) see as blank.

| Signal | Sites | What's behind the number | From |
| --- | --- | --- | --- |
| robots.txt allows AI bots | 78% | 22% block GPTBot, ClaudeBot, CCBot, or Google-Extended | `v0.2.0` |
| llms.txt | 29% | 97% of them are non-empty, a real file | `v0.2.0` |
| AGENTS.md | 30% | only 0.2% carry the expected sections | `v0.2.0` |
| sitemap.md | 24% | only 0.1% are a real structured sitemap | `v0.2.0` |
| Markdown mirror advertised | 27% | 4% serve it on request via content negotiation | `v0.2.0` |
| Mirror is valid markdown | 1.8% | of advertised mirrors are actually markdown, not HTML | `v0.3.0-draft` |
| Homepage server-renders content | 74% | 26% are JavaScript shells, invisible to agents that don't run JS | `v0.3.0-draft` |
| No consent interstitial | 93% | 7% gate content behind a wall an agent can't click | `v0.3.0-draft` |
| JSON-LD structured data | 37% | 6% include a dateModified | `v0.2.0` |
| Glossary link | 0.5% | effectively nobody | `v0.2.0` |

<details markdown="1">
<summary>Full reference: every check, by theme</summary>

Adoption for all 38 v0.2.0 checks, plus the 5 checks v0.3.0-draft adds. "Of applicable" excludes sites where a check does not apply (the markdown sub-checks, for example, only apply once a mirror exists). Three crawl-dependent checks (`discovery.in-page-link`, `discovery.indexed`, `discovery.no-duplicate-content`) need a multi-page crawl and were not measured by this single-page survey.

| Discoverability | Of all | Of applicable |
| --- | --- | --- |
| `sitemap-md.has-structure` | 0% | 0% |
| `llms-txt.md-extensions` | 0% | 0% |
| `agents-md.has-min-sections` | 0% | 0% |
| `llms-txt.content-type` | 5% | 19% |
| `sitemap-md.exists` | 24% | n/a |
| `sitemap-xml.has-lastmod` | 24% | 50% |
| `llms-txt.non-empty` | 28% | 97% |
| `llms-txt.exists` | 29% | n/a |
| `agents-md.exists` | 30% | n/a |
| `sitemap-xml.valid` | 47% | 76% |
| `sitemap-xml.exists` | 63% | n/a |
| `robots-txt.allows-ai-bots` | 78% | n/a |
| `robots-txt.exists` | 87% | n/a |
| `robots-txt.allows-llms-txt` | 91% | n/a |

| Markdown mirror | Of all | Of applicable |
| --- | --- | --- |
| `markdown.frontmatter` | 0% | 0% |
| `markdown.sitemap-section` | 0% | 0% |
| `markdown.alternate-link` | 0% | 0% |
| `markdown.canonical-header` | 0% | 0% |
| `markdown.content-negotiation` | 4% | 4% |
| `markdown.mirror-suffix` | 27% | 27% |

| Structured data | Of all | Of applicable |
| --- | --- | --- |
| `html.json-ld.date-modified` | 6% | 17% |
| `html.json-ld.breadcrumb` | 10% | 27% |
| `html.json-ld` | 37% | 37% |

| Content structure | Of all | Of applicable |
| --- | --- | --- |
| `html.glossary-link` | 0% | 0% |
| `html.text-ratio` | 45% | 45% |
| `html.headings` | 61% | 61% |

| HTML metadata | Of all | Of applicable |
| --- | --- | --- |
| `html.og-description` | 52% | 52% |
| `html.canonical-link` | 55% | 55% |
| `html.og-title` | 55% | 55% |
| `html.meta-description` | 65% | 65% |
| `html.lang-attribute` | 82% | 82% |

| HTTP | Of all | Of applicable |
| --- | --- | --- |
| `http.content-type-html` | 83% | 83% |
| `http.redirect-chain` | 95% | n/a |
| `http.status-200` | 97% | n/a |
| `http.no-noindex-noai` | 100% | n/a |

| Added in v0.3.0-draft | Of all | Of applicable |
| --- | --- | --- |
| `markdown.valid-markdown` | 0% | 2% |
| `markdown.size-reduction` | 5% | 17% |
| `markdown.navigation-stripped` | 14% | 52% |
| `html.ssr-content` | 74% | 74% |
| `http.no-interstitial` | 93% | 93% |

</details>

## And the fix works

The gap matters because closing it pays off. In a controlled A/B ([full case study](https://a14y.dev/research/scorecard-evals/)), serving the same site with the agent-readiness layer, versus without it, sharply cut what an AI agent spent to use it, with no loss in answer quality.

| Metric | Result |
| --- | --- |
| a14y score | 37 → 89 |
| Agent tokens | −49% |
| Tool calls | −52% |
| Wall-clock | −30% |
| Answer quality | tied (84 vs 83) |

Raising one site's a14y score from 37 to 89 cut the agent's token use about 49% and its tool calls about 52%, while an independent judge rated the answers statistically indistinguishable. Put the two findings together. **The agent-readiness layer roughly halves what an agent spends to use a site, and 73% of the most-visited web hasn't shipped it.**

## What it means

### For site owners

Agent readiness is still a near-empty field. Roughly three in four popular sites haven't shipped the layer, and none has it fully dialed in. That makes it a cheap, early-mover advantage: run `npx a14y` on your site, ship the top fixes, and roughly halve what every agent spends to use you.

### For agent & model builders

The signals you'd want to lean on can't be assumed yet. Only about 1 in 4 popular sites expose an `llms.txt` or `AGENTS.md`, so most of the time agents still fall back to the expensive path: fetching and parsing raw HTML. This dataset quantifies exactly where the gaps are.

### For the a14y project

This is the baseline. We'll re-run the survey on each scorecard release and track whether the web's agent readiness moves: a standing measure of how the agent-readable web is (or isn't) being built.

## Caveats

What this survey does not show, and where to read the numbers with care.

- **Page mode.** Each site is scored on its homepage only, a proxy for the site rather than a full-crawl verdict. Sites whose agent readiness lives in deeper sections are understated.
- **Point-in-time.** A single snapshot (June 18, 2026). Sites change; this is a frame, not a film.
- **Popularity bias.** CrUX skews toward large, heavily-trafficked sites; the long tail of the web likely looks different.
- **Reachable sites only.** Origins that didn't answer (DNS/TLS failures, bot or geo blocks) are dropped, so the sample is the reachable popular web.
- **Scorecard-specific.** Scores are against v0.2.0's 38 checks; a different scorecard shifts the absolute numbers (the leaderboard also carries a 0.3.0-draft scoring for comparison).
- **Filtered.** Adult, gambling, and spam domains are removed by a heuristic denylist, which is imperfect.

<details markdown="1">
<summary>Reproduce this study</summary>

Score any single site with the same tool the survey uses:

```
npx a14y https://example.com --scorecard 0.2.0
```

The survey itself is the same audit fanned out across the CrUX list. Batch provenance: `crux batch-2026-06-15-100k`, 50,074 sites, generated June 18, 2026.

</details>

## Sitemap

See the [full sitemap](/sitemap.md) or the [blog index](/blog/index.md) for every page on the site.