---
title: "Make your documentation readable by AI agents"
description: "How to structure developer documentation so that AI agents like Claude Code, ChatGPT, Cursor, Copilot can discover, retrieve, and cite it accurately. Three layers: discovery, retrieval, and tools."
doc_version: "1.0"
last_updated: "2026-03-27"
canonical: "https://timothyjordan.com/blog/2026/03/27/Make-Your-Docs-Agent-Readable.html"
---
# Make your documentation readable by AI agents

_Published March 27, 2026_

> How to structure developer documentation so that AI agents like Claude Code, ChatGPT, Cursor, Copilot can discover, retrieve, and cite it accurately. Three layers: discovery, retrieval, and tools.

_[Cross-posted from Vercel KB](https://vercel.com/kb/guide/make-your-documentation-readable-by-ai-agents)_

AI agents (Claude Code, ChatGPT, Cursor, Copilot) are a primary consumer of developer documentation. They don't need navigation chrome, dark mode toggles, or animated code blocks. They need:

*   **Discoverable content**: Where the docs are and what they cover

*   **Clean retrieval**: Markdown, not a DOM tree

*   **Structured metadata**: Version, last updated, and canonical URL

*   **Tool access**: Search and fetch via protocol, not scraping


Most content platforms serve agents the same HTML page they serve humans. The agent then spends tokens stripping tags, guessing at content boundaries, and hoping the important information survives extraction. The result: hallucinated APIs, outdated code examples, and missed context.

## Three layers of agent readiness

Agents interact with content in three layers. They discover what exists, retrieve clean content, optionally index at scale, and use tools for precision queries.

### Layer 1: Discovery

Help agents find what exists before they fetch anything.

| Requirement             | Implementation                                                                               | Purpose                                                                                      |
| ----------------------- | -------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------- |
| `/llms.txt`             | Curated markdown index at the site root with section headings and links                      | Entry point for agents and humans pasting docs into an IDE                                   |
| `sitemap.xml`           | Standard sitemap with accurate `lastmod` dates                                               | Freshness tracking for agents that monitor content changes                                   |
| `sitemap.md`            | Semantic sitemap served as markdown with section headings, categories, and page descriptions | High-level orientation for LLM-assisted navigation and contributor onboarding                |
| `robots.txt`            | Documented stance on agent access (which bots can crawl, which pages are off-limits)         | Crawl control and access transparency                                                        |
| JSON-LD structured data | Title, description, canonical URL, and breadcrumbs on every HTML page                        | Agents that parse HTML can understand page type and relationships without traversing the DOM |

Example JSON-LD for a docs page:

```json
{
  "@context": "<https://schema.org>",
  "@type": "TechArticle",
  "headline": "Vercel Functions",
  "description": "Deploy server-side code on Vercel.",
  "url": "<https://vercel.com/docs/functions>",
  "breadcrumb": { "@type": "BreadcrumbList",
    "itemListElement": [
      {
        "@type": "ListItem",
        "position": 1,
        "name": "Docs",
        "item": "<https://vercel.com/docs>"
      },
      {
        "@type": "ListItem",
        "position": 2,
        "name": "Functions",
        "item": "<https://vercel.com/docs/functions>"
      }
    ]
  }
}
```

### Layer 2: Retrieval

This is the highest-impact layer. When an agent fetches a docs page, it should get markdown, not HTML with embedded scripts.

| Mechanism           | How it works                                                                                                                                                                                                          |
| ------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Content negotiation | Agents that send `Accept: text/markdown` receive markdown with `Content-Type: text/markdown; charset=utf-8` and a `Vary: Accept` header. Claude Code sends this header natively.                                      |
| Agent auto-rewrite  | Detected AI agents receive markdown automatically, even without an `Accept: text/markdown` header. Detection uses user-agent matching, RFC 9421 Signature-Agent headers, and a heuristic fallback for unknown agents. |
| `.md` endpoints     | Append `.md` to any docs URL to get the markdown version directly. Works without custom headers. Useful for pasting docs into a chat or IDE.                                                                          |
| Rich frontmatter    | Every markdown response includes metadata for accurate citations.                                                                                                                                                     |
| HTML alternate link | Adding `<link rel="alternate" type="text/markdown">` in the HTML `<head>` tells agents a markdown version exists.                                                                                                     |

Example request and response:

```bash
curl -H "Accept: text/markdown" <https://vercel.com/docs/functions>
```

```markdown
---
title: Functions
description: Deploy server-side code on Vercel.
canonical_url: <https://vercel.com/docs/functions>
md_url: <https://vercel.com/docs/functions.md>
last_updated: 2026-01-15T12:34:56.000Z
---

# Functions

Deploy server-side code on Vercel...
```

Without `canonical_url` and `last_updated`, agents can't link back to the source or judge whether their information is stale.

The `Vary: Accept` header tells CDNs to cache HTML and markdown responses separately for the same URL. Without it, an agent's markdown response could be served to a browser, or vice versa.

#### **Always serve markdown to agents**

Content negotiation and `.md` endpoints cover agents that explicitly request markdown. Many agents don't. They send a standard `GET` with no special headers. If your platform can detect that the request comes from an AI agent, serve markdown anyway.

Vercel uses a three-layer detection approach:

1.  **User-agent matching**: Check against a maintained list of known AI agent strings (Claude, ChatGPT, GPTBot, Cursor, Copilot, and others). This is the most reliable signal.

2.  **Signature-Agent header**: The RFC 9421 standard header, used by ChatGPT's agent. Validate against known AI service domains.

3.  **Heuristic fallback**: If the request is missing the `sec-fetch-mode` header (which real browsers always send) and the user-agent matches a bot-like pattern, treat it as an agent. This catches unknown agents at the cost of occasional false positives, which are low-harm since serving markdown to a non-AI bot has no negative effect.


This means agents get markdown on every valid docs request, regardless of how they make it.

#### **Handle 404s with markdown, not HTML**

When an agent requests a page that doesn't exist, an HTML 404 page is useless. The agent can't parse it, and the conversation stalls. Instead, return a markdown response with actionable content:

1.  Search your docs index for pages similar to the requested path.

2.  If a result scores above a high-confidence threshold (0.99+), redirect the agent to the correct page.

3.  Otherwise, return markdown-formatted suggestions listing the closest matches as links.

4.  Return a 200 status, not 404. Agents need content they can act on.

5.  Append a sitemap footer to every markdown response so agents can browse the full index if they hit a dead end.


### Layer 3: Tool access

The most sophisticated agents don't scrape at all. They call tools.

| Tool        | Description                                                                                                                                                                                                  |
| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| MCP servers | Search docs, fetch specific pages, and list available content through a standard protocol. The [Vercel MCP server](https://vercel.com/docs/mcp) covers `vercel.com/docs`, `nextjs.org`, and the AI SDK docs. |
| Search APIs | Return structured JSON results with canonical URLs, snippets, and freshness metadata.                                                                                                                        |
| AI Chat     | Conversational interface on docs sites backed by doc-aware tools: `search_docs`, `get_doc_page`, and `list_docs`. Available to both humans and agents.                                                       |

## Checklist

> For the full scoring rubric (0–100) and detailed verification steps, see the [Agent-Readability Spec](https://vercel.com/kb/guide/agent-readability-spec).

**Discovery**

*   Serve `/llms.txt` with a curated H1 + H2 index of your content

*   Publish `sitemap.xml` with accurate `lastmod` dates

*   Serve `/sitemap.md` with a semantic, markdown-formatted sitemap describing docs sections and pages

*   Document agent access policy in `robots.txt`

*   Add JSON-LD structured data (title, description, canonical URL, and breadcrumbs) to every page


**Retrieval**

*   Return markdown for `Accept: text/markdown` with a `Vary: Accept` header

*   Generate `.md` endpoints for all content pages

*   Include frontmatter metadata (`title`, `canonical_url`, `last_updated`) in every markdown response

*   Add `<link rel="alternate" type="text/markdown">` to HTML pages

*   Detect AI agents (user-agent matching, Signature-Agent header, heuristic fallback) and serve markdown automatically

*   Verify by appending `.md` to any page URL and confirming you get clean markdown with frontmatter


**Tool access**

*   Expose search via MCP server or search API

*   Add a `SKILLS.md` or `AGENTS.md` file with install, config, and usage instructions for coding agents


To get started, run through the [checklist](#checklist) above, then score your site against the [Agent-Readability Spec](https://vercel.com/kb/guide/agent-readability-spec).

## Further reading

*   [Agent-Readability Spec](https://vercel.com/kb/guide/agent-readability-spec): Full checklist and 0–100 scoring rubric

*   [Vercel MCP server](https://vercel.com/docs/mcp): Query Vercel, Next.js, and AI SDK docs via MCP

*   [llms.txt specification](https://llmstxt.org/): The llms.txt standard

## Sitemap

See the [full sitemap](/sitemap.md) or the [blog index](/blog/index.md) for every page on the site.
