These aren’t hypothetical. They’re the checks that fail most often across the sites scanned on agent-ready.dev — the live failure rates are on the State of Agent Readability report. Each one below is cheap to fix; together they’re the difference between a site an AI agent can parse and cite and one it skips. Want the exact list for your site? Run a scan.
1. An AGENTS.md that’s missing its core sections
This is the most-failed check in the whole corpus. Plenty of repos now ship an AGENTS.md (the skill file that briefs coding agents like Claude Code, Codex and Cursor) — but the file is often a thin paragraph with none of the sections an agent actually reads. The check wants at least two of Installation, Configuration, and Usage, each with real commands.
Fix:structure it. Don’t pad — spell out how to install, configure, and run, with a code block in each:
# Acme CLI
> A command-line tool for managing Acme deployments.
## Installation
```bash
npm install -g @acme/cli
```
## Configuration
Create `acme.config.json` in your project root:
```json
{ "project": "my-app", "region": "us-east-1" }
```
## Usage
```bash
acme deploy --env production
```Full walkthrough: how to write an effective AGENTS.md.
2. No sitemap.md (or one with no structure)
Almost every site has a sitemap.xml. Far fewer have a sitemap.md — and it’s two of the most-failed checks on the board. A sitemap.md is the markdown twin of sitemap.xml: a curated, human- and agent-readable index of your most important pages. Where sitemap.xml is a flat XML list a crawler parses, sitemap.md is something an agent can read and cite directly. Serve it at /sitemap.md (or /docs/sitemap.md).
Fix:write a markdown file with H2 sections and descriptive markdown links — not a wall of URLs. The structure is the point:
# Sitemap
## Documentation
- [Getting started](/docs/getting-started.md): install and first deploy
- [Configuration](/docs/configuration.md): every config option explained
- [CLI reference](/docs/cli.md): all commands and flags
## Guides
- [Deploy to production](/guides/production.md)
- [Set up CI](/guides/ci.md)Full guide: what is sitemap.md (and how to add one). On how it differs from the XML version: llms.txt vs sitemap.xml.
3. An llms.txt that skips the basics
An llms.txt file is only useful if it follows the shape agents expect, and several of its sub-checks fail on a majority of the files that exist. The three that bite most often: no blockquote summary right after the H1, no H2 sections grouping links, and malformed links that aren’t in [name](url): description form. Broken links (a 404 behind a listed URL) are common too.
Fix: H1, a one-line blockquote, then H2 sections of properly-formatted links:
# Acme
> Acme is a deployment platform for full-stack apps.
## Docs
- [Getting started](https://acme.com/docs/start.md): install and first deploy
- [API reference](https://acme.com/docs/api.md): complete REST API
## Guides
- [Production checklist](https://acme.com/guides/prod.md)Step-by-step: how to add an llms.txt file.
4. No llms-full.txt companion
A standard llms.txt is a directory— it links out to pages the agent then has to fetch one by one. The companion llms-full.txt inlines the full content so an LLM gets your whole documentation in a single request. Two-thirds of sites with an llms.txt skip it.
Fix: generate an expanded file at /llms-full.txt that concatenates the actual content of the pages your llms.txt links to (headings, prose, code), not just their URLs. It can be built from the same source as your llms.txt at deploy time.
Full guide: what is llms-full.txt (and how to generate it).
5. A sitemap.xml with no <lastmod> dates
Your sitemap.xml might be valid and still fail this one: more than half of sites omit <lastmod>on their entries. Without it, an agent has no way to tell which pages are fresh and which are years stale — so it either re-crawls everything or trusts nothing.
Fix: add an ISO-8601 <lastmod> to every <url>entry, driven off each page’s real last-changed date (not the build timestamp — bumping them all on every deploy is its own anti-pattern):
<url>
<loc>https://acme.com/docs/getting-started</loc>
<lastmod>2026-06-10</lastmod>
</url>How to find which of these your site gets wrong
Run the full agent-readability score — it checks all of the above plus ~60 more and hands back a prioritised fix list specific to your site. Then re-check a single spec with the matching validator. And see the State of Agent Readability report for how your scores compare to everyone else’s.
Frequently asked questions
- What's the most common agent-readability mistake?
- Shipping an AGENTS.md without its core sections. Across the sites we've scanned, it's the single most-failed check: the file exists, but it has none of the Installation, Configuration, or Usage sections a coding agent reads to learn how to use the project. The fix is structural, not lengthy — two or three labelled sections with real commands and a code example.
- Do I need an AGENTS.md if my site isn't a code project?
- No. AGENTS.md is for codebases and developer tools — it briefs coding agents (Claude Code, Codex, Cursor) on how to work in your repo. A marketing site or blog doesn't need one, and Agent Ready won't penalise a content site for not having it. Every site, though, benefits from llms.txt and a sitemap.md.
- What's the difference between sitemap.xml and sitemap.md?
- sitemap.xml is the machine-only XML index search engines have used since 2005 — a flat list of every URL with optional lastmod dates. sitemap.md is a curated, human- and agent-readable markdown index: H2 sections grouping your most important pages with descriptive links. Agents can read and cite a sitemap.md directly; they have to parse XML to use a sitemap.xml. Ship both — they serve different readers.
- How do I find out which of these my own site gets wrong?
- Run a scan. The Agent Ready scanner checks all of these (and ~60 more) and returns a prioritised fix list with the exact failures for your site. Start at the full agent-readability score, or use the per-spec validators (llms.txt checker, AGENTS.md validator) for a targeted re-check.
- Where do these rankings come from?
- They're the most-failed checks across the public scans run on agent-ready.dev, deduplicated to one latest scan per domain. The live failure rates — and how they're computed — are on the State of Agent Readability page. The order here reflects that data; the fixes themselves are durable.