---
title: "llms.txt vs sitemap.xml: when to use which"
description: Comparison of llms.txt and sitemap.xml — audience, format, scope, and when to use each. Most sites should publish both.
last_updated: 2026-05-11
canonical_url: https://agent-ready.dev/llms-txt-vs-sitemap-xml
---

# llms.txt vs sitemap.xml

> Two machine-readable files at your site root, two different audiences. Here's when to use each — and why most sites should publish both.

## At a glance

| Criterion | llms.txt | sitemap.xml |
|---|---|---|
| Audience | AI agents and LLMs | Search engine crawlers |
| Format | Markdown | XML |
| Scope | Curated, high-value pages | Every indexable URL |
| Discovery | Read-on-fetch convention | Listed in robots.txt; submitted to Search Console |
| Standard | Proposal (late 2024) | Sitemaps Protocol since 2005 |
| Typical size | 10–100 entries | Hundreds to tens of thousands |
| Primary lift | AI citation accuracy | Search index coverage |

## What is llms.txt?

llms.txt is a Markdown file served at `/llms.txt` that gives AI systems a curated, plain-text map of a website's most important content. The format was [proposed in late 2024 by Jeremy Howard at Answer.AI](https://llmstxt.org). It is not yet an official web standard, but adoption has grown across documentation-heavy sites including Anthropic, Mintlify, and Cloudflare.

## What is sitemap.xml?

sitemap.xml is an XML file served at `/sitemap.xml` that lists every URL on a site you want search engines to crawl, plus optional metadata like last-modified dates and change frequency. The [Sitemaps Protocol](https://www.sitemaps.org/) was published in 2005 and is [documented by Google](https://developers.google.com/search/docs/crawling-indexing/sitemaps/overview), [Bing](https://www.bing.com/webmasters/help/sitemaps-3b5cf6ed), and every major search engine.

## How do llms.txt and sitemap.xml differ?

The two files solve different problems for different consumers. The biggest distinctions:

- **Audience.** LLMs and AI agents read llms.txt at inference time. Traditional crawlers (Googlebot, Bingbot) read sitemap.xml when building a search index.
- **Format.** Markdown is human-readable and trivially parseable by LLMs. XML is structured for crawlers and tooling.
- **Scope.** llms.txt is curated — only your most authoritative pages. sitemap.xml aims for completeness.
- **Discovery.** llms.txt is a read-on-fetch convention; clients try the path speculatively. sitemap.xml is referenced from `robots.txt` and submitted to webmaster consoles.
- **Standardisation.** sitemap.xml has had a formal protocol since 2005 with universal support. llms.txt is a proposed convention with growing adoption.

## When should I use llms.txt?

Publish an llms.txt if you have documentation, an API reference, policies, or any structured knowledge you want AI assistants to summarise accurately. It's most valuable when your site mixes templated marketing pages with a handful of authoritative ones — the llms.txt tells an LLM exactly which pages to weight. Skip it if every page is equally important (e-commerce catalogues, for example); there your sitemap.xml does the same job.

## When should I use sitemap.xml?

On every site that wants organic search traffic. sitemap.xml ensures search engines know about pages they might otherwise miss — deep pages, paginated archives, content discovered through JavaScript navigation. Google specifically recommends it for [sites with 500+ pages, frequently updated content, or limited internal linking](https://developers.google.com/search/docs/crawling-indexing/sitemaps/overview). There is no scenario where shipping one hurts.

## Should I use both?

Yes, for almost every site. They target separate retrieval systems, there's no overlap in cost or maintenance, and they're each cheap to generate from the same underlying URL list. Publish sitemap.xml for search engines, llms.txt for AI assistants, and use the [llms.txt validator](https://agent-ready.dev/llms-txt-checker) to confirm your llms.txt parses against the llmstxt.org spec.

## Frequently asked questions

### Does llms.txt replace sitemap.xml?

No. They serve different audiences and contain different information. sitemap.xml lists every indexable URL for search engines; llms.txt highlights your most authoritative content for AI assistants. Most sites should publish both.

### Do search engines read llms.txt?

Not as a ranking signal. The llms.txt convention is read by LLM-based clients (ChatGPT, Claude, Perplexity, Gemini) at inference time, not by traditional search crawlers as part of indexing. Treat it as an AI-assistant discovery file, not an SEO file.

### Do AI assistants read sitemap.xml?

Some do, opportunistically. Crawlers operated by AI companies (GPTBot, ClaudeBot, PerplexityBot) follow the same crawl conventions as search engines, and sitemap.xml can help them discover content. The format is verbose, which is why llms.txt was proposed as a more focused, AI-friendly channel.

### Where should each file live?

Both at the root of your site: /sitemap.xml and /llms.txt. Some llms.txt tooling also accepts /docs/llms.txt, but /llms.txt is the canonical path. Reference your sitemap from /robots.txt so crawlers can find it.

### How big should my llms.txt be?

Small. The point is curation — link to your 10–50 most useful URLs, not your entire content tree. If you want to expose more, ship an llms-full.txt companion file with the full content of those URLs concatenated as Markdown.

---

Read the full guide on the web: <https://agent-ready.dev/llms-txt-vs-sitemap-xml>

Validate your llms.txt: <https://agent-ready.dev/llms-txt-checker>

## Sitemap

See the full [sitemap](https://agent-ready.dev/sitemap.md) for all pages on agent-ready.dev.
