What Is llms.txt? The New Standard for AI Crawlers

TL;DR

llms.txt is a proposed standard file (similar to robots.txt) placed at your domain root that tells AI crawlers what your site is about, which content to prioritize, and how to categorize your business. Adoption is still early — Perplexity and Claude honor it, ChatGPT support is emerging. Creating one takes under 30 minutes and has no downside.

What is the llms.txt standard?

llms.txt is a plain text file placed at your website's root (e.g., example.com/llms.txt) that provides structured information about your site to AI language models and their associated crawlers. The concept was proposed by Jeremy Howard in September 2024 and formalized through a community specification at llmstxt.org. It functions as a complement to robots.txt — where robots.txt tells crawlers what not to access, llms.txt tells AI systems what your site is, what it covers, and which pages are most important.

The file uses Markdown formatting and follows a defined section structure. At its core, it contains a site description, a list of key pages with brief annotations, and optional metadata about the organization. The specification deliberately keeps the format simple — no JSON, no XML, no schema vocabulary. This simplicity is intentional: the file is designed to be readable by both AI systems and humans.

The parallel to robots.txt is useful but imperfect. robots.txt emerged in 1994 when web crawlers needed access control rules. llms.txt emerged in 2024 when AI systems needed context about sites they were summarizing and citing. The underlying need is the same: a standardized way for site owners to communicate with automated systems. For a deeper look at how AI engines select citation sources, see how AI engines decide what to cite.

Which AI platforms read llms.txt?

Platform support for llms.txt is uneven as of March 2026. Perplexity and Anthropic's Claude have confirmed they read and incorporate llms.txt data. OpenAI has acknowledged the standard and ChatGPT's browsing mode shows partial support. Google's Gemini and Microsoft's Bing Copilot do not currently honor the file, though neither has ruled out future support.

Platform	Reads llms.txt	Notes
Perplexity	Yes	Confirmed since December 2024. Uses llms.txt data for site categorization and content prioritization.
Claude (Anthropic)	Yes	Confirmed. Reads llms.txt when available during web research tasks.
ChatGPT (OpenAI)	Emerging	GPT-4o browsing shows partial awareness. Not officially documented in OpenAI's crawling specs.
Gemini (Google)	No	No public acknowledgment. Google's AI crawling relies on existing Search index and structured data.
Bing Copilot (Microsoft)	No	Uses Bing's existing index. No indication of llms.txt support in development.

The adoption pattern mirrors robots.txt history. When robots.txt was introduced, major crawlers adopted it over 2-3 years. The pragmatic case for creating llms.txt now is that it takes under 30 minutes, has zero negative side effects, and positions your site for the platforms that do read it — particularly Perplexity, which drives measurable referral traffic to cited sources.

What should you include in your llms.txt file?

The llms.txt specification defines four main sections: a title/description header, a list of primary pages, a list of optional/secondary pages, and any additional context. Based on analysis of 1,200 llms.txt files indexed by llmstxt.org's directory as of February 2026, the most effective files share three characteristics: they are concise (under 500 lines), they annotate each URL with a one-sentence description, and they explicitly state the site's primary topic or business category.

Section-by-section breakdown

Title and description — Start with '# Site Name' followed by a 1-2 sentence description of what the site/business does. Be specific: 'B2B SaaS for inventory management' is useful; 'innovative solutions' is not.
Primary pages — Listed under '## Main' or as the first URL section. Include your 5-15 most important pages with annotations. Format: '- [Page Title](URL): One-sentence description of what this page covers.'
Secondary pages — Listed under '## Optional' or a similar heading. Include supporting pages that provide context: blog posts, documentation, case studies. Limit to 20-30 entries.
Additional context — Optional free-text section for information that doesn't fit elsewhere: business location, target audience, founding date, key team members. Keep it factual.

A practical example for a Mallorca restaurant might include the menu page, location/hours page, and reservation page as primary entries, with blog posts about local ingredients or seasonal menus as secondary entries. The description would state the cuisine type, location, and price range — the exact information a tourist would ask an AI about.

llms.txt works best when combined with other AI discoverability signals. Structured data via schema markup, consistent entity information, and AEO-optimized content create a layered system where llms.txt provides the overview and your actual pages provide the detail.

How do you add llms.txt to a Next.js site?

Adding llms.txt to a Next.js project is a three-step process that takes approximately 10 minutes. There are 128,000+ Next.js sites in production (Wappalyzer estimate, 2025), making this the most common framework-specific implementation question for llms.txt.

Step 1: Create the file in public/

Create a file named llms.txt in your project's public/ directory. Next.js serves everything in public/ from the domain root, so public/llms.txt becomes yourdomain.com/llms.txt automatically. No routing configuration is needed. Write the file content following the specification structure: title, description, primary pages, optional pages.

Step 2: Verify at domain root

After deploying, confirm the file is accessible at yourdomain.com/llms.txt. It should return plain text with a 200 status code. If you're using middleware or custom headers in next.config.js, ensure the file isn't being redirected or blocked. Check that the Content-Type header is text/plain — Next.js handles this correctly by default for .txt files.

Step 3: Test with curl

Run 'curl -I https://yourdomain.com/llms.txt' to verify the response headers. You should see HTTP 200, Content-Type: text/plain, and no redirect chains. Then run 'curl https://yourdomain.com/llms.txt' to verify the full content renders correctly. If you maintain multiple environments (staging, production), verify on production specifically — staging may have different access rules.

SCALEBASE implements llms.txt as part of its AEO service, including initial creation, quarterly content updates, and monitoring to confirm AI platforms are reading the file. For sites already running Next.js, the implementation is typically completed within a single sprint.

Frequently Asked Questions

Is llms.txt an official web standard?

No. It is a community-proposed specification, not a W3C or IETF standard. The specification is maintained at llmstxt.org and has been adopted by several AI platforms, but it has no formal standards body endorsement. Its status is comparable to robots.txt in its early years — widely adopted before formal standardization.

Does llms.txt replace robots.txt?

No. The two files serve different purposes. robots.txt controls crawler access — which pages can and cannot be crawled. llms.txt provides context about your site — what it covers, which pages matter most, and how to categorize your business. You should maintain both files. They are complementary, not competing.

Can llms.txt hurt your SEO?

No. llms.txt is not read by Googlebot or any traditional search engine crawler. It has no impact on indexing, crawling, or ranking in Google, Bing, or other search engines. The file is exclusively consumed by AI systems. There is no known mechanism by which llms.txt could negatively affect search performance.

How often should you update llms.txt?

Update it whenever your site's structure changes significantly: new product launches, major content additions, URL restructuring, or business pivots. For most sites, a quarterly review is sufficient. If you publish content frequently (weekly blog posts), you don't need to add every post — focus on evergreen pages and cornerstone content that define what your site is about.

What Is an llms.txt File and Does Your Website Need One?