Free tool · MIT-style usage · No signup

llms.txt builder: generate the AI-citation file for your site.

llms.txt is a 2024 proposed standard by Jeremy Howard for telling LLMs and AI agents which pages on your site to prioritize when answering questions. While 2026 research from over 515 million LLM bot traffic events confirms it has near-zero direct impact on ChatGPT or Perplexity citation today, it has emerging value in the agentic web layer where AI agents acting on behalf of users fetch /llms.txt to navigate sites. Worth implementing.

About this tool.

The llms.txt format was proposed by Jeremy Howard in 2024 as a simple markdown manifest sitting at the root of a website. It tells LLMs and AI agents which pages on your site are canonical for citation purposes, and provides a structured directory of the content they should prioritize when answering user questions.

Adoption has grown but the empirical evidence on direct citation lift is sobering. A 2026 analysis of over 515 million LLM bot traffic events found no statistically significant correlation between llms.txt presence and citation frequency on ChatGPT, Perplexity, or Claude. Less than 0.1 percent of citation-driving bot requests touched /llms.txt at all. The major AI engines do not consume the file as input for retrieval-augmented generation.

Where llms.txt does work in 2026 is the emerging agentic web layer. AI coding assistants like Claude Code, Cursor, and GitHub Copilot fetch llms.txt frequently when working with a documentation site. AI agents acting on behalf of users (booking appointments, ordering products, comparing vendors) read it to navigate. The Model Context Protocol ecosystem increasingly references it.

ThatDeveloperGuy has shipped llms.txt across the entire ThatWebHostingGuy substrate (130+ client sites) since early 2026. We treat it as future-proofing infrastructure rather than current citation lift. The cost is negligible (one file per site) and the upside compounds as the agentic layer matures.

The format is intentionally minimal. A single H1 with your site name, an optional blockquote description, and one or more H2 sections listing canonical pages as markdown links. There is also an llms-full.txt extended format for sites wanting to provide full content for AI ingestion. We recommend both: llms.txt as the canonical index, llms-full.txt as the deeper corpus.

Want a production-grade implementation that generates llms.txt, llms-full.txt, aeo.json, entity.json, brand.json, and ai.txt from a single typed site config? Use ThatDeveloperGuy's open-source aio-surfaces toolkit (MIT licensed) at pypi.org/project/aio-surfaces. It's the same toolkit that powers AI-citation surface generation across 130 plus production sites.

FAQ.

Does llms.txt actually help my site appear in ChatGPT?

Not directly as of May 2026. A 515 million event analysis found no statistically significant citation lift. However, the agentic web layer (AI agents acting on behalf of users) reads it, and that ecosystem is growing rapidly. Implement it as future-proofing.

What's the difference between llms.txt and llms-full.txt?

llms.txt is the canonical index — a one-page directory of your most important URLs with brief descriptions. llms-full.txt is the extended corpus — your full content for AI engines to ingest. Most sites only need llms.txt to start.

Where do I put the file?

At the root of your site: https://yoursite.com/llms.txt. Same convention as robots.txt and sitemap.xml.

Do I need to update it when content changes?

Yes. The whole point is that it represents your canonical current content. Regenerate whenever you publish or remove significant pages. Ideally automate via your build pipeline.

Is there a Schema.org alternative?

Not yet. Schema.org's WebSite + Action + Dataset types overlap with some llms.txt use cases, but the AI agent ecosystem has standardized on the llms.txt file location convention. Both can coexist.

Should I list every page in llms.txt?

No. The whole point is curation — list your canonical, evergreen, high-quality pages that AI engines should treat as authoritative. Burying signal in noise defeats the purpose.

Built by Joseph W. Anady at ThatDeveloperGuy. Need professional help? Get a free 48-hour audit.