Straight answers for common llms.txt questions — useful for SEO long-tail queries and AI-citable Q&A.
What is an llms.txt file?
An llms.txt file is a Markdown text file at your domain root (/llms.txt) that lists your site's most important pages with short descriptions for AI systems. It gives language models a curated map of your content without requiring them to crawl your entire site. The convention is documented at llmstxt.org.
Is this llms.txt generator free?
Yes — this generator is 100% free with no signup, no credits, and no login required. We crawl your public pages and build the file from HTML titles and meta descriptions without using AI. You can copy or download the result instantly.
Where do I put the llms.txt file?
Upload it to the root of your domain so it is reachable at https://yourdomain.com/llms.txt. Some documentation sites use a subpath like /docs/llms.txt, but the root is the most common location crawlers check first.
What's the difference between llms.txt and llms-full.txt?
llms.txt is a curated link index with short descriptions — what most sites need. llms-full.txt inlines the full text of those pages for deeper context and can be very large. Start with llms.txt; add llms-full.txt only if you have a specific use case that needs full page content.
Does llms.txt help SEO or Google rankings?
No — llms.txt is not a Google ranking factor, and Google has said it does not use llms.txt for search indexing. Its real value is in AI tooling, developer discovery, and giving coding assistants a structured overview of your site. Keep investing in sitemaps, meta tags, and content quality for SEO.
Do ChatGPT, Claude, Gemini, and Perplexity actually read llms.txt?
No major AI provider has officially committed to using llms.txt in production inference. Adoption is uncertain, though some crawlers occasionally fetch /llms.txt. The strongest confirmed use today is AI coding assistants and developer tooling that consume the file when pointed at your repo or docs.
Is llms.txt the same as robots.txt? Does it block AI crawlers?
No — llms.txt is an inclusion and curation file, not a blocking mechanism. It does not tell crawlers what they may or may not access. To restrict AI crawlers like GPTBot or ClaudeBot, use robots.txt rules and each provider's opt-out documentation.
How do I check if AI crawlers are fetching my llms.txt?
Filter your server access logs for requests to /llms.txt and inspect the User-Agent header. Look for identifiers such as GPTBot, ClaudeBot, PerplexityBot, and Google-Extended. A single fetch does not mean ongoing use — track patterns over weeks.
How big should the file be?
Curate your best pages rather than listing every URL on your site. Aim well under typical model context limits — roughly 150K words or ~700KB as a practical ceiling — and split into llms-full.txt or section-specific files if you need more. Quality and focus beat exhaustive dumps.
Should I create a Markdown copy of every page?
Generally no — publishing indexable .md mirrors of every HTML page can create duplicate-content issues if search engines crawl them. llms.txt links to your canonical URLs with descriptions; that is usually enough. Use llms-full.txt only when you deliberately want inlined full text for LLM context.
How often should I update llms.txt?
Update when you add, remove, or significantly change key pages — new product areas, docs sections, or landing pages. For active sites, regenerating from your sitemap monthly or on each deploy (via CI) keeps the file current without manual edits.
Does this tool use AI or Gemini?
No. The generator crawls your public HTML and extracts page titles, meta descriptions, and URL structure programmatically. Nothing is sent to Gemini or any other AI model to write your file.
Sitemap URL vs website URL — which should I use?
Use sitemap URL for blogs and large sites — it discovers every listed page quickly. Use website URL for smaller sites where we start at the homepage and follow internal links. Paste specific URLs when you only want a hand-picked subset.
Why can't my site be crawled?
Some sites block automated requests through bot protection, Cloudflare, or WAF rules. Try pasting specific URLs manually, or use your sitemap URL if it is publicly accessible without authentication.