How to Create AI SEO for Your Website

Q: Does llms.txt replace robots.txt?

No. robots.txt controls crawler access (which bots can crawl which pages). llms.txt provides AI-readable content about your site's structure and purpose. You need both.

Q: How long does AI SEO take to show results?

RAG-based systems like Perplexity and ChatGPT search can reflect changes within days to weeks. Foundation model training data has cutoffs — influence there is longer-term and requires external citation building.

AI search engines are now primary entry points for business discovery. Traditional SEO gets you found on Google. AI SEO — also called Generative Engine Optimization (GEO) — gets your business cited inside AI-generated answers. This guide covers the exact steps to make your website readable, discoverable, and citable by AI systems.

95 out of 100 Google AI Mode queries end without a click — but the 5 that do click are high-intent buyers. ChatGPT sends 12 million clicks per month in Germany alone. The top-cited site captured 2% of that traffic, outranking Amazon and eBay.

An HTML page costs an AI agent ~16,000 tokens to process. The same content in markdown costs ~3,150 — an 80% reduction. AI systems prefer structured, machine-readable content. This guide shows you how to deliver it.

Create `llms.txt`

llms.txt is the AI equivalent of robots.txt. Proposed by Jeremy Howard (Answer.AI), it tells LLMs what your site is about and where to find structured content. Place it at the root: https://yourdomain.com/llms.txt

# Company Name

> One-sentence description of who you are and what you do.

## Core Pages
- [Products](https://yourdomain.com/docs/products.md): what you sell
- [FAQs](https://yourdomain.com/docs/faqs.md): common questions

## Optional
- [Full Content](https://yourdomain.com/llms-full.txt): all content merged

Key rules

Lead with a > blockquote — AI uses this to decide relevance before fetching anything else
Link to markdown files, not HTML pages (80% more token-efficient)
Include an llms-full.txt for AI agents that want everything in one fetch
Keep it concise — AI agents parse this before deciding what to retrieve

Update `robots.txt` for AI Crawlers

Many sites accidentally block AI crawlers with aggressive Disallow rules. Explicitly allow the bots that matter:

Bot User-Agent	Platform
`GPTBot`	ChatGPT (training)
`OAI-SearchBot`	ChatGPT (search / RAG)
`PerplexityBot`	Perplexity
`Claude-Web`	Claude
`anthropic-ai`	Anthropic crawlers
`Google-Extended`	Gemini training
`cohere-ai`	Cohere
`Meta-ExternalAgent`	Meta AI

User-agent: *
Allow: /

User-agent: GPTBot
Allow: /

User-agent: PerplexityBot
Allow: /

Sitemap: https://yourdomain.com/sitemap.xml

Create Markdown Content Files

Create a docs/md/ directory with one file per topic. AI agents retrieve these instead of parsing your HTML.

Recommended files

README.md — company overview, key stats, quick links
products.md — what you sell, features, how it works
faqs.md — Q&A format (AI loves direct FAQ structure)
pricing.md — declarative pricing facts
integrations.md — what connects with your platform

Write for AI extraction

Use ## headings for every major section — AI uses these as chunk boundaries
Write declarative statements: "Bermuda processes 50,000 quotes per month" — not "we're industry-leading"
Put the most important fact in the first sentence of each section
Use bullet lists — AI extracts these as structured facts
Include the canonical URL at the top of each file

Also create llms-full.txt at the root — all markdown files merged into one. AI agents that want a complete picture get it in a single HTTP request.

Create `sitemap.xml`

Without a sitemap, AI crawlers may miss your content entirely. Include your markdown files alongside HTML pages, and keep <lastmod> current — Perplexity heavily cites content less than one year old.

<url>
  <loc>https://yourdomain.com/docs/md/products.md</loc>
  <lastmod>2026-03-17</lastmod>
  <priority>0.7</priority>
</url>

Set `Content-Type: text/markdown` Headers

AI agents like Claude Code and OpenCode send Accept: text/markdown request headers. Your server should respond with the correct MIME type so they get clean markdown.

Vercel (`vercel.json`)

{
  "headers": [
    {
      "source": "/(.*\\.md)",
      "headers": [
        { "key": "Content-Type", "value": "text/markdown; charset=utf-8" }
      ]
    }
  ]
}

Cloudflare

Enable Markdown for Agents in the Cloudflare dashboard (Beta, free for Pro+ plans). Cloudflare automatically converts any HTML page to markdown when an AI agent requests it with Accept: text/markdown — no separate markdown files needed.

Nginx

location ~* \.md$ {
    add_header Content-Type "text/markdown; charset=utf-8";
}

Add JSON-LD Structured Data

Structured data helps both traditional search engines and AI systems understand your content type, author, and organization. Add to every HTML page:

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "Organization",
  "name": "Your Company",
  "url": "https://yourdomain.com",
  "description": "What your company does.",
  "contactPoint": {
    "@type": "ContactPoint",
    "telephone": "+1-000-000-0000",
    "email": "info@yourdomain.com"
  }
}
</script>

For blog and guide pages, also add Article and FAQPage schemas. This page uses both.

Content Strategy for AI Citation

AI systems cite content that is authoritative, specific, and easy to extract. Your website alone is not enough — AI pulls from multiple sources.

Write for RAG, not just readers

RAG (Retrieval-Augmented Generation) is how ChatGPT search and Perplexity work: they search the web live, pull chunks of content, and feed them into the AI model. Your content competes to be retrieved and selected.

Gets cited: specific numbers, FAQ format, clear section headings, summary sentences, bullet lists
Gets ignored: vague claims, long unbroken paragraphs, content buried under navigation

Build presence on sources AI cites most

Wikipedia / Crunchbase / G2 — complete these profiles thoroughly
LinkedIn — publish thought leadership articles with your target keywords
Reddit — participate genuinely in relevant communities
YouTube — create transcribed video content (AI reads transcripts)
Industry publications — digital PR and editorial mentions carry more weight than backlinks

Use identical company descriptions on LinkedIn, Crunchbase, X, and GitHub. Brand consistency across platforms accelerates AI visibility — some companies see ChatGPT visibility improvements within days.

Quick Checklist

llms.txt at root with company summary and markdown links
llms-full.txt at root with all content merged
robots.txt explicitly allowing GPTBot, PerplexityBot, Claude-Web
sitemap.xml including markdown document URLs
docs/md/ directory with topic-specific markdown files
Content-Type: text/markdown headers configured for .md files
JSON-LD structured data on all key pages
Consistent brand descriptions on Crunchbase, LinkedIn, G2
FAQ content written in declarative Q&A format
Google Analytics segment for AI traffic (ChatGPT, Perplexity, Claude)

Frequently Asked Questions

What is the difference between SEO and GEO?

SEO (Search Engine Optimization) targets Google's ranking algorithm to get blue links. GEO (Generative Engine Optimization) targets AI language models to get cited inside AI-generated answers. Both matter — but GEO requires structured, machine-readable content rather than keyword density.

Does llms.txt replace robots.txt?

No. robots.txt controls crawler access (which bots can crawl which pages). llms.txt provides AI-readable content about your site's structure. You need both.

Will this work for ChatGPT?

ChatGPT uses RAG (web search) for current information. Allowing GPTBot and OAI-SearchBot in robots.txt, having a sitemap, and providing structured content increases the probability of being retrieved and cited in ChatGPT answers.

How long does AI SEO take to show results?

RAG-based systems (Perplexity, ChatGPT search) can reflect changes within days to weeks. Foundation model training data has cutoffs — influence there is longer-term and requires external citation building.

Is markdown required, or can HTML work?

AI systems can process HTML, but it costs 5–10x more tokens. Clean HTML with good semantic structure performs better than average, but dedicated markdown files give AI agents the most efficient path to your content.

See This in Action

We applied every step in this guide to thebermuda.us. Explore our implementation:

llms.txt llms-full.txt robots.txt sitemap.xml

Talk to Our Team

Create llms.txt