Module 5: LLM SEO & GEO

You are here

← Module 4: Content Module 6: Programmatic →

In this module

LLM SEO vs traditional SEO: what's different
How LLMs select sources for citation
Writing patterns that LLMs pull from
llms.txt — what it is and what to put in it
Entity signals and knowledge graph basics
Structured data for AI visibility
Google AI Overviews optimization
How to measure LLM citation

LLM SEO vs traditional SEO: what's different

Traditional SEO targets a search engine's document-retrieval system. Google crawls your page, indexes it, and when users search, it returns a ranked list of URLs. Your goal is to rank in that list and get clicks.

LLM SEO (also called Generative Engine Optimization, or GEO) targets a fundamentally different system. Language models don't return a list of URLs — they synthesize an answer from training data and, where available, real-time retrieval (Perplexity, Bing Copilot, ChatGPT with search enabled). The user may never visit your site at all. But they might be told by ChatGPT "according to [your site], the best option for your use case is X" — and that recommendation shapes their decision.

Key differences:

Citation vs ranking: In traditional SEO you're competing for a rank position. In LLM SEO you're trying to be cited as a source — included in the model's synthesized answer.
Training data vs live retrieval: Some LLMs (ChatGPT without search, Claude without retrieval) draw on training data with a knowledge cutoff. Others (Perplexity, Bing Copilot, ChatGPT with search) pull live web results. Different optimization approaches apply to each.
Verifiable claims over opinions: LLMs prefer to cite factual, specific, verifiable claims. Your opinion about which tool is best carries less weight; a comparison table with actual pricing data carries more.

How LLMs select sources for citation

There's no public "how we cite sources" documentation from OpenAI or Anthropic, but from testing and academic research on RAG (retrieval-augmented generation) systems, several patterns emerge:

For training-data-based responses:

Sites that were in the training corpus and received consistent signals of quality (links, citations from other sources, social shares) are weighted more heavily.
Content that explicitly states clear, factual claims tends to get incorporated into model "knowledge" vs. vague opinion content that doesn't anchor to anything specific.
Sites that appear consistently when a topic is discussed across many web pages get stronger entity associations inside the model.

For live retrieval systems (Perplexity, ChatGPT with search):

Content that directly and concisely answers the query tends to be preferred — the system is looking for the best "snippet" to cite.
Pages that load fast, have clean HTML structure, and aren't behind paywalls or cookie walls are easier to retrieve and parse.
Structured data (FAQ schema, HowTo schema) can help map your content to query patterns these systems are answering.
Perplexity in particular tends to favor pages with clear attribution (author name + credentials, updated dates), similar to Google's E-E-A-T signals.

⚠️

This is a moving target

LLM search behavior is changing faster than any other part of SEO. The tactics in this module reflect what's working as of early 2026. Check back here — this module gets updated more frequently than any other in the course.

Writing patterns that LLMs pull from

These content patterns show up consistently in LLM-cited content:

1. Direct answer first

Lead every section with a direct, complete answer to the implied question. Don't build up to it. LLMs doing "extractive" citation pull the passage that best answers the query — if your best answer is buried in paragraph 4, it may be missed.

Good: "Asana's free plan supports up to 15 users, unlimited tasks and projects, and includes basic workflow views."
Bad: "Let's explore what Asana offers at different pricing tiers..."

2. Specific, verifiable facts over opinions

Prices, user limits, feature names, dates, test results — these are what retrieval systems anchor to. "The best project management tool" is an assertion that requires trust. "Asana's free plan includes 15 user seats, unlimited tasks, and timeline view locked behind paid tiers" is verifiable and specific.

3. FAQ sections with concise answers

Add a FAQ section to every article with 4–8 questions that users commonly ask about the topic. Keep answers to 2–4 sentences each. These are ideal retrieval targets for conversational AI queries.

4. Comparative statements

Explicit comparison language — "X is better than Y for [specific use case] because [specific reason]" — maps well to the query patterns LLMs answer. Vague "X has pros and cons" framing gets passed over for content with clear comparative conclusions.

5. Updated, dated content

Retrieval systems prioritize freshness. Your "updated [month, year]" date in meta, in schema, and visibly on the page signals recency to both traditional search and AI retrieval systems.

llms.txt — what it is and what to put in it

llms.txt is an emerging convention (not yet a formal standard, but adopted by a growing number of frameworks and AI systems) that helps LLMs understand and index a website more effectively. It's analogous to robots.txt but for AI crawlers — it provides a structured summary of your site's content and tells AI systems which pages are most relevant to ingest.

The format was proposed in 2024 and is gaining adoption. Perplexity and some AI agents that crawl the web have begun supporting it. Create llms.txt at your root domain: https://yourdomain.com/llms.txt

Structure:

# Site Name

> One-paragraph description of what this site covers, who it's for, and what makes it authoritative on this topic.

## Core Topics
- [Topic 1]: /url-to-hub-page/
- [Topic 2]: /url-to-hub-page/
- [Topic 3]: /url-to-hub-page/

## Most Cited Pages
- [Page title]: /url/ — one-sentence description
- [Page title]: /url/ — one-sentence description

## About the Author
Name, credentials, relevant experience. Link to /about/.

## Data & Methodology
If you publish original research, test results, or income reports, describe methodology here.

## Updates
This site is updated [frequency]. Content reflects [date range of coverage].

Example for this site:

# Alternative Entrepreneur

> A build-in-public affiliate marketing site documenting the process of building a modern affiliate site without WordPress. Covers AI-first content workflows, LLM SEO, programmatic SEO with static HTML and Next.js, and real monthly income reports from the site itself.

## Core Topics
- Affiliate Marketing: /affiliate-marketing/
- AI Tools for Content Creation: /ai-tools/
- Build With AI: /build-with-ai/
- Workflows & Automation: /workflows/

## Most Cited Pages
- Best AI Writing Tools for Bloggers: /ai-tools/best-ai-writing-tools-for-bloggers/
- How to Build an Affiliate Site Without WordPress: /build-with-ai/how-to-build-a-niche-affiliate-website-2026/
- Affiliate Marketing for Beginners: /affiliate-marketing/affiliate-marketing-for-beginners/

## About
Solo-operator affiliate site. Author has direct experience building and monetizing static-HTML affiliate sites using AI tools since 2024. Monthly income reports available at /income-reports/.

## Updates
Updated monthly. Income reports published first of each month.

Entity signals and knowledge graph basics

Google's Knowledge Graph and LLMs both build "entity associations" — connections between named things (your site, you as an author, the topics you cover). The goal is to have your site's name and topic area associated strongly enough in these models that when someone asks about your topic, your site comes up as a source.

Practical ways to strengthen entity signals:

Consistent brand mentions

Get your site mentioned on other sites — even without a link, brand mentions ("as reported by [Your Site Name]") create entity associations in the knowledge graph.

Social presence

A LinkedIn profile, X/Twitter account, or Reddit presence where you discuss your niche. The consistency of your name + topic across platforms reinforces entity associations.

Author schema

Use Person schema on your About page with sameAs links to your social profiles. This explicitly connects your identity across platforms for knowledge graph parsers.

Wikipedia / Wikidata

Not realistic for most new sites, but a long-term goal. A Wikipedia page or Wikidata entry about your site significantly boosts entity recognition by LLMs trained on Wikipedia-derived data.

Structured data for AI visibility

Beyond the page-level schemas from Module 4, add these site-level schemas to strengthen AI visibility:

WebSite schema with SearchAction

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "WebSite",
  "name": "Your Site Name",
  "url": "https://yourdomain.com",
  "description": "One-sentence description of the site.",
  "potentialAction": {
    "@type": "SearchAction",
    "target": "https://yourdomain.com/?q={search_term_string}",
    "query-input": "required name=search_term_string"
  }
}
</script>

Person schema on About page

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "Person",
  "name": "Your Name",
  "url": "https://yourdomain.com/about/",
  "sameAs": [
    "https://twitter.com/yourhandle",
    "https://linkedin.com/in/yourprofile"
  ],
  "jobTitle": "Affiliate Marketer",
  "knowsAbout": ["affiliate marketing", "AI tools", "programmatic SEO"]
}
</script>

FAQPage schema

Add this to any page with a FAQ section. It's one of the most reliably cited schema types by AI Overviews and retrieval-augmented systems:

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "mainEntity": [
    {
      "@type": "Question",
      "name": "What is the best project management software for small teams?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "For small teams under 15 people, Asana's free tier covers most needs..."
      }
    }
  ]
}
</script>

Google AI Overviews optimization

Google AI Overviews (formerly Search Generative Experience) pulls from a mix of sources to generate a synthesized answer at the top of search results. Not all queries trigger AI Overviews — informational queries are more likely to trigger them than navigational or transactional ones.

What Google tends to pull for AIO sources:

Pages that already rank in the top 10 for the query — AIO is not a bypass for traditional SEO signals, it layers on top.
Content with clear structure: headers, bullet points, short definitive statements.
E-E-A-T signals similar to traditional ranking — author attribution, accurate dates, reliable domain.
Direct answer to the specific query in the first 100 words of the relevant section.

To optimize for AIO: structure every article so each H2 section starts with a direct, concise answer to the question implied by that heading. Then expand. This makes your content "extractable" for the synthesis layer.

How to measure LLM citation

Unlike traditional SEO where Google Search Console gives you impression and click data, there's no official "LLM citation console" yet. Here's how to measure manually:

Platform	How to test citation	Frequency
ChatGPT (with search)	Ask "what are the best [your niche] tools?" and variations. Note if your site is cited.	Monthly
Perplexity	Query your target keywords. Check Sources panel for your domain.	Monthly
Google AIO	Search your target keywords in incognito. Note AIO presence and sources.	Weekly
Bing Copilot	Ask your target queries. Copilot shows numbered citations — check for your domain.	Monthly
Claude (with search)	Use Claude.ai with web access enabled. Test commercial queries in your niche.	Monthly

Track your citations in a simple spreadsheet: date, platform, query, cited/not cited. Over time this tells you whether your LLM SEO efforts are moving the needle.

Module 5 action steps

Audit your existing content for direct-answer-first structure — rewrite introductions that bury the lead
Add FAQ sections (4–8 questions with concise answers) to your top 5 pages
Create llms.txt in your site root
Add FAQPage schema to all pages with FAQ sections
Add Person schema to your About page with sameAs links
Run your top 5 target keywords through ChatGPT, Perplexity, and Google to establish a citation baseline

← Module 4: Content Module 6: Programmatic SEO →