In this module
LLM SEO vs traditional SEO: what's different
Traditional SEO targets a search engine's document-retrieval system. Google crawls your page, indexes it, and when users search, it returns a ranked list of URLs. Your goal is to rank in that list and get clicks.
LLM SEO (also called Generative Engine Optimization, or GEO) targets a fundamentally different system. Language models don't return a list of URLs — they synthesize an answer from training data and, where available, real-time retrieval (Perplexity, Bing Copilot, ChatGPT with search enabled). The user may never visit your site at all. But they might be told by ChatGPT "according to [your site], the best option for your use case is X" — and that recommendation shapes their decision.
Key differences:
- Citation vs ranking: In traditional SEO you're competing for a rank position. In LLM SEO you're trying to be cited as a source — included in the model's synthesized answer.
- Training data vs live retrieval: Some LLMs (ChatGPT without search, Claude without retrieval) draw on training data with a knowledge cutoff. Others (Perplexity, Bing Copilot, ChatGPT with search) pull live web results. Different optimization approaches apply to each.
- Verifiable claims over opinions: LLMs prefer to cite factual, specific, verifiable claims. Your opinion about which tool is best carries less weight; a comparison table with actual pricing data carries more.
How LLMs select sources for citation
There's no public "how we cite sources" documentation from OpenAI or Anthropic, but from testing and academic research on RAG (retrieval-augmented generation) systems, several patterns emerge:
For training-data-based responses:
- Sites that were in the training corpus and received consistent signals of quality (links, citations from other sources, social shares) are weighted more heavily.
- Content that explicitly states clear, factual claims tends to get incorporated into model "knowledge" vs. vague opinion content that doesn't anchor to anything specific.
- Sites that appear consistently when a topic is discussed across many web pages get stronger entity associations inside the model.
For live retrieval systems (Perplexity, ChatGPT with search):
- Content that directly and concisely answers the query tends to be preferred — the system is looking for the best "snippet" to cite.
- Pages that load fast, have clean HTML structure, and aren't behind paywalls or cookie walls are easier to retrieve and parse.
- Structured data (FAQ schema, HowTo schema) can help map your content to query patterns these systems are answering.
- Perplexity in particular tends to favor pages with clear attribution (author name + credentials, updated dates), similar to Google's E-E-A-T signals.
LLM search behavior is changing faster than any other part of SEO. The tactics in this module reflect what's working as of early 2026. Check back here — this module gets updated more frequently than any other in the course.
Writing patterns that LLMs pull from
These content patterns show up consistently in LLM-cited content:
1. Direct answer first
Lead every section with a direct, complete answer to the implied question. Don't build up to it. LLMs doing "extractive" citation pull the passage that best answers the query — if your best answer is buried in paragraph 4, it may be missed.
Good: "Asana's free plan supports up to 15 users, unlimited tasks and projects, and includes basic workflow views."
Bad: "Let's explore what Asana offers at different pricing tiers..."
2. Specific, verifiable facts over opinions
Prices, user limits, feature names, dates, test results — these are what retrieval systems anchor to. "The best project management tool" is an assertion that requires trust. "Asana's free plan includes 15 user seats, unlimited tasks, and timeline view locked behind paid tiers" is verifiable and specific.
3. FAQ sections with concise answers
Add a FAQ section to every article with 4–8 questions that users commonly ask about the topic. Keep answers to 2–4 sentences each. These are ideal retrieval targets for conversational AI queries.
4. Comparative statements
Explicit comparison language — "X is better than Y for [specific use case] because [specific reason]" — maps well to the query patterns LLMs answer. Vague "X has pros and cons" framing gets passed over for content with clear comparative conclusions.
5. Updated, dated content
Retrieval systems prioritize freshness. Your "updated [month, year]" date in meta, in schema, and visibly on the page signals recency to both traditional search and AI retrieval systems.
llms.txt — what it is and what to put in it
llms.txt is an emerging convention (not yet a formal standard, but adopted by a growing number of frameworks and AI systems) that helps LLMs understand and index a website more effectively. It's analogous to robots.txt but for AI crawlers — it provides a structured summary of your site's content and tells AI systems which pages are most relevant to ingest.
The format was proposed in 2024 and is gaining adoption. Perplexity and some AI agents that crawl the web have begun supporting it. Create llms.txt at your root domain: https://yourdomain.com/llms.txt
Structure:
Example for this site:
Entity signals and knowledge graph basics
Google's Knowledge Graph and LLMs both build "entity associations" — connections between named things (your site, you as an author, the topics you cover). The goal is to have your site's name and topic area associated strongly enough in these models that when someone asks about your topic, your site comes up as a source.
Practical ways to strengthen entity signals:
Consistent brand mentions
Get your site mentioned on other sites — even without a link, brand mentions ("as reported by [Your Site Name]") create entity associations in the knowledge graph.
Social presence
A LinkedIn profile, X/Twitter account, or Reddit presence where you discuss your niche. The consistency of your name + topic across platforms reinforces entity associations.
Author schema
Use Person schema on your About page with sameAs links to your social profiles. This explicitly connects your identity across platforms for knowledge graph parsers.
Wikipedia / Wikidata
Not realistic for most new sites, but a long-term goal. A Wikipedia page or Wikidata entry about your site significantly boosts entity recognition by LLMs trained on Wikipedia-derived data.
Structured data for AI visibility
Beyond the page-level schemas from Module 4, add these site-level schemas to strengthen AI visibility:
WebSite schema with SearchAction
Person schema on About page
FAQPage schema
Add this to any page with a FAQ section. It's one of the most reliably cited schema types by AI Overviews and retrieval-augmented systems:
Google AI Overviews optimization
Google AI Overviews (formerly Search Generative Experience) pulls from a mix of sources to generate a synthesized answer at the top of search results. Not all queries trigger AI Overviews — informational queries are more likely to trigger them than navigational or transactional ones.
What Google tends to pull for AIO sources:
- Pages that already rank in the top 10 for the query — AIO is not a bypass for traditional SEO signals, it layers on top.
- Content with clear structure: headers, bullet points, short definitive statements.
- E-E-A-T signals similar to traditional ranking — author attribution, accurate dates, reliable domain.
- Direct answer to the specific query in the first 100 words of the relevant section.
To optimize for AIO: structure every article so each H2 section starts with a direct, concise answer to the question implied by that heading. Then expand. This makes your content "extractable" for the synthesis layer.
How to measure LLM citation
Unlike traditional SEO where Google Search Console gives you impression and click data, there's no official "LLM citation console" yet. Here's how to measure manually:
| Platform | How to test citation | Frequency |
|---|---|---|
| ChatGPT (with search) | Ask "what are the best [your niche] tools?" and variations. Note if your site is cited. | Monthly |
| Perplexity | Query your target keywords. Check Sources panel for your domain. | Monthly |
| Google AIO | Search your target keywords in incognito. Note AIO presence and sources. | Weekly |
| Bing Copilot | Ask your target queries. Copilot shows numbered citations — check for your domain. | Monthly |
| Claude (with search) | Use Claude.ai with web access enabled. Test commercial queries in your niche. | Monthly |
Track your citations in a simple spreadsheet: date, platform, query, cited/not cited. Over time this tells you whether your LLM SEO efforts are moving the needle.
Module 5 action steps
- Audit your existing content for direct-answer-first structure — rewrite introductions that bury the lead
- Add FAQ sections (4–8 questions with concise answers) to your top 5 pages
- Create
llms.txtin your site root - Add
FAQPageschema to all pages with FAQ sections - Add
Personschema to your About page withsameAslinks - Run your top 5 target keywords through ChatGPT, Perplexity, and Google to establish a citation baseline