You are here
Module 6: Programmatic SEO

What programmatic SEO actually is

Programmatic SEO (pSEO) is the practice of generating large numbers of web pages from structured data using templates. Instead of writing each page manually, you define a template and a data source, and the build process generates hundreds or thousands of pages automatically.

Classic examples: Zapier's "how to connect [App A] to [App B]" pages (millions of combinations), NomadList's city profile pages (one template, hundreds of cities), and comparison sites in financial services where every insurance combination gets its own page.

For affiliate sites, pSEO is applicable when there's a repeating content pattern — like a review for every product in a defined list, or a comparison page for every possible pair of tools in a category.

When to go programmatic (and when not to)

Go programmatic when:
  • You have 50+ items that follow the same template
  • The data exists in a structured format (JSON, CSV, API)
  • Each generated page addresses a distinct, real search query
  • You can enrich pages beyond the base template data
  • You've already established topical authority manually
Don't go programmatic when:
  • You have fewer than ~30 items to scale
  • The content is too similar across pages (thin-content risk)
  • You'd be creating pages with no real search demand
  • Your site is brand new with no authority
  • The topic requires nuanced human judgment per-item

A common mistake: building a pSEO project before establishing the site's foundation. A new domain with 500 thin programmatically-generated pages and no manual content looks exactly like a spam site to Google. Build 15–30 solid manual pages first, get indexed and some initial traffic, then layer in programmatic content.

Data sources for programmatic pages

The data source is the bottleneck for most pSEO projects. Your options:

1. APIs (best option)

APIs give you structured, up-to-date data. For affiliate sites, many affiliate programs offer product APIs: Amazon Product Advertising API, PartnerStack has product data for enrolled programs, and SaaS companies often have public pricing API endpoints. API data can be fetched at build time (for SSG) or cached and refreshed on a schedule.

2. Your own structured data (JSON/CSV)

Create and maintain a JSON or CSV file containing the structured data for your pages. For a software comparison site, this might be a tools.json file with fields like name, category, pricing tiers, key features, affiliate link, and your rating. You control the data quality and can add rich fields that no API provides.

3. Third-party data aggregators

Sites like G2, Capterra, and Product Hunt publish public data that can inform (but not be directly scraped into) your content. Use this as a reference layer — you still need to add original content that differentiates your pages.

Next.js SSG approach

Next.js with Static Site Generation is the cleanest programmatic SEO stack for affiliate sites that want to scale. Pages are pre-generated at build time, served as static files, and load instantly — no server required.

The pattern using the App Router:

// app/tools/[slug]/page.tsx import { getAllTools, getToolBySlug } from '@/lib/tools' // This generates all the static paths at build time export async function generateStaticParams() { const tools = await getAllTools() return tools.map(tool => ({ slug: tool.slug })) } export default async function ToolPage({ params }) { const tool = await getToolBySlug(params.slug) return ( <article> <h1>{tool.name} Review {new Date().getFullYear()}</h1> <p>{tool.description}</p> {/* Template content continues */} </article> ) }

The lib/tools.ts file reads from your JSON data source or calls an API. At build time, Next.js calls generateStaticParams() for every slug, generates each page as a static HTML file, and deploys the result to Vercel's edge network. Each page loads in under 100ms and is fully crawlable.

💡
Dynamic OG images

Next.js has a built-in opengraph-image.tsx convention that lets you generate a custom OG image for each programmatic page using the page's data. This makes your shared links look far more professional and is one of the details that differentiates a polished pSEO site from a generic one.

Static HTML template approach

If you're building a static HTML site (like this one), you can still do programmatic SEO without a JavaScript framework. Use a build script to generate HTML files from templates and data.

A simple approach with Node.js:

// generate-pages.js const fs = require('fs') const tools = require('./data/tools.json') const template = fs.readFileSync('./templates/tool-review.html', 'utf8') tools.forEach(tool => { const html = template .replace(/{{TOOL_NAME}}/g, tool.name) .replace(/{{TOOL_SLUG}}/g, tool.slug) .replace(/{{TOOL_DESCRIPTION}}/g, tool.description) .replace(/{{TOOL_PRICE}}/g, tool.price) .replace(/{{AFFILIATE_LINK}}/g, tool.affiliateLink) // Create the directory and write the file fs.mkdirSync(\`./\${tool.slug}\`, { recursive: true }) fs.writeFileSync(\`./\${tool.slug}/index.html\`, html) console.log(\`Generated: ./\${tool.slug}/index.html\`) }) console.log(\`Done: \${tools.length} pages generated.\`)

Run node generate-pages.js when you update your data file, commit the generated HTML files, and deploy. This works perfectly with Cloudflare Pages and requires zero server infrastructure.

The thin-content risk and how to avoid it

Most pSEO sites that get penalized share one characteristic: all pages are nearly identical except for the variable data being inserted. If you search "Asana review" and "ClickUp review" and get pages that are structurally identical except the tool name was swapped in, Google's systems will classify this as thin, low-quality content.

How to avoid it:

Add a unique "editorial take" field to every item in your data

Your JSON for each tool should include a editorial_take field — a 2–3 sentence genuine opinion written by a human. This is the one thing that makes each page unique beyond the structured data.

Vary section depth based on available data

If you have 5 data points for Tool A and 12 for Tool B, don't pad Tool A with filler content to match. Let the page length naturally reflect available information quality.

Include real pricing data you've verified, not placeholder text

Pages that show current, accurate pricing (with a "pricing verified [date]" note) are more valuable and differentiated than pages with generic pricing tables.

Set a quality floor — don't publish pages below it

Some tools in your database won't have enough data to justify a page. It's better to skip them and maintain a quality floor than to publish 500 thin pages to get 500 pages indexed.

🚫
Don't generate pages for keywords with no search demand

Confirm real search volume exists for each page you're generating. A pSEO project that generates 300 pages for queries nobody searches is just indexing bloat.

Real example: what this site uses

This site is a static HTML site deployed on Cloudflare Pages. Here's how I've used programmatic-style generation on it:

The lesson: you don't need to fully automate from day one. Start with a consistent manual template, nail the quality, then automate the generation of that template when volume demands it.

Module 6 action steps

  1. Identify any content type on your site that has 20+ instances following the same pattern
  2. Build a data.json file with structured fields for each instance
  3. Write a template HTML (or JSX) file with variable placeholders
  4. Add an editorial take field to each item — written by you, not AI
  5. Write and run the generation script; review 10 generated pages for quality before deploying
← Module 5: LLM SEO Module 7: Monetization →