Article

Jun 9, 2026

Programmatic SEO Risks: Where Google Draws the Line

pSEO gurus sell scale. SEO purists call it spam. The truth lives in three tests Google already wrote down — and most pages fail all three

A single severed thread of cool light against deep black, the break point glowing orange

TL;DR

  • Programmatic SEO risks are mostly non-indexation for low-authority sites, not manual penalties.

  • Google's 2015 doorway guidance and 2024 scaled-content policy both target the same thing: thin variants of one page.

  • The swap test is brutal: if your page reads true with the niche swapped, it's duplication wearing a costume.

  • Ahrefs' working rule for small sites: target sub-KD-20 queries where top-ranking sites sit under DR 30.

  • pSEO works when there's a real data asset underneath. Without one, you're shipping a template.

The short answer

Programmatic SEO isn't penalized for being programmatic. It's ignored — or de-indexed — when the pages are thin variants of one page wearing different hats. That's the operator-grade version of what Google's been saying since at least 2015.

The doorway-page guidance Google published in March 2015 still governs this. The December 2024 update to its scaled-content abuse policy didn't replace that line; it widened it to cover any method of mass-producing pages with little added value, AI or otherwise. If your pages exist primarily to funnel users from a query to one destination, with substantially similar content across the set, they're doorway spam by Google's own self-test. The mechanism is the same whether you wrote them by hand, with GPT, or with a CSV and a Liquid template.

So the real programmatic seo risks for a low-authority site (think DR under 20, no link velocity, fresh domain) aren't a manual action email. They're quieter: pages crawled once, marked Discovered – currently not indexed, and never seen again. That's the realistic failure mode. The penalty story sells courses. The non-indexation story is what actually happens.

1. What programmatic SEO is — and the line Google drew in 2015 and 2024

Programmatic SEO means generating many pages from a single template plus a structured dataset. [City] plumber, [product] vs [competitor], [language] to [language] translator. Done well, the template is the delivery vehicle for a real data asset. Done badly, the template is the entire product.

Google drew the line twice, in plain language.

The first time was the doorway pages update in March 2015. The self-test there is the one most pSEO builders quietly fail: are these pages created primarily to rank for similar queries, funneling users toward one destination, with substantially similar content across the set? If yes, they're doorways. That guidance is still in force in June 2026.

The second was the December 2024 expansion of the scaled content abuse policy, which extended the same logic to any mass-production method — including AI generation, template stuffing, and stitched-together feeds. We've written about the AI-specific angle in does Google penalize AI-generated content; the short version is that the mechanism Google cares about is value per page, not author species.

Google's John Mueller put it less diplomatically when Ahrefs profiled the topic: "Programmatic SEO is often a fancy banner for spam." The word often is doing real work in that sentence. It's not a blanket condemnation. It's a base rate.

2. The doorway self-test, applied to content pages

Most pSEO advice treats doorway guidance as a local-SEO problem (think: 400 city pages for a single plumber in one metro). Apply the same self-test to content pages and a lot of pSEO blog clusters fail it.

Ask three questions about your page set:

  1. Is the purpose of these pages to rank for variant queries, rather than to answer them with materially different substance?

  2. Does the user, after landing, get funneled to one destination (a single product page, a single signup, a single calculator) regardless of which variant brought them in?

  3. Across the set, is the body content substantially similar — intro template, mid-section template, CTA template — with only entity tokens swapped?

Three yeses is a doorway by Google's definition, full stop. Two yeses is the gray zone where indexation becomes a coin flip. One yes is usually fine if the unique substance is real.

The interesting wrinkle: Sterling Sky's service-area page experiment found that pages 84% similar to each other still ranked in many markets. It's a single-vendor data point, and the authors noted explicitly it was "not always effective in competitive markets." The takeaway isn't 84% similarity is safe. It's that low-competition queries forgive a lot, and high-competition queries forgive almost nothing. Your KD floor matters more than your similarity ceiling.

3. The swap test: if it reads true with the niche swapped, it's duplication

Here's the test we run on every pSEO template before a client ships one page:

Take a draft page. Find-and-replace the entity (the city, the product, the language) with a completely unrelated entity. If the page still reads as factually true and useful, you've written a template, not a page. You're shipping duplication in a costume.

A [San Francisco] plumber page that reads identically as a [Boise] plumber page is two URLs of the same content. A Stripe vs Adyen comparison that reads identically as Stripe vs Braintree is the same essay twice. The swap test isn't a Google algorithm — it's a forcing function. If your page fails it, Google's deduplication will eventually catch what your editorial process missed.

The fix is unglamorous: every page needs 10–20% genuinely unique substance that cannot exist on any other page in the set. Local pricing data. A specific local case study. A comparison table with values pulled from a real dataset, not invented for the template. Numbers that change row by row because the underlying truth changes row by row.

Which brings us to the part most pSEO playbooks skip.

4. When pSEO works: the data-asset requirement

The pSEO sites that survive long-term have one thing in common. There's a real data asset underneath, and the pages are the interface to it. Zapier's integration pages work because each integration is a real, working integration with its own auth flow, triggers, and actions. Wise's currency pages work because the exchange rates are live and the corridors have different fees. G2's category pages work because the review corpus is genuinely different per category.

No data asset means no defensible page. The template is exposed.


Decision flowchart for whether to build a programmatic SEO page set

The pre-build decision flow. One no anywhere in the chain means don't ship.

If you can't answer what's the dataset, and why is each row materially different from every other row in one sentence, you don't have a pSEO play. You have a content-farming risk. Our content-marketing service starts every pSEO engagement with a dataset audit for exactly this reason — there's no point templating around a hollow center.

5. Realistic targets for low-authority sites: KD and competitor-DR thresholds

This is where the realistic-expectations conversation gets specific.

Ahrefs' working recommendation for programmatic plays is to target queries with KD under 20 where the top-ranking sites have DR under 30. That's not a Google rule. It's an operational rule of thumb based on what actually ranks for sites without strong link equity, and it's the threshold the Ahrefs pSEO guide anchors on.

Why those numbers? Because below KD 20, the SERP is usually populated by pages that haven't been hardened by years of link velocity. A new page with genuinely unique substance can compete on relevance alone. Above KD 30, you're competing against pages with backlink profiles your fresh template cannot match in a 90-day window, regardless of how clever the template is.

In practice, on client work, we see something like this distribution when sites ignore those thresholds: a meaningful chunk of pages get indexed and rank in the top 20 within 8–12 weeks (the sub-KD-20 wins), another chunk get indexed but never rank above page 5 (the mid-difficulty pages, slowly bleeding crawl budget), and a third chunk never get indexed at all (the Discovered – currently not indexed graveyard). The exact proportions depend on the domain, but the shape is consistent.

The pages that get indexed but never rank are the ones to worry about. They're not penalized. They're worse than penalized — they're consuming crawl budget that could be flowing to your money pages. If you want those money pages to show up in AI Overviews too, the prerequisite is the same: indexable, unique, cited. We covered the citation side in how to get cited in AI Overviews.

6. A pre-launch checklist before you generate page two

This is the checklist we run before any pSEO build leaves staging. If you can't answer yes to every line, you don't have a pSEO play yet. You have a template and a hope.

  • Real data asset. One sentence: what's the dataset, why is each row materially different.

  • 10–20% genuinely unique substance per page. Not template variation. Substance.

  • Passes the swap test. Replace the entity. Does the page still read as factually true? If yes, rewrite.

  • KD under 20 on target queries. Pulled from your tool of choice — Ahrefs, Semrush, whichever you trust. See each tool's published pricing page.

  • Competitor DR under 30 on the SERP. Check the top 5 results. If two or more sit above DR 50, pick a different query set.

  • Linked in nav, not orphaned. Every page must be reachable from a hub. Orphan pages are the first to be deprioritized.

  • No funneling to a single destination. Each page resolves the user's query on the page itself, with the CTA as an option, not the purpose.

  • Indexation monitored weekly for the first 90 days. Use Search Console's index coverage report. Pull the non-indexed list. Improve or noindex; don't leave them in limbo.

That's the build. The rest is editorial discipline and patience.

FAQ

Is programmatic SEO spam?

Not inherently. Google's John Mueller called it "often a fancy banner for spam," and often is the operative word. Programmatic SEO becomes spam when pages are thin variants of one template with no unique substance, generated primarily to capture variant queries. With a real data asset and 10–20% unique substance per page, it's just SEO at scale.

Will Google penalize my programmatic pages?

For low-authority sites, manual penalties are rare. The realistic outcome is non-indexation: pages crawled once, marked Discovered – currently not indexed, and never ranked. That's quieter than a penalty but functionally identical to one. Larger sites with established authority face higher manual-action risk under the December 2024 scaled-content abuse policy.

What's the doorway pages penalty in practice?

Google's March 2015 doorway guidance targets pages created primarily to funnel users toward one destination with substantially similar content. The practical consequence is deindexation of the offending set, sometimes accompanied by a manual action for egregious cases. The 2015 framing is still active in June 2026 and underpins the 2024 scaled-content policy.

What KD and DR thresholds should small sites target?

Ahrefs recommends programmatic plays target queries with KD under 20 where top-ranking sites have DR under 30. Below those thresholds, ranking is driven more by relevance than link equity, which is where a new template can compete. Above them, you're fighting backlink profiles a fresh page set cannot realistically match within a 90-day window.

How is the 2024 scaled content policy different from the 2015 doorway rule?

The 2015 doorway guidance focuses on intent and funneling. The December 2024 scaled-content abuse policy widens the lens to any method of mass-producing low-value pages — AI generation, template stuffing, feed stitching — regardless of intent. They overlap heavily. Most pages that fail the doorway self-test also fail the scaled-content policy, and vice versa.

Ship it Monday. Measure it Friday. Decide what to keep on Sunday.

The one action this week: run the swap test on five of your existing pages. If three pass with the entity swapped, rewrite them or noindex them before you generate page six. If you want a second pair of eyes on a pSEO build before it leaves staging, get in touch.

© All right reserved

© All right reserved