8 min read1,800 words

The siteRadius Penalty: Why Topic Drift Kills AI Visibility

siteRadius measures how far your pages drift from your site's semantic center. High radius = topical scatter penalty. Here's how to diagnose and fix it.

Google AlgorithmsiteRadiusTopical AuthorityAI VisibilityContent Strategy

The Hidden Cost of "One More Article" {#hidden-cost}

The marketing team sees a trending topic. It is not directly related to your product, but it is getting attention. The decision seems obvious: publish a piece, capture some traffic, demonstrate thought leadership.

The article goes live. It captures some visits. The team moves on.

What they do not see: that single piece just increased your siteRadius. Your site's semantic identity became slightly more scattered. Your authority signal weakened marginally.

One article does not matter much. But this decision happens weekly across years of content marketing. Each piece slightly off your core topic pushes your semantic center further from coherence. The cumulative effect is measurable.

This is the hidden cost of "one more article." Not the time to write it. Not the resources to promote it. The cost is semantic drift that compounds into algorithmic penalty.


What siteRadius Actually Measures {#what-it-measures}

siteRadius measures the distance between individual pages and your site's semantic center.

The mechanism works through vector mathematics. Google's system (site2vec) creates an embedding that represents your entire site's semantic identity. This embedding is a centroid, an average vector calculated from all your content. It represents what your site is fundamentally "about."

Each individual page also has its own embedding vector. When Google evaluates your site, it calculates the distance between each page vector and your site's centroid.

Pages close to the centroid strengthen your site's coherent identity. Pages far from the centroid weaken it.

siteRadius is the average of these distances across all pages. High radius means your pages scatter widely around your center. Low radius means your pages cluster tightly around a coherent identity.

The mathematical reality is stark: every page you publish affects your site's average radius. Pages within your topical territory pull the average down. Pages outside your territory push it up. There is no neutral option.


How Topic Drift Accumulates {#topic-drift}

One off-topic page has minimal impact on siteRadius. The problem is patterns.

Consider how topic drift typically accumulates:

Company news that strays from expertise. Your CEO spoke at a conference. You write about the conference topic, not your area of expertise. The speech was about industry trends tangential to your product. That page now sits far from your semantic center.

Trendjacking content outside your domain. A major event happens. Everyone is writing about it. Your team publishes a "what [event] means for [industry]" piece without genuine expertise to offer. Traffic spikes briefly. Radius increases permanently.

Guest posts on tangential subjects. You accept a guest post because the author has a following. The topic relates loosely to your space but is not your expertise. The page carries your domain's authority signal while diluting your semantic focus.

Legacy content from previous directions. Your company pivoted three years ago. The old blog posts remain, targeting keywords that no longer represent your business. Each piece pulls your centroid away from your current identity.

This accumulation creates what might be called a "content marketing graveyard." Pages that made sense at publication time now sit as semantic debt, increasing radius without providing proportional value.

The pattern compounds. As radius increases, your authority signal weakens. As signal weakens, rankings for core topics suffer. As rankings suffer, teams publish more content trying to recover, often accelerating the drift.


The Relationship to siteFocusScore {#sitefocusscore-relationship}

siteFocusScore measures your site's overall topical coherence. siteRadius measures individual page deviation. They are complementary signals evaluating the same underlying quality from different angles.

Think of it as forest and trees.

siteFocusScore evaluates the forest: does this site have a clear, focused identity? Does it cover a coherent semantic territory?

siteRadius evaluates the trees: how far does each individual page drift from the site's established center?

A site can have a clear identity (good siteFocusScore) but poor radius if many individual pages drift from that center. The identity exists, but individual pages do not consistently reinforce it.

Conversely, a site could have low radius (pages cluster tightly) but poor focus (the cluster itself covers scattered topics). Both signals matter.

Google uses both in its evaluation of site-level quality. Optimizing for one while ignoring the other produces incomplete results. Strong sites perform well on both measures: clear identity AND consistent page-level reinforcement of that identity.


Why AI Systems Penalize Topic Drift {#ai-penalty}

The same principles that drive siteRadius penalties in Google rankings apply to AI visibility.

RAG systems evaluate source coherence when deciding which content to retrieve and cite. A scattered site sends confused authority signals. When an AI system encounters your domain, it assesses whether you have genuine expertise in the queried topic.

High siteRadius signals to AI systems that your site lacks focused expertise. You cover many topics without depth in any. The system has lower confidence that your content represents authoritative knowledge.

The data supports this mechanism. Research on sites filtered from AI Overviews found that low-coherence sites (those with high radius and poor focus) were 67% more likely to be excluded. High-coherence sites achieved 4.7× higher average positions in traditional results and substantially better AI citation rates.

This is why topic drift kills AI visibility specifically. Traditional SEO might tolerate some scatter if individual pages are well-optimized. AI retrieval evaluates at the site level. Your domain's semantic coherence directly affects whether AI systems trust your content enough to cite it.

Every off-topic page increases your radius and decreases AI retrieval confidence. The effect compounds across hundreds of pages accumulated over years of content marketing.


Diagnosing Your siteRadius Problem {#diagnosing}

Google does not expose siteRadius in any public tool. But you can diagnose your topical drift through systematic audit.

Map your content to topics. Export your content inventory. Categorize every page by its primary topic. Not its target keyword. Its actual semantic subject matter.

Identify pages furthest from your core. Which pages cover subjects unrelated to your primary expertise? These are your highest-radius pages. Common culprits:

Company news posts about external events or industry trends outside your domain. These capture momentary relevance but permanent radius.

Trendjacking content published for traffic rather than expertise. If you would not hire the author to consult on that topic, the page likely increases radius.

Legacy content from previous business directions. Old product announcements, deprecated feature documentation, and historical content from strategic pivots.

Guest posts on tangential subjects. Content you published for relationship reasons rather than topical fit.

Apply the 80/20 rule. In most content audits, roughly 20% of pages cause 80% of the radius problem. Identify your worst offenders first. A site with 500 pages might have 100 that drive most of the topical scatter.

Assess page value against drift. Not all high-radius pages are worthless. Some may generate significant traffic or conversions. Weigh the value against the semantic cost. High-value pages slightly off-topic may be worth keeping. Low-value pages far off-topic are prime removal candidates.


Reducing Your siteRadius {#reducing}

Reducing siteRadius requires strategic content work, not indiscriminate deletion.

Prune the outliers. Identify pages with extreme topical drift and minimal value. Remove them or set noindex. These pages contribute disproportionately to radius while providing little return. Removal is not losing assets. It is removing negative signal.

Consolidate scattered content. Multiple thin pages on related topics can be merged into comprehensive resources. This reduces page count while concentrating semantic signal. The consolidated piece has lower radius than the scattered originals.

Redirect strategically. When removing pages with backlinks or historical traffic, redirect to topically relevant pages rather than deleting outright. The redirect passes value while eliminating the radius-increasing content.

Establish topical fit criteria. Future content should pass a topical fit test before publication. Does this piece strengthen our semantic territory or dilute it? Building this evaluation into your content workflow prevents radius accumulation going forward.

Accept the timeline. Radius improvements take 3-6 months to reflect in algorithmic evaluation. Google re-assesses site-level signals periodically, not in real-time. Consistent effort over months produces measurable results. Expecting immediate improvement leads to abandoning effective strategies.

The goal is not perfect topical purity. It is reducing average radius to a level that signals coherent expertise. Some drift is normal. Extreme scatter is penalized.


The Topical Authority Compound Effect {#compound-effect}

siteRadius creates self-reinforcing cycles, either virtuous or vicious.

The virtuous cycle:

Low radius signals focused expertise. Focused expertise improves rankings for core topics. Better rankings attract more topical visitors and links. More topical engagement encourages more topical content. More topical content further reduces radius. The cycle reinforces.

The vicious cycle:

High radius signals scattered identity. Scattered identity weakens rankings across topics. Weaker rankings create pressure to publish more content. Desperate content strategies often increase drift. More drift increases radius further. The cycle accelerates.

Organizations in the virtuous cycle find content marketing increasingly effective. Each piece builds on established authority. Rankings come easier. AI systems cite more frequently.

Organizations in the vicious cycle find content marketing increasingly difficult. Each piece struggles against weakened authority signals. Rankings require more effort. AI systems ignore despite volume.

The difference is contextual coherence. Not publishing volume. Not individual page quality. The coherence of your site's overall semantic identity.

Breaking the vicious cycle requires acknowledging that more content is not the solution. Focused content is. This often means publishing less while pruning more, at least until radius returns to healthy levels.


FAQs {#faqs}

What is siteRadius?

siteRadius is a Google internal signal that measures the semantic distance between individual pages and your site's overall semantic center (centroid). It calculates how far each page drifts from your established topical identity. Pages with excessive radius face algorithmic penalties because they weaken your site's coherent authority signal.

How is siteRadius different from siteFocusScore?

siteFocusScore measures your site's overall topical coherence (the forest view), while siteRadius measures individual page deviation from your semantic center (the trees view). A site can have a clear identity but high radius if many individual pages drift from that center. Both signals work together in Google's evaluation of site-level quality.

Does every off-topic page hurt my rankings?

Individual off-topic pages have minimal impact. The penalty emerges from patterns. A single company news post about an unrelated event is unlikely to affect rankings. However, systematic topic drift across many pages compounds into measurable radius increases that weaken your site's authority signal.

Should I delete all content outside my core topic?

Not necessarily. Evaluate each page's distance from your semantic center and its value. High-value pages slightly off-topic may be worth keeping. Low-value pages far off-topic are prime candidates for removal. The goal is reducing average radius, not achieving perfect topical purity. Focus on the outliers causing the most drift.

How do I know which pages have the highest radius?

Audit your content by mapping each page to its primary topic. Identify pages that cover subjects unrelated to your core expertise: company news about external events, trendjacking content, guest posts on tangential topics, legacy content from previous business directions. These typically have the highest radius and contribute most to topical scatter.

Can new topical content offset high-radius pages?

Partially. Publishing strong topical content does pull your site's average radius lower. However, removing or noindexing high-radius pages is more effective than trying to outpublish them. Think of it as weighted average: one page with extreme drift affects the average more than several pages close to center can compensate.


The Strategic Choice

Every piece of content you publish either strengthens or weakens your site's semantic coherence. There is no neutral option.

siteRadius makes this tradeoff explicit. The signal measures exactly what intuition suggests: focused sites outperform scattered ones. Google's algorithm encodes what expertise looks like at scale.

The choice facing content teams is clear. Continue publishing volume and accept increasing radius. Or establish topical discipline and build compounding authority.

The organizations that understand siteRadius will make different content decisions than those that do not. They will ask "does this strengthen our semantic territory?" before asking "can we rank for this keyword?"

That question, applied consistently over months and years, is the difference between the virtuous and vicious cycles. The signal exists. The mechanism is documented. The strategic response is yours to choose.

About the Author

Jack Metalle

Founding Technical Architect, DecodeIQ

M.Sc. (2004), 20+ years semantic systems architecture