Analysis

Local Search Implications of the Google API Leak

David Mihm

Jun 4, 2024

Visualization of the references between Google Search API modules

Last week's leak of Google Search API documentation has broad SEO implications relevant to local search, including the importance of SERP clickthrough patterns, site architecture, sitewide signals, page-level scoring, quality raters, mobile-friendliness, and the prevalence of entities.

The leak also reveals specific details about local search, such as the role of categories, local intent, query parsing, and the special treatment given to high-volume categories and brands.

While the documents do not provide a complete picture of Google's algorithms, they offer valuable directional information for SEO practitioners. As Mike King and others have noted, the leak suggests that a holistic approach to SEO, leveraging larger themes of site architecture, topical relevance, user intent, and entity-based optimization, may be more effective than targeting specific keywords alone.

Rand Fishkin of SparkToro first published the news that a huge set of Google Search API documentation had been made public, initially anonymously, and later attributed to Erfan Azimi. It’s one of the most consequential events in my 20-year SEO career.

Where to start

You can’t go wrong reading, watching, or listening to

if you haven’t already. If you’d prefer to review the documents yourself, start with the homepage for the complete repository.

How I'd position this repository

The scope of the documents is vast, with everything ranging from Google Assistant API calls to Google Maps road segments and transit options.

I wouldn’t call it an algorithm leak exactly, though I understand why some position it that way. Having reviewed the repository in some detail, it strikes me more as (a subset of?) a wiki of algorithmic elements which may or may not be in use. A buffet of ingredients Google might use in its special sauce, but not a recipe per se.

Limitations

We don’t know a particular date on which a given document was published. For example, there were a handful of noticeable Attribute references to Google Plus (including this one, which also references bizbuilder, the internal name for the mid-2010s version of Google My Business, and even this one which references its predecessor, the Local Business Center). Whether or not the attributes, or even the entire modules, containing those references hold any algorithmic value is very much an open question.

That said, the authors of these docs tend to go out of their way to mark particular attributes as deprecated (there are a lot of deprecations), and many modules and attributes certainly still seem to be applicable based on my general osmosis of thousands of SERPs over the last few months.

The bigger limitation, as I hinted above, is whether this is anything close to a complete list of ingredients or whether there are others contained in sub-APIs which are referenced extensively by these documents but not leaked alongside them.

High-level themes suggested by the documentation

I’d caution against taking any interpretation of these docs (including the one below!) as gospel. But I don’t see the harm in using them directionally. They absolutely help in understanding how Google looks at the job of "organizing the world’s information."

And while it’s true that there are no weights associated with given attributes, at some point, the volume of references to a particular theme has to be some indication of its importance in the algorithm.

Below are my most important takeaways for local search practitioners.

Local-relevant docs

SERP clickthrough patterns (“NavBoost”) seem to play a meaningful role in rankings

Importantly, as Rand pointed out in his summary and Pandu Nayak revealed in DOJ testimony, clickthrough analysis happens at the Metro level, and along with entity information Google procures from other sources, this may be one way Google assesses "brandyness," even for local businesses.

Localized clickthrough rate is also referenced here in a very interesting way:*

clickRadius50Percent (type: integer(), default: nil) - The radius (in miles) around the assigned location that the document gets 50% of its clicks.

*A caveat is that this document references a deprecated geoLocation system (“docloc;” “brainloc” appears to be more recent).

My primary takeaway from both of these localized clickthrough attributes is that your on-SERP presence needs to be compelling enough for local searchers to click on it, whether that’s a Google Business Profile or a Title Tag and Meta Description snippet.

It’s one of the reasons we’re so passionate about the consumer search behavior portion of our Search Gameplan – companies need to understand what captures users’ attention and what drives their clicks at least as much as they assess their competitive presence across traditional SEO metrics.

Three major site architecture takeaways

Link to pages you want to rank from your high search traffic pages. I’ve long advised clients to link to their most important pages from their homepage and primary navigation. I’m now also going to advise finding a way to link to them from high search traffic pages as well. What seems to be the primary page-level module refers to onsiteProminence, defined as “the importance of the document within its site. It is computed by propagating simulated traffic from the homepage and high craps click pages.” Based on other references to "craps" in NavBoost documents, I take "high craps click pages" to mean "high search traffic pages." (An ex-Googler confirmed this interpretation on X last week before deleting his tweet.)
Implement localized content silos where they make sense.
It’s not clear to me whether "the list of locations assigned to a document" in the CountryGeoLocation module refers to intra-country Metros and States/Provinces (referenced as such elsewhere) or the higher-level country. Even if Google does not explicitly tag a localized subpage with a given location, this module indicates it will inherit the location assigned to its parent.

This module refers to the notion of a "sitechunk" specific to chains.

The most prudent course of action would seem to be to implement localized content silos ("sitechunks"?) where they make sense.
On-site anchors matter.
Their importance and textual relevance is calculated separately from off-site anchor text, which suggests they may not contribute as much to rankings, but why devote so much documentation if they don’t contribute something?

Sitewide signals evident throughout the documentation

Plenty of ink has already been spilled over the inclusion of siteAuthority in these documents, despite Google’s repeated denials of this concept. But siteAuthority is far from the only reference to sitewide (not to mention sitechunk) and domain-level signals, even within documents that are ostensibly about page-level scoring. This includes topical classification which seems to take sitewide relevance to a given topic into account.

The sheer quantity of sitewide signals referenced helps explain why larger, more "trusted" sites continue to dominate so many search results today, even if the page-level content that’s ranking seems subpar.

Reading these documents suggests placing at least as much emphasis on site architecture, general topical relevance suggested by the complete corpus of content on your website, and user intent, as on specific keyword targeting.

Page-level scoring is there, too, relative to location

Unsurprisingly, Google seems to evaluate the suitability of a given page for a given query via quality raters, page typology (commercialScore), trigger words in the Title, and more.

But more interesting for local search practitioners: document (page-level) performance appears to be assessed at the city, county, state, and country level.

Page-based metro-level assessments are also referenced in a parallel module.

Quality raters aren’t just evaluating the quality of search results, but links and entities too

It looks like Google has used quality raters (internal codename EWOK) to assess link quality and topical relevance, as well as entity relevance for a given query.

Mobilegeddon was (and is?) real

The documents contain a number of references to project VOLT, which incorporates Core Web Vitals, HTTPS, and general mobile-friendliness. (The presence of interstitials is also noted, presumably as a negative factor, in the de facto primary page-level module, and is explicitly called out in other places as well.)

Entities are everywhere

Long underappreciated by the wider SEO community, the extent of entity-related references in the documentation almost seem to position content as context for entities, rather than the other way around.

Based on this document, it wouldn’t surprise me if every query were checked for known entity references (including location) as a starting point.

Here are of the more comprehensive modules discussing entity resolution and entity attributes. Of particular interest are the notions of entities being graded on a topicalityScore, and multiple references to something called MDVC here, here, and here.

As noted earlier, Google looks everywhere for linkages between entities, perhaps most eye-openingly within transcripts of YouTube videos, which of course are also measured by attributes such as crapsData (again, presumed to be SERP clickthroughs) and uniqueChromeViews.

Local-Specific Docs

Categories and local intent

Sadly, we don’t know what the salientTermSets are that drive Google’s decision to show results corresponding to particular categories (GCIDs), but these two documents provide at least a little insight into the back-end of local intent satisfaction:

gcidIntent (type: list(String.t), default: nil) - Geo intents corresponding to the gcids obtained from the Local result.
salientTermSet (type: GoogleApi.ContentWarehouse.V1.Model.QualitySalientTermsSalientTermSet.t, default: nil) - Salient terms associated with this Local result based on the result gcids.
localResultSignals (type: list(GoogleApi.ContentWarehouse.V1.Model.KnowledgeAnswersIntentQueryLocalResultSignals.t), default: nil) - Signals relating to each Local result.
minSalientTermSimilarity (type: number(), default: nil) - Minimum salient term similarity between Local results.

Local category lists

Google maintains a subset of local categories here and an even more bizarrely-specific category list here. Lodging-specific categories can be found here. None aligns exactly with Google Business Profiles’ list, or even the Google Places API.

Local query parsing

While not directly actionable, Google’s discussion of how they parse of location-related queries in this module is interesting. They first look for an entity type (or category) and then figure out which constraints apply in a structured manner.

As I’ve indicated in previous Near Memo podcasts and LinkedIn posts, faceted or sequential search seems to me to be the most logical final state for actually-useful AI overviews – not the dissertation-length single query that Google VP Liz Reid demo’ed at I/O last month. Google’s existing logic for interpreting local search queries seems to position it for success with either style of query, however.

This recent (post-pandemic) document appears to contain all the structured “constraints” for location-related searches. Included here is visitHistory, which I’ve long hypothesized is a possible “hidden” ranking factor for categories with lots of foot traffic.

Star ratings oddly are thought to be primarily relevant for hotels.

Special treatment for high volume categories

Google appears to maintain a discrete library of keywords which pre-filters local results based on Google Business Profile attributes.

Certain categories (coffee, brunch, hotels, restaurants – "hyperReliableData") seem to be treated differently from all the rest. After reviewing the repeated references to hyperReliableData, it started to feel like there might be some kind of manual mapping of certain keywords in high-volume categories to specific GBP category IDs.

Additionally, Google at least started down the path of category-dependent Proximity impact (commodityStrength), but the unresolved comment about future changes to this attribute implies this particular implementation may have been abandoned.

Special treatment for brands

Nowhere is Google’s “brand” assessment more obvious than in local search. At least historically – and possibly still today – Google treats large chains qualitatively differently in local search evaluation and results.

Chains are a discrete LocationType with their own freebase IDs.
Chains and Subchains receive their own module.
Google associates canonical GConcepts with chains.
Those GConcepts are assigned to individual chain locations.
Google tries to parse queries to see if they refer to chains.
Google tries to assess whether a page refers to a large chain or is itself a large chain.
Quality raters are even explicitly asked to assess chain-related content.

This isn’t to say that chains are necessarily treated more favorably in Local results, simply that Google seeks to understand them in a specific way, and possibly at a deeper level, than the typical local business.

Local organic results

This “LocalWWWInfo” document’s reference to a brickAndMortarStrength attribute is particularly interesting. Is it possible that the strength of a Google Business Profile can influence organic rankings (as well as vice versa)?

The isLargeLocalwwwinfo attribute is also noteworthy; presumably Google is classifying directory-like sites with this tag. And they're classifying enterprise websites with the isLargeChain attribute noted above.

Coupled with references to specific Travel Site types and smallPersonalSite, this implies there are a specific number of slots allotted to specific types of sites in certain localized SERPs. I didn't find anything in the documentation describing how these slots are allocated (from other research, via "Twiddlers"), but the underlying taxonomy associated with a given website is clearly important for some local queries.

Local data

This feels like an older document due to the references to the “cluster” and “plusbox,” but there is clear evidence for the concept of citations (LocalEntityAnnotations::location_confidence). Another seemingly older document talks about Google’s ability to extract address, phone, and hours information from web documents. They appear to have called (or continue to call) the concept of a citation a “wrapptorItem.” As Mike Blumenthal and I have long maintained, in many ways Local served as the guinea pig for the entire Knowledge Graph.

A seemingly more recent document (which mentions food ordering via Google) mentions whether a result is located inUserLocality, long suspected to be a local ranking factor.

Miscellany

An older doc, both due to its BizBuilder reference (internal name of the proto-Google My Business) as well as to the GConcepts doc.
This GConcept document may imply that primary category is given greater weight when assessing relevance for a given query (assuming that Google Business Profiles are considered a “data source.”)
These two Geo-related ranking documents appear to refer to Maps features, not businesses (Zenrin is a Japanese mapping data provider; Oyster appears to be or have been Google’s entity resolution system – h/t Andrew Shotland).
Google’s appears to define a centroid (referred to as “center”) as the “conceptual center” which I quite like.
RIP custom calls to action (and everything else Messaging-related) in Google Business Profiles.
I don’t know what to make of this Jobs-related local document.