SEOmoz, masters of SEO guidance and pro SEO insight have got themselves into something of a lather this week with the release of a new lab tool that measures your LDA (Latent Dirichlet Allocation). Put together by SEOmoz senior scientist Ben Hendrickson and announced as part of his SEO presentation at the SEOmoz Pro Seminar in Seattle, LDA as a concept, as well as the new LDA tool have certainly got the SEO community buzzing. Though there’s already been a stack of speculation on how Google integrates LDA technology into it’s algorithm the new SEOmoz focus on LDA seems to have really sent people over the edge.
For those SEO friends out there unfamiliar with the concept here’s a brief overview and what it might mean for the future of SEO.
A brief history of LDA
Despite being referred to by some as a ‘Game Changer’ there’s actually nothing particularly new about LDA. Developed by David Blei of Princeton University, as far back as 2002, 2003 IR(information retrieval) specialists were working with LDA as a way of querying databases for relevant information. By 2004 Microsoft Research was looking at ways of integrating LDA into it’s search model.
What seems to have brought LDA to the fore recently is the suggestion that search engines seek to determine relevance by using LDA to identify contextual relevance through topic modelling. Hendricksen’s research which involved assessing 8 million documents through over 1000 queries showed clearly that typically highest Google returns have more topical content. Put another way it would seem that there is now little doubt that search engines apply topic based semantic analysis when indexing a page and are now able to determine the contextual relevance and intent of the copy content.
LDA isn’t LSI
It’s important to differentiate between LDA and LS where LSI approaches relevance from a semantically shallower keyword density perspective LDA uses a deeper semantic pool that applies contextual relevance through the creation and study of related topics using modifies, adjectives and synonyms.
Google have long talked of their attempts to identify topicality through context, Amit Syngal writing on the Google blog in 2008 said:: “Synonyms are the foundation of our query understanding work. This is one of the hardest problems we are solving at Google. Though sometimes obvious to humans, it is an unsolved problem in automatic language processing. As a user, I don’t want to think too much about what words I should use in my queries. Often I don’t even know what the right words are. This is where our synonyms system comes into action. Our synonyms system can do sophisticated query modifications, e.g., it knows that the word ‘Dr’ in the query [Dr Zhivago] stands for Doctor whereas in [Rodeo Dr] it means Drive. A user looking for [back bumper repair] gets results about rear bumper repair. For [Ramstein ab], we automatically look for Ramstein Air Base; for the query query [b&b ab] we search for Bed and Breakfasts in Alberta, Canada. We have developed this level of query understanding for almost one hundred different languages, which is what I am truly proud of.“
Hendrickson analysis used Lady Gaga Poker examples as well as highlighting the importance of referring to Keith Richards, Goat’s Head Soup or Exile on Main Street when writing about the Rolling Stones and to avoid talking about rocks, pebbles or gemstones in order to maintain a semantic integrity when someone searches for ‘The Stones’.
There are plenty of tools out there to assist developing contextual content. Google Wonderwheel and Google Sets are powerful weapons in any SEO’s armoury when it comes to identifying themes and associations.
The importance of LDA to SEO
With all eyes currently focused off-page in the pursuit of juicy inbound links a shift back towards on-page factors in determining search relevance and subsequent search positioning may substantially alter the way that SEO’s go about their business by placing even great emphasis on the importance of rich copy content and lessening the importance of links. There are plenty who would argue though that it’s exactly this sort of well constructed, high quality and information rich content – the type of content that visitors find so useful that inspires a healthy range of of high quality links in the first place. In many ways any additional focus on LDA is yet another search engine attempt to encourage search destinations of value and interest.
Talk to your SEO for more information on how you can turn the use of LDA to your competitive advantage.
Call Top Page NOW on +44 (0)845 052 9467 and talk to Chris Horner
Posted by Chris Horner – SEO Specialist