Technical SEO

The reasonable surfer; makes for unreasonable thinkers

A guide to assessing search patents

Well, what can I say? It just never ends. Oh no my friends start shaking yer heads.. We talk about LSI/Google crap not once, but two and three times. Behavioural data and oh, I dunno, bounce rates a few times? How about the most important one, the Magic Bullet (2010 and 2007 versions). Time and time again, it seems to keep happening; SEOs grasping at straws.

Google re-ranking and personalized search study


Once upon a time me and a few cohorts wondered about just what levels of flux there were in the Google SERPs. Has personalized search really changed the consistency of rankings? It’s an issue that has been spoken about many times in the search world. We set out to see what was up. After two rounds of research we noticed that this was unlikely to be the case. You can learn more in the post; The SEO guide to Google personalized search

Google Personalized Search and Ranking Study

 As the world of search has grown to include behavioural and geo-graphic signals, many optimizers began to wonder about the value of ranking reports. The conventional thinking being that if we cannot definitively know the ranking of a given page, how do we valuate it? This is of great importance for those of us providing SEO services

Ad Serving and User Performance Metrics

Using concepts for Ad Targeting - Click here for original Patent filing

 This patent deals with serving Ads based upon keywords or moreover, topical ‘concepts’ to help and determine ad ‘relevancy’.  Yes, here I am once again discussing ‘relevance’ – I hope 2008 brings me a new hobby. This time I am looking ( thanks Bill ) at the Ad serving world. I do feel it is important as many of the concepts relating to establishing meaning or ‘themes’ to content/information/words is much the same in this area is it is in the ‘organic’ search processes.

Ranking via User Performance Metrics

Ranking documents based on large data sets - Click here for original Patent filing

 Hello all…. How are we today? We’re gonna try melting our brains once again with some notes from a June 2007 Google patent that is a thrilling tale of ranking and re-ranking documents… not a good bedtime story unless your tea is real hot. Last time we were looking at establishing relevance with; Learning a probabilistic generative model for text - and this time we will look at some ways of ranking results based upon this model.

A Probabilistic Learning Model

 Method and apparatus for learning a probabilistic generative model for text - Click here for Original Patent

This is an interesting method that seeks to ‘teach’ the system how to relate various documents, or more appropriately, the TEXT within the documents, from semantics to link nodes. Or as stated at one point – “a system that learns concepts by learning an explanatory model of text”. This is something they have worked on for a while and can been seen in the earlier related patents; Test classification system and method and Method and system for creating improved search queries

Phrase Based Personalization of Search

Continuing the journey into Phrase Based Optimization

One thing worth mentioning, is that there is limited info relating to personalized search and PaIR. It merely touches the surface of the over-all personalized search methodologies. This means it would merely play a role in the PS engine. There is much more to it and the PaIR model aspects are by no means comprehensive. I simply wanted to give a quick break down as to how a PaIR system would handle PS processes.

Spam detection in a PaIR system

Detecting spam documents in a phrase based information retrieval system 

This is a continuation of;Phrase Based Optimization and  Phrase Based Indexing and Retrieval II 

An information retrieval system uses phrases to index, retrieve, organize and describe documents. Phrases are identified that predict the presence of other phrases in documents. Documents are the indexed according to their included phrases. A spam document is identified based on the number of related phrases included in a document.

At least that’s the opening folly of the document. As a basic refresher, the method looks not only at the search term but related phrases for a given topic and related phrase occurrences expected to be present in a document statistically. I is calculated over individual/multiple documents and collections of documents (web pages and website for our purposes).

 

 

Phrase Based Indexing and Retrieval 2


Picking up where we left of with the overview of Phrase Based Optimization – I wanted to scan over some relevant points from the other Phrase Based Indexing and Retrieval (IR) Patents. This time we'll step back from the algo-babble and explore the intricacies a little further.

As you (undoubtedly) remember the core concept of the processing is to identify valid (actual/real) phrases in a given document collection (or web pages in our case). The goal being to classifying each potential phrase as either “a good phrase or a bad phrase” depending on it’s usage and frequency; then using those ‘good’ phrases in predicting the usage of other ‘good phrases’ in the collection of web pages.


What’s a ‘Good Phrase’?

The classification for possible phrases as either a good phrase or a bad phrase is when the possible phrase; ‘appears in a minimum number of documents, and appear a minimum number of instances in the document collection’. What that number is, we don’t know. Those are the ‘dials’ the Search Gods themselves only have access to. It is almost looking at a Phrase Density over the aggregate of documents (the web site). Also, a BAD phrase is not one with dirty words, it is simply a phrase with too low a frequency count to make the ‘good’ list.

Duplicate Content

– One more time

Why do engines care? - In order to make a search more relevant to a user, search engines use a filter that removes the duplicate content pages from the search results, Another is that they don’t want to spend the resources in indexing pages that are substantially similar.

That said, there still seems to be some confusion out in the SEO world over ‘duplicate content’ and how search engines treat and deal with them. Right away I would like to say - RELAX -. If you are doing sneaky things like filling up a site with dodgy content that YOU KNOW is duplicate, then worry. Most people that may have duplicate content issues are honest web site owners and aren’t at risk of any penalization.

Phrase Based Optimization

The main goal of this document is to give SEO enthusiasts a stronger grasp of how Phrasing is dealt with in Search Engines, in an effort to help you further target and optimize your web sites. The theories and information relate well to keyword/phrase research as well as content creation and to a lesser extent back links text development.

The crux of the piece was based on analysis of an existing Google Patent on ‘Phrase based searching’, (see Resources at the end). That is as far as I shall go on the original Patent since it can lead to assumptions of what may, or may not be used in their indexing and retrieval processes (algorithms). Just because they filed the patent, doesn’t necessarily mean they have implemented it. I feel the main point here is to get a better idea of HOW search engineers think and WHAT may possibly be in place now, or in future Search technologies.

Get Started Now

Name *
Invalid Input
Email *
Invalid Input
Phone
Invalid Input
URL *
Invalid Input
Budget
Invalid Input

Our Sites

Home - About Us - Consulting - SEO Reports - SEO Programs - SEO Packages - Request for Proposal - Contact Us

All content and images © Verve Developments 2012