|
|
 |
Technical SEO
|
Using concepts for Ad Targeting - Click here for original Patent filing
This patent deals with serving Ads based upon keywords or moreover, topical ‘concepts’ to help and determine ad ‘relevancy’. Yes, here I am once again discussing ‘relevance’ – I hope 2008 brings me a new hobby. This time I am looking ( thanks Bill ) at the Ad serving world. I do feel it is important as many of the concepts relating to establishing meaning or ‘themes’ to content/information/words is much the same in this area is it is in the ‘organic’ search processes.
There is a bunch of the usual mind numbers as far as the nuts and bolts of user account setting such as geo-location, budget, keywords desired, to the server side calls for number of ads to be displayed to size/real estate available for them. Of interest there is the checking for geo-location issues from say a search query to a regionally targeted ad (sounds like IP delivery er.. cloaking to me?).
What’s the point?
|
|
Read more...
|
|
|
Ranking documents based on large data sets - Click here for original Patent filing
Hello all…. How are we today? We’re gonna try melting our brains once again with some notes from a June 2007 Google patent that is a thrilling tale of ranking and re-ranking documents… not a good bedtime story unless your tea is real hot. Last time we were looking at establishing relevance with; Learning a probabilistic generative model for text - and this time we will look at some ways of ranking results based upon this model.
…. And away.
Ranking and re-ranking from past user data
|
“a ranking model that predicts a likelihood that a document will be selected by: storing information associated with a plurality of prior searches, determining a prior probability of selection based, at least in part, on the information associated with the prior searches, and generating the ranking model based, at least in part on the prior probability of selection; training the ranking model using a data set that includes approximately tens of millions of instances; identifying documents relating to a search query; scoring the documents based, at least in part, on the ranking model; forming search results for the search query from the scored documents; and outputting the search results.” |
If you’re still awake after that… I think we’ll be ok. Once again we’re touching on using prior searches (and likely user sessions) and probabilities as with the recent review of ‘Method and apparatus for learning a probabilistic generative model for text’ (atch Link). We’re also implementing training data and creation of rules based upon the method that are relative to that particular document as well. Give it a read as well at some point for reference. |
|
Read more...
|
|
|
Method and apparatus for learning a probabilistic generative model for text - Click here for Original Patent
This is an interesting method that seeks to ‘teach’ the system how to relate various documents, or more appropriately, the TEXT within the documents, from semantics to link nodes. Or as stated at one point – “a system that learns concepts by learning an explanatory model of text”. This is something they have worked on for a while and can been seen in the earlier related patents; Test classification system and method and Method and system for creating improved search queries
Moving along….
In section 2 – Related Art – we have;
| “Processing text in a way that captures its underlying meaning--its semantics--is an often performed but poorly understood task. This function is most often performed in the context of search engines, which attempt to match documents in some repository to queries by users. It is sometimes also used by other library-like sources of information, for example to find documents with similar content. In general, understanding the semantics of text is an extremely useful subcomponent of such systems. Unfortunately, most systems written in the past have only a rudimentary understanding, focusing only on the words used in the text, not the meaning behind them.” |
Call me a Phrase Based Indexing and Retrieval junky ( and you’d be right), but once again the concepts apply. The whole PaIR methodology sought to do just this – further comprehend the actual meaning of a document/text block rather than simply looking at individual words. For those paying attention, Google showed interest in the direction of ‘semantics’ when it purchased Applied Semantics and it’s ‘Latent Semantic Analysis’ technologies back around 2004 or so – though presumably for their AdWords/AdSense program. So this is not a new direction.
|
|
Read more...
|
|
|
Continuing the journey into Phrase Based Optimization
One thing worth mentioning, is that there is limited info relating to personalized search and PaIR. It merely touches the surface of the over-all personalized search methodologies. This means it would merely play a role in the PS engine. There is much more to it and the PaIR model aspects are by no means comprehensive. I simply wanted to give a quick break down as to how a PaIR system would handle PS processes.
Personalized Search in a PaIR system looks to customize the ranking of the search results based on a perceived model of a user's particular interests. Information deemed to be relevant to the user's interests would rank higher in the search results. |
|
Read more...
|
|
|
Detecting spam documents in a phrase based information retrieval system
This is a continuation of; Phrase Based Optimization and Phrased Based Indexing and Retrieval
| An information retrieval system uses phrases to index, retrieve, organize and describe documents. Phrases are identified that predict the presence of other phrases in documents. Documents are the indexed according to their included phrases. A spam document is identified based on the number of related phrases included in a document. |
At least that’s the opening folly of the document. As a basic refresher, the method looks not only at the search term but related phrases for a given topic and related phrase occurrences expected to be present in a document statistically. I is calculated over individual/multiple documents and collections of documents (web pages and website for our purposes).
|
|
Read more...
|
|
| |
|
Knowledge Base |
|
This article is not aimed at the advanced level reader here folks. I talk to many people that don't even spend a passing thought on the KW/P research - this posting is written to give the reader a foundation - not as an advanced guide. Of course, this being an in-exact science, there could be some gems in here for the ragged SEO warriors as well.
For a while now, I have been intending to give my perspective on Keyword/Phrase ( KW/P ) research and targeting. Why? Because I think it is an essential if not mandatory part of any SEO campaign. Each and every term you target will have a cost associate with it. This is where the rubber meets the road. In simplest terms, if I spend $500 on link building, $200 on content creation, $100 on site adjustments then I am looking at $800 invested. Once we reach the desired ranking, how long will it take to recoup that cash (including ongoing maintenance). In short, where is the ROI and what is the ‘break even’ point? That is a very simplistic example, but hopefully you get the idea.
Simply aiming to be #1 on Google is foolhardy since some terms contain little reward (traffic). Remember, cash that gets tied up chasing non-performing terms is money that could have been used elsewhere in your marketing endeavors. So, this is certainly an important step in the SEO process. Mistakes here can be very costly later on and in the over-all ‘big picture’ that is the sites financial health.
|
|
Read more...
|
|
|