Google gets phrasally semantic
Wed Apr 01 00:00:00 IST 2009, by Abhishek Mehta Bookmark and Share

In January, during the fourth-quarter earning call Eric Schmidt hinted about 2009 being semantic year for Google. This is a key year, when his company will start raiding semantic space. Lieutenants responded to CEO's call immediately and on 24th of March it become official (and in place) that Google will provide better suggestions for related searches, based on semantic analysis of search strings. Since this development benefits everyone who uses Google search, I decided to do the analysis and comparison of this new feature in my latest blog.

Google Semantic Search Example Google users were already empowered with somewhat interesting feature on result pages, called "Searches related to". Those who have never seen this feature can blame it on the positioning, which is at bottom of the page. This was a keyword modifier that varies the search keywords based on the suggestions automatically generated by Google. Query suggestion for single/compound words were already in place but this time company has taken a semantic approach to take shots on phrases.

Google's Semantic Capabilities:

I performed a positive test with - Glaciers melting in Himalayas - as a non-quoted single phrase in the search box and the alternate query suggestion, which came out, were certainly interesting. What especially caught my eye were two results: Global warming Himalayas and other one was climate change Himalayas (see Image). Skeptical might say these results can be achieved this way or that, but since Google is saying it has gone semantic, I believe them. Generating the options of "global warming" and "climate change" from "Glaciers melting" is certainly a semantic step forward. Many more semantic suggestions are possible, based on Polysemy and Synonimity but let us wait and watch till Google matures this technology.

Current approach of Google does a bad job once you put two/three phrases, or you put something, which can lead to multiple phrases. For example modify the above query string with "are" in-between. When query was done for - Glaciers are melting in Himalayas - results falls flat on the ground and the query suggestions were not even worth mentioning. As company has also/already said that they are currently eyeing only single phrase, so even if they can do a good job with single phrase, I would say it is a good beginning.

Alternatives:

ASK Semantic Search Example Generating related search strings for the user exist in all search engines for more then a decade. Each engine shows the "Related Searches" feature somewhere in their result page, Top or bottom its all about positioning, so what does Google do what they do not and vice-versa. When it comes to related keywords suggestion one search engine, which stands out, is ASK.

ASK

Depending upon the understanding of your phrase, keywords and inter-relation of your words, ASK comes up with suggestions which can expand, narrow and sometime give you different line of thought on your keywords selection. So web searcher can keep looking for desired results based on analysis of related searches rather than thinking of and manually typing next search string. Searches, which are done from a browser toolbar for ASK, have the suggestions segregated in two categories expand and narrow your keywords (this only happens for me from browser toolbar search, do not know why).

"Related Searches" as they call it are shown on the right hand side of your result page and get the prominent visibility. Positioning of the suggestion and their segregation into expanding and shrinking categories looks like swell features to me. Though I must confess that suggestions from Google looked more semantic then from ASK, but ASK currently gives more suggestions and is much more user friendly.

Cluuz

Query suggestion Cluuz Cluuz engine cannot escape the comparison, as its approach to related keyword suggestion is totally different from others. To begin with, try out anything in this engine you will not be disappointed, but there will neither be a surprise. Strength of query suggestion of this engine does not come from clustering or semantics but from its workflow.

Cluuz is a named entity extraction engine, which can extract people, websites, addresses, email address and phone numbers from the results and display them with the each hit separately. Once you select any extracted entity it modifies query string with it and shows new set of results. So this way you can modify your search results by driving through these named entities, it is a slam-dunk. I tried out "carbon credits" in this engine and image has a impression of it.

The others

Majority of other search engines have perfected the art of understanding important keywords from the list of options, generally these are nouns, proper nouns or noun phrases. Such engines hold the taxonomical references of the important words in predefined categories. These engines do not use any high-end semantic or (maybe) clustering mechanisms. They rather consider the probability or weight of your keywords in one category or another and then display all the options in selected categories. Cuil and Kosmix are the examples of such engines, Cuil provides more categories and has more depth then Kosmix though. But as a web searcher, having such a broad classification of related keywords is not very useful, as time spent by searcher will be more here than finding the related words in his own head. Value addition is only in the cases of searching on unknown topics or when one is vague about the results he/she is looking for.

Yahoo and MSN both do the "query suggestion" but their output is not interesting at current moment.

Google's approach to semantics:

In one of my earlier blogs**, I did the analysis and comparison of semantic engines like Hakia. But Google's latest approach on the semantic searching is worlds apart and generations behind theirs. While Hakia tries to understand your phrases and tries to match your expectations on the source documents semantically, Google is giving you option of changing your keywords semantically as it cannot match semantically. Certainly Hakia's approach is much better as it makes an effort to understand at least what is written but their indexes, popularity and engineering efforts currently does not match the Google. So this new feature is a silver lining for Google searchers, better to have something semantic then nothing.


Note: Search engines reviewed here are the ones capable of finding information on the whole World Wide Web. Engines like Powerset are left out because of this reason.

Clustering engines have not been considered for this review because their modulus operandi is hierarchical by nature, hence supporting "query suggestion" by default. But providing the semantic replacement for each cluster is not possible, as clusters are generally broken down to the word levels.


Reference: Quest for perfect Search Engine --Part1


Comments:


Post a Comment:
  • HTML Syntax: Allowed

Related Blogs



Computer- Internet - Privacy: Integral to child education


Google Reconciliatory note - The Murdoch Effect


©2008-2009 Abhishek Mehta All Rights Reserved

All content on this website and in whitepapers released by AbhishekMehta.com is proprietary, reproduction in any form without permission is prohibited.