Abhishek Mehta Logo
   Rss Feed                 Twitter

Search Engines: The latest fashion
Tue Jun 02 12:00:00 IST 2009, by Abhishek Mehta Bookmark and Share

So, what is the trend this summer?

Short hairs, retro style comeback, John Lennon style round glasses, economy bashing, Taliban crashing, social networking, Web 3.0 and last but not the least: New Search Engines.

Different Search Engines Yes, search engines - driving force behind this hysteria is the quest for finding next Google killer and ever growing size of information retrieval market. The bounty is so huge that launch of new search engine has become a bimonthly affair. If small launches does not pass through your eyes then there are couple of big ones every year, which you cannot ignore. Thanks to the media, which makes exorbitant claims, bloggers fill up the web with their reviews and owners of engines claim the rediscovery of the Holy Grail. Wave of Cuil arrived in July last year and was not worth surfing on. Then Google made its query suggestions semantic in march 2009 (more...). Wolfram Alpha a decision engine is making the headlines and world's wait for the launch of Bing from Microsoft is also over. I hope these new entrants will do well for themselves but till now innovations are disappointing in this area.

Sagoon:

Let me cool down and tell you about what triggered my cynical reaction on this search engine trend and fashion? It was the launch of yet another engine named Sagoon. Its search is based on Yahoo BOSS service. Since it is in beta stage so in future they promise:

"We are also in the process of adding unique features by developing semantic search and natural language processing. Our artificial intelligence technology, moreover, assesses user needs and demands in order to provide them the best quality content that they are looking for." **

YAHOO BOSS Effect:

I regard the entrepreneurial spirits of people who have taken Yahoo BOSS service as their search index base to provide innovative search solutions. They can channel their energies in providing better search results and advancements rather than worrying about big server farms for index storage, retrieval, high staff cost and constantly running web crawlers to gather information. Its not just Sagoon there are twenties of such engines built on the top of Yahoo search index.

These engines have hit the markets but unfortunately none of them are hit in the market. I am trying to explore the reasons behind their lack of success in this latest blog. There is no doubt to begin with that Yahoo, which has opened its search index to the developer community with a nominal fee, is not the best index of the world. Keeping that fact aside, I tested 6 different search engines, which use Yahoo search index as underlying technology and do their own jugglery to show some "effective" search results. My test suit represents the three branches of search engines called semantic engines (Hakia), Clustering Engines (Quintura, carrot-search) and others like Cluuz, Sagoon, Snap and Yahoo itself (This is obviously the benchmark). I tried a positive test with these keywords.

Test Cases 1: H2O -- 2: Internet -- 3: "Latest job hunting mantras"

First case was chosen because there are not too many ways to interpret chemical formula of water. All six of the engines showed ditto results with some have the result order modified marginally. Yahoo search engine shows no different results then these engines. The difference was only in terms of how they are presented to the web searcher. Clustering engines cluster the results in categories as expected from them. Hakia was not able to perform any better even being a semantic engine (don't know how they represent chemical formula in ontology). Carrot-search stood out marginally as by default it combines the results from wikipedia and MSN also with Yahoo. So if we go to the yahoo cluster on Carrot-search, this also falls in same category.

Hakia result for internet In case of second keyword, all engines showed up the same results except Hakia. Hakia did the ontological analysis of the word Internet and showed the results in more categorized way as shown in the image on the right. In case of search phrase "Latest job hunting mantras" again the results of all search engines were same except in Hakia.

Keeping Hakia separate, all engines in nutshell have more or less similar results, similar result ordering, different presentation, different navigation styles, and similar categorizations. One engines scores over other with very few features, which are mainly related to presentation rather than quality of the results. If I am a guy happy with vanilla search results rather than Bling-bling then no need to make a transition. Well case certainly is not so gloomy when it comes to people who want to have some adventure.

The difference in search engines:

It was totally clear with my test that most engines are showing up same results as their underlying index is from Yahoo. Result ordering is more or less same what you will find in Yahoo search engine itself. Now let us take up the main features of these engines separately, for which their creators think that they stand out.

Clustering engines (more...) like Quintura and carrot-search have similar results but their cluster differs. Once you get into any of the clusters, the segregation of results start taking place. Clustering approach is more focused hence narrow. Clustering engines group the search results in different categories based on similarity of the text, word frequency and proximity of the phrases/words found in the resultant documents. Rather than trying to map user's keywords to the results semantically, or just literally, clustering engines employ statistical methods of Text Mining.

Cluuz (more...) is a named entity extraction engine, which can extract people, websites, addresses, email address and phone numbers from the results and display them with the each hit separately. Once you select any extracted entity it modifies query string with it and shows new set of results. This is the added value to the web searchers.

Hakia (more...) is Vanilla semantic search engine (more...), which lives on the power of its semantic algorithms. It is in beta stage and shows relevant results for basic questions and simple phrases. If this technology comes out of the beta stage, then it has chance to make it big on the web. Else it is better off to be in same league as others.

Snap (more...) is based on the idea of showing a big enough screenshot of the result page before you actually open it. Navigation of the previews (results) is done in a very user friendly and innovative way. Once user starts looking at the previews and actual results, navigation becomes as easy as clicking up and down arrows of your keyboard.

Sagoon in its beta Avatar is nothing more then seeing yahoo's result under different domain name.

Bottom-Line

Search engines discussed here are started with some business goal in mind. Some are showcases for the clustering technologies of the companies and others are meant to eye domain verticals. Yahoo BOSS appears to be an attractive option but it is equally attractive to all of them. In the end this burns down to who has it in him to make a dent in the market share of the Google? Question is still unanswered with one certainly that it is not yahoo. But yahoo index can make huge impact if used by bright guys in right manner. Focus have to be on the quality of the results rather than navigational features or just one or two interesting feature like entity extractions and then drawing a correlation within the results, which not many can understand anyway.

There is only place for consolidation in the space rather than making newer engines based on same technology. Its good to have named entity extraction feature like of Cluuz but snap preview is not bad either; ability to cluster the results is not bad again. But the problem is that these features exist in separate search engines not in one. Entrepreneurs willing to make a cut in the search engine world should make an engine, which has all these features customizable by the web searchers at more or less one place. Consolidate rather then going individual ways.

Future only has place for semantic engines but they are not in sight right now. Till that happens there will be a market for engines which have lots of rich features rather than lots of engines with a rich feature in each one of them.


Note:

In case you are performing these tests yourself results may differ based on the geographical location. So before doing the test remember to set same geographical location for all the engines mentioned.


Reference:

About Sagoon 




Related Blogs



Computer- Internet - Privacy: Integral to child education


Google Reconciliatory note - The Murdoch Effect


©2008-2009 Abhishek Mehta All Rights Reserved

All content on this website and in whitepapers released by AbhishekMehta.com is proprietary, reproduction in any form without permission is prohibited.