|
|
Quest for perfect Search Engine --Part2
| Mon Feb 09 00:00:00 IST 2009, by Abhishek Mehta |
|
"Apple" the search engine and you will be rewarded with millions of web fruits. There are many kinds of apples and at many places in the world. Or I should rather say, all kinds of apple exist in all parts of world but are differentiated based on their color, role, age, usability, taste, and performance and of course context. Certainly previous sentence isn't weird since we all know apple is in electronics, you love to eat apple and Adam's apple isn't named after every individual. Now should we read the results of all apples to get to the right one or should we extend "microchip" to our apple or try something, which cluster the entire web results automatically into visually different result sets.
Quest for perfect search engine continues from Part-1. In the previous blog of this series, I looked into interesting engines like Powerset, Hakia, Kosmix and Cluuz. In this blog I will be discussing another branch of engines, broadly classified as "Clustering engines". So, lets see what exactly is a
Clustering Engine
"Clustering is an act of grouping into clusters. A cluster is grouping of things that can occur together. Clustering engine is a software application, which can classify a resultant heterogeneous set into smaller but more coherent homogeneous sub sets."
Clustering engines group the search results in different categories based on similarity of the text, word frequency and proximity of the phrases/words found in the resultant documents. Rather then trying to map user's keywords to the results semantically, or just literally, clustering engines employ statistical methods of Text Mining. They use predefined taxonomies and vocabularies to group results into clusters and name them for user's understanding based on weighted graphs (mathematically speaking) generated internally. Here are few good clustering engines to try.
![]()
Kartoo is my personal pick in visual clustering engines category. Being a fan of visual search, I must say that they have a nice flash based layout along with a balanced use of content and display. Once you have reached on some kind of consensus after doing your research on the Kartoo, you might like to preserve your steps/efforts, which went in finding the relevant information. So user-friendly features like: ability to save, load and print your visual map, zoom in and out of the clusters using the clicks comes handy. You don't have to do login and logout stuff for using these features.
A non-visual clustering is also available on this link non-visual-kartoo. Non-visual clustering is impressive with the abilities of including and excluding the clusters for specific views of the results.
This is a Meta search engine, meaning it combines results from various result sources. MSN and yahoo are the few search engines behind the nicely clustered results of Kartoo.
![]()
Clusty as the name suggests is a clustering engine (Meta search), from a company named Vivisimo. This company provides Enterprise, Federated and Clustering search solutions to the customers. But since clusty is free initiative for the personal use of web searchers so, one can use this for productivity enhancement without being sued.
Sources of clusty's cluster are the engines like live.com, ask.com, yahoo news, and open directory. This website can cluster the web results (obviously), along with that you can also cluster wikipedia, blogs, news, jobs and images. Another interesting clusters, which can be formed, are based on types search engine used and type of websites (.com, .net) from where results were gathered.
![]()
From Russia with Love" comes the Quintura. This is one of the most effective clustering engines with very beautiful graphical user interface. Quintura does context based search visualization and context management using neural networks, as one of their patents says. You can cluster web, images, videos and Amazon.
Quintura is also a Meta search engine relying mainly on yahoo's index for its clusters and its own-patented technology (7,437,370) for displaying the cloud of the cluster. It has very nice user interface with "on mouse over" kind of cluster expansion and contractions. Saving the cluster, map and reloading are the features provided for your results. Quintura definitely is one of finest clustering engines available on the web with very higher ratings.
Here is the list of some other interesting clustering engines. Their order is alphabetical rather then based on features, usability or recommendation by www.abhishekmehta.com:
Clustering Vs Semantics in Nutshell:
Clustering engines fall short of semantic engines on the scale of language processing, context understanding, Polysemy, synonimity, vernacular, capturing negations and ontology. They use more of syntactic constructs of language, pattern recognition, and phrase proximity, LSA rather then forging into semantic aspect of context understanding. As currently there is no accurate semantic search engine in the sight, so we can fall back of clustering engines for the some more years to come.
In Part 3 of this blogging series I will take a look into the worlds of Google/ yahoo/MSN and their efforts to make them future attractive, Plus some other catchier efforts.
Posted at 12:00AM Feb 09, 2009 by ABhi in Trends | Comments[3]
Related Blogs
©2008-2009 Abhishek Mehta All Rights Reserved
All content on this website and in whitepapers released by AbhishekMehta.com is proprietary, reproduction in any form without permission is prohibited.

Posted by Neeraj on March 04, 2009 at 12:21 PM IST #
Posted by Admin on March 05, 2009 at 11:03 AM IST #
Posted by 59.160.73.114 on May 29, 2009 at 05:30 PM IST #