Bindu Reddy
Create an account or sign in to get started
FriendFeed
posted a message
“Interesting: Cuil's category refinements all come from one source - Wikipedia”
July 29 at 12:46 pm - Link
Evidence? (Not that I don't believe you I just want to see it for myself) - Erica Baker
Type in any search query e.g. Sergey Brin - http://tinyurl.com/5skbqw. You will see "Business People in Software" as one of the categories. Open up the category box and you will see [David Filo] listed. Now go to David Filo's page on Wikipedia and at the bottom you will see the category - "Business People in Software" as one of the categories [it is near the bottom of the entry]. You can trace every category back to Wikipedia this way. - Bindu Reddy
It is pretty clever actually... They got their ontology from Wikipedia: Nodes and Categories and than they tagged each of the pages on with the Nodes that appear on the page. When they run a query, they compute a dynamic histogram of the nodes in the result set to get the final category structure. Now this works most of the time. However, it fails when the result page is somewhat less relevant to the query.. For example: The query Obama sometimes returns "Hispanic American Politicians".... - Bindu Reddy
I was wondering how long it would take for people to figure this out. :) The first non-Google person that noticed was the Google OS blog, which said "Another interesting idea is the explorative category section that shows related Wikipedia categories and topics." I wrote a personal blog post about how Cuil generates their categories and then decided not to hit the publish button. - Matt Cutts
Very interesting, especially since Wikipedia is very weakly structured (wouldn't dare to call it an ontology). Would be interesting if they chose Freebase or dbpedia to drive the related information bits. - Deepak
This would be good for Wikipedia, as it gives people some reward for structuring the wikipedia content, like adding categories, etc. For example, I could envision the BioGang enriching the categories in bioinformatics sciences. - Egon Willighagen
Egon, agree. Since Wikipedia drives content, any structured info (categories, infoboxes), etc only makes things more powerful. Actually info box driven information would be very cool (that's the part powerset does well when it's pulling in info from Freebase). The problem with Cuil is that the initial results can be so off that the rest becomes somewhat meaningless - Deepak