Sign in
or
Join FriendFeed
FriendFeed
is the easiest way to share online.
Learn more »
Join FriendFeed
Todd Suomela
Announcing the Public Terabyte Dataset project « Elastic Web Mining | Bixolabs -
http://bixolabs.com/2009...
November 3
from
delicious
-
Comment
-
Like
-
Share
We’re very excited to announce the Public Terabyte Dataset project. This is a high quality crawl of top web sites, using AWS’s Elastic Map Reduce, Concurrent’s Cascading workflow API, and Bixolab’s elastic web mining platform. -
Todd Suomela