MapReduce in the Enterprise
Back in April, I discussed MapReduce and its open source implementation, Hadoop (see "Hadoop, MapReduce, Cloudera, EC2, and BI," 14 April 2009). At that time, I said that I thought Hadoop offered exciting possibilities for enterprises to carry out large-scale data analysis and mining. In particular, I said that the availability of Hadoop on cloud providers' platforms (e.g., Amazon Elastic Compute Cloud [EC2]) now made it possible for end-user organizations to experiment with big data projects that might not otherwise be practical. I still find MapReduce and Hadoop exciting. However, to the best of my knowledge, they are still being used primarily by Internet companies such as Facebook, Google, and MySpace to optimize their online operations -- as opposed to being used by more traditional enterprises looking for a way to support their data analysis capabilities. There are several reasons for this, which I'll get to in a minute. But first, I'll give a quick review of MapReduce and Hadoop.
Cutter Consortium clients, please log in:
If you would like further information about how to become a client, please contact us at +1 781 648 8700 or sales@cutter.com, or you can Request Guest Access.