Apache Hadoop promises "a software platform that lets one easily write and run applications that process vast amounts of data". Sure enough, when reading the documentation, descriptions like:
(input) <k1, v1> -> map -> <k2, v2> -> combine -> <k2, v2> -> reduce -> <k3, v3> (output)
Are simple enough to read and understand, but how do you apply MapReduce to a problem you face in a real-life project?
This blog tries to give some insight into how to apply MapReduce with Hadoop.
Tags: hadoop, mapreduce
Filed under hadoop | 2 Comments »