Introduction
Hadoop is a well-known open source software to process large datasets. It also has an API for the MapReduce programming model, which is widely used. Nearly all the big data solutions have some sort of support to integrate them with Hadoop in order to use its MapReduce framework. MongoDB has a connector as well that integrates with Hadoop and lets us write MapReduce jobs using the Hadoop MapReduce API, process the data residing in the MongoDB/MongoDB dumps, and write the result to the MongoDB/MongoDB dump files. In this chapter, we will look at some recipes about the basic MongoDB and Hadoop integration.