Ingesting data from MongoDB using PySpark
Even though it seems impractical to create and ingest the data ourselves, this exercise can be applied to real-life projects. People who work with data are often involved in the architectural process of defining the type of database, helping other engineers to insert data from applications into a database server, and later ingesting only the relevant information for dashboards or other analytical tools.
So far, we have created and evaluated our server and then created collections inside our MongoDB instance. With all this preparation, we can now ingest our data using PySpark.
Getting ready
This recipe requires the execution of the Creating our NoSQL table in MongoDB recipe due to data insertion. However, you can create and insert other documents into the MongoDB database and use them here. If you do this, ensure you set the suitable configurations to make it run properly.
Also, as in the Creating our NoSQL table in MongoDB recipe...