Understanding sharding and its components
In the previous chapter, we saw how MongoDB provides high availability using replica sets. Replica sets also allow distributing read queries across slaves, thus providing a fair bit of load distribution across a cluster of nodes. We have also seen that MongoDB performs most optimally if its working datasets can fit in memory with minimal disk operations. However, as databases grow, it becomes harder to provision servers that can effectively fit the entire working set in memory. This is one of the most common scalability problems faced by most growing organizations.
To address this, MongoDB provides sharding of collections. Sharding allows dividing the data into smaller chunks and distributing it across multiple machines.
Components of MongoDB sharding infrastructure
Unlike replica sets, a sharded MongoDB cluster consists of multiple components.
Config server
The config server is used to store metadata about the sharded cluster. It contains details about...