Implementing horizontal partitioning or sharding
Let's explore sharding from two different perspectives: a dedicated SQL pool and Spark. Just note that we will be using the terminologies horizontal partitioning and sharding interchangeably throughout the book, but they mean the same thing.
Sharding in Synapse dedicated pools
Synapse SQL dedicated pools have three different types of tables based on how the data is stored, outlined as follows:
- Clustered columnstore
- Clustered index
- Heap
We will be learning more about these table types later in this chapter. Synapse dedicated pools support sharding for all these table types. They provide three different ways to shard the data, as follows:
- Hash
- Round-robin
- Replicated
These methods through which a SQL dedicated pool distributes data among its tables are also called distribution techniques. Sharding and distribution techniques are overlapping technologies that are always specified together...