All About Partitioning In Azure Cosmos DB

All About Partitioning In Azure Cosmos DB

This Azure tutorial will discuss All About Partitioning In Azure Cosmos DB or What is Partitioning In Azure Cosmos DB.

All About Partitioning In Azure Cosmos DB

  • Azure Cosmos DB allows you to store a huge amount of that.
  • When there is a provision to store a huge amount of data, if you are not thinking of storing the data properly, the performance will decrease while querying the data from the vast amount.
  • Azure Cosmos DB provides the concept of partitioning, where the items in a container are grouped into different partitions, or you can also call those unique subsets.

What is Partition Key?

  • The partition key plays an important role here. It is associated with each item inside a container, and based on its value, the subsets or the partitions are formed. Another important point to remember here is every item inside a specific container has the same partition key.
  • The partition key is nothing but a JSON property that is responsible for distributing data among different partitions.
  • Behind the scenes, the Partition key only decides where to place which document.
  • Another important point to remember here is once you set the partition key, it’s not possible to change the Partition key again.
  • It is suggested to keep a partition key with many distinct values.
What is Partition Key in Cosmos DB

Partition Key Components

  • The partition key has two components, and those are Path and Value.
partition key reached maximum size of 20 gb

Logical and Physical Partitions

There are two types of partitions related to Azure Cosmos DB partitioning and those are as below.

  1. Logical Partition
  2. Physical Partition

Logical Partition

  • The items present inside the containers are divided into logical partitions.
  • Logical partitions are again based on the Partition key associated with all the items inside the container.
  • The partition key value is the same for all items in a logical partition.
  • Currently, a logical partition can have a limit of 20 GB.

Physical Partition

  • Suppose you are thinking of the physical partitions, actually, behind the scenes. In that case, these logical partitions are mapped to physical partitions, and basically, one or more logical partitions are mapped to one physical partition.
  • Each physical partition can store up to 50 GB of data and provide a throughput of up to 10,000 RU/s.
  • Once the physical partition limit reaches 50 GB, immediately, Azure Cosmos Db creates another brand new physical partition.

How Does Partition Work Exactly?

  • Initially, one default partition gets created by Azure.
  • Then, when you are trying to insert a new document, based on the Partition key value, Azure Cosmos DB decides in which partition, it has to store the item.
  • Suppose there are more items, and it reaches the maximum size of the partition. In that case, Azure immediately creates a new physical partition and moves that particular logical partition to that new physical partition.
  • For better query performance, you can provide a partition key.

Cosmos DB Partition key Best Practices

Below are a few lists of best practices that you need to consider while choosing the cosmos db partition key.

  1. An item ID in a container is one of the best choices as the partition key because it is unique and has a wide range of possible values and no chance of duplication.
  2. Choosing a partition key containing a wide range of distinct values would be best.
  3. The value of the partition key shouldn’t change.
  4. It should have spread throughput across logical partitions.
  5. Choose the one that can be used with your queries as a filter condition.
  6. The primary key also can be used as one of the best choices for the Partition key.

How do you make sure you are choosing the best partition key?

We have already discussed that the partition key plays an important role here. Choosing the right partition key is one of the major concerns. Below are a few key suggestions we should consider while choosing the best partition.

  • The best partition key is the one whose value doesn’t change.
  • The partition key must contain a wide range of distinct values.
  • It must have spread throughput across logical partitions.
  • As a filter, it should appear frequently in your queries for the best performance.
  • Don’t choose a partition key that doesn’t have a wide range of distinct values, or the performance will go down.
  • For Example, if you are developing an application to host the Employee details. You can choose the employee ID as the partition key in that case. Then, you can map all the data related to the particular employee to the employeeID partition key.
  • If you are trying to build a multi-tenant application, using the tenant ID as the partition key is the best suggestion.

How do you create a large partition key in Azure Portal?

While creating a new container, you can create a large partition key in Azure Portal. Follow the below instructions.

  1. Log in to the Azure Portal and navigate to the Azure Cosmos DB account.
  2. Click on + Add container to create the new container.
  3. Fill in all the details on the New Container window, and then, to create the large partition key, select the “My partition key is larger than 100 bytes” option.

If you don’t want to create a large partition key, then make sure to uncheck the “My partition key is larger than 100 bytes” option.

synthetic partition key cosmos db
cosmos db synthetic partition key
cosmos db change partition key

4. Click the OK button to create the container with the large partition key.

Cosmos DB Physical Partition Size Limit

If you consider the cosmos db partition size limit, Each physical partition can currently store up to 50GB of data.

  1. ID in a container is one of the best choices as the partition key because it is unique and has a wide range of possible values and no chance of duplication.
  2. Choosing a partition key containing a wide range of distinct values would be best.
  3. The value of the partition key shouldn’t change.
  4. It should have spread throughput across logical partitions.
  5. Choose the one that can be used with your queries as a filter condition.
  6. The primary key also can be used as one of the best choices for the Partition key.

Cosmos DB Multiple Partition Keys

An important point to consider here is the Azure Cosmos DB container can have only one partition key.

What is Cosmos DB Synthetic Partition Key?

You can concatenate multiple property values to your Partition key, known as Synthetic Partition Key.

Let’s consider an example to understand exactly what synthetic partition key in cosmos db is.

In the below example, You can use either use /productid or /date as the partition key. So you can partition your container based on either product or date.

{
"productId": "xyz-328",
"date": 2020
}

Now, you can combine productid or date properties and create one cosmos db synthetic partition key as below.


{
"productId": "xyz-328",
"date": 2020
"partitionKey": "xyz-328-2020"
}

Cosmos DB Change Partition Key

You might be thinking, Can you change the partition key in Cosmos DB? or How to change your partition key in Azure Cosmos DB?

The answer to your question is, technically, it is impossible to update your partition key in the existing container.

In the case of migration, if you are trying to migrate from source to destination, If the existing collection has no dedicated partition key, you can create a new collection and select a different partition key at the destination.

You may also like following the articles below

Wrapping Up

In this article, we have discussed All About Partitioning In Azure Cosmos DB.

I hope you have enjoyed this article !!!