We now have a YouTube Channel. 
Visit complete Backend roadmap

← Back to Topics List


Throttling is a design pattern that is used to limit the rate at which a system or component can be used. It is commonly used in cloud computing environments to prevent overuse of resources, such as compute power, network bandwidth, or storage capacity.

There are several ways to implement throttling in a cloud environment:

  • Rate limiting: This involves setting a maximum number of requests that can be made to a system or component within a specified time period.
  • Resource allocation: This involves allocating a fixed amount of resources to a system or component, and then limiting the use of those resources if they are exceeded.
  • Token bucket: This involves using a “bucket” of tokens to represent the available resources, and then allowing a certain number of tokens to be “consumed” by each request. When the bucket is empty, additional requests are denied until more tokens become available.

Throttling is an important aspect of cloud design, as it helps to ensure that resources are used efficiently and that the system remains stable and available. It is often used in conjunction with other design patterns, such as auto-scaling and load balancing, to provide a scalable and resilient cloud environment.

Visit the following resources to learn more:


roadmap.sh is the 6th most starred project on GitHub and is visited by hundreds of thousands of developers every month.

Roadmaps Guides Videos About YouTube

roadmap.sh by Kamran Ahmed

Community created roadmaps, articles, resources and journeys to help you choose your path and grow in your career.

© roadmap.sh · FAQs · Terms · Privacy


The leading DevOps resource for Kubernetes, cloud-native computing, and the latest in at-scale development, deployment, and management.