Control the consumption of resources used by an instance of an application, an individual tenant, or an entire service
When to use this pattern?
SLA
To prevent a single tenant from monopolizing the resources provided by an application.
Handle bursts in activity.
Cost-optimization.
Throttling Strategies
1 – Reject Requests
2 – Disable or degrade the functionality
Disable / degrade the functionality for nonessential services, so the essential services can function as expected
Video streaming application, it could be switched to a lower resolution
3 - Queue-Based Load Levelling
3 – Priority Queue Pattern
4 – Deferring operations being performed on lower priority app
Inform the user that the system is busy and they can try again later.
In the above diagram, Feature B is a non essential service
Considerations
It’s is an architectural decision.
Throttling must be performed quickly.
HTTP response code 429 – Too many requests
Throttling can be used as a temporary measure while a system auto scales.