An error occurred (ThrottlingException) when calling the GetDeployment operation (reached max retries: 4): Rate exceeded

Question:

With an increase in the number of Deployment Groups in AWS CodeDeploy, BitBucket Pipelines are starting to fail more often.

PIPELINE FAILED…

+ python ./_scripts/codedeploy_deploy.py
Failed to deploy application revision.
An error occurred (ThrottlingException) when calling the GetDeployment operation (reached max retries: 4): Rate exceeded

Is there any way to increase the value before rate limit or reduce the chance of occurrence?

AWS FORUM POST: https://forums.aws.amazon.com/thread.jspa?messageID=892511

Asked By: Adan Rehtla

||

Answers:

Unfortunately, there is no way to increase the rate limit, as this is dynamically provisioned by the AWS API.

AWS SUPPORT:

This issue is not related to any concurrent deployment or any other resource related limit. This is a throttling issue, which cannot be changed.

Multiple API calls initiated at the same time gets throttled at our endpoints. The limit for each endpoint is varies and is dynamic, therefore it is not documented anywhere.

In this case, there are multiple calls for ‘GetDeployment’ API simultaneously hence the calls are getting throttled.

In such scenarios we recommend to implement error retries and exponential backoff between retries, so that the API calls are not simultaneous.

You can check the below link which explains how to implement it in our Code.
https://docs.aws.amazon.com/general/latest/gr/api-retries.html

I was able to implement an exponential back off to reduce the rate at which we are trying to get the deployment status and also increase the number of retries before deployment failure.

Make sure you are using the latest version of BOTO3 (boto3-1.9.108 botocore-1.12.108) which supports this new config system.

BOTO3 RETRY CONFIG: https://github.com/boto/botocore/issues/882#issuecomment-338846339

FORK: https://bitbucket.org/DJRavine/aws-codedeploy-bitbucket-pipelines-python/src/master/
GIST: https://gist.github.com/djravine/5007e2a7f726cebe14ea51c7ee54bf5d

PIPELINE SUCCESSFUL…

+ python ./_scripts/codedeploy_deploy.py
Deployment Created (Exponential back off 30s)
Deployment InProgress (Exponential back off 60s)
Deployment Succeeded

NOTE: I will update this post with more information as I revise the usage based on our deployments.

Answered By: Adan Rehtla

As suggested by @adan rehtla, there is no way of manually increasing the rate limit yourself. Following are the findings after raising a support ticket for AWS:

  1. Amazon EMR throttles the following API requests for each AWS account on a per-Region basis
  2. EMR uses a token bucket scheme for throttling. So if the user makes the same call, on average, at the same rate or less than this refill rate, you should be okay
  3. For each API, there is an account-level credit bucket with an upper limit. When you make an API call, one credit is taken from the bucket. At the same time, the bucket is being refilled with a fixed refill rate. If you are making API calls at the same rate or less than the refill rate, you will not get throttling exceptions. If you are making API calls faster than the refill rate, the credit in your bucket will be gradually depleted and you will experience throttled exceptions.

Our setup was an Airflow DAG creating EMR clusters and polling task status on them. Furthermore, we were creating the boto3 client using Airflow’s wrapper (AwsBaseHook) so implementing/changing the config to increase retries was not an option.

Solution:

  • Manually add retries with backoff around a try-catch block which internally called the polling function.
  • Asking the AWS team to increase the rate.
  • Using Cloudtrail to find the culprit. [Must do]

Hope this helps!

Answered By: Prithu Srinivas