The Everything Guide to Lambda Throttling, Reserved Concurrency, and Execution Limits

Kerri Rapes

Published in

ITNEXT

5 min readMar 28, 2019

What your need to know, to know, that you know a little somethin-somethin…

Definitions:

Concurrent Executions — Processes that are are being executed by AWS Lambda Functions at the same time .

Request — An event that triggers an AWS Lambda to launch and begin processing.

What are the Limits?

For every AWS account you start with a pool of 1000 concurrent executions. All of the lambdas from this account will share executions from this pool.

If one lambda were to receive a very large quantity of requests (say 1000), AWS would run those 1000 from the common pool.

If a second lambda were to also receive requests (for example 150) in the same in the same instant, it runs the risk of being completely or partially rejected because the combined number of concurrent executions is over the 1000 limit.

For some applications this risk of rejection is not acceptable. In these cases the engineer can reserve a set number of executions for any given lambda, guaranteeing that they will always be available, using the Reserved Concurrency parameter.

Reserved Concurrences can be set on any or all of the lambdas, in any combination. However, AWS reserves 100 executions to the common pool at all times. So, if the account has a limit of 1000, the maximum number of combined Reserved Concurrency would be 900.

When the Concurrency parameter is returned to ‘use unreserved account concurrency’ the executions are returned to the common pool.

With this setup, the reserved executions will always remain untouched by the other lambdas even if the lambda for which they were reserved is not using them.

The same is true in the opposite direction. Once the Reserved Concurrency parameter is set for a lambda function, the total number of concurrent execution of that function cannot exceed that number. Acting as a concurrency limit for that function.

For some systems the total number of concurrent executions consistently exceeds the 1000 limit even after optimizations. In this scenario the account owner can request a system limit increase by contacting the AWS Support Center.

Why Would We Ever Want to Limit the Executions?

There are many reasons to throttle or limit your lambda function.

Cost or Security, you may not want someone accidentally or maliciously making a large number of requests to your system.
Performance, to force reasonable batch sizes.
Scalability, to match the throughput to your downstream resources.
Off Switch, reducing the Reserved Concurrency to zero effectively means no traffic will flow through your processes.

What Does this Look Like in Practice?

To modify the Reserved Concurrency of any lambda there are several options.

The Console

Inside the AWS console navigate to your lambda, in the configurations page scroll down to the Concurrency box, and select Reserved Concurrency.

The Command Line

To modify the Reserved Concurrency via the command line use the following command:

aws lambda put-function-concurrency --function-name YOUR_FUNCTION_NAME_HERE --reserved-concurrent-executions 50

Serverless Framework File

If your deploying your functions with the serverless framework you can modify the Reserved Concurrency for any lambda inside the function section of you file.

service: MediumConcurrency provider:
  name: aws
  runtime: python3.7
  stage: ${opt:stage, 'dev'}
  region: ${opt:region, 'us-east-1'}
  profile: ${opt:profile, 'default'}
  environment:
    region: ${self:provider.region}
    stage: ${self:provider.stage}
  stackTags:
    Owner : krapes
    Project : concurrencyLimits
    Service : concurrencyLimits
    Team : brokenLeg
  stackPolicy: # This policy allows updates to all resources
    - Effect: Allow
      Principal: "*"
      Action: "Update:*"
      Resource: "*"  iamRoleStatements:functions:
  dummy:
    handler: dummy.main
    timeout: 10
    ## This parameter sets the reserved concurrency for the lambda 'dummy'
    reservedConcurrency: 25
#    events:
#      - http:
#          method: GET
#          path: /dummy
#          resp: json#plugins:
#  - serverless-python-requirements
custom:
  pythonRequirements:
     dockerizePip: non-linux

Now when testing your lambda, you’ll see that with the Reserved Concurrency set the excess requests were returned an error 500 code, and thus limited the system.

The outputs below were generated using the Lambda Load Testing tool without reservedConcurrency AND with it set to 25.

Without Reserved Concurrency Limit:

bash run.sh -n 5000 -c 50-----------------------------------------------------------Details (average, fastest, slowest):
  DNS+dialup:   0.0009 secs, 2.0200 secs, 6.0415 secs
  DNS-lookup:   0.0002 secs, 0.0000 secs, 0.0185 secs
  req write:    0.0000 secs, 0.0000 secs, 0.0030 secs
  resp wait:    3.5561 secs, 2.0199 secs, 6.0414 secs
  resp read:    0.0001 secs, 0.0000 secs, 0.0032 secsStatus code distribution:
  [200] 5000 responses

With Reserved Concurrency Limit:

bash run.sh -n 5000 -c 50-------------------------------------------------------------Details (average, fastest, slowest):
  DNS+dialup:   0.0007 secs, 0.0094 secs, 5.6580 secs
  DNS-lookup:   0.0000 secs, 0.0000 secs, 0.0119 secs
  req write:    0.0000 secs, 0.0000 secs, 0.0033 secs
  resp wait:    1.1845 secs, 0.0093 secs, 5.5826 secs
  resp read:    0.0000 secs, 0.0000 secs, 0.0032 secsStatus code distribution:
  [200] 1638 responses
  [500] 3362 responses

Conclusions

In summary, the concurrency provisioning and throttling of lambda functions can all be managed through the Reserved Concurrency parameter. In cases where the account simply needs more than the standard 1000 concurrent executions Amazon is happy to raise the limit after discussing other optimization techniques.

While concurrency limits are a great indication of the throughput of your service it still doesn’t tell the whole story. The average run-time length has a huge effect on the number of requests a service can churn through in a given period of time. A lambda or service that can work through a process twice as fast will need less concurrent executions to handle the same number of requests compared to a slower function.

To best understand how your service is going to respond it’s always best to stress-test it yourself. The Lambda Load Testing repository is a great place to get started.