AWS Lambda is a mature, feature-rich computing platform. While it’s very straightforward and simple to use for backend developers, when it comes to the performance tuning, there are a few things to keep in mind.
The cornerstone of Lambda performance is startup time. Whenever a Lambda function is invoked for the first time, it spins up a function instance and goes through the runtime initialization. This process can take a noticeable amount of time which is often unacceptable in production. A new Lambda instance can also get created during increased usage, parallel invocations or after a function has been idle for a while. To mitigate this issue, it is possible to pre-warm a Lambda function, so that it will already be initialized by the time the first request will come in. There are currently two similar approaches to choose from.
If you are using the Serverless Framework, you can implement function pre-warm using the Serverless WarmUp Plugin. It works by invoking your Lambda functions from a “warmer” function on a specified schedule (say, every five minutes) to simulate a user request or an event. You can also configure a concurrency at which your function will be invoked, which effectively provisions multiple function instances.
It is also possible to provide additional function initialization code, for example to connect to a database or instantiate any resource-heavy code:
// Lambda handler
Note that it is not guaranteed that the initialization code will be invoked before an actual request is made or that it will be invoked at all, so make sure that your initialization logic is lazy is idempotent.
Keep in mind that the warmup process will add up to your Lambda bill just like any other Lambda request. A single function configured to be invoked every five minutes will account for 8640 calls each month. Depending on your requirements, it could be benifitial to only pre-warm your function during business hours. You can do so by providing a CRON expression to your warmup configuration:
AWS introduced a native way to pre-warm Lambda functions called “provisioned concurrency”. You just need to provide a number of function instances you want to keep warm and AWS will take care of the rest. If you are using the Serverless framework, you can configure provisioned concurrency in the following way:
The way pre-warmed functions get initialized differs from the WarmUp plugin. With provisioned concurrency, the handler function is not called upon initialization. This results in a drawback that prevents from running an asynchronous initialization code in Node.js. In order to get around this limitation, you can use a Lambda layer that allows to provide a callback during the initialization.
One of the benefits of this being a native feature is that it is possible to automatically scale a provisioned concurrency for a function based on its utilization or on a schedule with scaling policies.
Whenever a provisioned concurrency is configured for a function, it will get a discount on the execution duration cost, but will also get an additional fee as long as this feature is active. See AWS Lambda pricing for the details.
The package size of a Lambda function affects the time it takes to initialize it. There is also an account-level limit on the total size of all Lambda functions, so it is advisable to keep the size of your functions in check. This is especially relevant in case of Node.js functions with lots of npm dependencies.
By default, npm dependencies get packaged with a function code as-is, with all their contents intact. Many packages include things like documentation or even media files that are not needed in order to run these packages. A possible solution to this issue is to use a code bundler like Webpack (or better yet, Serverless Bundle) to transpile dependencies into a single output file. The resulting package can easily be multiple times smaller than the same function with included
Another important topic related to Lambda performance is network connectivity. Lambda function instances can be created or teared down at any time, execution environment can be freezed in between requests. All of that can affect persistent network connections and increase reconnection rate, resulting in subpar application performance.
If you’re still using AWS SDK v2 for Node.js, there is a single-line change that can greatly increase its performance. Just add
AWS_NODEJS_CONNECTION_REUSE_ENABLED=1 as an environment variable of your function. As the name suggests, it enables a reuse of TCP connections, so subsequent requests to AWS API will take noticeable less time. AWS SDK v3 has this behavior enabled by default.
The best practice when using a remote database in Lambda is to use a proxy, either a stateless API (like DynamoDB) or a connection pool. Since Lambda can scale up to multiple instances really quick and can dispose an existing one at any time, it can result in too many connections to a database server. AWS RDS Proxy is a serverless solution that provides connection pooling for RDS databases. If you don’t use RDS, there are similar products for most of the popular databases, such as PgBouncer for PostgreSQL.
It is important to understand the performance bottlenecks of your Lambda functions when using it for mission-critical workflows and in high-load projects. With serverless approach, you can’t rely on a toolset that exists in the world of classic servers, but there are a few AWS products that can help you to get the job done.
By defaut, Lambda provides only a handful of performance metrics, such as execution duration and concurrency. If you want to get more information on what’s actually hapenning in your functions, you can enable Lambda Insights. With that in place, you will get detailed metrics for CPU, RAM and IO usage, as well as the information on cold function starts. This data is crucial for fine-tuning your function memory size, especially if you peform computationally-intensive tasks in Lambda.
Last but not least is tracing service for your distributed applications: AWS X-Ray. If you lerevage a microservice approach, you absolutely need to monitor how your requests are propagated through the system to identify any otherwise hard to pinpoint performance issues that can arise. With just a few lines of code, you can break down your requests into sequence diagrams and see which parts of your system needs an attention.
With just a few easy steps you can make you Lambda functions much faster without even changing the architecture.