AWS Lambda Functions Best Practices

Let’s write Lambda functions with quality and style

This blog post summarizes the guidelines and best practices for writing high-quality Lambda functions. These pieces of advice are highly inspired by the AWS official docs, personal experiences as well as the suggestions of the experts in the community too. Hope you will find these guidelines useful in your journey as well.

Unit-Testable Functions

Make the logic inside Lambda services unit-testable. This can be achieved by separating the core logic function from the Lambda handler as below. Ideally, a Lambda handler must be thin and act as a proxy or router to a Lambda-agnostic core logic function.

import {LambdaUtils} from "common-layer";

// Lambda handler
exports.handler = function(event, context, callback) {
     let appContext = LambdaUtils.getAppContext(event, context); 
     var result = executeTransaction(appContext);
     callback(null, LambdaUtils.wrapResponse(result));
}

// Core logic function
function executeTransaction(appContext) {
     // logic goes here
     // 1. validate function input
     // 2. run business logic
     // 3. return response
}

Avoid core logic functions from consuming and producing data through AWS-specific interfaces (e.g. APIGatewayEvent, Context, Handler, APIGatewayProxyEvent, APIGatewayProxyResult). As mentioned above, use helper functions like getAppContext() , wrapResponse() to deal with type conversions and always ensure that the core logic functions consume and produce data through common interfaces defined in the application. This practice helps make unit tests unaware of AWS-specific formats and make functional tests more generic.
Ideally, the functions must be testable locally without the presence of AWS-based resources. To achieve that, we can mock or stub AWS-based resources such as event sources, data services etc. For complex scenarios and component-like testing, consider using local alternatives to AWS — e.g. AWS LocalStack.

Function Scaling, Concurrency, Throttling

Lambda functions are built to auto-scale. Here, correctly speaking, by default, Lambdas can automatically scale out and in (a.k.a. horizontal scaling — i.e. add or reduce equivalent Lambda instances in parallel to spread out a load). However, Lambdas are supposed to be manually scaled up and down (a.k.a. vertical scaling — i.e. add or reduce more CPU/memory resources).
As an example for manual-scaling on Lambda, if the workload is memory-heavy, you can consider allocating more resources during the deployment time. Ideally, you may have to do some profiling and close monitoring on processing times with different loads and resource configurations to find the sweetspot with optimum configurations. Also, you can consider using tools like AWS Lambda Power Tuning too.
As an example for auto-scaling on Lambda, if the input traffic surges by 100 times, Lambdas will also immediately scale out (spin up more instances to spread out the excess load) and later, when the load reduces back to its normal, Lambdas will also eventually free up the unused instances and go back to its normal. However, due to some rare situations (developer mistakes, architectural pitfalls, malicious situations), Lambdas can run into unintentional scaling-out scenarios and create many unnecessary issues related to cost, security, and performance of the application. Also, these types of high-intense scaling-out scenarios can exhaust the downstream resources too. For an instance, if a max connection limit for a database is set to 100, and your database connector Lambdas increase up beyond that limit, it would slow down the overall database accessing process. Given that the database end has proper load handling and security policies, if the concurrent connections count increases more, it may assume the load to be malicious/abnormal (like a mini DDoS attack) and close the connections immediately. To avoid such scenarios, as a good practice, it is advised to always configure reserved concurrency (upper limit of Lambda instance count running at any given time, default: 500–3000 based on the AWS region).
When your function is reaching towards the reserved concurrency limits or the region-based quota limit set by AWS, your traffic will get throttled and once the limits are exceeded, the excess load will be rejected with a throttling error (HTTP status code: 429 Too Many Requests). A nice way to handle such a scenario would be to slow down the function invocation rate and write the failed and excess load to a data store and retry them with some intervals — individually or as batches with controlled sizes. If throttling errors are frequent, you may rethink of your overall architecture and adjust it to support a fairly-controlled throughput / batchsize.
In general, when dealing with Lambda failures, retries, throttling, batch processing, and other scenarios, make sure that you never make assumptions on the order of Lambda executions. Also, implement idempotency checks wherever necessary.
Another concurrency related config would be provisioned concurrency (lower limit of Lambda function count running at any given time, default: 0). If you build serious applications where the cold start time is a problem (e.g. latency-critical applications), you may consider setting up some number of instances to be always available (i.e. staying in warm mode). Lambda provides multiple parameters to fine-tune these concurrency levels, it’s best to understand and use them as explained on the official docs here: AWS Lambda function scaling.
By theory, a Lambda function can be modelled as just another programming function, hence, developers should be able to make recursive calls to it too. However, in reality, such approaches can introduce unnecessarily escalated costs due to the high volume of function invocations. If you need recursive calls, the best practice would be to, implement that logic as a separate function (as external to the main Lambda function) and ensure that the recursion completes within a single Lambda invocation. For some reason (e.g. by mistake / experimental purposes), if you ever implement recursion with Lambda functions, consider setting the `reserved concurrency` to 0 immediately, so that all invocations will get throttled and served via the same function instance until you update the code and deploy a fix with better handling of recursion.

Separation of Concerns

Create a common Lambda layer(s) and consider moving the common helper methods and other shared code into it. Layers promote code sharing and separation of responsibilities so that you can iterate faster on writing business logic.
Maintain separation of concerns in deployments and support the expectations of each environment/stage (such as production, testing, and development) — via the effective use of separate AWS accounts, regions, IAM rules, network settings, versions, aliases, environment variables, AWS AppConfigs, and other mechanisms. For an instance, if you write to a DynamoDB table at the development stage, make sure it has a separate resource name (ARN) that reflects its purpose (e.g. dev10-transactions , qa2-transactions , prod-transactions). Make sure that the Lambda functions can dynamically derive those ARN’s using one of the above mechanisms, instead of hard-coding to the code each time.

Light-weight Frameworks

When choosing technology stacks, always go for the simple and lightweight frameworks, instead of the heavy multi-purpose libraries. Also, use the technology-level optimizations to reduce the deployment package size and optimized builds with better load times, so that they will reduce the time to download and unpack those artefacts in the execution environment. This can significantly reduce the cold start duration and result in cost savings too.
In general, try to write smaller functions with fewer dependencies so that the cold start time could be minimized.
Lambda execution environments usually comprise minimum libraries for a Lambda to run (such as AWS SDK). So avoid duplicating such dependencies. However, you should always package all other runtime dependencies into the function’s deployment artefacts. Also, you must exclude the development time dependencies like compile, build, and test-related packages and limit the deployment artefacts only to a minimum — ideally, only the runtime necessities.
Make sure you lock the library versions when the code is stable, and periodically evaluate the need for updating library versions and obtain the security patches and other updates as necessary.
As a periodical Ops task, clean up the old Lambda functions, configs, layers, logs, and other resources that are no longer in use. Ideally, you should have automated scripts to create, update, repair, and tear down environments.

Execution Environment Usage

Do not cache dynamic data in the execution environment. Also, avoid maintaining states and user data between function invocations. Sharing user data arises security issues too. By its design, Lambda functions are ephemeral (short-lived) so as a best practice, design them to be stateless always. If a state or some data needs to be persisted, consider using a separate cache service, message queue, or similar data store, with proper controls.
Cache static assets in the execution environment (in the /tmp directory — which is the only writable directory in a Lambda execution environment, all other locations are read-only). With this, the subsequent function invocations that are coming to the same function instance can reuse the cached resources. This contributes to reducing the run time of a function.
Avoid dependencies between Lambda code and the underlying compute infrastructure. Function logic should be unaware of the infrastructure and should never assume any condition from the infrastructure unless specified and exposed by AWS via proper APIs.

Connection Handling

Cache stateless communication pipes — e.g. persistent HTTP connections (like database connections). When creating a connection to an external resource, instead of creating a fresh connection per each invocation, create one persistent connection (with keep-alive directive) and reuse it for each subsequent invocation throughout the lifetime of the function instance.
However, due to inactivity, Lambda functions can get freeze for a while between invocations and then wake up in warm mode if another invocation happens before its expiry. Sometimes due to this freezing, the external resource might have closed the connection from its end. In such scenarios, even though the Lambda function waking up in warm mode believes the connection to be still open, it won’t be so and any attempt made using the previously-cached connection will fail immediately. Therefore, as an optimization to these zombie connections, if a persistent connection is too old, make sure you ping it and verify that the pipe is not yet closed (i.e. check connection liveliness manually). The age of a persistent connection can simply be calculated by preserving the last invoke time in function and deducting it from invocation time. The check would be,

if (nowTime - lastInvokedTime) < AverageTimeToFreeze: 
  Connection is highly likely to be alive.
  Use the exising connection.

if (nowTime - lastInvokedTime) >= AverageTimeToFreeze:
  Connection is highly likely to be broken. Try ping.
  If ping works, use the exising connection.
  If ping fails, create a new connection.

Another improvement would be to set up a connection pooling library or a set of proxy functions external to the application layer Lambda function(s) and use it to share connections among multiple connection requests coming from different Lambda functions. In general, this approach is very efficient — reduces both run time as well as the costs in total. However, a potential problem would be the slow-trains problem — i.e. if the connector is serving 2 different function requests, and the first request is significantly slower than the next one, it can create a bottleneck and make an impact on the other function’s performance too. To detect such scenarios, you may have to do some profiling on various functions, record their response times for each request type, and fine-tune the functions behaviours and the connection pooling strategy accordingly.

CI/CD, Monitoring, Observability

Like anything on cloud, Lambda functions must also go through a properly-scripted deployment approach. From day 1, consider investing in some proper IaC templating approach (AWS SAM, Serverless, Terraform, … etc.), CI/CD pipelines, monitoring, alerting, and observality tools (start from CloudWatch + X-Ray, and later experiment with commercial/open-source tools). Also, be mindful about writing meaningful log messages with correct log levels and formats, collecting and publishing important stats etc. too. Let’s explore this topic further in an upcoming blog post.

What’s Next

Hope you enjoyed the read so far. In the upcoming blog posts of this series, let’s discuss more on scalability, CI/CD, Lambda logging, monitoring, and observability related topics in to more depth. If you have any questions or suggestions, please feel free to let me know too.

Stay tuned for the next AWS tip. Until then, happy coding!

If you enjoyed this article, you might also like reading these:

Microservices Design Guide

Everyone has heard about Microservices. But do you know how to design one?

medium.com

Enterprise Application Logging Best Practices (A Support Engineer’s Perspective)

Let’s write meaningful log messages that everyone loves!

betterprogramming.pub

Summarize