In many cases cold starts create pain. When serving live customer requests, for example, every millisecond counts to reduce bounce rates, cart abandonment, improve conversion rates, etc. Even in background processing, loading the same information in memory every time a new container needs to spin up is a waste of time and money.

A free service to help developers keep a pool of Lambdas warm, adjusting smartly to each function’s concurrency needs, would come very handy. In this article, I outline how I would implement such a service. As a developer advocate for Dashbird serverless monitoring, I frequently see complaints about cold starts and our goal is to provide an easy and free way to solve it.

Wood on firePhoto by [Luke Porter]( on [Unsplash](


Keeping functions warm

We know AWS takes some time before killing an idle Lambda container. How long it takes usually varies, but it’s safe to assume that most containers won’t be killed before 5 minutes, so let’s go with it.

There’s a trade-off between frequency and cost: a higher frequency minimizes the probability of having a cold start but is more expensive, and vice versa.


We’ll need as many containers warmed as concurrent requests may come in. To do that, we need to send multiple “warm requests” concurrently. How many? We’ll use time-series modeling to answer just that.


Below is an outlined plan to tackle these challenges. We would use:

  • CloudWatch: trigger the warming process on a regular basis (e.g. every 5 minutes)
  • Warmer: logic to invoke Lambdas and keep them warm
  • Prediction: anticipate how many containers are needed at any point in time

Illustrative diagram:

AWS Architecture Diagram

Pool of Warm Lambdas

Our functions will need a slight modification in code to handle the “warm requests”. In order to keep execution time to a minimum when warming up containers, we’ll short circuit to terminate processing as soon as possible. We should do something like:

if event['get_warm'] == True:
    return {'warmed': True}

Trigger (CloudWatch)

CloudWatch would trigger the warmer Lambda on a regular basis, say every 5 minutes. The time frequency should be adjustable on a per-function-basis to accomodate different projects needs.

Warmer (Lambda)

Will invoke and warm up a pool of Lambdas of our choosing. It should handle the concurrency issue discussed above. In order to make this Lambda work, we could use the cool open source project, Lambda-Warmer, by Jeremy Daly.

CampfirePhoto by [Sandis Helvigs]( on Unsplash_

Prediction (Lambda)

Warmer Lambda will need to know how many containers should be warmed at each cycle. This Lambda will provide just that. Here’s how it’s going to work:

First, the Predictor will get the latest invocation history for a given Lambda from CloudWatch metrics. AWS CLI interface has the get-metrics-data endpoint that would give us what we need. We would consume this data from an AWS SDK, though, instead of the command line (such as boto3 - getmetricdata), in order to run the entire process autonomously. CloudWatch can provide us how many times a function was invoked per second on the past few hours or days. We would use this as our proxy for a measure of the concurrent requests. It’s not perfect, but may be as close as we can get to the real number.

The invocation history would then be used by a time-series prediction model to anticipate the maximum number of concurrent requests our Lambda is expected to get in the next 5 minutes and provide this to the Warmer Lambda.

For the time-series modeling, we plan to use StatsModels, an interesting statistical open-source project that brings implementations of some of the best algorithms for this task.


These are our outlined ideas for a simple, yet effective, system to keep Lambda constantly warm. In case you want to get notified when this service is available freely to be used, please drop me a message at

This post is also available on DEV.