Since the advent of cloud computing, most enterprises have primarily relied on the Infrastructure as a Service (IaaS) model for building and deploying their web applications. IaaS can be leveraged in one of the following three ways:
- A Public cloud (think AWS EC2, Google Cloud Compute Engine, etc.).
- A private (corporate) cloud.
- A hybrid cloud (an amalgamation of the above two).
IaaS has been essential in introducing countless organizations to the benefits of the cloud. However, since quite some time, Serverless is being touted as the next logical step for cloud computing. Although the reasons for this are multitude, the major reason (similar to most other business decisions) is the same – money. (Of course, there are several other factors as well, but that would be another blog in itself.)
Detour: What is Serverless?
In the IaaS world, you own a bunch of servers and you deploy your applications to the same. You pay good money for these servers – hourly, monthly, or whatever billing plan you choose. But do you fully utilize their compute power at all times? Of course not! Though you can employ various auto-scaling strategies, many of your resources will still go to waste. You end up paying for idle as well. Unless of course you can predict the exact traffic trend, which, unfortunately is not true of most applications.
This is where Serverless shines the most, you only pay for what you use. For e.g., if you were to run a piece of code on AWS Lambda (a form of Serverless offering from AWS), and if your code were to run for, let’s say 100 ms, you would end up paying to AWS only for these 100 ms worth of resources! This is what drives down the price by a major factor.
To understand the premise behind this blog, let’s take a look at some of the goodness Serverless tries to offer:
- Pay-as-you-go: This one should be obvious due to the inherent nature of Serverless.
- Managed servers: You are only bothered with writing your functions’ code and registering the same with your cloud provider. There is no overhead of maintaining any servers.
- Faster time-to-market: Again, since you do not need to deploy any fully-featured web application, your time-to-market goes down significantly. This is especially beneficial for up-and-coming startups.
- Effortless scaling: Serverless does not require you to worry about scale while writing your application. It, inherently, scales up and down as per demand. And it is the responsibility of the cloud provider to ensure the same.
What is Function as a Service?
Function as a Service (FaaS) is a form of Serverless offering being provided by various cloud providers. A few examples:
As was described in the AWS Lambda example in the previous section, FaaS is essentially a pay-as-you-go model. It is an event-triggered paradigm. Simply put, in response to certain events, the cloud provider runs your function.
- Do not assume that Serveless only means FaaS. Since FaaS is the most prominent form of Serverless, it is easy to make this assumption.
- On the face of it, FaaS might look similar to PaaS. If you feel the same, try researching a bit more. I am keeping this out of the scope of this blog.
How FaaS scales?
In the FaaS world, you only write functions and do not worry about horizontal scaling. When the cloud provider receives an event to trigger a function, it spins up a container (not to be confused with a containerization technology such as Docker) and runs your function. This is an ephemeral container and may last either for only one invocation or for up to a few minutes (depending on the cloud vendor and various other conditions). When another function trigger is received in parallel to an already executing function, a new container will be spun up. This is how scaling works for FaaS.
Cold start problem with FaaS
A warm start is when an existing function is used to serve a request. Cold start is when no function is available/free to serve a request, and a new function needs to be spun up for the same. However, spinning up a new function introduces latency in the its execution. This latency can be broken down into two parts:
- Time to spin up the container: Serverless does not mean no servers will be involved. At the end of it all, every bit of code needs a physical machine to run on. It is just that these servers/containers (where your code runs) will be managed by the cloud vendor. The time spent in booting up these vendor containers is out of your control.
- Setting up the runtime: This part of the cold start can be controlled by the developer using various factors such as the choice of language being used for writing the function, the amount/size of dependencies being used by your code, etc. For e.g., a function written in Java would have a longer cold start time compared to one written in Python.
Typically cold start is not a major problem if you functions remains “warm” enough. However, sudden traffic spikes, augmented with one cold function calling another cold function, can lead to disastrous cascading scenarios.
Keep your functions warm
There are various methods (read “hacks”) to keep your functions warm such as:
- Warming up your functions before expected spikes.
- Warming up your functions at regular intervals.
AWS recently introduced something called “Provisioned Concurrency” for their FaaS offering (AWS Lambda). Using this feature, you can avoid the problem of cold start by keeping aside a fixed number of functions in an always running state. This number of always up functions need not be fixed either. You can keep changing its value throughout the day depending on expected spike patterns. For e.g., a food delivery application might expect a spike around evening for dinner orders. The Provisioned Concurrency can be set to a higher value around that time. All requests will first be served by these provisioned functions. If your demand exceeds the supply, new on-demand functions will be created in the usual way (along with the associated cold start).
Also, note that for these always available functions, you do not pay-as-you-go. You pay the regular IaaS way.
So, is FaaS truly Serverless?
Yes, for some cases and no for others. Serverless is supposed to be a managed, auto-scalable, on-demand offering. With recent trends such as Provisioned Concurrency, it is breaking away from its “managed” and “pay-as-you-go” philosophies.
For use-cases which will not be bothered by the cold start problem, FaaS seems to be truly Serverless. Examples include data processing functions, asynchronous non-real time calculations, etc.
However, if one tries to create a mission critical application or a real-time API, using workarounds such as Provisioned Concurrency, the true Serverless definition takes a hit. The main hit is taken by the “pay-as-you-go” philosophy wherein the dedicated functions begin to morph our Serverless system into somewhat of a dedicated EC2/GCE cluster.
FaaS is not a one-size-fits-all solution. Using this, we can target certain classes of problems in a very cost effective manner. Think of a video sharing application which runs post processing after users submit their videos. You do not need to have always available dedicated clusters for the same. New functions can be spun up when needed.
However, there will be other categories of problems for which FaaS might not be an ideal solution. For e.g., many people will be hesitant to use cloud functions (in conjunction with an API gateway) to build online, low-latency, user-centric APIs. Another point to not is that functions, unlike dedicated VMs, cannot maintain a dedicated HTTP connection pool or a DB connection pool.
These are still early days for Serverless/FaaS and a lot of problems in areas such as tooling, observability, etc. need to be tackled. This is an exciting and a promising space. Let’s see how it evolves in the future. Until next time!