How To Build and Deploy a Serverless Machine Learning App on AWS

Scale even with a low budget

Building Cartoonify (Image by the author)

Have you ever wanted to build a machine learning application with a heavy model on the backend, a React user-friendly interface on the frontend, and a serverless cloud architecture all around so that your product is scalable to tens, hundreds, or even thousands of users?

Well, this is not as difficult as it sounds, and in this article, I will show how to build an application like this. To make this tutorial a bit fun, I’ll walk through the process of building Cartoonify.

Cartoonify is a toy application I made from scratch to turn pictures into cartoons. I know this may seem a bit funny, but believe me, there is serious work behind this app that you can leverage elsewhere. I’ll let you read this article to believe me or check the code on my GitHub.

Here’s what motivated me in starting this project

Give generative adversarial networks (GANs) a try. I’ve been fascinated by these models lately. Trying the CartoonGAN model to turn your face into a cartoon seemed like real fun.
Learn about deploying an application on a serverless architecture using different services of AWS (Lambda, API Gateway, S3, etc.).
Practice my React skills. I was used to Plotly, Dash, and Streamlit, and I wanted, for once, to build something custom and less mainstream.
Use Netlify to deploy this React app. I saw demos of how easy this process was, and I wanted to try it to convince myself.

Building Cartoonify was so fun I decided to talk about it in a video series. Here’s a short introduction:

0. Prerequisites To Build and Deploy Cartoonify

If you want to run and deploy Cartoonify, here are some prerequisites first:

An AWS account (Don’t worry, deploying this app will cost you almost nothing.)
A free account on Netlify
Docker installed on your machine
node and npm (preferably the latest versions)
torch and torchvision to test CartoonGAN locally (optional)

If you’ve got everything, you’re now ready to go.

Please follow these four next steps.

1. CartoonGAN — an AI artist?

This section provides some details about the model. If you’re not interested in it, you can move to the next part about deployment.

The main idea behind this network

Before testing CartoonGAN, the model behind this application, let’s talk about it first to understand the underlying architecture and how it was designed and trained to produce cartoon styles.

Photo by 🇸🇮 Janko Ferlič on Unsplash (left) — transformed image (right)

CartoonGAN was introduced in 2018 in this paper with the goal of transforming images into the style of different Japanese cartoons. What CartoonGAN achieves is called style transfer, an image transformation technique that has been first introduced with neural networks in 2015 thanks to the work of Leon A. Gatys on neural style transfer. Style transfer was later enhanced in 2017 with CycleGAN architectures. CycleGAN approached style transfer by formalizing it as a translation problem between two image distributions, whereas neural style transfer framed it as an optimization problem.

CartoonGAN is a mix between neural style transfer and GANs. In fact, it uses a simple GAN architecture (not a cyclic one) and adds a semantic loss inspired by the neural style transfer technique to force the generator in preserving the content of the original image.

The authors went even further to make CartoonGAN’s results as visually appealing as possible: To do this, they added a second loss that forces the generator to create smooth edges, low-shaded colours, and less textured areas because these are visual features that are very specific to cartoons.

From GANs to CartoonGANs

To understand how CaroonGAN is built, let’s first understand how a typical GAN is trained to generate fake images.

GAN architecture — (image modified by the author)

A GAN is composed of two networks: a generator and a discriminator.

The goal of the generator is to take some input data (or noise) and create samples that look like a real-world image distribution, and the goal of the discriminator is to learn a boundary between the fake samples and the real ones.

These two networks are trained against each other in a min-max game inspired by game theory, where the goal of the generator is to fool the discriminator and the goal of the discriminator is to become better at identifying real data and fake data.

At the beginning of the training, these two networks are very bad in their respective tasks, and once the training goes on, they get better and better.

When they reach an equilibrium, the generator is so powerful that it dominates the discriminator by totally confusing it. In that case, we can throw the discriminator away and use the generator for the generation task.

And what about generating cartoons?

Training a GAN to generate cartoon data from real photographs is a simple problem that can be framed by considering:

the real photographs as input data to the generator
the cartoon data as the real data distribution that the generator will try to fake as much as possible

The beauty with this setup is that you don’t need a one-to-one mapping between photographs and cartoons: This means in practice that if you want to train a custom CartoonGAN to generate your favourite cartoon (let’s say the Simpsons for example), all you have to do then is gather some Simpson images and real photographs with no correspondence whatsoever and then start your training.

To learn more about the training procedure behind CartoonGAN, have a look at the original paper.

Let’s now move to a more hands-on section, where we get to generate cartoon results ourselves.

Test the CartoonGAN locally

First clone the repo of this project.

git clone https://github.com/ahmedbesbes/cartoonify.git

Download the four pre-trained models first. These weights will be loaded inside the generator model defined in cartoongan/network/Transformer.py

cd cartoongan
bash download_pth.sh

To test one of the four models, head over the notebook cartoongan/notebooks/CartoonGAN.ipynb and change the input image path to your test image. This notebook calls thecartoongan/test_from_code.py script to make the transformation.

cd cartoongan/notebooks
jupyter notebook

Testing CartoonGAN locally — screenshot by the author

The transform method takes a load_size argument to resize the input image in order to speed up the transformation. If you set it to a high value, the results will look better, but the processing will take more time. Throughout this project, we’ll set load_size to 450.

Here are some results obtained with a load_size of 800.

Photo by Anthony DELANOIX on Unsplash (left) — transformed image (right)

Photo by Christian Gabi on Unsplash (left) — transformed image (right)

You can watch this section on YouTube to learn more about GANs, the CartoonGAN model, and how to build the script test_from_code.py used in transforming the images.

2. Deploy CartoonGAN on a Serverless API Using AWS Lambda

The goal of this section is to deploy the CartoonGAN model on a serverless architecture so that it can be requested through an API endpoint… from anywhere on the internet.

Why does a serverless architecture matter?

In a serverless architecture using Lambda functions, for example, you don’t have to provision servers yourself. Roughly speaking, you only write the code that’ll be executed and list its dependencies, and AWS will manage the servers for you automatically and take care of the infrastructure.

This has a lot of benefits:

Cost efficiency: You don’t have to pay for a serverless architecture when you don’t use it. Conversely, when you have an EC2 machine running and not processing any request, you still pay for it.
Scalability: If a serverless application starts having a lot of requests at the same time, AWS will scale it by allocating more power to manage the load. If you had the manage the load by yourself using EC2 instances, you would do this by manually allocating more machines and creating a load balancer.

Of course, serverless architectures cannot be a perfect fit for any use case. In some situations, they are not practical at all (need for real-time or quick responses, use of WebSockets, heavy processing, etc.).

Since I frequently build machine learning models and integrate them into web applications for inference only, I found that a serverless architecture was interesting in these specific use cases.

Cartoonify workflow

Let’s have a look at the architecture of the app before deploying the lambda function:

The architecture behind Cartoonify (image by the author)

On the right side, we have a client built in React, and on the left side, we have a backend deployed on a serverless AWS architecture.

The backend and the frontend communicate with each other over HTTP requests. Here is the workflow:

An image is sent from the client through a POST request.
The image is then received via API Gateway.
API Gateway triggers a Lambda function to execute and passes the image to it.
The Lambda function starts running: it first fetches the pre-trained models from S3 and then applies the style transformation on the image it received.
Once the Lambda function is done running, it sends the transformed image back to the client through API Gateway again.

Deploy using the Serverless framework

We are going to define and deploy this architecture by writing it as a YAML file using the Serverless framework, an open source tool to automate deployment to AWS, Azure, Google Cloud, etc.

Isn’t this cool, writing your infrastructure as code?

Here are the steps to follow:

1. Install the Serverless framework on your machine.

npm install -g serverless

2. Create an IAM user on AWS with administrator access and name it cartoonify. Then configure serverless with this user’s credentials (I won’t show you mine — put in yours, buddy).

3. Bootstrap a serverless project with a Python template at the root of this project.

serverless create --template aws-python --path backend

4. Install two Serverless plugins to manage the Lambda dependencies and prevent the cold start of the lambda function:

5. Create a folder called network inside backend and put the following two files in it:

Transformer.py: a script that holds the architecture of the generator model.
A blank __init__.py

6. Modify the serverless.yml file with the following sections:

The provider section where we set up the provider, the runtime, and the permissions to access the bucket. Note here that you’ll have to specify your own S3 bucket.

The custom section where we configure the plugins:

The package section where we exclude unnecessary folders from the production:

The functions section where we create the Lambda function, configure it, and define the events that will invoke it. In our case, the lambda function is triggered by a post request on API Gateway on the path transform.

The plugins section to list external plugins:

7. List the dependencies inside requirements.txt (at the same level of serverless.yml).

8. Create an src folder inside backend and put handler.py in it to define the lambda function. Then modify handler.py.

First, add the imports:

Define two functions inside handler.py:

img_to_base64_str to convert binary images to base64 strings
load_models to load the four pre-trained models inside a dictionary and then keep them in memory

And finally, the lambda_handler that will be triggered by the API Gateway:

Now you’re done. The file structure of the backend should now look like this:

File structure of the backend folder (image by the author)

9. Start Docker before deploying.

10. Deploy the lambda function.

cd backend/
sls deploy

Deployment may take up to ten minutes, so go grab a ☕️.

What happens here, among many things, is that Docker will build an image of the Lambda deployment package, then Serverless will extract the dependencies of this environment in a zip before uploading to S3.

Once the lambda function deployed, you’ll be prompted for a URL of the API that you can request.

Go to Jupyter notebook to test it by loading an image, converting it to base64, and sending it inside a payload.

Demo of the API — (screenshot by the author)

If you want to follow this section step by step so that you don’t miss anything, you can watch it on YouTube.

3. Build a React Interface

This section covers building a simple React interface to interact with the model.

I wanted this interface to be as user-friendly as possible to visualize the style transformation in a very simple way.

I hopefully found this nice React component that allows you to compare two images side by side and go from one to another by sliding a cursor.

Before running the React app and building it, you’ll need to specify the API URL of the model you just deployed. Go inside fontend/src/api.js and change the value of baseUrl.

To run the React app locally:

cd frontend/
yarn install
yarn start

This will start it at http://localhost:3000.

To build the app before deploying it to Netlify:

yarn build

This will create thebuild/folder that contains a build of the application to be served on Netlify.

You can watch this section on YouTube to understand how the code is structured and the other React components being used.

4. Deploy the React App on Netlify

In this last section, we’ll cover deploying the front interface.

There are many ways to deploy a React app so that it goes live on the internet and anyone can have access to it. One of them is using Netlify: a great platform that automates building and deploying applications in many frameworks (React, Vue, Gatsby, etc.)

To be able to deploy on Netlify, you’ll need an account. It’s free: Head over to Netlify to sign up.
Then you’ll need to install netlify-cli:

npm install netlify-cli -g

Authenticate the Netlify client with your account:

netlify login

And deploy:

cd app/
netlify deploy

Netlify will ask you for the build folder (enter “build”) and a custom name for your app (this will appear as a subdomain of netlify.com). I’ve already picked “cartoonify,” but you can choose another one.

And this should be it! Now your app is live!

But wait! There’s some wrong with the URL: It’s prefixed with an alphanumerical code — I didn’t want that, right?

That’s because you deployed a draft URL!

To have a clean URL, you’ll have to deploy by specifying the prod option:

netlify deploy --prod

You can watch this section on YouTube for a live demo to understand how easy the deployment on Netlify can be.

5. Next: Want to Contribute?

If you’ve made it this far, I sincerely thank you for your time!

If you like this project and want to improve it, be my guest: I’m open to pull requests, and here’s the repo of the project.

If you have a machine learning project where you need to quickly deploy an API on the cloud and request it through a custom frontend, this project should be a boilerplate to kickstart your work, so feel free to reuse the code.

Happy hacking!

New to Medium? You can subscribe for $5 per month and unlock unlimited articles — click here.