Dynamic Image content resizing with Lambda

·

8 min read

Dynamic Image content resizing with Lambda

When should you read this guide

If:

  1. You have a significant amount, possibly millions of images with various usage patterns that you serve to website guests

  2. If these images are not static images, but dynamic, generated on the user request

  3. If you care about load times/SEO impact and cost

You should not use this guide:

  1. If you have a few static images that can be easily pre-sized to appropriate size

  2. If you're unable to make front-end code changes in image fetching.

Introduction

When transferring images in web and mobile applications, it's common to face challenges with image size optimization, as images are often sent without proper resizing, causing long transfer times, especially when handling multiple images.

This is typically not a problem with static images that can be pre-sized to various resolutions. However, if your web app doesn't yet know which images it needs to render, this issue can become more significant.

This solution involves enabling any front-end to request images in various sizes. This would be determined by the visitor’s device resolution, allowing for the delivery of an image that is optimally sized for each specific device.

Let's do first 4 different images sizes, however it's up to you how many you pick:

  • Small

  • Medium

  • Large

  • Extra Large

Users can request the size most suitable for their needs. For instance, a mobile device might require a 'small' image, whereas a desktop could be better served with an 'extra large' one.

Each size category scales the original image as follows, while maintaining the aspect ratio for height:

  • Small: 150px width

  • Medium: 300px width

  • Large: 600px width

  • Extra Large: 1200px width

Again, these sizes can be different.

Avoiding exact measurements for each image, such as 500x300, is strategic. It prevents the creation of an excessive number of image sizes (we'll store them later on in S3), which would otherwise increase storage costs and potentially slow down network transfers.

AWS tech we'll use

  • S3 (static site, redirection rules, generic S3 knowledge)

  • Lambda function (node js, lambda layers)

  • API gateway (generic API gateway knowledge)

  • Npm, build node projects, etc.

  • Cloudfront (for caching images)

High level overview

There’re 3 players in this game.

1. S3 bucket (this stores the original images, and the resized images)
2. API gateway (this is used as an entry point for non-resized images)
3. Lambda function (this is used to resize images and redirect to the new image)

  1. User requests image via Cloudfront URL. Something like https://somethingsomething.cloudfront.net/small/img/catimage.png

  2. Cloudfront checks its own cache if it has the image. If yes, it serves it back to the user. In this case, the process stops.

  3. If Cloudfront doesn't have it in its cache, it goes to S3 to retrieve the resized file. If S3 has the resized file, it returns it to Cloudfront, which caches it and returns it to the user.

  4. If S3 also doesn't have the resized image, it results in a 404, and hence forwards the call to API gateway.

  5. API gateway in return forwards the call to the Lambda function, which resizes the image

  6. Lambda then places it in the S3 bucket and returns a new URL to the user.

As you see this would only need to execute once for each image. Any further calls would then get the resized image from Cloudfront.

The setup

Below you’ll find the steps to set up the solution.

I’ll not detail a generic S3, Lambda or API Gateway setup or permissions. There’re plenty of tutorials online. I’ll mention all the peculiarities though.

S3 initial setup

  1. Set up an S3 bucket and dump some images on it

  2. Set up public access to it

  3. Set the permissions to allow public access (later on you can limit this to Cloudfront)

  4. Enable static website hosting

At this point, you should be able to access all files in the bucket from anywhere.

Lambda + API gateway setup

  1. create a function and attach a trigger to API gateway.

  2. Leave the gateway open to public.

  3. Set up Lambda permissions to S3 bucket

Here’s the policy I used (for logging and S3 PutObject)

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": "arn:aws:logs:*:*:*"
        },
        {
            "Effect": "Allow",
            "Action": "s3:PutObject",
            "Resource": "arn:aws:s3:::img.yourbucket/*"
        }
    ]
}

4. Create a node project on your local PC and add an index.js file with these details. This is the code that does the resizing. Feel free to modify it as you wish. We'll upload this to Lambda.

'use strict';

const AWS = require('aws-sdk');
const S3 = new AWS.S3({
  signatureVersion: 'v4',
});
const Sharp = require('sharp');

const BUCKET = process.env.BUCKET;
const URL = process.env.URL;

exports.handler = function(event, context, callback) {
  const key = event.queryStringParameters.key;
  const match = key.match(/(small|medium|large|xlarge)\/(.*)\.(jpg|jpeg|png)/);

  const scale = match[1];
  const imageName = match[2];
  const imageExtension = match[3];
  const imagePath = imageName + '.' + imageExtension;

  const sizes = new Map();
  sizes.set('small', 150);
  sizes.set('medium', 300);
  sizes.set('large', 600);
  sizes.set('xlarge', 1200);

  const newSize = sizes.get(scale);
  let contentType;
  if  (["png"].includes(imageExtension.toLowerCase())) contentType = 'image/png';
  if  (["jpg", "jpeg"].includes(imageExtension.toLowerCase())) contentType = 'image/jpeg';


  S3.getObject({Bucket: BUCKET, Key: imagePath}).promise()
    .then(data => Sharp(data.Body)
      .resize({
        fit: Sharp.fit.contain,
        width: newSize
      })
      .toBuffer()
    )
    .then(buffer => S3.putObject({
        Body: buffer,
        Bucket: BUCKET,
        ContentType: contentType,
        Key: key,
      }).promise()
    )
    .then(() => callback(null, {
        statusCode: '301',
        headers: {'location': `${URL}/${key}`},
        body: '',
      })
    )
    .catch(err => callback(err))
}

Here’s the package.json.

{
  "name": "image-resize",
  "version": "1.0.0",
  "description": "Serverless image resizing",
  "readme": "Serverless image resizing",
  "main": "index.js",
  "scripts": {
    "build-copy": "npm install && mkdir -p nodejs && cp -r node_modules nodejs/ && zip -r  {file-name}.zip nodejs"
  },
  "devDependencies": {
    "aws-sdk": "^2.1046.0",
    "sharp": "^0.29.3"
  }
}

5. npm install --arch=x64 --platform=linux This will create a node_modules folder and a package lock file.
Because Sharp package has a different package for arch linux (which is used by lambda), you’d need to install this type. Please see here.

6. Create a nodejs folder and copy node_modules folder into it. Zip the whole nodejs folder
7. Create a new layer in lambda and upload this zip as your layer. This is needed, so you can separate your dependencies from your actual code.

8. Zip your package.json, your json lock and index.js. Upload this file as your code to Lambda

9. Attach the previously created layer to this code. This can be version one.

10. Add new environment variables. One for the bucket and one for the url path. See example below

11. At this point your lambda function should work and you can test it by calling your API gateway EP. You can see your error logs for lambda in Cloudwatch.
The URL will be something like this: 123456789.execute-api.us-west-1.amazonaws.c..

Where /default/ is your stage you set and /image-resizer is your lambda function name.

Add redirection to S3 bucket

To have this all working together, you need to tell the S3 bucket to redirect calls that fail to retrieve any object.

Add a redirection rule to your static site settings:

[
    {
        "Condition": {
            "HttpErrorCodeReturnedEquals": "404"
        },
        "Redirect": {
            "HostName": "123456789.execute-api.eu-west-1.amazonaws.com",
            "HttpRedirectCode": "307",
            "Protocol": "https",
            "ReplaceKeyPrefixWith": "default/image-resizer?key="
        }
    }
]

Where HttpErrorCodeReturnedEquals is a condition that triggers when a 404 (not found) is returned from the bucket on a particular path.

HostName is your API gateway hostname

HttpRedirectCode is what you tell the user’s browser on the response (307 is temp. redirect)

Protocol is the secure http protocol

ReplaceKeyPrefixWith is used to replace incoming paths with other paths (along with KeyPrefixEquals tag in the condition block, which isn’t there now). In our case it simply adds the default/image-resizer?key= just after the domain name, making the rest of the path a query parameter instead, which we can parse in the Lambda function.

Add Cloudfront + HTTPS

Add a CloudFront distribution in front of your S3 bucket that contains the images. When you add your S3 bucket, make sure you don’t pre-select the bucket AWS offers you.
If you do so regardless, AWS will not use the static site endpoint (that has the redirection set up). The issue you’ll face will be a tricky one, as your available images will actually serve perfectly fine via Cloudfront, only the new (not yet resized) images will give “Access Denied” response. This response is actually quite misleading because when an object doesn’t exist on S3, you get this error from CloudFront.

So to avoid that, make sure you use the static site EP URL.

Additionally, change your lambda functions URL environment variable to point to Cloudfront instead of the S3 bucket. This will help your browser serve HTTPS images at all times (even on first call/cache)

This will have the following advantages:

  • it'll save your money using Cloudfront instead of S3

  • speed up network requests for users

Additional improvements to consider

Add Lifecycle rules for resized images only

This would be quite handy, as some images are only accessed once, and never again. Still, we’ll end up paying for the storage costs of those images. Deleting those images when not needed anymore would “free up space” on the bucket.

Reason why it’s recommended:
- not using an S3 bucket for storing an image that isn't used means saving money.

WebP is a modern image format designed for the web. It is both smaller than jpg and better quality.
(see: WebP Compression Study | Google for Developers )

Sharp gives you the option to convert images to WebP on the fly, further reducing the image size.

Reason why it’s recommended:
- smaller image size means smaller load times for users and faster pages
- extremely easy to implement and switch to webP at any time on our lambda function
- if you replace all your jpg and png with WebP, you'll save money on storage costs in S3