How to Deploy Playwright on AWS EC2

contents

Deploying Playwright on AWS EC2 is a versatile solution for automating browser tasks such as web scraping, end-to-end testing, and interacting with modern web applications. But, it requires some dependencies and configuring to run smoothly.

In this guide, we’ll cover everything from selecting the appropriate instance type to installing necessary dependencies and configuring the environment for optimal performance, followed by running an example to capture screenshots. We’ll also look at using separately hosted browsers to simplify things.

The EC2 deployment we'll cover in this guide

{{banner}}

Selecting Instance Configurations and Operating Systems

When deploying Playwright on AWS EC2, selecting an appropriate instance type ensures smooth performance. A solid basic configuration is a t3.medium or t3.large instance with 4–8 GB of RAM. For storage, around 10 GB is recommended to handle browser binaries and temporary files.

Playwright does not support Amazon Linux natively, which means dependencies have to be downloaded separately. We’ve included the list of dependencies for Amazon Linux below. 

Having said that, Playwright supports Ubuntu and provides the dependencies installation with a single command. Below commands are for Ubuntu.

Step 1: Launch and Connect to an EC2 Instance

To get started, launch an EC2 instance with sufficient storage and connect to it. You’ll install Node.js and Playwright using the following commands. Playwright comes bundled with browser binaries, so there’s no need to install them separately unless specific configurations are required for your use case.

Step 2: Install NodeJS


# Install curl if not already installed
sudo apt-get install -y curl

# Install Node.js (v22.x): Download and run the setup script
curl -fsSL https://deb.nodesource.com/setup_22.x -o nodesource_setup.sh
sudo -E bash nodesource_setup.sh

# Install Node.js
sudo apt-get install -y nodejs

# Verify Node.js installation
node -v

Step 3: Install Playwright and Dependencies


npm init playwright@latest

This command will prompt to install browsers and dependencies. In case of any issues, they can be downloaded separately using the following command:


sudo npx playwright install-deps

Dependencies for Amazon Linux (skip this if you using Ubuntu) 

Trying to install dependencies using Playwright tool above gives following warning

BEWARE: your OS is not officially supported by Playwright;
installing dependencies for ubuntu20.04-x64 as a fallback. 

Therefore, we install all the dependencies using the following command, which has been tested on nodejs16:


sudo yum install atk at-spi2-atk cups-libs libdrm libxcb libxkbcommon at-spi2-core libX11 libXcomposite libXdamage libXext libXfixes libXrandr mesa-libgbm pango cairo alsa-lib

Please note that xcb and xkbcommon are not directly available under those names in the default Amazon Linux. Therefore, we’ve installed the libraries using the following package names, which cover the same dependencies:

  • xcb is part of libxcb.
  • xkbcommon is part of libxkbcommon.

Example Code: Save Screenshots to S3

Let’s test some code that captures a screenshot of a webpage and uploads it to an S3 bucket.

First, create a new JavaScript file that will contain the code. You can run the following command in your terminal to create the file and open it in a text editor:


touch screenshot.js
nano screenshot.js

Then, copy and paste the following code into the file:


const { chromium } = require('playwright');
const AWS = require('aws-sdk');
const fs = require('fs');

// Configure AWS SDK to use S3
const s3 = new AWS.S3({
  region: 'your-region', // e.g., 'us-east-1'
});

// Get URL from command-line arguments
const url = process.argv[2];
if (!url) {
  console.error('Please provide a URL as the first argument');
  process.exit(1);
}

(async () => {
  // Launch a new browser instance
  const browser = await chromium.launch();
  const page = await browser.newPage();

  // Navigate to the provided webpage
  await page.goto(url);
 
  // Define the local path where the screenshot will be saved temporarily
  const localScreenshotPath = '/tmp/example.png';

  // Capture the screenshot and save it to the local directory (/tmp for EC2)
  await page.screenshot({ path: localScreenshotPath });
  console.log('Screenshot captured locally!');

  // Define the S3 bucket and key (filename in the S3 bucket)
  const bucketName = 'your-s3-bucket-name';
  const s3Key = `screenshots/${url.replace(/https?:\/\//, '').replace(/\//g, '_')}.png`;

  // Read the screenshot file from local directory
  const screenshotFile = fs.readFileSync(localScreenshotPath);

  // Upload the screenshot to S3
  const uploadParams = {
    Bucket: bucketName,
    Key: s3Key,
    Body: screenshotFile,
    ContentType: 'image/png'
  };

  // Perform the upload operation
  s3.upload(uploadParams, function(err, data) {
    if (err) {
      console.log('Error uploading to S3:', err);
    } else {
      console.log(`Screenshot uploaded to S3 at ${data.Location}`);
    }
  });

  // Close the browser
  await browser.close();
})();

This code takes the website we want to capture as a command line input. 

Run the following command and check the screenshot saved to S3 on a successful run.


node screenshot.js https://example.com

Maintenance tips and challenges

Playwright and EC2 is a great combination, but it requires careful maintenance. 

One of the most important aspects is dependency management. Playwright, along with its browser binaries, frequently releases updates to stay in sync with modern web standards. 

These updates often bring new features, optimizations, and security fixes, which makes it essential to regularly update your Playwright version to avoid potential compatibility issues or vulnerabilities.

You will also need to keep an eye out for memory leaks. Issues such as zombie process and browsers not closing properly can gradually increase the resources needed to keep the automations running smoothly.

Run Playwright with Browserless to Keep Things Simple

To take the hassle out of scaling your scraping, screenshotting or other automations, try Browserless.

It takes a quick connection change to use our thousands of concurrent Chrome browsers. Try it today with a free trial.

The Easy Option: Connect Playwright to Our Browser Pool

Hosting Playwright is easy, it's the browsers that cause the issues. To simplify your setup, use our pool of thousands of concurrent browsers with just a change in endpoint.



// Connecting to Chrome locally
const browser = await playwright.chromium.launch();

// Connecting to Firefox via Browserless
const browser = await playwright.chromium.connect(`https://production-sfo.browserless.io/firefox/playwright?token=GOES_HERE`);

});


You can either host just playwright-core without the browsers, or use our REST APIs. There’s residential proxies, stealth options, HTML exports and other common needed features.

Check out the docs
Share this article

Ready to try the benefits of Browserless?