Browserless v2 is out 🚀! Read more about it here.

Playwright vs Puppeteer: Which one to choose for Browser automation?

George GkasdrogkasAugust 19, 2022

Bonus: if you like our content and this “playwright vs puppeteer” article, you can join our web automation Slack community.

What is browser automation?

Browser automation is the process of simulating user-specific tasks on a web browser. In recent years the importance of browser automation as a core tool for everything from automating internal tools to web scraping to E2E tests, has led to the birth of several different automation platforms and libraries. This article will compare two of the most essential automation libraries and demonstrate how easy it is to get started with browser automation using a free service like Browserless.

A brief introduction to browser automation history

Browser automation is not a new concept. It started in the 2000s with the need for a reliable testing framework to simulate user interactions within a web application’s UI interface. One of the first pioneers in the market was Selenium, which originated in 2004 and was created by ThoughtWorks. It was the de facto choice for a long time (nearly a decade). However, it wasn’t without its flaws. Selenium tests were flaky, unstable, and resource-heavy (the Selenium driver had to rely on full-blown browser instances). Eventually, the idea of a headless browser was introduced to tackle those issues. The most notable example is PhantomJS, which was deprecated by its author in 2018. However, since 2017, both Chrome and Firefox, have supported interfaces for remote controlling browser instances. The browser automation market took off in the early 2010s, when browser based SaaS apps began to take over as the standard for software applications. Soon, many companies and use cases began to benefit from browser automating practices. This gave rise to several browser automation services, such as browserless, and libraries such as Puppeteer and Playwright.

Browserless: fast, scalable, and reliable web browser automation

There are two options for browser automation; the first is to use a headless browser instance directly from your computer, and the other is to use an online platform. The latter’s advantage is that it provides a URL to a remote browser instance, so you don’t need to allocate memory and disk space on your computer or manage the browser installations yourself. 

Browserless is an online headless automation platform that provides fast, scalable, and reliable web browser automation, ideal for data analysis tasks and E2E tests. It’s an open source browser automation platform with more than 4.7K stars on GitHub. Some of the largest companies worldwide use the platform daily to conduct QA testing and data collection tasks (for example, here is how Samsara uses Browserless for their stress test automation). The coding examples in this article are written using the  Browserless platform. For that reason, we dedicate a few lines to show you how easy it is to register a free account:

Browser automation tool

The platform offers free plans to start, and paid plans if we need more powerful processing power. The free tier gives up to 6 hours of usage, which is more than enough for evaluating the platform capabilities or simple use cases.

After completing the registration process, the platform supplies us with an API key. We will use this key to access the Browserless services later on.

Browser automation open source tool

Puppeteer: Headless Chrome Node.js API 

Let’s dive deeper into browser automation with Puppeteer.

Puppeteer is a popular open source JavaScript library, released a few months after the release of headless Chrome in 2017. It counts more than 79K stars on GitHub and is actively maintained. It was developed by the Chrome DevTools team. The library can drive Chrome, Chromium (the open source version of Chrome), or Firefox. It is distributed as an NPM package, which also downloads a compatible version of Chromium. In cases where we don’t need a local browser — like connecting to a remote browser — we can use the puppeteer-core package, which provides all the functionalities of Puppeteer without downloading the browser —resulting in reduced dependencies and final project size.

Puppeteer was created to be an automation tool. It has a relatively simple API and, in terms of performance, is fast due to how the library communicates with the underline ChromeDriver through a simple WebSocket client. To demonstrate how easy it is to get started,  we are going to scrape some basic info from a YouTube video. We’ve already shared how to web scrape YouTube videos with Puppeteer here. 

import puppeteer from 'puppeteer-core'
 
const BROWSERLESS_API_KEY = '***'
 
async function getYoutubeVideoStatistics(videoURL) {
  const browser = await puppeteer.connect({
    browserWSEndpoint: `wss://chrome.browserless.io?token=${BROWSERLESS_API_KEY}&stealth`,
  });
 
  const page = await browser.newPage()
 
  await page.goto(videoURL);
 
  const titleElement = await page.$('h1 > yt-formatted-string[class="style-scope ytd-video-primary-info-renderer"]')
  const title = await titleElement.evaluate(el => el.textContent)
 
  const viewCountElement = await page.$('ytd-video-view-count-renderer[class="style-scope ytd-video-primary-info-renderer"] > span')
  const views = await viewCountElement.evaluate(el => el.textContent)
 
  const likesCountElement = await page.$('yt-formatted-string#text[class="style-scope ytd-toggle-button-renderer style-text"]')
  const likes = await likesCountElement.evaluate(el => el.textContent)
 
  await page.close();
  await browser.close()
 
  return {
    title,
    views,
    likes,
  }
}
 
const videoStatistics = await getYoutubeVideoStatistics('https://www.youtube.com/watch?v=wZXgPuY5rUQ')
console.log(videoStatistics)

>>> Try to automate the web with Puppeteer & Browserless

Playwright browser automation & reliable end-to-end testing

Playwright is another popular open-source library. Microsoft released the first public version in 2020 and the library is considered the spiritual successor to Puppeteer; it started as a fork from Puppeteer, when many of the Google contributors moved to Microsoft. As a result, the API interface and the underline design are similar to Puppeteer’s in many aspects. It counts more than 41K stars on GitHub and is actively maintained. The library can drive most modern browsers such as Chrome, Firefox, Safari, and Chromium-based browsers such as Edge. Like Puppeteer, it bundles a compatible browser, but there’s also a barebones version, playwright-core. A key difference, however, is the supported platforms: Playwright provides versions of its library in JavaScript (Node.js through NPM), Python, Java, and C#.

Playwright offers many features that are built for a test framework rather than an automation framework such as Puppeteer. Some of these features include automatically waiting for elements to be available, built-in support for selected elements by text, and allowing isolated sessions on the same browser instance, to name a few. Let’s demonstrate a basic example of using the library by following the same example as before.

import playwright from 'playwright-core'
 
const BROWSERLESS_API_KEY = '***'
 
async function getYoutubeVideoStatistics(videoURL) {
  const browser = await playwright.chromium.connect(
    `wss://chrome.browserless.io/playwright?token=${BROWSERLESS_API_KEY}`
  );
 
  const context = await browser.newContext();
  const page = await context.newPage()
 
  await page.goto(videoURL);
 
  const titleElement = page.locator('h1 > yt-formatted-string[class="style-scope ytd-video-primary-info-renderer"]')
  const title = await titleElement.evaluate(el => el.textContent)
 
  const viewCountElement = await page.$('ytd-video-view-count-renderer[class="style-scope ytd-video-primary-info-renderer"] > span')
  const views = await viewCountElement.textContent()
 
  const likesCountElement = await page.$('yt-formatted-string#text[class="style-scope ytd-toggle-button-renderer style-text"]')
  const likes = await likesCountElement.textContent()
 
  await page.close();
  await browser.close()
 
  return {
    title,
    views,
    likes,
  }
}
 
const videoStatistics = await getYoutubeVideoStatistics('https://www.youtube.com/watch?v=wZXgPuY5rUQ')
console.log(videoStatistics)

>>> Try to automate the web with Playwright & Browserless

Playwright vs Puppeteer which one is right for you?

Now that we covered the basics of each library, which one should we use in our projects? Both libraries share many similarities and some substantial differences. When making the right choice, we should consider a couple of things.

First, we will address the elephant in the room: Puppeteer is a Node.js library, while Playwright supports more development environments. This can be a defining factor in choosing one over the other. Apart from that, both libraries are stable on their API and actively maintained. Playwright supports Safari, which Puppeteer does not — and it seems like there is no plan to do so. 

It’s also worth mentioning that Playwright offers a more robust API for automated testing than Puppeteer does. While you can easily integrate Puppeteer in your test suites, Playwright itself offers a great test suite, without the need for third-party testing libraries. However, you should make sure you really need automated testing, some teams decided it was not worth conducting extensive tests on multiple browsers when most users use Chrome anyway.

Puppeteer has a bigger community than Playwright, so it’s easier to find solutions for technical issues. And having a bigger community, there are many plugins, addons, and implementations for Puppeteer that do not exist for Playwright. At the end of the day, of course, the decision is up to you. Both libraries are excellent choices for every automation workflow. The best part is that their APIs are very similar, and you can migrate easily between the two if there is no cross-browser support or development platform restriction.

Puppeteer vs Playwright, final thoughts

>>> Try to automate the web with Playwright or Puppeteer & Browserless

In this article, we presented a brief history of browser automation, browserless, a free online service, and discussed some of each library’s key elements, pros, and cons. The final choice will depend on your use case. We propose to try both and stick with the one you like the most. 

If you like our content, we have many tutorials on our blog on scraping different websites like YouTube, Twitter, and Glassdoor. You can also check out how our clients use Browserless for different use cases:

__

George Gkasdrogkas,

Twitter, Personal website

Share this article

Ready to try benefits of Browserless?