WaitUntil option for Puppeteer and Playwright

Alejandro Loyola
Alejandro Loyola
/
August 22, 2023

Picking the best waitUntil option to use can be tricky. Puppeteer and Playwright have several options to consider when a site is considered “done” loading. The waitUntil option you use changes what resources your browser will wait for before it continues.

WaitUntil option for Puppeteer and Playwright
Events fired in the Puppeteer and Playwright navigation timeline. *networkidle's are prone to timing out.

If you need to scale your browser automation: data extraction, testing, PDF, or screenshot generation, browserless is for you. When working with these tasks, make sure you load all the resources you need, in the fastest time possible.

Why do I need to use the waitUntil option?

They will allow you to statically define when the site is ready for further processing. The best option depends on the following factors and challenges:

  • The network behavior of the target site.
  • Which library you're using.
  • Your use case.

Ideal event to use based on the network behavior of the target site.

Here’s the order of how the events on a page are fired, which is incredibly useful to know, as it can tell you which is best for your use case. Both Playwright’s documentation and Puppeteer’s documentation cover these events, though there are some differences. Here’s a pretty quick summary below:

  1. commit: is fired when the response headers have been parsed and the session history is updated but the navigation hasn’t started yet (Playwright only).
  2. domcontentloaded: is fired when the document content is loaded and parsed.
  3. load: is fired, after the page executes some scripts and loads resources like stylesheets and images.
  4. networkidle2: is fired when there are no more than 2 network connections for at least 500 ms.
  5. networkidle0 or networkidle (playwright): is fired when there are no more than 0 network connections for at least 500 ms.

Differences between Puppeteer and Playwright

The default waitUntil event is set to load in both libraries, which usually gives the best result in the least time, striking a balance between speed and completion. Playwright additionally has the commit event, which may be worth trying out for sites that quickly load the desired information, e.g. a header from the site, to check if it’s online or the page’s title.

networkidle

This option is useful when the load option isn’t quite loading everything you need. You’ll have to use this wisely since we’ve noticed this option is the most prone to timing out (you can use the timeout option to increase the default 30 seconds timeout).

  • Use networkidle2 (Puppeteer) if the site is using polling techniques. That means the network connections don’t close once data is passed on since it continues to send data over time.
  • Use networkidle0 (Puppeteer) if the site is using fetch requests or something similar that closes network connections once data is passed on (Common in SPAs).
  • Playwright only has the networkidle event, which is the equivalent of puppeteer's networkidle0, and Playwright's documentation suggests not using this for testing purposes, but rather to use web assertions and assess readiness.

When using Playwright, this option won't have a large difference due to Playwright's auto-waiting feature. For instance, if you have a page.goto() and a page.click() call right after the other without waiting, it will automatically wait for the selector to be attached to dom, visible, receive events, and more, making the networkidle option redundant.

Which event to target for your use case

Essentially, the fewer resources the site needs to load, the earlier event on the timeline you can pick. Fewer resources loaded will also help with issues like high CPU and Memory usage, as loading network resources uses both significantly. The below suggestions are a rough guide since they will depend on your website’s behavior.

Scraping and Data Extraction - commit, domcontentloaded, or load.

PDF, Screenshots, and Screencasting - load, networkidle, networkidle0, networkidle2.

Automation and Testing - load and use web assertions (Playwright) or page.$eval() (Puppeteer) to assess readiness.

Sample of each waitUntil option in Puppeteer and Playwright

We took a screenshot with each waitUntil option at "https://www.nytimes.com/" for demonstrating purposes since it's a site that has a lot of network activity. Please consider that the times we measured in this blog post per event may slightly vary depending on your location and worker capacity.

Playwright's results

Playwright commit load option
#1 - commit
Playwright domcontentloaded load option
#2 - domcontentloaded
Playwright waitUntil load option
#3 - load
Playwright networkidle load option
#4 - networkidle

Puppeteer's results

Puppeteer domcontentloaded waitUntil option
#1 - domcontentloaded
Puppeteer load waitUntil option
#2 - load
Puppeteer networkidle2 waitUntil option
#3 - networkidle2
Puppeteer networkidle0 waitUntil option
#4 - networkidle0

Wrapping up

I hope this was useful to understand the waitUntil Options that you can use in Puppeteer and Playwright.

Feel free to reach out to support@browserless.io if you have any questions or refer to the official documentation of each library.

Share this article

Ready to try the benefits of Browserless?

Sign Up