Sanely debugging puppeteer and fixes to common issues

February 6, 2019

contents

Ever take a page.screenshot call every-so-often in your code to see what the browser is doing? Or is console.log('Step #1') sprinkled throughout your code? While easy to use when you need some quick feedback, these methods of debugging remind me of troubleshooting JavaScript issues back in the days of Internet Explorer 5. Of course we’ve come a long way since then, and also have much much easier ways to see what’s going on in your puppeteer code, including the “gold” standard of Chrome’s developer tools. Getting it all running properly, however, requires a bit of setup work which is well worth it.

It’s my hope that, at the end of this post, you’ll have much better tools and methods for debugging some of the tougher problems in headless Chrome. And if not you’ll at least know where to get started!

Using Chrome’s devtools remotely

Before we take a dive into some of the more commons issues ran into when writing puppeteer scripts, let’s first start by getting established with how to get devtools running. If your script is portable enough, you can skip most of this and just use our online debugger, otherwise keep reading.

Now, every instance of Chrome out there can be debugged remotely. The trick to doing so is that you have to know the debugger’s port ahead of time, otherwise you cannot get a session to connect to it. By convention this port is generally 9222, and most tools out there (Chrome’s remote devtools included) tend to automatically work with this port number. To get puppeteer running with this specific port you’ll have to tell it to do so, otherwise it will randomize that port for you automatically. Doing so is fairly simple:


const browser = await puppeteer.launch({
  args: ['--remote-debugging-port=9222'],
});

With that chore out-of-the-way the next step is to open Chrome (the one already on your operating system, not puppeteer’s copy) and head to the chrome://inspect page. You might need to enable some settings in Chrome for this page to work, and there’s a great document on that here. We don’t need to do anything here just yet, but keeping it open is a great idea as we’ll come back here later.

Now, with that all in place, let’s try a simple script to see if we’ve got everything setup properly. Here’s a simple script below we’ll use as a starting point:


// index.js
const puppeteer = require('puppeteer');
async function run() {
    const browser = await puppeteer.launch({
        args: ['--remote-debugging-port=9222'],
    });
    const [ page ] = await browser.pages();
    await page.waitFor(5000); // Give us time to load the debugger
    await page.goto('https://example.com');
}
run();

And we’ll have node begin executing that with the following command:


$ node index.js

Quickly we’ll flip back over to our installed Chrome app and load that chrome://inspect page.

Huzzah! We’re nearly there! We now only need to click the “inspect” link, which will trigger a pop-up of a very-handy remote debugger, complete with a small rendering of the site.

With our handy swiss-army knife of debugging in our pockets, we’re ready to begin finding the root cause of some common puppeteer issues!

Issue #1: Blank pages rendering

One of the more common issues we see in browserless is blank pages rendering. This is even more jarring especially when the given site works locally, and works just fine. Prior to us having this handy debugger around, we were left with only our wits to guide us, but now that we can see everything we should be able to zero-in on the issue just fine.

Revisiting our prior code, let’s adjust it so that we visit the site that’s rendering blank:


// index.js
const puppeteer = require('puppeteer');
async function run() {
    const browser = await puppeteer.launch({
        args: ['--remote-debugging-port=9222'],
    });
    const [ page ] = await browser.pages();
    await page.waitFor(5000); // Give us time to load the debugger
    await page.goto('https://my-blank-site.com');
}
run();

Let’s also get our Chrome devtools running. Don’t worry if you don’t load it in time, you can simply use the built-in address bar and re-enter the URL there to re-navigate. Doing this will allow us to see the request and response inside of the Network tab in case we missed it from our page.goto call.

Now that we have that, we can see what the response from our network call actually is. In the majority of cases it’s an issue with how puppeteer sets the User-Agent, and if we want to continue further we’ll need to set it to an actual agent (Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.109 Safari/537.36 for instance). This might not fix the exact issue for your site in question, so Id’ recommend the following as well:

Launch in non-headless mode. Yes it’s possible for sites to detect this.
Use the stable build of Chrome and not the one that comes with puppeteer.
Finally, read more about browser fingerprinting techniques.

Issue #2: Webfonts not rendering properly

Another issue we’ve seen from time-to-time is webfonts not rendering properly, and namely those loaded from resources like Google Fonts or others. While this can likely be an issue with timing rather than detection (make sure to set your waitUntil to something like networkidle0), the difference between local Chrome and headless can be nuanced.

For instance, requesting the font “Amatic SC” in local Chrome triggers a network request for the woff2 font. However, that same page in headless will issue a request for the ttf variant. While this is just a simple difference in the font format, the edgecases can be pretty spectacular, and can include head-scratchers like the font not even rendering.

The solution here? You guessed it: set a legit user-agent. After doing so, most font services will realize who’s asking for the font and return the appropriate format. Something that’s hard to tell without having our debugger around.

Issue #3: Blank WebGL elements

If you’re looking to export a nice PDF of your dashboard or web-app with visualizations, chances are you’ll eventually run into something that uses WebGL internally. And when you go to export this nice page to a PDF via the page.pdf method, you’ll likely notice some strange blank boxes inside of your app. So, what gives?

Loading up the debugger will allow you to see that actually everything does render properly in the browser — so this one is a bit trickier. If you’re to go a step further and do <canvas>.toDataURL() call you’ll notice that that resulting image is also blank. The fix here? Make sure you always set preserveDrawingBuffer to true. This, for some reason, allows Chrome to be able to render these elements properly inside of a PDF, and our best guess is that it has to do with how Blink interacts with canvas elements.

Summary

Even though we only covered 3 common issues, we hope that this post has helped give you the tools needed to debug your next puppeteer project more quickly. If you are looking to run these scripts in the cloud, give browserless a shot, or even check out the live debugger. In either case you’ll now be well equipped to handle whatever oddities may come your way!