We’re more than excited to announce today that we’ve rewritten browserless.io from the ground-up, and are in the process of preparing browserless 2.0 for general availability! This required numerous technical innovations and improvements to how the service functions, which is what propelled us forward with a rewrite. There’s a lot to talk about, so let’s get started with a shortlist of the new features:
- Support for Firefox and WebKit!
- An improved recording solution
- A much more robust and secure /function API
- Better support for long-running sessions
- A NodeJS SDK for writing and creating your own routes.
- Numerous back-end improvements and performance fixes
And of course, all of this works with our recently released residential proxies.
Why we did it
A rewrite should never be taken lightly, and with browserless 2.0 we didn’t originally set out to rewrite the entire project from scratch. However, given that the project is now over 5 years old(!), we found that there were numerous features that we wanted in browserless but just couldn’t easily implement. As an example: adding Firefox support would have been almost impossible in the current version without introducing a lot of instability to the project.
Using a semantic version of 2.0.0 also indicates that this is a major breaking release, which gave us permission to remove some stuff we’ve wanted to for quite some time. For example, we’ve dropped support for Selenium in 2.0 as it added a lot of technical challenges to the project. We’ll get into that more below.
Finally we wanted to take advantage of numerous other packages and modules in the ecosystem that just weren’t quite possible in browserless today. Browserless 2.0 uses ECMAScript modules internally, which would have been a big challenge in the repo today.
Aside from a ground-up rewrite, the biggest feature we have today is the support for the three major vendors: Chrome, Webkit and Mozilla. Because browserless now supports the three vendors we’re going to also change the name of the project to just “browserless” as it’s grown beyond the confines of Chrome. Most of the things you use today should just redirect to the new repo location, and we’ll be cognizant of other changes to make sure things just work going forward.
Browserless 2.0 is now a simple NodeJS HTTP service. No other frameworks or routing technologies are used so we can keep the code as light and flexible as possible. Because of this we now have a foundation that can support both WebSocket and HTTP routing – something we didn’t easily have access to in 1.0. We’ll continue to make improvements in this area as needed, including potentially open-sourcing that by itself.
With all that here’s a pretty comprehensive list of new features:
- Support for the big 3 browser vendors.
- ECMAScript modules are used internally.
- Removed all frameworks and routing libraries, replacing them with our own.
- Better APIs around the big use-cases we see: static assets, data, and testing. Mostly backwards compatible for PDFs, screenshots and scraping.
- Fully modular: you can write and extend parts of the codebase to fit your needs.
- All TypeScript with an npm package making it easier for you to extend features.
- A brand new /function API that runs in the context of the browser and not in Node.
- New query-string parameters that are better organized for easier authoring of scripts.
- Built-in documentation on the container image for that version of browserless.
- Moving to a new docker container repository – stay tuned on that.
- Slimmer, faster, and lighter containers. Not nearly as bulky.
- A new dashboard/admin area for the containers. Stay tuned on that!
This is just the beginning and we have much much more to come. This rewrite gives us a solid base to implement exactly what browserless needs without all the overhead of the prior repo.
Obviously we couldn’t keep everything around, nor did we want to. All of these removals we decided to do in order to provide the best experience for modern libraries. This wasn’t easy, and I’m sure a few of us will greatly miss them, but we think it’s for the best.
- Dropping Selenium support as a library: there’s already established technology here and we aren’t innovating in this area.
- Support for KEEP-ALIVE and PREBOOT are now dropped since they’re confusing, improve little, and cause a lot of bugs.
- We will be moving to a new container host platform, keep an eye on that.
- Many smaller features like certain query-string parameters and more.
We have some very very cool things coming soon: the ability to watch sessions as they run natively, remote hybrid sessions, ability to load extensions and run other custom plugins, better debugging tools, and even a dashboard! All of this would be nearly impossible if it wasn’t for the new back-end components we’ve been working on over the last year. Here’s a short list of features we’re working on for the future:
- Support for multiple tokens, and ways to limit the amount of service/time they can use.
- LDAP and other external authentication systems.
- A more elaborate session workflow: create, reconnect, and more to a prior-state browser.
- Extension API to upload your own Chrome extensions.
- A new way to “multiplex” multiple clients into a single browser without them running into each other.
- Other forms of access-control to make this a federated service for your own internal needs.
We intend to keep browserless/chrome supported with security and dependency fixes for the near future. We realize with every breaking change there can be numerous issues and frustrations, which is why we haven’t had a breaking change in over 5 years. We definitely want to hear your feedback, so let us know what you think and what might make things better in the future.
Where can I check it out?
You can check out the development pull-request here for keeping an eye on progress.