Readit News logoReadit News
jesprenj · a year ago
What I very dislike about current browser automation tools is that they all use TCP for connecting the browser with the manager program. This means that, unlike for UNIX domain sockets, filesystem permissions (user/group restrictions) cannot be used to protect the TCP socket, which opens the browser automation ecosystem to many attacks where 127.0.0.1 cannot be trusted (untrusted users on a shared host).

I have yet to see a browser automation tool that does not use localhost bound TCP sockets. Apart from that, most tools do not offer strong authentication -- a browser is spawned and it listens on a socket and when the controlling application connects to the browser management socket, no authentication is required by default, which creates hidden vulnerabilites.

While browser sessions may only be controlled by knowing their random UUIDs, creating new sessions is usually possible to anyone on 127.0.0.1.

I don't know really, it's quite possible I'm just spreading lies here, please correct me and expand on this topic a bit.

JoelEinbinder · a year ago
You can set `pipe` to true in puppeteer (default false) here https://pptr.dev/api/puppeteer.launchoptions

By default, Playwright launches this way and you have to specifically enable the tcp listening.

jesprenj · a year ago
Great, I stand corrected! I still don't know how they convince firefox/chromium to use a pipe as a websocket transport layer.
_heimdall · a year ago
I have always wanted a browser automation tool that taps directly into the accessibility tree. Plenty do supporting querying based on accessibility features, but unless I'm mistaken none go directly to the same underlying accessibility tree used by screen readers and similar.

Happy to be wrong here if anyone can correct me. The idea of all tests confirming both functionality and accessibility in one go would be much nicer than testing against hard coded test IDs and separately writing a few a11y tests if I'm offered the time.

jahewson · a year ago
It depends on what you’re testing. Much of a typical page is visual noise that is invisible to the accessibility tree but is often still something you’ll want tests for. It’s also not uncommon for accessible ui paths to differ from regular ones via invisible screen-reader only content, eg in a complex dropdown list. So you can end up with a situation where you test that accessible path works but not regular clicks!

If you really want gold standard screen reader testing, there’s no substitute for testing with actual screen readers. Each uses the accessibility tree in its own way. Remember also that each browser has its own accessibility tree.

regularfry · a year ago
Guidepup looks like it's a decent stab in that direction: https://www.guidepup.dev/

Only Windows and MacOS though, which is a problem for build pipelines. I too would very much like the page descriptions and the accessibility inputs to be the primary way of driving a page. It would make accessible access the default, rather than something you have to argue for.

Nextgrid · a year ago
Spawn it in a dedicated network namespace (to contain the TCP socket and make it unreachable from any other namespace) and use `socat` to convert it to a UNIX socket.
jesprenj · a year ago
This is not always possible as some machines don't support network namespaces, but it's a perfectly valid solution. But this solution is Linux-only, do BSD OSes like MacOS support UID and NET namespaces?
jgraham · a year ago
There's an issue open for this on the WebDriver BiDi issue tracker.

We started with WebSockets because that supports more use cases (e.g. automating a remote device such as a mobile browser) and because building on the existing infrastructure makes specification easier.

It's also true that there are reasons to prefer other transports such as unix domain sockets when you have the browser and the client on the same machine. So my guess is that we're quite likely to add support for this to the specification (although of course there may be concerns I haven't considered that get raised during discussions).

bryanrasmussen · a year ago
I haven't researched it but I would be surprised if Sikuli does this http://sikulix.com/
notpublic · a year ago
run it inside podman/docker
yoavm · a year ago
I know this isn't what the WebDriver BiDi protocol is for, but I feel like it's 90% there to being a protocol through which you can create browsers, with swappable engines. Gecko has gone a long way since Servo, and it's actually quite performant these days. The sad thing is that it's so much easier to create a Chromium-based browser than it is to create a Gecko based one. But with APIs for navigating, intercepting requests, reading the console, executing JS - why not just embed the thing, remove all the browser chrome around it, and let us create customized browsers?
djbusby · a year ago
I have dreamed about a swappable engine.

Like, a wrapper that does my history and tabs and book marks - but let's me move from rendering in Chrome or Gecko or Servo or whatever.

sorenjan · a year ago
There used to be an extension for Firefox called "IE Tab for Firefox" that used the IE rendering engine inside a Firefox tab, for sites that only worked in IE.
joshuaissac · a year ago
There are some browsers that support multiple rendering engines out of the box, like Maxthon (Blink + Trident) and Lunascape (Blink + Gecko + Trident).
apatheticonion · a year ago
Agreed. Headless browser testing is a great example of a case where an embeddable browser engine "as a lib" would be immensely helpful.

JSDom in the Nodejs world offers a peak into what that might look like - though it is lacking a lot of browser functionality making it impractical for most use cases.

e12e · a year ago
What are reasons to prefer puppeteer to playwright which supports many browsers?

> Cross-browser. Playwright supports all modern rendering engines including Chromium, WebKit, and Firefox.

https://playwright.dev/

creesch · a year ago
Good question, even more so considering they were made by the same people. After the creators of puppeteers moved to Microsoft and started work on Playwright, I got the impression that puppeteer was pretty much abandoned. Certainly in the automation circles I find myself in I barely see anyone using or talking puppeteer unless it is a bit of legacy project.
irjustin · a year ago
I also wonder the same. Playwright is so good. I simply don't have flaky tests even when dealing with features that are playwrights' fault.

I used to have so many issues with Selenium and so only used it in must have situations defaulting to capybara to run out specs.

dataviz1000 · a year ago
If you open up the code Playwright codebase you will discover that it is literally Puppeteer with the copyright message header in the base files belonging to Google. It is a fork.
Vinnl · a year ago
I said this in a subthread:

> I think Playwright depends on forking the browsers to support the features they need, so that may be less stable than using a standard explicitly supported by the browsers, and/or more representative of realistic browser use.

(And for Safari/WebKit to support it as well, but I'm not holding my breath for that one.) Though I hope Playwright will adopt BiDi at some point as well, as its testing features and API are really nice.

natorion · a year ago
Playwright is shipping patched browsers. They take the open source version of the browser and patch in e.g. CDP support or other things that make automation "better. Playwright does not work with a "normal" Safari for example.
bdcravens · a year ago
Additionally, Playwright has some nice ergonomics in the API, though Puppeteer has since implemented a lot of it as well. Downloads and video capturing in Playwright is nicer.
hugs · a year ago
Ranked #4 on HN at the moment and no comments. So I'll just say hi. (Selenium project creator here. I had nothing to do with this announcement, but feel free to ask me anything!)

My hot take on things: When the Puppeteer team left Google to join Microsoft and continue the project as Playwright, that left Google high and dry. I don't think Google truly realized how complementary a browser automation tool is to an AI-agent strategy. Similar to how they also fumbled the bag on transformer technology. (The T in GPT)... So Google had a choice, abandon Puppeteer and be dependent on MS/Playwright... or find a path forward for Puppeteer. WebDriver BiDi takes all the chocolatey goodness of the Chrome DevTools Protocol (CDP) that Puppeteer (and Playwright) are built on... and moves that forward in a standard way (building on the earlier success of the W3C WebDriver process that browser vendors and members of the Selenium project started years ago.)

Great to see there's still a market for cross-industry standards and collaboration with this announcement from Mozilla today.

huy-nguyen · a year ago
What’s the relationship between Selenium, Puppeteer and Webdriver BiDi? I’m a happy user of Playwright. Is there any reason why I should consider Selenium or Puppeteer?
imiric · a year ago
> Is there any reason why I should consider Selenium or Puppeteer?

I'm not a heavy user of these tools, but I've dabbled in this space.

I think Playwright is far ahead as far as features and robustness go compared to alternatives. Firefox has been supported for a long time, as well as other features mentioned in this announcement like network interception and preload scripts. CDP in general is much more mature than WebDriver BiDi. Playwright also has a more modern API, with official bindings in several languages.

One benefit of WebDriver BiDi is that it's in process of becoming a W3C standard, which might lead to wider adoption eventually.

But today, I don't see a reason to use anything other than Playwright. Happy to read alternative opinions, though.

Vinnl · a year ago
I think Playwright depends on forking the browsers to support the features they need, so that may be less stable than using a standard explicitly supported by the browsers, and/or more representative of realistic browser use.
hugs · a year ago
Maybe you don't want to live in a world where Microsoft owns everything (again)?
notinmykernel · a year ago
I am an active user of both Selenium and Puppeteer/Pyppeteer. I use them because it's what I learned and they still work great, and explicitly because it's not Microsoft.
nox101 · a year ago
Last time I tried playwright it required custom versions of the browsers. That meant it was impossible to use with any newer browser features. That made it impossible to use if you wanted to target new and advanced use cases or prep a site in expectation of some new API feature that just shipped or is expected to ship soon.

If you used playwright, write tons of tests, then hear about some new browser feature you want to target to get ahead of your competition, you'd have to refactor all of your tests away from playwright to something that could target chrome canary or firefox nightly or safari technology preview.

Has that changed?

twic · a year ago
It works for me with stock Chromium and Chrome on Linux. But for Firefox, i apparently need a custom patched build, which isn't available for the distro i run, so i haven't confirmed that.
tracker1 · a year ago
IIRC, you can use the system installed browser, but need to know the executable path when launching. I remember it being a bit of a pain to do, but have done it.
SomaticPirate · a year ago
If I wanted to write some simple web-automation as a DevOps engineer with little javascript (or webdev experience at all) what tool would you recommend?

Some example use cases would be writing some basic tests to validate a UI or automate some form-filling on a javascript based website with no API.

abdusco · a year ago
Use playwright's code generator that turns turn page interactions into code.

https://playwright.dev/python/docs/codegen-intro

hugs · a year ago
Unironically, ask ChatGPT (or your favorite LLM) to create a hello world WebDriver or Puppeteer script (and installation instructions) and go from there.
devjab · a year ago
I’d go with puppeteer for your use case as it’s the easier option to set up browser automation with. But it’s not like you can really go wrong with playwright or selenium either.

Playwright only really gets better than puppeteer if you’re doing actual website testing of a website you’re building which is where it shines.

Selenium is awesome, and probably has more guide/info available but it’s also harder to get into.

anothername12 · a year ago
Is the WebDriver standard a good one? (Relative to playwright I guess) I seem to recall some pains implementing it a few years ago.
localfirst · a year ago
is it possible to now use Puppeteer from inside the browser? or do security concerns restrict this?

what does Webdriver Bidi do and what do you mean by "taking the good stuff from CDP"

I don't want to run my scrapes in the cloud and pay a monthly fee

I want to run them locally. I want to run LLM locally too.

I'm sick of SaaS

hugs · a year ago
Puppeteer controls a browser... from the outside... like a puppeteer controls a puppet. Other tools like Cypress (and ironically the very first version of Selenium 20 years ago) drive the browser from the inside using JavaScript. But we abandoned that "inside out" approach in later versions of Selenium because of the limitations imposed by the browser JS security sandbox. Cypress is still trying to make it work and I wish them luck.

You could probably figure out how to connect Llama to Puppeteer. (If no one has done it, yet, that would be an awesome project.)

hoten · a year ago
Yes. I'm not aware of any documentation walking one through it though.

There is a extension api that exposes a CDP connection [1][2]

You can create a Puppeteer.Browser given a CDP connection.

You can bundle Puppeteer in a browser (we do this in Lighthouse/Chrome DevTools[3]).

These two things is probably enough to get it working, though it may be limited to the active tab.

[1] https://chromedevtools.github.io/devtools-protocol/#:~:text=...

[2] https://stackoverflow.com/a/55284340/24042444

[3] https://source.chromium.org/chromium/chromium/src/+/main:thi...

jgraham · a year ago
> Is it possible to now use Puppeteer from inside the browser?

Talking about WebDriver (BiDi) in general rather than Puppeteer specifically, it depends what exactly you mean.

Classic WebDriver is a HTTP-based protocol. WebDriver BiDi uses websockets (although other transports are a possibility for the future). Script running inside the browser can create HTTP connections and create websockets connections, so you can create a web page that implements a WebDriver or WebDriver BiDi client. But of course you need to have a browser to connect to, and that needs to be configured to actually allow connections from your host; for obvious security reasons that's not allowed by default.

This sounds a bit obscure, but it can be useful. Firefox devtools is implemented in HTML+JS in the browser (like the rest of the Firefox UI), and can connect to a different Firefox instance (e.g. for debugging mobile Firefox from desktop). The default runner for web-platform-tests drives the browser from the outside (typically) using WebDriver, but it also provides an API so the in-browser tests can access some WebDriver commands.

burntcaramel · a year ago
This is great! I’m curious about the accessibility tree noted in the unsupported-for-now APIs. Accessing the accessibility tree was something that was in Playwright for the big 3 engines but got removed about a year ago. I think it was partly because as noted it was a dump of engine-specific internal data structures: “page.accessibility.snapshot returns a dump of the Chromium accessibility tree”.

I’d like to advocate for more focus on these accessibility trees. They are a distillation of every semantic element on the page, which makes them fantastic for snapshot “tests” or BDD tests.

My dream would be these accessibility trees one day become standardized across the major browser engines. And perhaps from a web dev point-of-view accessible from the other layers like CSS and DOM.

mstijak · a year ago
Are there any advantages to using Firefox over Chrome for exporting PDFs with Puppeteer?
lol768 · a year ago
I've found Firefox to produce better PDFs than Chrome does, for what it's worth. There are some CSS properties that Chrome/Skia doesn't honour properly (e.g. repeating-linear-gradient) or ends up generating PDFs from that don't work universally.
freedomben · a year ago
Indeed, Firefox uses PDF.js which I've found to produce really good results.
fitsumbelay · a year ago
Been waiting for this. This rocks
ed_mercer · a year ago
Shouldn’t the title be “Firefox support for puppeteer”?
jgraham · a year ago
Well the truth is it's both.

We had to change Firefox so it could be automated with WebDriver BiDi. The Puppeteer team had to change Puppeteer in order to implement a WebDriver BiDi backend, and to enable specific support for downloading and launching Firefox.

As the article says, it was very much a collaborative effort.

But the announcement is specifically about the new release of Puppeteer, which is the first to feature non-experimental support for Firefox. So that's why the title's that way around.