I think the challenge is that, while we agree on the optimal setup, it's almost never done like this. There's lots of reasons why, but the main one (imo) is organizational scaling. The logic of scaling teams tends towards specialization, especially because developer time is extremely expensive.
There are many ways to address the problem of QA. From our perspective, if they don't broaden the ownership of QA from just developers or just QA engineers (which is where all the current products are targeted), they will exacerbate this specialization problem.
>Compare this to code-based automation solutions like Selenium or Cypress, where a coder would need to manually update the code across all the relevant tests any time the product changes.
... Well, no. Those tools can have reusable chunks and references that work just like your steps and are organized in the same manner, so that things used multiple places are stored once and reused. That's a basic programming good practice. When a change is needed to, say, a login flow, you don't change 100 UI tests, you change the login function that those tests are all calling.
> Tests created in code-based automation solutions like Selenium and Cypress evaluate what’s going on in the code behind the scenes of your product’s UI. That is, these solutions test what the computer sees, not what your users see.
and
> For example, if the HTML ID attribute of the signup button in the example above changes, a Selenium or Cypress coded test that identifies that button by its ID would likely break.
This is all very disingenuous in my book. In these tools the tests interact with the rendered web page and _can_ be made to use IDs, classes, and other brittle stuff as selectors, but that's a naive approach that a professional wouldn't typically use, because the brittleness surfaces early and there are better patterns that avoid it. One of those is to use selectors based on something that itself should be present - eg, click a button based on its label being "Log In". If that test fails because the button label was changed or removed, that's a reasonable failure, because labels are important in the UI. In the case where we are selecting/making assertions about DOM elements that don't have meaningful user-facing labels, dedicated test-id attributes solve the brittleness problem. But by and large, if something is interactive to the user, it should have a label, so the fallback to test attrs for selectors is rare, and used for the case where we want to measure some specific effect took place that is not itself an interactive element.
It's unfair to suggest that these tools inherently create brittle tests.
> Rainforest QA tests don’t break when there are minor, behind-the-scenes code changes that don’t change the UI.
but
> It works by pixel matching and using AI to find and interact with (click, fill, select, etc.) the correct elements.
Look, maybe your tool is GREAT, I haven't tried it. But a functional UI test based on label contents may be more robust than something that breaks when the visual structure of a page changes, which happens itself pretty frequently. Maybe the AI is good enough to know that a bunch of stuff moved around but the actual flow is the same and can find its way through. With tests based on labels and dedicated selectors, devs can make massive structural changes to the DOM and not break any tests if all the workflows are the same.
I would sure love if this post contained more info on how the visual AI system of selecting elements is better than using labels, test attributes etc as is conventional in automated testing done by professionals. Likewise it would be good to compare with an actual competitor like Ghost Inspector, where non-programmers have been able to generate step-based tests like this for years. The main gripe I had with Ghost Inspector was that it creates brittle selectors and so tests need to be changed a lot in response to DOM changes, unless a developer gets in there and manually reviews/picks better selectors.
If what you have a is a tool that _makes more robust tests than Ghost Inspector_ but is _as easy for a PM to use_, then that is interesting.
I actually support this underlying idea completely - I love systems where non-developers can create and maintain tests that reflect what they care about in the UI. I even love it when tools like this create low-quality tests that are still expressive enough that a dev can quickly come in and fix up selectors to make them more suitable. Cypress Studio is a currently-experimental feature that looks promising for this too, allowing folks to edit and create tests by clicking in the UI, not editing code files. It's a good direction for automated test frameworks to explore.
I'm just really uncomfortable after reading this post. It strays beyond typical marketing hyperbole to be actually deceptive about the nature of these other tools and the practices of teams using them. Instead of highlighting the actual uniqueness of your product, you exaggerate the benefits of this approach vs other solution. Come on, you can do tests in parallel with many tools, and you can avoid captcha for automated testing in various ways. Some of the process points you make are fair but also, again, kind of weak, because sure "QA at the end with no consultation during planning and development" is bad, but that's inherently bad for well known reasons.
What you've said is basically "our tool is better than every other idea about QA, if those ideas are deliberately implemented in the worst possible way, and our tool is implemented in the best way". Well sure. But also, of course it is.
Sorry if this seems harsh. To give you the benefit of the doubt: Marketing copy walks a fine line when trying to be persuasive and you might not have expected to create this impression. It's also possible that you did some research into weaknesses of automated test frameworks and just don't understand that those things are compensated for quite easily and routinely, because maybe you don't have the background. I don't know, but I hope future materials are a little more grounded in reality.
Regarding DOM interaction, you're missing the point. All automation that tests the front end code, regardless of how it attaches, is using a path that is different from your end user. The end user interacts with the application visually. That's why we test visually. A decrease in code-based brittleness is just a nice side effect. And as you note, this is a very high-level post outlining one key idea about quality ownership. You may be interested in this, which is one of our front end folks talking about why we believe testing visually is superior: https://www.rainforestqa.com/blog/the-downfall-of-dom-and-th...
We have been selling a QA solution for almost 10 years. In that time we've seen thousands of setups and directly worked with hundreds of teams. Your claim "weaknesses of automated test frameworks [...] are compensated for quite easily and routinely" is, quite simply, not true for the majority of engineering teams - few QA leaders, including proponents of Cypress, would agree with you.
Not all users interact visually. Selecting interactive elements through accessible labels, not based on visual appearance, is a better practice imo. I want important parts of the DOM that are critical to building a correct accessibility tree to be a part of the test, and to fail the test if we change something that makes the accessibility tree incorrect. Because that is the API for assistive technology and it communicates the user interface for lots of people. "Correct behavior" of an app or website includes "the accessibility tree accurately reflects the structure and nature of the content" and "form fields are labelled correctly". I might be in the minority in thinking that, but I believe it 100%.
Nothing I've seen so far (including the post you linked) suggests that the OCR-like approach can tell us anything about the accessibility tree.
The post does make a similar point to mine:
> If your app changes visually, your test will fail. This is of course a tradeoff and a conscious decision to couple your tests to your UI instead of your code, but the upsides bring much more value than that the downsides take away.
I disagree on the tradeoff. I can run screenshot diffs etc to catch unexpected UI changes, they are a little noisy, but I'm ok with that tradeoff because I care about more than just the visual appearance of the app.
A "visually correct" app with click handlers on divs and no semantic HTML is a liability (legally, maintenance wise, etc).. I'd like the E2E testing tool to assert that the app is working correctly, which does mean some assertions about the DOM are appropriate to me. I agree with the author of the linked post that "we want a solution that is not brittle in unwanted ways." We can be selective about what DOM things are important.
In the linked post the author says "In particular, DOM tests are tightly coupled to the structure of your code." and gives an example about a Selenium test that uses a brittle xpath that depends on a specific DOM structure.
Maybe I have not been exposed to enough of the industry to know that there are thousands of setups relying on flaky xpaths to target elements for testing. To me, it is not true that DOM tests are tightly couple to the structure of your code by default. It's a false statement made for marketing purposes and it is gross.
DOM tests "can be flaky", "are sometimes coupled to DOM structure" or whatever, is a fair assertion, but flakiness in DOM-driven testing is not a fact, it's a sign of badly written DOM-based tests. This is often the first thing I address in a code review of new tests written by somebody who does not write lots of FE test code, and they easily learn how to avoid it.
Maybe I'm wrong but it seems really really basic stuff to not create brittle selectors that fail tests for reasons we don't care about.
I like the OS-level interaction and agree that provides some advantages. I totally disagree that these advantages mean your solution wins at the "best way" to test, but it does clearly cover a different surface area than other automated solutions for E2E testing, and it seems like tests are pretty quick to knock out.
This solution could be a complement to other automated E2E tests, and I would see no reason that a PM or other party couldn't spin up and maintain some tests this way as quick-to-create set of tests to run against various builds, knowing that design changes will break them but that this is OK cause in theory it is quick to rewrite them.
But I couldn't see using this tool as the only E2E testing solution as though it is a superset of what Cypress/Selenium/Whatever tests are capable of. It is actually not a competitor with those tools. It's addresses different concerns with a little bit of overlap.
I'm happy to check out the free trial and see if I'm missing something, and eat crow if I'm being unfair here.