Since it came up in another thread (yes, it's trivial), a function `add` is no easier or harder to test with examples than with PBT, here are some of the tests as both PBT-style and example-based style:
@given(st.integers())
def test_left_identity_pbt(a):
assert add(a, 0) == a
def test_left_identity():
assert add(10, 0) == 10
@given(st.integers(), st.integers())
def test_commutative(a, b):
assert add(a, b) == add(b, a)
@parametrize("a,b", examples)
def test_commutative():
assert add(a, b) == add(b, a)
They're the same test, but one is more comprehensive than the other. And you can use them together. Supposing you do find an error, you add it to your example-based tests to build out your regression test suite. This is how I try to get people into PBT in the first place, just take your existing example-based tests and build a generator. If they start failing, that means your examples weren't sufficiently comprehensive (not surprising). Because PBT systems like Hypothesis run so many tests, though, you may need to either restrict the number of generated examples for performance reason or breakup complex tests into a set of smaller, but faster running, tests to get the benefit.Other things become much simpler, or at least simpler to test comprehensively, like stateful and end-to-end tests (assuming you have a way to programmatically control your system). Real-world, I used Hypothesis to drive an application by sending a series of commands/queries and seeing how it behaved. There are so many possible sequences that manually developing a useful set of end-to-end tests is non-trivial. However, with Hypothesis it just generated sequences of interactions for me and found errors in the system. After each command (which may or may not change the application state) it issued queries in the invariant checks and verified the results against the model. Like with example-based testing, these can be turned into hard-coded examples in your regression test suite.
I wanted to highlight one unexpected but very welcomed side effect of having those stateful property tests is we could use them to design high fidelity stubs. I wrote a follow-up blog post about it https://blog.tiserbox.com/posts/2024-07-08-make-good-stubs-w...
You're downplaying the amount of code required to properly setup a property-based test. In the linked article, the author implemented a state machine to accurately model the SUT. While this is not the most complex of systems, it is far from trivial, and certainly not a "bit of code". In my many years of example-based unit/integration/E2E testing, I've never had to implement something like that. The author admits that the team was reluctant to adopt PBT partly because of the amount of code.
This isn't to say that example-based tests are simple. There can be a lot of setup, mocking, stubbing, and helper code to support the test, but this is usually a smell that something is not right. Whereas with PBT it seems inevitable in some situations.
But then again, I can see how such tests can be invaluable, very difficult and likely more complex to implement otherwise. So, as with many things, it's a tradeoff. I think PBT doesn't replace EBT, nor vice-versa, but they complement eachother.
Why is it not more popular?
My theory is that only code written in functional languages has complex properties you can actually test.
In imperative programs, you might have a few utils that are appropriate for property testing - things like to_title_case(str) - but the bulk of program logic can only be tested imperatively with extensive mocking.
I’ve been working for the past 3 years on SelfHostBlocks https://github.com/ibizaman/selfhostblocks, making self-hosting a viable and convenient alternative to the cloud for non technical people.
It is based on NixOS and provides a hand-picked groupware stack: user-facing there is Vaultwarden and Nextcloud (and a bunch more but those 2 are the most important IMO for non technical people as it covers most of one’s important data) and on the backend Authelia, LLDAP, Nginx, PostgreSQL, Prometheus, Grafana and some more. My know-how is in how to configure all this so they play nice together and to have backups, SSO, LDAP, reverse proxy, etc. integration. I’m using it daily as the house server, I’m my first customer after all. And beginning of 2025 it passed my own internal checkpoint to be shared with others and there’s a handful of technical users using it.
My goal is to work on this full time. I started a company to provide a white glove installation, configuration and maintenance of a server with SelfHostBlocks. Everything I’ll be doing will always be open source, same as the whole stack and the server is DIY and repair friendly. The continuous maintenance is provided with a subscription which includes customer support and training on the software stack as needed.
Paperwm on gnome has this.
I didn’t try it myself though. I found it while scrolling https://github.com/Vortriz/awesome-niri
That seems to be the takeaway.
Centralization of just about anything is an issue, not just messaging.
However, users still want/need the kinds of advantages that we get from monopolies/centralization, and implementing them in distributed systems is really hard.