> The biggest effect was that it gave our tiny engineering team the productivity of a team 50x its size.
I feel like the idea of the legendary "10x" developer has been bastardized to just mean workers who work 15 hours a day 6.5 days a week to get something out the door until they burn out.
But here's your real 10x (or 50x) productivity. People who implement something very few people even considered or understood to be possible, which then gives amazing leverage to deliver working software in a fraction of the time.
It seems like the industry would get a lot more 10x behavior if it was recognized and rewarded more often than it currently does. Too often, management will focus more on the guy who works 12 hour days to accomplish 8 hours of real work than the guy who gets the same thing accomplished in an 8 hour day.
Also, deviations from 'normal' are frowned upon. Taking time to improve the process isn't built into the schedule; so taking time to build a wheelbarrow is discouraged when they think you could be hauling buckets faster instead.
>It seems like the industry would get a lot more 10x behavior if it was recognized and rewarded more often than it currently does
I'd be happier if industry cares more for team productivity - I have witnessed how rewarding "10x" individuals may lead to perverse results on a wider scale, a la Cobra Effect. In one insidious case, our management-enabled, long-tenured "10x" rockstar fixed all the big customer-facing bugs quickly, but would create multiple smaller bugs and regressions for the 1x developers to fix while he moved to the next big problem worthy of his attention. Everyone else ended up being 0.7x - which made the curse of an engineer look even more productive comparatively!
Because he was allowed to break the rules, there was a growing portion of the codebase that only he could work on - while it wasn't Rust, imagine an org has a "No Unsafe Rust" rule that is optional to 1 guy. Organizations ought to be very careful how they measure productivity, and should certainly look beyond first-order metrics.
This reminds me of the "Parable of the Two Programmers." [1] A story about what happens to a brilliant developer given an identical task to a mediocre developer.
It's almost impossible to get executives to think in return on equity (“RoE”) for the future instead of “costs” measured in dollars and cents last quarter.
Which is weird, since so many executives are working in a VC-funded environment, and internal work should be “venture funded” as well.
> It seems like the industry would get a lot more 10x behavior if it was recognized and rewarded more often than it currently does.
I don't agree with that, there are a _lot_ of completely crap developers and they get put into positions where even the ones capable of doing so aren't allowed to because it's not on a ticket.
"Don't confuse motion with action", in other words. I think a lot of people aren't good at it because they themselves are rewarded for the opposite. This seems rife in the "just above individual contributor" management layer, but that's a biased take.
When I was in college, I've met a few people that coded _a lot_ faster than me. Typically, they started since they were 12 instead of 21 (like me). That's how 10x engineers exist, by the time they are 30, they have roughly 20 years of programming experience behind their belt instead of 10.
Also, their professional experience is much greater. Sure, their initial jobs at 15 are the occassional weird gig for the uncle/aunt or cousin/nephew but they get picked up by professional firms at 18 and do a job next to their CS studies.
At least, that's how it used to be. Not sure if this is still happening due to the new job environment, but this was the reality from around 2004 to 2018.
For 10x engineers to exist, all it takes is a few examples. To me, everyone is in agreement that they seem to be rare. I point to a public 10x engineer. He'd never say it himself, but my guess is that this person is a 10x engineer [1].
If you disagree, I'm curious how you'd disagree. I'm just a blind man touching a part of the elephant [2]. I do not claim to see the whole picture.
Yup, that's been my experience as someone who asked for a C++ compiler for my 12th birthday, worked on a bunch of random websites and webapps for friends of the family, and spent some time at age 16-17 running a Beowulf cluster and attempting to help postdocs port their code to run on MPI (with mixed success). All thru my CS education I was writing tons of toy programs, contributing (as much as I could) toward OSS, reading lots of stuff on best practices, and leaning on my much older (12 years) brother who was working in the industry. He pointed me to Java and IntelliJ, told me to read Design Patterns (Gang of Four) and Refactoring (Fowler). I read Joel on Software religiously, even though he was a Microsoft guy and I was a hardcore Linux-head.
By the time I joined my first real company at age 21, I was ready to start putting a lot of this stuff into place. I joined a small med device software company which had a great product but really no strong software engineering culture: zero unit tests, using CVS with no branches, release builds were done manually on the COO's workstation, etc.
As literally the most junior person in the company I worked through all these things and convinced my much more senior colleagues that we should start using release branches instead of "hey everybody, please don't check in any new code until we get this release out the door". I wrote automated build scripts mostly for my own benefit, until the COO realized that he didn't have to worry about keeping a dev environment on his machine, now that he didn't code any more. I wrote a junit-inspired unit testing framework for the language we were using (https://en.wikipedia.org/wiki/IDL_(programming_language) - like Matlab but weirder).
Without my work as a "10x junior engineer", the company would have been unable to scale to more than 3 or 4 developers. I got involved in hiring and made sure we were hiring people who were on board with writing tests. We finally turned into a "real" software company 2 or 3 years after I joined.
I'm not even sure that coding _much_ faster than necessary is even required to give a 3-5x multiple on "average", let alone "worst case" developers. Some of the biggest productivity wins can be had by being able to look at requirements, knowing what's right or wrong about them, and getting everyone on the same page so the thing only needs to be made once. Being good at test and debug so problems are identified and fixed _early_ are also big wins. Lots of that is just having the experience to recognize what sort of problem you're dealing with very quickly.
Being a programming prodigy is nice, but I don't think you even really need that.
Last year, we had 2 new hires.. one is fresh out of college (and not one of the top ones), other with 15 years experience on resume in our industry.
I am not sure there is 10x difference, but there is at least 5x difference in performance, in favor of fresh college grad, and they are now working on the more complex tasks too.
The sad part is our hiring is still heavily in "senior engineer with lots of experience" phase, and intership program has been canceled.
I am not convinced that just starting early is all there is to it. I started Math, Sports, and Piano at like 6 years old but there are still plenty of "10x <insert activity here>" people that figuratively and literally run circles around me. Talent is a real thing.
Some people organize their time and focus their efforts more efficiently than others. They also use tools that others might not even know or careabout.
You probably surf the internet 10x faster than your parents. Yes you've probably had more exposure than them, but you could probably teach them how to do it just as fast. But would they want to learn and would they actually adapt what you taught them?
Nick with Antithesis here with a funny story on this.
I became friends with Dave our CTO when I was 5 or 6, we were neighbors. He'd already started coding little games in Basic (this was 1985). Later in our friendship, like when I was maybe 10, I asked him if he could help me learn to code, which he did. After a week or two I had made some progress but compared what I could do to what he was doing and figured "I guess I just started too late, what's the point?".
I found out later that most people didn't start coding till late HS or college! It worked out though - I'm programmer adjacent and have taken care of the business side of our projects through the years :)
> That's how 10x engineers exist, by the time they are 30, they have roughly 20 years of programming experience behind their belt instead of 10.
This is a relatively small part of it.
The majority of developers who have been programming for 20 years, maybe learned a few tricks along the way then got stuck in a local maximum.
There are a few who learn deep computer science principles and understand how to apply them to novel problems. I'm thinking of techniques like in this book:
(mainly because Peter Norvig is my goto as the paradigmatic 10x developer)
For example, in the Efficiency Issues chapter about how to optimize programs, Norvig lists writing a compiler from one language into a more efficient one. Most developers who have been working 20 years, either won't think of that, or understand how to implement it. But these are the kinds of things that can result in really outsize productivity gains.
Yes: Programmers who start at twelve are often the 10x programmers who can really program faster than the average developer by a lot.
No: It's not because they have 10 more years of experience. Read "The Mythical Man Month." That's the book that popularized the concept that some developers were 5-25x faster than others. One of the takeaways was that the speed of a developer was not correlated with experience. At all.
That said, the kind of person who can learn programming at 12 might just be the kind of person who is really good at programming.
I started learning programming concepts at 11-12. I'm not the best programmer I know, but when I started out in the industry at 22 I was working with developers with 10+ years of (real) experience on me...and I was able to come in and improve on their code to an extreme degree. I was completing my projects faster than other senior developers. With less than two years of experience in the industry I was promoted to "senior" developer and put on a project as lead (and sole) developer and my project was the only one to be completed on time, and with no defects. (This is video game industry, so it wasn't exactly a super-simple project; at the time this meant games written 100% in assembly language with all kinds of memory and performance constraints, and a single bug meant Nintendo would reject the image and make you fix the problem. We got our cartridge approved the first time through.)
Some programmers are just faster and more intuitive with programming than others. This shouldn't be a surprise. Some writers are better and faster than others. Some artists are better and faster than others. Some architects are better and faster than others. Some product designers are better and faster than others. It's not all about the number of hours of practice in any of these cases; yes, the best in a field often practices an insane amount. But the very top in each field, despite having similar numbers of hours of practice and experience, can vary in skill by an insane amount. Even some of the best in each field are vastly different in speed: You can have an artist who takes years to paint a single painting, and another who does several per week, but of similar ultimate quality. Humans have different aptitudes. This shouldn't even be controversial.
I do wonder if the "learned programming at 12" has anything to do with it: Most people will only ever be able to speak a language as fluently as a native speaker if they learn it before they're about 13-14 years old. After that the brain (again, for most people; this isn't universal) apparently becomes less flexible. In MRI studies they can actually detect differences between the parts of the brain used to learn a foreign language as an adult vs. as a tween or early teen. So there's a chance that early exposure to the right concepts actually reshapes the brain. But that's just conjecture mixed with my intuition of the situation: When I observe "normal" developers program, it really feels like I'm a native speaker and they're trying to convert between an alien way of thinking about a problem into a foreign language they're not that familiar with.
AND...there may not be a need to explicitly PROGRAM before you're 15 to be good at it as an adult. There are video games that exercise similar brain regions that could substitute for actual programming experience. AND I may be 100% wrong. Would be good for someone to fund some studies.
The real problem is the measurement: speed of coding (or doing any other job) or volume of worl done. Those two are actually really bad productivity measures.
The truest 10x engineer I ever encountered was a memory firmware guy with ASIC experience who absolutely made sure to log off at 5 every day after really putting in the work. Go to guy for all parts of the codebase, even that which he didn't expressly touch.
I feel you. I had a semi-traumatic experience at a previous job, combined with a number of other factors which have left me feeling like a shell of my former self. Now I'm sure if I'm working at .5x or if I feel like .5x. Such is life, at times.
The “10x engineer” comes from the observation that there is a 10x difference in productivity between the best and the worst engineers. By saying that you want to be a 1x engineer, you’re saying you want to be the least productive engineer possible. 1x is not the average, 1x is the worst.
Do 10x engineers get 10x the wages? Somehow I feel being exceptionally better than other engineers is just unfair to both of you and the ones worse than you. I wouldn't want to be a 10x either, I'd rather just be normal engineer.
Your definition is also vague. Someone still needs to do the legwork. One man armies who can do everything themselves don't really fit in standardized teams where everything is compartmentalized and work divided and spread out.
They work best on their own projects with nobody else in their way, no colleagues, no managers, but that's not most jobs. Once you're part of a team, you can't do too much work yourself no matter how good you are, as inevitably the other slower/weaker team members will slow you down as you'll fight dealing with the issues they introduce into the project or the issues from management, so every team moves at the speed of the lowest common denominator no matter their rockstars.
That rings true and is probably why the 10x engineers I have seen usually work on devops or modify the framework the other devs are using in some way. For example, an engineer who speeds up a build or test suite by an order of magnitude is easily a 10x engineer in most organizations, in terms of man hours saved.
Nothing wrong with "one man armies" in the team context. There is a long list of tasks that needs to be done.. over same time period, one person will do 5 complex tasks (with tests and documentation), while the other will do just 1 task, and then spend even more time redoing it properly.
Over time this produces funny effects, like super-big 20 point task done in few days because wrong person started working on it.
On my team, one of the main multipliers is understanding the need behind the requested implementation, and proposing alternative solutions - minimizing or avoiding code changes altogether. It helps that we work on internal tooling and are very close to the process and stakeholders.
"Hmmm, there's another way to accomplish this" being the 10x. Doing things faster is not it.
Exactly this. It’s why it’s so frustrating when product managers who think they’re above giving background run the show (the ones who think they’re your manager and are therefore too important to share that with you)
I've always thought a x10 is one who sits back and sees a simpler way - like some math problems have an easy solution, if you can see it. Also: change the question; change the context (Alan Kay)
(And absolutely not brute-force grinding themselves away)
Was it 50x productivity due to 10x engineers, or 50x productivity due to optimized company structure? (edit: obviously, these do not need to be mutually exclusive - it's a sum of all the different parts)
It's easy to bog down even the best Nx engineers if you keep them occupied with endless bullshit tasks, meetings, (ever) changing timelines, and all that.
Kind of like having a professional driver drive a sportscar through a racetrack, versus the streets of Boston.
This is a perceptive observation. In my experience, so called "10x" engineers are as productive as they are because they have a process by which they practice the development on software that anticipates future problems. As a result when they check something in, they spend very little time "debugging" or "fixing bugs" with code that does what they already need it to do.
It is always very useful as an engineer to log your time, what are you working on "right now" and is it "new work" , "maintenance work", or "fixing work." Then for each log entry that isn't "new work" thinking about what you could have done that would have caught that problem before it was committed to the code base.
I find it is much better to evaluate engineers based on how often they are solving the same problem that they had before vs creating new stuff. That ratio, for me, is the essence of the Nx engineer (for 0.1 < N < 10)
The point that Wilson makes that having infrastructure/tools that push that ratio further from "repair" work to "new work" is hugely empowering to an organization.
> People who implement something very few people even considered or understood to be possible, which then gives amazing leverage to deliver working software in a fraction of the time.
I agree with the first part of your statement, but what really happens to such people?
In my experience (sample size greater than one), they receive some kudos, but remain underpaid, never promoted, and are given more work under tight deadlines. At least until some of them are laid off along with lower performers.
But for those who say that hard things are impossible, they seem to get along just fine. They merely declare such things as out-of-scope or lie about their roadmap.
> In my experience (sample size greater than one), they receive some kudos, but remain underpaid, never promoted, and are given more work under tight deadlines. At least until some of them are laid off along with lower performers.
100% agree, I've seen plenty of the best of the best get treated like trash and laid off at first sight of trouble on the horizon
Anyone can be a 10x engineer when they write something similar/identical to what they've written before. Other jobs are not like this. A plumber may only be 20% faster on the best days of their career.
In my experience it often comes down to business processes. We have a guy in my extended team who knows everything about his side of the company. When I work with him I accomplish business altering deliveries in a very short amount of time, which after a week or two rarely needs to be touched again unless something in the business changes. He’s not a PO and we don’t do anything too formally because it’s just him, me and another developer + whatever business manager will benefit from the development (and a few testers from their tran). In many ways the way we work these projects are very akin to Team Topologies.
At other times I’ll be assigned projects with regular POs, Architects and business employees who barely know what it is they are doing themselves, with poorly defined tasks and all sorts of bureaucratic nonsense “agile” process methods and well spend forever delivering nothing.
So sometimes I’m a 50x developer delivering business altering changes. At other times I’m a useless cog in a sea of pseudo workers. I don’t particularly care, I get paid, but if management actually knew what was going on, and how to change it… well…
How many organisations - of any kind, startups or enterprise or unicorns or whatever - will invest so much in effort that doesn't even touch the product. Before the product exists!
I think the reluctance to invest effort in something that will give devs super-powers 6 months in the future is why we don't get all those 10x devs.
The 10x developer I know basically DOES seem to do 10x more than anyone else on our team. But they are working on a team that does relatively simple work and are by far the most senior person on that team.
It's like how an NBA player would be a 10x player on a college basketball team. Great to work with them but If I was in their shoes I don't know how enjoyable/engaging the work would be.
Yep. Often I find our most accelerative work is stuff that makes testing changes easy (a very simple to bootstrap staging environment) or creates a lot of guarantees (typescript).
I know it's meant to be funny, but the number of tech people who spend zero time learning about "what's out there", are usually not the most effective developers. You won't find better solutions to existing or even new problems without an interest in industry. Maybe this particular article isn't "industry valuable fair enough", but having zero interest in refining and enhancing your craft beyond the work in front of you is almost guaranteed to end with worse outcomes.
Not true. I've known some very ADHD developers who are constantly context shifting and are able to fuck around on Hackernews for a while and then suddenly knock out a huge amount of work. The problem is that (speaking from personal experience) everybody with ADHD thinks they can do this and 99% cannot.
6.5 X 15 is only 97 hours per week not even close to the 400 hrs (5X40) per week of programming a 10X Rust programmer can provide. I jest but all this 10X stuff is getting ridiculous. They stayed in "Stealth" mode because they didn't have anything worth showing for 5 years. Doesn't sound all that productive to me. More likely what they are trying to do was hard and complicated and took a while to figure out.
They're not boasting about their current productivity, they're boasting about the one they achieved at FoundationDB when they implemented the testing, which gave them the idea to build antithesis
This might be the best introduction post I've read.
Lays the foundation (get it?) for who the people are and what they've built.
Then explains how the current thing they are building is a result of the previous thing. It feels that they actually want this problem solved for everyone because they have experienced how good the solution feels.
Then tells us about the teams (pretty big names with complex systems) that have already used it.
All of these wrapped in good writing that appeals to developers/founders. Landing page is great too!
It seems like marketing copy. Not a technical blog post.
It would be nice to see some actual use cases and examples.
Instead, the writer just name-dropped a few big companies and claimed to have a revolutionary product that works magically. Then include the typical buzzwords like '10x programmer' and 'stealth mode'. The latter doesn't make sense because they also name-drop clients.
Having that context puts the post in a much better perspective. It's definitely an introduction post (the company has been developing this in stealth mode for the past few years), but it is most certainly _not_ a marketing post. These people developed extremely novel testing techniques for FoundationDB and are now generalizing them to work with any containerized application.
It absolutely doesn’t read like typical marketing copy, and yes it’s not a dense technical blog post either. I’m sure the use cases and examples will come, but putting them in this post would have been overkill.
Also, stealth mode just means your company isn’t public, you can still have clients.
Post author here. Sorry it was vague, but there's only so much detail you can go into in a blog post aimed at general audiences. Our documentation (https://antithesis.com/docs/) has a lot more info.
Here's my attempt at a more complete answer: think of the story of the blind men and the elephant. There's a thing, called fuzzing, invented by security researchers. There's a thing, called property-based testing, invented by functional programmers. There's a thing, called network simulation, invented by distributed systems people. There's a thing, called rare-event simulation, invented by physicists (!). But if you squint, all of these things are really the same kind of thing, which we call "autonomous testing". It's where you express high-level properties of your system, and have the computer do the grunt work to see if they're true. Antithesis is our attempt to take the best ideas from each of these fields, and turn them into something really usable for the vast majority of software.
We believe the two fundamental problems preventing widespread adoption of autonomous testing are: (1) most software is non-deterministic, but non-determinism breaks the core feedback loop that guides things like coverage-guided fuzzing. (2) the state space you're searching is inconceivably vast, and the search problem in full generality is insolubly hard. Antithesis tries to address both of these problems.
So... is it fuzzing? Sort of, except you can apply it to whole interacting networked systems, not just standalone parsers and libraries. Is it property-based testing? Sort of, except you can express properties that require a "global" view of the entire state space traversed by the system, which could never be locally asserted in code. Is it fault injection or chaos testing? Sort of, except that it can use the techniques of coverage guided fuzzing to get deep into the nooks and crannies of your software, and determinism to ensure that every bug is replayable, no matter how weird it is.
It's hard to explain, because it's hard to wrap your arms around the whole thing. But our other big goal is to make all of this easy to understand and easy to use. In some ways, that's proved to be even harder than the very hard technological problems we've faced. But we're excited and up for it, and we think the payoff could be big for our whole industry.
Your feedback about what's explained well and what's explained poorly is an important signal for us in this third very hard task. Please keep giving it to us!
Sure, it doesn't go into details. And that is exactly why I termed it an excellent introduction and a sales pitch.
I haven't heard of deterministic testing before. Nor have I heard of FoundationDB or the related things. And I went from knowing zero things about them to getting impressed and interested. This led me to go into their docs, blogs, landing page, etc. to know more.
The entire testing system they describe feels like something I can strive towards too. They make you want their solution because it offers a way of life and thinking and doing like you've never experienced before
This is a great pitch, and I don't want to come across as negative, but I feel like a statement like "we found all bugs" can only be true with a very narrow definition of bug.
The most pernicious, hard-to-find bugs that I've come across have all been around the business logic of an application, rather than it hitting into an error state. I'm thinking of the category where you have something like "a database is currently reporting a completed transaction against a customer, but no completed purchase item, how should it be displayed on the customer recent transactions page?". Implementing something where "a thing will appear and not crash" in those cases is one thing, but making sure that it actually makes sense as a choice given all the context of everyone elses choices everywhere else in the stack is a lot harder.
Or to take a database, something along the lines of "our query planner produces a really suboptimal plan in this edge-case".
Neither of those types of problems could ever be automatically detected, because they aren't issues of the programming reaching an error state- the issue is figuring out in the first place what "correct" actually is for you application.
Maybe I'm setting the bar too high for what a "bug" is, but I guess my point is, its one thing to fantasize about having zero bugs, its another to build software in the real world. I probably still settle for 0 run time errors though to be fair. . .
I do think that it was a mistake to use the word "all" and imply that there are absolutely no bugs in FoundationDB. However, FoundationDB is truly known as having advanced the state of the art for testing practices: https://apple.github.io/foundationdb/testing.html.
So in normal cases this would reek of someone being arrogant / overconfident, but here they really have gotten very close to zero bugs.
The other issue I would point out is that building a database, while impressive with their quality, is still fundamentally different than an application or set of applications like a larger SaaS offering would involve (api, web, mobile, etc). Like the difference between API and UI test strategies, where API has much more clearly defined and standardized inputs and outputs.
To be clear, I am not saying that you can't define all inputs and outputs of a "complete SaaS product offering stack", because you likely could, though if it's already been built by someone that doesn't have these things in mind, then it's a different problem space to find bugs.
As someone who has spent the last 15 years championing quality strategy for companies and training folks of varying roles on how to properly assess risk, it does indeed feel like this has a more narrow scope of "bug" as a definition, in the sort of way that a developer could try to claim that robust unit tests would catch "any" bugs, or even most of them. The types of risk to a software's quality have larger surface areas than at that level.
I think the reference to "all the bugs" here is basically that our insanely brutal deterministic testing system was not finding any more bugs after 100's of thousands of runs. Can't prove a negative obviously, but the fact that we'd gotten to that "all green" status gave us a ton of confidence to push forward in feature development, believing we were building on something solid - which, time has shown we were.
Thanks -- that's very clarifying! But isn't this circular? The lack of bugs is used as evidence of the effectiveness of the testing approach, but the testing approach is validated by...not finding any more bugs in the software?
What do you call it when the spec is wrong? Like clearly actually wrong, such as when someone copied a paragraph from one CRUD-describing page to the next and forgot to change the word "thing1" to "thing2" in the delete description.
Because I'd call that a bug. A spec bug, but a bug. It's no feature request to make the code based on the newer page delete thing2 rather than thing1, it's fixing a defect
A spec bug is just as bad as a code bug! Declaring a system free of defects because it matches the spec is sneaky sleight-of-hand that ignores the costs of having a spec.
The actual testing value is the difference between the cost of writing and maintaining the code, and the cost of writing and maintaining the spec.
If the spec is similar in complexity to the code itself, then bugs in the spec are just as likely as bugs in the code, thus verification to spec has gained you nothing (and probably cost you a lot).
The best definition I've heard for "bug" is "software not working as documented". Of course, a lot of software is lacking documentation -- and those are doc bugs. But I like this definition because even when the docs are incomplete, the definition guides you to ask: would I really document that the software behaves like this or would I change the behavior [and document that]? It's harder (at least for me) to sweep goofy behavior under the rug.
I've been super interested in this field since finding out about it from the `sled` simulation guide [0] (which outlines how FoundationDB does what they do).
Currently bringing a similar kind of testing in to our workplace by writing our services to run on top of `madsim` [1]. This lets us continue writing async/await-style services in tokio but then (in tests) replace them with a deterministic executor that patches all sources of non-determinism (including dependencies that call out to the OS). It's pretty seamless.
The author of this article isn't joking when they say that the startup cost of this effort is monumental. Dealing with every possible source of non-determinism, re-writing services to be testable/sans-IO [2], etc. takes a lot of engineering effort.
Once the system is in place though, it's hard to describe just how confident you feel in your code. Combined with tools like quickcheck [3], you can test hundreds of thousands of subtle failure cases in I/O, event ordering, timeouts, dropped packets, filesystem failures, etc.
This kind of testing is an incredibly powerful tool to have in your toolbelt, if you have the patience and fortitude to invest in it.
As for Antithesis itself, it looks very very cool. Bringing the deterministic testing down the stack to below the OS is awesome. Should make it possible to test entire systems without wiring up a harness manually every time. Can’t wait to try it out!
> you can test hundreds of thousands of subtle failure cases in I/O, event ordering, timeouts, dropped packets, filesystem failures, etc.
As cool as all this is, I can't stop but wonder how often the culture of micro-services and distributed computing is ill advised. So much complexity I've seen in such systems boils down to calling a "function" is: async, depends on the OS, is executed at some point or never, always returns a bunch of strings that need to be parsed to re-enter the static type system, which comes with its own set of failure modes. This makes the seemingly simple task of abstracting logic into a named component, aka a function, extremely complex. You don't need to test for any of the subtle failures you mentioned if you leave the logic inside the same process and just call a function. I know monoliths aren't always a good idea or fit, at the same time I'm highly septical whether the current prevalence of service based software architectures is justified and pays off.
> I can't stop but wonder how often the culture of micro-services and distributed computing is ill advised.
You can't get away from distributed computing, unless you get away from computing. A modern computer isn't a single unit, it's a system of computers talking to each other. Even if you go back a long time, you'll find many computers or proto-computers talking to each other, but with a lot stricter timings, as the computers are less flexible.
If you save a file to a disk, you're really asking the OS (somehow) to send a message to the computer on the storage device, asking it to store your data, and it will respond with success or failure and it might also write the data. (sometimes it will tell your os success and then proceed to throw the data away, which is always fun)
That said, keeping things together where it makes sense, is definitely a good thing.
TigerBeetle is actually another customer of ours. You might ask why, given that they have their own, very sophisticated simulation testing. The answer is that they're so fanatical about correctness, they wanted a "red team" for their own fault simulator, in case a bug in their tests might hide a bug in their database!
I gotta say, that is some next-level commitment to writing a good database.
Sure! I mentioned a few orthogonal concepts that go well together, and each of the following examples has a different combination that they employ:
- the company that developed Madsim (RisingWave) [0] [1] is tries hardest to eliminate non-determinism with the broadest scope (stubbing out syscalls, etc.)
- sled [2] itself has an interesting combo of deterministic tests combined with quickcheck+failpoints test case auto-discovery
- Dropbox [3] uses a similar approach but they talk about it a bit more abstractly.
Sans-IO is more documented in Python [4], but str0m [5] and quinn-proto [6] are the best examples in Rust I’m aware of. Note that sans-IO is orthogonal to deterministic test frameworks, but it composes well with them.
With the disclaimer that anything I comment on this site is my opinion alone, and does not reflect the company I work at —— I do work at a rust shop that has utilized these techniques on some projects.
TigerBeetle is an amazing example and I’ve looked at it before! They are really the best example of this approach outside of FoundationDB I think.
> Programming in this state is like living life surrounded by a force field that protects you from all harm. [...] We deleted all of our dependencies (including Zookeeper) because they had bugs, and wrote our own Paxos implementation in very little time and it _had no bugs_.
Being able to make that statement and back it by evidence must be indeed a cool thing.
> The longer I have computed, the less I seem to use Numerical Software Packages. In an ideal world this would be crazy; maybe it is even a little bit crazy today. But I've been bitten too often by bugs in those Packages. For me, it is simply too frustrating to be sidetracked while solving my own problem by the need to debug somebody else's software. So, except for linear algebra packages, I usually roll my own. It's inefficient, I suppose, but my nerves are calmer.
> The most troubling aspect of using Numerical Software Packages, however, is not their occasional goofs, but rather the way the packages inevitably hide deficiencies in a problem's formulation. We can dump a set of equations into a solver and it will usually give back a solution without complaint - even if the equations are quite poorly conditioned or have an unsuspected singularity that is distorting the answers from physical reality. Or it may give us an alternative solution that we failed to anticipate. The package helps us ignore these possibilities - or even to detect their occurrence if the execution is buried inside a larger program. Given our capacity for error-blindness, software that actually hides our errors from us is a questionable form of progress.
> And if we do detect suspicious behavior, we really can't dig into the package to find our troubles. We will simply have to reprogram the problem ourselves. We would have been better off doing so from the beginning - with a good chance that the immersion into the problem's reality would have dispelled the logical confusions before ever getting to the machine.
I suppose whether to do this depends on how rigorous one is, how rigorous certain dependencies are, and how much time one has. I'm not going to be writing my own database (too complicated, multiple well-tested options available) but if I only use a subset of the functionality of a smaller package that isn't tested well, rolling my own could make sense.
In the specific case in question, the biggest problem was that dependencies like Zookeeper weren't compatible with our testing approach, so we couldn't do true end to end tests unless we replaced them. One of the nice things about Antithesis is that because our approach to deterministic simulation is at the whole system level, we can do it against real dependencies if you can install them.
I was a co-founder of both FoundationDB and Antithesis.
That tracks well (both the quotes and your thoughts).
One example that comes to mind where I want to roll my own thing (and am in the process of doing so) is replacing our ci/cd usage of jenkins that is solely for running qa automation tests against PR's on github. Jenkins does way way more than we need. We just need github PR interaction/webhook, secure credentials management, and spawning ecs tasks on aws...
Every time I force myself to update our jenkins instance, I buckle up because there is probably some random plugin, or jenkins agent thing, or ... SOMETHING that will break and require me to spend time tracking down what broke and why. 100% surface area for issues, whilst we use <5% of what Jenkins actually provides.
1. It's a brilliant idea that came at the right time. It feels like people are finally losing patience with flaky software, see developer sentiment on: fuzzers, static typing, memory safety, standardized protocols, containers, etc.
2. It's meant to be niche. $2 per hour per CPU (or $7000 per year per CPU if reserved), no free tier for hobby or FOSS, and the only way to try/buy is to contact them. Ouch. It's a valid business model, I'm just sad it's not going for maximum positive impact.
3. Kudos for the high quality writing and documentation, and I absolutely love that the docs include things like (emphasis in original):
> If a bug is found in production, or by your customers, you should demand an explanation from us.
That's exactly how you buy developer goodwill. Reminds me of Mullvad, who I still recommend to people even after they dropped the ball on me.
Thanks for your kind words! As I mention in this comment (https://news.ycombinator.com/item?id=39358526) we are planning to have pricing suitable for small teams, and perhaps even a free tier for FOSS, in the future.
"It's meant to be niche. $2 per hour per CPU (or $7000 per year per CPU if reserved), no free tier for hobby or FOSS, and the only way to try/buy is to contact them. Ouch. It's a valid business model, I'm just sad it's not going for maximum positive impact."
This is the sort of thing that, if it takes off, will start affecting the entire software world. Hardware will start adding features to support it. In 30 years this may simply be how computing works. But the pioneers need to recover the costs of the arrows they got stuck with before it can really spread out. Don't look at this an event, but as the beginning of a process.
I think their target audience is teams who already have mature software and comprehensive tests. From the docs, the kinds of bugs their platform is designed to find are the wild “unreproducible” kind that only happens rarely in production. Most teams have much bigger problems and obvious bugs to fix.
Heck, most software in production today barely has unit tests.
$2 per hour per CPU could be expensive or inexpensive, depending on how long it takes to fuzz your program. I wonder how that multiplies out in real use cases?
I met Antithesis at Strangeloop this year and got to talk to employees about the state of the art of automated fault injection that I was following when I worked at Amazon, and I cannot overstate how their product is a huge leap forward compared to many of the formal verification systems being used today.
I actually got to follow their bug tracking process on an issue they identified in Apache Spark streaming - going off of the docs, they managed to identify a subtle and insidious correctness error in a common operation that would've caused headaches in low visibility edge case for years at that point. In the end the docs were incorrect, but after that showing I cannot imagine how critical tools like Antithesis will be inside companies building distributed systems.
I hope we get some blog posts that dig into the technical weeds soon, I'd love to hear what brought them to their current approach.
I'm trying to avoid diving into the hype cycle about this immediately - but this sounds like the holy grail right? Use your existing application as-is (assuming it's containerized), and simply check properties on it?
The blocker in doing that has always been the foundations of our machines: non-deterministic CPUs and operating systems. Re-building an entire vertical computing stack is practically impossible, so they just _avoid_ it by building a high-fidelity deterministic simulator.
I do wonder how they are checking for equivalence between the simulator and existing OS's, as that sounds like a non-trivial task. But, even still, I'm really bought in to this idea.
You still have to use their SDKs to write lots of integration tests (they call them “workloads”).
Then they run those tests while injecting all sorts of failures like OS failures, network issues, race and timing conditions, random number generator issues, etc.
It’s likely the only practical way today of testing for those things reliably, but you still have to write all of the tests and define your app state.
I feel like the idea of the legendary "10x" developer has been bastardized to just mean workers who work 15 hours a day 6.5 days a week to get something out the door until they burn out.
But here's your real 10x (or 50x) productivity. People who implement something very few people even considered or understood to be possible, which then gives amazing leverage to deliver working software in a fraction of the time.
I'd be happier if industry cares more for team productivity - I have witnessed how rewarding "10x" individuals may lead to perverse results on a wider scale, a la Cobra Effect. In one insidious case, our management-enabled, long-tenured "10x" rockstar fixed all the big customer-facing bugs quickly, but would create multiple smaller bugs and regressions for the 1x developers to fix while he moved to the next big problem worthy of his attention. Everyone else ended up being 0.7x - which made the curse of an engineer look even more productive comparatively!
Because he was allowed to break the rules, there was a growing portion of the codebase that only he could work on - while it wasn't Rust, imagine an org has a "No Unsafe Rust" rule that is optional to 1 guy. Organizations ought to be very careful how they measure productivity, and should certainly look beyond first-order metrics.
[1] I preserved a copy of it on my (no-advertising or monetization) blog here: https://realmensch.org/2017/08/25/the-parable-of-the-two-pro...
Which is weird, since so many executives are working in a VC-funded environment, and internal work should be “venture funded” as well.
I don't agree with that, there are a _lot_ of completely crap developers and they get put into positions where even the ones capable of doing so aren't allowed to because it's not on a ticket.
I've seen some thing.
Also, their professional experience is much greater. Sure, their initial jobs at 15 are the occassional weird gig for the uncle/aunt or cousin/nephew but they get picked up by professional firms at 18 and do a job next to their CS studies.
At least, that's how it used to be. Not sure if this is still happening due to the new job environment, but this was the reality from around 2004 to 2018.
For 10x engineers to exist, all it takes is a few examples. To me, everyone is in agreement that they seem to be rare. I point to a public 10x engineer. He'd never say it himself, but my guess is that this person is a 10x engineer [1].
If you disagree, I'm curious how you'd disagree. I'm just a blind man touching a part of the elephant [2]. I do not claim to see the whole picture.
[1] https://bellard.org/ (the person who created JSLinux)
[2] https://en.wikipedia.org/wiki/Blind_men_and_an_elephant - if you don't know the parable, it's a fun one!
By the time I joined my first real company at age 21, I was ready to start putting a lot of this stuff into place. I joined a small med device software company which had a great product but really no strong software engineering culture: zero unit tests, using CVS with no branches, release builds were done manually on the COO's workstation, etc.
As literally the most junior person in the company I worked through all these things and convinced my much more senior colleagues that we should start using release branches instead of "hey everybody, please don't check in any new code until we get this release out the door". I wrote automated build scripts mostly for my own benefit, until the COO realized that he didn't have to worry about keeping a dev environment on his machine, now that he didn't code any more. I wrote a junit-inspired unit testing framework for the language we were using (https://en.wikipedia.org/wiki/IDL_(programming_language) - like Matlab but weirder).
Without my work as a "10x junior engineer", the company would have been unable to scale to more than 3 or 4 developers. I got involved in hiring and made sure we were hiring people who were on board with writing tests. We finally turned into a "real" software company 2 or 3 years after I joined.
Being a programming prodigy is nice, but I don't think you even really need that.
I am not sure there is 10x difference, but there is at least 5x difference in performance, in favor of fresh college grad, and they are now working on the more complex tasks too.
The sad part is our hiring is still heavily in "senior engineer with lots of experience" phase, and intership program has been canceled.
You probably surf the internet 10x faster than your parents. Yes you've probably had more exposure than them, but you could probably teach them how to do it just as fast. But would they want to learn and would they actually adapt what you taught them?
I became friends with Dave our CTO when I was 5 or 6, we were neighbors. He'd already started coding little games in Basic (this was 1985). Later in our friendship, like when I was maybe 10, I asked him if he could help me learn to code, which he did. After a week or two I had made some progress but compared what I could do to what he was doing and figured "I guess I just started too late, what's the point?".
I found out later that most people didn't start coding till late HS or college! It worked out though - I'm programmer adjacent and have taken care of the business side of our projects through the years :)
This is a relatively small part of it.
The majority of developers who have been programming for 20 years, maybe learned a few tricks along the way then got stuck in a local maximum.
There are a few who learn deep computer science principles and understand how to apply them to novel problems. I'm thinking of techniques like in this book:
https://www.everand.com/book/282526076/Paradigms-of-Artifici...
(mainly because Peter Norvig is my goto as the paradigmatic 10x developer)
For example, in the Efficiency Issues chapter about how to optimize programs, Norvig lists writing a compiler from one language into a more efficient one. Most developers who have been working 20 years, either won't think of that, or understand how to implement it. But these are the kinds of things that can result in really outsize productivity gains.
No: It's not because they have 10 more years of experience. Read "The Mythical Man Month." That's the book that popularized the concept that some developers were 5-25x faster than others. One of the takeaways was that the speed of a developer was not correlated with experience. At all.
That said, the kind of person who can learn programming at 12 might just be the kind of person who is really good at programming.
I started learning programming concepts at 11-12. I'm not the best programmer I know, but when I started out in the industry at 22 I was working with developers with 10+ years of (real) experience on me...and I was able to come in and improve on their code to an extreme degree. I was completing my projects faster than other senior developers. With less than two years of experience in the industry I was promoted to "senior" developer and put on a project as lead (and sole) developer and my project was the only one to be completed on time, and with no defects. (This is video game industry, so it wasn't exactly a super-simple project; at the time this meant games written 100% in assembly language with all kinds of memory and performance constraints, and a single bug meant Nintendo would reject the image and make you fix the problem. We got our cartridge approved the first time through.)
Some programmers are just faster and more intuitive with programming than others. This shouldn't be a surprise. Some writers are better and faster than others. Some artists are better and faster than others. Some architects are better and faster than others. Some product designers are better and faster than others. It's not all about the number of hours of practice in any of these cases; yes, the best in a field often practices an insane amount. But the very top in each field, despite having similar numbers of hours of practice and experience, can vary in skill by an insane amount. Even some of the best in each field are vastly different in speed: You can have an artist who takes years to paint a single painting, and another who does several per week, but of similar ultimate quality. Humans have different aptitudes. This shouldn't even be controversial.
I do wonder if the "learned programming at 12" has anything to do with it: Most people will only ever be able to speak a language as fluently as a native speaker if they learn it before they're about 13-14 years old. After that the brain (again, for most people; this isn't universal) apparently becomes less flexible. In MRI studies they can actually detect differences between the parts of the brain used to learn a foreign language as an adult vs. as a tween or early teen. So there's a chance that early exposure to the right concepts actually reshapes the brain. But that's just conjecture mixed with my intuition of the situation: When I observe "normal" developers program, it really feels like I'm a native speaker and they're trying to convert between an alien way of thinking about a problem into a foreign language they're not that familiar with.
AND...there may not be a need to explicitly PROGRAM before you're 15 to be good at it as an adult. There are video games that exercise similar brain regions that could substitute for actual programming experience. AND I may be 100% wrong. Would be good for someone to fund some studies.
In the end it doesn't matter, whole team could be laid off at once.
Productivity has no inherent value - like efficiency and perfection, it is necessarily of something else. Its value is entirely derived.
They work best on their own projects with nobody else in their way, no colleagues, no managers, but that's not most jobs. Once you're part of a team, you can't do too much work yourself no matter how good you are, as inevitably the other slower/weaker team members will slow you down as you'll fight dealing with the issues they introduce into the project or the issues from management, so every team moves at the speed of the lowest common denominator no matter their rockstars.
Over time this produces funny effects, like super-big 20 point task done in few days because wrong person started working on it.
"Hmmm, there's another way to accomplish this" being the 10x. Doing things faster is not it.
(And absolutely not brute-force grinding themselves away)
https://manuel.kiessling.net/2011/04/07/why-developing-witho...
It's easy to bog down even the best Nx engineers if you keep them occupied with endless bullshit tasks, meetings, (ever) changing timelines, and all that.
Kind of like having a professional driver drive a sportscar through a racetrack, versus the streets of Boston.
It is always very useful as an engineer to log your time, what are you working on "right now" and is it "new work" , "maintenance work", or "fixing work." Then for each log entry that isn't "new work" thinking about what you could have done that would have caught that problem before it was committed to the code base.
I find it is much better to evaluate engineers based on how often they are solving the same problem that they had before vs creating new stuff. That ratio, for me, is the essence of the Nx engineer (for 0.1 < N < 10)
The point that Wilson makes that having infrastructure/tools that push that ratio further from "repair" work to "new work" is hugely empowering to an organization.
I agree with the first part of your statement, but what really happens to such people?
In my experience (sample size greater than one), they receive some kudos, but remain underpaid, never promoted, and are given more work under tight deadlines. At least until some of them are laid off along with lower performers.
But for those who say that hard things are impossible, they seem to get along just fine. They merely declare such things as out-of-scope or lie about their roadmap.
100% agree, I've seen plenty of the best of the best get treated like trash and laid off at first sight of trouble on the horizon
At other times I’ll be assigned projects with regular POs, Architects and business employees who barely know what it is they are doing themselves, with poorly defined tasks and all sorts of bureaucratic nonsense “agile” process methods and well spend forever delivering nothing.
So sometimes I’m a 50x developer delivering business altering changes. At other times I’m a useless cog in a sea of pseudo workers. I don’t particularly care, I get paid, but if management actually knew what was going on, and how to change it… well…
I think the reluctance to invest effort in something that will give devs super-powers 6 months in the future is why we don't get all those 10x devs.
It's like how an NBA player would be a 10x player on a college basketball team. Great to work with them but If I was in their shoes I don't know how enjoyable/engaging the work would be.
Deleted Comment
Lays the foundation (get it?) for who the people are and what they've built.
Then explains how the current thing they are building is a result of the previous thing. It feels that they actually want this problem solved for everyone because they have experienced how good the solution feels.
Then tells us about the teams (pretty big names with complex systems) that have already used it.
All of these wrapped in good writing that appeals to developers/founders. Landing page is great too!
It would be nice to see some actual use cases and examples.
Instead, the writer just name-dropped a few big companies and claimed to have a revolutionary product that works magically. Then include the typical buzzwords like '10x programmer' and 'stealth mode'. The latter doesn't make sense because they also name-drop clients.
Having that context puts the post in a much better perspective. It's definitely an introduction post (the company has been developing this in stealth mode for the past few years), but it is most certainly _not_ a marketing post. These people developed extremely novel testing techniques for FoundationDB and are now generalizing them to work with any containerized application.
It's a big deal.
Also, stealth mode just means your company isn’t public, you can still have clients.
Here's my attempt at a more complete answer: think of the story of the blind men and the elephant. There's a thing, called fuzzing, invented by security researchers. There's a thing, called property-based testing, invented by functional programmers. There's a thing, called network simulation, invented by distributed systems people. There's a thing, called rare-event simulation, invented by physicists (!). But if you squint, all of these things are really the same kind of thing, which we call "autonomous testing". It's where you express high-level properties of your system, and have the computer do the grunt work to see if they're true. Antithesis is our attempt to take the best ideas from each of these fields, and turn them into something really usable for the vast majority of software.
We believe the two fundamental problems preventing widespread adoption of autonomous testing are: (1) most software is non-deterministic, but non-determinism breaks the core feedback loop that guides things like coverage-guided fuzzing. (2) the state space you're searching is inconceivably vast, and the search problem in full generality is insolubly hard. Antithesis tries to address both of these problems.
So... is it fuzzing? Sort of, except you can apply it to whole interacting networked systems, not just standalone parsers and libraries. Is it property-based testing? Sort of, except you can express properties that require a "global" view of the entire state space traversed by the system, which could never be locally asserted in code. Is it fault injection or chaos testing? Sort of, except that it can use the techniques of coverage guided fuzzing to get deep into the nooks and crannies of your software, and determinism to ensure that every bug is replayable, no matter how weird it is.
It's hard to explain, because it's hard to wrap your arms around the whole thing. But our other big goal is to make all of this easy to understand and easy to use. In some ways, that's proved to be even harder than the very hard technological problems we've faced. But we're excited and up for it, and we think the payoff could be big for our whole industry.
Your feedback about what's explained well and what's explained poorly is an important signal for us in this third very hard task. Please keep giving it to us!
I haven't heard of deterministic testing before. Nor have I heard of FoundationDB or the related things. And I went from knowing zero things about them to getting impressed and interested. This led me to go into their docs, blogs, landing page, etc. to know more.
Deleted Comment
Deleted Comment
The linked article is 3/4 about some history and rationale before it actually tells you what they build.
It's like those pesky recipe blogs that tell you about the authors childhood, when you just want to make vegan pancakes.
The most pernicious, hard-to-find bugs that I've come across have all been around the business logic of an application, rather than it hitting into an error state. I'm thinking of the category where you have something like "a database is currently reporting a completed transaction against a customer, but no completed purchase item, how should it be displayed on the customer recent transactions page?". Implementing something where "a thing will appear and not crash" in those cases is one thing, but making sure that it actually makes sense as a choice given all the context of everyone elses choices everywhere else in the stack is a lot harder.
Or to take a database, something along the lines of "our query planner produces a really suboptimal plan in this edge-case".
Neither of those types of problems could ever be automatically detected, because they aren't issues of the programming reaching an error state- the issue is figuring out in the first place what "correct" actually is for you application.
Maybe I'm setting the bar too high for what a "bug" is, but I guess my point is, its one thing to fantasize about having zero bugs, its another to build software in the real world. I probably still settle for 0 run time errors though to be fair. . .
So in normal cases this would reek of someone being arrogant / overconfident, but here they really have gotten very close to zero bugs.
To be clear, I am not saying that you can't define all inputs and outputs of a "complete SaaS product offering stack", because you likely could, though if it's already been built by someone that doesn't have these things in mind, then it's a different problem space to find bugs.
As someone who has spent the last 15 years championing quality strategy for companies and training folks of varying roles on how to properly assess risk, it does indeed feel like this has a more narrow scope of "bug" as a definition, in the sort of way that a developer could try to claim that robust unit tests would catch "any" bugs, or even most of them. The types of risk to a software's quality have larger surface areas than at that level.
Issues around business logic are not failures of the system, the system worked to spec, the spec was not comprehensive enough and now we iterate.
Because I'd call that a bug. A spec bug, but a bug. It's no feature request to make the code based on the newer page delete thing2 rather than thing1, it's fixing a defect
Verification is "does this thing do what I asked it to do".
Validation is "did I ask it to do the right thing".
The actual testing value is the difference between the cost of writing and maintaining the code, and the cost of writing and maintaining the spec.
If the spec is similar in complexity to the code itself, then bugs in the spec are just as likely as bugs in the code, thus verification to spec has gained you nothing (and probably cost you a lot).
I agree that suboptimal query planning would be a database-layer bug, a defect which could easily be missed by the bug-testing framework.
Currently bringing a similar kind of testing in to our workplace by writing our services to run on top of `madsim` [1]. This lets us continue writing async/await-style services in tokio but then (in tests) replace them with a deterministic executor that patches all sources of non-determinism (including dependencies that call out to the OS). It's pretty seamless.
The author of this article isn't joking when they say that the startup cost of this effort is monumental. Dealing with every possible source of non-determinism, re-writing services to be testable/sans-IO [2], etc. takes a lot of engineering effort.
Once the system is in place though, it's hard to describe just how confident you feel in your code. Combined with tools like quickcheck [3], you can test hundreds of thousands of subtle failure cases in I/O, event ordering, timeouts, dropped packets, filesystem failures, etc.
This kind of testing is an incredibly powerful tool to have in your toolbelt, if you have the patience and fortitude to invest in it.
As for Antithesis itself, it looks very very cool. Bringing the deterministic testing down the stack to below the OS is awesome. Should make it possible to test entire systems without wiring up a harness manually every time. Can’t wait to try it out!
[0]: https://sled.rs/simulation.html
[1]: https://github.com/madsim-rs/madsim?tab=readme-ov-file#madsi...
[2]: https://sans-io.readthedocs.io/
[3]: https://github.com/BurntSushi/quickcheck?tab=readme-ov-file#...
As cool as all this is, I can't stop but wonder how often the culture of micro-services and distributed computing is ill advised. So much complexity I've seen in such systems boils down to calling a "function" is: async, depends on the OS, is executed at some point or never, always returns a bunch of strings that need to be parsed to re-enter the static type system, which comes with its own set of failure modes. This makes the seemingly simple task of abstracting logic into a named component, aka a function, extremely complex. You don't need to test for any of the subtle failures you mentioned if you leave the logic inside the same process and just call a function. I know monoliths aren't always a good idea or fit, at the same time I'm highly septical whether the current prevalence of service based software architectures is justified and pays off.
You can't get away from distributed computing, unless you get away from computing. A modern computer isn't a single unit, it's a system of computers talking to each other. Even if you go back a long time, you'll find many computers or proto-computers talking to each other, but with a lot stricter timings, as the computers are less flexible.
If you save a file to a disk, you're really asking the OS (somehow) to send a message to the computer on the storage device, asking it to store your data, and it will respond with success or failure and it might also write the data. (sometimes it will tell your os success and then proceed to throw the data away, which is always fun)
That said, keeping things together where it makes sense, is definitely a good thing.
Are there public examples of what such a re-write looks like?
Also, are you working at a rust shop that's developing this way?
Final Note, TigerBeetle is another product that was written this way.
I gotta say, that is some next-level commitment to writing a good database.
Disclosure: Antithesis co-founder here.
- the company that developed Madsim (RisingWave) [0] [1] is tries hardest to eliminate non-determinism with the broadest scope (stubbing out syscalls, etc.)
- sled [2] itself has an interesting combo of deterministic tests combined with quickcheck+failpoints test case auto-discovery
- Dropbox [3] uses a similar approach but they talk about it a bit more abstractly.
Sans-IO is more documented in Python [4], but str0m [5] and quinn-proto [6] are the best examples in Rust I’m aware of. Note that sans-IO is orthogonal to deterministic test frameworks, but it composes well with them.
With the disclaimer that anything I comment on this site is my opinion alone, and does not reflect the company I work at —— I do work at a rust shop that has utilized these techniques on some projects.
TigerBeetle is an amazing example and I’ve looked at it before! They are really the best example of this approach outside of FoundationDB I think.
[0]: https://risingwave.com/blog/deterministic-simulation-a-new-e...
[1]: https://risingwave.com/blog/applying-deterministic-simulatio...
[2]: https://dropbox.tech/infrastructure/-testing-our-new-sync-en...
[3]: https://github.com/spacejam/sled
[4]: https://fractalideas.com/blog/sans-io-when-rubber-meets-road...
[5]: https://github.com/algesten/str0m
[6]: https://docs.rs/quinn-proto/0.10.6/quinn_proto/struct.Connec...
> Programming in this state is like living life surrounded by a force field that protects you from all harm. [...] We deleted all of our dependencies (including Zookeeper) because they had bugs, and wrote our own Paxos implementation in very little time and it _had no bugs_.
Being able to make that statement and back it by evidence must be indeed a cool thing.
pp. 65-66:
> The longer I have computed, the less I seem to use Numerical Software Packages. In an ideal world this would be crazy; maybe it is even a little bit crazy today. But I've been bitten too often by bugs in those Packages. For me, it is simply too frustrating to be sidetracked while solving my own problem by the need to debug somebody else's software. So, except for linear algebra packages, I usually roll my own. It's inefficient, I suppose, but my nerves are calmer.
> The most troubling aspect of using Numerical Software Packages, however, is not their occasional goofs, but rather the way the packages inevitably hide deficiencies in a problem's formulation. We can dump a set of equations into a solver and it will usually give back a solution without complaint - even if the equations are quite poorly conditioned or have an unsuspected singularity that is distorting the answers from physical reality. Or it may give us an alternative solution that we failed to anticipate. The package helps us ignore these possibilities - or even to detect their occurrence if the execution is buried inside a larger program. Given our capacity for error-blindness, software that actually hides our errors from us is a questionable form of progress.
> And if we do detect suspicious behavior, we really can't dig into the package to find our troubles. We will simply have to reprogram the problem ourselves. We would have been better off doing so from the beginning - with a good chance that the immersion into the problem's reality would have dispelled the logical confusions before ever getting to the machine.
I suppose whether to do this depends on how rigorous one is, how rigorous certain dependencies are, and how much time one has. I'm not going to be writing my own database (too complicated, multiple well-tested options available) but if I only use a subset of the functionality of a smaller package that isn't tested well, rolling my own could make sense.
I was a co-founder of both FoundationDB and Antithesis.
One example that comes to mind where I want to roll my own thing (and am in the process of doing so) is replacing our ci/cd usage of jenkins that is solely for running qa automation tests against PR's on github. Jenkins does way way more than we need. We just need github PR interaction/webhook, secure credentials management, and spawning ecs tasks on aws...
Every time I force myself to update our jenkins instance, I buckle up because there is probably some random plugin, or jenkins agent thing, or ... SOMETHING that will break and require me to spend time tracking down what broke and why. 100% surface area for issues, whilst we use <5% of what Jenkins actually provides.
I do not make the claim my spec has no bugs.
Deleted Comment
Deleted Comment
1. It's a brilliant idea that came at the right time. It feels like people are finally losing patience with flaky software, see developer sentiment on: fuzzers, static typing, memory safety, standardized protocols, containers, etc.
2. It's meant to be niche. $2 per hour per CPU (or $7000 per year per CPU if reserved), no free tier for hobby or FOSS, and the only way to try/buy is to contact them. Ouch. It's a valid business model, I'm just sad it's not going for maximum positive impact.
3. Kudos for the high quality writing and documentation, and I absolutely love that the docs include things like (emphasis in original):
> If a bug is found in production, or by your customers, you should demand an explanation from us.
That's exactly how you buy developer goodwill. Reminds me of Mullvad, who I still recommend to people even after they dropped the ball on me.
Disclosure: Antithesis co-founder.
This is the sort of thing that, if it takes off, will start affecting the entire software world. Hardware will start adding features to support it. In 30 years this may simply be how computing works. But the pioneers need to recover the costs of the arrows they got stuck with before it can really spread out. Don't look at this an event, but as the beginning of a process.
Heck, most software in production today barely has unit tests.
I actually got to follow their bug tracking process on an issue they identified in Apache Spark streaming - going off of the docs, they managed to identify a subtle and insidious correctness error in a common operation that would've caused headaches in low visibility edge case for years at that point. In the end the docs were incorrect, but after that showing I cannot imagine how critical tools like Antithesis will be inside companies building distributed systems.
I hope we get some blog posts that dig into the technical weeds soon, I'd love to hear what brought them to their current approach.
The blocker in doing that has always been the foundations of our machines: non-deterministic CPUs and operating systems. Re-building an entire vertical computing stack is practically impossible, so they just _avoid_ it by building a high-fidelity deterministic simulator.
I do wonder how they are checking for equivalence between the simulator and existing OS's, as that sounds like a non-trivial task. But, even still, I'm really bought in to this idea.
Then they run those tests while injecting all sorts of failures like OS failures, network issues, race and timing conditions, random number generator issues, etc.
It’s likely the only practical way today of testing for those things reliably, but you still have to write all of the tests and define your app state.