In grad school I took a formal methods class where we proved properties about programs that completely changed how I think about bugs. The main things I took from the class were
1. Correctness of a program is distinct from its performance.
2. Program correctness can be proven.
3. Optimizing for performance often makes it harder to prove correctness.
I do not actually use formal methods in my work as a developer, but the class helped improve my program quality nonetheless. Now I generally think in terms of a program being correct rather than having no bugs. Technically these are the same thing, but the change of language brings a change of focus. I generally try to use the term "error" instead of "bug", for an incorrect program.
My strategy is to write the simplest correct version of the program first, convince myself that it is correct, and then optimize, if necessary without regressing on correctness. I generally use tests, rather than formal proofs, though, so of course there is still the possibility of uncaught errors, but this strategy works well overall.
Thinking this way also gives me guidance as to how to break down a program into modules and subprograms: anything that is too big or complex for me to be able to reason about its correctness must be subdivided into pieces with well-defined correctness.
It also has clarified for me what premature optimization means: it is optimizing a program before you know it's correct.
(EDIT: fixed "reason about its complexity" to say "reason about its correctness" in the penultimate paragraph.)
Correctness of a program is usually distinct from performance.
One obvious exception is code processing secret data. There, performance variance creates observable side-effects (timing side channels) which can be used to determine the secrets' values.
Another is any sort of hard real-time system, where performance is critical to correctness. For example, a brake-by-wire system that took 10 seconds to respond to the pedal being pressed would be incorrect, because of poor performance.
Otherwise, I agree. There might be some other exceptions, but striving for correctness first is a good way to write code.
Even for hard real-time systems, focusing on correctness first is often the right way to go. At least in my experience, it's typically been much easier to make a correct system fast than a fast wrong system correct. With some particularly delightful (read: catastrophic) results when some folks really wanted to push their fast wrong code (fixed years ago, fortunately, and they had a good corporate culture change not long after so no point in naming and shaming).
Achieving Correctness is really satisfying. however it is hard and difficult. IMOH this does in general polarize the scene (proving fanatics on one extreme and the other side who are not even testing the code)
IMHO flushing out what you are designing does help and goes along the way of having fewer bugs. one old relatively yet easy and accessible formal toolkit which helpful in flushing out process is Z notation . one of the accessible books, old yet an easy read and rewarding is
"Software Development with Z. A practical approach to formal methods in software engineering"
there are other notations developed later. but its simplicity and easiness even while scribling on paper or word processor gets me back to using it every now and then
I've a good track record of having my programs work without bugs, I don't think it's too hard. The way I work is to restrict myself to using building blocks that I know work well and produce correct results. For example: using state machines, never breaking out of a loop, tackling all the edge case before the body, using simple data structures, don't use inheritance or exceptions, don't interact with files natively, don't use recursion, etc. etc.
When I face a programming problem I map the solution to those simple building blocks, and If I can't I try to change the problem first.
Formal methods are hard if you want to prove the correctness of a hard algorithm, but you can almost always do without a hard algorithm and use something really basic instead, and for the basic things you don't need formal methods.
The people who write the most bugs in my experience do it because they don't fully understand the building blocks they're using, and rely on other things for correctness like the type checker or unit tests. They view code as a stochastic process, write code they don't fully understand and have accepted the resulting bugs as a fact of life.
It is a great time to work with proofs and formal verification; so many nice tools with different goals like idris, agra, lean, coq, tla+, etc, but also unit and system tests, however generally clients don't pay for that; they pay for sloppy and buggy.
It can be proven, in broadly the same sense as all of the atoms in your body can simultaneously "decide" to exist an inch to the right due to quantum field theory.
It is not practical to prove correctness for the vast majority of programs, and there are programs that are demonstrably incorrect, cannot be made correct, and yet are useful and still function anyway (e.g., closing TCP connections cannot be done "correctly").
> It is not practical to prove correctness for the vast majority of programs,
That is the excuse i hear a lot from software developers, when everything they do is to test the expected behaviour of a program, without any edge case.
And this is also the reason why iMessage and WhatsUp are full of one click exploits.
Correctness is also distinct from applicability/usefulness. A program that lacks some important edge cases is marginally more useful than a program that takes impractical amount of time for most inputs. So "Correctness of a program is distinct from its performance" doesn't imply "performance is optional".
I clicked on this because of the crazy title but its actually a really inisghtful article, e.g. "Conversely, there are people with commitment issues; they want to experiment non-stop and thus have no faith in robustness." ... like there's this belief that bugs will just happen anyway so why worry about them. But the authors point is that a little bit of extra thought and work can make a lot of difference to quality.
I’ve caught multiple production bugs in code review by just carefully reasoning about the code and changing some variable names to match my mental model. I didn’t quite understand how bad it was, but I could tell it wasn’t right so I left a comment saying “this seems wrong”. What happened after? It was merged without addressing and it triggered a P1 regression in production two weeks later. Like the author said, it takes time and energy to truly understand a program and think through all the modes of execution. BUT I think it’s a worthwhile exercise. It just doesn’t really get rewarded. In my experience, this happened at least twice a year over my last 10 years of working in software.
The other example is blindly accepting AI generated code, which I think is an extension of copying template / boilerplate code. You take a working example and massage it to do what you want. It includes so many things that either aren’t needed or don’t make sense, but it works at first glance so it’s submitted and merged. For example, build configs, JSON or YAML configs, docker container build files, helm charts, terraforms, gradle builds. It takes a lot to learn these things so we often just accept it when it works. It’s exhausting to learn it all but if you do, you’ll be much better at catching weird issues with them.
I think the problem is we trick ourselves into thinking we should spend more time coding than anything else, but it’s everything else that causes us more problems and we should build the muscles to handle everything else so that we waste less time on those mistakes. We’re often already really good at shipping features but terrible at finding and handling bugs or configs or builds or infrastructure or optimizing for performance.
The author makes the insightful observation that they write non-buggy code by being careful, in contrast to the vast majority of developers who write code full of bugs. Being careful is left to the reader, but it should be easy. /s
I think there's always argument that you don't know what you don't know. How much thought do you put on writing code with out bugs. Bug could be caused by the business logic, the language internals, the runtime environment internal and variation. I think what people often ignore writing piece of software is an iterative process, you build, deploy and learn from the operation and you fix and repeat.
If you keep thinking of all possible issues that can happen that becomes a blackhole and you dont deliver anything.
> writing piece of software is an iterative process
Often, yes. Absolutely.
> you build, deploy and learn from the operation and you fix and repeat.
But no, not at all in this way. This is generally not necessary and definitely not a mindset to internalize.
Commercial software products are often iterative, especially for consumer or enterprise, because they need to guess about the market that will keep them profitable and sustainable. And this speculative character has a way of bleeding down through the whole stack, but not for the sake that "bugs happen!" -- just for the sake that requirements will likely change.
As an engineer, you should always be writing code with a absolutely minimal defect rate and well-understood capabilities. From there, if you're working an a product subject to iteration (most now, but not all), you can strive to write adaptable code that can accomodate those iterations.
> I think what people often ignore writing piece of software is an iterative process, you build, deploy and learn from the operation and you fix and repeat.
I presume you didn't use any Microsoft operating system (or program). /s
Writing code with fewer bugs is a function of experience. But, the reason is not entirely what you would think it is. Sure, a lot of it is anticipating problems previously experienced, and writing code that handles problems, or entire classes of problems, previously encountered.
However, with more experience comes a better understanding of the general metastructure of code, and therefore an ability to hold more code in your head at a time. (Compare for instance the well-known increased ability of chess masters to memorize chess boards, compared to non-chess players.)
When you’re an inexperienced programmer, you need to write the code down (and run it to test if it works) before you know if the code and algorithm solves the problem. This makes the inexperienced programmer take shortcuts while writing down the code, in order to get the code written down as fast as possible, while it is still clear in their mind.
The experienced programmer, on the other hand, can easily envision the entire algorithm in their head beforehand, and can therefore spare some extra attention for adding error checking and handling of outlier cases, while writing the code for the first time.
Also, as the article states, when you make a conscious habit of always writing code which checks for all errors and accounts for all outliers, it becomes easier with time; practice makes perfect, as it were. This is essentially a way to speed up the natural process described above.
> Also, I find that a lot of monkey testing is important. AI could be very beneficial, here. I anticipate the development of "AI Chaos Monkeys."
Well, you just made me realise that there is still a use for those LLMs beside generating propaganda. The problem, i guess will be, that nobody will be willing to spend time on those bug reports.
If you actually want to write software without bugs:
Assume that your code will have bugs no matter how good you are. Correct for that by making careful architecture decisions and extensively test the product yourself.
There’s no silver bullet. If you put in enough time and make some good decisions, you can earn a reputation for writing relatively few bugs.
In a college chemistry lab we would have to write lab reports of our work. The instructor made it very clear that he has never given 100% on a lab reports because there's always something to improve.
A one of my CS cohorts happened to be in the same class so we teamed up for the first lab project. It was pretty straightforward, we collected whatever information, and started working on our report. We didn't bother spending much time on it because we already knew we'd lose points for something or another.
When we got it back, there was a big, red "100" on top. We checked around and it did look like we were the only ones that got a perfect score, so we went to the instructor and, mostly jokingly, said, "What's up with this?" to which he stayed on beat and replied, "Do you want me to take another look?"
It's not hard to do good work, but you do have to make a habit of it. Re-read what you write, preferably out loud, to make sure it actually makes sense.
You'll still make errors and mistakes and you won't catch them all, but no one's going to care about a typo or two unless you draw attention to it with more glaring problems. And I think this is where metrics, especially things like code coverage, can actually be detrimental, because they bring attention to the wrong things.
Specifically, in places I've seen code coverage enforced, tests (written by consultants making x5-10 more than I do) tend to look like, `assert read_csv("foo,bar") == [["foo", "bar"]]` that execute enough of the function to satisfy the coverage requirements, but everyone is surprised when things break using a real CSV document.
The corollary of the author's trick is that if you keep making excuses to produce poor work, you may subconsciously decline instead.
This feels like fluff. You can think a few steps all you like but bugs will creep in, those you can’t think about, those in areas you don’t quite understand, those that require weird sequences of events.
The key, IMO, is having awareness about when you don’t quite understand something, because that means you can’t reason about the code to prove it correct to yourself in your head. And then, avoid shipping code in such a state at (almost) all cost. This awareness can be trained, and I suspect that the author’s virtually-bug-free shipping record is based on that. My personal experience is that bugs are nearly always caused by code where I ignored my inner uncertainty about the code.
I dont think the author's goal (or reality) is perfection, zero bugs. A fluent English speaker, even an English professor, will occasionally trip up on a word or write a confusing sentence. But if thoughtfulness and planning reduce 80% of bugs or more, that's a big win.
Just add other teams of people doing work in parallel, then make all the work depend on each other, and bugs will become even more inevitable. All the integration, end-to-end, contract, etc. testing in the world won’t save you, the savant incapable of writing bugs, from encountering and having to deal with bugs.
> My “trick” during that final year was simple: I always tried to write correctly, not just when I was asked to, but all the time. After a year of subconscious improvements, I aced the exam.
Practice makes permanent. Perfect practice makes perfect.
I'm trying this with learning piano, and I see the advice in a good number of places-- if I make a mistake in a phrase, I repeat the phrase 5-7 times correctly, instead of pushing through. It's been working out well so far-- I'm not 'burning in' my mistakes.
1. Correctness of a program is distinct from its performance.
2. Program correctness can be proven.
3. Optimizing for performance often makes it harder to prove correctness.
I do not actually use formal methods in my work as a developer, but the class helped improve my program quality nonetheless. Now I generally think in terms of a program being correct rather than having no bugs. Technically these are the same thing, but the change of language brings a change of focus. I generally try to use the term "error" instead of "bug", for an incorrect program.
My strategy is to write the simplest correct version of the program first, convince myself that it is correct, and then optimize, if necessary without regressing on correctness. I generally use tests, rather than formal proofs, though, so of course there is still the possibility of uncaught errors, but this strategy works well overall.
Thinking this way also gives me guidance as to how to break down a program into modules and subprograms: anything that is too big or complex for me to be able to reason about its correctness must be subdivided into pieces with well-defined correctness.
It also has clarified for me what premature optimization means: it is optimizing a program before you know it's correct.
(EDIT: fixed "reason about its complexity" to say "reason about its correctness" in the penultimate paragraph.)
One obvious exception is code processing secret data. There, performance variance creates observable side-effects (timing side channels) which can be used to determine the secrets' values.
Another is any sort of hard real-time system, where performance is critical to correctness. For example, a brake-by-wire system that took 10 seconds to respond to the pedal being pressed would be incorrect, because of poor performance.
Otherwise, I agree. There might be some other exceptions, but striving for correctness first is a good way to write code.
"Software Development with Z. A practical approach to formal methods in software engineering"
https://archive.org/details/softwaredevelopm0000word/page/n1... .
there are other notations developed later. but its simplicity and easiness even while scribling on paper or word processor gets me back to using it every now and then
When I face a programming problem I map the solution to those simple building blocks, and If I can't I try to change the problem first.
Formal methods are hard if you want to prove the correctness of a hard algorithm, but you can almost always do without a hard algorithm and use something really basic instead, and for the basic things you don't need formal methods.
The people who write the most bugs in my experience do it because they don't fully understand the building blocks they're using, and rely on other things for correctness like the type checker or unit tests. They view code as a stochastic process, write code they don't fully understand and have accepted the resulting bugs as a fact of life.
It can be proven, in broadly the same sense as all of the atoms in your body can simultaneously "decide" to exist an inch to the right due to quantum field theory.
It is not practical to prove correctness for the vast majority of programs, and there are programs that are demonstrably incorrect, cannot be made correct, and yet are useful and still function anyway (e.g., closing TCP connections cannot be done "correctly").
That is the excuse i hear a lot from software developers, when everything they do is to test the expected behaviour of a program, without any edge case.
And this is also the reason why iMessage and WhatsUp are full of one click exploits.
Any advice on reasoning about correctness?
Care to bring home the thesis on how that’s actually really insightful?
I’ve caught multiple production bugs in code review by just carefully reasoning about the code and changing some variable names to match my mental model. I didn’t quite understand how bad it was, but I could tell it wasn’t right so I left a comment saying “this seems wrong”. What happened after? It was merged without addressing and it triggered a P1 regression in production two weeks later. Like the author said, it takes time and energy to truly understand a program and think through all the modes of execution. BUT I think it’s a worthwhile exercise. It just doesn’t really get rewarded. In my experience, this happened at least twice a year over my last 10 years of working in software.
The other example is blindly accepting AI generated code, which I think is an extension of copying template / boilerplate code. You take a working example and massage it to do what you want. It includes so many things that either aren’t needed or don’t make sense, but it works at first glance so it’s submitted and merged. For example, build configs, JSON or YAML configs, docker container build files, helm charts, terraforms, gradle builds. It takes a lot to learn these things so we often just accept it when it works. It’s exhausting to learn it all but if you do, you’ll be much better at catching weird issues with them.
I think the problem is we trick ourselves into thinking we should spend more time coding than anything else, but it’s everything else that causes us more problems and we should build the muscles to handle everything else so that we waste less time on those mistakes. We’re often already really good at shipping features but terrible at finding and handling bugs or configs or builds or infrastructure or optimizing for performance.
If you keep thinking of all possible issues that can happen that becomes a blackhole and you dont deliver anything.
Often, yes. Absolutely.
> you build, deploy and learn from the operation and you fix and repeat.
But no, not at all in this way. This is generally not necessary and definitely not a mindset to internalize.
Commercial software products are often iterative, especially for consumer or enterprise, because they need to guess about the market that will keep them profitable and sustainable. And this speculative character has a way of bleeding down through the whole stack, but not for the sake that "bugs happen!" -- just for the sake that requirements will likely change.
As an engineer, you should always be writing code with a absolutely minimal defect rate and well-understood capabilities. From there, if you're working an a product subject to iteration (most now, but not all), you can strive to write adaptable code that can accomodate those iterations.
I presume you didn't use any Microsoft operating system (or program). /s
However, with more experience comes a better understanding of the general metastructure of code, and therefore an ability to hold more code in your head at a time. (Compare for instance the well-known increased ability of chess masters to memorize chess boards, compared to non-chess players.)
When you’re an inexperienced programmer, you need to write the code down (and run it to test if it works) before you know if the code and algorithm solves the problem. This makes the inexperienced programmer take shortcuts while writing down the code, in order to get the code written down as fast as possible, while it is still clear in their mind.
The experienced programmer, on the other hand, can easily envision the entire algorithm in their head beforehand, and can therefore spare some extra attention for adding error checking and handling of outlier cases, while writing the code for the first time.
Also, as the article states, when you make a conscious habit of always writing code which checks for all errors and accounts for all outliers, it becomes easier with time; practice makes perfect, as it were. This is essentially a way to speed up the natural process described above.
A lot.
I always find bugs, when I test, no matter how "perfect" I think my code should be.
Also, I find that a lot of monkey testing is important. AI could be very beneficial, here. I anticipate the development of "AI Chaos Monkeys."
https://littlegreenviper.com/various/testing-harness-vs-unit...
Well, you just made me realise that there is still a use for those LLMs beside generating propaganda. The problem, i guess will be, that nobody will be willing to spend time on those bug reports.
Assume that your code will have bugs no matter how good you are. Correct for that by making careful architecture decisions and extensively test the product yourself.
There’s no silver bullet. If you put in enough time and make some good decisions, you can earn a reputation for writing relatively few bugs.
When I test, I always find bugs. Never fails.
Watching this posting dive down from the HN front page has been interesting (and expected).
It’s rather discouraging to see how discussions of Quality Development are treated, hereabouts.
Quality seems to be heresy.
A one of my CS cohorts happened to be in the same class so we teamed up for the first lab project. It was pretty straightforward, we collected whatever information, and started working on our report. We didn't bother spending much time on it because we already knew we'd lose points for something or another.
When we got it back, there was a big, red "100" on top. We checked around and it did look like we were the only ones that got a perfect score, so we went to the instructor and, mostly jokingly, said, "What's up with this?" to which he stayed on beat and replied, "Do you want me to take another look?"
It's not hard to do good work, but you do have to make a habit of it. Re-read what you write, preferably out loud, to make sure it actually makes sense.
You'll still make errors and mistakes and you won't catch them all, but no one's going to care about a typo or two unless you draw attention to it with more glaring problems. And I think this is where metrics, especially things like code coverage, can actually be detrimental, because they bring attention to the wrong things.
Specifically, in places I've seen code coverage enforced, tests (written by consultants making x5-10 more than I do) tend to look like, `assert read_csv("foo,bar") == [["foo", "bar"]]` that execute enough of the function to satisfy the coverage requirements, but everyone is surprised when things break using a real CSV document.
The corollary of the author's trick is that if you keep making excuses to produce poor work, you may subconsciously decline instead.
Practice makes permanent. Perfect practice makes perfect.