hitchstory (u/hitchstory)

hitchstory commented on The Big TDD Misunderstanding (2022) linkedrecords.com/the-big... · Posted by u/WolfOliver

Izkata · a year ago

> What happens when you then show it to stakeholders (e.g. other teams consuming your API, customers or UX people) or and they tell you to change it again?

> Rewrite everything again?

> Thats gonna be reaaaaaaaalllly labor intensive and could damage your code base too.

Why would I do that? Only the thing they have issue with would need to be changed, it wouldn't take any longer than another way of doing it.

You seem to have forgotten what I said, something needs to exist for me to work with. Well, in this "stakeholders want something changed", something exists. It's not a rewrite from scratch.

hitchstory · a year ago

>Why would I do that?

If you change the spec (e.g. changing the contract on a REST API), you will probably need to consult to make sure it aligns with everybody's expectations. Does the team calling it even have the customer ID you've just decided to require on, say, this new endpoint?

>You seem to have forgotten what I said, something needs to exist for me to work with.

No. I'm assuming here that a code base exists and that you are mostly (if not 100%) familiar with it.

hitchstory commented on The Big TDD Misunderstanding (2022) linkedrecords.com/the-big... · Posted by u/WolfOliver

charleslmunger · a year ago

>This is actually a good (albeit somewhat niche) reason to not write a test scenario at all, but it's still not a great reason to write a test after instead of before.

But the before-test is strictly negative - it's a waste of time (deleted code, never submitted) and it possibly slowed down development (had to update the test as I messed with APIs).

>Yup. A test scenario which is of no interest to at least some stakeholders probably shouldnt be written at all.

And yet I see TDD practitioners as the primary source of such tests - if you are dogmatically writing a test for every intermediate change, you will end up with lots of extra tests that assert things in order to satisfy the TDD dogma rather than the specific needs of the problem. Obviously this can be avoided with judgement - but if you have sound independent judgement you don't need to adhere to specific philosophies about the order you make changes in.

>In fact it's probably a bit easier to link intentional behavior to a test while you have the spec in front of you and before the code is written.

When implementing to a spec you are absolutely right, but a very small amount of software is completely or even mostly specified in advance.

>I find people who write test after tend to (not always, but strong tendency) fit the test to the code rather than the requirement. This is really bad.

I agree this can lead to brittle tests and lack of spec adherence, but if you are iterating on intermediate state and writing tests as you go, the structure of the code you wrote 30 seconds ago is very much influencing the test you're writing now.

Another issue is that fault injection tests basically require coupling to the implementation - "make the Nth allocation fail" etc. The way I prefer to write these is to write the implementation first, then write the fuzz test - add a few bugs in the implementation, and fix/enhance the fuzz test until it catches them. Fuzz testing is one of the best bang-for-buck testing methodologies there is, and in my experience it's very hard to write a really good fuzz test unless you already have most of your implementation, so you can ensure your fuzz tester is actually exercising the stuff you want it to.

>Assuming Im understanding you correctly (you're building something like terraform?),

I write library code for mobile phones, mostly in Java/Kotlin. I recently did some open source work (warning: I am not actually very proficient with C, any good results are from enormous time spent and my code reviewers, constructive criticism very much welcome). Here's a few somewhat small, contained changes of mine, so we can talk about something concrete:

https://github.com/protocolbuffers/protobuf/pull/19893/files

This change alters a lock-free data structure to add a monotonicity invariant, when the space allocated is queried on an already-fused arena while racing with another fuse. I didn't add tests for this - I spent a fair bit of time thinking about how to do it, and decided that the type of test I would have to write to reliably reproduce this was not going to be net better at preventing a future bug, given its cost, than a comment in the implementation code and markdown documentation of the data structure. I don't know how I would really have made this change with a TDD methodology.

https://github.com/protocolbuffers/protobuf/pull/19933/files

This change moves a memory layout - again, I don't know how I would have written a test for this, besides something wild like querying smaps (not portable) to see if the final page of the arena allocation had faulted in.

https://github.com/protocolbuffers/protobuf/pull/19885/files

This change was written more in the way you recommend - but the whole change is basically a test. I debugged this by reading the code and thinking about it, then wrote up a pretty complicated fuzz test to help find any future races. I'm guessing that you would not consider adding debug asserts to be a violation of "write the test first"? So in this case, I followed TDD's order - not because I was following TDD, but because the code change was trivial and all the hard work was thinking about the data structures and memory model.

https://github.com/protocolbuffers/protobuf/pull/19688/files

All the tests were submitted before the implementation change here, but not because of TDD - in this case, I was trying to optimize performance, and wrote the whole implementation before any new tests - because changing the implementation required changing the API to no longer expose contiguous memory. But I did not want to churn all the users of the public API unless I knew my implementation was actually going to deliver a performance improvement - so I didn't write any tests for the API until I had the implementation pretty well in hand. Good thing too, because I actually had to alter the new API's behavior a few times to enable the performance I wanted, and if I had written all the tests as I went along, I'd have to go and rewrite them over and over. So in this case I wrote the implementation, got it how I wanted it, wrote and submitted the new API (implemented at first on the old implementation) and added tests, updated all callers to the new API, and then submitted the new implementation.

I don't think TDD would have led to better results in these cases, but you sound like a TDD believer and I'm always interested to hear anything that would make my engineering better.

hitchstory · a year ago

>But the before-test is strictly negative

I actually did this the other day on a piece of code. I was feeling a bit lazy. I didn't write the test and I figured that making the type checker catch it was enough. I still didn't write a test after either though.

Anecdotally I've always found that tests which cover real life bugs are in the class of test with the highest chance of catching future regressions. So even if it does exist, I'm still mildly skeptical of the idea that tests that catch bugs that compilers have been provoked into also catching are strictly negative.

>And yet I see TDD practitioners as the primary source of such tests

I find the precise opposite to be true. TDD are more likely to tie requirements to tests because they write them directly after getting requirements. Test-after practitioners more likely tie implementation to the test.

It's always possible to write a shit implementation-tied test with TDD, but the person who writes a shit test with TDD will write a shit implementation tied test after too. What did TDD have to do with that? Nothing.

>if you are dogmatically writing a test for every intermediate change, you will end up with lots of extra tests that assert things in order to satisfy the TDD dogma rather than the specific needs of the problem.

I find that this only really happens when you practice TDD with very loose typing. If you practice strict typing, the tests will invariably be narrowed down to ones which address the specific needs of the problem.

Again - without TDD and writing the test after, loose typing is still a shit show. So, I see this as another issue which is about something separate to TDD.

>Obviously this can be avoided with judgement - but if you have sound independent judgement you don't need to adhere to specific philosophies

I think this is conflating "TDD is a panacea" with "if it is valuable to write a test, it's always better to write it before". I've never thought the former, but the examples you've listed here look to me only like examples of where TDD didn't save somebody from making a mistake that was about a separate issue (types, poor quality test). None of them are examples of "actually writing the test after would have been better".

>When implementing to a spec you are absolutely right, but a very small amount of software is completely or even mostly specified in advance.

Why on earth would you do that? If I write even a single line of production code I have specified what that line of code is going to do. I have watched juniors flail around and do this when getting vague specs but seniors generally try to nail down a user story tight with a combination of code investigation, spiking and dialog with stakeholders before writing code that they would otherwise have to toss in the trash can if it wasn't fit for purpose.

To me this isn't related to TDD either. Whether or not I practice TDD, I don't fuck around writing or changing production code if I don't know precisely what result it is I want to achieve. Ever.

Future requirements will probably remain vague but never the ones I'm implementing right now.

>I agree this can lead to brittle tests and lack of spec adherence, but if you are iterating on intermediate state and writing tests as you go, the structure of the code you wrote 30 seconds ago is very much influencing the test you're writing now.

Only if the spec is changing too. This sometimes happens if I discover some issue by looking at the code, but in general my test remains relatively static while the code underneath it iterates.

This obviously wouldn't happen if you wrote implementation-tied tests rather than specification-tied tests but... maybe just don't do that?

>Another issue is that fault injection tests basically require coupling to the implementation

All tests requiring coupling to implementation in some way. The goal is to loosely couple as possible while maximizing speed, ease of use, etc. I'm not really sure why fault injection should be treated as special. If you need to refactor the test harness to allow it, that's probably a really good idea.

>The way I prefer to write these is to write the implementation first, then write the fuzz test - add a few bugs in the implementation, and fix/enhance the fuzz test until it catches them. Fuzz testing is one of the best bang-for-buck testing

Fuzz testing is great and having preferences is fine, but once again fuzz testing says little about the efficacy of TDD (fuzz tests can be written both before and after) and preference for test-after I find tends to mean little more than "I like my old habits".

>Fuzz testing is one of the best bang-for-buck testing methodologies there is, and in my experience it's very hard to write a really good fuzz test unless you already have most of your implementation

In my experience you can (I've done TDD with property tests and, well, I see fuzz testing as simply a subset of that). I also don't see any particular reason why you can't.

These methodologies I find do provide bang for the buck if you're writing a very specific kind of code. I will know if I'm writing that type of code in advance.

>This change alters a lock-free data structure to add a monotonicity invariant,

If I'm reading it correctly, this looks like a class of bugs we discussed that is fixed by tightening up the typing. In which case, no test is strictly necessary, although I'd argue that it probably would not hurt either.

>This change moves a memory layout - again, I don't know how I would have written a test for this, besides something wild like querying smaps (not portable) to see if the final page of the arena allocation had faulted in.

I can't tell if this is refactoring or you're fixing a bug. Is there a scenario which would reproduce a bug? If so, quite possibly a test would help. I've rarely been terribly sympathetic to the view that "writing a test to replicate this bug is too hard" is a good reason for not doing it. Programming is hard. I find that A) bugs often tend to cluster in scenarios that the testing infrastructure is ill equipped to reproduce the scenario B) once you upgrade the testing infrastructure to handle those scenario types those bugs often poof...stop recurring.

>This change was written more in the way you recommend - but the whole change is basically a test. I debugged this by reading the code and thinking about it, then wrote up a pretty complicated fuzz test to help find any future races. I'm guessing that you would not consider adding debug asserts to be a violation of "write the test first"

I would file that under "tightening up typing" again and also file it under "the decision to write a test at all is distinct from the decision to write a test first".

>I don't think TDD would have led to better results in these cases

Again, I see no examples where writing a test after would have been better. There are just a few where you could argue that not writing a test at all is the correct course of action.

hitchstory commented on The Big TDD Misunderstanding (2022) linkedrecords.com/the-big... · Posted by u/WolfOliver

charleslmunger · a year ago

I think having tests for all your diffs at the level of published commits/change lists/etc is totally reasonable for software you really care about. What's counterproductive is practicing TDD at the level of individual editor operations.

If I'm fixing a bug, I start by writing a test that reproduces the bug. If I can't do that, I fix the test harness until I can. Then I implement the change, making mental notes of each intermediate bug I think about along the way - things like "I should be careful to name this distinctly so that it's not confused with this other value in scope that has the same type". After that, I cull down that list until it's reasonable and not totally paranoid, and write tests covering those cases. Same thing for any bugs in in-progress code caught by manual testing, fuzzers, etc.

If you have discipline and use version control, you don't need to write tests before you write the actual code to get the same level of coverage as TDD and you waste a lot less time. I've often figured out late in the game how to make something a compile time failure rather than a runtime one - time to delete all those tests written along the way? Encode them all as negative compilation tests? Fundamentally the goal of testing is to describe what behaviors of the software are intentional rather than incidental, and to detect bugs that might be introduced by future changes to the software - TDD mixes both concerns and doesn't put any emphasis on preventing future bugs specifically.

Maybe other people work on different types of things and TDD is great for them, but I write primarily infrastructure code where correctness is critical and I have the luxury of time, and TDD doesn't produce better results for me. This is a case TDD feels like it should work well for, but in my experience it doesn't improve correctness, maintainability, or speed of delivery - at least compared to the alternative I described. I'm sure there's a universe of teams with sloppy practices out there that TDD would be an improvement for, but it's not helpful for me.

hitchstory · a year ago

>I've often figured out late in the game how to make something a compile time failure rather than a runtime one

This is actually a good (albeit somewhat niche) reason to not write a test scenario at all, but it's still not a great reason to write a test after instead of before.

>Fundamentally the goal of testing is to describe what behaviors of the software are intentional rather than incidental

Yup. A test scenario which is of no interest to at least some stakeholders probably shouldnt be written at all.

This is again about whether to write a test at all, though, not whether to write it first.

>TDD mixes both concerns

I dont think writing a test after helps unmix those concerns any better.

In fact it's probably a bit easier to link intentional behavior to a test while you have the spec in front of you and before the code is written.

I find people who write test after tend to (not always, but strong tendency) fit the test to the code rather than the requirement. This is really bad.

>Maybe other people work on different types of things and TDD is great for them, but I write primarily infrastructure code where correctness is critical and I have the luxury of time

Assuming Im understanding you correctly (you're building something like terraform?), integration tests which run scenarios matching real features against fake infra would seem to be pretty useful to me.

So...why wont you write tests with that harness before the code? Im still unsure.

The only thing "special" about that type of code that i can see (which isnt even all that special) is that unit tests would often be useless. But so what?

hitchstory commented on The Big TDD Misunderstanding (2022) linkedrecords.com/the-big... · Posted by u/WolfOliver

Izkata · a year ago

> Im sensing a pattern in the answers to my question though. I keep getting "well, if you assume TDD is only done with low level unit tests..."

Completely wrong.

Even with your example, there's an initial exploratory stage where you're still figuring out the interface that the tests would use. I, personally, am not capable of using something that doesn't exist. I have to make that initial version first before I can use it in a test.

Quick edit aside: This is also why I rarely work top-down or bottom-up, I work mostly throughline - following the data flow and jumping up and down the abstraction stack as needed.

hitchstory · a year ago

Im not sure quite why you feel you always need to write code before sussing out what an API or UI should look like but it seems like a very expensive habit to me.

What happens when you then show it to stakeholders (e.g. other teams consuming your API, customers or UX people) or and they tell you to change it again?

Rewrite everything again?

Thats gonna be reaaaaaaaalllly labor intensive and could damage your code base too.

Im equally perplexed about why people dont try to build top down. It's one of those few things in programming that always makes sense regardless of circumstance.

hitchstory commented on The Big TDD Misunderstanding (2022) linkedrecords.com/the-big... · Posted by u/WolfOliver

chowells · a year ago

It mostly comes from observing the reality of how tests are used rather than idealism. In a lot of "test-first" codebases, the majority of tests are garbage. All they check is that the code is still structured identically to the original implementation. This is the reality that people encounter, so of course it's what they talk about. Why would they talk about some ivory tower idea instead?

hitchstory · a year ago

A shit test written before writing the code is still a shit test. Mimetic tests arent be any better written after the code either.

If I had to choose between 1) always writing specification-linked tests that make as few architectural assumptions as possible and 2) TDD, sure, I'd pick 1 every time.

1 and 2 is still better though.

hitchstory commented on The Big TDD Misunderstanding (2022) linkedrecords.com/the-big... · Posted by u/WolfOliver

voiceofunreason · a year ago

"I still find the skepticism around TDD weird."

A small community of programmers, with a disproportionately large audience, foretold that practicing test-driven development would produce great benefits; over twenty five years the audience has found that not to be the case.

Compare with "continuous integration" - here, the immediate returns of trying the proposed discipline were so good that pretty much everybody who tried the experiment got positive returns, and leaned into it, and now CI (and later CD) are _everywhere_.

As for what is gained, try this spelling: test driven development adds load to your interfaces at a time when you know the least about the problem you are trying to solve, which is to say the period where having your interfaces be flexible is valuable.

And thus, the technique gets criticism from both ends -- that design work that should have been done up front is deferred (making the design more difficult to change, therefore introducing costs/delays), and that the investment is being made in testing before you have a clear understanding for which tests are going to be sensitive to the actual errors that you introduce creating the code (thereby both increasing the amount of "waste" in the test suite, in addition to increasing the risk of needing test rewrites).

The situation is further not improved by (a) the fact that most TDD demonstrations are problems that are small, stable problems that you can solve in about an hour with any technique at all and (b) the designs produced in support of the TDD practice aren't clearly an improvement on "just doing it", and in some notable cases have been much much worse.

So if it is working for you: GREAT, keep it up; no reason for you not to reap the benefits if your local conditions are such that TDD gives you the best positive return on your investment.

hitchstory · a year ago

>As for what is gained, try this spelling: test driven development adds load to your interfaces at a time when you know the least about the problem you are trying to solve

If Im writing a single line of production code I should know as much as possible what requirements problem Im actually trying to solve with it first, no?

This is actually dovetails into a benefit to writing the test first. If you flesh out a user story scenario in the form of an executable test it can provoke new questions ("hm, actually I'd need the user ID on this new endpoint to satisfy this requirement...") and you can more quickly return to stakeholders ("can you send me a user ID in this API call?") and "fix" your "requirements bugs" before making more expensive lower level changes to the code.

This outside-in "flipping between one layer and the layer directly beneath it" is very effective at properly refining requirements, tests and architecture.

>And thus, the technique gets criticism from both ends -- that design work that should have been done up front is deferred

I dont think "design work" should be done up front if you can help it. I've always felt that the very best architecture emerges as a result of aggressive refactoring done within the confines of a complete set of tests that made as few architectural assumptions as possible. Why? Coz we're all bad at predicting the future and it's better if we dont try.

This is a mostly separate issue from TDD though.

hitchstory commented on The Big TDD Misunderstanding (2022) linkedrecords.com/the-big... · Posted by u/WolfOliver

simonw · a year ago

Wow. If "unit" in "unit test" does indeed mean that the test itself should be able to run independent of the other tests then maybe I can get over my avoidance of calling them "unit tests"!

I dislike that term because the most valuable tests I write are inevitably more in the shape of integration tests - tests that exercise just one function/class are probably less than 10% of the tests that I write.

So I call my tests "tests", but I get frustrated that this could be confused with manual tests, so then I call them "automated tests" but that's a bit of a mouthful and not a term many other people use.

I'd love to go back to calling them "unit tests", but I worry that most people who hear me say that will still think I'm talking about the test-a-single-unit-of-code version.

hitchstory · a year ago

Ive had this experience with team-specific vocab where certain terms organically end up having terms with two or more conflicting meanings and it was horrendous. It led to all sorts of bugs, misunderstandings and even arguments.

Even worse, most people didnt realize there was a problem coz they always knew what they meant.

The only time I managed to work past it was by convincing everyone to never use that term again - burning it to the ground - and agreeing to replace it with two or more new, unambiguous terms.

Id love to burn "unit test" and "integration test" to the ground but nobody outside my team listens to me :)

Id probably replace them with:

* code coupled

* interface coupled

* high level

* low level

* xUnit

* faked infrastructural

* deployed infrastructural

* hermetic / non hermetic

* declarative / non declarative

hitchstory commented on The Big TDD Misunderstanding (2022) linkedrecords.com/the-big... · Posted by u/WolfOliver

simonw · a year ago

I'm talking about strict test-first development here, where you write the tests before you write the implementation.

If you're using snapshot tests (a technique I really like) surely you can't write the tests before the implementation, because you need the implementation in order to generate the snapshot?

(This is what I hate about the term TDD: sometimes it means test-first, sometimes it doesn't - which leads to frustrating conversations where people are talking past each other.)

hitchstory · a year ago

You need the final implementation before taking the final snapshot but you can write the entire test up front (given/when). The snapshot artefact is generated not written (often in a different file entirely), so Id argue it still fits the definition cleanly.

I agree that "unit test"/"integration test" as a definition sucks horribly and leads to people talking past each other, but I think with TDD the main issue is that lots of people have developed a fixed and narrow idea of the kind of test you are "supposed" to write with it which makes the process miserable if the type of code doesnt fit that type of test.

The whole idea of a unit test being "the" kind of "default" test and being "tests a class/method as a unit" definitely needs to die.

hitchstory commented on The Big TDD Misunderstanding (2022) linkedrecords.com/the-big... · Posted by u/WolfOliver

simonw · a year ago

If you value red green refactoring then you should write the tests first.

I only use that technique for pieces of code that really fit that well - usually functions that have a very strong relationship between their input and output - so I'll write tests first for those, but not for most of my other stuff.

hitchstory · a year ago

Well ok...but then what kind of code doesnt it fit well?

Almost every user story I follow in production code follows the form of given/when/then scenario which can always be transformed into a test of some kind (e2e, integration, sometimes even unit).

Where it's something like "do x, y and z and then a graph appears" I find TDD with a snapshot test with, say, playwright works best.

hitchstory commented on The Big TDD Misunderstanding (2022) linkedrecords.com/the-big... · Posted by u/WolfOliver

charleslmunger · a year ago

This is hilarious because it's a perfect interview (detects precisely the thing they're testing for) but also provides total adverse selection because the thing they're testing for is ridiculous.

hitchstory · a year ago

Ive done this too. The exercise wasnt arrays (Im militant about only setting very realistic tasks). My task required modifying existing production-like code and tests.

My hope was always that the candidates would do TDD where it seemed simple and obvious to do so. It was actually pretty rare but the candidates that defaulted to doing that always ended up being better in my opinion. They were always made offers higher than my company could afford elsewhere (so i guess in others' opinions too).

In this thread https://news.ycombinator.com/item?id=43060636 I pondered why most people dont default to TDD for production code and the answer invariably seemed to be "we didnt think TDD was a thing you could do with integration/e2e tests".