The big TDD misunderstanding (2022)

I also wasn't aware that "unit" referred to an isolated test, not to the SUT. I usually distinguish tests by their relative level, since "unit" can be arbitrary and bring up endless discussions about what it actually means. So low-level tests are those that test a single method or class, and integration and E2E tests confirm the functionality at a higher level.

I disagree with the premise that "unit", or low-level tests, are not useful because they test the implementation. These are the tests that check every single branch in the code, every possible happy and sad path, use invalid inputs, etc. The reason they're so useful is because they should a) run very quickly, and b) not require any external state or setup, i.e. the traditional "unit". This does lead to a lot of work maintaining them whenever the implementation changes, but this is a necessary chore because of the value they provide. If I'm only relying on high-level integration and E2E tests, because there's much fewer of them and they are slower and more expensive to run, I might miss a low-level bug that is only manifested under very specific conditions.

This is why I still think that the traditional test pyramid is the best model to follow. Every new school of thought since then is a reaction towards the chore of maintaining "unit" tests. Yet I think we can all agree that projects like SQLite are much better for having very high testing standards[1]. I'm not saying that every project needs to do the same, but we can certainly follow their lead and aspire to that goal.

[1]: https://www.sqlite.org/testing.html

vmenge · 2 years ago

I've never had issues with integration tests running with real databases -- they never felt slow or incurred any significant amount of time for me.

I also don't think unit tests bring as much value as integration tests. In fact, a lot of times unit tests are IMO useless or just make your code harder to change. The more towards testing implementation the worse it gets IMO, unless I really really care that something is done in a very peculiar way, which is not very often.

My opinion will be of course biased by my past experiences, but this has worked well for me so far with both monoliths and microservices, from e-shops and real estate marketplaces to IoTs.

imiric · 2 years ago

> I've never had issues with integration tests running with real databases -- they never felt slow or incurred any significant amount of time for me.

They might not be slow individually, but if you have thousands of them, even a runtime of a couple of seconds adds up considerably. Especially if they're not parallelized, or parallelizable. Also, since they depend on an external service, they're tedious to execute, so now Docker becomes a requirement for every environment they run in, including slow CI machines. Then there is external state you need to think about, to ensure tests are isolated and don't clobber each other, expensive setup/teardown, ensuring that they cleanup after they're done, etc. It's all complexity that you don't have, or shouldn't have, with low-level tests.

That's not to say that such tests shouldn't exist, of course, but that they shouldn't be the primary test type a project relies on.

> I also don't think unit tests bring as much value as integration tests. In fact, a lot of times unit tests are IMO useless or just make your code harder to change.

You're repeating the same argument as TFA, which is what I disagree with. IME I _much_ preferred working on codebases with a high coverage from low-level tests, than on those that mostly rely on higher level ones. This is because with more lower level tests there is a higher degree of confidence that a change won't inadvertently break something a higher level test is not accounting for. Yes, this means larger refactorings also means having to update your tests, but this is a trade-off worth making. Besides, nowadays it's becoming easier to just have an AI maintain tests for you, so this argument is quickly losing ground.

> My opinion will be of course biased by my past experiences

Sure, as all our opinions are, and this is fine. There is no golden rule that should be strictly followed about this, regardless of what some authority claims. I've also worked on codebases that use your approach, and it has worked "well" to some extent, but with more code coverage I always had more confidence in my work, and quicker and convenient test suites ensured that I would rely on this safety net more often.

layer8 · 2 years ago

Do you set up a new database schema for each unit test? If yes, that tends to be slow if you have many tests (hundreds or thousands), and if no, then you risk getting stateful dependencies between tests and they aren’t really unit tests anymore.

solraph · 2 years ago

> I've never had issues with integration tests running with real databases -- they never felt slow or incurred any significant amount of time for me.

I've come around on this. I used to mock the DB, especially when it was being used as a dumb store, and now I just recreate the DB in SQLite, and see the DB as part of the system being tested, rather than something to mock.

However, I think it's important to note that it wasn't until improved SQLite capabilities, SSDs (and sometimes docker if I really need postgres) all came together that this actually practical. Previously using an actual DB would have blown out my test runtimes by a factor of 10x.

> I also don't think unit tests bring as much value as integration tests. In fact, a lot of times unit tests are IMO useless or just make your code harder to change. The more towards testing implementation the worse it gets IMO, unless I really really care that something is done in a very peculiar way, which is not very often.

I see this a slightly different way. My concept of a unit (ignoring the article) has expanded to be what makes sense for a given test. This may be a class or set of classes where there's a well crafted set of inputs and outputs, but where there's a tricky set of inputs and outputs (anything involving date calculation for example) I'll often write a set of tests for just that function. I'd probably call all of these "unit tests" however.

To me, an integration test involves testing that disparate vertical parts of a SUT work together. I haven't seen many of these in the wild.

icedchai · 2 years ago

I once worked at a place that demanded we write unit tests for every new method. Something that was simply a getter or setter? New unit test. I'd argue that the code was covered by tests on other code, where it was actually used. This would result in more and more useless arguments. Eventually, I just moved on. The company is no longer in business anyway.

magicalhippo · 2 years ago

I think it depends on what exactly the code does.

We have some custom rounding routines (to ensure consistent results). That's the kind of stuff you want to have lots and lots of unit tests for, testing all the paths, edge cases and so on.

We also have a complex price calculation module, which depends on lots of tables stored in the DB as well as some fixed logic to do its job. Sure we could test all the individual pieces of code, but like Lego pieces it's how you put them together that matters so IMO integration testing is more useful.

So we do a mix. We have low-level unit testing for low-level library style code, and focus more on integration testing for higher-level modules and business logic.

sfn42 · 2 years ago

I take a similar approach in .NET. I try to build these Lego pieces as traditional classes - no dependencies (except maybe a logger), just inputs and outputs. And then I have a few key "services" which tie everything together. So the service will pull some data from an API and maybe pull some data from an API, then pass it to these pure classes for processing.

I don't unit test the service, I integration test the api itself which indirectly tests the service. Mock the third party API, spin up a real db with testcontainers.

And then I unit test the pure classes. This makes it much easier to test the logic itself, and the service doesn't really have any logic - it just calls a method to get some data, then another method to process it and then returns it. I could quite easily use mocks to test that it calls the right methods with the right parameters etc, but the integration tests test that stuff implicitly, without being a hindrance to refactoring and similar work.

Scubabear68 · 2 years ago

This is of course the correct answer - it depends on the context of your code. A single dogmatic approach to testing will not work equally well across all problem domains.

Simple stateless components, hitting a defined wire protocol or file format, utilizing certain API’s, testing numerical stuff, implies unit testing will go far.

Stateful components, complex multi-class flows, and heavily data driven domains will often benefit from higher level integration/end to end tests.

j1elo · 2 years ago

I just wrote a sibling comment and then realized you just stated exactly the same I wanted to say, but with more concrete examples :)

That's exactly the sweet spot: complex, self-contained logic units might benefit from low-level unit testing, but for the most part, what you're interested to know is if the whole thing works or not. Just IMHO, of course...

j1elo · 2 years ago

I believe the recent-ish reactions against the chore of maintaining the most lower level of unit tests, is because with years and experience we might be going through an industry tendency where we collectively learn that those chores are not worth it.

100% code coverage is a red herring.

If you're in essence testing things that are part of the private implementation, only through indirect second effects of the public surface... then I'd say you went too far.

What you want to do is to know that the system functions as it should. "I might miss a low-level bug that is only manifested under very specific conditions." means to me that there's a whole-system condition that it's possible to occur and thus should be added to the higher level tests.

Not that lower level unit tests are not useful, but I'd say only for intricate and isolated pieces of code that are difficult to verify. Otherwise, most software is a changing entity because we tend to not know what we actually want out of it, thus its lower level details tend to evolve a lot over time, and we shouldn't have two implementations of it (first one the code itself, second one a myriad tiny tests tightly coupled to the former)

andrewprock · 2 years ago

You should be very skeptical of anyone that claims they have 100% test coverage.

Only under very rare circumstances is 100% test coverage is even possible, let alone done. Typically when people say coverage they mean "code line coverage", as opposed to the more useful "code path coverage". Since it's combinatorially expensive to enumerate all possible code paths, you rarely see 100% code path coverage in a production system. You might see it for testing vary narrow ADTs, for example; booleans or floats. But you'll almost never see it for black boxes which take more than one simply defined input doing cheap work.

vidarh · 2 years ago

To me, unit tests primary value is in libraries or components where you want confidence before you build on top of them.

You can sidestep them in favour of higher level tests when the only place they're being used is in one single component you control.

But once you start wanting to reuse a piece of code with confidence across components, unit tests become more and more important. Same as more people are involved.

Often the natural time to fill in lacking unit tests is as an alternative to ad hoc debugging.

DavidWoof · 2 years ago

> I also wasn't aware that "unit" referred to an isolated test

It never did. "Unit test" in programming has always had the meaning it does now: it's a test of a unit of code.

But "unit test" was originally used in electronics, and the meaning in electronics was a bit closer to what the author suggests. The author is being a bit fanciful (aka lying) by excluding this context and pretending that we all don't really understand what Kent Beck et. al. were talking about.

voiceofunreason · 2 years ago

Yes.

<< I call them "unit tests" but they don't match the accepted definition of unit tests very well. >>

I'm not entirely certain it's fair to accuse the author of lying; ignorance derived from limited exposure to materials outside the bubble (rather than deceit) is the more likely culprit here.

(Not helped at all by the fact that much of the TDD/XP origin story is pre-Google, and requires a different set of research patterns to track down.)

layer8 · 2 years ago

He links to where he got the notion from. I don’t think it’s that clear-cut.

troupo · 2 years ago

> pretending that we all don't really understand what Kent Beck et. al. were talking about.

Here's what Kent Beck has to say about testing: https://stackoverflow.com/a/153565

--- start quote ---

I get paid for code that works, not for tests, so my philosophy is to test as little as possible to reach a given level of confidence

--- end quote ---

drewcoo · 2 years ago

> I also wasn't aware that "unit" referred to an isolated test, not to the SUT.

I'm with you. That claim is unsubstantiated. It seems to trace to the belief that the first unit tests were XUnit family, thus were SUnit for Scheme. But Kent Beck made it pretty clear that SUnit "units" were classes.

https://web.archive.org/web/20150315073817/http://www.xprogr...

There were unit tests before that. SUnit took its name from common parlance, not vice versa. It was a strange naming convention, given that the unit testing framework could be used to test anything and not just units. Much like the slightly older Test Anything Protocol (TAP) could.

> [on unit tests] This does lead to a lot of work maintaining them whenever the implementation changes, but this is a necessary chore because of the value they provide.

I disagree. Unit tests can still be behavioral. Then they change whenever the behavior changes. They should still work with a mere implementation change.

> This is why I still think that the traditional test pyramid is the best model to follow.

I'll disagree a little with that, too. I think a newer test pyramid that uses contract testing to verify integrations is better. The notion of contract tests is much newer than the pyramids and, properly applied, can speed up feedback by orders of magnitude while also cutting debugging time and maintenance by orders of magnitude.

On that front, I love what Pact is doing and would like to see more competition in the area. Hottest thing in testing since Cypress/Playwright . . .

https://pact.io

janosdebugs · 2 years ago

Genuine question: can somebody please explain why there needs to be a distinction between "true" unit tests and tests that work on several layers at once as long as said tests are runnable <1min on a consumer-grade laptop without any prior setup apart from a standard language + container env setup?

Over the years I had several discussion to that effect and I truly, genuinely don't understand. I have test cases that test a connector to, say, Minio, so I spin up a Minio container dynamically for each test case. I need to test an algorithm, so I isolate its dependencies amd test it.

Shouldn't the point be that the thing is tested with the best tool available for the job that ensures robustness in the face of change rather than riding on semantics?

imiric · 2 years ago

There doesn't need to be a strict distinction, but if you subscribe to the traditional test pyramid practice, it's worth focusing the bulk of your testing on tests that run very quickly (typically in the order of milliseconds), that don't require complex setup/teardown, and that don't rely on external state or services to run. This means that these tests are easier to read/write, you can have thousands of them, and you can run them much more frequently, which makes you more productive.

In turn, if you focus more on integration or E2E tests, which are usually more difficult and expensive to run, then you may avoid running them frequently, or rely on CI to run them for you, which slows down development.

Also, you may consider having a stronger distinction, and e.g. label integration and E2E tests, and choose to run them conditionally in CI. For example, it might not make sense to even start the required services and run integration/E2E tests if the unit test suite fails, which can save you some time and money on CI resources.

emmelaich · 2 years ago

Wow that's interesting, because I never even considered that a unit test to be other than a test to a small unit.

Is it not right there in the name?

cassianoleal · 2 years ago

I guess what the OP is arguing is that "unit test" doesn't mean you "test a unit" but rather that "each test is a unit" - i.e. each test executes independently from all other tests.

I just find that good testing practice tbh but it's true that there are loads of test suites out there that require tests to be run in a particular sequence. I haven't seen one of those in a while but they used to be quite common.

BurningFrog · 2 years ago

Yes. That is definitely the original intention of the term!

Of course, language can and does drift etc, but I haven't seen the other use anywhere else.

melvinroest · 2 years ago

Having high testing standards means practically to me (having worked for a few SaaS companies): change code somewhere else, and see where it fails elsewhere. Though, I see failing tests as guidelines as nothing is 100% tested. If you don't see them as guidelines but as absolute, then you'll get those back in bugs via Zendesk.

WolfOliver · 2 years ago

It make sense to write a test for a class when the class/method does complex calculations. Today this is less the case then it was when the test pyramid was introduced.

PH95VuimJjqBqy · 2 years ago

> I also wasn't aware that "unit" referred to an isolated test, not to the SUT.

What the hell are they teaching people nowadays?

But then I do a quick google search and see this as the top result

https://stackoverflow.com/questions/652292/what-is-unit-test...

> Unit testing simply verifies that individual units of code (mostly functions) work as expected.

Well no fucking wonder.

This is like the whole JRPG thing where the younger generation misunderstood but in their hubris claim it's the older generation that coined the term that doesn't understand.

It's the blind leading the blind and that's probably the most apt description of our industry right now.

The big TDD misunderstanding is that most people consider TDD a testing practice. The article doesn’t talk about TDD, it gives the reader some tips on how to write tests. That’s not TDD.

MoreQARespect · 2 years ago

I'm fully aware of the idea that TDD is a "design practice" but I find it to be completely wrongheaded.

The principle that tests that couple to low level code give you feedback about tightly coupled code is true but it does that because low level/unit tests couple too tightly to your code - I.e. because they too are bad code!

Have you ever refactored working code into working code and had a slew of tests fail anyway? That's the child of test driven design.

High level/integration TDD doesnt give "feedback" on your design it just tells you if your code matches the spec. This is actually more useful. It then lets you refactor bad code with a safety harness and give failures that actually mean failure and not "changed code".

I keep wishing for the idea of test driven design to die. Writing tests which break on working code is inordinately uneconomic way to detect design issues as compared to developing an eye for it and fixing it under a test harness with no opinion on your design.

So, yes this - high level test driven development - is TDD and moreover it's got a better cost/benefit trade off than test driven design.

thom · 2 years ago

I think many people realise this, thus the spike and stabilise pattern. But yes, integration and functional tests are both higher value in and of themselves, and lower risk in terms of rework, so ought to be a priority. For pieces of logic with many edge cases and iterations, mix in some targeted property-based testing and you’re usually in a good place.

lukeramsden · 2 years ago

Part of test-driven design is using the tests to drive out a sensible and easy to use interface for the system under test, and to make it testable from the get-go (not too much non-determinism, threading issues, whatever it is). It's well known that you should likely _delete these tests_ once you've written higher level ones that are more testing behaviour than implementation! But the best and quickest way to get to having high quality _behaviour_ tests is to start by using "implementation tests" to make sure you have an easily testable system, and then go from there.

osigurdson · 2 years ago

I think there can be some value to using TDD in some situations but as soon as people get dogmatic about it, the value is lost.

The economic arguments are hard to make. Sure, writing the code initially might cost $X and writing tests might cost $1.5X but how can we conclude that the net present value (NPV) of writing the tests is necessarily negative - this plainly depends on the context.

surgical_fire · 2 years ago

I don't even like TDD much, but I think that this missed the point:

> Have you ever refactored working code into working code and had a slew of tests fail anyway?

Yes - and that is intended. The "refactor of working code into working code" often changes some assumptions that were made during implementation.

Those tests are not there to give "feedback on your design", they are there to endure that the implementation does what you thought it should do when you wrote your code. Yes, that means that when you refactor your code, quite a few tests will have to be changed to match the new code.

But the amount of times I had this happen and it highlighted issues on the refactor is definitely not negligible. The cost of not having these tests (which would translate into bugs) would certainly have surpassed the costs of keeping those tests around.

Deleted Comment

aljarry · 2 years ago

> Have you ever refactored working code into working code and had a slew of tests fail anyway? That's the child of test driven design.

I had this problem, when either testing too much implementation, or relying too much on implementation to write tests. If, on the other hand, I test only the required assumptions, I'd get lower line/branch coverage, but my tests wouldn't break while changing implementation.

My take on this - TDD works well when you fully control the model, and when you don't test for implementation, but the minimal required assumptions.

OJFord · 2 years ago

I don't think that's TDD's fault, that's writing a crappy test's fault.

If you keep it small and focussed, don't include setup that isn't necessary and relevant, only exercise the thing which is actually under test, only make an assertion about the thing you actually care about (e.g. there is the key 'total_amount' with the value '123' in the response, not that the entire response body is x); that's much less likely to happen.

chris_wot · 2 years ago

If you’ve refactored code and a bunch of tests fail, then you’ve likely introduced a bug.

deneas · 2 years ago

I mean I think it's fair to assume that TEST-Driven-Development has something to do with testing. That being said, Kent Beck recently (https://tidyfirst.substack.com/p/tdd-outcomes) raised a point saying TDD doesn't have to be just an X technique, which I wholeheartedly agree with.

shimst3r · 2 years ago

Instead of Test-Driven Design, it should’ve been called Design-By-Testing.

paulluuk · 2 years ago

Did you mean Test Driven Development, or is Test-Driven Design a whole other thing?

marcosdumay · 2 years ago

Well, it's exactly as much about testing as it focus on writing and running tests.

What means, it's absolutely entirely about them.

People can claim it's about requirements all they want. The entire thing runs around the tests, and there's absolutely no consideration to the requirements except on the part where you map them into tests. If you try to create a requirements framework, you'll notice that there is much more to them than testing if they are met.

danmaz74 · 2 years ago

As I remember the discourse about TDD, originally it was described as a testing practice, and later people started proposing to change the last D from "development" to "design".

skrebbel · 2 years ago

Yeah it’s kind of unfortunate because they make a very good argument about defining a thing better, and in the title use a wrong definition of an adjacent term.

WolfOliver · 2 years ago

Maybe the term TDD in the title can be replaced with "unit testing". But unit testing is an major part of TDD.