Integration tests are closer to what you want to know, but they're also more. If I want to make sure that my state machine returns an error when it receives a message for which no state transition is defined, I could spin up a process and set up log collection and orchestrate with python and... or I could write a unit test that instantiates a state machine, gives it a message, and checks the result.
My point is that we need both. Write a unit test to ensure that your component behaves to its spec, especially with respect to edge cases. Write an integration test to make sure that the feature of which your component is a part behaves as expected.
With other types of bug programmers want to fix it. With flakiness they either want to rerun the test until it passes or tear it down and write an entirely different type of test - as if it is in fact not a bug, but some immutable fact of life.
Couple years ago I helped to bring a project back on track. They had a notoriously flakey part of test suite, turned out to be caused by a race condition. And a very puzzling case of occasional data corruption - also, turns out, caused by the same race condition.
Most flakiness ends up being a bug in the test or nondeterminism exhibited by the code which users dont actually care about.