If you don't trust your devs, CI is rarely going to protect you (at least in most setups)
Your dev could just replace all the tests with a "return true" to bypass traditional CI, too,
This makes an explicit step where you are vouching that you have run tests. If it turns out you didn't, it is going to be found out when someone actually DOES run tests and they fail... at that point you can discipline the developer for lying.
It isn't a matter of trust, its a matter of developers are humans. Humans make mistakes, and continuous integration helps catch mistakes.
I've been coding for 25+ years, and I sometimes I forget to remove debug output, or I forget to update an explicit exception I threw somewhere, or I forget to make sure dependencies build across all the platforms we need.
Running your build a second machine helps to catch a lot of those mistakes. It also helps enforce automation and good configuration practices and lint/formatting standards and all that, but fundamentally its helps us because humans make mistakes.
CI doesn't have to mean setting up a 12 step pipeline in GH actions with 3k lines of yaml and crazy triggers and workflows. It can just be an old dev machine (https://www.jamesshore.com/v2/blog/2006/continuous-integrati...) that pulls the latest code, runs `script/build`, and POSTs results somewhere.
Maybe local signoff works for 37signals, if so, thats great. But it isn't continuous integration.
Humans are easily confused though. I've been 100% sure I've run the tests and haven't many times. It's quite easy to do: make a change, run the tests, fix it, run again, everything works. Make some trivial refactor that "shouldn't change anything". Your brain says "it's fine, I ran the tests, remember?" and you push. But you didn't run the tests.
"First you must not fool yourself, and you are the easiest to fool"
The most common mistake you'll get here is people running tests, then making a small change later, then having tests fail unexpectedly. Even accidentally. This has happened to me before with regular CI, and often hit me during an automerge I fully expected to pass.
Dev replacing all tests with return true would should up in the PR review right? I mean a dev that malignant needs to be fired yesterday and someone approving that PR can be shown the door right alongside them.
Requiring devs to certify that they ran the tests might be useful for compliance, if that's a thing your org has to worry about and you're really dead set against using CI.
Not to sound like a shill but Nix solves this very elegantly: tests are just a special kind of build (usually with a synthetic output like "echo ok > $out") and can therefore be "cached" like any other build. But because build recipes are deterministic, and Nix uses sandboxed, pure build environments, "caching" is just a special case of "uploading build artifacts". Meaning your cache = build artifact server ("binary store"), and it's shared between everyone (CI, build farm, dev laptops, ..), again because build recipes are deterministic. Meaning: if your laptop (or VM) is the same architecture as your CI you can just run your CI steps locally and push that to the shared store, and your CI will now automatically see the tests have already been run and not run them again. And vice-versa. Across all branches and all collaborators who share that same cache.
This is all deterministic and doesn't require any human in the loop: the test result is hashed by all of the inputs, so you never accidentally spuriously pass a test locally which shouldn't have.
It requires almost no setup at all. There are SaaS which make this basically turnkey (cachix, nixbuild.net, garnix, ..). Getting your app to build in Nix in the first place, though? Years of tears.
Nix lives in this group of tools where "easy things are hard, hard things are easy", and this is a classic example.
Lots of confusion between two things in this thread: 1) is this a good idea? 2) is this a good implementation of the idea?
Whether this is a good idea has been discussed to death already, but assuming you want this (which most people won't, the readme says as much), is this a good implementation of the idea? Yeah, it is.
Requiring a CI pass to merge, and a tool to add the CI pass after some unspecified process, seems like a neat, minimal implementation that still prompts the author enough to prevent most accidental misses. Is it complete? Of course not, but checklists don't need to be complete to be useful. Atul Gawande's book "The Checklist Manifesto" talks about this a bit, just the act of asking is often enough to break us out of habits and get us thinking, and will often turn up more issues than are on the checklist itself.
At Google we have a ton of tooling that amounts to light automation around self-certification. Many checks on PRs (CLs) only require you to say "yes I've done the thing", because that's often sufficient.
Surely if this is about creating a step in a checklist, all you need is a box to tick in the PR template, and that would be an even simpler version of this, requiring far fewer moving parts and being easier to use.
I think part of the criticism of (1) here comes from the complexity of the solution, which makes it feel like it should be competing with a more fully-fledged CI solution. But for a tool where the goal is really just to let the developers assert that they've run some tests, it's surely a lot more complicated than it needs to be, no?
PR templates are optional, many tools bypass them, my previous company tried using them and couldn't make processes stick with them because people used various tools to create PRs.
I'd argue that depending on PR templates, vs depending on devs having the signoff tool installed, is a pretty similar level of moving parts.
However perhaps more important than moving parts is failure modes. The failure mode of this tool is that your PR is blocked from merging until you run the tool. The failure mode of the PR template is that you never realise you missed the checkbox.
> for a tool where the goal is really just to let the developers assert that they've run some tests, it's surely a lot more complicated than it needs to be, no?
I think the point is to have the right amount of friction. Too little, as mentioned above, and you don't realise when you've lost the protection from the process. To me this is a good solution exactly because it gives you a little bit of friction (but not much and cheap).
Solutions with less friction include: just write in the PR description that you ran the tests, or just know that you ran the tests but don't write it down anywhere. I'd suggest that the safety nets provided by these options are less useful because they have less friction.
> Remote CI runners are fantastic for repeatable builds, comprehensive test suites, and parallelized execution. But many apps don't need all that. Maybe yours doesn't either.
My reading of that is that if you need repeatable builds, comprehensive test suites, and/or parallelised execution, then this is not the tool for you.
I think a checkbox, suggested in another comment is a worse implementation.
I could certainly see the appeal of this sort of idea. Once your engineering org gets to a certain size, you can end up spending an eye-watering amount of money on CI compute - spinning up runners and executing thousands of tests for every single commit of every single pull request. That cost could decrease by a lot if, for PRs at least, you could use the dev's local machine instead.
The three prerequisites in my mind would be, that the CI workflow runs inside a local cluster of VMs (to match a real CI environment), that the results of the CI run are still uploaded and published somewhere, and that it's only used for pull requests (whereas a real CI is still used for the main branch).
This tool fails the second test; it doesn't publish the test results or in any way associate them with the status check.
Also, a larger organization is one in which human error is more likely, so I would expect a tool that relies on individual engineers not to make mistakes to be less suitable there.
I could _maybe, maybe_ see self-hosted GitHub runners inside of VMs running on developer machines. The VMs would have to be re-created after every request however.
I don't think this is a great idea, however, as now CI is dependent on my flaky laptop's wifi/internet connection/IP address, has the potential to be contaminated by something running on my machine, build logs can be modified, environment shape/architectures are all different and can't be easily controlled, I now have access to all of the CI secrets and can impersonate the CI, etc.
If this is just running tests locally, it seems deeply flawed - e.g. the tests would work even if I forgot to commit new files.
OTOH if starting a new container, pulling my branch and then doing everything, it's definitely as good as running on remote CI because it's basically replicating the same behaviour. And it would likely still be much faster since:
* CI machines are underpowered compared to dev laptops/desktops. e.g. our CI builds run on a container with 4 vCPUs and 16GB RAM. In contrast my laptop has 16 cores and 48 GB RAM
* docker pull itself takes a relatively long time on CI. If I'm running it locally it would just use the cached layers
The tool does actually check whether you have any uncommitted changes in Git, and fails in that case. So you're protected from that particular mistake. You're not protected from mistakes related to running the tests or checking their results, though, because the tool has nothing to do with that.
I wrote and released a similar concept in 2012 in response to co-workers constantly breaking my code and not running tests before committing. https://rubygems.org/gems/git_test/
That was before we had a reliable CI server and let me tell you, fixing the CI server was a much better investment than convincing everyone to run and document their tests locally. The basecamp tool is more polished than what I cobbled together, but I personally won’t be reaching for it any time soon.
It's actually worse than that; this tool doesn't run any tests or do anything with test results. All it does is require each developer to run `gh signoff` before their PR can be merged; the only thing that command checks is that there aren't uncommitted changes in Git. So if your colleagues aren't already locally running your tests the right way, this does nothing at all to help you.
There is no obligation on the dev to actually run the tests locally, so you could just save time by disabling status checks.
The point of a ci pipeline is that the build ran in a more controlled environment and here is the build log for everyone to see proof.
Your dev could just replace all the tests with a "return true" to bypass traditional CI, too,
This makes an explicit step where you are vouching that you have run tests. If it turns out you didn't, it is going to be found out when someone actually DOES run tests and they fail... at that point you can discipline the developer for lying.
I've been coding for 25+ years, and I sometimes I forget to remove debug output, or I forget to update an explicit exception I threw somewhere, or I forget to make sure dependencies build across all the platforms we need.
Running your build a second machine helps to catch a lot of those mistakes. It also helps enforce automation and good configuration practices and lint/formatting standards and all that, but fundamentally its helps us because humans make mistakes.
CI doesn't have to mean setting up a 12 step pipeline in GH actions with 3k lines of yaml and crazy triggers and workflows. It can just be an old dev machine (https://www.jamesshore.com/v2/blog/2006/continuous-integrati...) that pulls the latest code, runs `script/build`, and POSTs results somewhere.
Maybe local signoff works for 37signals, if so, thats great. But it isn't continuous integration.
"First you must not fool yourself, and you are the easiest to fool"
You do it on a shared server because you can pull it up and show your auditor "see, we follow our SDLC process".
This is all deterministic and doesn't require any human in the loop: the test result is hashed by all of the inputs, so you never accidentally spuriously pass a test locally which shouldn't have.
It requires almost no setup at all. There are SaaS which make this basically turnkey (cachix, nixbuild.net, garnix, ..). Getting your app to build in Nix in the first place, though? Years of tears.
Nix lives in this group of tools where "easy things are hard, hard things are easy", and this is a classic example.
Whether this is a good idea has been discussed to death already, but assuming you want this (which most people won't, the readme says as much), is this a good implementation of the idea? Yeah, it is.
Requiring a CI pass to merge, and a tool to add the CI pass after some unspecified process, seems like a neat, minimal implementation that still prompts the author enough to prevent most accidental misses. Is it complete? Of course not, but checklists don't need to be complete to be useful. Atul Gawande's book "The Checklist Manifesto" talks about this a bit, just the act of asking is often enough to break us out of habits and get us thinking, and will often turn up more issues than are on the checklist itself.
At Google we have a ton of tooling that amounts to light automation around self-certification. Many checks on PRs (CLs) only require you to say "yes I've done the thing", because that's often sufficient.
I think part of the criticism of (1) here comes from the complexity of the solution, which makes it feel like it should be competing with a more fully-fledged CI solution. But for a tool where the goal is really just to let the developers assert that they've run some tests, it's surely a lot more complicated than it needs to be, no?
I'd argue that depending on PR templates, vs depending on devs having the signoff tool installed, is a pretty similar level of moving parts.
However perhaps more important than moving parts is failure modes. The failure mode of this tool is that your PR is blocked from merging until you run the tool. The failure mode of the PR template is that you never realise you missed the checkbox.
> for a tool where the goal is really just to let the developers assert that they've run some tests, it's surely a lot more complicated than it needs to be, no?
I think the point is to have the right amount of friction. Too little, as mentioned above, and you don't realise when you've lost the protection from the process. To me this is a good solution exactly because it gives you a little bit of friction (but not much and cheap).
Solutions with less friction include: just write in the PR description that you ran the tests, or just know that you ran the tests but don't write it down anywhere. I'd suggest that the safety nets provided by these options are less useful because they have less friction.
(Also, I don't see anything in the readme that says most people shouldn't use it.)
My reading of that is that if you need repeatable builds, comprehensive test suites, and/or parallelised execution, then this is not the tool for you.
I think a checkbox, suggested in another comment is a worse implementation.
The three prerequisites in my mind would be, that the CI workflow runs inside a local cluster of VMs (to match a real CI environment), that the results of the CI run are still uploaded and published somewhere, and that it's only used for pull requests (whereas a real CI is still used for the main branch).
Also, a larger organization is one in which human error is more likely, so I would expect a tool that relies on individual engineers not to make mistakes to be less suitable there.
I don't think this is a great idea, however, as now CI is dependent on my flaky laptop's wifi/internet connection/IP address, has the potential to be contaminated by something running on my machine, build logs can be modified, environment shape/architectures are all different and can't be easily controlled, I now have access to all of the CI secrets and can impersonate the CI, etc.
https://docs.github.com/en/actions/hosting-your-own-runners/...
You could "favor" your own laptop as a target runner for the CI when it's your PR for example
https://docs.github.com/en/actions/writing-workflows/choosin...
OTOH if starting a new container, pulling my branch and then doing everything, it's definitely as good as running on remote CI because it's basically replicating the same behaviour. And it would likely still be much faster since:
* CI machines are underpowered compared to dev laptops/desktops. e.g. our CI builds run on a container with 4 vCPUs and 16GB RAM. In contrast my laptop has 16 cores and 48 GB RAM
* docker pull itself takes a relatively long time on CI. If I'm running it locally it would just use the cached layers
The Docker CLI is supposed to support caching on GitHub Actions (https://docs.docker.com/build/cache/backends/gha/) but I suppose I haven't checked how fast it is in practice.
Ha, I wish. My company thinks 8gb of RAM on a 6 year old machine is plenty of power for the devs.
That was before we had a reliable CI server and let me tell you, fixing the CI server was a much better investment than convincing everyone to run and document their tests locally. The basecamp tool is more polished than what I cobbled together, but I personally won’t be reaching for it any time soon.