I've used this in the past to force bash to print every command it runs (using the -x flag) in the Actions workflow. This can be very helpful for debugging.
4. curl begins running. sudo begins running in a subshell/pipe. tar begins running under the sudo pipe, extracting half of the just binary. curl fails due to network error. Due to pipefail being enabled, shell exits immediately. There is no error message. A corrupt executable is left on-disk (which will be attempted to run if your step had failure-skipping enabled)
> there will be no error output, you won't know why it failed
That's probably why the -x is there. (Well, that and if something like curl or sudo fails it tends to output something to stderr...)
> Pipefail also doesn't prevent more complex error states ... A corrupt executable is left on-disk (which will be attempted to run if your step had failure-skipping enabled)
If I'm reading right it seems like you're suggesting is that the case pipefail doesn't handle is if you explicitly ignore the exit code. That doesn't exactly seem like the most concerning catch 22, to be honest.
You're supposed to also use `set -e` if you're going to `set -o pipefail`, but of course that requires understanding that `set -e` will not apply to anything happening from inside a function called in a conditional expression -- this is a tremendous footgun.
And if you want to know which command in a pipe failed there's `PIPESTATUS`.
One cool undocumented GitHub Actions trick I spotted at work was the ability to use wildcards to match repository_dispatch event names:
on:
repository_dispatch:
- security_scan
- security_scan::*
Why would you want to do this?
We centralize our release pipelines as it's the only way to force repositories through a defined reusable workflow (we don't want our product teams to have to maintain them).
How have you found this centralized approach to work for you? Does your org require everything be built in exactly the same way?
I’ve just joined an organization that is trying to do similar, but in practice it seems nearly worthless. Templates are frequently broken, or written in ways that expect code to be built in a particular manner without any supporting docs describing the requirements.
We centralize releases but we don't centralize builds. That would remove too much autonomy from teams.
We don't use templates for any of this. Our interface is the payload sent with repository_dispatch, a few metadata files in the repository (which we fetch) and a GitHub application that allows us to update the PRs with the release status checks.
GitHub doesn't have a great story here, ideally we would want to listen to CI events emitting from a repo and run workflows as a reaction.
The reusable story on a modular basis is better but here we're missing features that we would need to move some of workflows into Actions. Notably - Action repos need to be public.
I may be targeted for calling this the “correct” way, but it is - it’s the only correct way.
Otherwise you need complicated setups to test any of the stuff you put up there since none of it can be run locally / normally.
GitHub Actions, like any CI/CD product, is for automating in ways you cannot with scripting - like parallelizing and joining pipelines across multiple machines, modelling the workflow. That’s it.
I would really appreciate an agnostic templating language for this so these workflows can be modelled generically and have different executors, so you could port them to run them locally or across different products. Maybe there is an answer to this that I’ve just not bothered to look for yet.
I don’t see the post as the author suggesting you do this, but informing that it can be done. There’s a large difference. Knowing the possibilities of a system, even if it’s things you never plan on using, is useful for security and debugging.
My take was that it is not useful, definitely, categorically not useful. It is a potential security hazard though. Especially for 'exploring' self-hosted runners.
As someone currently working to move a large enterprise to GH Actions (not quite, but “yaml-based pipelines tied to git”) - what would discipline look like? If you can describe it, I can probably make it happen at my org.
All github action logic should be written in a language that compiles to yaml, for example dhall (https://dhall-lang.org/). Yaml is an awful language for programmers, and it's a worse language for non-programmers. It's good for no one.
2. To the greatest extent possible, do not use any actions which install things.
For example, don't use 'actions/setup-node'. Use bazel, nix, direnv, some other tool to setup your environment. That tool can now also be used on your developer's machines to get the same versions of software as CI is using.
3. Actions should be as short and simple as possible.
In many cases, they will be as simple as effectively "actions/checkout@v4", "run: ./ci/build.sh", and that's it.
Escape from yaml as quickly as possible, put basic logic in bash, and then escape from bash as quickly as possible too into a real langauge.
4. Do not assume that things are sane or secure by default.
Ideally you don't accept PRs from untrusted users, but if you do, read all the docs very carefully about what actions can run where, etc. Github actions on untrusted repos are a nightmare footgun.
0. Make 99% of your setups runnable locally, with Docker if need be. It's the fastest way to test something and nothing else come close. #1 and #2 derive from #0. This is actually a principle for code, too, if you have stuff like Lambda, make sure you have a command line entry point, too and you can also test things locally.
1. Avoid YAML if you can. Either plain configuration files (generated if need be - don't be afraid to do this) or full blown programming languages with all the rigor required (linting/static analysis, tests, etc).
2. Move ALL logic outside of the pipeline tool. Your actions should be ./my-script.sh or ./my-tool.
Source: lots of years of experience in build engineering/release engineering/DevOps/...
The golden rule is "will I need to make a dummy commit to test this?" and if yes, find a different way to do it. All good rules in sibling comments here derive from this rule.
You not want to ever need to make dummy commits to debug something in CI, it's awful. As a bonus, following this rule also means better access to debugging tools, local logs, "works on CI but not here" issues, etc. Finally if you ever want to move away from GitHub to somewhere else, it'll be easy.
For CI action: pre-build docker image with dependencies, then run your tests using this image as single GitHub action command.
If dependencies change, rebuild image.
Do not rely on gh caching, installs, multiple steps, etc.
Otherwise there will be a moment when tests pass locally, but not on gh, and debugging will be super hard. In this case you just debug in the same image.
1. Distinct prod and non-prod environments. I think you should have distinct Lab and Production environments. It should be practical to commit something to your codebase, and then test it in Lab. Then, you deploy that to Production. The Github actions model confuses the concepts of (source control) and (deployment environment). So you easily end up with no lab environment, and people doing development work against production.
2. Distinguish programming language expression and DSLs. Github yaml reminds me of an older time where people built programming languages in XML. It is an interesting idea, but it does not work out. The value of a programming language: the more features it has, the better. The value of a DSL: the fewer features it has, the better.
3. Security. There is a growing set of github-action libraries. The Github ecosystem makes it easy to install runners on workstations to accept dispatch from github actions. This combination opens opportunities for remote attacks.
Writing any meaningful amount of logic or configuration in yaml will inevitably lead to the future super-sentient yaml-based AI torturing you for all eternity for having taken any part in cursing it to a yaml-based existence. The thought-experiment of "Roko's typed configuration language" is hopefully enough for you to realize how this blog post needs to be deleted from the internet for our own safety.
As long as other readers of the action are aware of what's happening, this seems pretty useful. There's been many adventures where my shell script, starting out as a few lines basically mirroring what I typed in by hand, has grown to a hundred-line-plus monster where I wish that I had real arrays and types and the included batteries in the Python stdlib.
I'm definitely not going to use this to implement my company's build actions in elisp.
Github Actions Runner code is pretty easy to read, here's a specific place that define default arguments for popular shells / binaries: https://github.com/actions/runner/blob/main/src/Runner.Worke..., it is exported through a method ScriptHandlerHelpers.GetScriptArgumentsFormat.
In ScriptHandler.cs there's all the code for preparing process environment, arguments, etc. but specifically here's actual code to start the process:
Overall I was positively surprised at simplicity of this code. It's very procedural, it handles a ton of edge cases, but it seems to be easy to understand and debug.
https://github.com/jstrieb/just.sh/blob/2da1e2a3bfb51d583be0...
Pipefail also doesn't prevent more complex error states. For example this step from your config:
Here's the different error conditions you will run into:1. curl succeeds, sudo succeeds, tar succeeds, but just fails to extract from the tarball. Tar reports error, step fails.
2. curl succeeds, sudo succeeds, tar fails. Sudo reports error, step fails.
3. curl succeeds, sudo fails. Shell reports error, step fails.
4. curl begins running. sudo begins running in a subshell/pipe. tar begins running under the sudo pipe, extracting half of the just binary. curl fails due to network error. Due to pipefail being enabled, shell exits immediately. There is no error message. A corrupt executable is left on-disk (which will be attempted to run if your step had failure-skipping enabled)
That's probably why the -x is there. (Well, that and if something like curl or sudo fails it tends to output something to stderr...)
> Pipefail also doesn't prevent more complex error states ... A corrupt executable is left on-disk (which will be attempted to run if your step had failure-skipping enabled)
If I'm reading right it seems like you're suggesting is that the case pipefail doesn't handle is if you explicitly ignore the exit code. That doesn't exactly seem like the most concerning catch 22, to be honest.
And if you want to know which command in a pipe failed there's `PIPESTATUS`.
We centralize our release pipelines as it's the only way to force repositories through a defined reusable workflow (we don't want our product teams to have to maintain them).
This allows us to dispatch an event like so:
Then it is far easier to identify which product and version a workflow is running when looking in the Actions tab of our central release repository.I’ve just joined an organization that is trying to do similar, but in practice it seems nearly worthless. Templates are frequently broken, or written in ways that expect code to be built in a particular manner without any supporting docs describing the requirements.
We don't use templates for any of this. Our interface is the payload sent with repository_dispatch, a few metadata files in the repository (which we fetch) and a GitHub application that allows us to update the PRs with the release status checks.
GitHub doesn't have a great story here, ideally we would want to listen to CI events emitting from a repo and run workflows as a reaction.
The reusable story on a modular basis is better but here we're missing features that we would need to move some of workflows into Actions. Notably - Action repos need to be public.
I tend to prefer either:
- Using a build-system (e.g. Make) to encode logic and just invoke that from GitHub Actions; or
- Writing a small CLI program and then invoke that from GitHub Actions
It's so much easier to debug this stuff locally than in CI.
So an interesting trick, but I don't see where it would be useful.
Call me old school, but I want to leave YAML town as soon as possible.
Otherwise you need complicated setups to test any of the stuff you put up there since none of it can be run locally / normally.
GitHub Actions, like any CI/CD product, is for automating in ways you cannot with scripting - like parallelizing and joining pipelines across multiple machines, modelling the workflow. That’s it.
I would really appreciate an agnostic templating language for this so these workflows can be modelled generically and have different executors, so you could port them to run them locally or across different products. Maybe there is an answer to this that I’ve just not bothered to look for yet.
This generation will shudder when they are asked to bring discipline to deployments built from github actions.
1. Do not use yaml.
All github action logic should be written in a language that compiles to yaml, for example dhall (https://dhall-lang.org/). Yaml is an awful language for programmers, and it's a worse language for non-programmers. It's good for no one.
2. To the greatest extent possible, do not use any actions which install things.
For example, don't use 'actions/setup-node'. Use bazel, nix, direnv, some other tool to setup your environment. That tool can now also be used on your developer's machines to get the same versions of software as CI is using.
3. Actions should be as short and simple as possible.
In many cases, they will be as simple as effectively "actions/checkout@v4", "run: ./ci/build.sh", and that's it.
Escape from yaml as quickly as possible, put basic logic in bash, and then escape from bash as quickly as possible too into a real langauge.
4. Do not assume that things are sane or secure by default.
Ideally you don't accept PRs from untrusted users, but if you do, read all the docs very carefully about what actions can run where, etc. Github actions on untrusted repos are a nightmare footgun.
1. Avoid YAML if you can. Either plain configuration files (generated if need be - don't be afraid to do this) or full blown programming languages with all the rigor required (linting/static analysis, tests, etc).
2. Move ALL logic outside of the pipeline tool. Your actions should be ./my-script.sh or ./my-tool.
Source: lots of years of experience in build engineering/release engineering/DevOps/...
Also put as much as possible in bash or justfile instead of inside the yaml. It avoids vendor lock-in and makes local debugging easier.
You not want to ever need to make dummy commits to debug something in CI, it's awful. As a bonus, following this rule also means better access to debugging tools, local logs, "works on CI but not here" issues, etc. Finally if you ever want to move away from GitHub to somewhere else, it'll be easy.
Do not rely on gh caching, installs, multiple steps, etc.
Otherwise there will be a moment when tests pass locally, but not on gh, and debugging will be super hard. In this case you just debug in the same image.
1. Distinct prod and non-prod environments. I think you should have distinct Lab and Production environments. It should be practical to commit something to your codebase, and then test it in Lab. Then, you deploy that to Production. The Github actions model confuses the concepts of (source control) and (deployment environment). So you easily end up with no lab environment, and people doing development work against production.
2. Distinguish programming language expression and DSLs. Github yaml reminds me of an older time where people built programming languages in XML. It is an interesting idea, but it does not work out. The value of a programming language: the more features it has, the better. The value of a DSL: the fewer features it has, the better.
3. Security. There is a growing set of github-action libraries. The Github ecosystem makes it easy to install runners on workstations to accept dispatch from github actions. This combination opens opportunities for remote attacks.
Writing any meaningful amount of logic or configuration in yaml will inevitably lead to the future super-sentient yaml-based AI torturing you for all eternity for having taken any part in cursing it to a yaml-based existence. The thought-experiment of "Roko's typed configuration language" is hopefully enough for you to realize how this blog post needs to be deleted from the internet for our own safety.
I'm definitely not going to use this to implement my company's build actions in elisp.
In ScriptHandler.cs there's all the code for preparing process environment, arguments, etc. but specifically here's actual code to start the process:
https://github.com/actions/runner/blob/main/src/Runner.Worke...
Overall I was positively surprised at simplicity of this code. It's very procedural, it handles a ton of edge cases, but it seems to be easy to understand and debug.
Probably can write assembly too.