For anyone in the audience who didn't know: GitHub Actions is based on Visual Studio Team Foundation Server's CI, and later Azure DevOps. And nobody, including the current maintainers, seem to know exactly how it all works (or how it doesn't).[1] The random exit codes is just the cherry on top!
> And nobody, including the current maintainers, seem to know exactly how it all works (or how it doesn't).[1]
I read through this whole issue, and I cannot find exactly what you're referring to. What message gives you the impression they don't know how it all works? Seems it's mostly people asking for updates, and eventually something similar but different got implemented.
The fact that the --once flag was implemented and just didn't work. And it didn't work for several reasons. One, the control flow in the runner is extremely convoluted and exiting out is not trivial, and two, the scheduler doesn't handle assigned jobs not being picked up.
I attempted to read that code in order to glean information about how it communicates back to GitHub about progress etc. The goal was to make a new runner. I ran for the hills, fast.
Sounds like most Microsoft projects to be honest. Not a lot of them can be highlighted as the pinnacle of software engineering, including their GitHub Actions implementation.
Looks like GHA was announced[1] around the same time as the acquisition by MS. The first commit in that repo is a year later when they opensourced it, so we can't see how it evolved before then.
For anyone who was at Github at the time, was it always written in C# or rewritten/replaced after acquisition? If "GitHub Actions is based on Visual Studio Team Foundation Server's CI" is the case, then it sounds like the latter.
I don't recall the exact context anymore but during the Microsoft Build this year it was said by an employee at Github that the service came with the acquisition which is the reason it was and stays C#.
Personal experience building infrastructure and an autoscaler for these things (which performed horribly due to API lag) while having weekly meetings with our GitHub rep to get the --ephemeral flag in (took about a year longer than promised). Sometimes the exit code would be 0 when using `--once`, and sometimes it would not be. Sometimes it'd also be 0 if the token somehow didn't work and the worker couldn't even register no matter how often you restarted it (of course with a cryptic Azure IAM error code). Either way, we eventually just decided that throwing away the machine if the runner exists for any reason was safest.
> GitHub Actions is based on Visual Studio Team Foundation Server's CI, and later Azure DevOps
Yes and no, ADO Agent (https://github.com/microsoft/azure-pipelines-agent) is far more secretive and "black-box" alike.
Like stuck in old version of NodeJS, Powershell, API without documentation or even enough tests/samples...
If you want the ability to run as much of your GitHub Actions locally as possible (which you should! Testing via pushing commits is painful) you can arrange your actions so the majority of the logic lives in shell scripts (or Python scripts or whatever) that are called by the workflow.
Then you can run those scripts locally and iterate in them before you try them on GitHub Actions.
I agree: Actions would be more useful if I could run the entire stack locally (an official implementation, not a community-maintained clone). But it's not a big enough inconvenience for me to care very much.
If you're suggesting the self-hosted runners are equivalent to running locally, then no, self-hosted runners still require a commit + push and runs remotely with the added overhead of the job being queued, reported, etc.
The dream is to have something like a container _locally_ and be able to run something like:
I wish I didn't have to circumvent the UI with a strategy like this. Part of the reason I like CI/CD tools is the visualization factor - I can create specific steps, see which step is currently active during execution, and only care for the output of a specific step during debugging.
A platform with support for visual control from the scripts (implemented as no-ops during local execution) would be perfect.
GitHub actions are configured by version-controlled files too though. I have an actions-dev branch on my project, so I can iterate on actions config without having to clutter up master.
Would still be better if I could run locally to iterate. But the local run would have to have very high fidelity to what happens on GitHub to be useful.
That cuts both ways, in my experience: a bug fix needs a metric boatload of cherry-pick, possibly across a bunch of repos, to fix any bugs. Centralization is great until it's not
We have gotten a lot of good mileage out of GitLab's include: feature <https://docs.gitlab.com/ee/ci/yaml/#include>, which for clarity absolutely carries this same hazard, but can mean centralized fixes are less onoz than trying to patch every active branch of every active repo's build scripts
I am also aware this cuts against the thrust of the thread about "build scripts should be locally executable," but until they (they: GitHub, they: GitLab) straighten out their runner binary to be sane, it is currently a better use of our engineering effort maintaining CI yaml than teaching devs how to run CI locally
I'm not quite familiar with GitHub Actions, but Azure DevOps Pipeline has a nice Preview API: https://learn.microsoft.com/en-us/rest/api/azure/devops/pipe... which runs the preprocessing steps(like running the C preprocessor on a C source file) then gives you the processed yaml file. Then if you'd like, you can write your own local runner based on that if your yaml files do not use too many different kinds Azure DevOps pipeline steps.
yes! what we did was use a task runner to encapsulate the work in a “ci” task. you can run it locally, and then the GHA runner does little more than clone, install a couple depa and run the task.
Building Java and C# on the command line nowadays is a hard undertaking, requiring weeks of learning and debugging, just to have your code change in a way that makes the scripts outdated at any time (oh, and it's never portable).
I really don't get why people use those complex CI/CD tools for other languages, but at least on the enterprise ones, I can understand people moving from the IDE into some huge centralized mess that can import it.
I usually tweak it a bit, but that's minutes, not weeks.
There's often more stuff to add as a project grows, but again, minutes to hours for the common stuff. Setting up publishing to a Maven repository is more work than I'd like, but it's still not weeks.
So what are you doing that it takes so long? And how long would that take if done any other way?
I would respectfully suggest that the author is misusing CI. If you have trouble running your tests locally, you have a problem. If you have trouble deploying from your local code, you have a problem. All of those capabilities should exist as simple scripts in your project already. Once you have that done, the CI yaml is a simple glue layer that defines an order of operations, e.g.:
1. Run static tests
2. If those pass, run unit/integration tests
3. If those pass, deploy
If you find yourself screaming about YAML, you're leaning too heavily on it and need to refactor your project's scripts.
Maybe a good question to ask would be "if I had to switch to another CI system today, how hard would it be?" If the answer is "hard", perhaps you're leaning too heavily on it and need to refactor your project's scripts.
90% of the tricky parts of CI are secret management. As long as you write scripts to pick up credentials in a sane way (.awsprofile or similar) you should be able to configure your CI to provide the credentials just as well as you can locally - but in practice, the various different ways that things like artifact repositories, integration test databases, and cloud deployment tools want to manage auth is the cause of most of the complexity in getting your build/test/deploy pipeline working on the runner.
This is all fine and good if you are the principal developer of the project. However, the author makes it clear that he is migrating other people's CI pipelines. He is a DevOps engineer working across several teams.
This is why he makes the important point that discipline is not enough. The reason is most teams simply don't care. I find one in five teams where everyone on the team cares about the build (when I'm lucky!), most teams have one person who cares, and some teams have no one that cares.
When I am tasked with the proper care and feeding of the pipelines of others, I want tools that can work and help me out even when the developers who created the software are Holding it Wrong.
With these requirements in mind -- managing and migrating the many different CI piplines across an organization -- it would be a major breakthrough to have a tool that 1) transpiles to the workflows of all the CI tools and 2) allows for local testing. So many orgs have different teams using different CI stacks, and the local testing problem is always a struggle. I would use a tool like that into the ground.
So I would qualify your original statement: The author isn't misusing CI. Rather, the author is attempting to survive in a world where others are misusing it, and where the author is tasked with managing all the CI pipelines.
Deployments, I think it'd be fair to consider the requirements. At work, our softare can be tested locally but deployments are all registered against a central authority, and after a point of composing enough access requirements, only then does a role (cicd in this case) have enough policy allowance to perform a deployment.
The entire transaction is auditable. And I think that with a deployment, that's how it should be; allowing that trust down to a local environment strikes me that too much permission is accured with a single entity.
I guess that we could better define what a deployment is; to some nonprod environments I'd agree, but I'd still probably insist on the heavy machinery up at the test, perf, qa, areas, and then getting into staging and prod, there'd be no wiggle room.
Fair critique, totally depends on what kind of software we're talking about and where the deployment is happening. In general though, how screwed are you if your CI environment goes down? Can you not deploy anything? That would be scary.
My point, however, was mostly that the logic necessary to deploy should live as part of your codebase, not written out in YAML. The privs necessary to deploy are a separate discussion.
This sounds fine and well, but it's not how Github Actions work (or at least, not the encouraged workflow). Let's have a look at the snippet from one of the projects I work on:
Good luck running this locally. There's no script code to speak of, just references to external "actions" and parameters (for example, https://github.com/docker/setup-buildx-action).
Some CI platforms are just a simple glue layer (Gitlab CI - which I prefer - is one of them), but in most cases Github CI is not. Maybe it adds to the author frustration?
Building it that way is a choice. It's not mandatory.
You can use gitlab CI with special-purpose docker images for all your steps and magic parameters driving everything too (Gitlab AutoDevops works that way).
But if you just run your steps in shell scripts in vanilla docker images containing your build-time dependencies, you should be able to produce something that works the same in any CI pipeline, or locally.
The most annoying thing for me is that a lot of CI engines make docker-in-docker complicated. I love using compose to set up integration test environments, but doing that in CI is often a fight.
> I would respectfully suggest that the author is misusing CI. If you have trouble running your tests locally, you have a problem.
I would respectfully suggest that you misread the author. The issue isn't running tests locally, it's running the CI config locally.
I experienced the same problem with gitlab CI years ago, where, basically, you can lint the file and not much more. Past that, you need to run it through your CI and debug if you get slightly different results compared to running a script locally.
Yeah, it's a horrible experience all around. CircleCI solved this problem like a decade ago, enabling SSH builds so you can troubleshoot straight up in the build itself, and once you've figured it out, just copy-paste the steps to your Makefile/CI config.
I don't understand how one could build a CI service so long time after CircleCI launched, and still not have that very same feature (or something similar).
The authors speak about how hard it is nowadays to provide a CI extension to the major platform including GitHub, Azure DevOps and Gitlab. (and others)
Wanting to run locally a developed extension is totally legit as some can be really tricky and depends on the behavior of [runner & OS].
My strategy is just to keep the yaml to a minimum and call python scripts within the action to do anything complicated. The GHA yaml is just used for defining when the action runs, and on what architectures, etc. It's worked well for me.
Github Actions gives me literally free server time across an extremely wide range of OS'es that I dont have to worry about at all, including Windows and OSX, which I therefore dont have to deal with in any way, buy any license keys, none of that. It's nothing short of miraculous for us as it's how Python projects can have binary wheel files for dozens of OSes and Python versions: https://pypi.org/project/SQLAlchemy/#files . This task remained impossible for years (to be clear: because I don't have a server farm, or Windows licenses, or however you'd run OSX on a headless server, or any kind of resources to fund / manage dozens of images and keep them running, or any of that) until GH Actions made it possible.
Now is this all part of Microsoft's evil plan ? It probably is! But unless someone else wants to give me a free server farm that includes Windows / OSX up and running without me paying anything / writing containers / storing images / etc. I dont see this aspect of Github actions losing any popularity.
Local execution of GitHub actions for testing will become possible some time after pigs learn to fly.
Everyone here is asking for it as if it’s some minor oversight, soon to be rectified.
Unfortunately, this tech stack is a significant revenue source. Microsoft charges for pipeline minutes, concurrent runs, etc… This is especially true in Azure DevOps which shares much of same underlying pipeline software.
Letting anyone run this locally for any reason would let them bypass the monetisation.
It’s the same reason that ad-supported YouTube is “missing” a download offline feature.
It’s not an oversight. It’s not happening. Stop asking.
The only thing we the dev community can do about this is to develop our own open-source CD platform with blackjack and hookers.
I can't wait for this meme to die, or for act (or gitea's fork thereof) to catch up to the hype train. Then again, I guess this fantasy is being promoted by folks who are all "just use run: and that's it" because any moderately complex one <https://github.com/VSCodium/vscodium/blob/1.84.2.23314/.gith...> for sure fails with spectacularly illegible error messages under both act and gitea's fork
This is not helpful. You, the author, and random commenters scattered around keep teasing that act is bad, but I can't seem to find our what any of you mean.
One comment said github APIs fail, which would make sense to me. Is that the primary reason for act being a pain? Do you have output for the linked build yml or an explanation of where it goes wrong?
> Hosting a git repo is hardly more than providing a file system and SSH access. The actual mechanism they use to keep you on their platform is the CI-pipelines
Then why was GitHub so popular for the 10+ years it had no built in CI system?
I‘d also rather say that it was about hosting the code for free in the first place and then the pull requests, including the possibility to comment on code in PRs. I use the git cli (instead of some UI / IDE extension) for all interactions with the repo locally. But as soon as it‘s about collaboration, these platforms come into play.
I just quickly scanned, to find that there is the `git request-pull` command, before I wasn’t even sure whether pull requests are a git built-in feature at all.
Side question: does any code hosting platform allow to comment on lines of code outside of pull requests? I‘ve had several occasions where I wanted to ask, why something was written the way it was.
a github pull request isn't a pull request; a pull request is an email from one of linus torvalds' direct underlings (subsystem maintainers) to linus torvalds "requesting" that he "pull" (hence the name) some tag. an arbitrary example: https://lkml.org/lkml/2017/11/13/229
git request-pull generates these emails.
note that a "pull" is just "merge from a URL", and requires some preexisting trust, hence why it's only for the subsystem maintainers.
github stole this term for their signature misfeature and we've all been suffering since. some of its clones walk back this poor naming by saying "merge request" instead, but the damage to the name is done.
Yeah, utter nonsense. The mechanisms they use to keep people on GitHub is a) network effects (easy to create issues, PRs etc because you already have an account), and b) GitHub is actually really good!
But aside from enterprise SSO, far better permissions management than you get with SSH and Linux filesystem permissions, a unified open-source project discovery and vetting-assistance system, secrets management, integrated CI, lfs support, issue tracking, a billion integrations for-free, automatic dependency vulnerability detection, et c…
https://github.com/actions/runner
For anyone in the audience who didn't know: GitHub Actions is based on Visual Studio Team Foundation Server's CI, and later Azure DevOps. And nobody, including the current maintainers, seem to know exactly how it all works (or how it doesn't).[1] The random exit codes is just the cherry on top!
[1]: https://github.com/actions/runner/issues/510
I read through this whole issue, and I cannot find exactly what you're referring to. What message gives you the impression they don't know how it all works? Seems it's mostly people asking for updates, and eventually something similar but different got implemented.
For anyone who was at Github at the time, was it always written in C# or rewritten/replaced after acquisition? If "GitHub Actions is based on Visual Studio Team Foundation Server's CI" is the case, then it sounds like the latter.
[1]: https://github.blog/2018-10-16-future-of-software/
https://github.com/actions/runner/blob/a4c57f27477077e57545a...
Dead Comment
Yes and no, ADO Agent (https://github.com/microsoft/azure-pipelines-agent) is far more secretive and "black-box" alike. Like stuck in old version of NodeJS, Powershell, API without documentation or even enough tests/samples...
I could do that refactoring, I suppose I could make it better, in addition to piling on about it on hacker news. ;-)
Then you can run those scripts locally and iterate in them before you try them on GitHub Actions.
I agree: Actions would be more useful if I could run the entire stack locally (an official implementation, not a community-maintained clone). But it's not a big enough inconvenience for me to care very much.
Works pretty well for the actions I have used it on.
CI automation is for me the thing that replaces me running scripts one by one (and reporting/deploying results)
It is not the thing that does the building/testing/deploying. That’s the scripts that are hopefully written in a debuggable, portable, language.
https://docs.github.com/en/actions/hosting-your-own-runners/...
Then you configure the repo settings to only use the self hosted ones.
The dream is to have something like a container _locally_ and be able to run something like:
A platform with support for visual control from the scripts (implemented as no-ops during local execution) would be perfect.
Would still be better if I could run locally to iterate. But the local run would have to have very high fidelity to what happens on GitHub to be useful.
We have gotten a lot of good mileage out of GitLab's include: feature <https://docs.gitlab.com/ee/ci/yaml/#include>, which for clarity absolutely carries this same hazard, but can mean centralized fixes are less onoz than trying to patch every active branch of every active repo's build scripts
I am also aware this cuts against the thrust of the thread about "build scripts should be locally executable," but until they (they: GitHub, they: GitLab) straighten out their runner binary to be sane, it is currently a better use of our engineering effort maintaining CI yaml than teaching devs how to run CI locally
Deleted Comment
I really don't get why people use those complex CI/CD tools for other languages, but at least on the enterprise ones, I can understand people moving from the IDE into some huge centralized mess that can import it.
For a straightforward application or library project, you can fill in this form and get a shovel-ready build: https://gradle-initializr.cleverapps.io/
I usually tweak it a bit, but that's minutes, not weeks.
There's often more stuff to add as a project grows, but again, minutes to hours for the common stuff. Setting up publishing to a Maven repository is more work than I'd like, but it's still not weeks.
So what are you doing that it takes so long? And how long would that take if done any other way?
1. Run static tests
2. If those pass, run unit/integration tests
3. If those pass, deploy
If you find yourself screaming about YAML, you're leaning too heavily on it and need to refactor your project's scripts.
Maybe a good question to ask would be "if I had to switch to another CI system today, how hard would it be?" If the answer is "hard", perhaps you're leaning too heavily on it and need to refactor your project's scripts.
For me it's external system state management. Like making sure the integration test db is cleaned up correctly.
This is why he makes the important point that discipline is not enough. The reason is most teams simply don't care. I find one in five teams where everyone on the team cares about the build (when I'm lucky!), most teams have one person who cares, and some teams have no one that cares.
When I am tasked with the proper care and feeding of the pipelines of others, I want tools that can work and help me out even when the developers who created the software are Holding it Wrong.
With these requirements in mind -- managing and migrating the many different CI piplines across an organization -- it would be a major breakthrough to have a tool that 1) transpiles to the workflows of all the CI tools and 2) allows for local testing. So many orgs have different teams using different CI stacks, and the local testing problem is always a struggle. I would use a tool like that into the ground.
So I would qualify your original statement: The author isn't misusing CI. Rather, the author is attempting to survive in a world where others are misusing it, and where the author is tasked with managing all the CI pipelines.
Deployments, I think it'd be fair to consider the requirements. At work, our softare can be tested locally but deployments are all registered against a central authority, and after a point of composing enough access requirements, only then does a role (cicd in this case) have enough policy allowance to perform a deployment.
The entire transaction is auditable. And I think that with a deployment, that's how it should be; allowing that trust down to a local environment strikes me that too much permission is accured with a single entity.
I guess that we could better define what a deployment is; to some nonprod environments I'd agree, but I'd still probably insist on the heavy machinery up at the test, perf, qa, areas, and then getting into staging and prod, there'd be no wiggle room.
My point, however, was mostly that the logic necessary to deploy should live as part of your codebase, not written out in YAML. The privs necessary to deploy are a separate discussion.
Some CI platforms are just a simple glue layer (Gitlab CI - which I prefer - is one of them), but in most cases Github CI is not. Maybe it adds to the author frustration?
You can use gitlab CI with special-purpose docker images for all your steps and magic parameters driving everything too (Gitlab AutoDevops works that way).
But if you just run your steps in shell scripts in vanilla docker images containing your build-time dependencies, you should be able to produce something that works the same in any CI pipeline, or locally.
The most annoying thing for me is that a lot of CI engines make docker-in-docker complicated. I love using compose to set up integration test environments, but doing that in CI is often a fight.
I would respectfully suggest that you misread the author. The issue isn't running tests locally, it's running the CI config locally.
I experienced the same problem with gitlab CI years ago, where, basically, you can lint the file and not much more. Past that, you need to run it through your CI and debug if you get slightly different results compared to running a script locally.
I don't understand how one could build a CI service so long time after CircleCI launched, and still not have that very same feature (or something similar).
Wanting to run locally a developed extension is totally legit as some can be really tricky and depends on the behavior of [runner & OS].
(sorry for the dumbass question but it could go either way)
Now is this all part of Microsoft's evil plan ? It probably is! But unless someone else wants to give me a free server farm that includes Windows / OSX up and running without me paying anything / writing containers / storing images / etc. I dont see this aspect of Github actions losing any popularity.
Every major distribution used build farms long before GitHub (and git) existed...
That's still the main reason I use Github (although Gitlab has them now in beta(?) https://docs.gitlab.com/ee/ci/runners/index.html).
Everyone here is asking for it as if it’s some minor oversight, soon to be rectified.
Unfortunately, this tech stack is a significant revenue source. Microsoft charges for pipeline minutes, concurrent runs, etc… This is especially true in Azure DevOps which shares much of same underlying pipeline software.
Letting anyone run this locally for any reason would let them bypass the monetisation.
It’s the same reason that ad-supported YouTube is “missing” a download offline feature.
It’s not an oversight. It’s not happening. Stop asking.
The only thing we the dev community can do about this is to develop our own open-source CD platform with blackjack and hookers.
In other words, we need a Kubernetes of CI/CD.
1. Jobs should always (just) execute a script or program. This allows running outside the CI system.
2. To test/debug CI jobs, use act.
3. To test/debug more complex scenarios spin up a Gitea instance (which provides a Github clone wrapping act).
I can't wait for this meme to die, or for act (or gitea's fork thereof) to catch up to the hype train. Then again, I guess this fantasy is being promoted by folks who are all "just use run: and that's it" because any moderately complex one <https://github.com/VSCodium/vscodium/blob/1.84.2.23314/.gith...> for sure fails with spectacularly illegible error messages under both act and gitea's fork
This is not helpful. You, the author, and random commenters scattered around keep teasing that act is bad, but I can't seem to find our what any of you mean.
One comment said github APIs fail, which would make sense to me. Is that the primary reason for act being a pain? Do you have output for the linked build yml or an explanation of where it goes wrong?
Then why was GitHub so popular for the 10+ years it had no built in CI system?
I just quickly scanned, to find that there is the `git request-pull` command, before I wasn’t even sure whether pull requests are a git built-in feature at all.
Side question: does any code hosting platform allow to comment on lines of code outside of pull requests? I‘ve had several occasions where I wanted to ask, why something was written the way it was.
git request-pull generates these emails.
note that a "pull" is just "merge from a URL", and requires some preexisting trust, hence why it's only for the subsystem maintainers.
github stole this term for their signature misfeature and we've all been suffering since. some of its clones walk back this poor naming by saying "merge request" instead, but the damage to the name is done.
What have the Romans done for us?
Deleted Comment