Readit News logoReadit News
deng · a year ago
Already see people saying GitLab is better: yes it is, but it also sucks in different ways.

After years of dealing with this (first Jenkins, then GitLab, then GitHub), my takeaway is:

* Write as much CI logic as possible in your own code. Does not really matter what you use (shell scripts, make, just, doit, mage, whatever) as long as it is proper, maintainable code.

* Invest time that your pipelines can run locally on a developer machine as well (as much as possible at least), otherwise testing/debugging pipelines becomes a nightmare.

* Avoid YAML as much as possible, period.

* Don't bind yourself to some fancy new VC-financed thing that will solve CI once and for all but needs to get monetized eventually (see: earthly, dagger, etc.)

* Always use your own runners, on-premise if possible

hi_hi · a year ago
I came to the exact same conclusion accidentally in my first role as a Tech Lead a few years back.

It was a large enterprise CMS project. The client had previously told everyone they couldn't automate deployments due to the hosted platform security, so deployments of code and configs were all done manually by a specific support engineer following a complex multistep run sheet. That was going about as well as you'd expect.

I first solved my own headaches by creating a bunch of bash scripts to package and deploy to my local server. Then I shared that with the squads to solve their headaches. Once the bugs were ironed out, the scripts were updated to deploy from local to the dev instance. Jenkins was then brought in an quickly setup to use the same bash scripts, so now we had full CI/CD working to dev and test. Then the platform support guy got bored manually following the run sheet approach and started using our (now mature) scripts to automate deployments to stage and prod.

By the time the client found out I'd completely ignored their direction they were over the moon because we had repeatable and error free automated deployments from local all the way up to prod. I was quite proud of that piece of gorilla consulting :-)

badloginagain · a year ago
I hate the fact that CI peaked with Jenkins. I hate Jenkins, I hate Groovy, but for every company I've worked for there's been a 6-year-uptime Jenkins instance casually holding up the entire company.

There's probably a lesson in there.

rrr_oh_man · a year ago
> gorilla consulting

Probably 'guerilla', but I like your version more.

noplacelikehome · a year ago
Nix is awesome for this -- write your entire series of CI tools in she'll or Python and run them locally in the exact same environment as they will run in CI. Add SOPS to bring secrets along for the ride.
mikepurvis · a year ago
Strongly isolated systems like Nix and Bazel are amazing for giving no-fuss local reproducibility.

Every CI "platform" is trying to seduce you into breaking things out into steps so that you can see their little visualizations of what's running in parallel or write special logic in groovy or JS to talk to an API and generate notifications or badges or whatever on the build page. All of that is cute, but it's ultimately the tail wagging the dog— the underlying build tool should be what is managing and ordering the build, not the GUI.

What I'd really like for next gen CI is a system that can get deep hooks into local-first tools. Don't make me define a bunch of "steps" for you to run, instead talk to my build tool and just display for me what the build tool is doing. Show me the order of things it built, show me the individual logs of everything it did.

Same thing with test runners. How are we still stuck in a world where the test runner has its own totally opaque parallelism regime and our only insight is whatever it chooses to dump into XML at the end, which will be probably be nothing if the test executable crashes? Why can't the test runner tell the CI system what all the processes are that it forked off and where each one's respective log file and exit status is expected to be?

steeleduncan · a year ago
> Write as much CI logic as possible in your own code

Nix really helps with this. Its not just that you do everything via a single script invocation, local or ci, you do it in an identical environment, local or ci. You are not trying to debug the difference between Ubuntu as setup in GHA or Arch as it is on your laptop.

Setting up a nix build cache also means that any artefact built by your CI is instantly available locally which can speed up some workflows a lot.

shykes · a year ago
Dagger.io does this out of the box:

- Everything sandboxed in containers (works the same locally and in CI)

- Integrate your build tools by executing them in containers

- Send traces, metrics and logs for everything at full resolution, in the OTEL format. Visualize in our proprietary web UI, or in your favorite observability tool

nand_gate · a year ago
Why would you need extra visualisation anyway, tooling like Nix is already what you see is what you get!
squiggleblaz · a year ago
Basically an online version of nix-output-monitor. Might be half an idea. But it doesn't get you 100%: you get CI, but not CD.
specialist · a year ago
We used to just tail the build script's output.

Maybe add some semi-structured log/trace statements for the CI to scrap.

No hooks necessary.

teeray · a year ago
> What I'd really like for next gen CI is a system that can get deep hooks into local-first tools.

But how do you get that sweet, sweet vendor-lock that way? /s

doix · a year ago
I came from the semiconductor industry, where everything was locally hosted Jenkins + bash scripts. The Jenkins job would just launch the bash script that was stored in perforce(vcs), so all you had to do to run things locally was run the same bash script.

When I joined my first web SaaS startup I had a bit of a culture shock. Everything was running on 3rd party services with their own proprietary config/language/etc. The base knowledge of POSIX/Linux/whatever was almost completely useless.

I'm kinda used to it now, but I'm not convinced it's any better. There are so many layers of abstraction now that I'm not sure anybody truly understands it all.

Xcelerate · a year ago
Haha, I had the same experience going from scientific work in grad school to big tech. The phrase “a solution in search of a problem” comes to mind. The additional complexity does create new problems however, which is fine for devops, because now we have a recursive system of ensuring job security.

It blows my mind what is involved in creating a simple web app nowadays compared to when I was a kid in the mid-2000s. Do kids even do that nowadays? I’m not sure I’d even want to get started with all the complexity involved.

sgarland · a year ago
> I'm kinda used to it now, but I'm not convinced it's any better.

It’s demonstrably worse.

> The base knowledge of POSIX/Linux/whatever was almost completely useless.

Guarantee you, 99% of the engineering team there doesn’t have that base knowledge to start with, because of:

> There are so many layers of abstraction now that I'm not sure anybody truly understands it all.

Everything is constantly on fire, because everything is a house of cards made up of a collection of XaaS, all of which are themselves houses of cards written by people similarly clueless about how computers actually operate.

I hate all of it.

zamalek · a year ago
> I'm not convinced it's any better.

Your Jenkins experience is more valuable and worth replicating when you get the opportunity.

nsonha · a year ago
it's just common sense, which is unfortunately lost with sloppy devs. People go straight from junior dev to SRE without learning engineering principles through building products first.
jimbokun · a year ago
I feel like more time is spent getting CI working these days than on the actual applications.

Between that and upgrading for security patches. Developing user impacting code is becoming a smaller and smaller part of software development.

cookiengineer · a year ago
This.

I heavily invested in a local runner based CI/CD workflow. First I was using gogs and drone, now the forgejo and woodpecker CI forks.

It runs with multiple redundancies because it's a pretty easy setup to replicate on decentralized hardware. The only thing that's a little painful is authentication and cross-system pull requests, so we still need our single point of failure to merge feature branches and do code reviews.

Due to us building everything in go, we also decided to have always a /toolchain/build.go so that we have everything in a single language, and don't need even bash in our CI/CD podman/docker images. We just use FROM scratch, with go, and that's it. The only exception being when we need to compile/rebuild our ebpf kernel modules.

To me, personally, the Github Actions CVE from August 2024 was the final nail in the coffin. I blogged about it in more technical detail [1] and guess what was the reason that the TJ actions have been compromised last week? Yep, you guessed right, the same attack surface that Github refuses to fix, a year later.

The only tool, as far as I know, that somehow validates against these kind of vulnerabilities, is zizmor [2]. All other tools validate schemas, not vulnerabilities and weaknesses.

[1] https://cookie.engineer/weblog/articles/malware-insights-git...

[2] https://github.com/woodruffw/zizmor

pcthrowaway · a year ago
My years using Concourse were a dream compared to the CI/CD pains of trying to make github actions work (which I fortunately didn't have to do a lot of). Add that to the list of options for people who want open source and their own runners
sleepybrett · a year ago
Did they finally actually say how the tj actions repo got compromised. When I was fixing that shit on saturday it was still 'we don't know how they got access!?!?'
JanMa · a year ago
Whenever possible I now just use GitHub actions as a thin wrapper around a Makefile and this has improved my experience with it a lot. The Makefile takes care of installing all necessary dependencies and runs the relevant build/Test commands. This also enables me to test that stuff locally again without the long feedback loop mentioned in other comments in this thread.
oulipo · a year ago
mise (https://mise.jdx.dev/) and dagger (https://github.com/dagger/dagger) seem like nice candidates too!

Mise can install all your deps, and run tasks

fireflash38 · a year ago
I implemented a thing such that the makefiles locally use the same podman/docker images as the CI/CD uses. Every command looks something like:

target: $(DOCKER_PREFIX) build

When run in gitlab, the DOCKER_PREFIX is a no-op (it's literally empty due to the CI=true var), and the 'build' command (whatever it is) runs in the CI/CD docker image. When run locally, it effectively is a `docker run -v $(pwd):$(pwd) build`.

It's really convenient for ensuring that if it builds locally, it can build in CI/CD.

akanapuli · a year ago
I dont quite understand the benefit. How does running commands from the Makefile differ from running commands directly on the runner ? What benefit does Makefile brings here ?
mwenge · a year ago
Do you have a public example of this? I'd love to see how to do this with Github Actions.
ehansdais · a year ago
After years of trial and error our team has come to the same conclusion. I know some people might consider this insanity, but we actually run all of our scripts as a separate C# CLI application (The main application is a C# web server). Effectively no bash scripts, except as the entry point here and there. The build step and passing the executable around is a small price to pay for the gain in static type checking, being able to pull in libraries as needed, and knowing that our CI is not going to down because someone made a dumb typo somewhere.

The other thing I would add is consider passing in all environment variables as args. This makes it easy to see what dependencies the script actually needs, and has the bonus of being even more portable.

baq · a year ago
> I know some people might consider this insanity

Some people here still can’t believe YAML is used for not only configuration, but complex code like optimized CI pipelines. This is insane. You’re actually introducing much needed sanity into the process by admitting that a real programming language is the tool to use here.

I can’t imagine the cognitive dissonance Lisp folks have when dealing with this madness, not being one myself.

robinwassen · a year ago
Did a similar thing when we needed to do complex operations towards aws.

Instead of wrapping the aws cli command I wrote small Go applications using the boto3 library.

Removed the headaches when passing in complex params, parsing output and and also made the logic portable as we need to do the builds on different platforms (Windows, Linux and macOS).

noworriesnate · a year ago
I've used nuke.build for this in the past. This makes it nice for injecting environment variables into properties and for auto-generating CI YAML to wrap the main commands, but it is a bit of a pain when it comes to scaling the build. E.g. we did infrastructure as code using Pulumi, and that caused the build code to dramatically increase to the point the Nuke script became unwieldy. I wish we had gone the plain C# CLI app from the beginning.
ozim · a year ago
I don’t think it is insanity quite the opposite - insanity is trying to force everything in yaml or pipeline.

I have seen people doing absolutely insane setups because they thought they have to do it in yaml and pipeline and there is absolutely no other option or it is somehow wrong to drop some stuff to code.

mst · a year ago
Honestly, "using the same language as the application" is often a solid choice no matter what the application is written in. (and I suspect that for any given language somebody might propose as an exception to that rule, there's more than one team out there doing it anyway and finding it works better for them than everything else they've tried)
7bit · a year ago
> The other thing I would add is consider passing in all environment variables as args. This makes it easy to see what dependencies the script actually needs, and has the bonus of being even more portable.

This is the dumbest thing I see installers do a lot lately.

no_wizard · a year ago
Am I an outlier in that not only do I find GitHub actions pleasant to use, but that most folks over complicate their CI/CD pipelines? I've had to re-write alot of actions configurations over the last few years, and in every case, the issue was simply not thinking through the limits of the platform, or when things would be better to run as custom docker images (which you can do via GitHub Actions) etc.

It tends to be that folks want to shoehorn some technology into the pipeline that doesn't really fit, or they make these giant one shot configurations instead of running multiple small parallel jobs by setting up different configurations for different concerns etc.

davidham · a year ago
I'm with you! I kind of love GitHub Actions, and as long as I keep it to tools and actions I understand, I think it works great. It's super flexible and has many event hooks. It's reasonably easy to get it to do the things I want. And my current company has a pretty robust CI suite that catches most problems before they get merged in. It's my favorite of the CI platforms I have used.
gchamonlive · a year ago
The way that gitlab shines is just fundamentally better than GitHub actions.

It's really easy to extend and compose jobs, so it's simple to unit test your pipeline: https://gitlab.com/nunet/test-suite/-/tree/main/cicd/tests?r...

This way I can code my pipeline and use the same infrastructure to isolate groups of jobs that compose a relevant functionality and test it in isolation to the rest of the pipeline.

I just wish components didn't have such a rigid opinion on folder structure, because they are really powerful, but you have to adopt gitlab prescription

rbongers · a year ago
In my opinion, unless if you need its ability to figure out when something should rebuild or potentially if you already use it, Make is not the right tool for the job. You should capture your pipeline jobs in scripts or similar, but Make just adds another language for developers to learn on top of everything. Make is not a simple script runner.

I maintained a Javascript project that used Make and it just turned into a mess. We simply changed all of our `make some-job` jobs into `./scripts/some-job.sh` and not only was the code much nicer, less experienced developers were suddenly more comfortable making changes to scripts. We didn't really need Make to figure out when to rebuild anything, all of our tools already had caching.

JanMa · a year ago
Make is definitely just my personal preference. If using bash scripts, Just, Taskfile or something similar works better for you then by all means use it.

The main argument I wanted to make is that it works very well to just use GitHub actions to execute your tool of choice.

DanHulton · a year ago
This is why I've become a huge fan of Just, which is just a command runner, not a build caching system or anything.

It allows you to define a central interface into your project (largely what I find people justify using Make for), but smoothes out so many of the weird little bumps you run into from "using Make wrong."

Plus, you can an any point just drop into running a script in a different language as your command, so it basically "supports bash scripts" too.

https://github.com/casey/just

stinos · a year ago
This. I don't know which guru came up with it but this is the 'one-click build' principle. If youcan't do that, you have a problem.

So if even remotely possible we write all CI as a single 'one-click' script which can do it all by itself. Makes developing/testing the whole CI easy. Makes changing between CI implementations easy. Can solve really nasty issues (think: CI is down, need to send update to customer) easily because if you want a release you just build it locally.

The only thing it won't automaticaly do out of the box is being fast, because obviously this script also needs to setup most of the build environment. So depending on the exact implementation there's variation in the split between what constitutes setting up a build environment and running the CI script. As in: for some tools our CI scripts will do 'everything' so starting from a minimal OS install. Whereas others expect an OS with build tools and possibly some dependencies already available.

xyzal · a year ago
I think it was mentioned as a part of the 'Joel test'

https://www.joelonsoftware.com/2000/08/09/the-joel-test-12-s...

makeitdouble · a year ago
> * Invest time that your pipelines can run locally on a developer machine as well (as much as possible at least), otherwise testing/debugging pipelines becomes a nightmare.

Yes, a thousand time.

Deploy scripts are tougher to deal with, as they'll naturally rely on a flurry of environment variables, protected credentials etc.

But for everything else writing the script for local execution first, and generalizating them for CI one they run well enough is the absolute best approach. It doesn't even need to run in the local shell, having all the CI stuff in a dedicated docker image is fine if it requires specific libraries or env.

20thr · a year ago
I spend a lot of time in CI (building https://namespace.so) and I agree with most of this:

- Treat pipelines as code. - Make pipelines parts composable, as code. - Be mindful of vendor lock-in and/or lack of portability (it is a trade-off).

For on-promise: if you're already deeply invested in running your own infrastructure, that seems like a good fit.

When thinking about how we build Namespace -- there are parts that are so important that we just build and run internally; and there are others where we find that the products in the market just bring a tremendous amount of value beyond self-hosting (Honeycomb is a prime example).

Use the tools that work best for you.

LukaD · a year ago
> [...] use (shell scripts, make, just, doit, mage, whatever) as long as it is proper, maintainable code

I fully agree with the recommendation to use maintainable code. But that effectively rules out shell scripts in my oppinion. CI shell scripts tend to become big ball of mud rather quickly as you run into the limitations of bash. I think most devs only have superficial knowledge of shell scripts, so do yourself a favor and skip them and go straight to whatever language your team is comfortable with.

sgarland · a year ago
Maybe people should get better at shell, instead. Read the bash / zsh manual. Use ShellCheck.
cnotv · a year ago
Once, a reliable and wise colleague told me "Use in CI what you use locally" and that has been the best devop advice that never failed me to save my time.

The second one has been, from someone else: if you can use anything else than bash, do that.

amtamt · a year ago
Try brainfuck...

Jokes aside... it's so trendy to bash bash that it's not funny anymore. Bash is still quite reliable for work that usually gets done in CI, and nearly maintenance free if used well.

psyclobe · a year ago
'* Write as much CI logic as possible in your own code. Does not really matter what you use (shell scripts, make, just, doit, mage, whatever) as long as it is proper, maintainable code.'

THIS 10000% percent.

jpgvm · a year ago
This is the way.

My personal favourite solution is Bazel specifically because it can be so isolated from those layers.

No need for Docker (or Docker in Docker as many of these solutions end up requiring) or other exotic stuff, can produce OCI image artifacts with `rules_oci` directly.

By requiring so little of the runner you really don't care for runner features, you can then restrict your CI/CD runner selection to just reliability, cost, performance and ease of integration.

otikik · a year ago
> * Avoid YAML as much as possible, period.

That's also a very valid takeaway for life in general

ozim · a year ago
Good insight, because that is just a complex issue - especially when there is team churn and everyone adds their parts in yaml or configuration.

Doesn’t matter Jenkins or actions - it is just complicated. Making it simpler is on devs/ops not the tool.

jiehong · a year ago
I always thought it could be cool to use systemd as a CI agent replacement someday:

Each systemd service could represent a step built by running a script, and each service can say what it depends on, thus helping parallelize any step that can be.

I have not found anyone trying that so far. Is anybody aware of something similar and more POSIX/cross platform that allows writing a DAG of scripts to execute?

forrestthewoods · a year ago
> pipelines can run locally on a developer machine as well (as much as possible at least)

Facts.

However I’ll go a step further and say “only implement your logic in a tool that has a debugger”.

YAML is the worse. But shell scripts are second worst. Use a real language.

never_inline · a year ago
Python with click and PyYAML can go a long way - then you can build it as a CLI application and use the same from CI. In a java shop, picocli + graalvm probably. I wouldn't like Go for this purpose (against the conventional wisdom - because boilerplate and pretty bad debugging capabilities).

That said, if you absolutely need to use shell script for reasons, keep it all in single script, define logging functions including debug logs, rigorously check every constraint and variable, use shellcheck, factor the code well into functions - I should sometimes write a blog post about it.

folmar · a year ago
There is a debugger for bash: https://github.com/Trepan-Debuggers/bashdb Not that I'm recommending 10k-line programs in bash, but a debugger is useful when you need it.
sgarland · a year ago
Shell is a very real language, and it has a debugger; it’s called set -x and/or strace.
julienEar · a year ago
Completely agree. Keeping CI logic in actual code instead of YAML is a lifesaver. The GitHub Actions security issues just reinforce why self-hosted runners are the way to go.
crabbone · a year ago
First of all, I cannot agree more, given what we have today.

Unfortunately, this isn't a good plan going forward... :( Going forward I'd wish for a tool that's as ubiquitous as Git, has good integration with editors like language servers, can be sold as a service or run completely in-house. And it would allow defining the actions of the automated builds and tests, have a way of dealing with releases, expose interface for collecting statistics, integrate with bug tracing software for the purpose of excluding / including tests in test runs, allowed organizing tests in groups (eg. sanity / nightly / rc).

The problem is that tools today don't come anywhere close to being what I want for CI, neither free nor commercial tools aren't even going in the desired direction. So, the best option is simply to minimize their use.

philistine · a year ago
> * Avoid YAML as much as possible, period.

Why does YAML have any traction when JSON is right there? I'm an idiot amateur and even I learned this lesson; my 1 MB YAML file full of data took 15 seconds to parse each time. I quickly learned to use JSON instead, takes half a second.

12_throw_away · a year ago
> Why does YAML have any traction when JSON is right there?

Because it has comments, which are utterly essential for anything used as a human readable/writable configuration file format (your use case, with 1 MB of data, needs a data interchange format, for which yes JSON is at least much better than YAML).

fireflash38 · a year ago
JSON is valid YAML.

YAML has comments. YAML is easily & trivially written by humans. JSON is easily & trivially written by code.

My lesson learned here? When generating YAML, instead generate JSON. If it's meant to be read and updated by humans, use something that can communicate to the humans (comments). And don't use YAML as a data interchange format.

sofixa · a year ago
Because YAML, as much as it sucks, is relatively straightforward to write by humans. It sucks to read and parse, you can make tons of small mistakes that screw it up entirely, but it's still less cruft than tons of needless "": { } .

For short configs, YAML is acceptable-ish. For anything longer I'd take TOML or something else.

iloveitaly · a year ago
I very much agree here. I've had the best luck when there is as little as possible config in CI as possible:

- mise for lang config

- direnv for environment loading

- op for secret injection

- justfile for lint, build, etc

Here's a template repo that I've been working on that has all of this implemented:

https://github.com/iloveitaly/python-starter-template

It's more complex than I would like it to be, but it's consistent and avoids having to deal with GHA too much.

I've also found having a GHA playground is helpful:

https://github.com/iloveitaly/github-action-playground

qweiopqweiop · a year ago
Can you explain YAML? I've found declarative pipelines with it have been... fine?
deng · a year ago
YAML is fine for what it is: a markup language. I have no problem with it being used in simple configuration files, for instance.

However, CI is not "configured", it is coded. It is simply the wrong tool. YAML was continuously extended to deal with that, so it developed into much more than just "markup", but it grew into this terrible chimera. Once you start using advanced features in GitLab's YAML like anchors and references to avoid writing the same stuff again and again, you'll notice that the whole tooling around YAML is simply not there. How does the resulting YAML look like? How do you run this stuff locally? How do you debug this? Just don't go there.

You will not be able to avoid YAML completely, obviously, but use it the way it was originally intended to.

crabbone · a year ago
YAML's problems:

* Very easy to write the code you didn't mean to, especially in the context of CI where potentially a lot of languages are going to be mixed, a lot of quoting and escaping. YAML's string literals are a nightmare.

* YAML has no way to express inheritance. Nor does it have a good way to express variables. Both are usually desperately needed in CI scripts, and are usually bolted on top with some extra-language syntax (all those dollars in GitHub actions, Helm charts, Ansible playbooks etc.)

* Complexity skyrockets compared to the size of the file. I.e. in a language like C you can write a manageable program with millions of lines of code. In YAML you will give up after a few tens of thousands of lines (similar to SQL or any other language that doesn't have modules).

* Whitespace errors are very hard to spot and fix. Often whitespace errors in YAML result in valid YAML which, however, doesn't do what you want...

dharmab · a year ago
1. The YAML spec is extremely complex with some things being ambiguous. You might not notice this if your restrict yourself to a small subset of the language. But you will notice it when different YAML libraries and programming languages interpret the same YAML file as different content.

2. Trying to encode logic and control flow in a YAML document is much more difficult than writing that flow in a "real" programming language. Debugging is especially much easier in "real" languages.

maratc · a year ago
You can't put a breakpoint in YAML. You can't evaluate variables in YAML. You can't print debugging info from YAML. You can't rerun YAML from some point.

YAML is great for the happy-flow where everything works. It's absolutely terrible for any other flow.

fergie · a year ago
As a developer based in Norway, one fairly major drawback to YAML is the way that it processes the language code for Norwegian ("no").
imp0cat · a year ago
Depends on the complexity of your pipeline.
alex_suzuki · a year ago
> * Always use your own runners, on-premise if possible

Why? I understand it in cases where security is critical or intellectual property is at stake. Are you talking about "snowflake runners" or just dumb executors of container images?

saidinesh5 · a year ago
Caching is nicer on own runners. No need to redownload 10+GB of "development container images" just to build your 10 lines of changed code.

With self hosted Gitlab runners it was almost as fast as doing incremental builds. When your build process can take like 15-20 minutes (medium sized C++ code base), this brought down the total time to 30 seconds or so.

deng · a year ago
It obviously depends on your load. Fast pipelines matter, so don't run them on some weak cloud runner with the speed of a C64. Fast cloud runners are expensive. Just invest some money and buy or at least rent some beefy servers with lots of cores, RAM and storage and never look back. Use caches for everything to speed up things.

Security is another thing where this can come in handy, but properly firewalling CI runners and having mirrors of all your dependencies is a lot of work and might very well be overkill for most people.

crabbone · a year ago
Debugging and monitoring. When the runner is somewhere else, and is shared nobody is going to give you full access to the machine.

So many times I was biting my fingers not being able to figure out the problems GitHub runners were having with my actions and was unable to investigate.

Deleted Comment

gabyx · a year ago
Ohh, @deng, my exact words and you are 100% right. Same experience, same conclusion:

- I would go even further: Do not use bash/python or any duck-typed lang. (only for simple projects, but better just dont get started). - Leverage Nix (!! no its not a joke ecosystem) : devshells or/and build devcontainers out of it. - Treat tooling code, ci code, the exact same as your other code. - Maybe generate the pipeline for your YAML based CI system in code. - If you use a CI system, gitlab, circle etc, use one which does not do stupid things with your containers (like Github: 4 years! old f** up: https://github.com/actions/runner/issues/863#issuecomment-25...)

Thats why we built our own build tool which does that, or at least helps us doing the above things:

https://github.com/sdsc-ordes/quitsh

gabyxgabyx · a year ago
Ohh, @deng, my exact words and you are 100% right. Same experience, same conclusion:

- I would go even further: Do not use bash/python or any duck-typed lang. (only for simple projects, but better just dont get started).

- Leverage Nix (!! no its not a joke ecosystem) : devshells or/and build devcontainers out of it.

- Treat tooling code, ci code, the exact same as your other code.

- Maybe generate the pipeline for your YAML based CI system in code.

- If you use a CI system, gitlab, circle etc, use one which does not do stupid things with your containers (like Github: 4 years! old f** up: https://github.com/actions/runner/issues/863#issuecomment-25...). Also one which lets you run dynamically generated pipelines.

Thats why we built our own build tool which does that, or at least helps us doing the above things:

https://github.com/sdsc-ordes/quitsh

carlmr · a year ago
>Invest time that your pipelines can run locally on a developer machine as well (as much as possible at least), otherwise testing/debugging pipelines becomes a nightmare.

This so much. This ties into the previous point about using as much shell as possible. Additionally I'd say environment control via Docker/Nix, as well as modularizing the pipeline so you can restart it just before the point of failure instead of rerunning the whole business just to replay one little failure.

valenterry · a year ago
Amen.

To put the first 3 points into different words: you should treat the CI only as a tool that manages the interface and provides interaction with the outside world (including injecting secrets/configuration, setting triggers, storing caches etc.) and helps to visualize things.

Unfortunately, to do that, it puts constraints on how you can use it. Apart from that, no logic should live in the CI.

Tainnor · a year ago
> Write as much CI logic as possible in your own code. Does not really matter what you use (shell scripts, make, just, doit, mage, whatever) as long as it is proper, maintainable code.

To an extent, yes. There should be one command to build, one to run tests, etc.

But in many cases, you do actually want the pipeline functionality that something like Gitlab CI offers - having multiple jobs instead of a single one has many benefits (better/shorter retry behaviour, parallelisation, manual triggers, caching, reacting to specific repository hooks, running subsets of tests depending on the changed files, secrets in env vars, artifact publishing, etc.). It's at this point that it becomes almost unavoidable to use many of the configuration features including branching statements, job dependencies etc. and that's where it gets messy.

The problem is really that you're forced to do all of that in YAML instead of an actual programming language.

jordanbeiber · a year ago
We’ve gone full-on full-code.

Although we’re using temporal to schedule the workflows, we have a full-code typescript CI/CD setup.

We’ve been through them all starting with Jenkins ending with drone, until we realized that full-code makes it so much easier to maintain and share the work over the whole dev org.

No more yaml, code generating yaml, product quirk, groovy or DSLs!

bob1029 · a year ago
> Write as much CI logic as possible in your own code

This has been my entire strategy since I've been able to do this:

https://learn.microsoft.com/en-us/dotnet/core/deploying/#pub...

Pulling the latest from git, running "dotnet build" and sending the artifacts to zip/S3 is now much easier than setting up and managing Jenkins, et. al. You also get the benefit of having 100% of your CI/CD pipeline under source control alongside the product.

In my last professional application of this (B2B/SaaS; customer hosts on-prem), we didn't even have to write the deployment piece. All we needed to do was email the S3 zip link to the customer and they learned a quick procedure to extract it on the server each time.

ptx · a year ago
> All we needed to do was email the S3 zip link to the customer and they learned a quick procedure to extract it on the server each time.

My concern with this kind of deployment solution, where the customer is instructed to install software from links received in e-mails, is that someone else could very easily send them a link to a malicious installer and they would be hosed. E-mail is not authenticated (usually) and the sender can be forged.

I suppose you could use a shared OneDrive folder or something, which would be safer, as long as the customer doesn't rely on receiving the link to OneDrive by e-mail.

cesnja · a year ago
You can build the first pipeline with oneliners, but as long as you want to keep optimizing the pipelines, the yaml code will keep piling up with CI vendor's specific approaches to job selection, env variable delivery, caching, output sharing between jobs and so on.

Deleted Comment

wvh · a year ago
I like the premise of something like Dagger, being an Actions CI that can run locally and uses Docker. I don't know if there's an up-and-coming "safe" open-source alternative that does not have that threat of a VC time bomb hanging over it.

Docker and to some extent, however unwieldy, Kubernetes at least allow you to run anywhere, anytime without locking you into a third party.

A "pipeline language" that can run anywhere, even locally, sets up dependency services and initial data, runs tests and avoids YAML overload would be a much needed addition to software engineering.

DanielHB · a year ago
Man I tried this approach by making my builds dockerized, turns out docker layer caching is pretty slow on CI and adds a lot of overhead locally.

Do not recommend this approach (of using docker for building).

adra · a year ago
Make builds in docker by mounting volumes and have your sources, intermediate files, caches, etc. in these volume mounts. Building a bunch of intermediate or incremental data IN the container every time you execute a new partial compile is insanity.

It's very satisfying just compile an application with a super esoteric tool chain in docker vs the nightmares of setting it up locally (and keeping it working over time).

amadio · a year ago
I think this is good advice overall. I wrote a CMake script that does most of the heavy lifting for XRootD (see https://news.ycombinator.com/item?id=39657703). The CI is then a couple of lines, one to install the dependencies using the packaging tools, and another one calling that script. So don't underestimate the convenience that packaging can give you when installing dependencies.
Aeolun · a year ago
This is where I was going to say something about dagger, but it seems it turned into AI crud.

Let me at least recommend depot.dev for having absurdly fast runners.

shykes · a year ago
Hello! Dagger CEO here. We are indeed getting an influx of AI workloads (AI agents to be specific, which is the fancy industry term for "software with LLMs inside"), and are of course trying to capitalize on that in our marketing material. We're still looking for the right balance of CI and AI on our website. Crucially, it's the same engine running both. Because, as it turns out, AI agents are mostly workflows under the hood, and Dagger is great at running those.

I shared more context in this thread: https://x.com/solomonstre/status/1895671390176747682

oulipo · a year ago
can you give more feedback about dagger? what is good/not good about it? I was going to start looking into it
amedvednikov · a year ago
We recently migrated from YAML CI to VSH as well:

https://github.com/vlang/v/blob/master/ci/linux_ci.vsh

speleding · a year ago
I would like to add one point:

* Consider whether it's not easier to do away with CI in the cloud and just build locally on the dev's laptop

With fast laptops and Docker you can get perfectly reproducible builds and tests locally that are infinitely easier to debug. It works for us.

claytonjy · a year ago
How do you ensure what a dev builds and tags and pushes is coherent, meaning the tag matches the code commit it’s expected to?

I think builds must be possible locally, but i’d never rely on devs for the source of truth artifacts running in production, past a super early startup.

Deleted Comment

ed_elliott_asc · a year ago
* print out the working directory and a directory listing every time
12_throw_away · a year ago
And the environment! (Also, don't put secrets in environment vars)
WhyNotHugo · a year ago
If CI just installs some packages and runs `make check` (or something close), then it's going to be much much easier for others to run checks locally.
djha-skin · a year ago
I couldn't agree more, really. My whole career points to this as the absolute correct advice in CI.
fahhem · a year ago
Why use your own runners? If it's about cost, why not use a cheaper cloud like SonicInfra.com?
outofpaper · a year ago
Agree with everything except for the avoidance of YAML. What is your rationale for this?
neves · a year ago
How AWS Code Builder compares? I'm delving into AWS world now.
Kwpolska · a year ago
There are tradeoffs to that. If your CI logic is in shell scripts, you will probably get worse error reporting than the dedicated tasks from the CI tool (which hook into the build system, or which know how to parse logs).
agumonkey · a year ago
seconded, it was great to leverage hosted cicd at work, until we realized that local testing would now be handled differently..

as always, enough decoupling is useful

totaldude87 · a year ago
Gitlab's search just sucks..
julik · a year ago
Seconded. Moreover...

> as long as it is proper, maintainable code

...in an imperative language you know well and which has a sufficient amount of type/null checking you can tolerate.

Ancalagon · a year ago
+1 for avoiding YAML at all costs

Also lol @deng

Dead Comment

tobinfekkes · a year ago
This is the joy of HN, for me, at least. I'm genuinely fascinated to read that both GitHub Actions and DevOps are (apparently) so universally hated. I've been using both for many years, with barely a hiccup, and I actually really enjoy and value what they do. It would never have dawned on me, outside this thread, to think that so many people dislike it. Nice to see a different perspective!

Are the Actions a little cumbersome to set up and test? Sure. Is it a little annoying to have to make somewhat-useless commits just to re-trigger an Action to see if it works? Absolutely. But once it works, I just set it and forget it. I've barely touched my workflows in ~4 years, outside of the Node version updates.

Otherwise, I'm very pleased with both. My needs must just be simple enough to not run into these more complicated issues, I guess?

dathinab · a year ago
It really depends on what you do?

GitHub CI is designed in a way which tends to work well for

- languages with no or very very cheap "compilation" steps (i.e. basically only scripting languages)

- relatively well contained project (e.g. one JS library, no mono repo stuff)

- no complex needs for integration tests

- no need for compliance enforcement stuff, especially not if it has to actually be securely enforced instead of just making it easier to comply then not to comply

- all developers having roughly the same permissions (ignore that some admin has more)

- fast CI

but the moment you step away from this it just falls more and more and more apart and I every company which doesn't fit the constraints above I have seen so far has non stop issues with GitHub Actions.

But the worst part, which maybe is where a lot of hatred comes from, is that it's there for cheap maybe even free (if you anyway pay for GitHub) and it doesn't need an additional contract, billing, etc. Not an additional vetting of 3rd party companies. Doesn't need managing your own CI service etc. So while it does cause issues non stop it also seems initially still "cheaper" solution for the company. And then when your company realizes it's not and has to setup their own GitHub runner etc. it probably isn't. But that is if you properly account dev time spend on "fixing CI issues" and even then there is the sunk cost fallacy because you already spend so much time to make github actions work and you would have to port everything over etc. Also, realistically speaking, a lot of other CI solutions are only marginally better.

voxic11 · a year ago
> no need for compliance enforcement stuff

I find github actions works very well for compliance. The ability to create attestations makes it easy to enforce policies about artifact provenance and integrity and was much easier to get working properly compared to my experience attempting to get jenkins to produce attestations.

https://docs.github.com/en/actions/security-for-github-actio...

https://docs.github.com/en/actions/security-for-github-actio...

What was your issue with it?

tasuki · a year ago
> languages with no or very very cheap "compilation" steps (i.e. basically only scripting languages)

This is not true at all. It's fine with Haskell, just cache the dependencies to speed up the build...

Marsymars · a year ago
> But the worst part, which maybe is where a lot of hatred comes from, is that it's there for cheap maybe even free (if you anyway pay for GitHub) and it doesn't need an additional contract, billing, etc.

Or even if you pay $$$ for big runners you can roll it onto your Azure bill rather than having to justify another SAAS service.

lolinder · a year ago
> Also, realistically speaking, a lot of other CI solutions are only marginally better.

This is the key point. Every CI system falls apart when you get too far from the happy path that you lay out above. I don't know if there's an answer besides giving up on CI all together.

jillesvangurp · a year ago
I use GH actions. You should treat it like all build systems: let them do what they are good at and nothing else. The rest should be shell scripts or separate docker containers. If it gets complicated, dumb it down to "run this script". Scripts are a lot easier to write and debug than thousands of lines of yaml doing god knows what.

The problem isn't github actions but people overloading their build and CI system with all sorts of custom crap. You'd have a hard time doing the same thing twenty years ago with Ant and Hudson (Jenkin's before the fork after Oracle inherited that from Sun). And for the same reason. These systems simply aren't very good as a bash replacement.

If you don't know what Ant is. That was a popular build system for Java before people moved the problem to Maven and then to Gradle (without solving it). I've dealt with Maven files that were trying to do all sorts of complicated things via plugins that would have amounted to two or three lines of bash. Gradle isn't any better. Ant at least used to have simple primitives for "doing" things. But you had to spell it out in XML form.

The point of all this, is that build & CI systems should mainly do simple things like building software. They shouldn't have a lot of conditional logic, custom side effects, and wonky things that may or may not happen depending on the alignment of the moon and stars. Debugging that stuff when it fails to work really sucks.

What helps with Yaml is using Yaml generators. I've used a Kotlin one for a while. Basically, you get auto complete, syntactical sanity, type checking and if it compiles it runs. Also makes it a lot easier to discover new parameters, plugin version updates, etc.

motorest · a year ago
> I use GH actions. You should treat it like all build systems: let them do what they are good at and nothing else. The rest should be shell scripts or separate docker containers.

That's supposedly CICD 101. I don't understand why people in this thread seem to be missing this basic fact and instead they vent about irrelevant things like YAML.

You set your pipeline. You provide your own scripts. If a GitHub Action saves you time, you adopt it instead of reinventing the wheel. That's it.

This whole discussion reads like the bike fall meme.

anonzzzies · a year ago
We see quite a lot of organisations inside because of the business we have, and, while usually this is not our task, when I hear these stories and see people struggle with devops stuff in reality, the first thing we push for is to do anything to dumb it down and remove all the dependencies on 3rd party providers so we are back to having everything run again like, in this case, the hello world of github actions. It is literally always the case that the people who complain have this (very HN, so funny you say that) thing of absolutely grossly overarchitecting and writing things that are just there because they read it on HN/some subreddits/discord. We sometimes walk into struggling teams where we check the commits / setup only to find out they did things like switch package manager/bundler/etc 5x in the past year (this is definitely a HN thing where a new packagemanager for js pops up every 14 minutes). Another terrible thing looking at 10+ year codebases, we see js, ts, py, go, rust and when we ask wtf, they tell us something something performance. Of course the language was never the bottleneck of these (people here would be pretty scared to see how bad database setups are even for multi million$ projects in departmental or even enterprise wide; the DBA's in the basement know but they are not consulted for various reasons), mostly LoB, apps. And the same happens with devops. We only work for large companies, almost never startups, and these issues are usually departmental (because big bad Java/Oracle IT in the basement doesn't allow anything so they have budgets to do their own), but still, it's scary how much money is being burnt on these lame new things that won't survive anyway.
IshKebab · a year ago
Sounds like you have the same pain points as everyone else; you're just more willing to ignore them.

I am with the author - we can do better than the status quo!

tobinfekkes · a year ago
I guess it's possible. But I also don't really have anything to ignore....? I genuinely never have an issue; it builds code, every time.

I commit code, push it, wait 45 seconds, it syncs to AWS, then all my sites periodically ping the S3 bucket for any changes, and download any new items. It's one of the most reliable pieces of my entire stack. It's comically consistent, compared to anything I try building for a mobile app or pushing to a mobile app store.

I look forward to opening my IDE to push code to the Actions for my web app, and I dread the build pipeline for a mobile app.

dkdbejwi383 · a year ago
The pain points sound pretty trivial though.

You notice a deprecation warning in the logs, or an email from GitHub and you make a 1 line commit to bump the node version. Easy.

Sure you can make typos that you don’t spot until you’ve pushed and the action doesn’t run, but I quickly learned to stop being lazy and actually think about what I’m writing, and get someone else to do an actual review (not just scroll down and up and give it a LGTM).

My experience is same as the commenter above, it’s relatively set and forget. A few minutes setup work for hours and hours of benefit over years of builds.

raffraffraff · a year ago
It probably depends on your org size and how specialised you are. Right now I dislike GitHub Actions and think that Gitlab CI is way better, but I also don't give it to much thought because it's a once in a blue moon task for me to mess with them. But I would absolutely hate to be a "100% DevOps guy" for a huge organisation that wants me to specialise in this stuff all the time. I think that by the end of week 1 I'd go mad.
thom · a year ago
Unless I'm misunderstanding, you can use workflow_dispatch to avoid having to make useless commits to trigger actions.
duped · a year ago
I have a small gripe that I think exemplifies a bigger problem. actions/upload-artifact strips executable permissions from binaries (1). The fact they fucked this up in the first place, and six years later haven't fixed it, gives me zero confidence in the team managing their platform. And when I'm picking a CI/CD service, I want reliability and correctness. GH has neither.

When it takes all of a day to self host your own task runner on a laptop in your office and have better uptime, lower cost, better performance, and more correct implementations, you have to ask why anyone chooses GHA. I guess the hello-world is convincing enough for some people.

(1) https://github.com/actions/upload-artifact/issues/38

chanux · a year ago
You must have simple, straightforward flow touched only by a handful of folks max.

The world is full of kafkaesque nightmares of Dev-ops pipeline "designed" and maintained by committees of people.

It's horrible.

That said, for some personal stuff I have Google Cloud Build that has a very VERY simple flow. Fire, forget and It's been good.

eru · a year ago
You might like 'git commit --allow-empty' to make your somewhat-useless commits.

But honestly, doesn't github now have a button you can press to retrigger actions without a commit?

GitHub Actions are least hassle, when you don't care about how much compute time you are burning through. Either because you are using the free-for-open-source repositories version, or because your company doesn't care about the cost.

If you care about the compute time you are burning, then you can configure them enough to help with that, but it quickly becomes a major hassle.

ImHereToVote · a year ago
GitHub actions is nice. People are just not accustomed to being punched in the face. The stuff I work on regularly makes GitHub actions seem like a Hello World app.
trevor-e · a year ago
I thought the same until having to do slightly more complicated and "off the beaten path" workflows. I'm still amazed at how easy they make building CI jobs now, but I also get frustrated at how it's not a "local first" workflow that you then push to their service.
tasuki · a year ago
Yes, your needs are simple. I've also been using GitHub actions for all my needs since Travis shut down and haven't run into any problems.

I wouldn't want to maintain GitHub actions for a large project involving 50 people and 5 languages...

flanked-evergl · a year ago
Software engineer thrives on iteration speed. Things have to change, if your pipeline is difficult to change it will cost you.

Dead Comment

xlii · a year ago
There is one thing that I haven’t seen mentioned: worst possible feedback loop.

I’ve noticed this phenomenon few times already, and I think there’s nothing worse than having a 30-60s feedback loop. The one that keeps you glued to the screen but otherwise is completely nonproductive.

I tried for many moons to replicate GHA environment on local and it’s impossible in my context. So every change is like „push, wait for GH to pickup, act on some stupid typo or inconsistency, rinse, repeat”.

It’s like a slot machine „just one more time and it will run”, eating away focus and time.

It took me 25 minutes to get 5s build process. Naive build with GHA? 3 minutes, because dependencies et al. Ok, let’s add caching. 10 hours fly by.

The cost of failure and focus drop is enormous.

kelseydh · a year ago
Feel this pain so much. If you are debugging Github Action container builds, and each takes over ~40 minutes to build.. you can burn through a whole work day only testing six or seven changes.

There has to be a better way. How has nobody figured this out?

elAhmo · a year ago
There is act, that allows you to run actions locally. Although not exactly the same as the real thing, it can save time.

https://github.com/nektos/act

esafak · a year ago
There's dagger; CI as code. Test your pipeline locally, in your IDE.
hv42 · a year ago
With GitLab, I have found https://github.com/firecow/gitlab-ci-local to be an incredible time-saver when working with GitLab pipelines (similar to https://github.com/nektos/act for GitHub)

I wish GitLab/GitHub would provide a way to do this by default, though.

cantagi · a year ago
act is great. I use it to iterate on actions locally (I self-host gitea actions, which uses act, so it's identical to github actions).
lsuresh · a year ago
This is exactly a big piece of our frustration -- the terrible feedback loop and how much mental space it wastes. OP does talk about this at the end (babysitting the endless "wip" commits till something works).
figmert · a year ago
Highly recommend nektos/act, and if it's something complex enough, you can Ssh into the server to investigate. There are many action that facilitate this.
tomjakubowski · a year ago
I use LLMs for a lot of things these days, but maybe the most important one is as a focus-preserving mechanism for exactly these kinds of middle-ground async tasks that have a feedback loop measured in a handful of minutes.

If the process is longer than a few minutes, I can switch tasks while I wait for it. It's waiting for those things in the 3-10 minute range that is intolerable for me: long enough I will lose focus, not long enough for me to context switch.

Now I can bullshit with the LLM about something related to the task while I wait, which helps me to stay focused on it.

silisili · a year ago
I worked at companies using Gitlab for a decade, and got familiar with runners.

Recently switched to a company using Github, and assumed I'd be blown away by their offering because of their size.

Well, I was, but not in the way I'd hoped. They're absolutely awful in comparison, and I'm beyond confused how it got to that state.

If I were running a company and had to choose between the two, I'd pick Gitlab every time just because of Github actions.

yoyohello13 · a year ago
Glad I’m not the only one. GitLab runners just make sense to me. A container you run scripts in.

I have some GitHub actions for some side projects and it just seems so much more confusing to setup for some reason.

briansmith · a year ago
Actions have special integration with GitHub (e.g. they can annotate the pull request review UI) using an API. If you forgo that integration, then you can absolutely use GitHub Actions like "a container you run scripts in." This is the advice that is usually given in every thread about GitHub Actions.
HdS84 · a year ago
There are lots of problems. Actions try to abstract the script away and give you a consistent experience and, must crucially, allow sharing. Because gitlab has no real way to share actions or workflows (I can do yaml include, but come on that sucks even harder than actions) you are constantly reinventing the wheel. That's ok if all you do is " build folder" but if you need caching, reporting of issues, code coverage etc. Pp it gets real ugly really fast. Example: yesterday I tried services, i.e. starting up some DB and backend containers to run integration tests against. Unfortunately, you cannot expand dynamic variables (set by previous containers) but are limited to already set bars. So back to docker compose...and the gitlab pipelines are chock full of such weird limitations
usr1106 · a year ago
So Github was really the perfect acquisation for the Microsoft portfolio. Applications with a big market share that are technically inferior to the competition.

// Luckily still a gitlab user, but recently forced to Microsoft Teams and office.

OJFord · a year ago
> I have some GitHub actions for some side projects and it just seems so much more confusing to setup for some reason.

Because the docs are crap perhaps? I prefer it, having used both professionally (and Jenkins, Circle, Travis), but I do think the docs are really bad. Even just the nesting of pages once you have them open, where is the bit with the actual bloody syntax reference, functions, context, etc.

globular-toast · a year ago
Same. I'd been using Gitlab for a few years when Actions came out. Looked at it and thought, wow that's weird, but gave it the benefit of the doubt as it's just different, surely it would make sense eventually. Well no, it doesn't make sense, and seeing all the shocked Pikachu at the action compromise the other day was amusing.
zamalek · a year ago
> I'm beyond confused how it got to that state.

A few years back I wanted to throw in the towel and write a more minimal GHA-compatible agent. I couldn't even find where in the code they were calling out to GitHub APIs (one goal was to have that first party progress UI experience). I don't know where I heard this, so big hearsay warning, but apparently nobody at GitHub can figure it out either.

jalaziz · a year ago
GitHub Actions started off great as they were quickly iterating, but it very much seems that GitHub has taken its eye of the ball and the improvements have all but halted.

It's really upsetting how little attention Actions is getting these days (<https://github.com/orgs/community/discussions/categories/act...> tells the story -- the most popular issues have gone completely unanswered).

Sad to see Earthly halting development and Dagger jumping on the AI train :(. Hopefully we'll get a proper alternative.

On a related note, if you're considering https://www.blacksmith.sh/, you really should consider https://depot.dev/. We evaluated both but went with Depot because the team is insanely smart and they've solved some pretty neat challenges. One of the cooler features is that their caching works with the default actions/cache action. There's absolutely no need to switch out popular third party actions in favor of patched ones.

shykes · a year ago
> Sad to see Earthly halting development and Dagger jumping on the AI train :(. Hopefully we'll get a proper alternative.

Hi, Dagger CEO here. We're advertising a new use case for Dagger (running AI agents) while continuing to support the original use case (running complex builds and tests). Dagger has always been a general purpose engine, and our community has always used it for more than just CI. It's still the exact same engine, CLI, SDKs and observability stack. It's not like we're discontinuing a product, to the contrary: we're getting more workloads on the platform, which benefits all our users.

jalaziz · a year ago
Great to know. I think the fear is that so many companies are prioritizing AI workloads for the valuation bump rather than delivering actual meaningful value.
SamuelAdams · a year ago
A lot of GH actions teams were impacted by layoffs in November.

Example:

https://github.com/actions/runner/pull/2477#issuecomment-244...

mike_hearn · a year ago
Presumably the issue is that GH underpriced Actions such that it's not worth improving because driving more usage won't drive revenue, and that then forced prices down for everyone else because everyone fixed on the Actions pricing.
pinkgolem · a year ago
I might have missed the news, but I did not find anything in regards to earthly stopping development

What happened there?

jalaziz · a year ago
I missed it too, but then found this: https://github.com/earthly/earthly/issues/4313
pimeys · a year ago
We switched to Depot last week. Our Rust builds went down from 20+ minutes to 4-8 minutes. The easy setup and their docker builds with fast caching are really good.
lsuresh · a year ago
This sounds promising. What made your Rust builds become that fast? Any repo you could point us to?
solatic · a year ago
> Trivial mistakes (formatting, unused deps, lint issues) should be fixed automatically, not cause failures.

Do people really consider this best practice? I disagree. I absolutely don't want CI touching my code. I don't want to have to remember to rebase on top of whatever CI may or may not have done to my code. Not all linters are auto-fixable so anyway some of the time I would need to fix it from my laptop. If it's a trivial check it should run as a pre-commit hook anyway. What's next, CI should run an LLM to auto-fix failing test cases?

Do people actually prefer CI auto-fixing anything?

thedougd · a year ago
I think this is where things went off the rails for him. Commiting back to the same branch that is running CI has too many gotchas in any CI system. You touched on the first issue, the remote branch immediately deviates unexpectedly from the local branch. Care has to be taken not to trigger additional CI runs from that commit.
stared · a year ago
I do such things with pre-commit.

Doing it in CI sounds like making things more complicated by resetting to remote branches after pushing commits. And, in the worst case, something that actually brakes code that works locally.

Marsymars · a year ago
I have team members who complain that installing and running pre-commit is too much overhead, so instead I see them pushing commit after broken commit that tie up CI resources to fail on the pre-commit workflow. :(
llm_nerd · a year ago
That part immediately made me short circuit out of the piece. That sounds like a recipe for disaster and an unnecessary complexity that just brings loads of new failure modes. Not a best practice.

Trivial mistakes in PRs are almost always signs of larger errors.

ben_pfaff · a year ago
I'm new to CI auto-fixes. My early experience with it is mixed. I find it annoying that it touches my code at all, but it does sometimes allow a PR to get further through the CI system to produce more useful feedback later on. And then a lot of the time I end up force-pushing a branch that is revised in other ways, in which case I fold in whatever the CI auto-fix did, either by squashing it in or by applying it in some other way.

(Most of the time, the auto-fix is just running "cargo fmt".)

kylegalbraith · a year ago
This was an interesting read and highlighted some of the author's top-of-mind pain points and rough edges. However, in my experience, this is definitely not an exhaustive list, and there are actually many, many, many more.

Things like 10 GB cache limits in GitHub, concurrency limits based on runner type, the expensive price tag for larger GitHub runners, and that's before you even get to the security ones.

Having been building Depot[0] for the past 2.5 years, I can say there are so many foot guns in GitHub Actions that you don't realize until you start seeing how folks are bending YAML workflows to their will.

We've been quite surprised by the `container` job. Namely, folks want to try to use it to create a reproducible CI sandbox for their build to happen in. But it's surprisingly difficult to work with. Permissions are wonky, Docker layer caching is slow and limited, and paths don't quite work as you thought they did.

With Depot, we've been focusing on making GitHub Actions exponentially faster and removing as many of these rough edges as possible.

We started by making Docker image builds exponentially faster, but we have now brought that architecture and performance to our own GHA runners [1]. Building up and optimizing the compute and processes around the runner to make jobs extremely fast, like making caching 2-10x faster without having to replace or use any special cache actions of ours. Our Docker image builders are right next door on dedicated compute with fast caching, making the `container` job a lot better because we can build the image quickly, and then you can use that image right from our registry in your build job.

All in all, GHA is wildly popular. But, the sentiment around even it's biggest fans is that it could be a lot better.

[0] https://depot.dev/

[1] https://depot.dev/products/github-actions

SkiFire13 · a year ago
By what measure is this "exponentially faster"? Surely GH doesn't take an exponential time in the number of steps of the workflow...
magicalhippo · a year ago
Depot looks nice, but also looks fairly expensive to me. We're a small B2B company, just 10 devs, but we'd be looking at 200+500 = $700/mo just for building and CI.

I guess that would be reasonable if we really needed the speedup, but if you're also offering a better QoL GHA experience then perhaps another tier for people like us who don't necessarily need the blazing speed?

suryao · a year ago
You might want to check out my product, WarpBuild[0].

We are fully usage based, no minimums etc., and our container builders are faster than others on the market.

We also have a BYOC option that gives 10x cost reduction and used by many customers at scale.

[0] https://warpbuild.com

kylegalbraith · a year ago
We're rolling out new pricing in the next week or two that should likely cover your use case. Feel free to ping me directly, email in my bio, if you'd like to learn more.
axelfontaine · a year ago
At https://sprinters.sh we offer AWS-hosted runners at a price point that will be much more suitable for a company like yours.
Aeolun · a year ago
Depot is fantastic. Can heavily recommend it. It’s like magic when your builds suddenly take 1m instead of 5+ just by switching the runner.
tasuki · a year ago
> Things like 10 GB cache limits in GitHub

10,000,000,000 bytes should be enough for anyone! It really is a lot of bytes...

hn_throwaway_99 · a year ago
> A few days ago, someone compromised a popular GitHub Action. The response? "Just pin your dependencies to a hash." Except as comments also pointed out, almost no one does.

I used GitHub actions when building a fin services app, so I absolutely used the hash to specify Action dependencies.

I agree that this should be the default, or even the required, way to pull in Action dependencies, but saying "almost no one does" is a pretty lame excuse when talking about your own risk. What other people do has no bearing on your options here.

Pin to hashes when pulling in Actions - it's much, much safer

dijit · a year ago
I think the HN community at large had a bit of a learning experience a couple of days ago.

"Defaults matter" is a common phrase, but equally true is: "the pattern everyone recommends including example documentation matters".

It is fair to criticise the usage of GH Actions, just like it's fair to criticise common usage patterns of MySQL that eat your data - even if smarter individuals (who learn from deep understanding, or from being burned) can effectively make correct decisions, since the population of users are so affected and have to learn the hard way or be educated.

hn_throwaway_99 · a year ago
I wholeheartedly agree, and perhaps it was just how I was interpreting the author's statement in the article. If it's saying that the "default" way of using GitHub Actions is dangerous and leads to subtle security footguns, I completely agree. But if you know the proper way to use and secure Actions, saying "everyone else does it a bad way" is irrelevant to your security posture.

Deleted Comment

gazereth · a year ago
Pinning dependencies is trading one problem for another.

Yes, your builds will work as expected for a stretch of time, but that period will come to an end, eventually.

Then one day you will be forced to update those pinned dependencies and you might find yourself having to upgrade through several major versions, with breaking changes and knock-on effects to the rest of your pipelines.

Allowing rolling updates to dependencies helps keep these maintenance tasks small and manageable across the lifetime of the software.

StrLght · a year ago
You don’t have to update them manually. Renovate supports pinned GitHub Actions dependencies [1]. Unfortunately, I don’t use Dependabot so can’t say whether it does the same.

Just make sure you don’t leak secrets to your PRs. Also I usually review changes in updated actions before merging them. It doesn’t take that much time, so far I’ve been perfectly fine with doing that.

[1]: https://docs.renovatebot.com/modules/manager/github-actions/...

baq · a year ago
Not pinning dependencies is an existential risk to the business. Yes it’s a tradeoff, you must assign a probability of any dependency being hijacked in your timeframe yourself, but it is not zero.
kevincox · a year ago
That isn't even the biggest problem. That breaks, and breakage gets fixed. Other than some slight internal delays there is little harm done. (You have a backup emergency deploy process that doesn't depend on GitHub anyways right?)

The real problem is security vulnerabilities in these pinned dependencies. You end up making a choice between:

1. Pin and risk a malicious update.

2. Don't pin and have your dependencies get out of date and grow known security vulnerabilities.

progbits · a year ago
But there is no transitive locking like package manager lockfiles. So if I depend on good/foo@hash, they depend on bad/hacked@v1 and V1 gets moved to malicious version I get screwed.

This is for composite actions. For JS actions what if they don't lock dependencies but pull whatever newest package at action setup time? Same issue.

Would have to transitively fork everything and pin it myself, and then keep it updated.

smpretzer · a year ago
I have been using renovate, which automatically pins, and updates, hashes. So I can stay lazy, and only review the new hash when a renovate PR gets opened: https://docs.renovatebot.com/modules/manager/github-actions/...