I've been curating (mostly design) books on a digital library: https://links.1984.design/books
I've been curating (mostly design) books on a digital library: https://links.1984.design/books
At first glance it seems clear. On a second read, it becomes obvious that what matters is not the dogs, but whether they are being carried.
Grandma calls out: "The chicken is ready to eat."
Many system outputs have the same problem. They look definitive, but they silently hide whether the required conditions were ever met.
When systems consume outputs from black-box algorithms, the usual options are to trust the conclusion or ignore it entirely.
In clinical genomics, the latter is traditional. For example, the British Society for Genetic Medicine advises clinicians not to act on results from external genomic services https://bsgm.org.uk/media/12844/direct-to-consumer-genomic-t...
This post describes a third approach, grounded in computer science. Before any interpretation, systems should record whether verifiable evidence is actually available.
The standard adds a small but strict step. Each rule first reports whether it could be checked at all: yes, no, or not evaluable. Then the evidence is used in reverse, not to confirm the result, but to try to rule it out. If removing or negating that evidence would change the outcome, it counts as real evidence. If not, it does not.
Crucially, this forces a simple question: could the same result have appeared even if the evidence were absent or different? Only when the answer is no does the result actually count as evidence.
The idea comes from genomics, where hospitals, companies, and research groups need to share results without exposing proprietary methods, but it applies anywhere systems reason over incomplete or black-box data.
I have worked 100% in 3 comparable systems over the past 10 years. Can you access with ssh?
I find it super fluid to work on the HPC directly to develop methods for huge datasets by using vim to code and tmux for sessions. I focus on printing detailed log files constantly with lots of debugs and an automated monitoring script to print those logs in realtime; a mixture of .out .err and log.txt.
Even with complete attention to detail, the final renders would be color graded using Flame, or Inferno, or some other tool and all of those edits would also be stored and reproducible in the pipeline.
Pixar must have a very similar system and maybe a Pixar engineer can comment. My somewhat educated assumption is that these DVD releases were created outside of the Pixar toolchain by grabbing some version of a render that was never intended as a direct to digital release. This may have happened as a result of ignorance, indifference, a lack of a proper budget or some other extenuating circumstance. It isn't likely John Lasseter or some other Pixar creative really wanted the final output to look like this.
I run into this same failure mode often. We introduce purposeful scaffolding in the workflow that isn’t meant to stand alone, but exists solely to ensure the final output behaves as intended. Months later, someone is pitching how we should “lean into the bold saturated greens,” not realising the topic only exists because we specifically wanted neutral greens in the final output. The scaffold becomes the building.
In our work this kind of nuance isn’t optional, it is the project. If we lose track of which decisions are compensations and which are targets, outcomes drift badly and quietly, and everything built after is optimised for the wrong goal.
I’d genuinely value advice on preventing this. Is there a good name or framework for this pattern? Something concise that distinguishes a process artefact from product intent, and helps teams course-correct early without sounding like a semantics debate?
Bioinformaticians (among others) in (for example) University Medical Centers won’t get much more bang for the buck than on a well managed Slurm cluster (ie with GPU and Fat nodes etc to distinguish between compute loads). You buy the machines, they are utilized close to 100% over their life time.
Graphical data exploration and stats with R, python, etc is a beautiful challenge at that scale.
I was recently thinking the exact same thing as the author here; as a teen I got my ipod and instantly respected the graceful design and felt shocked how shoddy my previous cheap mp3 player was in comparison.
I am also convinced that he was fully responsible for keeping Apple on this path and that it is almost impossible to stop others from diluting the craftsmanship towards mediocrity as the group size grows. Big CEOs get labelled as greedy exploiters in a single brushstroke by people who don’t seem to care to read up.
If astronomers announced that a large asteroid might strike Earth in twenty years, and that we currently had no way to deflect it, nobody would respond by saying, “Come back when you already have the rocket.” We would immediately build better telescopes to track it precisely, refine its trajectory models, and begin developing propulsion systems capable of interception. You do not wait for the cure before improving the measurement. You improve the measurement so that a cure becomes possible, targeted, and effective.
Medicine is no different. Refusing to improve early, probabilistic diagnosis because today’s treatments are modest confuses sequence with outcome. Breakthroughs do not emerge from vague labels and mixed populations. They emerge from precise, quantitative stratification that allows real effects to be seen. The danger is not that we measure too early. It is that we continue making irreversible clinical and research decisions using imprecise, binary classifications while biological insight and therapeutic tools are advancing rapidly. Building the probabilistic layer now is not premature. It is how we make future intervention feasible.