It makes sense that a virus passed through saliva would evolve like this, but I just find it particularly unsettling when a pathogen can effect higher-level behaviors like drinking water (or jumping into water for mantises).
I do hope that LLMs can help straighten some of it out but anyone whos done healthcare software, the problems are not technical, they are quite human.
That being said one bright spot is we've (my colleagues, not me) made a huge step forward using category theory and Prolog to discover the provably optimal 3+3 clinical oncology dose escalation trial protocol[1]. David gave a great presentation on it at the Scryer Prolog meetup[2] in Vienna.
It's kind of amazing how in the dark ages we are with medicine. Even though this is the first EXECUTABLE/PROGRAMMABLE SPEC for a 3+3 cancer trial, he is still fighting to convince his medical colleagues and hospital administrators that this is the optimal trial because -- surprise -- they don't speak software (or statistics).
[1]: https://arxiv.org/abs/2402.08334
[2]: https://www.digitalaustria.gv.at/eng/insights/Digital-Austri...
Further, that by virtue of being at the centre of action in research, doctors in prestige medical centres have an advantage that could be available to all doctors. It's a pretty important point, sometimes referred to as the dissemination of knowledge problem.
Currently, this is best approached by publishing systematic reviews according to the Cochrane Criteria [0]. Such reviews are quite labour-intensive and done all too rarely, but are very valuable when done.
One aspect of such reviews, when done, is how often they discard published studies for reasons such as bias, incomplete datasets, and so forth.
The approach described by Geiger in the link is commendable for its intentions but the outcome will be faced with the same problem that manual systematic reviews face.
I wonder if the author considered included rules-based approaches (e.g. Cochrane guidelines) in addition to machine learning approaches?
NCCN guidelines and Cochrane Reviews serve complementary roles in medicine - NCCN provides practical, frequently updated cancer treatment algorithms based on both research and expert consensus, while Cochrane Reviews offer rigorous systematic analyses of research evidence across all medical fields with a stronger focus on randomized controlled trials. The NCCN guidelines tend to be more immediately applicable in clinical practice, while Cochrane Reviews provide a deeper analysis of the underlying evidence quality.
My main goal here was to show what you could do with any set of medical guidelines that was properly structured. You can choose any criteria you want.
PDFs suck in many ways but are durable and portable. If I work with two oncologists, I use the same pdf.
The author means well but his solution will likely be worse because only he will understand it. And there’s a million edge cases.
I'm not trying to build this out or sell it as a tool to providers. Just wanted to demo what you could do with structured guidelines. I don't think there's any reason this would have to be unique to a practice or emr.
As sister comments mentioned, I think the ideal case here would be if the guideline institutions released the structured representations of the guidelines along with the PDF versions. They could use a tool to draft them that could export in both formats. Oncologists could use the PDFs still, and systems could lean into the structured data.
1) CI and IAC that deploy a web app running in a container
2) Add horizontal scaling and load balancer
3) Add long running tasks / scheduled task support
4) Deploys will likely break long running tasks. Implement blue/green or rolling deploys or some other sort of advanced deployment scheme
5) Implement rollbacks
Would love to know more about how they filtered the training set down here and what heuristics were involved.
I think that the models we use now are enormous for the use cases we’re using them for. Work like this and model distillation in general is fantastic and sorely needed, both to broaden price accessibility and to decrease resource usage.
I’m sure frontier models will only get bigger, but I’d be shocked if we keep using the largest models in production for almost any use case.