RL has been used extensively in other areas - such as coding - to improve model behavior on out-of-distribution stuff, so I'm somewhat skeptical of handwaving away a critique of a model's sophistication by saying here it's RL's fault that it isn't doing well out-of-distribution.
If we don't start from a position of anthropomorphizing the model into a "reasoning" entity (and instead have our prior be "it is a black box that has been extensively trained to try to mimic logical reasoning") then the result seems to be "here is a case where it can't mimic reasoning well", which seems like a very realistic conclusion.
The HIV-free transplanted immune system sees the original immune system as alien, and proceeds to wipe it out at the cellular level. This presumably takes the HIV with it, even if the new immune system is not itself resistant.
I guess this means that quiescent HIV is not at a stage in its lifecycle where it can reinfect cells if its host cell is destroyed. My hilarious mental model of infectious HIV virions floating inside a CD4+ T-cell like angry bees inside a balloon is clearly mistaken.
Adoption rate = first derivative
Flattening adoption rate = the second derivative is negative
Starting to flatten = the third derivative is negative
I don't think anyone cares what the third derivative of something is when the first derivative could easily change by a macroscopic amount overnight.
https://aella.substack.com/p/the-joy-is-not-optional