“The performance of a 512-node Anton machine is over 17,000 nanoseconds of simulated time per day for a protein-water system consisting of 23,558 atoms.[5] In comparison, MD codes running on general-purpose parallel computers with hundreds or thousands of processor cores achieve simulation rates of up to a few hundred nanoseconds per day on the same chemical system.”
17,000 ns of simulation per day sounds crazy small but I wonder how that compares to the timescale of molecular interactions. How helpful have Antons been to pharma research etc?
[Disclaimer: I used to work at D. E. Shaw Research from 2011-2016]
The early Anton 1 numbers of 17us/day on 100K atoms were huge leap forward then. At that time, GPU-based simulations (e.g. GROMACS/Desmond on GPU) were doing single digit ns/day. Remember, even for 'fast-folding' proteins, the relaxation time is on the order of us and you need 100s of samples before you can converge statistical properties, like folding rates [0]. Anton 2 got a 50-100x speed-up [1] which made it much easier to look at druggable pathways. Anton was also used for studying other condensed matter systems, such as supercooled liquids [2].
Your question of why is this so slow or small is prescient. On the reasons that we have to integrate the dynamical equations (e.g. Newtonian or Hamiltonian mechanics) at small, femtosecond timesteps (1 fs = 1e-15s) is because the vibrational frequencies of bonds are on the order of picoseconds (1 ps = 1e-12s). Given that you also have to compute Omega(n^2) pairwise interactions between n particles, you end up having a large runtime to get to ns and us while respecting bond frequencies. The hard part, for atomistic/all-atom simulation is that n is on the order of 1e5-1e6 for a single protein with 100s of water molecules. The water molecules are extremely important to simulate exactly since you need to get polar phenomena, such as hydrogen bonding, correct to get folded structure and druggable sites correct to angstrom precision (1e-10 meters). If you don't do atomistic simulations (e.g. n is much smaller and you ignore complex physical interactions, including semi-quantum interactions), you have a much harder time matching precision experiments.
[1] https://ieeexplore.ieee.org/abstract/document/7012191/ [the variance comes from the fact that different physics models and densities cause very different run times -> evaluating 1/r^6 vs. 1/r^12 in fixed precision is very different w.r.t communication complexity and Ewald times and FFTs and ...]
This explanation is interesting. Thanks for sharing it. While reading it, I got the impression that the simulation is not fully quantum mechanical, but rather classical with select quantum mechanical effects.
Which parts of quantum mechanics are idealised away and how do we know that not including them won't significantly reduce the quality of the result?
Are you possibly using stochastical noise in the simulations and repeat them multiple times, in the hope that whatever disturbance caused by the idealisation of the model is covered by the noise?
I think it's still an open question whether you truly need an ensemble of long simulations (versus many shorter ones that can be generated in embarassingly parallel mode on GPUs).
Nonetheless, I'm thankful for both DESMOND and Anton, which helped push the MD community out of its moribund state in the early 2000s. I still don't think MD simulations produce anything that exceeds the opportunity value of the power they consume, and it seems unclear that they ever will, although I still would love to know the answer to the question: "could sophisticated classic forcefields with reasonable O(nlogn) scaling every be truly valuable as a virtual screening feature generator/evaluation mechanism".
There's quite a nice plot from a review paper of D.E. Shaw Research that lists the timescale of several biological processes (and compares it to other experimental methods), https://www.annualreviews.org/doi/full/10.1146/annurev-bioph... (Figure 2). Anton has been extremely helpful for studying the basic science of protein dynamics in academia and has been applied in industry (namely at Relay Therapeutics), but drug discovery is a long process so we still haven't seen the fruits of those long simulations yet.
I have a lot of respect for david shaw. He quit managing his hedge fund day to day because he said something along the lines of finance make me stupid or something and went back to doing something useful. If only more of our elites realized this (and cared to do something useful with themselves)
Me as well. It does seem that the whole point of being a billionaire is to do whatever you want. I can't imagine why so many seem to stick to managing their creations, which after a while can't be much fun.
Most billionaires got their wealth through inheritance. All they know is to manage the wealth creation agent that was handed down to them.
Most of the self made ones too have spent a large chunk of their lives perfecting the wealth generation agent which made them rich. It would be like asking a pro NBA player to also take a shot at being a pro NFL player. It’s not what they trained for; they would need to learn a new skill, a new industry and they will most likely fail anyways.
Only a minority of such billionaires actually end up doing different things; Elon Musk comes to mind. Most are victims of their previous successes.
I’d say this is very much on purpose (I worked there for two summers, also with one of the people in this thread). DESRES is very, very particular about the papers it puts out, so, while there is an incredible amount of great science with brilliant people who were mostly poached from academia, only the very top papers ever get published. Many more are written or kept as internal documents, but the firm is very particular about only publishing very impactful research.
Unlike in academia, there isn’t a push to publish only okay or average quality research since funding is not public and there are no metrics to push.
Now seems as good a time as any to share a replica of an Antoine van Leeuwenhoek microscope that my father made. The lens he made in the same way as the original... heating up glass to a semi molten stake, then letting a drop fall through the air. By the time it had landed and cooled...voila! A spherical lens.
https://www.dropbox.com/s/4vl3rbfkelf1nv9/micro_02.jpg?dl=0
Same, but annotated.
1 = The lens housing
2 = The 'stage'. Basically, just a pin that you stick the subject onto
3 = The handle. using this, the assembly is positioned in front of a strong light.
The whole thing actually works. The lens is hit or miss (literally), and it shows. Plenty of chromatic aberration. Basically just a droplet of glass. He tried many times to get it right.
Why did he make it? Well... he has a fascination with old technology. He was a founding member of the British Vintage Wireless Society, and has written a book on the subject of old radios. Also one on ancient navigation techniques. He is an old fashioned polymath. Also have in my possession a replica of Galileo's first telescope. Also works fine.
My master's thesis work was on MD simulations. My setups had around 150k atoms each, and it took months and hundreds of cores to finish any meaningful simulation. I was incredibly jealous of that machine.
But frankly I am still not convinced of the usefulness of the MD studies except for a few cases (docking studies, etc.).
17,000 ns of simulation per day sounds crazy small but I wonder how that compares to the timescale of molecular interactions. How helpful have Antons been to pharma research etc?
The early Anton 1 numbers of 17us/day on 100K atoms were huge leap forward then. At that time, GPU-based simulations (e.g. GROMACS/Desmond on GPU) were doing single digit ns/day. Remember, even for 'fast-folding' proteins, the relaxation time is on the order of us and you need 100s of samples before you can converge statistical properties, like folding rates [0]. Anton 2 got a 50-100x speed-up [1] which made it much easier to look at druggable pathways. Anton was also used for studying other condensed matter systems, such as supercooled liquids [2].
Your question of why is this so slow or small is prescient. On the reasons that we have to integrate the dynamical equations (e.g. Newtonian or Hamiltonian mechanics) at small, femtosecond timesteps (1 fs = 1e-15s) is because the vibrational frequencies of bonds are on the order of picoseconds (1 ps = 1e-12s). Given that you also have to compute Omega(n^2) pairwise interactions between n particles, you end up having a large runtime to get to ns and us while respecting bond frequencies. The hard part, for atomistic/all-atom simulation is that n is on the order of 1e5-1e6 for a single protein with 100s of water molecules. The water molecules are extremely important to simulate exactly since you need to get polar phenomena, such as hydrogen bonding, correct to get folded structure and druggable sites correct to angstrom precision (1e-10 meters). If you don't do atomistic simulations (e.g. n is much smaller and you ignore complex physical interactions, including semi-quantum interactions), you have a much harder time matching precision experiments.
[0] https://science.sciencemag.org/content/334/6055/517
[1] https://ieeexplore.ieee.org/abstract/document/7012191/ [the variance comes from the fact that different physics models and densities cause very different run times -> evaluating 1/r^6 vs. 1/r^12 in fixed precision is very different w.r.t communication complexity and Ewald times and FFTs and ...]
[2] https://pubs.acs.org/doi/abs/10.1021/jp402102w
Which parts of quantum mechanics are idealised away and how do we know that not including them won't significantly reduce the quality of the result?
Are you possibly using stochastical noise in the simulations and repeat them multiple times, in the hope that whatever disturbance caused by the idealisation of the model is covered by the noise?
Deleted Comment
Nonetheless, I'm thankful for both DESMOND and Anton, which helped push the MD community out of its moribund state in the early 2000s. I still don't think MD simulations produce anything that exceeds the opportunity value of the power they consume, and it seems unclear that they ever will, although I still would love to know the answer to the question: "could sophisticated classic forcefields with reasonable O(nlogn) scaling every be truly valuable as a virtual screening feature generator/evaluation mechanism".
Some references:
10^-3 ns - Hydrogen bond vibrations, 100+ ns - Protein side chains moving, 1000 - 10000 ns is the timescale of protein folding
https://insidehpc.com/2016/02/anton-2-supercomputer-at-psc-w...
Most of the self made ones too have spent a large chunk of their lives perfecting the wealth generation agent which made them rich. It would be like asking a pro NBA player to also take a shot at being a pro NFL player. It’s not what they trained for; they would need to learn a new skill, a new industry and they will most likely fail anyways.
Only a minority of such billionaires actually end up doing different things; Elon Musk comes to mind. Most are victims of their previous successes.
But I don't know much about their internals, perhaps, they're leasing a good bit of computer time to biotech companies.
Unlike in academia, there isn’t a push to publish only okay or average quality research since funding is not public and there are no metrics to push.
https://www.dropbox.com/s/in4x3vjysw1o1wc/IMG_0291.JPG?dl=0
https://www.dropbox.com/s/qhs8qf2qw5e4n35/micro_01.jpg?dl=0 From the back
https://www.dropbox.com/s/4vl3rbfkelf1nv9/micro_02.jpg?dl=0 Same, but annotated. 1 = The lens housing 2 = The 'stage'. Basically, just a pin that you stick the subject onto 3 = The handle. using this, the assembly is positioned in front of a strong light.
https://www.dropbox.com/s/4xra81bu2qdsxj9/micro_03.jpg?dl=0 The front, showing the viewing hole
The whole thing actually works. The lens is hit or miss (literally), and it shows. Plenty of chromatic aberration. Basically just a droplet of glass. He tried many times to get it right.
Why did he make it? Well... he has a fascination with old technology. He was a founding member of the British Vintage Wireless Society, and has written a book on the subject of old radios. Also one on ancient navigation techniques. He is an old fashioned polymath. Also have in my possession a replica of Galileo's first telescope. Also works fine.
But frankly I am still not convinced of the usefulness of the MD studies except for a few cases (docking studies, etc.).