Readit News logoReadit News
nee1r commented on Energy Predictions 2025 – Casey Handmer's blog   caseyhandmer.wordpress.co... · Posted by u/bilsbie
nee1r · 13 days ago
hmmm wonder if decosting is actually linear vs. discrete jumps in ability (ie. we might just nail fusion or boosts in efficiency)
nee1r commented on Nobel Prize in Chemistry 2025   nobelprize.org/prizes/che... · Posted by u/pykello
nee1r · 2 months ago
Love MOFs! Did research about MOFs <=> language modeling a couple years ago and I'm excited to see them getting more coverage https://arxiv.org/abs/2311.07617
nee1r commented on Building the heap: racking 30 petabytes of hard drives for pretraining   si.inc/posts/the-heap/... · Posted by u/nee1r
tarasglek · 3 months ago
but they have multiple head nodes, so its some distributed setup or just active/passive type thing?
nee1r · 3 months ago
We have a custom barebones solution that uses a hashring to route the files!
nee1r commented on Building the heap: racking 30 petabytes of hard drives for pretraining   si.inc/posts/the-heap/... · Posted by u/nee1r
akreal · 3 months ago
How is/was the data written to disks? Something like rsync/netcat?
nee1r · 3 months ago
We use the same nginx rust server to do file writes, it's done via web requests
nee1r commented on Building the heap: racking 30 petabytes of hard drives for pretraining   si.inc/posts/the-heap/... · Posted by u/nee1r
echelon · 3 months ago
If the authors chime in, I'd like to ask what "Standard Intelligence PBC" does.

Is it a public benefit corp?

What are y'all building?

nee1r · 3 months ago
We did want more pictures!! Recently bought a Sony A7III to capture more fun moments like this.

We're working on pretraining computer action models from the ground up—hence the pretraining data cluster. We're a public benefit corp because we think its important for AGI to built in the public's interest + are planning on automating a lot of the work done on computers!

nee1r commented on Building the heap: racking 30 petabytes of hard drives for pretraining   si.inc/posts/the-heap/... · Posted by u/nee1r
archmaster · 3 months ago
Had the pleasure of helping rack drives! Nothing more fun than an insane amount of data :P
nee1r · 3 months ago
Thanks for helping!!!
nee1r commented on Building the heap: racking 30 petabytes of hard drives for pretraining   si.inc/posts/the-heap/... · Posted by u/nee1r
huxley_marvit · 3 months ago
damn this is cool as hell. estimate on the maintenance cost in person-hours/month?
nee1r · 3 months ago
Around 2-5 hours/month, mostly powercycling the servers and replacing hard drives
nee1r commented on Building the heap: racking 30 petabytes of hard drives for pretraining   si.inc/posts/the-heap/... · Posted by u/nee1r
jonas21 · 3 months ago
Nice writeup. All of the technical detail is great!

I'm curious about the process of getting colo space. Did you use a broker? Did you negotiate, and if so, how large was the difference in price between what you initially were quoted and what you ended up paying?

nee1r · 3 months ago
We reached out to almost every colocation space in SF/some in Fremont to get quotes. There wasn't a difference between the quote price and what we ended up paying, though we did negotiate terms + one-time costs.
nee1r commented on Building the heap: racking 30 petabytes of hard drives for pretraining   si.inc/posts/the-heap/... · Posted by u/nee1r
RagnarD · 3 months ago
I love this story. This is true hacking and startup cost awareness.
nee1r · 3 months ago
Thanks!! :)

u/nee1r

KarmaCake day239April 18, 2020
About
https://neelr.dev hi! nice to see you~
View Original