GistNoesis (u/GistNoesis)

GistNoesis commented on Show HN: I made a memory game to teach you to play piano by ear lend-me-your-ears.specr.n... · Posted by u/vunderba

GistNoesis · a month ago

I like the game up until 7 or 8 notes, but it keeps adding note.

I couldn't find a setting to freeze the difficulty where it's comfortable and where the melody can still be construed to make sense.

When adding more notes, it breaks the flow and turn a training for pitch practicing into a memory game for rain man, even more so when we make a mistake and must redo the melody partially.

GistNoesis commented on Ask HN: How do I help a colleague who introduces a lot of typos? · Posted by u/tornadofart

GistNoesis · a month ago

Have you checked his physical keyboard ?

My laptop is getting old, and some keys need to be pressed with more insistence and more accurately for them to register properly. It also breaks the flow, and muscle memory for things like passwords. It also lead to letter inversions, because the tonic accent need to be put on letter which need to be pressed more, rather than on the first letter of the word. It's driving me crazy but unfortunately computer are too expensive for now (and it's probably only getting worse).

GistNoesis commented on Contact the ISS ariss.org/contact-the-iss... · Posted by u/logikblok

GistNoesis · a month ago

Or use lasers : https://www.youtube.com/watch?v=DCQ2CbfGs6g&t=440s

GistNoesis commented on Some Epstein file redactions are being undone theguardian.com/us-news/2... · Posted by u/vinni2

ThePowerOfFuet · 2 months ago

>Let's crop it anyway

That is not cropping.

https://en.wikipedia.org/wiki/Cropping_(image)

>Cropping is the removal of unwanted _outer_ areas from a photographic or illustrated image.

GistNoesis · 2 months ago

Please forgive my outside the box use of word.

I used it at the time as a reference to the "PNG aCropalypse" ( https://news.ycombinator.com/item?id=35208721 where I originally shared it in a comment).

The algorithm does also work if you remove the outer areas of the photo.

GistNoesis commented on Some Epstein file redactions are being undone theguardian.com/us-news/2... · Posted by u/vinni2

GistNoesis · 2 months ago

If it's not done properly, and you happen at any point in the chain to put black blocks on a compressed image (and PDF do compress internal images), you are leaking some bits of information in the shadow casted by the compression algorithm : (Self-plug : https://github.com/unrealwill/jpguncrop )

GistNoesis · 2 months ago

And that's just in the non-adversarial simple case.

If you don't know the provenance of images you are putting black box on (for example because of a rogue employee intentionally wanting to leak them, or if the image sensor of your target had been compromised to leak some info by another team), your redaction can be rendered ineffective, as some images can be made uncroppable by construction .

(Self-plug : https://github.com/unrealwill/uncroppable )

And also be aware that compression is hiding everywhere : https://en.wikipedia.org/wiki/Compressed_sensing

GistNoesis commented on Some Epstein file redactions are being undone theguardian.com/us-news/2... · Posted by u/vinni2

OneMorePerson · 2 months ago

It's funny seeing this play out because in my personal life anytime I'm sharing a sensitive document where someone needs to see part of it but I don't want them to see the rest that's not relevant, I'll first block out/redact the text I don't want them to see (covering it, using a redacting highlighter thing, etc.), and then I'll screenshot the page and make that image a PDF.

I always felt paranoid (without any real evidence, just a guess) that there would always be a chance that anything done in software could be reversed somehow.

GistNoesis · 2 months ago

If it's not done properly, and you happen at any point in the chain to put black blocks on a compressed image (and PDF do compress internal images), you are leaking some bits of information in the shadow casted by the compression algorithm : (Self-plug : https://github.com/unrealwill/jpguncrop )

GistNoesis commented on New mathematical framework reshapes debate over simulation hypothesis santafe.edu/news-center/n... · Posted by u/Gooblebrai

GistNoesis · 2 months ago

The problem of computers is the problem of time : How to obtain a consistent causal chain !

The classical naive way of obtaining a consistent causal chain, is to put the links one after the other following the order defined by the simulation time.

The funnier question is : can it be done another way ? With the advance of generative AI, and things like diffusion model it's proven that it's possible theoretically (universal distribution approximation). It's not so much simulating a timeline, but more sampling the whole timeline while enforcing its physics-law self-consistency from both directions of the causal graph.

In toy models like game of life, we can even have recursivity of simulation : https://news.ycombinator.com/item?id=33978978 unlike section 7.3 of this paper where the computers of the lower simulations are started in ordered-time

In other toy model you can diffusion-model learn and map the chaotic distribution of all possible three-body problem trajectories.

Although sampling can be simulated, the efficient way of doing it necessitate to explore all the possible universes simultaneously like in QM (which we can do by only exploring a finite number of them while bounding the neighbor universe region according to the question we are trying to answer using the Lipschitz continuity property).

Sampling allows you to bound maximal computational usage and be sure to reach your end-time target, but at the risk of not being perfectly physically consistent. Whereas simulating present the risk of the lower simulations siphoning the computational resources and preventing the simulation time to reach its end-time target, but what you could compute is guaranteed consistent.

Sampled bottled universe are ideal for answering question like how many years must a universe have before life can emerge, while simulated bottled universe are like a box of chocolate, you never know what you are going to get.

The question being can you tell which bottle you are currently in, and which bottle would you rather get.

GistNoesis commented on The RAM shortage comes for us all jeffgeerling.com/blog/202... · Posted by u/speckx

lysace · 2 months ago

Please explain to me like I am five: Why does OpenAI need so much RAM?

2024 production was (according to openai/chatgpt) 120 billion gigabytes. With 8 billion humans that's about 15 GB per person.

GistNoesis · 2 months ago

What they need is not so much memory but memory bandwidth.

For training, their models have a certain number of memory needed to store the parameters, and this memory is touched for every example of every iteration. Big models have 10^12 (>1T )parameters, and with typical values of 10^3 examples per batch, and 10^6 number of iteration. They need ~10^21 memory accesses per run. And they want to do multiple runs.

DDR5 RAM bandwidth is 100G/s = 10^11, Graphics RAM (HBM) is 1T/s = 10^12. By buying the wafer they get to choose which types of memory they get.

10^21 / 10^12 = 10^9s = 30 years of memory access (just to update the model weights), you need to also add a factor 10^1-10^3 to account for the memory access needed for the model computation)

But the good news is that it parallelize extremely well. If you parallelize you 1T parameters, 10^3 times, your run time is brought down to 10^6 s = 12 days. But you need 10^3 *10^12 = 10^15 Bytes of RAM by run for weight update and 10^18 for computation (your 120 billions gigabytes is 10^20, so not so far off).

Are all these memory access technically required : No if you use other algorithms, but more compute and memory is better if money is not a problem.

Is it strategically good to deprive your concurrents from access to memory : Very short-sighted yes.

It's a textbook cornering of the computing market to prevent the emergence of local models, because customers won't be able to buy the minimal RAM necessary to run the models locally even just the inferencing part (not the training). Basically a war on people where little Timmy won't be able to get a RAM stick to play computer games at Xmas.

GistNoesis commented on I mathematically proved the best "Guess Who?" strategy [video] youtube.com/watch?v=_3RNB... · Posted by u/surprisetalk

GistNoesis · 2 months ago

In the video, in the continuous version the game never end and highlight the "loser" strategy.

When you are behind the optimal play is to make a gamble, which most likely will make you even worse. From the naive winning side it seems the loser is just doing a stupid strategy of not following the optimal dichotomy strategy, and therefore that's why they are losing. But in fact they are a "player" doing not only their best, but the best that can be done.

The infinite sum of ever smaller probabilities like in Zeno's paradox, converge towards a finite value. The inevitable is a large fraction of the time, you are playing catch-up and will never escape.

You are losing, playing optimally, but slowly realising the probabilities that you are a loser as evidence by the score which will most likely go down even more next round. Most likely the entire future is an endless sequence of more and more desperate looking losing bets, just hoping to strike it once that will most likely never comes.

In economics such things are called "traps", for example the poverty trap exhibits similar mechanics. Where even though you display incredible ingenuity by playing the optimal game strategy, most of the time you will never escape, and you will need to take even more desperate measures in the future. That's separating the wheat from the chaff from the chaff's perspective or how you make good villains : because like Bane in Batman there are some times (the probability is slim but finite) where the gamble pays off once and you escape the hell hole you were born in and become legend.

If you don't play this optimal strategy you will lose slower but even more surely. The optimal strategy is to bet just enough to go from your current situation to the winning side. It's also important not to overshoot : this is not always taking moonshots, but betting just enough to escape the hole, because once out, the probabilities plays in your favor.

GistNoesis commented on Simulating a Planet on the GPU: Part 1 (2022) patrickcelentano.com/blog... · Posted by u/Doches

dahart · 3 months ago

Might be worth starting with a baseline where there’s no collision, only advection, and assume higher than 1fps just because this gives higher particles per second but still fits in 24GB? I wouldn’t be too surprised if you can advection 100M particles at interactive rates.

GistNoesis · 3 months ago

The theoretical maximum rate for 1B particle advection (Just doing p[] += v[]dt), is 1000GB/s / 24GB = 42 iteration per second. If you only have 100M you can have 10 times more iteration.
But that's without any rendering, and non interacting particles which are extremely boring unless you like fireworks. (You can add a term like v[] += g
dt for free.) And you don't need to store colors for your particles if you can compute the colors from the particle number with a function.

Rasterizing is slower, because each pixel of the image might get touched by multiple particles (which mean concurrent accesses in the GPU to the same memory address which they don't like).

Obtaining the screen coordinates is just a matrix multiply, but rendering the particles in the correct depth order requires multiple pass, atomic operations, or z-sorting. Alternatively you can slice your point clouds, by mixing them up with a peak-shaped weight function around the desired depth value, and use an order independent reduction like sum, but memory accesses are still concurrent.

For the rasterizing, you can also use the space partitioning indices of the particle to render to a part of the screen independently without concurrent access problems. That's called "tile rendering". Each tile render the subset of particles which may fall in it. (There are plenty of literature in the Gaussian Splatting community).