Readit News logoReadit News
empath-nirvana commented on The Lunacy of Artemis   idlewords.com/2024/5/the_... · Posted by u/feross
just_steve_h · 2 years ago
I find this essay to be well-crafted and compelling.
empath-nirvana · 2 years ago
I actually was put off by the know-it-all nature of the whole thing as if nasa scientists had totally not considered any of this.
empath-nirvana commented on A word game I built to understand semantic distance   celestineplawrence.itch.i... · Posted by u/celestiallylvd1
empath-nirvana · 2 years ago
I don't understand what I'm supposed to be doing here.
empath-nirvana commented on Computer scientists invent an efficient new way to count   quantamagazine.org/comput... · Posted by u/jasondavies
minkzilla · 2 years ago
There is a big practicality problem I see with this algorithm. The thresh defined in the paper relies on the length of the stream. It seems to me that in a scenario where you have a big enough data set to desire a solution that doesn't just store every unique value you don't know the length. I did not make it through all the proofs but I believe they use the fact that the defined threshold has the length in it to prove error bounds. If I were to use this in a scenario where I need to know error bounds i would probably ballpark the length of my stream to estimate error bounds and then use the algorithm with a ballpark threshold depending on my systems memory.

Another practical thing is the "exception" if nothing is removed on line 6 in the original algorithm. This also seems needed for the proof but you would not want in production, though the chance of hitting it should be vanishingly small so maybe worth the gamble?

Here is my faithful interpretation of the algorithm. And then a re-interpretation with some "practical" improvements that almost certainly make the provability of the correctness impossible.

    func CountUnique(scanner *bufio.Scanner, epsilon float64, delta float64, m int) int {

    X := make(map[string]bool)
    p := 1.0
    thresh := int(math.Ceil((12 / (epsilon * epsilon)) \* math.Log(8*float64(m)/delta)))

    for scanner.Scan() {
        a := scanner.Text()
        delete(X, a)
        if rand.Float64() < p {
            X[a] = true
        }

        if len(X) == thresh {
            for key := range X {
                if rand.Float64() < 0.5 {
                    delete(X, key)
                }
            }
            p /= 2
            if len(X) == thresh {
                panic("Error")
            }
        }
    }

    return int(float64(len(X)) / p)
}

  func CountUnique2(scanner *bufio.Scanner, thresh int) int {

     //threshold passed in, based on system memory / estimates
    X := make(map[string]bool)
    p := 1.0

    for scanner.Scan() {
        a := scanner.Text()
        delete(X, a)
        if rand.Float64() < p {
            X[a] = true
        }

        if len(X) >= thresh {  // >= instead of == and remove the panic below
            for key := range X {
                if rand.Float64() < 0.5 {
                    delete(X, key)
                }
            }
            p /= 2
        }
    }

    return int(float64(len(X)) / p)
}

I tested it with Shakespeare's work. The actual unique word count is 71,595. With the second algorithm it is interesting to play with the threshold. Here are some examples.

threshold 1000 Mean Absolute Error: 2150.44 Root Mean Squared Error: 2758.33 Standard Deviation: 2732.61

threshold 2000 Mean Absolute Error: 1723.72 Root Mean Squared Error: 2212.74 Standard Deviation: 2199.39

threshold 10000 Mean Absolute Error: 442.76 Root Mean Squared Error: 556.74 Standard Deviation: 555.53

threshold 50000 Mean Absolute Error: 217.28 Root Mean Squared Error: 267.39 Standard Deviation: 262.84

empath-nirvana · 2 years ago
If you don't know the length of the stream in advance, you can just calculate the margin of error when you're done, no?
empath-nirvana commented on Computer scientists invent an efficient new way to count   quantamagazine.org/comput... · Posted by u/jasondavies
mattkrause · 2 years ago
But for this algorithm, you need to know the total length ("m") to set the threshold for the register purges.

Does it still work if you update m as you go?

empath-nirvana · 2 years ago
You actually don't need to do that part in the algorithm. If you don't know the length of the list, you can just choose a threshold that seems reasonable and calculate the margin of error after you're done processing. (or i guess at whatever checkpoints you want if it's continuous)

In this example, they have the length of the list and choose the threshold to give them a desired margin of error.

empath-nirvana commented on Computer scientists invent an efficient new way to count   quantamagazine.org/comput... · Posted by u/jasondavies
burjui · 2 years ago
The algorithm uses less memory, but more CPU time because of rather frequent deletions, so it's a tradeoff, not just generally better algorithm, as article may suggest.
empath-nirvana · 2 years ago
the list is small so the cost of deletions should be small.
empath-nirvana commented on Egypt's pyramids may have been built on a long-lost branch of the Nile   nature.com/articles/d4158... · Posted by u/gumby
empath-nirvana · 2 years ago
It makes a lot of sense because obviously having a river there makes the transport of materials a lot easier, but i do wonder how nobody noticed this before.
empath-nirvana commented on Strangely Curved Shapes Break 50-Year-Old Geometry Conjecture   quantamagazine.org/strang... · Posted by u/pseudolus
alistairSH · 2 years ago
Is there a good layman’s explanation of higher dimensions as they relate to this type of problem? I’m trying to envision what that means, which is probably a wrong approach…
empath-nirvana · 2 years ago
Higher dimensions in general?

An n-dimensional space is just a collection of points, each defined uniquely by a set of n-numbers. The semantic meaning of those numbers doesn't really matter. It might be like actual physical space, but it could just as well be something like "time" and "the price of big macs". We have a bunch of mathematical operations that work well on 2 or 3 dimensional space that correlate nicely with our physical intuitions of 'curvature' and 'holes', and that still work perfectly well in more generalized forms in higher dimensions.

I'm not really sure it's that useful to try and visualize what it means on higher dimensions, to be honest.

empath-nirvana commented on It’s an age of marvels   blog.plover.com/tech/its-... · Posted by u/pavel_lishin
empath-nirvana · 2 years ago
The letters you're referring to were written in 1753 and 1755, and has he got older, his views on race softened to the point that he was an outright abolitionist by the end of his life. In 1763 he wrote the following in a letter:

This is chiefly to acquaint you, that I have visited the Negro School here in Company with the Revd. Mr. Sturgeon and some others; and had the Children thoroughly examin’d. They appear’d all to have made considerable Progress in Reading for the Time they had respectively been in the School, and most of them answer’d readily and well the Questions of the Catechism; they behav’d very orderly, showd a proper Respect and ready Obedience to the Mistress, and seem’d very attentive to, and a good deal affected by, a serious Exhortation with which Mr. Sturgeon concluded our Visit. I was on the whole much pleas’d, and from what I then saw, have conceiv’d a higher Opinion of the natural Capacities of the black Race, than I had ever before entertained. Their Apprehension seems as quick, their Memory as strong, and their Docility in every Respect equal to that of white Children.1 You will wonder perhaps that I should ever doubt it, and I will not undertake to justify all my Prejudices, nor to account for them. ---

I'm not sure that really that excuses being so racist that he thought _Germans_ weren't sufficiently white for america in his 50s, but he did change his views over time.

empath-nirvana commented on Americans are choking on surging fast-food prices. "I can't justify the expense"   cbsnews.com/news/mcdonald... · Posted by u/koolba
empath-nirvana · 2 years ago
This is just a consequence of the inflation countermeasures working.

During the inflationary period, people were flush with cash, demand increased, there were shortages, everyone raised prices, profits surged, companies hired workers for higher wages, people got more money, etc.. Everyone was mad because they got big raises, which were obviously the result of all their hard work which corporations suddenly saw and appreciated for a large percentage of people all at once and _also_ "greedy corporations" suddenly en masse decided that they no longer wanted to be charitable enterprises and decided to raise prices to steal money from the pockets of hard working americans.

Or, you know, there was a bunch of inflation and wages and prices went up in parallel.

And now money is no longer flowing into the economy, some companies went too far raising prices anticipating more inflation, and now they're losing sales and that's hurting profits, and they're going to end up cutting prices to increase sales and maximize profits again.

Nature is healing.

I think a lot of people have the impression that inflation reduces how much stuff people can afford and generally it's fairly neutral in that respect. There's a certain amount of production and a certain amount of demand and in general it will balance out and no matter what's' going on with inflation people are gonna be able to afford the same amount of stuff. I think people had this idea that if we got inflation under control that suddenly everyone would be able to afford to buy all the stuff they wanted to buy, and they just can't.

The main reason inflation is bad for most people is instability and you have to keep getting raises to keep up with prices. You get a raise, you can suddenly buy some stuff you couldn't before -- prices go up and now you can't again. Then you get a raise, can afford to buy a bunch of stuff, and prices go up and now you can't again. (not to mention that toll it takes on saving, but even that isn't that bad if you own stock instead of holding cash, because asset prices inflate, also)

empath-nirvana commented on AlphaFold 3 predicts the structure and interactions of life's molecules   blog.google/technology/ai... · Posted by u/zerojames
SJC_Hacker · 2 years ago
What if it turns out that nature simply doesn't have nice, neat models that humans can comprehend for many observable phenomena?
empath-nirvana · 2 years ago
I read an article about the "unreasonable effectiveness of mathematics" that it was basically the result of a drunk looking for his keys under a lamp post because that's where the light is. We know how to use math to model parts of the world, and every where we look, there's _something_ we can model with math, but that doesn't mean that there's all there is to the universe. We could be understanding .0000001% of what's out there to understand, and it's the stuff that's amenable to mathematical analysis.

u/empath-nirvana

KarmaCake day2468April 17, 2023View Original