cirpis (u/cirpis) - Readit News

cirpis commented on The Theory of Topos-Theoretic 'Bridges' – A Conceptual Introduction (2016) glass-bead.org/article/th... · Posted by u/ebcode

wesselbindt · a year ago

> Surely if the technique is so powerful, there ought to be easy examples of it aplenty

Some examples of insanely powerful and ubiquitous concepts which do not have easy examples:

- algebraic stacks

- QFT

- _general_ relativity

There's this idea that everything should somehow be explainable to my grandmother, but this idea is never presented with any justification, and it seems to me that there's counterexamples aplenty. And something irks me about the idea. I feel like it comes from the same place as people who've done a 5 minute search on google feeling like they can take on experts who've spent decades studying a subject.

cirpis · a year ago

This seems like a needlesly defensive answer. If there are no examples then just say so. It is mathematics after all, some of it is beautiful to explore just for its own sake, no examples or applications neccesary.

But claiming a technique is a bridge between disparate areas of mathematics and then subsequently failing to give concrete examples of such bridging is a bit odd, dont you think?

For what its worth the book "7 sketches in compositionality" (https://arxiv.org/abs/1803.05316) has a chapter on topos theory which provides a good introduction with some simple examples!

cirpis commented on A New Coefficient of Correlation arxiv.org/abs/1909.10140... · Posted by u/malshe

zaptheimpaler · 4 years ago

How is it possible to make a general coefficient of correlation that works for any non-linear relationship? Say if y=sha256(x), doesn't that mean y is a predictable function of x, but its statistically impossible to tell from looking at inputs/outputs alone?

cirpis · 4 years ago

Two things: 1. The result is asymptotical, i.e. holds as number of samples approach infinity.

2. The result is an "almost surely" result, i.e. in the collection of all possible infinite samples, the set of samples for which it fails has 0 measure. In non technical terms this means that it works for typical random samples and may not work for handpicked counterexamples.

In our particular case let f=Sha256. Then X must be discrete, i.e. a natural number. Now the particulars depend on the distribution on X, but the general idea is that since we have discrete values, the probability that we get an infinite sample where the values tend to infinity is 0. So we get that in a typical sample theres going to be an infinitude of x ties and furthermore most x values arent too large (in a way you can make precise), so the tie factors l_i dominate since there just arent that many distinct values encountered total. And so we get that the coefficient tends to 1.

cirpis commented on A New Coefficient of Correlation arxiv.org/abs/1909.10140... · Posted by u/malshe

roenxi · 4 years ago

Either Chatterjee has made a mistake, that is wrong or that definition has some extremely precise meanings because:

    > counterexample = tibble(x = 1:6, y = c(1.1,-1.2,1.3,-1.4,1.5,-1.6))
    > counterexample
    # A tibble: 6 × 2
          x     y
      <int> <dbl>
    1     1   1.1
    2     2  -1.2
    3     3   1.3
    4     4  -1.4
    5     5   1.5
    6     6  -1.6
    > XICOR::calculateXI(xvec = counterexample$x, yvec=counterexample$y)
    [1] -0.2857143
    > -0.2857143 == 1
    [1] FALSE

& I assume you could get data like that in the wild sampling a sin wave as a timeseries.

I think the abstract is a bit strong, it probably means "converges to" for a large repeating sample.

cirpis · 4 years ago

There is no mistake in the definition and this is all elaborated upon in page 4 of the article.

Quote: " On the other hand, it is not very hard to prove that the minimum possible value of ξn(X, Y ) is −1/2 + O(1/n), and the minimum is attained when the top n/2 values of Yi are placed alternately with the bottom n/2 values. This seems to be paradoxical, since Theorem 1.1 says that the limiting value is in [0, 1]. The resolution is that Theorem 1.1 only applies to i.i.d. samples. Therefore a large negative value of ξn has only one possible interpretation: the data does not resemble an i.i.d. sample."

You propose Y = f(X) = (-1)^X*(1+X/10) as your functional relation, which is measurable if X is discrete and indeed if we let x_n=n, then the limiting value of the estimator will be -1/2 not 1.

However, this is just a particular value of the estimator on a particular sample of (x,y). The theorem is an "almost surely statement", which means that it fails for a set of samples with 0 propbability.

Indeed, if we actually picked a random sample of (X, f(X)) with your f, then independent on the distribution on X, since X is discrete, we would expect to see many ties (infinitely many ties as the number of samples goes to infinity). This would mean that |r_{i+1}-r_i| is 0 for all but finitely many i and so the estimator would be 1.

This also covers the case of f being a hash function as mentioned before. Worse yet it only has finitely many different values so once again infinitely many ties.

The way the estimator is defined, it will take care of the (X, f(X)) case fairly easily as for a typical sample you will get x values that cluster and for a measurable function this implies that the resulting values will be close together and so the rank differences will be small.

This discussion probably wasnt included in the abstract since its fairly simple measure theory which most experts readimg tje article will be intimately familiar with