Readit News logoReadit News
vouaobrasil · 2 years ago
I have a PhD in math pure math, and I am fairly well-versed in algebraic topology and I've studied topological data analysis.

The idea is that you take your data points and expand them in space by putting circles around them. If there are features that are persistent over large variations of the sizes of the circles (or balls in higher-dimensional space), then those persistent features are said to be important reflections of the structure of your data.

The persistent features are characterized by homology, which is a tool that measures the basic topological shape of your data. Homology is actually extremely easy to calculate and doesn't require much advanced math. In fact, for data analysis purposes, you can focus purely on combinatorial aspects, in which case no advanced math is required.

In my opinion, it's a pretty idea and seems logical, but in practice it really seems to be limited in a few ways in terms of practical applications. For one, it often is weaker than more direct methods such as good experimental design and other cluster data methods that can be more directly used to tease out causal relationships.

For another, it seems most useful in the domain of comparing data changing over time or some other variable where the fundamental topological structure change in high-dimensional space in tidy ways. This isn't the case with a lot of real-world data.

In practial terms, you can think of this in the following way... there is only so much you can get out of data. And often, what you do need is not coarse topological features, which rarely have intrinsic meaning.

So my overall opinion of it is that while it might have some limited applications in highly specific fields, it will never become a general method that could be used in 99.9% of data science applications. (To be FAIR though, I also believe that a LARGE proportion of data science is also kind of useless too...data science has its fair share of snake oil.)

jebarker · 2 years ago
I also have a PhD in Algebraic Topology and feel similarly. One application I'd like to see explored more is the use of differentiable topological invariants as loss functions for training neural nets. Loss functions to measure topological similarity don't really exist in practice yet and could be very useful.
the88doctor · 2 years ago
I started my PhD doing algebraic topology and tried using TDA for a couple industry projects before eventually switching to work on quantum computing algorithms because I decided TDA wasn't that useful. It's definitely a solution in search of a problem, and while there do seem to be a couple interesting use cases (such as genetic data analysis), I completely agree with you that there are better alternatives in most situations.
maddimini · 2 years ago
What do you mean by "other cluster data methods that can be more directly used to tease out causal relationships"? What clustering approaches can tease out causal relationships? Thanks in advance!
rg111 · 2 years ago
k-Means Clustering mainly. Also tSNE and PCA. I think it is safe to call PCA a kind of a clustering in chosen components space, although it is classified under dimensionality reduction techniques.
pid-1 · 2 years ago
This is a maths book with zero Data Science content.

If you're wondering how Algebraic Topology can be used to solve real world problems, you will still wonder 300 pages later.

usgroup · 2 years ago
Looking through the contents I thought the same thing. Seems like a lot of theory with only a handful of attached methods.

I think one might be better of just reading about uMAP, Mapper and persistent homology directly . At least the last two are very simple and don’t require advanced maths.

bigbillheck · 2 years ago
It's "Algebraic Topology for Data Scientists", not "Even More Data Science with some Algebraic Topology Included for Data Scientists".
hgsgm · 2 years ago
"This book gives a thorough introduction to topological data analysis (TDA), the application of algebraic topology to data science."
srvmshr · 2 years ago
For those interested, there is a really nice course with lecture notes taught by Vidit Nanda (Oxford).

https://people.maths.ox.ac.uk/nanda/cat/

Unlike what people commented elsewhere, TDA has growing application in computer graphics, 3D reconstructions and computer vision.

PS: I am told the new version of this course is narrated/illustrated by Robert Ghrist. He's famous for his foundational calculus course

northzen · 2 years ago
Can you give some hints or examples of TDA in 3D reconstructions and computer vision? I'm interested in this topic, espectially exploring it from topological view.
srvmshr · 2 years ago
Sorry about late reply. I wish there was some way to get notifications of comments. Here's a survey paper which connects TDA & Graphics in several ways:

https://arxiv.org/abs/2212.09703

godelski · 2 years ago
322 pages and no links or bookmarks! Come on!

Instructions to add:

- Click other formats

- Download source

- Add this to the includes

``` \usepackage[bookmarks,linktocpage=true]{hyperref}

\hypersetup{

    colorlinks,

    linktoc=all,

    linkcolor={blue},
}

\makeindex ```

- recompile

Presto, you got the same document except now we can click the links in the table of contents, the citations, and there's a bookmark section on the side that allows us to navigate the document instead of just scrolling.

ConnorMooneyhan · 2 years ago
> most mathematicians have never been exposed to it

I'm not sure what graduate math programs don't have Algebraic Topology. It's pretty fundamental.

bigbillheck · 2 years ago
My math program didn't require graduate topology for most students, they offered a topology course but most of it was general, not algebraic.

Dead Comment

srean · 2 years ago
There have been many comments that have pointed out the poverty of real ML applications that ride on algebraic topology.

The problem is this -- topological spaces on their own have a lot less structure than say a vector space or a metric space. Most of the real world data that ML applications deal with today have more structure and are thus handled using vector space methods or metric space methods or manifold methods. It is also the right thing to do. When you have structure you shoukd use it.

TDA would be useful for data sets where the members have no clear analogue of a distance, or the distances cannot be trusted, or where vector embeddings do not make sense.

outlace · 2 years ago
If you're interested in algebraic topology to do topological data analysis, I wrote a series of blog posts about it that I believe is quite accessible even to those with limited math backgrounds: http://outlace.com/TDApart1.html
renatyv · 2 years ago
Don't get me wrong, I like math and had fun skimming through the book. But 4 pure math chapters to get to "growing circles"? Really? It feels like these algorithms can be explained using much simpler terms.