Readit News logoReadit News

Dead Comment

it_does_follow commented on Markov Chains for programmers   czekster.github.io/markov... · Posted by u/raister
klysm · 4 years ago
Markov chain Monte Carlo is incredibly useful and widely applied.
it_does_follow · 4 years ago
Not to mention that the entire class of Markov Chain Monte Carlo techniques only form a subset of general uses for Markov chains.

Markov chains form the basis of n-gram language models, which are still useful today.

Markov chains are also the basis of the Page-rank algorithm.

Hidden Markov Models (which are just an extension of Markov Chains to have unobserved states) are a powerful and commonly used time series model found all over the place in industry.

In the pre-deep learning model Markov chains (and HMMs) in particular had very wide spread usage in Speech processing.

They are probably one of the most practical statistical techniques out there (out side of obvious example like linear models).

it_does_follow commented on Meta's A.I. exodus: Top talent quits as lab tries to keep pace with rivals   cnbc.com/2022/04/01/metas... · Posted by u/kjhughes
personjerry · 4 years ago
Facebook AI Research (FAIR) is pretty reputable in the the AI/ML world
it_does_follow · 4 years ago
Being reputable and being a vanity division are not mutually exclusive.

You could argue that at it's peak Bell labs was a vanity division. That research may have changed the world, but very little of it likely ended up benefiting AT&T in any major way financially. It's telling that once AT&T was broken up Bell labs, while existing in some form for years after, was never reestablished.

it_does_follow commented on Meta's A.I. exodus: Top talent quits as lab tries to keep pace with rivals   cnbc.com/2022/04/01/metas... · Posted by u/kjhughes
it_does_follow · 4 years ago
My guess is this is ultimately about money.

Facebook/Meta stock lost 30% of it's value a few months back and may not recover any time soon.

Many of these people are high level, so I'm guessing (based on a quick skim of levels.fyi) about 50% or more of their TC comes from stock.

This amounts to an effective 15% pay cut in a year with record inflation.

I think most of us would leave our job over a 15% pay cut, especially if we were well established in the field. On top of this it loosens those golden handcuffs quite a bit for anyone who was on the fence about being employed by facebook, but couldn't say no to the comp.

it_does_follow commented on Reasons to not use PCA for feature selection   blog.kxy.ai/5-reasons-you... · Posted by u/leonry
jstx1 · 4 years ago
For me PCA is strictly in a bucket labeled "this comes up a lot in data science interviews but you never actually need to use it at work".
it_does_follow · 4 years ago
You've never had to any kind of factor analysis in your work or done any searching for latent variables that map to customer/stakeholder question? Given the number of people I've worked with that are interested in modeling "engagement", I find this hard to believe.

PCA is an incredibly valuable tool that I've used in most jobs I've had. It's just a terrible idea as a default part of a feature engineering pipeline (which is what the author is talking about in terms of "feature selection"), for reasons outline in this article.

I suggest you don't be quite so quick to dismiss important concepts in this area, and before criticizing this post, at least read through it (I noticed your comment about misunderstand what the author is discussing by "feature selection" is the top comment here).

it_does_follow commented on Reasons to not use PCA for feature selection   blog.kxy.ai/5-reasons-you... · Posted by u/leonry
jstx1 · 4 years ago
The only reason you need: PCA is not a feature selection algorithm

Is the author misunderstanding something very basic or are they deliberately writing this way for clicks and attention? I can see that they have great credentials so probably the latter? It's a weird article.

it_does_follow · 4 years ago
It's very clear if you read the article that what the author is calling "feature selection" might be better termed "feature generation". He explicitly calls out what he means in the post:

> When used for feature selection, data scientists typically regard z^p:=(z_1,…,z_p) as a feature vector than contains fewer and richer representations than the original input x for predicting a target y.

I don't even think this is necessarily incorrect terminology, especially given the author's background of working primarily for Google and the like. It's the difference between considering feature section as "choosing from a list of the provided features" vs "choosing from the set of all possible features". The author's term makes perfect sense given the latter.

PCA is used for this all the time in the field. There have been an astounding number of presentations I've seen where people start with PCA/SVD as the first round of feature transformation. I always ask "why are you doing that?" and the answer is always mumbling with shoulder shrugging.

This is a solid post and I find it odd that you try to dismiss it as either ignorant or click bait, when a quick skim of it dismisses both of these options.

it_does_follow commented on Word2Vec Explained. Explaining the Intuition of Word2Vec   towardsdatascience.com/wo... · Posted by u/ColinWright
it_does_follow · 4 years ago
Am I alone in really disliking Towards Data Science?

While their articles always look nice, their content is all written quickly by data scientists wanting to polish their resume with the ultimate aim of rapidly generating content for TDS that will match every conceivable data science related search. This post clearly exists solely so that TDS can get the top spot for "Word2vec explained" (which they have). As evidence of this tactic you can see that there already is a TDS post "Word2vec made easy" [0], offering nothing substantially different than this one.

The problem is that content is almost never useful, it just looks nice at first skim through. The authors, at no real fault of their own, are just eager novices that rarely have new perspective to add to a topic. It's not uncommon to find huge conceptual errors (or at least gaps) in the content there.

I personally encourage everyone at every level to write about what they can, but the issue is that TDS has manipulated this population of eager data scientists in order to dominant search results on nearly every single topic they can cover related to DS, which has made searching for anything tedious.

Compare this post to the fantastic work of Jay Alammar [1]. Jay's post is truly excellent, covering a lot of interesting details about word2vec and providing excellent visuals as well.

I'm assuming TDS will fold as soon as DS stops being a "hot" topic (which I think we'll be in the relatively near future), and will personally be glad to see the web rid of their low signal blog spam.

0. https://towardsdatascience.com/word2vec-made-easy-139a31a4b8... 1. https://jalammar.github.io/illustrated-word2vec/

it_does_follow commented on The counterintuitive rise of Python in scientific computing (2020)   cerfacs.fr/coop/fortran-v... · Posted by u/leonry
jrochkind1 · 4 years ago
As a rubyist, it makes me sad that python ended up here rather than ruby. And I sometimes wonder why.

> As the name suggests, numeric data is manipulated through this package, not in plain Python, and behind the scenes all the heavy lifting is done by C/C++ or Fortran compiled routines.

So I wonder, was it easier to write C/C++ or fortran compiled extensions in python than it was in ruby?

it_does_follow · 4 years ago
> And I sometimes wonder why.

Numpy.

I honestly think it all boils down to numpy being developed long before matrix libraries became a standard part of software development.

Ruby's early "killer app" (remember that term?) was Rails. Even to this day there is almost no major code out there built in Ruby that isn't ultimately related to building CRUD web apps. While Ruby may be losing popularity now, it moved the web-development ecosystem ahead in the same way that Python has moved the scientific computing world ahead.

20 years ago if you wanted to use open source tools to performant vector code there was Python and a hand full of oss clones of commercial products. Given the Python was also useful for other programming tasks in a way that say Matlab/Octave is not, it was the choice for more sophisticated programmers who wanted an OSS solution and need to do scientific computing. This creates a positive feed back that persists to this day.

Given that Python remains a decent language relative to it's contemporary peers and it has a massive and still growing library of numerical computing software it is extremely unlikely to be dethroned, even by promising new languages like Julia.

Even to this day there is nothing even close to numpy in Ruby. I do DS work in an org that is almost entirely Ruby, but we still use python without question because we know re-implementing all of our numeric code into Ruby would be a fools errand.

Had ruby had early support of matrix math, it wouldn't have surprised me if it would have replaced Python.

it_does_follow commented on 'Children of Men' is happening   edwest.substack.com/p/chi... · Posted by u/rwmj
goldenkey · 4 years ago
Are you seriously going to tell me that starting PayPal, having two divorces, 8 kids, and rolling that into Tesla and SpaceX makes someone the smartest entrepreneur? I thought entrepreneur included all parts of life. Elon is not a savior, he's smart, but he got lucky too.
it_does_follow · 4 years ago
I believe the parent is attempting their own demonstration of Poe's law. I was impressed by the quality (and really hope I'm not wrong).
it_does_follow commented on 'Children of Men' is happening   edwest.substack.com/p/chi... · Posted by u/rwmj
giorgioz · 4 years ago
The article fails to mention France which is managing the low children crisis better than most. Macron moved down the compulsory school age start from 6 (primary school) down to 3 (kindergarten).

Governments should help families by providing easier access to daycares and kindergartens. Helping young parents get some rest with their first child(ren) will encourage them having more children.

I'm a parent of two young children, a 3 years old and a 6 months old. This is the hardest period of my life since high school. I'm 34 and in our circles I'm the youngest dad. Most of other dads' are in their late 30s all the way to early 50s.

Yet I do feel I'm making a difference, especially considering that I'm Italian (high longevity + few children). I've been also inspired by my choices of parenthood by the movie IDIOCRACY:https://www.youtube.com/watch?v=BBvIweCIgwk I was delighted when I red in the Elon Musk biography that he had also saw the movie and that was also one of the reasons for being a parent of multiple kids.

it_does_follow · 4 years ago
"I'm having kids because of the movie Idiocracy, and inspired because Elon Musk did the same" has to be peak Poe's law[0].

0. https://en.wikipedia.org/wiki/Poe%27s_law

u/it_does_follow

KarmaCake day242November 14, 2021View Original