"First of all, just to say that this is really serious stuff in terms of what was done."
"The probability that McCartney wrote it was .018"
"In situations like this, you'd better believe the math because it's much more reliable than people's recollections."
The probability was .018 under their model. This doesn't mean that this is the true probability. Naive Bayes probabilities are typically not very reliable [1]. I have not read the paper, but I think his confident wording makes me question how believable this is.
[1] Ensembles of models are better, like random forest.
I could not find a paper (I was interested in how they constructed the dataset), but only an extended abstract of the talk [1]
It seems the dataset consisted of 70 songs. Since they don't specify the distribution, there is no saying if 80% accuracy is good or worse than average guessing.
An average of 35 samples per class is a serious few-shot constraint, which makes resorting to interpretable and simple Bayesian analysis a sane step. Note how the dataset is grossly underdefined too: 70 samples and 149 features, which can cause problems for more complex algo's.
I think we have to reconstruct this article from behind the pop-sci glasses this was presented with.
> this is really serious stuff in terms of what was done
Sure, it is no joke paper, nor meandering about music theory without any hard proofs.
> "The probability that McCartney wrote it was .018"
Probably this meant to say: The model predicts a 0.018 probability, but that is too careful for a pop-sci article. We can then question the validity of the model.
> you'd better believe the math because it's much more reliable than people's recollections
"Ha-ha! People did a lot of drugs in the 60s!" nothing more...
> And 10 years later, here we are talking about the discovery.
Also cute, in that it makes it sound like one of the authors spend 10 years working on this very basic analysis.
Meta: This article would have been published 10 years ago too (since the finding is interesting from a pop-sci view), but I doubt they would even dare describing the maths behind the publication. Now we are in techno-fetish era and write about the number of layers, GPU hours, BoW, and value networks, just to fool the reader into getting a glimpse of modern AI. While the methods in this article can be implemented in 1 hour of downloading Midi files and another hour of implementing Graham's 2002: A plan for spam. [2]
Needlessly negative/critical, but I could not edit my post. Apologies, I really liked this effort. I dig stylometry, that's why I change my username so much.
Their future research may be really cool, tracing popular chord progressions all through pop history.
>Sure, it is no joke paper, nor meandering about music theory without any hard proofs.
Well, similarly to how one comments without having read the paper, but just the article.
>"Ha-ha! People did a lot of drugs in the 60s!" nothing more...
Points to a common fact: people's memories from their drug-fueled periods are hazy, and a historical fact: the Beatles did a lot of drugs during the mid-to-late sixties.
> The probability was .018 under their model. This doesn't mean that this is the true probability.
Well of course different models will give you different probabilities. But given that this is a past event, talking about "true probability" is a bit weird. The only "true" probabilities are 0 and 1. There's nothing nondeterministic about the past. Someone wrote the song.
That's not how Bayesian statistics work. In this model (on which statistical learning and information theory based) a probability does not require nondeterminism, just incomplete information; and a probability is defined with respect to a certain amount/type of information. (Actually information is defined in terms of things that change probability, and probability is the more "core" concept.)
So for example, if someone rolls a six-sided die, hides it from you, and asks what the probability of a "6" is, under Bayesian theory it's 1/6; if they then tell you that the number rolled was odd, the probability of a "6" drops to zero, even though nothing physically changed, purely because you now have more information.
This is generally a much more useful and intuitive definition of probability than to say "the probability of it being a 6 is either 0 or 1, I just don't know yet".
Under the Bayesian definition, probability is a degree of belief. Probabilities belong “inside” an agent, not somewhere “out there” in the world.
For a justification of this, see the Dutch Book Argument: “The main point of the Dutch Book Argument is to show that rational people must have subjective probabilities for random events, and that these probabilities must satisfy the standard axioms of probability.”
I wish I was as confident in anything as these guys are in the predictive accuracy of their simple model with an out-of-sample accuracy of 80%[1] with limited and questionable training data.
[1] Accuracy is probably a misleading performance metric here as well
I think it's very possible when you collaborate, you mimic the other person's speech or writing patterns. Especially since these guys worked together for so long, you'd think they could finish each others verses.
The other issue with this kind of analysis is it assumes people are like robots, but sometimes artists just do decide to do something radically different to any of their prior work.
Most people with a certain level of obsession about The Beatles would certainly recognize the song as Lennon's, though with some amount of input from McCartney that could range from no involvement to half-and-half.
It's too conjunct to be Paul's, but that doesn't mean he didn't give any input.
I question this analysis because it seems like an easy target -- the song obviously has Lennon's fingerprint on it (a melody hovering around just a few notes with thick harmonies), and the headline of "study reveals new insight into Beatles songwriting" is too juicy for my liking.
What this article seems to also be missing is that there are actually two versions of the song. I'd say that the final recording is indisputably Lennon, but the original is very much McCartney.
The final version's lyrics are abstract and have the interplay of dark and light. The original version is very concrete and almost ballad-like, very much McCartney's style (it even mentions Penny Lane).
What this kind of approach misses is that shared writing credits don't necessarily mean writing together at the same time. While the two have been known to go off and write a piece together, I think there's a decent argument to be made that McCartney sketched out the idea for "In My Life" and Lennon refined it.
Both John and Paul agree that Paul wrote at least some of the music, and yet this article says that John wrote the whole thing. I believe John and Paul.
Thanks. I wish I had an extension that redirects to text-only npr automatically (their button doesn't redirect to the actual article and I'm assuming they broke their own site on purpose). One of those days I might actually sit down and start making Firefox extensions - especially if I can put together a basic template for quick one-off single-purpose GPDR fixers and such.
Speaking from experience, it isn't too hard. The MDN WebExtension documentation is a great place to start, and with very little effort you can write an extension that works in chrome off the same codebase.
I’m too lazy to read the paper right now but I’m curious: if we can’t trust the memories of Paul & John, how did they train the model on the 70 other songs in the first place?
> Mathematics professor Jason Brown spent 10 years working with statistics to solve the magical mystery.
> The three co-authors of this paper — there was someone called Mark Glickman who was a statistician at Harvard. He's also a classical pianist. Another person, another Harvard professor of engineering, called Ryan Song. And the third person was a Dalhousie University mathematician called Jason Brown.
It took three people ten years to do this? Also all the reporting here is awful. This is nothing like a proof.
"The probability that McCartney wrote it was .018"
"In situations like this, you'd better believe the math because it's much more reliable than people's recollections."
The probability was .018 under their model. This doesn't mean that this is the true probability. Naive Bayes probabilities are typically not very reliable [1]. I have not read the paper, but I think his confident wording makes me question how believable this is.
[1] Ensembles of models are better, like random forest.
It seems the dataset consisted of 70 songs. Since they don't specify the distribution, there is no saying if 80% accuracy is good or worse than average guessing.
An average of 35 samples per class is a serious few-shot constraint, which makes resorting to interpretable and simple Bayesian analysis a sane step. Note how the dataset is grossly underdefined too: 70 samples and 149 features, which can cause problems for more complex algo's.
I think we have to reconstruct this article from behind the pop-sci glasses this was presented with.
> this is really serious stuff in terms of what was done
Sure, it is no joke paper, nor meandering about music theory without any hard proofs.
> "The probability that McCartney wrote it was .018"
Probably this meant to say: The model predicts a 0.018 probability, but that is too careful for a pop-sci article. We can then question the validity of the model.
> you'd better believe the math because it's much more reliable than people's recollections
"Ha-ha! People did a lot of drugs in the 60s!" nothing more...
> And 10 years later, here we are talking about the discovery.
Also cute, in that it makes it sound like one of the authors spend 10 years working on this very basic analysis.
Meta: This article would have been published 10 years ago too (since the finding is interesting from a pop-sci view), but I doubt they would even dare describing the maths behind the publication. Now we are in techno-fetish era and write about the number of layers, GPU hours, BoW, and value networks, just to fool the reader into getting a glimpse of modern AI. While the methods in this article can be implemented in 1 hour of downloading Midi files and another hour of implementing Graham's 2002: A plan for spam. [2]
[1] https://www.amstat.org/asa/files/pdfs/pressreleases/JSM2018-...
[2] http://www.paulgraham.com/spam.html
Their future research may be really cool, tracing popular chord progressions all through pop history.
Well, similarly to how one comments without having read the paper, but just the article.
>"Ha-ha! People did a lot of drugs in the 60s!" nothing more...
Points to a common fact: people's memories from their drug-fueled periods are hazy, and a historical fact: the Beatles did a lot of drugs during the mid-to-late sixties.
Well of course different models will give you different probabilities. But given that this is a past event, talking about "true probability" is a bit weird. The only "true" probabilities are 0 and 1. There's nothing nondeterministic about the past. Someone wrote the song.
So for example, if someone rolls a six-sided die, hides it from you, and asks what the probability of a "6" is, under Bayesian theory it's 1/6; if they then tell you that the number rolled was odd, the probability of a "6" drops to zero, even though nothing physically changed, purely because you now have more information.
This is generally a much more useful and intuitive definition of probability than to say "the probability of it being a 6 is either 0 or 1, I just don't know yet".
For a justification of this, see the Dutch Book Argument: “The main point of the Dutch Book Argument is to show that rational people must have subjective probabilities for random events, and that these probabilities must satisfy the standard axioms of probability.”
Deleted Comment
[1] Accuracy is probably a misleading performance metric here as well
Deleted Comment
It's too conjunct to be Paul's, but that doesn't mean he didn't give any input.
I question this analysis because it seems like an easy target -- the song obviously has Lennon's fingerprint on it (a melody hovering around just a few notes with thick harmonies), and the headline of "study reveals new insight into Beatles songwriting" is too juicy for my liking.
The final version's lyrics are abstract and have the interplay of dark and light. The original version is very concrete and almost ballad-like, very much McCartney's style (it even mentions Penny Lane).
What this kind of approach misses is that shared writing credits don't necessarily mean writing together at the same time. While the two have been known to go off and write a piece together, I think there's a decent argument to be made that McCartney sketched out the idea for "In My Life" and Lennon refined it.
https://text.npr.org/s.php?sId=637468053
"The two even debate between themselves — their memories seem to differ when it comes to who wrote the music for 1965's "In My Life.""
"[...] they're still the same people, and they have their preferences without realizing it."
> The three co-authors of this paper — there was someone called Mark Glickman who was a statistician at Harvard. He's also a classical pianist. Another person, another Harvard professor of engineering, called Ryan Song. And the third person was a Dalhousie University mathematician called Jason Brown.
It took three people ten years to do this? Also all the reporting here is awful. This is nothing like a proof.