There was an open source audio fingerprinting system called echoprint, which actually implemented the shazam algorithm in a way that made it hard to claim it's the same approach as shazam, but in reality it was almost the same. The hardest part about these kind of services is designing the fingerprints so that you can search them effectively. The audio part is interesting and fun, but actually less critical.
what would you say is the difference between fingerprint and using something like OpenAI's whisper approach (visual spectrogram ML) for finding the music
tangent: I'm also thinking about some fast way to search text algo maybe related to Spotify damn that was a long time ago read that article
You should be careful with this. Last time I saw an article about reproducing Shazam's algorithm, their lawyers came after them and eventually the article was removed.
There were questions as to the validity of the threats their lawyers used, but even a bulletproof case is a costly endeavor when going up against the scale of companies.
and then goes on to say he lets people put in Spotify links to add songs. Spotify won't let you download songs, but he uses their API to get the band and title... then searches for it on Youtube and downloads the song from there instead
PFFFT that's the sound of Youtube's lawyers spitting out their coffee and sprinting back to their desks
Well sure, a songs database is important. But song databases like https://acoustid.org/ exist, which let you look up songs that share the same audio "fingerprint" (https://github.com/acoustid/chromaprint). You need the full track to make that fingerprint.
Shazam can take only a tiny snippet, and can guess quite accurately just from that snippet. By comparison to AcoustID, which is also a song database (with an entirely different purpose) we can say that the "main ingredient" is Shazam's system for identifying songs from short snippets.
tangent: I'm also thinking about some fast way to search text algo maybe related to Spotify damn that was a long time ago read that article
There were questions as to the validity of the threats their lawyers used, but even a bulletproof case is a costly endeavor when going up against the scale of companies.
PFFFT that's the sound of Youtube's lawyers spitting out their coffee and sprinting back to their desks
How Shazam Works (2003) [pdf] (117 points, 11 months ago, 29 comments) https://news.ycombinator.com/item?id=40029036 - there's a lot of links to past Shazam stories in comments
Hmm they did seem to have gotten some more customers after I left but the website is all glitchy now so I guess it's abandoned.
https://spot-on.media/
https://github.com/cgzirim/seek-tune
Shazam can take only a tiny snippet, and can guess quite accurately just from that snippet. By comparison to AcoustID, which is also a song database (with an entirely different purpose) we can say that the "main ingredient" is Shazam's system for identifying songs from short snippets.