Show HN: I've built a spectrogram analyzer web app

jboy55 · 3 years ago

Here is a spectrogram of the track, "Look" from the album, "Songs about my Cats" by Venetian Snares.

Aphex twin did something similar, but this is more playful in my opinion.

epiccoleman · 3 years ago

Mick Gordon did some fun hidden spectrogram imagery in the Doom 2016 soundtrack.

He talks about that and plenty of other cool stuff in his talk at the 2017 GDC conference. One of my favorite conference talks ever, he did so much cool experimentation to get the sounds he used on the soundtrack, and watching his talk is one of those moments where you really get to see a master of his craft let loose and explain his process.

https://youtu.be/U4FNBMZsqrY

quickthrower2 · 3 years ago

https://venetiansnares.bandcamp.com/track/look

Warning - this music freaked my dog out!

jo-m · 3 years ago

I once (badly) did something similar as a student [1].

Unfortunately it's in Matlab so I can not run it any more.

[1] https://jo-m.ch/posts/2015/01/hack-the-spectrum-hide-images-...

sorenjan · 3 years ago

If you want to run it again you could try Octave, an open source Matlab alternative.

https://octave.org/

scotteric · 3 years ago

You know a personal license is only $149 USD right? You could then run your old code. Toolkits for home are $49.

lightweightbaby · 3 years ago

I also made something similar using Python a long time ago [1]. It's a extremely simple script so it should still work.

[1] https://github.com/DanielAllepuz/ImageToSound

syx · 3 years ago

This is very interesting what was the Aphex Twin’s track with this concept?

twelvechairs · 3 years ago

Its usually called 'formula' or 'equation' - B side of 'windowlicker'. There's a video at the link below

https://www.reddit.com/r/Damnthatsinteresting/comments/kvjil...

swah · 3 years ago

https://www.youtube.com/watch?v=wSYAZnQmffg at around 5m30

jcelerier · 3 years ago

From the look of the pictures there's a log() missing somewhere, no?

ssgh · 3 years ago

Author here. This is a basic spectrogram visualizer that's mobile friendly. It allows to select regions on the spectrogram and play them separately. There is no grand plan behind this web app: it's just a handy basic tool to capture sounds on your phone and see what they look like.

echelon · 3 years ago

This is incredible! One of the best spectral tools I've seen.

Can we hire you to help us improve the (broken) spectral visualizations on our app?

Example: https://fakeyou.com/tts/result/TR:9jy3vew9w0s3ew4keay9m330rd...

I would so love to hire you to help us. This is freaking cool.

Even if you're not interested, mad props. I really love this.

ssgh · 3 years ago

Your spectrogram looks elongated horizontally because the FFT window size is too large. I use window size 1024 with sample rate 48000 Hz, so one window covers 1024/48000=0.02 sec. This window size looks optimal in most cases: if you change it in my web app, you'll see that all other window sizes get the spectrogram blurry in different ways, but at 1024 it gets into focus.

Of course, don't forget the window function (Hann, or raised cosine), but it looks like you've got that covered because your spectrogram looks smooth.

The color palette looks good in your case. FWIW, my color function is like this: pow(fft_amp, 1.5) * rgb(9, 3, 1). The pow() part brightens the low/quiet amplitudes, and the (9,3,1) multiplier displays 10x wider amp range by mapping it to a visually long black->orange->yellow->white range of colors. Note, that I don't do log10 mapping of the amplitudes.

timlod · 3 years ago

In case OP doesn't respond, I could probably help you with this - feel free to send an email!

KennyBlanken · 3 years ago

Not OP, but...why on earth does having your site open in firefox nearly set my computer on fire?

slhck · 3 years ago

Nice tool. Some suggestions:

- Allow playback via Space button. Show a play marker to let the user know where in the sample they are, even without having selected a part.

- Choose a sample that is easier on the ears than high-pitched bird song. I was really shocked when the first loud part came.

montag · 3 years ago

Looks like it says “mime type is not supported” on mobile Safari.

ssgh · 3 years ago

It uses "audio/webm;codecs=opus" to record mic. Now it's possible to change it in the config menu in the top right. Safari probably needs audio/mp3. Edit: also consider "audio/foo;codecs=pcm" where "foo" is something compatible with Safari.

_emacsomancer_ · 3 years ago

This is a problem of iOS not supporting modern efficient codecs.

grugagag · 3 years ago

I get the same error on Iphone/Safari

wpietri · 3 years ago

Very neat! May I suggest adding a button to switch to log scale for frequency? I love the ability to select and play back just a particular set of frequencies. But voice uses only about ~15% of the screen height [1], so it's hard to play with.

[1] https://en.wikipedia.org/wiki/Voice_frequency

ssgh · 3 years ago

You can select an area and zoom into it. Another option is to change sample rate in the config in the top right.

Groxx · 3 years ago

Quite neat, thanks for sharing! I've never been able to play portions like this before, it's interesting.

Is there any way to make this display in real time, or is that not (currently?) possible with audio APIs?

Bewelge · 3 years ago

The WebAudio API has an anlayser node that can create spectrogramms in real-time. The ones I've created in the past were nowhere near as detailed as this one though.

https://developer.mozilla.org/en-US/docs/Web/API/Web_Audio_A...

lokimedes · 3 years ago

Hi, I love it. Perhaps we should chat about making it for radio data as well? We could potentially use it for our radar systems.

shangers · 3 years ago

Can I ask what kind of use cases would a spectrogram have for radar data? I've been messing around with making my own spectrogram app as well (linux desktop app and not web app though) and would be stoked to know if there's any potentially easy to reach use cases for it

ssgh · 3 years ago

You can reach me at: ssgh at mm dot st.

rickcarlino · 3 years ago

I get the mime error on Firefox mobile. Very interesting idea though, hope I can try it in the future.

a-dub · 3 years ago

i really like the filter definition as selection rectangle on time frequency plot ux,

ssgh · 3 years ago

Yeah, that's a bandpass filter, essentially. But I did it the lazy way: audio signal -> FFT -> zero out unwanted frequencies -> inverse FFT.

orbisvicis · 3 years ago

When I read about ultrasonic cross-device trackers in advertising [1], I installed "org.woheller69.audio_analyzer_for_android" and "hans.b.skewy1_0" (automatic ultrasonic detection) and started scanning through TV channels after running some test tones. Suffice to say I didn't find any, but the entire process was quite fun. There's also "org.billthefarmer.scope" which is an oscilloscope with a spectrum (not spectrogram).

1. https://arstechnica.com/tech-policy/2015/11/beware-of-ads-th...

vishnuharidas · 3 years ago

Web apps like this that accesses user's data should provide samples for users to experiment and explore before they have to give access to their actual data.

ssgh · 3 years ago

Very reasonable ask, sir. In fact, I had added sample.mp3, but forgot to add a button in the UI. Now that's fixed.

nick_m · 3 years ago

Brilliant work - I "get" how this works, I've just spent about half-an-hour playing with this (Chrome browser on my kitchen ChromeBook), singing into it and letting it "listen" to the ambient background noise here (old cooker clock ticking, fridge compressor rumbling occasionally). Useful, educational, and fun also - thanks for publishing/hosting this so others can enjoy it!

vjerancrnjak · 3 years ago

Very nice app.

I usually use Audacity to inspect the spectrogram of FLAC files and see if they really are 44100Hz or if someone packaged a constant rate 320kbps mp3 encode into a FLAC file.

Now I can just check it in my browser :D

firefoxd · 3 years ago

Simple, straight to the point, and super useful.

One place I used these was on a toy AI assistant. I recorded myself saying a trigger word thousands of times, cut the audio in pieces and converted each to a spectrogram image. I then feed those to a training model to help recognize the trigger word.

Before the spectrogram, i was feeding the wav file directly, it was incredibly intensive on my laptop. But the image files were easier to process in real time. This tool can be used for debugging.

jxmorris12 · 3 years ago

How would this work with AI? Don’t you need to train the model to discriminate between the trigger word and other words? If all that’s seen during training is the trigger word, the model will just learn to say “yes” to everything, if you get what I mean.

firefoxd · 3 years ago

Yes, i have recorded myself talking on the phone for hours as well. I should have clarified that.

HarHarVeryFunny · 3 years ago

Nice - very fast (using WebGPU?).

I like the interesting ability to play a "rectangular" (time + frequency limited) section of the audio.

ssgh · 3 years ago

I do have a WebGL-based implementation of FFT, but here I used good old JS. When properly written, it gets translated into really fast machine code, which is even faster than WebAssembly (I tried!). WebGL's problem is the high toll on the CPU--GPU bridge. When you need to transfer a block of audio data from CPU to GPU to perform calculations, you wait. When you need to transfer the FFT data back, you wait. These waits quickly outweight everything else. However on wavelet transforms GPU comes first because you can store some pre-computed FFTs on GPU and reuse them in multiple runs.