Phased Array Microphone (2023)

"As part of the calibration, the speed of sound is also a parameter which is optimized to obtain the best model of the system, which allows this whole procedure to act as a ridiculously overengineered thermometer."

Reminds me of the electronics adage: "all sensors are temperature sensors, some measure other things as well."

danielheath · a year ago

Back in high school, I built (with some parental assistance) an apparatus to measure how quickly the pressure would drop (in a pressurized cylinder) when a very small hole allowed air to leak out.

Turns out, not only can you measure temperature that way, but can extrapolate the graph out to find absolute zero (IIRC my result was out by about 20 kelvin, which I think is pretty damn good for a high-school-garage project).

Horffupolde · a year ago

No, there you are measuring R, assuming the air inside the cylinder was an ideal gas.

analog31 · a year ago

>>>> Reminds me of the electronics adage: "all sensors are temperature sensors, some measure other things as well."

A corollary that's one of my rules to live by: Never measure anything over time without also measuring the ambient temperature.

mastersummoner · a year ago

Not really related, but the way you chose to indicate a comment here immediately made me think you had merge conflicts.

Marthinwurer · a year ago

I love these kind of inadvertent measurements. One of my favorite examples is that a sufficiently accurate IMU can get you relatively accurate longitude measurements from the Coriolis effect.

adolph · a year ago

Slight correction, latitude, not longitude.

The earth’s surface closer to the poles has less distance to travel for any rotation than the surface closer to the equator. As a result the inertial navigation systems of long distance systems must be adjusted. Iirc, this is also the case for artillery firing computations.

https://www.oxts.com/blog/going-round-circles-earth-rotation...

https://www.britannica.com/science/latitude

nielsole · a year ago

Asahi Linux (and likely MacOS too) uses the resistance of the speakers coils to detect overheating of same speakers and reduces volume.

01HNNWZ0MV43FF · a year ago

Is that the same thing where a flat-earther tried to measure something with an expensive laser gyro and kept finding that Earth was rotating?

psunavy03 · a year ago

I believe this is one of the initial steps an aircraft INS uses to find north while it is aligning, but it's been too long since I had aircraft systems theory in the front of my brain.

emptiestplace · a year ago

https://en.m.wikipedia.org/wiki/Inertial_measurement_unit

cameldrv · a year ago

Latitude or Longitude?

sam_dam_gai · a year ago

Do you mean latitude?

Deleted Comment

Dead Comment

user_7832 · a year ago

Is there one saying “All electronic devices are smoke machines, some can compute too”?

jaggederest · a year ago

Similarly, diesel engines come with a reserve fuel supply that you can accidentally use once. (diesel engines will happily run on engine oil when warm)

qskousen · a year ago

The one I've heard is "Every machine is a smoke machine, if you operate it wrongly enough."

frabert · a year ago

"All diodes are light-emitting if you try hard enough"

ChuckMcM · a year ago

"Inside every amplifier is an oscillator trying to get out."

qingcharles · a year ago

"All electronics are hand-warmers if miscalibrated correctly enough."

Bearsilber · a year ago

I just learned how the Duracell Powercheck© worked, which was done with temperature.

https://youtu.be/zsA3X40nz9w?si=oGg2wdUlLXSDxpsN

mnky9800n · a year ago

a colleague of mine spent months analysing fluctuations in narrow band signal from a geophone only for a more senior colleague to get fed up with it and demonstrated that actually the fluctuations simply correlate with the air temperature and do so within the spec sheets reported temperature tolerance.

cushychicken · a year ago

> Reminds me of the electronics adage: "all sensors are temperature sensors, some measure other things as well."

I wanna say that’s a Bob Pease quote but I can’t find an attribution to it.

frankus · a year ago

I first encountered it in Elecia White's book Making Embedded Systems, but the attribution is anonymous and whom it's attributed to may have heard it elsewhere.

UltraSane · a year ago

The highest grade gauge blocks use laser interferometry from Mitutoyo have a measured coefficient of thermal expansion AND a uncertainty of that coefficient. And they have a size variance of plus or minus 30nm. That is only about 410 oxygen atoms.

kqr · a year ago

Oh yeah. I realised this the day I discovered my fancy digital SLR was a thermometer: https://entropicthoughts.com/does-my-dslr-have-dead-pixels

glitchc · a year ago

Yup, it's called dark noise. Random generation of electrpns which sometimes find their way into the depletion region.

djmips · a year ago

A lot of people like myself consider heat a form of light but I guess a photographer would be just thinking visible light. They say that about 50% of the sun's light emissions comes in the infrared frequencies.

entropicdrifter · a year ago

It does act as a thermometer, if and only if the altitude remains constant. The speed of sound fluctuates with both temperature and altitude

amluto · a year ago

I’m not sure how the speed of sound could depend on altitude, even in principle. The air doesn’t know where it is!

Putting that aside, in an ideal gas, the speed of sound depends on the composition of the gas and the temperature and, interestingly, does not depend on pressure, and pressure is the main way that the altitude would affect the speed of sound. So measuring the speed of sound in air actually makes for a pretty good thermometer.

https://en.wikipedia.org/wiki/Speed_of_sound

_0ffh · a year ago

Right, it gets even worse: Air pressure in not only altitude-dependent but fluctuates even at constant altitude. The pressure (altitude) dependence is comparatively weak, though.

t0mas88 · a year ago

The speed of sound fluctuates with density. Altitude and temperature both change density.

I'm curious, why haven't you used TDM I2S microphones for your array and used PDM?

I understand that ICS-52000 is a relatively low cost ($2/100pcs) and there are even breakout boards available with 4 microphones, which can be chained to 8 or 16, like https://www.cdiweb.com/datasheets/notwired/ds-nw-aud-ics5200...

Then you can take Jetson (or any I2S capable hardware with DSP or GPU on it) and chain 16 microphones per I2S port. It would seem a lot easier to assemble and probgam, if comared to FPGA setup.

kindiana · a year ago

(OP here) tverbeure hit most of the main points, but mostly cost ($2/mic vs $0.5/mic adds up when there are 192 microphones), difficulty of finding things with enough i2s interfaces (even with 16 way daisy chaining, thats still more than most/all things will have). The FPGA/custom hardware was part of the fun as well!

dchichkov · a year ago

Yeah, I've also had difficulty finding something with enough I2S. It was a while back and I've used Sprocket carrier for Jetson TX2 - it had 6 lanes, so up to 96. It was for a SODAR application, so the sampling frequency was not that critical and to me it felt like the perfect trick to make an array with off-the-shelf hardware. So I was just curious, if this was something you've considered.

For something indoors, yes, I can see how low sampling frequency gets very limiting. And 192 microphones, that's really pushing it. Love it.

The $2/mic vs $0.5/mic argument is a fun one. You've obviously poured enormous amount of engineering in there, involving PCB design, FPGA and network programming, writing custom CUDA kernels, signal processing, PyTorch, the list goes on. And you've had 4090 plugged in your PC in 2023. Classic hobbit in a mithril vest ;)

morcheeba · a year ago

Not OP, but I looked in to this a few years ago. It was more expensive then, and only went to 20 kHz. Higher frequencies are helpful if you're listening for the hiss of leaking gas, or corona discharge of an electric arc.

The Orin has 6xI2S ports internally, so that would work up to 16*6 = 96 microphones, which is a good number. But it looks like maybe only 3 are brought out & on different dev board connectors [1]? As with a lot of design, the devil is in the details. An FPGA could be easier to configure if you need more than 96 microphones.

My notes:

ICS-52000 $3.50, 20 kHz

ICS-41350 $1.05, 40 kHz

SPH0641LU4H-1 $1.45, 80 kHz+

[1] https://docs.nvidia.com/jetson/archives/r34.1/DeveloperGuide...

tverbeure · a year ago

I've considered making a phased array myself, but never got around to sending out the PCB. But here are two reasons by I2S is not the best option:

* I2S requires 3 instead of the 2 pins of PDM. However, in the datasheet that you provided, it shows how you can daisy-chain microphones which is really cool (even if not standard I2S.) So that argument goes away.

* PDM gives you access to way higher sample rates which in turns gives you more flexibility in choosing the delay for a delay-and-sum operation. For example, if the PDM clock is 2MHz, you could theoretically delay with a precision of 0.5us. In practice, you'll do that with lower precision, but with I2S, the clock will typically max out at 192kHz.

* PDM microphones then do be cheaper.

belzebalex · a year ago

1) and 3) are valid, but 2) isn't really. In that sort of pipeline, you usually do IQ sampling which allows you to phase-shift by any arbitrary value with a complex multiplication.

dllu · a year ago

I once did a project to do multilateration of bats (the flying mammal) using an array of 4 microphones arranged in a big Y shape on the ground. Using the time difference of arrival at the four microphones, we could find the positions of each bat that flew over the array, as well as identify the species. It was used for an environmental study to determine the impact of installing wind turbines. Fun times.

lscharen · a year ago

Reminds me of Intellectual Venture's Optical Fence developed to track and kill mosquitoes with short laser pulses.

As a side-effect of the precision needed to spatially locate the mosquitoes, they could detect different wing beat frequencies that allowed target discrimination by sex and species.

redblacktree · a year ago

Where can I buy one?

bafe · a year ago

I did a similar project at 18. Needless to say I didn't have enough HW and SW skills to do much since I implemented the most naive form of the TDOA algorithms as well as the most inefficient way of estimating the time difference through cross correlation. I still learnt a lot and it led me to eventually getting a PhD in SAR systems, which are actually beamformers using the movement of the platform instead of an array

jessetemp · a year ago

What were the results of your study? I’ve heard that bat lungs are so sensitive that when they fly across the pressure differential of large turbines their capillaries basically explode

Yes basically. Bird lungs are relatively rigid, open at both ends like a tube, and have a one-way flow of air, so they are less prone to pressure-related injuries. Bat lungs are mammalian lungs that expand and contract as they breathe just like us, so they are particularly vulnerable to barotrauma near wind turbines.

After writing a bunch of MATLAB code to find the bats, I handed it off and haven't heard back about whether they actually built the wind turbines or not.

mywacaday · a year ago

I would love to do something like that to track the bats in my garden, how feasible would it be for an amateur to do as a personal project? Any good references on where to start.

A nice mention about this is the outstanding and quiet work of the Cosys-Lab of the University of Antwerp. They once put a microphone array below a scorpio, and showed how bats moved their ultrasonic beam to scan for a scorpio. Incredible stuff [0].

[0]: https://www.youtube.com/watch?v=57ScSPWhGqU

FredPret · a year ago

I had no idea they were mammals until this comment. I thought they were furry birds!

repiret · a year ago

It is not unreasonable to think of bats as flying mice.

neumann · a year ago

That sounds super interesting. Is there a write up somewhere of the project?

Here's the report [1], written when I was a second year undergrad in 2010.

It's very basic. The species identification is based on matching contours of the spectrogram against some template contour. The multilateration was, embarrassingly, done by brute force by generating a dense 3D grid. At the time, I didn't have any knowledge of Kalman filters or anything that could have been helpful for actually tracking the bats.

[1] https://daniel.lawrence.lu/public/bat-report.pdf

NL807 · a year ago

That sounds like a fun project. Was it part of a research grant?

isatty · a year ago

> bats (the flying mammal)

As opposed to?

Rygian · a year ago

Baseball bats?

ryandvm · a year ago

Honestly, that sounds like amazing work. I wish I could afford to get out of enterprise software engineering and just do academic software development like that.

jcims · a year ago

Look up acoustic cameras on YouTube, there are some pretty impressive demonstrations of their capability. This is one of the companies I've been watching for a while, but it looks like FLIR and some other big names are getting into it: https://www.youtube.com/@gfaitechgmbh

The one use case that is both creepy and interesting to me is recording a public space and then after the fact 'zooming in' to conversations between individuals.

sipjca · a year ago

I am very interested in how small these arrays can be. From talking with a friend with cochlear implants, I would assume this could help dramatically with the right signal processing to help him hear.

brunosan · a year ago

Armchair comment. I would LOVE to be a grad student again and try to pair it with ultrasound speaker arrays, for medical applications. Essentially a super HIFU (High-Intensity Focused Ultrasound) with live feedback. https://en.wikipedia.org/wiki/Focused_ultrasound

zipy124 · a year ago

I do my PhD in in-air ultrasound with phased arrays and talk to the medical guys at conferences/labs that we talk to and it's soooo much harder in solids/liquids. The frequency is significantly higher, think 1-10MHz instead of like 40khz, so any normal electronics are out the window.

brudgers · a year ago

Then, why not be a grad student again?

01100011 · a year ago

Maybe they want to afford dinner?

duped · a year ago

One problem is that the speed of sound is not constant (or approximately constant) across the bandwidth you're interested in when the sound wave is traveling through solids and liquids.

always_swapping · a year ago

I may be the FUS grad student you seek. Reach out via profile email if you want to chat. Cheers!

etrautmann · a year ago

Medical applications would presumably require contact coupling and not through air?

adamcharnock · a year ago

I would love to see this come to our various mobile devices in a nicely packaged form. I think part of what is holding back assistants, universal-translators, etc, is poor audio. Both reducing noise and being able to detect direction has a huge potential to help (I want to live-translate a group conversation around a dining table, for example).

Firstly it would be great if my phone + headphones could combine the microphones to this end. But what if all phones in the immediate vicinity could cooperate to provide high quality directional audio? (Assuming privacy issues could be addressed).

abecedarius · a year ago

For the hard of hearing like me the killer application would be live transcription in a noisy setting like a meetup or party, with source separation and grouping of speech from different speakers. Could be life-changing.

(Android's Live Transcribe is very good now but doesn't even try to separate which words are from different speakers.)

* Automatic speech recognition (ASR) systems have progressed to the point where humans can interact with computing devices using speech. However, the distance between a device and the speaker will cause a loss in speech quality and therefore impact the effectiveness of ASR performance. As such, there is a greater need to have reliable voice capture for far-field speech recognition. The launch of Amazon Echo devices prompted the use of far-field ASR in the consumer electronics space, as it allows its users to interact with the device from several meters away by using microphone array processing techniques.*

https://assets.amazon.science/da/c2/71f5f9fa49f585a4616e49d5...

MVissers · a year ago

I believe modern macbook pro’s already have multiple microphones that probably do some phase-array magic.

refulgentis · a year ago

Pretty much every device does, the trick always was if it actually worked, which Apple is assuredly great at. (source: worked on Google Assistant)

spaceywilly · a year ago

This is known as the Cocktail Party Problem. It turns out or brains do an incredible amount of processing to allow us to understand a person talking to us in a noisy room.

https://en.wikipedia.org/wiki/Cocktail_party_effect?wprov=sf...

quantadev · a year ago

In general the position of the microphones in space must be known precisely for the phase shifting math to be done well, and also the clocks on the phones would need to be in sync at high precision like 10x the highest frequency sound you're picking up. In other words within 10s of thousands of a second. Also if the array mic locations is not a simple straight line, circle, or other simple geometry the computer code (ie. math) to milk out an improved signal becomes very difficult.

NavinF · a year ago

> 10s of thousands of a second

10ms? That's a very long time. Phone clocks are much more accurate than that because they're synced to the atomic clocks in cell towers and GPS satellites.

Hell even NTP can do 1ms over the internet. AFAIK the only modern devices with >10ms inaccurate clocks by default are Windows desktops. I complained about that before because it screwed up my one-way latency measurements: https://github.com/microsoft/WSL/issues/6310

I solved that problem by RTFM and toggling some settings until I got the same accuracy as Linux: https://learn.microsoft.com/en-us/windows-server/networking/...

Anyway I dunno why the math would be too complicated, GPUs are great at this kind of signal processing

hatsunearu · a year ago

It's already kind of implemented.

hinkley · a year ago

Boeing ginned up a spherical version of these and used it on 787 prototypes to identify candidates for sound deadening material.

Apparently in loud situations like airplanes, audio illusions can make a sound appear to come from a different spot than it really is. And when you have a weight budget for sound dampening material it matters if you hit the 80/20 sweet spot or not.

Salmonfisher11 · a year ago

If somebody wants to play around with Zynq 7010's - have a look at the EBAZ4205 board. They can be bought from Aliexpress (20-30€). These are former Bitcoin Mining controllers.

Some people reverse engineered the entire thing. It can be found in GitHub. And there's an adapter plate available for getting to the GPIOs.

For a less complex entry there are also Chinese FPGAs ("Sipeed" boards which use a GoWin FPGA. They are quite capable and the IDE is free.

telgareith · a year ago

Xilinx tool chain is also no-cost.

scottapotamas · a year ago

For some/smaller parts. Once you start going higher than Artix or the token Kintex parts you need to pay up.