Back to Smartcitizen.me

A bit of acoustics and signal processing for audio - Part III

Hello everyone!

In this post we are going to describe how we have to pre-post process our signals in order to obtain the results in the manner we are expecting. These are very important steps in our processing chain, since the FFT algorithms -or convolution FIR Filters- won’t be able to cope with our system’s limitations. These limitations might not be obvious at the beginning, but you really don’t want to ignore them while designing your system, since they’ll invalidate many of your measurements. If this sounds greek to you, consider reading Part I and Part II in this forum before continuing with this post.

The very first of these limitations, is the fact that our microphone is, in fact, taking discrete samples of the ambient noise surrounding it. This means that, from the very beginning, we are missing some pieces of information and we will never be able to process them. But it’s OK! For the purpose of our analysis, we don’t need to sample continuosly and this situation is easily bypassed.

Image credit: NUTAQ - Signal processing

Discrete sampling has two main consequences for us: the first one is that we are taking samples once every 1/fs, where fs is the sampling frequency. Normal audio systems sample at 44,1kHz, but this number might vary depending on the application. If you remember this chart, you might be wondering why we have to sample at such a high frequency. This is due to the Nyquist sampling criterion, which states that at a minimum, we have to sample at double the maximum frequency we want to analyse. Since humans hearing has a limited frequency range that goes up to 20kHz in some cases, it is reasonable to use something around 40kHz. With this, the Nyquist criterion solves the so called aliasing problem, in which several sinusoid signals could fit the same sampling pattern if the number of samples is too low:

Image credit: Wikipedia - Aliasing

The second of the discrete sampling limitation comes from the amount of samples we are able to handle at a time. Normally, this is due to memory limitations in the RAM, although we’ll see in the future where to allocate them. Nevertheless, it is not useful to handle buffers that are too long, since at some point, the increase of buffer length does not provide any additional information. Buffer length requirements in our case come from the minimum frequency we want to sample, which is around 20Hz. Doing some quick math, we need 0,05s worth of sample buffer, which at 44,1kHz is roughly 2200 samples. This is equally too many samples, considering that each could be allocated as a uint8_t, taking up to 16kB just for the raw buffer!

This is where signal windowing kicks in. Imagine that we have a very-low-frequency sinusoid and that we are not able to sample completely the whole sine wave, due to buffer limitations. By definition, our system is assuming that the discrete samples we measure are constantly being repeated in the environment, one after the other:

Image credit: Smart Citizen

When we take the FFT of this signal, we see undesired frequencies that make our frequency spectrum invalid. This is called spectral leakage and it’s mitigated by the use of windows (math funcions, not the OS). These windows operate by smoothing the edges of our measurement and preventing the jumps in the signal helping the FFT algorithm to properly analyse the signals.

Image credit: Smart Citizen

With the use of signal windowing, more specifically with the use of the hamming window, we are then able to reduce the amount of samples needed to roughly 1000 samples. Now we are down to 50% of the memory allocation needed without windowing. You can see the effect on the RMS relative errors in the image below, where the trend of the Hann (another common window) and the Hamming treated buffers, with respect to the frequency tends to stabilise much more quickly than the raw buffers.

Image credit: Smart Citizen

There is a wide range of functions to use and the decision depends on your application. For audio applications, the most common ones are the Hann, Hamming, and Blackmann. We chose the Hamming because it’s trend is to stabilise a bit more quickly than the rest, although the differencies are minimal. For your reference, there is a very interesting description of all these phenomena in this article, where you’ll find a more mathematical approach.

We hope you enjoyed this post and see you next week to talk about Equalisation!

2 Likes

Thx for this series @oscgonfer and great work on this. Now if may ask a question in the language of common people:
Owkay I have sensor with interval 60 sec. How is the value produced every minute? Th microphone is going to listen once a minute for about 0,05s and this sample is then going to be processed to produce the dBA value?

Hi @bron,

It’s exactly how you described it. If I remember correctly, we sample the equivalent of 512 samples at 44100Hz, which translates to something like 11ms. Then, those samples are processed (FFT + A-weighting + RMS) and we get a value in dBA scale. This happens once every minute, by default, or the reading interval you have configured.

Regards

Hi,

@oscgonfer we are trying to compare the results with calibrated short measurements done by the local administratin of highways. I get the question about the method which is used for the A-weighing of the measurement. (FAST/SLOW, LEQ/INST/MIN/MAX,…) ?

Greetings Bron

Hi @bron!

Super nice! So, the A-weighting function is the one in the IEC61672 which is described also here.

(FAST/SLOW, LEQ/INST/MIN/MAX,…)

I am not sure what that is, but mathematically what we do is to calculate the FFT spectrum, equalize the frequence response from the microphone (tested in a chamber), and then perform the a-weighting according to the function above in the spectrum domain. After that, we calculate the RMS of the spectrum and convert it to dBA scale.

  • A note with the actual calculations can be found here.
  • And the code originating the A-weighting function for our samples is here.
  • Finally, the intercomparison of the microphone with a reference instrument in an anechoic chamber, with the equalisation results can be found here (page 47 onwards)

Let me know if that answers the question… I can talk more from the DSP side of it, but not so much from the official normative perspective though.

Hope it helps!