In recent years a revolution has been taking place in the world of user interaction with smart devices. The success of Amazon Echo and subsequent release of Google Home are evidence of a new ecosystem of hardware and apps that allow full voice control of our devices. The in-home hardware is supported by cloud-based services that implement sophisticated learning algorithms for voice recognition. With natural voice based interaction we are on the brink of a world where simple conversational commands are all that is required to control devices without the need of a screen to augment or assist in our experience.
Recognizing the movement to the age of voice as a primary user interface, Yole Development predicts an almost threefold increase in the shipments of MEMS microphones from 2013 (2.4 billion units) to 2019 (6.6 billion units). However, with this increase in demand for MEMS microphones comes a refined set of requirements to meet the needs of far field voice detection applications, including systems using beam forming technology.
Advanced MEMs-based microphones that support these new applications are an important contributor to the user experience. When evaluating these devices, there are three important specifications to consider; Signal to Noise Ratio, Acoustic Overload Point and Dynamic Range.
Signal to Noise Ratio
The importance of a microphone’s Signal to Noise Ratio (SNR) in far field and voice detection applications is often misunderstood to mean the ability of the microphone to operate in a noisy environment. However, the SNR of a microphone only measures the signal against the internal noise sources, and so is actually a better indicator of how a microphone will perform in a quiet room, such as a living room or bedroom.
SNR is a good indicator of the minimum sound level that a microphone can detect in a quiet environment and is measured in dB, against a reference signal of 94dBSPL (Sound Pressure Level), a value chosen as it equates to a 1Pa change in pressure, not because it has any significance for audio use cases. Typical high performance microphones currently have an SNR in the range of 64dB to 68dB.
A microphone with an SNR of 64dB will not be able to differentiate sounds below 30dBSPL (94dBSPL – 64dB) from its internal noise. In reality even the best speech detection algorithm will require the information component of the signal to be elevated above the noise floor by a few dB, the exact requirement depends on the algorithm and processing applied to the signal. If we assume that a speech detection algorithm requires 10dB between the microphone noise floor and the input signal level, then a microphone with an SNR of 64dB will be able to detect sounds down to 40dBPSL. This is equivalent to normal conversation levels (coffee shop conversation volume) at a distance of around 5 to 6 meters, or a quiet conversation (office conversation volume) at around 2 to 3 meters. For the user, this means seamlessly interacting with a device anywhere in the room, without having to face towards it or raise their voice. For an ideal microphone, a 6dB increase in SNR allows detection of sound either twice as far away or at half the volume level.
Sounds |
Sound Pressure Level, (dB SPL) |
Sound Pressure, (N/m2, Pa) |
Sound Intensity, (W/m2) |
Jet Engine |
140 |
200 |
100 |
Thunderclap |
130 |
63.2 |
10 |
Rock Concert |
120 |
20 |
1 |
Chainsaw |
110 |
6.3 |
0.1 |
Subway Train |
100 |
2 |
0.01 |
Measurement Reference |
94 |
1 |
0.001 |
City Traffic |
90 |
0.63 |
0.0001 |
Alarm Clock |
80 |
0.2 |
0.00001 |
Noisy Restaurant |
70 |
0.063 |
0.000001 |
Conversational Speech |
60 |
0.02 |
0.0000001 |
Average Home |
50 |
0.0063 |
0.00000001 |
Living Room |
40 |
0.002 |
0.000000001 |
Soft Whisper |
30 |
0.00063 |
0.0000000001 |
Rustling Leaves |
20 |
0.0002 |
0.00000000001 |
Hearing Threshold |
0 |
0.00002 |
0.0000000000001 |
Table 1: Overview of different sounds in relation to sound pressure and intensity
Acoustic Overload Point
In loud environments, the most important parameter is the microphone’s Acoustic Overload Point (AOP). Put simply, this defines the maximum signal level which the microphone can detect without “too much” distortion, where “too much” is defined as 10 percent Total Harmonic Distortion (THD). Recent advances in MEMS technology have pushed the AOP of microphones from 120dBSPL up to and above 130dBSPL. To put that in perspective, a smartphone in 2012 would have trouble recording a live concert without clipping, whereas modern high end microphones are capable of capturing high quality audio from the front row of a rock festival.
Signals at high levels can come from unexpected sources. Wind noise is a particularly big problem for microphones when used outdoors. Depending on wind speed and direction, and the orientation of the microphone, this noise can exceed 120dBSPL. Wind noise consists of a strong low frequency fundamental tone with higher frequencies at lower levels, and it is usually filtered out before speech processing algorithms are applied or when HD voice calls are made. However it can only be filtered out if the input signal has not reached the AOP level. At the point where there is significant harmonic distortion in the signal, filtering is no longer possible.
Having a higher AOP level means that when a user is outside in windy or noisy conditions, the microphone can detect and record a signal that is not clipped even if there is a lot of background noise. This means that speech detection algorithms can still be effectively applied, filtering can be used on HD voice calls to improve audibility, and songs recorded at concerts can still be enjoyed. The key is that even when there is some background noise, it is a lot easier to post process and recover a signal which is not overloaded with harmonic distortion and clipping.
Dynamic Range
While SNR is a good indicator of microphone performance in a quiet environment, and AOP is a good indicator of performance in a loud environment, the Dynamic Range of a microphone is a combination of both parameters, indicating the range of sound pressure levels to which the microphone is sensitive. While a microphone can be designed to be very sensitive to quiet sounds, or very robust to high sound pressure levels, it is far more difficult to design one product which can work well for both situations.
Modern MEMS microphones have Dynamic Range measurements approaching 100dB, allowing the same microphone that can record thundering kettledrums and pipe organs at an orchestra recital to also pick up the whispering of the audience members.
The dynamic range of a microphone is not always explicitly specified in the datasheet, but can be easily determined form other specifications. For digital microphones it is simply attained by adding the sensitivity parameter to the SNR. For an analog microphone, the difference between the AOP and 94dBSPL reference level must be found first, and then this number is added to the SNR.
Advances in MEMS Microphone Design
Typically a MEMS microphone consists of a flexible charged membrane and a rigid voltage sensing back plate. The membrane and back plate form a capacitor, the value of which changes as the membrane is moved by vibrations in the air. This changing capacitance is converted into a changing voltage which is either amplified and output directly by an analog microphone ASIC, or converted into a digital output signal by an ADC in a digital microphone ASIC. This operation is similar to studio condenser microphones, but on a much smaller scale.

Figure 1: Infineon’s dual back-plate MEMS design ensures high Acoustic Overload Point
Dual back-plate technology is a proprietary method of MEMS construction which uses a plate on each side of the membrane. This method allows fully differential measurement of the signal, improving signal response and quality, increasing SNR and AOP levels and also improving the THD performance of the microphone below the AOP level.
The dual back-plate structure of Infineon’s microphone MEMS has an added benefit in robustness to fast air pressure changes, such as those experienced when a phone is dropped on the ground.

Figure 2: Differential signal output from dual back-plate MEMS
While the AOP of a microphone is defined as the input sound pressure level at which there is 10 percent THD on the microphone output, studies suggest that much lower levels of THD are audible to listeners and can detract from the listener experience. Dual back-plate microphones have an excellent THD profile, staying below 1 percent until very close to the AOP level, giving an excellent listening experience up to and even above 130dBSPL.

Figure 3: THD vs SPL performance of Infineon’s dual back-plate MEMS compared to leading competitor
With the rise of voice interaction as a preferred user interface to devices, compact, and efficient MEMS microphones will be pushed further than ever before in the search for perfect performance and usability. Dual back-plate and fully differential microphones in both analog and digital versions are one of the innovations that will push the performance of audio-ready hardware with improved THD and AOP performance, expanded frequency response and the potential to push SNR to 70dB and beyond.
Filed Under: Rapid prototyping