Design World

  • Home
  • Technologies
    • ELECTRONICS • ELECTRICAL
    • Fastening • joining
    • FLUID POWER
    • LINEAR MOTION
    • MOTION CONTROL
    • SENSORS
    • TEST & MEASUREMENT
    • Factory automation
    • Warehouse automation
    • DIGITAL TRANSFORMATION
  • Learn
    • Tech Toolboxes
    • Learning center
    • eBooks • Tech Tips
    • Podcasts
    • Videos
    • Webinars • general engineering
    • Webinars • Automated warehousing
    • Voices
  • LEAP Awards
  • 2025 Leadership
    • 2024 Winners
    • 2023 Winners
    • 2022 Winners
    • 2021 Winners
  • Design Guides
  • Resources
    • Subscribe
    • 3D Cad Models
      • PARTsolutions
      • TraceParts
    • Digital Issues
      • Design World
      • EE World
    • Engineering diversity
    • Trends
  • Supplier Listings
  • Advertise
  • Subscribe

How Audio Technology Will Improve Voice-Enabled Devices

By Jim Jacot, Director of Marketing, Smart Home, Cirrus Logic | November 3, 2017

As humans age, their hearing, memory, and sharpness may fade—but not so with the rapidly emerging category of intelligent voice-controlled smart speakers. Unlike humans, the longer this category of products is around, the better they will get. Smart speakers will hear better and expand in capability. Alexa, the cloud-based voice service powering Amazon Echo’s family of devices (and others like it) are going to keep getting smarter, and their hearing will get even better as audio designers step in with more powerful, high-performing acoustic technology.

 

Good audio is key to better understanding

One of the next frontiers in improving vocal interaction with voice-enabled devices is the focus on higher performing audio and acoustic technology. By filtering external noise in the surrounding environment, a voice service like Amazon Alexa will hear better further away. Let’s take a look at some factors affecting smart speaker performance.

 

Distance, reverberation, and noise

As illustrated in Figure 1, there are three factors that affect listening performance: distance, reverberation, and noise.

Figure 1: Distance, reverberation, and noise can impact smart speaker’s ability to hear.

  • Distance: A key element in performance because the farther you get from the device, the more noise in the room begins to interfere. Just moving one to three meters away, has a 10 decibel (dB) drop in sound pressure level. If Alexa were a human listener, the user’s command she would perceive drops by about half, while the noise remains the same. This is because the sound waves coming from a person’s mouth geometrically spread as they move across the room—like ripples from a stone dropped into a still pond.
  • Reverberation: These ripples bounce off surfaces in the room, and arrive at slightly different times. Sounds reflect off walls, ceiling, furniture, or other objects, on its way to the device. What Alexa hears is a combination of the user’s voice propagating on a direct path to the device, along with a multiplicity of closely timed echoes. Reverberation is the persistence of sound after it’s produced. The farther you move away, the ratio of direct path voice to reverberations rapidly drops, making it harder for Alexa to understand.
  • Noise: Air conditioning, kids, appliances (to mention a few) become essential to include several microphones and sophisticated voice capture algorithms.

 

Spectral versus spatial filtering

Before assessing how next generation smart speakers will address issues of distance, reverberation, and noise, let’s look at typical smart speaker product classes defined by coverage, distance, or environment where the device can adequately perform and interact with the user. Basic categories include:

Push button/Tap-to-Talk— can operate within one meter (arm’s length), from the user and may also include a push button trigger or tap-to-talk application.

Hands-free— can operate several meters from the user; voice is normally the primary user interface, such as in a bedroom, hallway, or other small area.

Far field— can hear and operate at five or more meters from user, and typically operates in larger spaces and noisier environments, like living rooms, kitchens, or other communal areas in the home.

To improve voice response performance, smart speaker designers implement noise suppression techniques, using multiple microphone implementations to address distance, reverberation, and noise issues. In many cases, at least two microphones are needed to receive and distinguish spatial information using noise suppression algorithms.

Noise suppression algorithms typically fall into two categories: spectral and spatial. Properly implemented, spectral noise reduction (which works by first measuring the ratio of speech-to-noise in each frequency band then removing noise appropriately) may sound good to the human ear, but be distorted when it comes to speech recognition engines. Contrarily, spatial noise reduction techniques use beamforming that can separate speech from noise based on information like direction and time of arrival.

At least two mics are required to capture required spatial information. With multiple mics, smart speakers can determine which direction the voice is coming from and differentiate between the direct path sound and noise or reverberation coming from other directions. Complex algorithms are then used to cancel or filter unwanted noise. As distance increases, the direct path sound gets weaker, compared to reverberation and noise, which stay relatively constant. This makes it more difficult for the spatial algorithm to differentiate between the direct voice and noise. To compensate, more spatial information is need, thus requiring more microphones. This is why a push- or tap-to-talk device would use fewer microphones, a hands-free device would use two to three, and a far-field implementation would have four or more (as shown in the table below).

Device Classification

Distance

Use Environment

# of Mics

Push/Tap-to-Talk

< 1 meter

Within arms-length, push button, or tap-to-talk trigger

1 mic

Hands-free

1-4 meters

Smaller areas—bedroom, hallway

2-4 mics

Far field

5 or more meters

Larger, communal areas—kitchen, living room

4-7 mics

Table 1. Number of microphones usually implemented per smart speaker device classification.

Multiple mic solutions use beamforming or spatial filtering to detect sound and determine what noise to block. More mics mean more production cost and computational power. These computations can affect battery life, which is why many smart speakers must be plugged in or remain in the charging cradle. However, this too is changing.

 

It’s not just the number of microphones

When purchasing next generation smart speakers, the number of microphones included won’t tell the entire story. Audio and acoustic technology once developed for professional sound systems by audio leaders, is now being developed for consumer voice-enabled devices like smart phones, speakers, and other voice-controlled applications.

Innovative techniques for locating and classifying noise sources will improve voice capture performance. Advanced audio techniques for Acoustic Echo Cancellation (AEC), coupled with multi-mic noise suppression, will also improve “barge in” performances. Barge-in allows voice commands to interrupt loud music or voice responses in progress to accurately interpret a new command.

Voice-enabled smart speakers generally use on-device keyword spotting to detect a wake word. If you’re streaming music or listening to other content, the microphones pick this up, which may completely “swamp out” your voice and make it difficult to detect the wake word. Playback detected by microphones is called echo—the louder the playback, the higher the echo. The job of AEC is to cancel this playback. The attenuation an echo incurs while propagating from speaker to microphone is called Echo Return Loss (ERL). The measurement of how much echo the AEC cancels out is called Echo Return Loss Enhancement (ERLE). Figure 2 illustrates AEC, showing a live test graph of ERLE measurement on the Cirrus Logic Alexa Voice Service (AVS) development kit console.

Figure 2. Test graph showing AEC functioning in a two-mic smart speaker solution.

Higher Performing Audio Leads to Better Barge-In

Another key factor to look for is audio quality in a voice-enabled device or smart speaker. There’s a direct correlation between audio quality and its responsiveness. The ability to interrupt a voice-enabled device during playback depends on the AEC capabilities and playback system. If loud playback has low distortion, it’ll sound good, but the AEC must cancel out the playback so it doesn’t interfere with device hearing calling out the wake word.

Therefore, how well a digital device works and plays music will also be a direct function of how much distortion comes from the speakers. Cheaper speakers tend to result in poorer quality audio and higher distortion playback when volume increases. Technically, the playback system can never be better than its digital-to-analog converters (DACs).

 

Lower power designs could open the door to new portable devices

New development kits with advanced audio functions make it easier for smart speaker OEMs to develop next generation designs and devices. Device OEMs won’t have to be acoustic experts in audio design to gain improved functionality, performance, and features coming from improved audio capabilities. OEMs will also benefit from new lower power high-performance semiconductor design.

As the use of voice response smart speakers grows, new audio and acoustic technology will improve performance and pave the way for new, innovative applications. Savvy buyers will look beyond the number of microphones, looking for high-quality audio and portability with long battery life as determining factors when purchasing next-generation voice-enabled smart speaker devices. Better acoustic technology, such as AEC, noise reduction, and low-distortion playback, will improve barge-in performance so that voice services can hear and understand you better.

You Might Also Like


Filed Under: M2M (machine to machine)

 

LEARNING CENTER

Design World Learning Center
“dw
EXPAND YOUR KNOWLEDGE AND STAY CONNECTED
Get the latest info on technologies, tools and strategies for Design Engineering Professionals.
Motor University

Design World Digital Edition

cover

Browse the most current issue of Design World and back issues in an easy to use high quality format. Clip, share and download with the leading design engineering magazine today.

EDABoard the Forum for Electronics

Top global problem solving EE forum covering Microcontrollers, DSP, Networking, Analog and Digital Design, RF, Power Electronics, PCB Routing and much more

EDABoard: Forum for electronics

Sponsored Content

  • Sustainability, Innovation and Safety, Central to Our Approach
  • Why off-highway is the sweet spot for AC electrification technology
  • Looking to 2025: Past Success Guides Future Achievements
  • North American Companies Seek Stronger Ties with Italian OEMs
  • Adapt and Evolve
  • Sustainable Practices for a Sustainable World
View More >>
Engineering Exchange

The Engineering Exchange is a global educational networking community for engineers.

Connect, share, and learn today »

Design World
  • About us
  • Contact
  • Manage your Design World Subscription
  • Subscribe
  • Design World Digital Network
  • Control Engineering
  • Consulting-Specifying Engineer
  • Plant Engineering
  • Engineering White Papers
  • Leap Awards

Copyright © 2025 WTWH Media LLC. All Rights Reserved. The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with the prior written permission of WTWH Media
Privacy Policy | Advertising | About Us

Search Design World

  • Home
  • Technologies
    • ELECTRONICS • ELECTRICAL
    • Fastening • joining
    • FLUID POWER
    • LINEAR MOTION
    • MOTION CONTROL
    • SENSORS
    • TEST & MEASUREMENT
    • Factory automation
    • Warehouse automation
    • DIGITAL TRANSFORMATION
  • Learn
    • Tech Toolboxes
    • Learning center
    • eBooks • Tech Tips
    • Podcasts
    • Videos
    • Webinars • general engineering
    • Webinars • Automated warehousing
    • Voices
  • LEAP Awards
  • 2025 Leadership
    • 2024 Winners
    • 2023 Winners
    • 2022 Winners
    • 2021 Winners
  • Design Guides
  • Resources
    • Subscribe
    • 3D Cad Models
      • PARTsolutions
      • TraceParts
    • Digital Issues
      • Design World
      • EE World
    • Engineering diversity
    • Trends
  • Supplier Listings
  • Advertise
  • Subscribe
We use cookies to personalize content and ads, to provide social media features, and to analyze our traffic. We share information about your use of our site with our social media, advertising, and analytics partners who may combine it with other information you’ve provided to them or that they’ve collected from your use of their services. You consent to our cookies if you continue to use this website.OkNoRead more