How different areas of the human auditory cortex represent the acoustic components of mixed speech was unknown. Until the research done by Columbia University’s Zuckerman Institute in New York, which was published in an October issue of Neuron. Studying how the brain focuses on a single speaker has positive implications for hearing aids.
This path of inquiry started when team leader Nima Mesgarani was a graduate student at the University of Maryland. He worked on an algorithm to convert the neural responses recorded from the brain back to the sound that elicited them.
“A question that came up for me at the time was what sound will be reconstructed from the brain if a listener is listening to a speaker in multi-talker scenarios?” he recalls. “Will we see the sound of all speakers in the reconstructions, or just the person the listener is focusing on? In 2012, we showed that the latter is the case. The brain filters out the competing speakers and selectivity represent the attended talker.”
In 2017, Mesgarani took his research a step further when he developed new speech processing algorithms to solve the speech separation problem in machines.
“These new algorithms are able to take the mixed voice of speakers that we have used in the training, and successfully separate them into different streams,” says Mesgarani. “This technology was instrumental for the 2017 paper, which compared the brain signals with the separated sound sources to identify and amplify the attended voice. A big limitation of this work was that we had to train on the speakers, which makes it hard to work with new speakers.”
In 2019, this limitation was removed by using a new speech separation algorithm that could work on new speakers never seen during the training of the network. The algorithm compares the brainwaves with the separated sources. It also amplifies the source that is most similar to the brain waves.
The team worked with eight volunteers who were undergoing electrode seizure monitoring as part of epilepsy treatment. It would have been too invasive to ask otherwise healthy individuals to undergo neurosurgery. Now Mesgarani says that his group and others have shown that decoding the attentional focus of a listener can be done from the non-invasive scalp and around-the-ear EEG. This method can, therefore, be used with a larger population of subjects and not only epilepsy patients.
Turns out the brain picks who it wants to hear in two separate steps. One area interprets both speakers’ voices. Neurons in a second area amplify the desired voice while dampening the second. Naturally, the whole process is super fast, taking place in about 150 milliseconds.
This information on how the brain processes sound is relevant can be used to improve hearing aids. It’s important to note that at this point, the technology requires having some residual hearing.
“Our goal is to improve the baseline intelligibility in subjects,” Mesgarani says. “Hearing impaired subjects have a lower baseline, hence this method may work in situations that are less acoustically noisy. But the effect will be the same, which is improving their baseline hearing threshold. We have preliminary results that confirm that it is indeed possible to decode the attentional focus in hearing impaired listeners.”
In other words, current hearing aids utilizing this technology could theoretically pick up on a voice of the user’s choice. Cocktail parties and other scenarios would then be easier to navigate.
“…current hearing aids utilizing this technology could theoretically pick up on a voice of the user’s choice.”
Mesgarani has a different line of research that aims at more limited situations. For example, when someone is unable to communicate. In this case, the goal is to decode and synthesize speech directly from the subject’s neural recordings.
Another application of this technology is creating a device that connects directly to the brain. It would be similar to a cochlear implant. Mesgarani’s lab has a demo, which shows how it would work.
“This technology can ultimately result in smart hearing aids that track the brainwaves of the subject,” says Mesgarani. “These brain-controlled hearing devices, therefore, can have targeted filtering of the acoustic scene and therefore help the subject achieve his/her desired outcome.”