Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Bid farewell to "Hawking tone": Chinese scientists design new brain-computer devices, and for the first time, human beings "speak" directly with brainwaves.

2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

Produced by big data Digest

Authors: Wei Zimin, Zhou Suyun

For the first time in human history, a complete spoken sentence is output directly from the brain.

In a new study published in the journal Nature on April 25th, neuroscientists have designed a device that converts brain signals into speech. Through the state-of-the-art brain-computer interface, natural synthetic speech is generated according to virtual channels controlled by brain activity. To put it simply, it is to decode human brain signals such as lips, chin, tongue and throat and convert them into the voice that the patient wants to express.

"for the first time, we can generate complete spoken sentences according to individual brain activity." Edward Chang, a professor of neurosurgery at the University of California, San Francisco and lead researcher on the work.

Edward Chang, a professor of neurosurgery at the University of California, San Francisco, focuses on the brain mechanisms of speech, movement and human emotion. Image source UCSF

Speech disorders are widespread. Thousands of people are unable to communicate normally due to loss of speech caused by injuries in various accidents, strokes or neurodegenerative diseases, such as amyotrophic lateral sclerosis or amyotrophic lateral sclerosis.

It has long existed to use external devices to generate auxiliary voice output. We are well known, such as the speech synthesizer used by Hawking, to spell words through human eyes and facial movements, which, ideally, can help paralyzed people output up to eight words per minute.

Source: The Guardian

These technologies have brought some improvement to the lives of aphasia patients, but compared with the average speed of 150 words per minute of natural speech, the output through the external interface is still too slow.

The latest experimental results released by Nature directly raise the ability of communication recovery to one level: directly reading brain signals to synthesize speech. Compared with word-by-word input, it is more efficient, and can solve many problems of speech output in prior art, such as the lack of pronunciation and intonation caused by a single syllable. If it can be applied to clinic, it can greatly improve the communication ability of patients with speech disorder.

The Edward Chang team also released a clear and understandable phonetic example: the first half was the sentences read by the readers who took part in the experiment, and the second half was automatically generated by recording the patient's brain activity. Let's listen first.

Although the content generated by brainwave is still quite vague compared with natural speech, it is the output of the whole sentence and retains the sense of sentence breakage and intonation. According to the study, as many as 70% of native English testers think they can understand the content.

In fact, early last year, Science magazine also reported on the important progress made by the Edward Chang team in the brain-computer interface, when the research was still focused on the recording of a single number: the researchers asked subjects to listen to oral numbers and reconstruct the voice on the basis of their brain activity records when they heard the numbers. Although the numbers at that time were also recognizable, they only stayed at the output of a single word.

Compared with the latest audio, the complete sentence can be output in just over a year. According to Edward Chang, the technology is now "within reach" and "we should be able to build a clinically viable device in patients who have lost their language skills."

Paper download link:

Https://www.nature.com/articles/s41586-019-1119-1

Interpret the brain's intention, and then generate speech.

For people who are unable to communicate because of nerve injury, the technology of translating neural activity into language will be transformative.

Decoding speech from neural activity is challenging because speaking requires very accurate and fast multi-dimensional control of the vocal occlusal frame. Chang teaches their neural decoders that explicitly use motion and sound representations encoded in human cortical activity to synthesize audible speech. The recurrent neural network first decodes the recorded cortical activity directly into the representation of joint motion, and then converts these representations into speech acoustics.

Source: Nature

In a closed vocabulary test, listeners can easily recognize and transcribe speech synthesized from cortical activity. Intermediate joint dynamics can improve performance even if the data are limited. These findings improve the clinical feasibility of using phonetic neuroprosthesis to restore oral communication. Although the experiment was conducted in volunteers with full language function, in the future, the technique is expected to restore the voices of people who have lost the ability to speak due to paralysis and other forms of nerve injury.

Experimental process

The team recruited five volunteers who were about to undergo neurosurgery for epilepsy. To prepare for surgery, doctors temporarily implanted electrodes in the brain to map the source of seizures. When the electrodes were in place, the volunteers were asked to read hundreds of sentences aloud, while the scientists recorded activity in areas of the brain known to involve speech production.

It takes only two steps to decode speech: convert the electrical signals in the brain into sound movements, and then convert those movements into speech.

They do not need to collect the second step of the data, because other researchers have previously compiled a large database showing the correlation between sound movement and speech. They can use it to reverse design the sound movement of patients.

They then trained machine learning algorithms to match the patterns of electrical activity in the brain with the sound movements that would be generated, such as pressing the lips together, tightening the vocal cords and moving the tip of the tongue to the top of the mouth. They describe the technology as a "virtual channel" that can be directly controlled by the brain to produce a synthesis similar to the human voice.

To test the intelligibility of synthetic speech, scientists invited hundreds of people to transcribe samples through the Amazon Mechanical Turk platform.

In the test, 100 sentences and 25 words were given at a time, including target words and random words. The accuracy of audience recognition is 43%.

Some pronunciations, such as "sh" and "z", are accurately synthesized, and the pronunciation decoders of "b" and "p" are not fully distinguishable.

But these do not affect normal communication, in daily life, we will gradually become familiar with a person's pronunciation and speculate about what they want to say.

At present, the experimental algorithm can not decode untrained sentences, and there is still a long way to go to become a feasible brain-computer interface for clinical language synthesis.

Exploration of brain-computer Interface

In fact, as early as the beginning of last year, Science magazine reported on the important progress made by the Edward Chang team and from Columbia University (Columbia University) and the University of Bremen in Germany (Bremen) in the brain-computer interface: through surgery, they put electrodes on the brain, collect the data produced by the electrodes, and then turn them into voice messages through computers. Then, through the neural network model, they reconstructed words and sentences that could be understood by human listeners in some cases.

The picture is from Science

At the time, researchers from Columbia tried to figure out how the brain turns on and off neurons at different points in time, and infer the content of speech. Although these models will perform best on very accurate data-and collecting these accurate data requires opening our skulls.

Researchers record such risks only in rare cases: in one case, electrical readings from the exposed brain can help surgeons locate to avoid key speech and motor areas during the removal of a brain tumor; in another case, electrodes are implanted into patients with epilepsy and maintained for several days before surgery to determine the cause of the seizure.

At that time, Edward Chang and his team reconstructed the entire sentence based on brain activity captured from language and motor areas caused by three epileptic patients reading aloud.

In the online test, 166 people understood one of the sentences and chose from 10 text options. In more than 80% of the cases, the model can correctly identify sentences. The researchers further improved the model: they used it to recreate sentences based on human lips.

The researchers at the time also released a recording of the experiment, in which they asked a group of listeners to "say" the numbers on the computer and evaluate them at the same time; the accuracy was about 75%. The sound sounds terrible, but you can still recognize the numbers by listening carefully.

In addition, Iron Man Musk is also interested in this field, in addition to electric cars and space exploration, he also early dabbled in the field of brain-computer interface, founded in 2016 brain-computer interface development company Neuralink, working with a number of well-known neuroscientists at the University of California, its short-term goal is to cure serious brain diseases such as Alzheimer's and Parkinson's disease, and eventually strengthen the brain through "integration with AI."

Human progress driven by artificial intelligence, neurology and linguistics

Researchers in the project are currently experimenting with higher-density electrode arrays and more advanced machine learning algorithms, which they hope will further improve synthetic speech. The next goal of the technology is to apply the system to people who can't speak and test whether they can learn how to use the system and popularize it without being able to train with their own voices. so that they can say whatever they want.

A graduate student in bioengineering at Josh Chartier,Chang Lab. The picture is from UCSF

Based on anatomy, researchers can decode and synthesize new sentences from participants' brain activity, as well as sentences trained by algorithms, according to one of the team's study participants. Even if the researchers provide an algorithm to record brain activity data, and a participant only speaks without sound, the system can still produce an understandable combined version of the mixed sentence in the speaker's voice.

The researchers also found that the neural code of sound movement partially overlapped between participants, and that the vocal channel simulation of one subject could respond to neural instructions recorded from the brain of another participant. In summary, these results suggest that individuals with language disorders due to neurological disorders are likely to learn to use this system and use voice input as a model of speech prosthesis.

"people with physical disabilities have learned to use their brains to control robot limbs," said Chartier, a bioengineering graduate student at Chang Lab. "We hope that one day people with language disabilities will be able to learn to speak again with this brain-controlled artificial vocal tract."

Another researcher, Anumanchipalli, added: "I am proud to use neuroscience, linguistics and machine learning expertise as a milestone to help people with neurological disabilities."

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report