AI enters a barrier-free era: what does the application of sign language recognition translation mean? 04/15 Update SLTechnology News&Howtos

AI enters a barrier-free era: what does the application of sign language recognition translation mean?

2025-04-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

I believe that in people's impression, AI is a "majority" category of technology.

The so-called majority, first, means that the amount of relevant data is large and easy to accumulate, which is suitable for the characteristics that AI is extremely dependent on massive data; second, it means that the application scenario is extensive, it is easy to cash in and recover the cost, and it is suitable for the high threshold investment of AI research and development.

At present, face recognition, voice interaction and so on, which often appear around us, are in line with the above characteristics. But this does not mean that AI, which belongs to the "minority", is a blank.

On the special day of Global accessibility Day, we might as well focus on technological developments that can eliminate the distinction between the minority and the majority, such as sign language recognition for people with hearing impairment.

Sign language that you and I don't understand, why is AI so difficult to read?

Sign language is a unique way of communication for people with hearing impairment, which uses gestures to measure movements and simulates images or syllables to form certain meanings or words according to the changes of gestures. However, although this kind of communication enables people with hearing impairment to communicate with each other, or ordinary people who understand sign language to communicate with them, it still can not meet the communication needs between the hearing impaired and the general public.

This means that people with hearing impairment may encounter some inconvenience in some social and public spaces, such as government or service scenes.

AI, on the other hand, happens to be a solution.

In some software, we have begun to apply AI's gesture recognition capabilities, such as "heart comparison" when taking photos to trigger some AR effects. By matching this capture of gestures with gesture semantics, can not we achieve the translation and generation of sign language?

This logic is correct, but there is still a long way to go from correct logic to feasible application.

First of all, the expression of sign language has some particularity, so it is not easy to capture it.

We know that there is no absolute accuracy in the behavior of "gesture comparison". In addition, the expressions of some sign language words are very close, and sign language expressions are usually expressed in sentences, and there is no obvious gap between words. In the past, the use of front-facing camera in gesture recognition is basically not feasible.

So the solutions given by many technologies and teams are peripherals, such as the Kinect-based sign language translation system launched by the University of Science and Technology of China and Microsoft, and the sign language recognition gloves once launched by the University of California. But these peripherals are either low portable or expensive, so it is very difficult to popularize.

At the same time, sign language expression also has national and regional characteristics, so there are difficulties in the generality of the model.

There are two concepts of "grammatical sign language" and "natural sign language" in sign language. Grammatical sign language is a common Putonghua, while natural sign language, like dialects, has great differences among countries, regions and even cities. As a result, sign language data collection and tagging will be a high cost and heavy workload.

Amazon, for example, has suggested that Alex, a smart speaker, can be modified to translate some simple signals. However, due to the lack of large-scale training data sets, this function can only recognize some simple American sign language and remain in the laboratory stage.

There is no secret way to solve the problem in sign language: Tencent YouTu Lab's spirit of equality

Although it is difficult to explore, technology companies continue to achieve results on sign language AI.

For example, today, Tencent YouTu Lab launched the YouTu AI sign language translator released by the Shenzhen Information accessibility Research Association, which is a great step forward in the application of sign language AI.

The breakthrough of YouTu AI sign language translator lies in two aspects, on the one hand, the technological progress of sign language AI itself, and on the other hand, the breakthrough of application scene.

In the sign language AI technology itself, it can be divided into two solutions: recognition model and data set. In the data set, YouTu built his own sign language recognition data set through contact with relevant social institutions and people with hearing impact. at present, this data set has covered nearly 1,000 daily expressions and 900 commonly used vocabularies. it is the largest Chinese sign language recognition data set at present. And this data set expands the diversity of expression habits and speed according to the local differences of sign language expression.

As for the recognition model, YouTu also proposed some updated algorithm concepts, such as extracting static and dynamic information from gestures by 2D convolution neural network and 3D convolution neural network respectively, and improving the effect of video recognition through comprehensive processing. Completely get rid of the shackles of other sensors. At the same time, in view of the whole sentence phenomenon of sign language expression, YouTu added word-level information mining at the end of the video frame to verify the information proposed by the feature extractor to further determine the boundary of gesture to word expression. in addition to improving the accuracy of recognition, it can also improve the ability to summarize regional expressions in natural sign language. On this basis, YouTu also introduced the ability of context understanding into the algorithm model to face the more complex requirements of sign language recognition translation.

However, although the technology has been improved, the application scene side will still be subject to some limitations.

As high-precision algorithms require high computing power, YouTu's AI sign language translator still needs to rely on the background operation of high-performance computers. And because the recognition through images and videos is not high for complex scenes, the application planning of YouTu AI sign language translator is applied to explore public service places such as airports, high-speed rail and civil affairs to make up for the communication barriers encountered by people with hearing impairment in these places due to the low popularity of sign language, so as to help build an information-free city.

In fact, it is not difficult to find that although YouTu AI sign language translator in YouTu laboratory has greatly improved the accuracy of sign language translation and found a feasible application scheme for AI sign language translation, if we split the technology, we can find that the reason for YouTu's technical breakthrough in AI sign language translator is not a sudden amazing breakthrough in some basic science, but a consistent long-term investment in research and development and data accumulation. In order to get rid of the dilemma of the lack of sign language corpus in the past, we can iterate constantly in the algorithm.

In other words, Tencent has invested almost as much energy and financial resources as "most" AI technologies in the "minority" AI technology. For the AI industry, this is undoubtedly a spirit of equality.

From people-oriented to Tech for Good: why should we be more proactive in guiding the technological ocean current?

Tencent this seemingly "go against the trend" approach, in fact, is also a faint momentum in the AI industry in the direction of ocean currents.

A few days ago, at the Digital China Summit in Fuzhou, Ma Huateng first mentioned the concept of "Tech for Good", saying, "We hope that 'Tech for Good' will be part of Tencent's vision and mission in the future. We believe that science and technology can benefit mankind; mankind should make good use of science and technology, avoid abuse and put an end to evil use; science and technology should strive to solve the social problems brought about by its own development."

Coincidentally, after returning to Stanford, Li Feifei started the HAI Institute (people-oriented Stanford Institute of artificial Intelligence) and began to serve as director this year. The research goal of HAI is to promote the development of AI technology for the benefit of mankind and to predict the real impact of AI on human life.

Science and technology giants and academic flags have turned their attention to the same direction, because people have gradually begun to find that the momentum of technological forces such as AI, 5G, industrial digitization and so on is so strong that they have to be guided or even restricted.

As mentioned above, science and technology enterprises have played a great role in promoting this wave of technological development, and the pursuit of profit is naturally the instinct and nature of enterprises. therefore, enterprises will take the lead in those technologies that satisfy the majority of people, have a wide range of applications, and have relatively low R & D costs.

There is nothing wrong with this behavior itself, but the efficiency improvement brought about by new technologies such as AI is so significant that it is a question that many people are thinking about whether it will squeeze or even marginalize those areas and groups that are temporarily unable to access the new technology.

For example, with the continuous enhancement of the machine translation ability of mainstream languages such as English, Chinese, Japan, France and Russia, will those small languages with insufficient corpora and fewer applications be further marginalized because of lack of technical empowerment?

Similarly, as more and more public affairs are replaced by AI technologies such as voice interaction and image recognition, will people with hearing and visual impairment have more trouble in obtaining services?

A similar situation has already happened: at the end of 2018, the United Nations released a report on the digitization of the British government. The results show that in England, homelessness has increased by 60% since 2010, there are 1.2 million people on the waiting list for guaranteed housing, and the demand for food banks used to help the poor has nearly quadrupled-- because many poor people do not know how to apply for poverty subsidies on the Internet. Even at home, there is no way to connect to the Internet, and eventually they can only get deeper and deeper into poverty.

In many cases, even harmless technology can lead to unpredictable trends. Perhaps we should be more proactive in our guidance of Tech for Good.

Concluding remarks

Finally, let's take a look at this set of figures:

According to estimates by the Beijing hearing Association in 2017, the number of people with hearing impairment in China has reached about 72 million. Globally, the latest data released by the World Health Organization show that a total of about 466 million people worldwide suffer from disabled hearing loss.

You see, the so-called "majority" and "minority" in the world are already a relative concept, and there is no clear distinction between black and white. Especially for AI, a technology that is good at imitating human abilities, its existence could have knocked down the air wall that hinders communication among groups, rather than exacerbating this trend. Our goal of using technology to build a better world, since we can leave no one behind, we should not leave anyone behind.

Fortunately, from the sign language recognition translation for the hearing impaired, we can already see this trend-the computing brain is not the only simulation object of AI, but also the fiery human heart. We believe that under the guidance of academia and giants, more and more enterprises will pay attention to the development of barrier-free AI technology in the future and constantly break down various barriers.

Although love is silent, it also echoes.

Although AI is silent, there are echoes.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.