In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
2020-07-16 19:17:47
Author | Zhang Jiajun
Edit | end of cluster
Machine translation, which aims to realize automatic translation between natural languages by computers, has always been an important research direction in the field of natural language processing and artificial intelligence, and has made a breakthrough in recent years, and has become a well-known and commonly used technology.
Now when it comes to the origin of machine translation technology, people who know a little about the field of machine translation know that it was Warren Weaver (Weaver) of the United States who first put forward the concept of machine translation in 1947 and formally recorded it in a memo called "Translation" in July 1949. However, perhaps most people do not know exactly who Weaver is and how he came up with the concept of machine translation. As a researcher in the field of machine translation, I am very interested in these issues and hope to let more people understand the interesting and historical facts about the birth of the concept of machine translation.
Warren Weaver
If you make a guess, Weaver may think that Weaver is a scholar engaged in language translation. because of the heavy task of manual translation, Weaver has the idea of automatic translation by computer. In fact, his life experience will be amazing.
If you think that he put forward the concept of machine translation is great enough, then you can tell you that Weaver is a mathematician who helped the US military innovate fire control systems and bomber technology during World War II, and he put forward the concept of molecular biology for the first time. He also wrote the epoch-making book "Mathematical principles of Communications" (The Mathematical Theory of Communication) with Shannon, the father of information theory. It is a bit embarrassing to find that machine translation seems to be just a small hobby of his.
The fact that so many contributions from different fields are concentrated in one person is a sign of Weaver's extraordinary nature. We may wonder, what kind of profession does Weaver do? In fact, it is difficult to summarize his career by a title such as a professor, a researcher or a scientist. Next, let's walk slowly into Weaver's life.
one
From Wisconsin to New York
Weaver was born on July 17, 1894 in Rizburg, Wisconsin, USA. From an early age, he loved all kinds of trouble and was determined to become an engineer. After entering the University of Wisconsin, influenced by two teachers, Charles Sleecht (Charles Slichter) and Max Marson (Max Mason), Weaver found that his interest and enthusiasm were not in engineering technology, but in applied mathematics and theoretical physics. Weaver found that his interest and enthusiasm were not in engineering technology, but in applied mathematics and theoretical physics. Weaver resolutely turned to mathematics and earned his mathematics degree in 1916. And received a degree in civil engineering in 1917, it seems that he has not completely given up engineering technology. After graduating from college, he worked briefly as a math teacher at Schrop University, the predecessor of the famous California Institute of Technology, before serving in the United States Air Force for two years. After retiring, he returned to the University of Wisconsin to continue his doctoral research and received his doctorate in 1921. After graduation, he stayed at school as a professor of mathematics and became head of the mathematics department at the University of Wisconsin since 1928. According to Weaver, he is not very good at mathematics research, and life is destined to be ordinary if it goes on like this.
At this time, Weaver's life mentor Marson appeared again, first invited Weaver to write the classic physics textbook electromagnetic Field (The electromagnetic field), and after he became president of the Rockefeller Foundation, Weaver was invited to serve as director of the foundation's natural science department. The Rockefeller Foundation is based in New York, and changing jobs means not only moving but also facing a shift in career direction, and moving from a university professor to a research project manager may not be so attractive. However, after thinking for a long time, Weaver decided to follow his teacher to New York and officially became director of the Rockefeller Foundation's Department of Natural Sciences in 1932, starting his extraordinary career of scientific exploration, planning and management. Here is a brief introduction to the Rockefeller Foundation, which provides Weaver with the full play of his talents.
The Rockefeller Foundation, officially founded in 1913, has been a century-old brand and is almost the largest and most successful private foundation in the world. Here are a few achievements that we may be familiar with: 1, in the field of medicine, the Rockefeller Foundation established the field of modern public health to develop vaccines to help eradicate diseases such as yellow fever and malaria; 2, in the field of agriculture, it promoted the green revolution in the reform of agricultural production technology in the third World countries in the 20th century; and 3, in the field of information, funded the Dartmouth Conference held in 1956 to mark the origin of artificial intelligence. 4. In China, it funded the establishment of Peking Union Medical College and its affiliated Peking Union Medical College Hospital. There are many more great achievements of the Rockefeller Foundation. With such a stage, Weaver was able to show his ability to control future scientific trends and manage scientific research.
two
Marching into the field of biology
The Rockefeller Foundation has a lot of money, so in theory, you can fund whatever you want, so the direction of funding is particularly important. At the beginning of his tenure, Weaver successfully persuaded the Rockefeller Foundation's board of directors to shift the focus of funding from physics to emerging areas of biology with his physics background and keen sense of the impending outbreak of biotechnology (of course, this process should also be strongly supported by president and teacher Marson).
As long as you go in the right direction, everything will be fine. In just five to six years, funded research projects in these emerging fields have made rapid progress, and Weaver collectively referred to these emerging technologies in biology as molecular biology in the Foundation's Natural Science Annual report in 1938. As a result, the concept of molecular biology was born, which opened up a new discipline direction of the intersection and integration of biology, chemistry and physics.
Now, the DNA research we are familiar with and the nucleic acid detection in COVID-19 belong to the field of molecular biology. Driven by Weaver, the Rockefeller Foundation funded many researchers in this field, many of whom became leaders in specific academic directions after a few years. For example, 15 of the scholars funded by the Rockefeller Foundation in molecular biology from 1954 to 1965 won Nobel Prizes (a total of 18 in this field). It can be said that one of Weaver's greatest contributions was to greatly promote the development of global biology in the 20th century.
three
Make some contribution to information theory
During his tenure as director of the Rockefeller Foundation's Department of Natural Sciences, Weaver retained his enthusiasm for applied mathematics, especially probability and statistics. Among them, one outstanding achievement is the landmark book "Mathematical principles of Communication" written with Claude Shannon (Claude Shannon) in 1949. However, Shannon has been working at Bell Labs, and the two do not actually intersect, so why did they become the co-author of this great work? The story is very interesting.
In 1948, Shannon published a Mathematical Theory of Communication (A Mathematical Theory of Communication) in the Bell Systems Science magazine Bell System Technical Journal, which laid the foundation for information theory and communication theory. In this way, the groundbreaking work of information theory has nothing to do with Weaver. However, the mathematical description in Shannon's work is more obscure, and the theory is only applicable to the field of engineering communications, so the audience of the work is very small.
Weaver has always maintained a high interest in information theory, has a deep understanding, but also has his own unique views, so he elaborated and expanded Shannon's theory in easy-to-understand language, and published "Mathematics in Communication" (The Mathematics of Communication) in Science American magazine in 1949. Professor Wilbur Schram, then editor-in-chief of University of Illinois Press (father of communication), thought the combination of the two was the most perfect, so Weaver and Shannon's articles were rearranged as the first and second parts respectively. The epoch-making work "Mathematical principles of Communication" (The Mathematical Theory of Communication) was published (directly revised from the low-key "A Mathematical Theory of Communication" to the domineering "Mathematical principles of Communication"). Now, the "Shannon-Weaver model" has become a well-known basic theory in the field of communication and communication, which shows how important Weaver has played in the development and dissemination of information theory.
four
The birth of the concept of Machine Translation
Now, let's get back to the point and explore the process in which Weaver put forward the concept of machine translation and its impact on the subsequent development of machine translation. According to Weaver himself, the whole process stems from the true story of one of his distinguished mathematician friends. We call this mathematician friend P, a former German who spent some time in Istanbul, Turkey, and studied Turkish. The story takes place during World War II, when the study of cryptography was very popular because of the needs of the war. One day, F, a colleague of P, claimed that he had come up with a decryption algorithm, so he asked P to design a ciphertext and test the decryption algorithm. P is also very interested in cryptography. since F does not know Turkish, P wants to embarrass F, so he writes down a paragraph containing 100 words in Turkish, and then replaces the non-English letters in Turkish with English letters. finally, after a slightly more complicated replacement, a ciphertext with a numeric sequence is generated. Unexpectedly, F presented the decoding result to P the next day. Although F claims that it failed to decode the result and only got a string sequence of meaningless English letters (which is considered meaningless because it does not understand Turkish), P can restore Turkish information with a little modification.
The story deeply touched Weaver, who has a background in probability and statistics. Weaver is a little interested in language translation, which will be mentioned later. After deep thinking, Weaver believes that the frequency and combination of letters in different languages have similar rules, so we can decrypt the language by making use of these features, that is, the automatic translation of the language.
However, what tool to use for automatic translation has become a key issue. As it happens, the world's first electronic computer, ENIAC, was born in 1946. Inspired by language decryption and computers, Weaver put forward the idea of machine translation in 1947 and discussed the feasibility of machine translation with Norbert Weiner, the father of cybernetics. The first question is why Weaver discussed it with Wiener. In fact, on the one hand, Weaver led the Rockefeller Foundation to fund Wiener and helped him to establish the subject of cybernetics, which should be familiar with each other; on the other hand, Weaver believes that the automatic translation of language is a complex system. Wiener is an authority in the study of complex systems, and it is necessary to discuss machine translation. However, Wiener and Weaver discussed only one round and thought that the hypothetical space and ambiguity faced by machine translation were too large and too ambiguous to be feasible. Weaver was very disappointed and wanted to be reasonable and continue to discuss with Wiener, and finally tried to convince Wiener, but then he didn't.
Weaver knows very well that in order for the concept of machine translation to be accepted by people (including Wiener, of course), it is necessary to come up with feasible design and implementation techniques to prove its feasibility. So Weaver thought for two years and had an in-depth discussion with Andrew D. Booth of Birkbeck College, University of London, who had similar ideas in 1948, and finally put forward the concept of machine translation and four possible implementation strategies in the memorandum of translation in July 1949.
The first implementation strategy is based on a simple word substitution method, and its core is to solve the problem of word meaning disambiguation. Weaver believes that the key problem in the automatic conversion of one natural language to another is the polysemy of words in different contexts. A feasible scheme proposed by him is to use the context information of N word windows to help predict the semantics of the central words, and that N does not need to be too large. This idea has been applied in the machine translation method based on direct transformation.
The second implementation strategy assumes that the language is a logical expression. The automatic conversion between languages can be formalized into the automatic derivation of one logical table to another logical expression. Weaver hopes to use this strategy to show that machine translation is formally solvable. Later, the rule-based translation method and the translation derivation model based on synchronous context-free grammar in statistical machine translation are consistent with the basic idea of this strategy.
The third implementation strategy assumes that automatic translation between languages can actually be regarded as a communication process, that is, an input signal (unknown target language text, also known as plaintext in cryptography) outputs another signal (observable source language, ciphertext in cryptography) through the channel, and the translation process is the process of restoring the input signal according to the output signal. Weaver, as a pioneer of information theory, was inspired by the decoding of passwords during World War II, and believed that machine translation was actually very similar to the problem of password decoding, and that automatic language conversion could be realized by mining statistical patterns between the two languages. The rise of statistical machine translation around 1990 is based on the basic idea of this strategy.
The fourth implementation strategy assumes that all languages have the same logical characteristics and can be regarded as a common language or an intermediate language. Weaver believes that the automatic translation of the source language to the target language can first transform the source language into the intermediate language, and then from the intermediate language to the target language. Later, the JANUS machine translation system developed by Carnegie Mellon University in the United States adopted the translation method based on intermediate language. However, the definition and representation of intermediate languages has always been an unsolved problem. At present, the multilingual neural machine translation framework based on unified encoder and decoder is essentially similar to the translation idea based on intermediate language. All languages generate a distributed semantic representation through the same encoder, and then the decoder generates the target language from the distributed semantic representation.
It can be seen that from the first strategy to the fourth strategy, the idea is getting bolder and more difficult. However, from the perspective of historical development, it is basically in line with the advanced process of machine translation methods, and Weaver has to admire Weaver's strategic vision for future scientific development.
After the birth of the concept of machine translation, it has gradually attracted more and more scholars to enter this new research field. Three years later, Weaver led the Rockefeller Foundation to fund the first machine translation conference held at the Massachusetts Institute of Technology on June 17-20, 1952, organized by another machine translation pioneer, Yehoshua Bar-Hillel (great mathematician, philosopher, Luo, and linguist). A total of 18 experts attended. Later things became familiar to everyone, for example, the first machine translation system was publicly demonstrated in New York in 1954, the weather forecast machine translation system in Canada in 1976 was eye-catching, and the statistical machine translation model of IBM was born around 1990, which promoted the development of online translation systems such as Google, Microsoft and Baidu. After 2014, deep learning brought a breakthrough to machine translation.
five
Life after retirement
Weaver devoted most of his career to the Rockefeller Foundation from 1932 as director of the Department of Natural Sciences to his retirement in 1959. After retirement, he was invited to serve as vice president of the Sloan Foundation (Alfred P Sloan Foundation) for another five years. In the more than ten years from his retirement to his death in 1978, Weaver spent more time with his family on the one hand and began to focus more on his interests on the other. From the perspective of subsequent works, Weaver's interest is mainly focused on probability theory and language translation. In 1963, Weaver published a popular science book, Lady Luck: the Theory of probability (Lady Luck: The Theory of Probability), hoping to introduce the theory of probability to a wider range of people.
In terms of language translation, Weaver did not continue to study machine translation methods, but became interested in translating literary works in different languages. As a big fan of Lewis Carroll (Lewis Carroll), Weaver is particularly interested in translating Alice in Wonderland into different languages.
In 1964, Weaver published another monograph, "Alice in Many Tongues in many languages," in which he compared in detail the versions of 40 different languages, hoping to send a message: translating Alice in Wonderland into other languages is a great challenge. But Weaver couldn't understand 40 languages, so he used a back-translation method to translate other languages back into English, and then compare different English versions. The concept of Back-translation is too familiar to today's neural machine translation researchers. It has become a popular technology in the field of neural machine translation and a necessary technology in all kinds of machine translation competitions. However, the application of back-translation technology to neural machine translation was only officially proposed in 2016. Unexpectedly, Weaver has been using back-translation 's ideas more than half a century ago. What else can he say except admiration or admiration?
We can get at least two revelations from Weaver's life experience and achievements. First of all, interest is the key to success. Secondly, the control and choice of trends and directions will not only determine individual achievements, but will also play a vital role in national and global technological development.
References:
Warren Weaver. 1955. Translation. Machine Translation of Languages, 14:15-23, 1955.
Weaver, Warren. 1964. Alice in Many Tongues: The Translations of "Alice in Wonderland." Madison: University of Wisconsin Press.
Warren Weaver. National Academy of Sciences. 1987. Biographical Memoirs: V.57. Washington, DC: The National Academies Press.
Lily E. Kay. 1996. The Molecular Vision of Life: Caltech, the Rockefeller Foundation, and the Rise of the New Biology, Oxford University Press, Reprint 1996.
John Hutchins.1998. Milestones in machine translation. Language Today, no. 13. 1998. Pp. 12-13.
Author: Zhang Jiajun, researcher of Institute of Automation, Chinese Academy of Sciences, his main research interests are machine translation, natural language processing and deep learning. Zhihu column: https://www.zhihu.com/people/zhang-jia-jun-29-18
Https://www.toutiao.com/i6850035899368145421/
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.