The Mother of All Tongues
Superfast translation services are now available over the Net. But they appear unlikely to reduce language barriers to a pile of rubble
By Kitty McKinsey/HONG KONG
NEARLY 50 YEARS AGO, a group of engineers and linguists proudly showed off what they hoped would be a valuable new American spy tool in the Cold War against the Soviet Union: a computer capable of translating Russian into English–well, at least a few carefully selected sentences with limited vocabulary.
That kind of computer, capable of instantaneously–and perfectly–translating any language to any other, remains the Holy Grail of computer science, even today. But dramatic improvements in machine translation, or MT, mean tens of millions of Internet users are reading pages originally posted in languages they don”t understand, thanks to on-line translation services available free on the Web. Phenomenal increases in processing power and the plummeting cost of disc storage space have made it technically possible for computers to translate Web pages in a second. Suddenly MT is hot, especially in Asia.
“We”re an instant success after 50 years of hard work,” jokes Brian Garr, a specialist in machine translation for IBM Voice Systems. “The World Wide Web allows for the creation of global communities, but that can”t happen with language barriers.”
What started out as a heavily English domain is quickly becoming truly international. Global technology-research company IDC estimates non-English speakers on the Web outnumber English speakers by 211 million to 192 million. And it predicts that the number of non-English users will hit 560 million by 2003, dwarfing an English-speaking population by then of 230 million. While the global MT market is still small, Allied Business Intelligence of the U.S. estimates it will grow to anywhere from $1.7 billion to $2.3 billion by 2005.
Three companies dominate the global market for computerized-translation software. Paris-based Systran claims to power 99% of on-line translators, handling more than 4 million Web pages with 16 language combinations every day. Other big players in the field are IBM, which recently added Chinese, Korean and Japanese to its offerings, and Belgian technology company Lernout & Hauspie.
And Asian firms are becoming increasingly important players. Japanese giants like NEC, Nippon Telegraph & Telephone Corp. and Fujitsu have made special efforts to make translated English documents and Web pages available to their employees. “Japanese people tend to be bad at foreign languages,” says Akio Yokoo, head of the MT research group at NTT.
Singapore technology start-up EWGate specializes in MT for Asian languages, with the lofty goal “to bridge and unify the East-West cyberworld.” (See story on next page.) Hong Kong software company isilk, meanwhile, offers on-line translation between Chinese and English.
But it was Systran-powered Babelfish–which can be found on the Alta Vista portal at www.babelfish.altavista.com–that first brought MT to the wider public in 1997. Named after the translating fish in Doulas Adams” 1970s book The Hitch Hiker”s Guide To The Galaxy, Babelfish handles 30 million translation requests every month, according to Alta Vista.
Yet the sometimes bizarre results have earned it the nickname of Mangelfish. Indeed, machine translation can produce results that are hilarious, mystifying or downright backwards. Look what happens to the French version of “I don”t care” (“Je m”en fou”). In the hands of various on-line translators it becomes in English: “I myself in crazy,” “I of insane,” and, funniest of all, “Me me in madman.”
Although it”s easy to laugh at on-line translations, the fact that a computer can perform such an operation in seconds is “in itself a small miracle,” says Dimitrios Sabatakakis, CEO of Systran. Proponents stress on-line translations are good for simply getting the gist of a Web page to decide whether it might be useful. MT systems are most helpful when specialized in one field, like economics or medicine.
The daunting challenge in MT is training a computer to think and learn like a human. Language is anything but straightforward. Take a short word like “bank,” which could refer to money, blood, data, memory or a river, to name just a few meanings. Humans can quickly analyze the context of a conversation and decide which meaning is correct.
But teaching a computer to do the same thing requires the marriage of a vast array of specializations–not only all branches of linguistics, but also hugely complex fields like information theory and statistical pattern recognition. And some Asian languages are even more complicated. Take Chinese, for example. Unlike English, which relies on a 26-letter alphabet to make up words, Chinese relies on tens of thousands of characters, with each syllable carrying its own meaning. So even the basic building blocks of meaning are different.
However, MT continues to be over-hyped as the real version of Star Trek”s Universal Translator. Just over a year ago, former U.S. President Bill Clinton promised that machines that can “translate as fast as you can speak” were just around the corner. It earned him a stinging rebuke from Ann Macfarlane, president of the American Translators Association. Such machines exist only “in fantasy-land–where they will remain for the indefinite future,” she told Clinton in a letter.
Everyone in the business agrees that putting human translators out of business isn”t even a goal. Systran, in fact, hires human translators to translate its own annual report from French into English.
MT is useful for translating huge masses of information that would cost too much time and money for human translator to do. “The purpose of MT is to make intelligible what is unintelligible,” adds IBM”s Garr. “”The goal is to allow people to communicate with each other.”
And fostering communication in Asia is of ever greater interest to the global companies. IBM has a research lab in Beijing to ensure it will be “a very strong player in China,” says Garr. Systran, meanwhile, is adding Chinese, Korean and Japanese (to and from English) to the 16 MT language pairs already commercially available. French to Chinese and German to Chinese are also in the works.
Even so, the perfect translating computer is still far in the future. “It will most likely be the last problem to be solved in computer science,” says Dekai Wu, founder of isilk.