Its ability to improve is decreasing and it seeks a revolutionary system to avoid its sound errors
The Google translator (Google Translate) is a tool too comfortable as to not use it; although it does not have to be a philologist to detect blunders, inconsistencies, and absurdities, not to speak of the fine thread that gives meaning, tone and emotion to the words, sentences and paragraphs are well interpreted. Why not hit more? Why Does Google speak a rare Spanish (worse Catalan and even worse Chinese, and with different success in each of the 71 languages it understands)? Has the quality machine translation been roofed? The answer is that there is no longer so much room for improvement and is constantly seeking to implement the system in order to gain quality.
Machines are given numbers well, not letters
Google does not translate word for word, does not follow grammatical or syntactic rules, because exceptions to the rule, and exceptions to exceptions in each of the languages would make the tool unworkable. Google Translate, by contrast, continues to the translation machine (the propelled IBM in the nineties, but GT is an advanced development). It consists in extracting statistical patterns through the probabilities obtained after analyzing texts already interpreted by professional human translators.
The Mountain View company, for example, has confessed to the systematic analysis of the texts translated into different languages by the UN and the European Union. The conversion, therefore, consists in reproducing the words according to the probability that the combination is repeated in the other language (the so-called phrase tables). Machines are good with numbers, not letters.; and this is how they understand a language.
analyse human translations
The ideal is that the more texts you analyze, the better the statistical pattern will be and the better the translation will be. Although, in reality, there is little room for improvement left. As she told a then employee of Google Translate to “The Guardian”, each time Google doubles the number of texts analyzed, only improves your success with luck at 0.5%. And you can’t bend to Infinity, no matter how much the company doesn’t have a rival in Internet texts.
Moreover, this technique only results in languages that have a significant volume of texts to make a direct translation between a couple of languages. For example, Google does not translate directly from English to Catalan, but translates from English to Spanish and then into Catalan; between the two translations the error rate is multiplied. The same applies to many other pairs, such as Ukrainian and English, which have previously to pass through Russian. The company recognizes the need for improvement in terms of providing a tool of participation to debug the translation, the Translator Toolkit.
There are not many external investigations into the accuracy of Google translations. One recent (2013) has been prepared by the Agency for Research and Quality Health Care, the Government of the united States. The agency conducted a study to give a percentage to the quality of the translations of Google in comparison with professional translations in the medical studies, from English to chinese, French, German, japanese and Spanish. The study is done on specific data extracted from the text, not on the meaning of the text as a whole.
The result gives an overall success rate of more than 76% in 78% of data in Spanish, similar to French (74%) and higher than German (70%) and Japanese (67%). In Chinese, the success rate of less than 50% is 22% of the data, the worst of all. The conclusion of the study is that translation is far from perfection and the “risk of causing errors is very high”. And that happens between pairs of languages with direct translation. In short, it’s not perfect. And that’s why Google is looking for an improved system.
looking for a new system
The technological giant’s research team has just released a system that aims to complete the current, and bring it to new heights of success. Instead of analyzing texts by counting word rows and extracting probabilities, create a word map in a single language (see example below). The vectors from the distribution on the map can be reproduced in any other language. The translation therefore depends on the place it occupies on the axis. The system also allows automate learning.
According to their authors, the quality of the Spanish-English translation reaches 90% (from English to Vietnamese it would remain at 30%). Although, as the researchers conclude, ” clearly, there is still much to explore.” That’s why Google opened the tool word2vec, a software designed to understand the relationships between words without a human guide, for researchers from all over the world to join forces with the great task of language in the global village.
This study is the most recent and highly publicized, but the lines of research are continuous. Google has several areas of study devoted to translation and the human-machine language relationship: mechanical translation, processing of the speech,natural language processing
The ultimate futuristic idea, already announced, is to create the universal translator, to speak by an earphone in one language and to receive it in another language on the other side of the phone. Without forgetting Google’s need to understand all the texts it travels, to analyze the flow of information and finally place related ads, after all the source of your business. Google is not the only company exploring the field. Microsoft, China Baidu, Ersatz and AlchemyAPIalso seek to analyze the language through automatic learning techniques. And the virtual Wizard is beginning to become more and more real so that the