Google Translate - Explore the Science & Experts

The Experts below are selected from a list of 231 Experts worldwide ranked by ideXlab platform

Su Zhendong - One of the best experts on this subject based on the ideXlab platform.

Testing Machine Translation via Referential Transparency

2021

Co-Authors: He Pinjia, Meister Clara, Su Zhendong

Abstract:

Machine translation software has seen rapid progress in recent years due to the advancement of deep neural networks. People routinely use machine translation software in their daily lives, such as ordering food in a foreign restaurant, receiving medical diagnosis and treatment from foreign doctors, and reading international political news online. However, due to the complexity and intractability of the underlying neural networks, modern machine translation software is still far from robust and can produce poor or incorrect translations; this can lead to misunderstanding, financial loss, threats to personal safety and health, and political conflicts. To address this problem, we introduce referentially transparent inputs (RTIs), a simple, widely applicable methodology for validating machine translation software. A referentially transparent input is a piece of text that should have similar translations when used in different contexts. Our practical implementation, Purity, detects when this property is broken by a translation. To evaluate RTI, we use Purity to test Google Translate and Bing Microsoft Translator with 200 unlabeled sentences, which detected 123 and 142 erroneous translations with high precision (79.3% and 78.3%). The translation errors are diverse, including examples of under-translation, over-translation, word/phrase mistranslation, incorrect modification, and unclear logic.Comment: Accepted by ICSE2

15 days free trial to Access Article
Testing Machine Translation via Referential Transparency

2020

Co-Authors: He Pinjia, Meister Clara, Su Zhendong

Abstract:

Machine translation software has seen rapid progress in recent years due to the advancement of deep neural networks. People routinely use machine translation software in their daily lives, such as ordering food in a foreign restaurant, receiving medical diagnosis and treatment from foreign doctors, and reading international political news online. However, due to the complexity and intractability of the underlying neural networks, modern machine translation software is still far from robust. To address this problem, we introduce referentially transparent inputs (RTIs), a simple, widely applicable methodology for validating machine translation software. A referentially transparent input is a piece of text that should have invariant translation when used in different contexts. Our practical implementation, Purity, detects when this invariance property is broken by a translation. To evaluate RTI, we use Purity to test Google Translate and Bing Microsoft Translator with 200 unlabeled sentences, which led to 123 and 142 erroneous translations with high precision (79.3\% and 78.3\%). The translation errors are diverse, including under-translation, over-translation, word/phrase mistranslation, incorrect modification, and unclear logic. These translation errors could lead to misunderstanding, financial loss, threats to personal safety and health, and political conflicts

15 days free trial to Access Article
Structure-Invariant Testing for Machine Translation

2020

Co-Authors: He Pinjia, Meister Clara, Su Zhendong

Abstract:

In recent years, machine translation software has increasingly been integrated into our daily lives. People routinely use machine translation for various applications, such as describing symptoms to a foreign doctor and reading political news in a foreign language. However, the complexity and intractability of neural machine translation (NMT) models that power modern machine translation make the robustness of these systems difficult to even assess, much less guarantee. Machine translation systems can return inferior results that lead to misunderstanding, medical misdiagnoses, threats to personal safety, or political conflicts. Despite its apparent importance, validating the robustness of machine translation systems is very difficult and has, therefore, been much under-explored. To tackle this challenge, we introduce structure-invariant testing (SIT), a novel metamorphic testing approach for validating machine translation software. Our key insight is that the translation results of "similar" source sentences should typically exhibit similar sentence structures. Specifically, SIT (1) generates similar source sentences by substituting one word in a given sentence with semantically similar, syntactically equivalent words; (2) represents sentence structure by syntax parse trees (obtained via constituency or dependency parsing); (3) reports sentence pairs whose structures differ quantitatively by more than some threshold. To evaluate SIT, we use it to test Google Translate and Bing Microsoft Translator with 200 source sentences as input, which led to 64 and 70 buggy issues with 69.5\% and 70\% top-1 accuracy, respectively. The translation errors are diverse, including under-translation, over-translation, incorrect modification, word/phrase mistranslation, and unclear logic.Comment: Accepted at ICSE 202

15 days free trial to Access Article

Charles Christoph Roehr - One of the best experts on this subject based on the ideXlab platform.

does Google Translate replace translation services in the nicu a tri lingual comparison

European Respiratory Journal, 2013

Co-Authors: N Borner, Stefanie Sponholz Dos Santos, Kai Konig, Silke Brodkorb, Christoph Buhrer, Charles Christoph Roehr

Abstract:

Background: Communication with family members with limited English proficiency is a daily challenge in the work on a neonatal intensive care unit (NICU). Professional interpreters are not always at hand whereas online translation tools, e.g. Google Translate (GT), appear to offer a feasible and easily accessible alternative. The objective of this study was to test the reliability of selected GT translations of standardized sentences of a neonatal doctor-patient-interview. Methods: 20 sentences were taken from an English NICU parent information brochure and Translated in to German, Portuguese and Arabic using GT. After checking the translations with regards to grammar and content, in a second step and after simplification of all incorrect sentences, re-translation with GT was performed and again checked for correctness. Results: Overall 52% of the translations were incorrect in regards to content and grammar, varying between 45% - 60%. Simplification led to correct content in overall 29% of the translations, the remainder stayed imprecise. Discussion: Any translation service, whether it is traditional or intent-based, is prone to error. With GT, contextual and grammatically incorrect translations are common. The design of GT as a statistical translation engine might be an explanation. Conclusion: Presently, traditional interpreter services cannot be replaced by online translation engines. In particular circumstances, use of GT or similar engines may be justified, if users are aware of limitations. For future, we suggest to compile a catalogue of sentences containing central information which can be Translated in to defined foreign languages without misinterpretation or loss of information.

15 days free trial to Access Article

Manganello Jennifer - One of the best experts on this subject based on the ideXlab platform.

Online National Health Agency Mask Guidance for the Public in Light of COVID-19: Content Analysis

'JMIR Publications Inc.', 2020

Co-Authors: Laestadius Linnea, Wang Yang, Ben Taleb Ziyad, Kalan, Mohammed Ebrahimi, Cho Young, Manganello Jennifer

Abstract:

Background: The rapid global spread of the coronavirus disease (COVID-19) has compelled national governments to issue guidance on the use of face masks for members of the general public. To date, no work has assessed how this guidance differs across governments. Objective: This study seeks to contribute to a rational and consistent global response to infectious disease by determining how guidelines differ across nations and regions. Methods: A content analysis of health agency mask guidelines on agency websites was performed in late March 2020 among 25 countries and regions with large numbers of COVID-19 cases. Countries and regions were assigned across the coding team by language proficiency, with Google Translate used as needed. When available, both the original and English language version of guidance were reviewed. Results: All examined countries and regions had some form of guidance online, although detail and clarity differed. Although 9 countries and regions recommended surgical, medical, or unspecified masks in public and poorly ventilated places, 16 recommended against people wearing masks in public. There were 2 countries that explicitly recommended against fabric masks. In addition, 12 failed to outline the minimum basic World Health Organization guidance for masks. Conclusions: Online guidelines for face mask use to prevent COVID-19 in the general public are currently inconsistent across nations and regions, and have been changing often. Efforts to create greater standardization and clarity should be explored in light of the status of COVID-19 as a global pandemic

15 days free trial to Access Article
Online National Health Agency Mask Guidance for the Public in Light of COVID-19: Content Analysis

FIU Digital Commons, 2020

Co-Authors: Laestadius Linnea, Wang Yang, Cho Young, Taleb, Ziyad Ben, Kalan, Mohammad Ebrahimi, Manganello Jennifer

Abstract:

Background: The rapid global spread of the coronavirus disease (COVID-19) has compelled national governments to issue guidance on the use of face masks for members of the general public. To date, no work has assessed how this guidance differs across governments. Objective: This study seeks to contribute to a rational and consistent global response to infectious disease by determining how guidelines differ across nations and regions. Methods: A content analysis of health agency mask guidelines on agency websites was performed in late March 2020 among 25 countries and regions with large numbers of COVID-19 cases. Countries and regions were assigned across the coding team by language proficiency, with Google Translate used as needed. When available, both the original and English language version of guidance were reviewed. Results: All examined countries and regions had some form of guidance online, although detail and clarity differed. Although 9 countries and regions recommended surgical, medical, or unspecified masks in public and poorly ventilated places, 16 recommended against people wearing masks in public. There were 2 countries that explicitly recommended against fabric masks. In addition, 12 failed to outline the minimum basic World Health Organization guidance for masks. Conclusions: Online guidelines for face mask use to prevent COVID-19 in the general public are currently inconsistent across nations and regions, and have been changing often. Efforts to create greater standardization and clarity should be explored in light of the status of COVID-19 as a global pandemic. Keywords: COVID-19; content analysis; infectious disease; online health information; pandemic; personal protective equipment; public health; public health policy

15 days free trial to Access Article

Naomie Salim - One of the best experts on this subject based on the ideXlab platform.

web based cross language plagiarism detection

International Conference on Computational Intelligence Modelling and Simulation, 2010

Co-Authors: Chow Kok Kent, Naomie Salim

Abstract:

As the Internet help us cross language and cultural border by providing different types of translation tools, cross language plagiarism, also known as translation plagiarism are bound to arise. In this paper, we propose a new approach in detecting cross language plagiarism. In order to limit certain scale of our proposed system, we are consider Bahasa Melayu as an input language of the submitted query document and English as a target language of similar, possibly plagiarised documents. Input documents are Translated into English using Google Translate API before undergo pre-processing phase (stemming and removal of stop words). Tokenized documents are sent to the Google AJAX Search API to detect similar documents throughout the World Wide Web. Only top ten sources retrieved by the Google Search API are considered as the candidate of source documents. We integrate the use of Stanford Parser and WordNet to determine the similarity level between the suspected documents with those candidate source documents. After that, a detailed similarity analysis is performed and a report of results is produced.

15 days free trial to Access Article
web based cross language plagiarism detection

arXiv: Other Computer Science, 2009

Co-Authors: Chow Kok Kent, Naomie Salim

Abstract:

As the Internet help us cross language and cultural border by providing different types of translation tools, cross language plagiarism, also known as translation plagiarism are bound to arise. Especially among the academic works, such issue will definitely affect the student's works including the quality of their assignments and paper works. In this paper, we propose a new approach in detecting cross language plagiarism. Our web based cross language plagiarism detection system is specially tuned to detect translation plagiarism by implementing different techniques and tools to assist the detection process. Google Translate API is used as our translation tool and Google Search API, which is used in our information retrieval process. Our system is also integrated with the fingerprint matching technique, which is a widely used plagiarism detection technique. In general, our proposed system is started by translating the input documents from Malay to English, followed by removal of stop words and stemming words, identification of similar documents in corpus, comparison of similar pattern and finally summary of the result. Three least-frequent 4-grams fingerprint matching is used to implement the core comparison phase during the plagiarism detection process. In K-gram fingerprint matching technique, although any value of K can be considered, yet K = 4 was stated as an ideal choice. This is because smaller values of K (i.e., K = 1, 2, or 3), do not provide good discrimination between sentences. On the other hand, the larger the values of K (i.e., K = 5, 6, 7...etc), the better discrimination of words in one sentence from words in another.

15 days free trial to Access Article

Takahiro Kiuchi - One of the best experts on this subject based on the ideXlab platform.

Preliminary study of online machine translation use of nursing literature: quality evaluation and perceived usability

BMC research notes, 2012

Co-Authors: Ryoko Anazawa, Hirono Ishikawa, Mj Park, Takahiro Kiuchi

Abstract:

Japanese nurses are increasingly required to read published international research in clinical, educational, and research settings. Language barriers are a significant obstacle, and online machine translation (MT) is a tool that can be used to address this issue. We examined the quality of Google Translate® (English to Japanese and Korean to Japanese), which is a representative online MT, using a previously verified evaluation method. We also examined the perceived usability and current use of online MT among Japanese nurses. Randomly selected nursing abstracts were Translated and then evaluated for intelligibility and usability by 28 participants, including assistants and research associates from nursing universities throughout Japan. They answered a questionnaire about their online MT use. From simple comparison of mean scores between two language pairs, translation quality was significantly better, with respect to both intelligibility and usability, for Korean-Japanese than for English-Japanese. Most respondents perceived a language barrier. Online MT had been used by 61% of the respondents and was perceived as not useful enough. Nursing articles Translated from Korean into Japanese by an online MT system could be read at an acceptable level of comprehension, but the same could not be said for English-Japanese translations. Respondents with experience using online MT used it largely to grasp the overall meanings of the original text. Enrichment in technical terms appeared to be the key to better usability. Users will be better able to use MT outputs if they improve their foreign language proficiency as much as possible. Further research is being conducted with a larger sample size and detailed analysis.

15 days free trial to Access Article

Discover everything there is to know about the scientific topic Google Translate with ideXlab!

Su Zhendong - One of the best experts on this subject based on the ideXlab platform.

Testing Machine Translation via Referential Transparency

Testing Machine Translation via Referential Transparency

Structure-Invariant Testing for Machine Translation

Charles Christoph Roehr - One of the best experts on this subject based on the ideXlab platform.

does Google Translate replace translation services in the nicu a tri lingual comparison

Manganello Jennifer - One of the best experts on this subject based on the ideXlab platform.

Online National Health Agency Mask Guidance for the Public in Light of COVID-19: Content Analysis

Online National Health Agency Mask Guidance for the Public in Light of COVID-19: Content Analysis

Naomie Salim - One of the best experts on this subject based on the ideXlab platform.

web based cross language plagiarism detection

web based cross language plagiarism detection

Takahiro Kiuchi - One of the best experts on this subject based on the ideXlab platform.

Preliminary study of online machine translation use of nursing literature: quality evaluation and perceived usability