Target Language - Explore the Science & Experts

The Experts below are selected from a list of 151785 Experts worldwide ranked by ideXlab platform

Mikel L. Forcada - One of the best experts on this subject based on the ideXlab platform.

Speeding up Target-Language driven part-of-speech tagger training for machine translation

MICAI 2006 Advances in Artificial Intelligence, 2006

Co-Authors: Felipe SÃƒÂ¡nchez-MartÃƒÂnez, Juan Antonio PÃƒÂ©rez-Ortiz, Mikel L. Forcada

Abstract:

When training hidden-Markov-model-based part-of-speech (PoS) taggers involved in machine translation systems in an unsupervised manner the use of Target-Language information has proven to give better results than the standard Baum-Welch algorithm. The Target-Language-driven training algorithm proceeds by translating every possible PoS tag sequence resulting from the disambiguation of the words in each source-Language text segment into the Target Language, and using a Target-Language model to estimate the likelihood of the translation of each possible disambiguation. The main disadvantage of this method is that the number of translations to perform grows exponentially with segment length, translation being the most time-consuming task. In this paper, we present a method that uses a priori knowledge obtained in an unsupervised manner to prune unlikely disambiguations in each text segment, so that the number of translations to be performed during training is reduced. The experimental results show that this new pruning method drastically reduces the amount of translations done during training (and, consequently, the time complexity of the algorithm) without degrading the tagging accuracy achieved.

15 days free trial to Access Article
MICAI - Speeding up Target-Language driven part-of-speech tagger training for machine translation

Lecture Notes in Computer Science, 2006

Co-Authors: Felipe Sánchez-martínez, Juan Antonio Pérez-ortiz, Mikel L. Forcada

Abstract:

When training hidden-Markov-model-based part-of-speech (PoS) taggers involved in machine translation systems in an unsupervised manner the use of Target-Language information has proven to give better results than the standard Baum-Welch algorithm. The Target-Language-driven training algorithm proceeds by translating every possible PoS tag sequence resulting from the disambiguation of the words in each source-Language text segment into the Target Language, and using a Target-Language model to estimate the likelihood of the translation of each possible disambiguation. The main disadvantage of this method is that the number of translations to perform grows exponentially with segment length, translation being the most time-consuming task. In this paper, we present a method that uses a priori knowledge obtained in an unsupervised manner to prune unlikely disambiguations in each text segment, so that the number of translations to be performed during training is reduced. The experimental results show that this new pruning method drastically reduces the amount of translations done during training (and, consequently, the time complexity of the algorithm) without degrading the tagging accuracy achieved.

15 days free trial to Access Article

Laurent Besacier - One of the best experts on this subject based on the ideXlab platform.

first steps in fast acoustic modeling for a new Target Language application to vietnamese

International Conference on Acoustics Speech and Signal Processing, 2005

Co-Authors: Viet Bac Le, Laurent Besacier

Abstract:

This paper presents our first steps in fast acoustic modeling for a new Target Language. Both knowledge-based and data-driven methods were used to obtain phone mapping tables between a source Language (French) and a Target Language (Vietnamese). While acoustic models borrowed directly from the source Language did not perform very well, we have shown that using a small amount of adaptation data in the Target Language (one or two hours) lead to very acceptable automatic speech recognition (ASR) performance. Our best continuous Vietnamese recognition system, adapted with only two hours of Vietnamese data, obtains a word accuracy of 63.9% on one hour of Vietnamese speech dialog for instance.

15 days free trial to Access Article
ICASSP (1) - First steps in fast acoustic modeling for a new Target Language: application to Vietnamese

Proceedings. (ICASSP '05). IEEE International Conference on Acoustics Speech and Signal Processing 2005., 1

Co-Authors: Laurent Besacier

Abstract:

This paper presents our first steps in fast acoustic modeling for a new Target Language. Both knowledge-based and data-driven methods were used to obtain phone mapping tables between a source Language (French) and a Target Language (Vietnamese). While acoustic models borrowed directly from the source Language did not perform very well, we have shown that using a small amount of adaptation data in the Target Language (one or two hours) lead to very acceptable automatic speech recognition (ASR) performance. Our best continuous Vietnamese recognition system, adapted with only two hours of Vietnamese data, obtains a word accuracy of 63.9% on one hour of Vietnamese speech dialog for instance.

15 days free trial to Access Article

Hungyi Lee - One of the best experts on this subject based on the ideXlab platform.

Language transfer of audio word2vec learning audio segment representations without Target Language data

International Conference on Acoustics Speech and Signal Processing, 2018

Co-Authors: Chiahao Shen, Janet Y Sung, Hungyi Lee

Abstract:

Audio Word2Vec offers vector representations of fixed dimensionality for variable-length audio segments using Sequence to-sequence Autoencoder (SA). These vector representations are shown to describe the sequential phonetic structures of the audio segments to a good degree, with real world applications such as spoken term detection (STD). This paper examines the capability of Language transfer of Audio Word2Vec. We train SA from one Language (source Language) and use it to extract the vector representation of the audio segments of another Language (Target Language). We found that SA can still catch the phonetic structure from the audio segments of the Target Language if the source and Target Languages are similar. In STD, we obtain the vector representations from the SA learned from a large amount of source Language data, and found them surpass the representations from naive encoder and SA directly learned from a small amount of Target Language data. The result shows that it is possible to learn Audio Word2Vec model from high-resource Languages and use it on low-resource Languages. This further expands the usability of Audio Word2Vec.

15 days free trial to Access Article
Language transfer of audio word2vec learning audio segment representations without Target Language data

arXiv: Computation and Language, 2017

Co-Authors: Chiahao Shen, Janet Y Sung, Hungyi Lee

Abstract:

Audio Word2Vec offers vector representations of fixed dimensionality for variable-length audio segments using Sequence-to-sequence Autoencoder (SA). These vector representations are shown to describe the sequential phonetic structures of the audio segments to a good degree, with real world applications such as query-by-example Spoken Term Detection (STD). This paper examines the capability of Language transfer of Audio Word2Vec. We train SA from one Language (source Language) and use it to extract the vector representation of the audio segments of another Language (Target Language). We found that SA can still catch phonetic structure from the audio segments of the Target Language if the source and Target Languages are similar. In query-by-example STD, we obtain the vector representations from the SA learned from a large amount of source Language data, and found them surpass the representations from naive encoder and SA directly learned from a small amount of Target Language data. The result shows that it is possible to learn Audio Word2Vec model from high-resource Languages and use it on low-resource Languages. This further expands the usability of Audio Word2Vec.

15 days free trial to Access Article

Felipe Sánchez-martínez - One of the best experts on this subject based on the ideXlab platform.

MICAI - Speeding up Target-Language driven part-of-speech tagger training for machine translation

Lecture Notes in Computer Science, 2006

Co-Authors: Felipe Sánchez-martínez, Juan Antonio Pérez-ortiz, Mikel L. Forcada

Abstract:

When training hidden-Markov-model-based part-of-speech (PoS) taggers involved in machine translation systems in an unsupervised manner the use of Target-Language information has proven to give better results than the standard Baum-Welch algorithm. The Target-Language-driven training algorithm proceeds by translating every possible PoS tag sequence resulting from the disambiguation of the words in each source-Language text segment into the Target Language, and using a Target-Language model to estimate the likelihood of the translation of each possible disambiguation. The main disadvantage of this method is that the number of translations to perform grows exponentially with segment length, translation being the most time-consuming task. In this paper, we present a method that uses a priori knowledge obtained in an unsupervised manner to prune unlikely disambiguations in each text segment, so that the number of translations to be performed during training is reduced. The experimental results show that this new pruning method drastically reduces the amount of translations done during training (and, consequently, the time complexity of the algorithm) without degrading the tagging accuracy achieved.

15 days free trial to Access Article

Yoshinobu Kajikawa - One of the best experts on this subject based on the ideXlab platform.

EUSIPCO - Automatic Speech Translation System Selecting Target Language by Direction-of-Arrival Information

2018 26th European Signal Processing Conference (EUSIPCO), 2018

Co-Authors: Masanori Tsujikawa, Koji Okabe, Ken Hanazawa, Yoshinobu Kajikawa

Abstract:

In this paper, we propose an automatic speech translation system that selects its Target Language on the basis of the direction-of-arrival (DOA) information. The system uses two microphones to detect speech signals arriving from specific directions. The Target Language for speech recognition is selected on the basis of the DOA. Both the speech detection and Target Language selection relieves users from operations normally required for individual utterances, without serious increase in computational costs. In a speech-recognition evaluation of the proposed system, 80 % word accuracy was achieved for utterances recorded with two microphones that were 40cm distant from speaker positions. This accuracy is nearly equivalent to that in which the time frame and Target Language of a user's speech are given in advance.

15 days free trial to Access Article

Discover everything there is to know about the scientific topic Target Language with ideXlab!

Mikel L. Forcada - One of the best experts on this subject based on the ideXlab platform.

Speeding up Target-Language driven part-of-speech tagger training for machine translation

MICAI - Speeding up Target-Language driven part-of-speech tagger training for machine translation

Laurent Besacier - One of the best experts on this subject based on the ideXlab platform.

first steps in fast acoustic modeling for a new Target Language application to vietnamese

ICASSP (1) - First steps in fast acoustic modeling for a new Target Language: application to Vietnamese

Hungyi Lee - One of the best experts on this subject based on the ideXlab platform.

Language transfer of audio word2vec learning audio segment representations without Target Language data

Language transfer of audio word2vec learning audio segment representations without Target Language data

Felipe Sánchez-martínez - One of the best experts on this subject based on the ideXlab platform.

MICAI - Speeding up Target-Language driven part-of-speech tagger training for machine translation

Yoshinobu Kajikawa - One of the best experts on this subject based on the ideXlab platform.

EUSIPCO - Automatic Speech Translation System Selecting Target Language by Direction-of-Arrival Information