The Experts below are selected from a list of 151785 Experts worldwide ranked by ideXlab platform
Mikel L. Forcada - One of the best experts on this subject based on the ideXlab platform.
-
Speeding up Target-Language driven part-of-speech tagger training for machine translation
MICAI 2006 Advances in Artificial Intelligence, 2006Co-Authors: Felipe Sánchez-MartÃÂnez, Juan Antonio Pérez-Ortiz, Mikel L. ForcadaAbstract:When training hidden-Markov-model-based part-of-speech (PoS) taggers involved in machine translation systems in an unsupervised manner the use of Target-Language information has proven to give better results than the standard Baum-Welch algorithm. The Target-Language-driven training algorithm proceeds by translating every possible PoS tag sequence resulting from the disambiguation of the words in each source-Language text segment into the Target Language, and using a Target-Language model to estimate the likelihood of the translation of each possible disambiguation. The main disadvantage of this method is that the number of translations to perform grows exponentially with segment length, translation being the most time-consuming task. In this paper, we present a method that uses a priori knowledge obtained in an unsupervised manner to prune unlikely disambiguations in each text segment, so that the number of translations to be performed during training is reduced. The experimental results show that this new pruning method drastically reduces the amount of translations done during training (and, consequently, the time complexity of the algorithm) without degrading the tagging accuracy achieved.
-
MICAI - Speeding up Target-Language driven part-of-speech tagger training for machine translation
Lecture Notes in Computer Science, 2006Co-Authors: Felipe Sánchez-martínez, Juan Antonio Pérez-ortiz, Mikel L. ForcadaAbstract:When training hidden-Markov-model-based part-of-speech (PoS) taggers involved in machine translation systems in an unsupervised manner the use of Target-Language information has proven to give better results than the standard Baum-Welch algorithm. The Target-Language-driven training algorithm proceeds by translating every possible PoS tag sequence resulting from the disambiguation of the words in each source-Language text segment into the Target Language, and using a Target-Language model to estimate the likelihood of the translation of each possible disambiguation. The main disadvantage of this method is that the number of translations to perform grows exponentially with segment length, translation being the most time-consuming task. In this paper, we present a method that uses a priori knowledge obtained in an unsupervised manner to prune unlikely disambiguations in each text segment, so that the number of translations to be performed during training is reduced. The experimental results show that this new pruning method drastically reduces the amount of translations done during training (and, consequently, the time complexity of the algorithm) without degrading the tagging accuracy achieved.
Laurent Besacier - One of the best experts on this subject based on the ideXlab platform.
-
first steps in fast acoustic modeling for a new Target Language application to vietnamese
International Conference on Acoustics Speech and Signal Processing, 2005Co-Authors: Viet Bac Le, Laurent BesacierAbstract:This paper presents our first steps in fast acoustic modeling for a new Target Language. Both knowledge-based and data-driven methods were used to obtain phone mapping tables between a source Language (French) and a Target Language (Vietnamese). While acoustic models borrowed directly from the source Language did not perform very well, we have shown that using a small amount of adaptation data in the Target Language (one or two hours) lead to very acceptable automatic speech recognition (ASR) performance. Our best continuous Vietnamese recognition system, adapted with only two hours of Vietnamese data, obtains a word accuracy of 63.9% on one hour of Vietnamese speech dialog for instance.
-
ICASSP (1) - First steps in fast acoustic modeling for a new Target Language: application to Vietnamese
Proceedings. (ICASSP '05). IEEE International Conference on Acoustics Speech and Signal Processing 2005., 1Co-Authors: Laurent BesacierAbstract:This paper presents our first steps in fast acoustic modeling for a new Target Language. Both knowledge-based and data-driven methods were used to obtain phone mapping tables between a source Language (French) and a Target Language (Vietnamese). While acoustic models borrowed directly from the source Language did not perform very well, we have shown that using a small amount of adaptation data in the Target Language (one or two hours) lead to very acceptable automatic speech recognition (ASR) performance. Our best continuous Vietnamese recognition system, adapted with only two hours of Vietnamese data, obtains a word accuracy of 63.9% on one hour of Vietnamese speech dialog for instance.
Hungyi Lee - One of the best experts on this subject based on the ideXlab platform.
-
Language transfer of audio word2vec learning audio segment representations without Target Language data
International Conference on Acoustics Speech and Signal Processing, 2018Co-Authors: Chiahao Shen, Janet Y Sung, Hungyi LeeAbstract:Audio Word2Vec offers vector representations of fixed dimensionality for variable-length audio segments using Sequence to-sequence Autoencoder (SA). These vector representations are shown to describe the sequential phonetic structures of the audio segments to a good degree, with real world applications such as spoken term detection (STD). This paper examines the capability of Language transfer of Audio Word2Vec. We train SA from one Language (source Language) and use it to extract the vector representation of the audio segments of another Language (Target Language). We found that SA can still catch the phonetic structure from the audio segments of the Target Language if the source and Target Languages are similar. In STD, we obtain the vector representations from the SA learned from a large amount of source Language data, and found them surpass the representations from naive encoder and SA directly learned from a small amount of Target Language data. The result shows that it is possible to learn Audio Word2Vec model from high-resource Languages and use it on low-resource Languages. This further expands the usability of Audio Word2Vec.
-
Language transfer of audio word2vec learning audio segment representations without Target Language data
arXiv: Computation and Language, 2017Co-Authors: Chiahao Shen, Janet Y Sung, Hungyi LeeAbstract:Audio Word2Vec offers vector representations of fixed dimensionality for variable-length audio segments using Sequence-to-sequence Autoencoder (SA). These vector representations are shown to describe the sequential phonetic structures of the audio segments to a good degree, with real world applications such as query-by-example Spoken Term Detection (STD). This paper examines the capability of Language transfer of Audio Word2Vec. We train SA from one Language (source Language) and use it to extract the vector representation of the audio segments of another Language (Target Language). We found that SA can still catch phonetic structure from the audio segments of the Target Language if the source and Target Languages are similar. In query-by-example STD, we obtain the vector representations from the SA learned from a large amount of source Language data, and found them surpass the representations from naive encoder and SA directly learned from a small amount of Target Language data. The result shows that it is possible to learn Audio Word2Vec model from high-resource Languages and use it on low-resource Languages. This further expands the usability of Audio Word2Vec.
Felipe Sánchez-martínez - One of the best experts on this subject based on the ideXlab platform.
-
MICAI - Speeding up Target-Language driven part-of-speech tagger training for machine translation
Lecture Notes in Computer Science, 2006Co-Authors: Felipe Sánchez-martínez, Juan Antonio Pérez-ortiz, Mikel L. ForcadaAbstract:When training hidden-Markov-model-based part-of-speech (PoS) taggers involved in machine translation systems in an unsupervised manner the use of Target-Language information has proven to give better results than the standard Baum-Welch algorithm. The Target-Language-driven training algorithm proceeds by translating every possible PoS tag sequence resulting from the disambiguation of the words in each source-Language text segment into the Target Language, and using a Target-Language model to estimate the likelihood of the translation of each possible disambiguation. The main disadvantage of this method is that the number of translations to perform grows exponentially with segment length, translation being the most time-consuming task. In this paper, we present a method that uses a priori knowledge obtained in an unsupervised manner to prune unlikely disambiguations in each text segment, so that the number of translations to be performed during training is reduced. The experimental results show that this new pruning method drastically reduces the amount of translations done during training (and, consequently, the time complexity of the algorithm) without degrading the tagging accuracy achieved.
Yoshinobu Kajikawa - One of the best experts on this subject based on the ideXlab platform.
-
EUSIPCO - Automatic Speech Translation System Selecting Target Language by Direction-of-Arrival Information
2018 26th European Signal Processing Conference (EUSIPCO), 2018Co-Authors: Masanori Tsujikawa, Koji Okabe, Ken Hanazawa, Yoshinobu KajikawaAbstract:In this paper, we propose an automatic speech translation system that selects its Target Language on the basis of the direction-of-arrival (DOA) information. The system uses two microphones to detect speech signals arriving from specific directions. The Target Language for speech recognition is selected on the basis of the DOA. Both the speech detection and Target Language selection relieves users from operations normally required for individual utterances, without serious increase in computational costs. In a speech-recognition evaluation of the proposed system, 80 % word accuracy was achieved for utterances recorded with two microphones that were 40cm distant from speaker positions. This accuracy is nearly equivalent to that in which the time frame and Target Language of a user's speech are given in advance.