Korean Language

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 18849 Experts worldwide ranked by ideXlab platform

Vickie Doll - One of the best experts on this subject based on the ideXlab platform.

Lori Lamel - One of the best experts on this subject based on the ideXlab platform.

  • Unsupervised acoustic model training for the Korean Language
    2014
    Co-Authors: Antoine Laurent, William Hartmann, Lori Lamel
    Abstract:

    This paper investigates unsupervised training strategies for the Korean Language in the context of the DGA RAPID Rapmat project. As with previous studies, we begin with only a small amount of manually transcribed data to build preliminary acoustic models. Using the initial models, a larger set of untranscribed audio data is decoded to produce approximate transcripts. We compare both GMM and DNN acoustic models for both the unsupervised transcription and the final recognition system. While the DNN acoustic models produce a lower word error rate on the test set, training on the transcripts from the GMM system provides the best overall performance. We also achieve better performance by expanding the original phone set. Finally, we examine the efficacy of automatically building a test set by comparing system performance both before and after manually correcting the test set.

  • ISCSLP - Unsupervised acoustic model training for the Korean Language
    The 9th International Symposium on Chinese Spoken Language Processing, 2014
    Co-Authors: Antoine Laurent, William Hartmann, Lori Lamel
    Abstract:

    This paper investigates unsupervised training strategies for the Korean Language in the context of the DGA RAPID Rapmat project. As with previous studies, we begin with only a small amount of manually transcribed data to build preliminary acoustic models. Using the initial models, a larger set of untranscribed audio data is decoded to produce approximate transcripts. We compare both GMM and DNN acoustic models for both the unsupervised transcription and the final recognition system. While the DNN acoustic models produce a lower word error rate on the test set, training on the transcripts from the GMM system provides the best overall performance. We also achieve better performance by expanding the original phone set. Finally, we examine the efficacy of automatically building a test set by comparing system performance both before and after manually correcting the test set.

Geon Ho Lee - One of the best experts on this subject based on the ideXlab platform.

  • Evaluation of Korean-Language COVID-19-Related Medical Information on YouTube: Cross-Sectional Infodemiology Study.
    Journal of Medical Internet Research, 2020
    Co-Authors: Hana Moon, Geon Ho Lee
    Abstract:

    BACKGROUND: In South Korea, the number of coronavirus disease (COVID-19) cases has declined rapidly and much sooner than in other countries. South Korea is one of the most digitalized countries in the world, and YouTube may have served as a rapid delivery mechanism for increasing public awareness of COVID-19. Thus, the platform may have helped the South Korean public fight the spread of the disease. OBJECTIVE: The aim of this study is to compare the reliability, overall quality, title-content consistency, and content coverage of Korean-Language YouTube videos on COVID-19, which have been uploaded by different sources. METHODS: A total of 200 of the most viewed YouTube videos from January 1, 2020, to April 30, 2020, were screened, searching in Korean for the terms "Coronavirus," "COVID," "Corona," "Wuhan virus," and "Wuhan pneumonia." Non-Korean videos and videos that were duplicated, irrelevant, or livestreamed were excluded. Source and video metrics were collected. The videos were scored based on the following criteria: modified DISCERN index, Journal of the American Medical Association Score (JAMAS) benchmark criteria, global quality score (GQS), title-content consistency index (TCCI), and medical information and content index (MICI). RESULTS: Of the 105 total videos, 37.14% (39/105) contained misleading information; independent user-generated videos showed the highest proportion of misleading information at 68.09% (32/47), while all of the government-generated videos were useful. Government agency-generated videos achieved the highest median score of DISCERN (5.0, IQR 5.0-5.0), JAMAS (4.0, IQR 4.0-4.0), GQS (4.0, IQR 3.0-4.5), and TCCI (5.0, IQR 5.0-5.0), while independent user-generated videos achieved the lowest median score of DISCERN (2.0, IQR 1.0-3.0), JAMAS (2.0, IQR 1.5-2.0), GQS (2.0, IQR 1.5-2.0), and TCCI (3.0, IQR 3.0-4.0). However, the total MICI was not significantly different among sources. "Transmission and precautionary measures" were the most commonly covered content by government agencies, news agencies, and independent users. In contrast, the most mentioned content by news agencies was "prevalence," followed by "transmission and precautionary measures." CONCLUSIONS: Misleading videos had more likes, fewer comments, and longer running times than useful videos. Korean-Language YouTube videos on COVID-19 uploaded by different sources varied significantly in terms of reliability, overall quality, and title-content consistency, but the content coverage was not significantly different. Government-generated videos had higher reliability, overall quality, and title-content consistency than independent user-generated videos.

  • Evaluation of Korean-Language COVID-19–Related Medical Information on YouTube: Cross-Sectional Infodemiology Study (Preprint)
    2020
    Co-Authors: Hana Moon, Geon Ho Lee
    Abstract:

    BACKGROUND In South Korea, the number of coronavirus disease (COVID-19) cases has declined rapidly and much sooner than in other countries. South Korea is one of the most digitalized countries in the world, and YouTube may have served as a rapid delivery mechanism for increasing public awareness of COVID-19. Thus, the platform may have helped the South Korean public fight the spread of the disease. OBJECTIVE The aim of this study is to compare the reliability, overall quality, title–content consistency, and content coverage of Korean-Language YouTube videos on COVID-19, which have been uploaded by different sources. METHODS A total of 200 of the most viewed YouTube videos from January 1, 2020, to April 30, 2020, were screened, searching in Korean for the terms “Coronavirus,” “COVID,” “Corona,” “Wuhan virus,” and “Wuhan pneumonia.” Non-Korean videos and videos that were duplicated, irrelevant, or livestreamed were excluded. Source and video metrics were collected. The videos were scored based on the following criteria: modified DISCERN index, Journal of the American Medical Association Score (JAMAS) benchmark criteria, global quality score (GQS), title–content consistency index (TCCI), and medical information and content index (MICI). RESULTS Of the 105 total videos, 37.14% (39/105) contained misleading information; independent user–generated videos showed the highest proportion of misleading information at 68.09% (32/47), while all of the government-generated videos were useful. Government agency–generated videos achieved the highest median score of DISCERN (5.0, IQR 5.0-5.0), JAMAS (4.0, IQR 4.0-4.0), GQS (4.0, IQR 3.0-4.5), and TCCI (5.0, IQR 5.0-5.0), while independent user–generated videos achieved the lowest median score of DISCERN (2.0, IQR 1.0-3.0), JAMAS (2.0, IQR 1.5-2.0), GQS (2.0, IQR 1.5-2.0), and TCCI (3.0, IQR 3.0-4.0). However, the total MICI was not significantly different among sources. “Transmission and precautionary measures” were the most commonly covered content by government agencies, news agencies, and independent users. In contrast, the most mentioned content by news agencies was “prevalence,” followed by “transmission and precautionary measures.” CONCLUSIONS Misleading videos had more likes, fewer comments, and longer running times than useful videos. Korean-Language YouTube videos on COVID-19 uploaded by different sources varied significantly in terms of reliability, overall quality, and title–content consistency, but the content coverage was not significantly different. Government-generated videos had higher reliability, overall quality, and title–content consistency than independent user–generated videos.

Antoine Laurent - One of the best experts on this subject based on the ideXlab platform.

  • Unsupervised acoustic model training for the Korean Language
    2014
    Co-Authors: Antoine Laurent, William Hartmann, Lori Lamel
    Abstract:

    This paper investigates unsupervised training strategies for the Korean Language in the context of the DGA RAPID Rapmat project. As with previous studies, we begin with only a small amount of manually transcribed data to build preliminary acoustic models. Using the initial models, a larger set of untranscribed audio data is decoded to produce approximate transcripts. We compare both GMM and DNN acoustic models for both the unsupervised transcription and the final recognition system. While the DNN acoustic models produce a lower word error rate on the test set, training on the transcripts from the GMM system provides the best overall performance. We also achieve better performance by expanding the original phone set. Finally, we examine the efficacy of automatically building a test set by comparing system performance both before and after manually correcting the test set.

  • ISCSLP - Unsupervised acoustic model training for the Korean Language
    The 9th International Symposium on Chinese Spoken Language Processing, 2014
    Co-Authors: Antoine Laurent, William Hartmann, Lori Lamel
    Abstract:

    This paper investigates unsupervised training strategies for the Korean Language in the context of the DGA RAPID Rapmat project. As with previous studies, we begin with only a small amount of manually transcribed data to build preliminary acoustic models. Using the initial models, a larger set of untranscribed audio data is decoded to produce approximate transcripts. We compare both GMM and DNN acoustic models for both the unsupervised transcription and the final recognition system. While the DNN acoustic models produce a lower word error rate on the test set, training on the transcripts from the GMM system provides the best overall performance. We also achieve better performance by expanding the original phone set. Finally, we examine the efficacy of automatically building a test set by comparing system performance both before and after manually correcting the test set.

Seokhoon Kang - One of the best experts on this subject based on the ideXlab platform.

  • Design of Iconic Language Interface for Semantic Based Korean Language Generation
    Foundations of Intelligent Systems, 2003
    Co-Authors: Kyonam Choo, Yoseop Woo, Hongki Min, Hyunjae Park, Seokhoon Kang
    Abstract:

    The iconic Language interface is designed to provide more convenient communication environments to the system compared with the existing keyboard input system. The iconic Language interface consisting of pictures, which indicate meanings that are easily recognizable to users, is the supplemental Language interface for creating the Korean Language and the commands to control system through configuration of various situation information such as word, grammar, and meaning from the various iconic stream the user select. Based on this, the algorithm to generate the Korean Language from the iconic interface is suggested.

  • ISMIS - Design of iconic Language interface for semantic based Korean Language generation
    Lecture Notes in Computer Science, 2003
    Co-Authors: Kyonam Choo, Yoseop Woo, Hongki Min, Hyunjae Park, Seokhoon Kang
    Abstract:

    The iconic Language interface is designed to provide more convenient communication environments to the system compared with the existing keyboard input system. The iconic Language interface consisting of pictures, which indicate meanings that are easily recognizable to users, is the supplemental Language interface for creating the Korean Language and the commands to control system through configuration of various situation information such as word, grammar, and meaning from the various iconic stream the user select. Based on this, the algorithm to generate the Korean Language from the iconic interface is suggested.