Model Adaptation

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 280926 Experts worldwide ranked by ideXlab platform

Luc Van Gool - One of the best experts on this subject based on the ideXlab platform.

  • guided curriculum Model Adaptation and uncertainty aware evaluation for semantic nighttime image segmentation
    International Conference on Computer Vision, 2019
    Co-Authors: Christos Sakaridis, Dengxin Dai, Luc Van Gool
    Abstract:

    Most progress in semantic segmentation reports on daytime images taken under favorable illumination conditions. We instead address the problem of semantic segmentation of nighttime images and improve the state-of-the-art, by adapting daytime Models to nighttime without using nighttime annotations. Moreover, we design a new evaluation framework to address the substantial uncertainty of semantics in nighttime images. Our central contributions are: 1) a curriculum framework to gradually adapt semantic segmentation Models from day to night via labeled synthetic images and unlabeled real images, both for progressively darker times of day, which exploits cross-time-of-day correspondences for the real images to guide the inference of their labels; 2) a novel uncertainty-aware annotation and evaluation framework and metric for semantic segmentation, designed for adverse conditions and including image regions beyond human recognition capability in the evaluation in a principled fashion; 3) the Dark Zurich dataset, which comprises 2416 unlabeled nighttime and 2920 unlabeled twilight images with correspondences to their daytime counterparts plus a set of 151 nighttime images with fine pixel-level annotations created with our protocol, which serves as a first benchmark to perform our novel evaluation. Experiments show that our guided curriculum Adaptation significantly outperforms state-of-the-art methods on real nighttime sets both for standard metrics and our uncertainty-aware metric. Furthermore, our uncertainty-aware evaluation reveals that selective invalidation of predictions can lead to better results on data with ambiguous content such as our nighttime benchmark and profit safety-oriented applications which involve invalid inputs.

  • Curriculum Model Adaptation with Synthetic and Real Data for Semantic Foggy Scene Understanding
    International Journal of Computer Vision, 2019
    Co-Authors: Dengxin Dai, Christos Sakaridis, Simon Hecker, Luc Van Gool
    Abstract:

    This work addresses the problem of semantic scene understanding under fog. Although marked progress has been made in semantic scene understanding, it is mainly concentrated on clear-weather scenes. Extending semantic segmentation methods to adverse weather conditions such as fog is crucial for outdoor applications. In this paper, we propose a novel method, named Curriculum Model Adaptation (CMAda), which gradually adapts a semantic segmentation Model from light synthetic fog to dense real fog in multiple steps, using both labeled synthetic foggy data and unlabeled real foggy data. The method is based on the fact that the results of semantic segmentation in moderately adverse conditions (light fog) can be bootstrapped to solve the same problem in highly adverse conditions (dense fog). CMAda is extensible to other adverse conditions and provides a new paradigm for learning with synthetic data and unlabeled real data. In addition, we present four other main stand-alone contributions: (1) a novel method to add synthetic fog to real, clear-weather scenes using semantic input; (2) a new fog density estimator; (3) a novel fog densification method for real foggy scenes without known depth; and (4) the Foggy Zurich dataset comprising 3808 real foggy images, with pixel-level semantic annotations for 40 images with dense fog. Our experiments show that (1) our fog simulation and fog density estimator outperform their state-of-the-art counterparts with respect to the task of semantic foggy scene understanding (SFSU); (2) CMAda improves the performance of state-of-the-art Models for SFSU significantly, benefiting both from our synthetic and real foggy data. The foggy datasets and code are publicly available.

  • curriculum Model Adaptation with synthetic and real data for semantic foggy scene understanding
    arXiv: Computer Vision and Pattern Recognition, 2019
    Co-Authors: Dengxin Dai, Luc Van Gool, Christos Sakaridis, Simon Hecker
    Abstract:

    This work addresses the problem of semantic scene understanding under fog. Although marked progress has been made in semantic scene understanding, it is mainly concentrated on clear-weather scenes. Extending semantic segmentation methods to adverse weather conditions such as fog is crucial for outdoor applications. In this paper, we propose a novel method, named Curriculum Model Adaptation (CMAda), which gradually adapts a semantic segmentation Model from light synthetic fog to dense real fog in multiple steps, using both labeled synthetic foggy data and unlabeled real foggy data. The method is based on the fact that the results of semantic segmentation in moderately adverse conditions (light fog) can be bootstrapped to solve the same problem in highly adverse conditions (dense fog). CMAda is extensible to other adverse conditions and provides a new paradigm for learning with synthetic data and unlabeled real data. In addition, we present three other main stand-alone contributions: 1) a novel method to add synthetic fog to real, clear-weather scenes using semantic input; 2) a new fog density estimator; 3) a novel fog densification method to densify the fog in real foggy scenes without using depth; and 4) the Foggy Zurich dataset comprising 3808 real foggy images, with pixel-level semantic annotations for 40 images under dense fog. Our experiments show that 1) our fog simulation and fog density estimator outperform their state-of-the-art counterparts with respect to the task of semantic foggy scene understanding (SFSU); 2) CMAda improves the performance of state-of-the-art Models for SFSU significantly, benefiting both from our synthetic and real foggy data. The datasets and code are available at the project website.

  • dark Model Adaptation semantic image segmentation from daytime to nighttime
    International Conference on Intelligent Transportation Systems, 2018
    Co-Authors: Dengxin Dai, Luc Van Gool
    Abstract:

    This work addresses the problem of semantic image segmentation of nighttime scenes. Although considerable progress has been made in semantic image segmentation, it is mainly related to daytime scenarios. This paper proposes a novel method to progressive adapt the semantic Models trained on daytime scenes, along with large-scale annotations therein, to nighttime scenes via the bridge of twilight time — the time between dawn and sunrise, or between sunset and dusk. The goal of the method is to alleviate the cost of human annotation for nighttime images by transferring knowledge from standard daytime conditions. In addition to the method, a new dataset of road scenes is compiled; it consists of 35,000 images ranging from daytime to twilight time and to nighttime. Also, a subset of the nighttime images are densely annotated for method evaluation. Our experiments show that our method is effective for knowledge transfer from daytime scenes to nighttime scenes, without human annotation.

  • dark Model Adaptation semantic image segmentation from daytime to nighttime
    arXiv: Computer Vision and Pattern Recognition, 2018
    Co-Authors: Dengxin Dai, Luc Van Gool
    Abstract:

    This work addresses the problem of semantic image segmentation of nighttime scenes. Although considerable progress has been made in semantic image segmentation, it is mainly related to daytime scenarios. This paper proposes a novel method to progressive adapt the semantic Models trained on daytime scenes, along with large-scale annotations therein, to nighttime scenes via the bridge of twilight time -- the time between dawn and sunrise, or between sunset and dusk. The goal of the method is to alleviate the cost of human annotation for nighttime images by transferring knowledge from standard daytime conditions. In addition to the method, a new dataset of road scenes is compiled; it consists of 35,000 images ranging from daytime to twilight time and to nighttime. Also, a subset of the nighttime images are densely annotated for method evaluation. Our experiments show that our method is effective for Model Adaptation from daytime scenes to nighttime scenes, without using extra human annotation.

Jae Sam Yoon - One of the best experts on this subject based on the ideXlab platform.

  • acoustic Model Adaptation based on pronunciation variability analysis for non native speech recognition
    Speech Communication, 2007
    Co-Authors: Yoo Rhee Oh, Jae Sam Yoon
    Abstract:

    In this paper, pronunciation variability between native and non-native speakers is investigated, and a novel acoustic Model Adaptation method is proposed based on pronunciation variability analysis in order to improve the performance of a speech recognition system by non-native speakers. The proposed acoustic Model Adaptation method is performed in two steps: analysis of the pronunciation variability of non-native speech, and acoustic Model Adaptation based on the pronunciation variability analysis. In order to obtain informative variant phonetic units, we analyze the pronunciation variability of non-native speech in two ways: a knowledge-based approach, and a data-driven approach. Next, for each approach, the acoustic Model corresponding to each informative variant phonetic unit is adapted such that the state-tying of the acoustic Model for non-native speech reflects a phonetic variability. For further improvement, a conventional acoustic Model Adaptation method such as MLLR and/or MAP is combined with the proposed acoustic Model Adaptation method. It is shown from the continuous Korean-English speech recognition experiments that the proposed method achieves an average word error rate reduction of 16.76% and 12.80% for the knowledge-based approach and the data-driven approach, respectively, when compared with the baseline speech recognition system trained by native speech. Moreover, a reduction of 53.45% and 57.14% in the average word error rate is obtained by combining MLLR and MAP Adaptations to the adapted acoustic Models by the proposed method for the knowledge-based approach and the data-driven approach, respectively.

  • acoustic Model Adaptation based on pronunciation variability analysis for non native speech recognition
    International Conference on Acoustics Speech and Signal Processing, 2006
    Co-Authors: Yoo Rhee Oh, Jae Sam Yoon
    Abstract:

    In this paper, we investigate the pronunciation variability between native and non-native speakers and propose an acoustic Model Adaptation method based on the variability analysis in order to improve the performance of a non-native speech recognition system. The proposed acoustic Model Adaptation is performed in two steps. First, we construct baseline acoustic Models from native speech, and perform phone recognition by using the baseline acoustic Models to identify most informative variant phonetic units from native to non-native. Next, the acoustic Model corresponding to each informative variant phonetic unit is adapted so that the state tying of the acoustic Model for non-native speech reflects such a phonetic variability. For further improvement, the traditional acoustic Model Adaptation such as MLLR or MAP could be applied on the system that is adapted with the proposed method. In this work, we select English as a target language and non-native speakers are all Korean. It is shown from the continuous Korean-English speech recognition experiments that the proposed method can achieve the average word error rate reduction by 12.75% when compared with the speech recognition system with the baseline acoustic Models trained by native speech. Moreover, the reduction of 57.12% in the average word error rate is obtained by applying MLLR or MAP Adaptation to the adapted acoustic Models by the proposed method.

Dengxin Dai - One of the best experts on this subject based on the ideXlab platform.

  • guided curriculum Model Adaptation and uncertainty aware evaluation for semantic nighttime image segmentation
    International Conference on Computer Vision, 2019
    Co-Authors: Christos Sakaridis, Dengxin Dai, Luc Van Gool
    Abstract:

    Most progress in semantic segmentation reports on daytime images taken under favorable illumination conditions. We instead address the problem of semantic segmentation of nighttime images and improve the state-of-the-art, by adapting daytime Models to nighttime without using nighttime annotations. Moreover, we design a new evaluation framework to address the substantial uncertainty of semantics in nighttime images. Our central contributions are: 1) a curriculum framework to gradually adapt semantic segmentation Models from day to night via labeled synthetic images and unlabeled real images, both for progressively darker times of day, which exploits cross-time-of-day correspondences for the real images to guide the inference of their labels; 2) a novel uncertainty-aware annotation and evaluation framework and metric for semantic segmentation, designed for adverse conditions and including image regions beyond human recognition capability in the evaluation in a principled fashion; 3) the Dark Zurich dataset, which comprises 2416 unlabeled nighttime and 2920 unlabeled twilight images with correspondences to their daytime counterparts plus a set of 151 nighttime images with fine pixel-level annotations created with our protocol, which serves as a first benchmark to perform our novel evaluation. Experiments show that our guided curriculum Adaptation significantly outperforms state-of-the-art methods on real nighttime sets both for standard metrics and our uncertainty-aware metric. Furthermore, our uncertainty-aware evaluation reveals that selective invalidation of predictions can lead to better results on data with ambiguous content such as our nighttime benchmark and profit safety-oriented applications which involve invalid inputs.

  • Curriculum Model Adaptation with Synthetic and Real Data for Semantic Foggy Scene Understanding
    International Journal of Computer Vision, 2019
    Co-Authors: Dengxin Dai, Christos Sakaridis, Simon Hecker, Luc Van Gool
    Abstract:

    This work addresses the problem of semantic scene understanding under fog. Although marked progress has been made in semantic scene understanding, it is mainly concentrated on clear-weather scenes. Extending semantic segmentation methods to adverse weather conditions such as fog is crucial for outdoor applications. In this paper, we propose a novel method, named Curriculum Model Adaptation (CMAda), which gradually adapts a semantic segmentation Model from light synthetic fog to dense real fog in multiple steps, using both labeled synthetic foggy data and unlabeled real foggy data. The method is based on the fact that the results of semantic segmentation in moderately adverse conditions (light fog) can be bootstrapped to solve the same problem in highly adverse conditions (dense fog). CMAda is extensible to other adverse conditions and provides a new paradigm for learning with synthetic data and unlabeled real data. In addition, we present four other main stand-alone contributions: (1) a novel method to add synthetic fog to real, clear-weather scenes using semantic input; (2) a new fog density estimator; (3) a novel fog densification method for real foggy scenes without known depth; and (4) the Foggy Zurich dataset comprising 3808 real foggy images, with pixel-level semantic annotations for 40 images with dense fog. Our experiments show that (1) our fog simulation and fog density estimator outperform their state-of-the-art counterparts with respect to the task of semantic foggy scene understanding (SFSU); (2) CMAda improves the performance of state-of-the-art Models for SFSU significantly, benefiting both from our synthetic and real foggy data. The foggy datasets and code are publicly available.

  • curriculum Model Adaptation with synthetic and real data for semantic foggy scene understanding
    arXiv: Computer Vision and Pattern Recognition, 2019
    Co-Authors: Dengxin Dai, Luc Van Gool, Christos Sakaridis, Simon Hecker
    Abstract:

    This work addresses the problem of semantic scene understanding under fog. Although marked progress has been made in semantic scene understanding, it is mainly concentrated on clear-weather scenes. Extending semantic segmentation methods to adverse weather conditions such as fog is crucial for outdoor applications. In this paper, we propose a novel method, named Curriculum Model Adaptation (CMAda), which gradually adapts a semantic segmentation Model from light synthetic fog to dense real fog in multiple steps, using both labeled synthetic foggy data and unlabeled real foggy data. The method is based on the fact that the results of semantic segmentation in moderately adverse conditions (light fog) can be bootstrapped to solve the same problem in highly adverse conditions (dense fog). CMAda is extensible to other adverse conditions and provides a new paradigm for learning with synthetic data and unlabeled real data. In addition, we present three other main stand-alone contributions: 1) a novel method to add synthetic fog to real, clear-weather scenes using semantic input; 2) a new fog density estimator; 3) a novel fog densification method to densify the fog in real foggy scenes without using depth; and 4) the Foggy Zurich dataset comprising 3808 real foggy images, with pixel-level semantic annotations for 40 images under dense fog. Our experiments show that 1) our fog simulation and fog density estimator outperform their state-of-the-art counterparts with respect to the task of semantic foggy scene understanding (SFSU); 2) CMAda improves the performance of state-of-the-art Models for SFSU significantly, benefiting both from our synthetic and real foggy data. The datasets and code are available at the project website.

  • dark Model Adaptation semantic image segmentation from daytime to nighttime
    International Conference on Intelligent Transportation Systems, 2018
    Co-Authors: Dengxin Dai, Luc Van Gool
    Abstract:

    This work addresses the problem of semantic image segmentation of nighttime scenes. Although considerable progress has been made in semantic image segmentation, it is mainly related to daytime scenarios. This paper proposes a novel method to progressive adapt the semantic Models trained on daytime scenes, along with large-scale annotations therein, to nighttime scenes via the bridge of twilight time — the time between dawn and sunrise, or between sunset and dusk. The goal of the method is to alleviate the cost of human annotation for nighttime images by transferring knowledge from standard daytime conditions. In addition to the method, a new dataset of road scenes is compiled; it consists of 35,000 images ranging from daytime to twilight time and to nighttime. Also, a subset of the nighttime images are densely annotated for method evaluation. Our experiments show that our method is effective for knowledge transfer from daytime scenes to nighttime scenes, without human annotation.

  • dark Model Adaptation semantic image segmentation from daytime to nighttime
    arXiv: Computer Vision and Pattern Recognition, 2018
    Co-Authors: Dengxin Dai, Luc Van Gool
    Abstract:

    This work addresses the problem of semantic image segmentation of nighttime scenes. Although considerable progress has been made in semantic image segmentation, it is mainly related to daytime scenarios. This paper proposes a novel method to progressive adapt the semantic Models trained on daytime scenes, along with large-scale annotations therein, to nighttime scenes via the bridge of twilight time -- the time between dawn and sunrise, or between sunset and dusk. The goal of the method is to alleviate the cost of human annotation for nighttime images by transferring knowledge from standard daytime conditions. In addition to the method, a new dataset of road scenes is compiled; it consists of 35,000 images ranging from daytime to twilight time and to nighttime. Also, a subset of the nighttime images are densely annotated for method evaluation. Our experiments show that our method is effective for Model Adaptation from daytime scenes to nighttime scenes, without using extra human annotation.

Alex Acero - One of the best experts on this subject based on the ideXlab platform.

  • noise adaptive training for robust automatic speech recognition
    IEEE Transactions on Audio Speech and Language Processing, 2010
    Co-Authors: Ozlem Kalinli, Michael L. Seltzer, Jasha Droppo, Alex Acero
    Abstract:

    In traditional methods for noise robust automatic speech recognition, the acoustic Models are typically trained using clean speech or using multi-condition data that is processed by the same feature enhancement algorithm expected to be used in decoding. In this paper, we propose a noise adaptive training (NAT) algorithm that can be applied to all training data that normalizes the environmental distortion as part of the Model training. In contrast to feature enhancement methods, NAT estimates the underlying “pseudo-clean” Model parameters directly without relying on point estimates of the clean speech features as an intermediate step. The pseudo-clean Model parameters learned with NAT are later used with vector Taylor series (VTS) Model Adaptation for decoding noisy utterances at test time. Experiments performed on the Aurora 2 and Aurora 3 tasks demonstrate that the proposed NAT method obtain relative improvements of 18.83% and 32.02%, respectively, over VTS Model Adaptation.

  • Noise adaptive training using a vector taylor series approach for noise robust automatic speech recognition
    ICASSP IEEE International Conference on Acoustics Speech and Signal Processing - Proceedings, 2009
    Co-Authors: Ozlem Kalinli, Michael L. Seltzer, Alex Acero
    Abstract:

    In traditional methods for noise robust automatic speech recognition, the acoustic Models are typically trained using clean speech or using multi-condition data that is processed by the same feature enhancement algorithm expected to be used in decoding. In this paper, we propose a noise adaptive training (NAT) algorithm that can be applied to all training data that normalizes the environmental distortion as part of the Model training. In contrast to the feature enhancement methods, NAT estimates the underlying ldquopseudo-cleanrdquo Model parameters directly without relying on point estimates of the clean speech features as an intermediate step. The pseudo-clean Model parameters learned with NAT are later used with vector Taylor series (VTS) Model Adaptation for decoding noisy utterances at test time. Experiments performed on the Aurora 2 and Aurora 3 tasks, demonstrate that the proposed NAT method obtain relative improvements of 18.83% and 32.02%, respectively, over VTS Model Adaptation.

Yoo Rhee Oh - One of the best experts on this subject based on the ideXlab platform.

  • acoustic Model Adaptation based on pronunciation variability analysis for non native speech recognition
    Speech Communication, 2007
    Co-Authors: Yoo Rhee Oh, Jae Sam Yoon
    Abstract:

    In this paper, pronunciation variability between native and non-native speakers is investigated, and a novel acoustic Model Adaptation method is proposed based on pronunciation variability analysis in order to improve the performance of a speech recognition system by non-native speakers. The proposed acoustic Model Adaptation method is performed in two steps: analysis of the pronunciation variability of non-native speech, and acoustic Model Adaptation based on the pronunciation variability analysis. In order to obtain informative variant phonetic units, we analyze the pronunciation variability of non-native speech in two ways: a knowledge-based approach, and a data-driven approach. Next, for each approach, the acoustic Model corresponding to each informative variant phonetic unit is adapted such that the state-tying of the acoustic Model for non-native speech reflects a phonetic variability. For further improvement, a conventional acoustic Model Adaptation method such as MLLR and/or MAP is combined with the proposed acoustic Model Adaptation method. It is shown from the continuous Korean-English speech recognition experiments that the proposed method achieves an average word error rate reduction of 16.76% and 12.80% for the knowledge-based approach and the data-driven approach, respectively, when compared with the baseline speech recognition system trained by native speech. Moreover, a reduction of 53.45% and 57.14% in the average word error rate is obtained by combining MLLR and MAP Adaptations to the adapted acoustic Models by the proposed method for the knowledge-based approach and the data-driven approach, respectively.

  • acoustic Model Adaptation based on pronunciation variability analysis for non native speech recognition
    International Conference on Acoustics Speech and Signal Processing, 2006
    Co-Authors: Yoo Rhee Oh, Jae Sam Yoon
    Abstract:

    In this paper, we investigate the pronunciation variability between native and non-native speakers and propose an acoustic Model Adaptation method based on the variability analysis in order to improve the performance of a non-native speech recognition system. The proposed acoustic Model Adaptation is performed in two steps. First, we construct baseline acoustic Models from native speech, and perform phone recognition by using the baseline acoustic Models to identify most informative variant phonetic units from native to non-native. Next, the acoustic Model corresponding to each informative variant phonetic unit is adapted so that the state tying of the acoustic Model for non-native speech reflects such a phonetic variability. For further improvement, the traditional acoustic Model Adaptation such as MLLR or MAP could be applied on the system that is adapted with the proposed method. In this work, we select English as a target language and non-native speakers are all Korean. It is shown from the continuous Korean-English speech recognition experiments that the proposed method can achieve the average word error rate reduction by 12.75% when compared with the speech recognition system with the baseline acoustic Models trained by native speech. Moreover, the reduction of 57.12% in the average word error rate is obtained by applying MLLR or MAP Adaptation to the adapted acoustic Models by the proposed method.