The Experts below are selected from a list of 1039770 Experts worldwide ranked by ideXlab platform
John H. L. Hansen - One of the best experts on this subject based on the ideXlab platform.
-
A Study on Universal Background Model Training in Speaker Verification
IEEE Transactions on Audio Speech and Language Processing, 2011Co-Authors: Taufiq Hasan, John H. L. HansenAbstract:State-of-the-art Gaussian mixture Model (GMM)-based speaker recognition/verification systems utilize a universal Background Model (UBM), which typically requires extensive resources, especially if multiple channel and microphone categories are considered. In this study, a systematic analysis of speaker verification system performance is considered for which the UBM data is selected and purposefully altered in different ways, including variation in the amount of data, sub-sampling structure of the feature frames, and variation in the number of speakers. An objective measure is formulated from the UBM covariance matrix which is found to be highly correlated with system performance when the data amount was varied while keeping the UBM data set constant, and increasing the number of UBM speakers while keeping the data amount constant. The advantages of feature sub-sampling for improving UBM training speed is also discussed, and a novel and effective phonetic distance-based frame selection method is developed. The sub-sampling methods presented are shown to retain baseline equal error rate (EER) system performance using only 1% of the original UBM data, resulting in a drastic reduction in UBM training computation time. This, in theory, dispels the myth of “There's no data like more data” for the purpose of UBM construction. With respect to the UBM speakers, the effect of systematically controlling the number of training (UBM) speakers versus overall system performance is analyzed. It is shown experimentally that increasing the inter-speaker variability in the UBM data while maintaining the overall total data size constant gradually improves system performance. Finally, two alternative speaker selection methods based on different speaker diversity measures are presented. Using the proposed schemes, it is shown that by selecting a diverse set of UBM speakers, the baseline system performance can be retained using less than 30% of the original UBM speakers.
-
ICASSP - A novel feature sub-sampling method for efficient universal Background Model training in speaker verification
2010 IEEE International Conference on Acoustics Speech and Signal Processing, 2010Co-Authors: Taufiq Hasan, Yun Lei, Aravind Chandrasekaran, John H. L. HansenAbstract:Speaker recognition/verification systems require an extensive universal Background Model (UBM), which typically requires extensive resources, especially if new channel domains are considered. In this study we propose an effective and computationally efficient algorithm for training the UBM for speaker verification. A novel method based on Euclidean distance between features is developed for effective sub-sampling of potential training feature vectors. Using only about 1.5 seconds of data from each development utterance, the proposed UBM training method drastically reduces the computation time, while improving, or at least retaining original speaker verification system performance. While methods such as factor analysis can mitigate some of the issues associated with channel/microphone/environmental mismatch, the proposed rapid UBM training scheme offers a viable alternative for rapid environment dependent UBMs.
Taufiq Hasan - One of the best experts on this subject based on the ideXlab platform.
-
A Study on Universal Background Model Training in Speaker Verification
IEEE Transactions on Audio Speech and Language Processing, 2011Co-Authors: Taufiq Hasan, John H. L. HansenAbstract:State-of-the-art Gaussian mixture Model (GMM)-based speaker recognition/verification systems utilize a universal Background Model (UBM), which typically requires extensive resources, especially if multiple channel and microphone categories are considered. In this study, a systematic analysis of speaker verification system performance is considered for which the UBM data is selected and purposefully altered in different ways, including variation in the amount of data, sub-sampling structure of the feature frames, and variation in the number of speakers. An objective measure is formulated from the UBM covariance matrix which is found to be highly correlated with system performance when the data amount was varied while keeping the UBM data set constant, and increasing the number of UBM speakers while keeping the data amount constant. The advantages of feature sub-sampling for improving UBM training speed is also discussed, and a novel and effective phonetic distance-based frame selection method is developed. The sub-sampling methods presented are shown to retain baseline equal error rate (EER) system performance using only 1% of the original UBM data, resulting in a drastic reduction in UBM training computation time. This, in theory, dispels the myth of “There's no data like more data” for the purpose of UBM construction. With respect to the UBM speakers, the effect of systematically controlling the number of training (UBM) speakers versus overall system performance is analyzed. It is shown experimentally that increasing the inter-speaker variability in the UBM data while maintaining the overall total data size constant gradually improves system performance. Finally, two alternative speaker selection methods based on different speaker diversity measures are presented. Using the proposed schemes, it is shown that by selecting a diverse set of UBM speakers, the baseline system performance can be retained using less than 30% of the original UBM speakers.
-
ICASSP - A novel feature sub-sampling method for efficient universal Background Model training in speaker verification
2010 IEEE International Conference on Acoustics Speech and Signal Processing, 2010Co-Authors: Taufiq Hasan, Yun Lei, Aravind Chandrasekaran, John H. L. HansenAbstract:Speaker recognition/verification systems require an extensive universal Background Model (UBM), which typically requires extensive resources, especially if new channel domains are considered. In this study we propose an effective and computationally efficient algorithm for training the UBM for speaker verification. A novel method based on Euclidean distance between features is developed for effective sub-sampling of potential training feature vectors. Using only about 1.5 seconds of data from each development utterance, the proposed UBM training method drastically reduces the computation time, while improving, or at least retaining original speaker verification system performance. While methods such as factor analysis can mitigate some of the issues associated with channel/microphone/environmental mismatch, the proposed rapid UBM training scheme offers a viable alternative for rapid environment dependent UBMs.
Rin-ichiro Taniguchi - One of the best experts on this subject based on the ideXlab platform.
-
Case-based Background Modeling: associative Background database towards low-cost and high-performance change detection
Machine Vision and Applications, 2014Co-Authors: Atsushi Shimada, Hajime Nagahara, Yosuke Nonaka, Rin-ichiro TaniguchiAbstract:Background Modeling and subtraction is an essential task in video surveillance applications. Many researchers have discussed about an improvement of performance of a Background Model, and a reduction of memory usage or computational cost. To adapt to Background changes, a Background Model has been enhanced by introducing various information including a spatial consistency, a temporal tendency, etc. with a large memory allocation. Meanwhile, an approach to reduce a memory cost cannot provide better accuracy of a Background subtraction. To tackle the trade-off problem, this paper proposes a novel framework named “case-based Background Modeling”. The characteristics of the proposed method are (1) a Background Model is created, or removed when necessary, (2) case-by-case Model sharing by some of the pixels, (3) pixel features are divided into two groups, one for Model selection and the other for Modeling. These approaches realize a low-cost and high accurate Background Model. The memory usage and the computational cost could be reduced by half of a traditional method and the accuracy was superior to the method.
-
Background Model based on statistical local difference pattern
International Conference on Computer Vision, 2012Co-Authors: Satoshi Yoshinaga, Hajime Nagahara, Atsushi Shimada, Rin-ichiro TaniguchiAbstract:We present a robust Background Model for object detection and report its evaluation results using the database of Background Models Challenge (BMC). Our Background Model is based on a statistical local feature. In particular, we use an illumination invariant local feature and describe its distribution by using a statistical framework. Thanks to the effectiveness of the local feature and the statistical framework, our method can adapt to both illumination and dynamic Background changes. Experimental results, which are done thanks to the database of BMC, show that our method can detect foreground objects robustly against Background changes.
-
Hybrid Background Modeling for Long-term and Short-term Illumination Changes
IEEJ Transactions on Electronics Information and Systems, 2010Co-Authors: Atsushi Shimada, Rin-ichiro TaniguchiAbstract:Background Modeling has been widely researched to detect moving objects from image sequences. It is necessary to adapt the Background Model various changes of illumination condition. Recent years, a hybrid type of Background Model which consists of more than one Background Model has been used for object detection since it is very adaptable to illumination changes. In this paper, we also propose a new hybrid type of Background Model named “Hybrid Spatial-Temporal Background Model”. Our Model consists of two different kinds of Background Models. One is pixel-level Background Model which adapts to long-term illumination changes. The other is spatial-temporal Background Model which adapts to short-term illumination changes. Our experimental results demonstrate superiority of our method to some related works.
-
AVSS - Hybrid Background Model Using Spatial-Temporal LBP
2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance, 2009Co-Authors: Atsushi Shimada, Rin-ichiro TaniguchiAbstract:Background Modeling has been widely researched to detect moving objects from image sequences. It is necessary to adapt the Background Model various changes of illumination condition. Recent years, a hybrid type of Background Model which consists of more than one Background Model has been used for object detection since it is very robust for illumination changes. In this paper, we also propose a new hybrid type of Background Model named "Hybrid Spatial-Temporal Background Model". Our Model consists of two different kinds of Background Models. One is pixellevel Background Model which is robust for long-term illumination changes. The other is spatial-temporal Background Model which is robust for short-term illumination changes. Our experimental results demonstrate superiority of our method to some related works.
-
a fast algorithm for adaptive Background Model construction using parzen density estimation
Advanced Video and Signal Based Surveillance, 2007Co-Authors: Tatsuya Tanaka, Daisaku Arita, Atsushi Shimada, Rin-ichiro TaniguchiAbstract:Non-parametric representation of pixel intensity distribution is quite effective to construct proper Background Model and to detect foreground objects accurately. However, from the viewpoint of practical application, the computation cost of the distribution estimation should be reduced. In this paper, we present fast estimation of the probability density function (PDF) of pixel value using Parzen density estimation and foreground object detection based on the estimated PDF. Here, the PDF is computed by partially updating the PDF estimated at the previous frame, and it greatly reduces the computation cost of the PDF estimation. Thus, the Background Model adapts quickly to changes in the scene and, therefore, foreground objects can be robustly detected. Several experiments show the effectiveness of our approach.
Balakrishnan Varadarajan - One of the best experts on this subject based on the ideXlab platform.
-
universal Background Model based speech recognition
International Conference on Acoustics Speech and Signal Processing, 2008Co-Authors: Daniel Povey, Stephen M Chu, Balakrishnan VaradarajanAbstract:The universal Background Model (UBM) is an effective framework widely used in speaker recognition. But so far it has received little attention from the speech recognition field. In this work, we make a first attempt to apply the UBM to acoustic Modeling in ASR. We propose a tree-based parameter estimation technique for UBMs, and describe a set of smoothing and pruning methods to facilitate learning. The proposed UBM approach is benchmarked on a state-of-the-art large-vocabulary continuous speech recognition platform on a broadcast transcription task. Preliminary experiments reported in this paper already show very exciting results.
Anil K Jain - One of the best experts on this subject based on the ideXlab platform.
-
a Background Model initialization algorithm for video surveillance
International Conference on Computer Vision, 2001Co-Authors: D Gutchess, M Trajkovics, Eric Cohensolal, Damian M Lyons, Anil K JainAbstract:Many motion detection and tracking algorithms rely on the process of Background subtraction, a technique which detects changes from a Model of the Background scene. We present a new algorithm for the purpose of Background Model initialization. The algorithm takes as input a video sequence in which moving objects are present, and outputs a statistical Background Model describing the static parts of the scene. Multiple hypotheses of the Background value at each pixel are generated by locating periods of stable intensity in the sequence. The likelihood of each hypothesis is then evaluated using optical flow information from the neighborhood around the pixel, and the most likely hypothesis is chosen to represent the Background. Our results are compared with those of several standard Background Modeling techniques using surveillance video of humans in indoor environments.
-
ICCV - A Background Model initialization algorithm for video surveillance
Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, 1Co-Authors: D Gutchess, M Trajkovics, Damian M Lyons, Eric Cohen-solal, Anil K JainAbstract:Many motion detection and tracking algorithms rely on the process of Background subtraction, a technique which detects changes from a Model of the Background scene. We present a new algorithm for the purpose of Background Model initialization. The algorithm takes as input a video sequence in which moving objects are present, and outputs a statistical Background Model describing the static parts of the scene. Multiple hypotheses of the Background value at each pixel are generated by locating periods of stable intensity in the sequence. The likelihood of each hypothesis is then evaluated using optical flow information from the neighborhood around the pixel, and the most likely hypothesis is chosen to represent the Background. Our results are compared with those of several standard Background Modeling techniques using surveillance video of humans in indoor environments.