Laplacian Distribution

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 1662 Experts worldwide ranked by ideXlab platform

Tomoki Toda - One of the best experts on this subject based on the ideXlab platform.

  • efficient shallow wavenet vocoder using multiple samples output based on Laplacian Distribution and linear prediction
    International Conference on Acoustics Speech and Signal Processing, 2020
    Co-Authors: Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda
    Abstract:

    This paper presents a novel way for an efficient implementation scheme of shallow WaveNet vocoder with multiple samples (segment) output based on the use of Laplacian Distribution and linear prediction. In our previous work, we have proposed a shallow architecture for WaveNet vocoder that utilizes only 9 dilated convolutional layers while capable of generating high-quality speech with the use of Laplacian Distribution in speech samples modeling. However, there is still a lot of room for improvements to increase the computation efficiency, such as by the inference of segment output and the use of a more compact structure. In this work, we tackle this issue by proposing a simple implementation scheme of segment output modeling, that can be easily extended into other neural vocoders, where the Laplacian Distribution parameters of multiple samples are estimated simultaneously. Further, to preserve the dependencies of the samples within the segment, we also propose utilizing linear prediction (LP) to compute the Distribution parameters, where data- driven LP-coefficients are estimated by the WaveNet vocoder along with locations and scales. Finally, a shallower WaveNet vocoder with 6 layers is deployed. The experimental results demonstrate that the proposed LP-based Laplacian Distribution can alleviate the quality degradation caused by segment generation.

  • ICASSP - Efficient Shallow Wavenet Vocoder Using Multiple Samples Output Based on Laplacian Distribution and Linear Prediction
    ICASSP 2020 - 2020 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), 2020
    Co-Authors: Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda
    Abstract:

    This paper presents a novel way for an efficient implementation scheme of shallow WaveNet vocoder with multiple samples (segment) output based on the use of Laplacian Distribution and linear prediction. In our previous work, we have proposed a shallow architecture for WaveNet vocoder that utilizes only 9 dilated convolutional layers while capable of generating high-quality speech with the use of Laplacian Distribution in speech samples modeling. However, there is still a lot of room for improvements to increase the computation efficiency, such as by the inference of segment output and the use of a more compact structure. In this work, we tackle this issue by proposing a simple implementation scheme of segment output modeling, that can be easily extended into other neural vocoders, where the Laplacian Distribution parameters of multiple samples are estimated simultaneously. Further, to preserve the dependencies of the samples within the segment, we also propose utilizing linear prediction (LP) to compute the Distribution parameters, where data- driven LP-coefficients are estimated by the WaveNet vocoder along with locations and scales. Finally, a shallower WaveNet vocoder with 6 layers is deployed. The experimental results demonstrate that the proposed LP-based Laplacian Distribution can alleviate the quality degradation caused by segment generation.

  • investigation of shallow wavenet vocoder with Laplacian Distribution output
    2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2019
    Co-Authors: Patrick Lumban Tobing, Tomoki Hayashi, Tomoki Toda
    Abstract:

    In this paper, an investigation of shallow architecture and Laplacian Distribution output for WaveNet vocoder trained with limited training data is presented. The use of shallower WaveNet architecture is proposed to accommodate the possibility of more suitable use case with limited data and to reduce the computation time. In order to further improve the modeling of WaveNet vocoder, the use of Laplacian Distribution output is proposed. Laplacian Distribution is inherently a sparse Distribution, with higher peak and fatter tail than the Gaussian, which might be more suitable for speech signal modeling. The experimental results demonstrate that: 1) the proposed shallow variant of WaveNet architecture gives comparable performance compared to the deep one with softmax output, while reducing the computation time by 73%; and 2) the use of Laplacian Distribution output consistently improves the speech quality in various amounts of limited training data, reaching a value of 4.22 for the two highest mean opinion scores.

  • ASRU - Investigation of Shallow Wavenet Vocoder with Laplacian Distribution Output
    2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2019
    Co-Authors: Patrick Lumban Tobing, Tomoki Hayashi, Tomoki Toda
    Abstract:

    In this paper, an investigation of shallow architecture and Laplacian Distribution output for WaveNet vocoder trained with limited training data is presented. The use of shallower WaveNet architecture is proposed to accommodate the possibility of more suitable use case with limited data and to reduce the computation time. In order to further improve the modeling of WaveNet vocoder, the use of Laplacian Distribution output is proposed. Laplacian Distribution is inherently a sparse Distribution, with higher peak and fatter tail than the Gaussian, which might be more suitable for speech signal modeling. The experimental results demonstrate that: 1) the proposed shallow variant of WaveNet architecture gives comparable performance compared to the deep one with softmax output, while reducing the computation time by 73%; and 2) the use of Laplacian Distribution output consistently improves the speech quality in various amounts of limited training data, reaching a value of 4.22 for the two highest mean opinion scores.

Patrick Lumban Tobing - One of the best experts on this subject based on the ideXlab platform.

  • efficient shallow wavenet vocoder using multiple samples output based on Laplacian Distribution and linear prediction
    International Conference on Acoustics Speech and Signal Processing, 2020
    Co-Authors: Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda
    Abstract:

    This paper presents a novel way for an efficient implementation scheme of shallow WaveNet vocoder with multiple samples (segment) output based on the use of Laplacian Distribution and linear prediction. In our previous work, we have proposed a shallow architecture for WaveNet vocoder that utilizes only 9 dilated convolutional layers while capable of generating high-quality speech with the use of Laplacian Distribution in speech samples modeling. However, there is still a lot of room for improvements to increase the computation efficiency, such as by the inference of segment output and the use of a more compact structure. In this work, we tackle this issue by proposing a simple implementation scheme of segment output modeling, that can be easily extended into other neural vocoders, where the Laplacian Distribution parameters of multiple samples are estimated simultaneously. Further, to preserve the dependencies of the samples within the segment, we also propose utilizing linear prediction (LP) to compute the Distribution parameters, where data- driven LP-coefficients are estimated by the WaveNet vocoder along with locations and scales. Finally, a shallower WaveNet vocoder with 6 layers is deployed. The experimental results demonstrate that the proposed LP-based Laplacian Distribution can alleviate the quality degradation caused by segment generation.

  • ICASSP - Efficient Shallow Wavenet Vocoder Using Multiple Samples Output Based on Laplacian Distribution and Linear Prediction
    ICASSP 2020 - 2020 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), 2020
    Co-Authors: Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda
    Abstract:

    This paper presents a novel way for an efficient implementation scheme of shallow WaveNet vocoder with multiple samples (segment) output based on the use of Laplacian Distribution and linear prediction. In our previous work, we have proposed a shallow architecture for WaveNet vocoder that utilizes only 9 dilated convolutional layers while capable of generating high-quality speech with the use of Laplacian Distribution in speech samples modeling. However, there is still a lot of room for improvements to increase the computation efficiency, such as by the inference of segment output and the use of a more compact structure. In this work, we tackle this issue by proposing a simple implementation scheme of segment output modeling, that can be easily extended into other neural vocoders, where the Laplacian Distribution parameters of multiple samples are estimated simultaneously. Further, to preserve the dependencies of the samples within the segment, we also propose utilizing linear prediction (LP) to compute the Distribution parameters, where data- driven LP-coefficients are estimated by the WaveNet vocoder along with locations and scales. Finally, a shallower WaveNet vocoder with 6 layers is deployed. The experimental results demonstrate that the proposed LP-based Laplacian Distribution can alleviate the quality degradation caused by segment generation.

  • investigation of shallow wavenet vocoder with Laplacian Distribution output
    2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2019
    Co-Authors: Patrick Lumban Tobing, Tomoki Hayashi, Tomoki Toda
    Abstract:

    In this paper, an investigation of shallow architecture and Laplacian Distribution output for WaveNet vocoder trained with limited training data is presented. The use of shallower WaveNet architecture is proposed to accommodate the possibility of more suitable use case with limited data and to reduce the computation time. In order to further improve the modeling of WaveNet vocoder, the use of Laplacian Distribution output is proposed. Laplacian Distribution is inherently a sparse Distribution, with higher peak and fatter tail than the Gaussian, which might be more suitable for speech signal modeling. The experimental results demonstrate that: 1) the proposed shallow variant of WaveNet architecture gives comparable performance compared to the deep one with softmax output, while reducing the computation time by 73%; and 2) the use of Laplacian Distribution output consistently improves the speech quality in various amounts of limited training data, reaching a value of 4.22 for the two highest mean opinion scores.

  • ASRU - Investigation of Shallow Wavenet Vocoder with Laplacian Distribution Output
    2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2019
    Co-Authors: Patrick Lumban Tobing, Tomoki Hayashi, Tomoki Toda
    Abstract:

    In this paper, an investigation of shallow architecture and Laplacian Distribution output for WaveNet vocoder trained with limited training data is presented. The use of shallower WaveNet architecture is proposed to accommodate the possibility of more suitable use case with limited data and to reduce the computation time. In order to further improve the modeling of WaveNet vocoder, the use of Laplacian Distribution output is proposed. Laplacian Distribution is inherently a sparse Distribution, with higher peak and fatter tail than the Gaussian, which might be more suitable for speech signal modeling. The experimental results demonstrate that: 1) the proposed shallow variant of WaveNet architecture gives comparable performance compared to the deep one with softmax output, while reducing the computation time by 73%; and 2) the use of Laplacian Distribution output consistently improves the speech quality in various amounts of limited training data, reaching a value of 4.22 for the two highest mean opinion scores.

Tomoki Hayashi - One of the best experts on this subject based on the ideXlab platform.

  • efficient shallow wavenet vocoder using multiple samples output based on Laplacian Distribution and linear prediction
    International Conference on Acoustics Speech and Signal Processing, 2020
    Co-Authors: Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda
    Abstract:

    This paper presents a novel way for an efficient implementation scheme of shallow WaveNet vocoder with multiple samples (segment) output based on the use of Laplacian Distribution and linear prediction. In our previous work, we have proposed a shallow architecture for WaveNet vocoder that utilizes only 9 dilated convolutional layers while capable of generating high-quality speech with the use of Laplacian Distribution in speech samples modeling. However, there is still a lot of room for improvements to increase the computation efficiency, such as by the inference of segment output and the use of a more compact structure. In this work, we tackle this issue by proposing a simple implementation scheme of segment output modeling, that can be easily extended into other neural vocoders, where the Laplacian Distribution parameters of multiple samples are estimated simultaneously. Further, to preserve the dependencies of the samples within the segment, we also propose utilizing linear prediction (LP) to compute the Distribution parameters, where data- driven LP-coefficients are estimated by the WaveNet vocoder along with locations and scales. Finally, a shallower WaveNet vocoder with 6 layers is deployed. The experimental results demonstrate that the proposed LP-based Laplacian Distribution can alleviate the quality degradation caused by segment generation.

  • ICASSP - Efficient Shallow Wavenet Vocoder Using Multiple Samples Output Based on Laplacian Distribution and Linear Prediction
    ICASSP 2020 - 2020 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), 2020
    Co-Authors: Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda
    Abstract:

    This paper presents a novel way for an efficient implementation scheme of shallow WaveNet vocoder with multiple samples (segment) output based on the use of Laplacian Distribution and linear prediction. In our previous work, we have proposed a shallow architecture for WaveNet vocoder that utilizes only 9 dilated convolutional layers while capable of generating high-quality speech with the use of Laplacian Distribution in speech samples modeling. However, there is still a lot of room for improvements to increase the computation efficiency, such as by the inference of segment output and the use of a more compact structure. In this work, we tackle this issue by proposing a simple implementation scheme of segment output modeling, that can be easily extended into other neural vocoders, where the Laplacian Distribution parameters of multiple samples are estimated simultaneously. Further, to preserve the dependencies of the samples within the segment, we also propose utilizing linear prediction (LP) to compute the Distribution parameters, where data- driven LP-coefficients are estimated by the WaveNet vocoder along with locations and scales. Finally, a shallower WaveNet vocoder with 6 layers is deployed. The experimental results demonstrate that the proposed LP-based Laplacian Distribution can alleviate the quality degradation caused by segment generation.

  • investigation of shallow wavenet vocoder with Laplacian Distribution output
    2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2019
    Co-Authors: Patrick Lumban Tobing, Tomoki Hayashi, Tomoki Toda
    Abstract:

    In this paper, an investigation of shallow architecture and Laplacian Distribution output for WaveNet vocoder trained with limited training data is presented. The use of shallower WaveNet architecture is proposed to accommodate the possibility of more suitable use case with limited data and to reduce the computation time. In order to further improve the modeling of WaveNet vocoder, the use of Laplacian Distribution output is proposed. Laplacian Distribution is inherently a sparse Distribution, with higher peak and fatter tail than the Gaussian, which might be more suitable for speech signal modeling. The experimental results demonstrate that: 1) the proposed shallow variant of WaveNet architecture gives comparable performance compared to the deep one with softmax output, while reducing the computation time by 73%; and 2) the use of Laplacian Distribution output consistently improves the speech quality in various amounts of limited training data, reaching a value of 4.22 for the two highest mean opinion scores.

  • ASRU - Investigation of Shallow Wavenet Vocoder with Laplacian Distribution Output
    2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2019
    Co-Authors: Patrick Lumban Tobing, Tomoki Hayashi, Tomoki Toda
    Abstract:

    In this paper, an investigation of shallow architecture and Laplacian Distribution output for WaveNet vocoder trained with limited training data is presented. The use of shallower WaveNet architecture is proposed to accommodate the possibility of more suitable use case with limited data and to reduce the computation time. In order to further improve the modeling of WaveNet vocoder, the use of Laplacian Distribution output is proposed. Laplacian Distribution is inherently a sparse Distribution, with higher peak and fatter tail than the Gaussian, which might be more suitable for speech signal modeling. The experimental results demonstrate that: 1) the proposed shallow variant of WaveNet architecture gives comparable performance compared to the deep one with softmax output, while reducing the computation time by 73%; and 2) the use of Laplacian Distribution output consistently improves the speech quality in various amounts of limited training data, reaching a value of 4.22 for the two highest mean opinion scores.

Kazuhiro Kobayashi - One of the best experts on this subject based on the ideXlab platform.

  • efficient shallow wavenet vocoder using multiple samples output based on Laplacian Distribution and linear prediction
    International Conference on Acoustics Speech and Signal Processing, 2020
    Co-Authors: Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda
    Abstract:

    This paper presents a novel way for an efficient implementation scheme of shallow WaveNet vocoder with multiple samples (segment) output based on the use of Laplacian Distribution and linear prediction. In our previous work, we have proposed a shallow architecture for WaveNet vocoder that utilizes only 9 dilated convolutional layers while capable of generating high-quality speech with the use of Laplacian Distribution in speech samples modeling. However, there is still a lot of room for improvements to increase the computation efficiency, such as by the inference of segment output and the use of a more compact structure. In this work, we tackle this issue by proposing a simple implementation scheme of segment output modeling, that can be easily extended into other neural vocoders, where the Laplacian Distribution parameters of multiple samples are estimated simultaneously. Further, to preserve the dependencies of the samples within the segment, we also propose utilizing linear prediction (LP) to compute the Distribution parameters, where data- driven LP-coefficients are estimated by the WaveNet vocoder along with locations and scales. Finally, a shallower WaveNet vocoder with 6 layers is deployed. The experimental results demonstrate that the proposed LP-based Laplacian Distribution can alleviate the quality degradation caused by segment generation.

  • ICASSP - Efficient Shallow Wavenet Vocoder Using Multiple Samples Output Based on Laplacian Distribution and Linear Prediction
    ICASSP 2020 - 2020 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), 2020
    Co-Authors: Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda
    Abstract:

    This paper presents a novel way for an efficient implementation scheme of shallow WaveNet vocoder with multiple samples (segment) output based on the use of Laplacian Distribution and linear prediction. In our previous work, we have proposed a shallow architecture for WaveNet vocoder that utilizes only 9 dilated convolutional layers while capable of generating high-quality speech with the use of Laplacian Distribution in speech samples modeling. However, there is still a lot of room for improvements to increase the computation efficiency, such as by the inference of segment output and the use of a more compact structure. In this work, we tackle this issue by proposing a simple implementation scheme of segment output modeling, that can be easily extended into other neural vocoders, where the Laplacian Distribution parameters of multiple samples are estimated simultaneously. Further, to preserve the dependencies of the samples within the segment, we also propose utilizing linear prediction (LP) to compute the Distribution parameters, where data- driven LP-coefficients are estimated by the WaveNet vocoder along with locations and scales. Finally, a shallower WaveNet vocoder with 6 layers is deployed. The experimental results demonstrate that the proposed LP-based Laplacian Distribution can alleviate the quality degradation caused by segment generation.

Shanqjang Ruan - One of the best experts on this subject based on the ideXlab platform.

  • scene analysis for object detection in advanced surveillance systems using Laplacian Distribution model
    Systems Man and Cybernetics, 2011
    Co-Authors: Fanchieh Cheng, Shihchia Huang, Shanqjang Ruan
    Abstract:

    In this paper, we propose a novel background subtraction approach in order to accurately detect moving objects. Our method involves three important proposed modules: a block alarm module, a background modeling module, and an object extraction module. The block alarm module efficiently checks each block for the presence of either a moving object or background information. This is accomplished by using temporal differencing pixels of the Laplacian Distribution model and allows the subsequent background modeling module to process only those blocks that were found to contain background pixels. Next, the background modeling module is employed in order to generate a high-quality adaptive background model using a unique two-stage training procedure and a novel mechanism for recognizing changes in illumination. As the final step of our process, the proposed object extraction module will compute the binary object detection mask through the applied suitable threshold value. This is accomplished by using our proposed threshold training procedure. The performance evaluation of our proposed method was analyzed by quantitative and qualitative evaluation. The overall results show that our proposed method attains a substantially higher degree of efficacy, outperforming other state-of-the-art methods by Similarity and F1 accuracy rates of up to 35.50% and 26.09%, respectively.

  • advanced background subtraction approach using Laplacian Distribution model
    International Conference on Multimedia and Expo, 2010
    Co-Authors: Fanchieh Cheng, Shihchia Huang, Shanqjang Ruan
    Abstract:

    In this paper, we propose a novel background subtraction approach in order to accurately detect moving objects. Our method involves three important proposed modules: a block alarm module, a background modeling module, and an object extraction module. Our proposed block alarm module efficiently checks each block for the presence of either moving object or background information. This is accomplished by using temporal differencing pixels of the Laplacian Distribution model and allows the subsequent background modeling module to process only those blocks found to contain background pixels. For our proposed background modeling module, a unique two-stage background training procedure is performed using Rough Training followed by Precise Training in order to generate a high-quality adaptive background model. As the final step of our process, we present an object extraction module which will compute the binary object detection mask through the applied suitable threshold value. This is accomplished by using our proposed threshold training procedure in order to achieve accurate and complete detection of moving objects. The overall results of these analyses demonstrate that our proposed method attains a substantially higher degree of efficacy, outperforming other state-of-the-art methods by Similarity and F 1 accuracy rates of up to 57.17% and 48.48%, respectively.

  • ICME - Advanced background subtraction approach using Laplacian Distribution model
    2010 IEEE International Conference on Multimedia and Expo, 2010
    Co-Authors: Fanchieh Cheng, Shihchia Huang, Shanqjang Ruan
    Abstract:

    In this paper, we propose a novel background subtraction approach in order to accurately detect moving objects. Our method involves three important proposed modules: a block alarm module, a background modeling module, and an object extraction module. Our proposed block alarm module efficiently checks each block for the presence of either moving object or background information. This is accomplished by using temporal differencing pixels of the Laplacian Distribution model and allows the subsequent background modeling module to process only those blocks found to contain background pixels. For our proposed background modeling module, a unique two-stage background training procedure is performed using Rough Training followed by Precise Training in order to generate a high-quality adaptive background model. As the final step of our process, we present an object extraction module which will compute the binary object detection mask through the applied suitable threshold value. This is accomplished by using our proposed threshold training procedure in order to achieve accurate and complete detection of moving objects. The overall results of these analyses demonstrate that our proposed method attains a substantially higher degree of efficacy, outperforming other state-of-the-art methods by Similarity and F 1 accuracy rates of up to 57.17% and 48.48%, respectively.