Laplacian Distribution

The Experts below are selected from a list of 1662 Experts worldwide ranked by ideXlab platform

Tomoki Toda - One of the best experts on this subject based on the ideXlab platform.

efficient shallow wavenet vocoder using multiple samples output based on Laplacian Distribution and linear prediction

International Conference on Acoustics Speech and Signal Processing, 2020

Co-Authors: Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda

Abstract:

This paper presents a novel way for an efficient implementation scheme of shallow WaveNet vocoder with multiple samples (segment) output based on the use of Laplacian Distribution and linear prediction. In our previous work, we have proposed a shallow architecture for WaveNet vocoder that utilizes only 9 dilated convolutional layers while capable of generating high-quality speech with the use of Laplacian Distribution in speech samples modeling. However, there is still a lot of room for improvements to increase the computation efficiency, such as by the inference of segment output and the use of a more compact structure. In this work, we tackle this issue by proposing a simple implementation scheme of segment output modeling, that can be easily extended into other neural vocoders, where the Laplacian Distribution parameters of multiple samples are estimated simultaneously. Further, to preserve the dependencies of the samples within the segment, we also propose utilizing linear prediction (LP) to compute the Distribution parameters, where data- driven LP-coefficients are estimated by the WaveNet vocoder along with locations and scales. Finally, a shallower WaveNet vocoder with 6 layers is deployed. The experimental results demonstrate that the proposed LP-based Laplacian Distribution can alleviate the quality degradation caused by segment generation.

15 days free trial to Access Article
ICASSP - Efficient Shallow Wavenet Vocoder Using Multiple Samples Output Based on Laplacian Distribution and Linear Prediction

ICASSP 2020 - 2020 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), 2020

Co-Authors: Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda

Abstract:

This paper presents a novel way for an efficient implementation scheme of shallow WaveNet vocoder with multiple samples (segment) output based on the use of Laplacian Distribution and linear prediction. In our previous work, we have proposed a shallow architecture for WaveNet vocoder that utilizes only 9 dilated convolutional layers while capable of generating high-quality speech with the use of Laplacian Distribution in speech samples modeling. However, there is still a lot of room for improvements to increase the computation efficiency, such as by the inference of segment output and the use of a more compact structure. In this work, we tackle this issue by proposing a simple implementation scheme of segment output modeling, that can be easily extended into other neural vocoders, where the Laplacian Distribution parameters of multiple samples are estimated simultaneously. Further, to preserve the dependencies of the samples within the segment, we also propose utilizing linear prediction (LP) to compute the Distribution parameters, where data- driven LP-coefficients are estimated by the WaveNet vocoder along with locations and scales. Finally, a shallower WaveNet vocoder with 6 layers is deployed. The experimental results demonstrate that the proposed LP-based Laplacian Distribution can alleviate the quality degradation caused by segment generation.

15 days free trial to Access Article
investigation of shallow wavenet vocoder with Laplacian Distribution output

2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2019

Co-Authors: Patrick Lumban Tobing, Tomoki Hayashi, Tomoki Toda

Abstract:

In this paper, an investigation of shallow architecture and Laplacian Distribution output for WaveNet vocoder trained with limited training data is presented. The use of shallower WaveNet architecture is proposed to accommodate the possibility of more suitable use case with limited data and to reduce the computation time. In order to further improve the modeling of WaveNet vocoder, the use of Laplacian Distribution output is proposed. Laplacian Distribution is inherently a sparse Distribution, with higher peak and fatter tail than the Gaussian, which might be more suitable for speech signal modeling. The experimental results demonstrate that: 1) the proposed shallow variant of WaveNet architecture gives comparable performance compared to the deep one with softmax output, while reducing the computation time by 73%; and 2) the use of Laplacian Distribution output consistently improves the speech quality in various amounts of limited training data, reaching a value of 4.22 for the two highest mean opinion scores.

15 days free trial to Access Article
ASRU - Investigation of Shallow Wavenet Vocoder with Laplacian Distribution Output

2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2019

Co-Authors: Patrick Lumban Tobing, Tomoki Hayashi, Tomoki Toda

Abstract:

In this paper, an investigation of shallow architecture and Laplacian Distribution output for WaveNet vocoder trained with limited training data is presented. The use of shallower WaveNet architecture is proposed to accommodate the possibility of more suitable use case with limited data and to reduce the computation time. In order to further improve the modeling of WaveNet vocoder, the use of Laplacian Distribution output is proposed. Laplacian Distribution is inherently a sparse Distribution, with higher peak and fatter tail than the Gaussian, which might be more suitable for speech signal modeling. The experimental results demonstrate that: 1) the proposed shallow variant of WaveNet architecture gives comparable performance compared to the deep one with softmax output, while reducing the computation time by 73%; and 2) the use of Laplacian Distribution output consistently improves the speech quality in various amounts of limited training data, reaching a value of 4.22 for the two highest mean opinion scores.

15 days free trial to Access Article

Patrick Lumban Tobing - One of the best experts on this subject based on the ideXlab platform.

efficient shallow wavenet vocoder using multiple samples output based on Laplacian Distribution and linear prediction

International Conference on Acoustics Speech and Signal Processing, 2020

Co-Authors: Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda

Abstract:

This paper presents a novel way for an efficient implementation scheme of shallow WaveNet vocoder with multiple samples (segment) output based on the use of Laplacian Distribution and linear prediction. In our previous work, we have proposed a shallow architecture for WaveNet vocoder that utilizes only 9 dilated convolutional layers while capable of generating high-quality speech with the use of Laplacian Distribution in speech samples modeling. However, there is still a lot of room for improvements to increase the computation efficiency, such as by the inference of segment output and the use of a more compact structure. In this work, we tackle this issue by proposing a simple implementation scheme of segment output modeling, that can be easily extended into other neural vocoders, where the Laplacian Distribution parameters of multiple samples are estimated simultaneously. Further, to preserve the dependencies of the samples within the segment, we also propose utilizing linear prediction (LP) to compute the Distribution parameters, where data- driven LP-coefficients are estimated by the WaveNet vocoder along with locations and scales. Finally, a shallower WaveNet vocoder with 6 layers is deployed. The experimental results demonstrate that the proposed LP-based Laplacian Distribution can alleviate the quality degradation caused by segment generation.

15 days free trial to Access Article
ICASSP - Efficient Shallow Wavenet Vocoder Using Multiple Samples Output Based on Laplacian Distribution and Linear Prediction

ICASSP 2020 - 2020 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), 2020

Co-Authors: Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda

Abstract:

This paper presents a novel way for an efficient implementation scheme of shallow WaveNet vocoder with multiple samples (segment) output based on the use of Laplacian Distribution and linear prediction. In our previous work, we have proposed a shallow architecture for WaveNet vocoder that utilizes only 9 dilated convolutional layers while capable of generating high-quality speech with the use of Laplacian Distribution in speech samples modeling. However, there is still a lot of room for improvements to increase the computation efficiency, such as by the inference of segment output and the use of a more compact structure. In this work, we tackle this issue by proposing a simple implementation scheme of segment output modeling, that can be easily extended into other neural vocoders, where the Laplacian Distribution parameters of multiple samples are estimated simultaneously. Further, to preserve the dependencies of the samples within the segment, we also propose utilizing linear prediction (LP) to compute the Distribution parameters, where data- driven LP-coefficients are estimated by the WaveNet vocoder along with locations and scales. Finally, a shallower WaveNet vocoder with 6 layers is deployed. The experimental results demonstrate that the proposed LP-based Laplacian Distribution can alleviate the quality degradation caused by segment generation.

15 days free trial to Access Article
investigation of shallow wavenet vocoder with Laplacian Distribution output

2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2019

Co-Authors: Patrick Lumban Tobing, Tomoki Hayashi, Tomoki Toda

Abstract:

In this paper, an investigation of shallow architecture and Laplacian Distribution output for WaveNet vocoder trained with limited training data is presented. The use of shallower WaveNet architecture is proposed to accommodate the possibility of more suitable use case with limited data and to reduce the computation time. In order to further improve the modeling of WaveNet vocoder, the use of Laplacian Distribution output is proposed. Laplacian Distribution is inherently a sparse Distribution, with higher peak and fatter tail than the Gaussian, which might be more suitable for speech signal modeling. The experimental results demonstrate that: 1) the proposed shallow variant of WaveNet architecture gives comparable performance compared to the deep one with softmax output, while reducing the computation time by 73%; and 2) the use of Laplacian Distribution output consistently improves the speech quality in various amounts of limited training data, reaching a value of 4.22 for the two highest mean opinion scores.

15 days free trial to Access Article
ASRU - Investigation of Shallow Wavenet Vocoder with Laplacian Distribution Output

2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2019

Co-Authors: Patrick Lumban Tobing, Tomoki Hayashi, Tomoki Toda

Abstract:

In this paper, an investigation of shallow architecture and Laplacian Distribution output for WaveNet vocoder trained with limited training data is presented. The use of shallower WaveNet architecture is proposed to accommodate the possibility of more suitable use case with limited data and to reduce the computation time. In order to further improve the modeling of WaveNet vocoder, the use of Laplacian Distribution output is proposed. Laplacian Distribution is inherently a sparse Distribution, with higher peak and fatter tail than the Gaussian, which might be more suitable for speech signal modeling. The experimental results demonstrate that: 1) the proposed shallow variant of WaveNet architecture gives comparable performance compared to the deep one with softmax output, while reducing the computation time by 73%; and 2) the use of Laplacian Distribution output consistently improves the speech quality in various amounts of limited training data, reaching a value of 4.22 for the two highest mean opinion scores.

15 days free trial to Access Article

Tomoki Hayashi - One of the best experts on this subject based on the ideXlab platform.

efficient shallow wavenet vocoder using multiple samples output based on Laplacian Distribution and linear prediction

International Conference on Acoustics Speech and Signal Processing, 2020

Co-Authors: Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda

Abstract:

This paper presents a novel way for an efficient implementation scheme of shallow WaveNet vocoder with multiple samples (segment) output based on the use of Laplacian Distribution and linear prediction. In our previous work, we have proposed a shallow architecture for WaveNet vocoder that utilizes only 9 dilated convolutional layers while capable of generating high-quality speech with the use of Laplacian Distribution in speech samples modeling. However, there is still a lot of room for improvements to increase the computation efficiency, such as by the inference of segment output and the use of a more compact structure. In this work, we tackle this issue by proposing a simple implementation scheme of segment output modeling, that can be easily extended into other neural vocoders, where the Laplacian Distribution parameters of multiple samples are estimated simultaneously. Further, to preserve the dependencies of the samples within the segment, we also propose utilizing linear prediction (LP) to compute the Distribution parameters, where data- driven LP-coefficients are estimated by the WaveNet vocoder along with locations and scales. Finally, a shallower WaveNet vocoder with 6 layers is deployed. The experimental results demonstrate that the proposed LP-based Laplacian Distribution can alleviate the quality degradation caused by segment generation.

15 days free trial to Access Article
ICASSP - Efficient Shallow Wavenet Vocoder Using Multiple Samples Output Based on Laplacian Distribution and Linear Prediction

ICASSP 2020 - 2020 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), 2020

Co-Authors: Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda

Abstract:

This paper presents a novel way for an efficient implementation scheme of shallow WaveNet vocoder with multiple samples (segment) output based on the use of Laplacian Distribution and linear prediction. In our previous work, we have proposed a shallow architecture for WaveNet vocoder that utilizes only 9 dilated convolutional layers while capable of generating high-quality speech with the use of Laplacian Distribution in speech samples modeling. However, there is still a lot of room for improvements to increase the computation efficiency, such as by the inference of segment output and the use of a more compact structure. In this work, we tackle this issue by proposing a simple implementation scheme of segment output modeling, that can be easily extended into other neural vocoders, where the Laplacian Distribution parameters of multiple samples are estimated simultaneously. Further, to preserve the dependencies of the samples within the segment, we also propose utilizing linear prediction (LP) to compute the Distribution parameters, where data- driven LP-coefficients are estimated by the WaveNet vocoder along with locations and scales. Finally, a shallower WaveNet vocoder with 6 layers is deployed. The experimental results demonstrate that the proposed LP-based Laplacian Distribution can alleviate the quality degradation caused by segment generation.

15 days free trial to Access Article
investigation of shallow wavenet vocoder with Laplacian Distribution output

2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2019

Co-Authors: Patrick Lumban Tobing, Tomoki Hayashi, Tomoki Toda

Abstract:

In this paper, an investigation of shallow architecture and Laplacian Distribution output for WaveNet vocoder trained with limited training data is presented. The use of shallower WaveNet architecture is proposed to accommodate the possibility of more suitable use case with limited data and to reduce the computation time. In order to further improve the modeling of WaveNet vocoder, the use of Laplacian Distribution output is proposed. Laplacian Distribution is inherently a sparse Distribution, with higher peak and fatter tail than the Gaussian, which might be more suitable for speech signal modeling. The experimental results demonstrate that: 1) the proposed shallow variant of WaveNet architecture gives comparable performance compared to the deep one with softmax output, while reducing the computation time by 73%; and 2) the use of Laplacian Distribution output consistently improves the speech quality in various amounts of limited training data, reaching a value of 4.22 for the two highest mean opinion scores.

15 days free trial to Access Article
ASRU - Investigation of Shallow Wavenet Vocoder with Laplacian Distribution Output

2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2019

Co-Authors: Patrick Lumban Tobing, Tomoki Hayashi, Tomoki Toda

Abstract:

In this paper, an investigation of shallow architecture and Laplacian Distribution output for WaveNet vocoder trained with limited training data is presented. The use of shallower WaveNet architecture is proposed to accommodate the possibility of more suitable use case with limited data and to reduce the computation time. In order to further improve the modeling of WaveNet vocoder, the use of Laplacian Distribution output is proposed. Laplacian Distribution is inherently a sparse Distribution, with higher peak and fatter tail than the Gaussian, which might be more suitable for speech signal modeling. The experimental results demonstrate that: 1) the proposed shallow variant of WaveNet architecture gives comparable performance compared to the deep one with softmax output, while reducing the computation time by 73%; and 2) the use of Laplacian Distribution output consistently improves the speech quality in various amounts of limited training data, reaching a value of 4.22 for the two highest mean opinion scores.

15 days free trial to Access Article

Kazuhiro Kobayashi - One of the best experts on this subject based on the ideXlab platform.

efficient shallow wavenet vocoder using multiple samples output based on Laplacian Distribution and linear prediction

International Conference on Acoustics Speech and Signal Processing, 2020

Co-Authors: Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda

Abstract:

This paper presents a novel way for an efficient implementation scheme of shallow WaveNet vocoder with multiple samples (segment) output based on the use of Laplacian Distribution and linear prediction. In our previous work, we have proposed a shallow architecture for WaveNet vocoder that utilizes only 9 dilated convolutional layers while capable of generating high-quality speech with the use of Laplacian Distribution in speech samples modeling. However, there is still a lot of room for improvements to increase the computation efficiency, such as by the inference of segment output and the use of a more compact structure. In this work, we tackle this issue by proposing a simple implementation scheme of segment output modeling, that can be easily extended into other neural vocoders, where the Laplacian Distribution parameters of multiple samples are estimated simultaneously. Further, to preserve the dependencies of the samples within the segment, we also propose utilizing linear prediction (LP) to compute the Distribution parameters, where data- driven LP-coefficients are estimated by the WaveNet vocoder along with locations and scales. Finally, a shallower WaveNet vocoder with 6 layers is deployed. The experimental results demonstrate that the proposed LP-based Laplacian Distribution can alleviate the quality degradation caused by segment generation.

15 days free trial to Access Article
ICASSP - Efficient Shallow Wavenet Vocoder Using Multiple Samples Output Based on Laplacian Distribution and Linear Prediction

ICASSP 2020 - 2020 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), 2020

Co-Authors: Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda

Abstract:

This paper presents a novel way for an efficient implementation scheme of shallow WaveNet vocoder with multiple samples (segment) output based on the use of Laplacian Distribution and linear prediction. In our previous work, we have proposed a shallow architecture for WaveNet vocoder that utilizes only 9 dilated convolutional layers while capable of generating high-quality speech with the use of Laplacian Distribution in speech samples modeling. However, there is still a lot of room for improvements to increase the computation efficiency, such as by the inference of segment output and the use of a more compact structure. In this work, we tackle this issue by proposing a simple implementation scheme of segment output modeling, that can be easily extended into other neural vocoders, where the Laplacian Distribution parameters of multiple samples are estimated simultaneously. Further, to preserve the dependencies of the samples within the segment, we also propose utilizing linear prediction (LP) to compute the Distribution parameters, where data- driven LP-coefficients are estimated by the WaveNet vocoder along with locations and scales. Finally, a shallower WaveNet vocoder with 6 layers is deployed. The experimental results demonstrate that the proposed LP-based Laplacian Distribution can alleviate the quality degradation caused by segment generation.

15 days free trial to Access Article

Shanqjang Ruan - One of the best experts on this subject based on the ideXlab platform.

scene analysis for object detection in advanced surveillance systems using Laplacian Distribution model

Systems Man and Cybernetics, 2011

Co-Authors: Fanchieh Cheng, Shihchia Huang, Shanqjang Ruan

Abstract:

In this paper, we propose a novel background subtraction approach in order to accurately detect moving objects. Our method involves three important proposed modules: a block alarm module, a background modeling module, and an object extraction module. The block alarm module efficiently checks each block for the presence of either a moving object or background information. This is accomplished by using temporal differencing pixels of the Laplacian Distribution model and allows the subsequent background modeling module to process only those blocks that were found to contain background pixels. Next, the background modeling module is employed in order to generate a high-quality adaptive background model using a unique two-stage training procedure and a novel mechanism for recognizing changes in illumination. As the final step of our process, the proposed object extraction module will compute the binary object detection mask through the applied suitable threshold value. This is accomplished by using our proposed threshold training procedure. The performance evaluation of our proposed method was analyzed by quantitative and qualitative evaluation. The overall results show that our proposed method attains a substantially higher degree of efficacy, outperforming other state-of-the-art methods by Similarity and F1 accuracy rates of up to 35.50% and 26.09%, respectively.

15 days free trial to Access Article
advanced background subtraction approach using Laplacian Distribution model

International Conference on Multimedia and Expo, 2010

Co-Authors: Fanchieh Cheng, Shihchia Huang, Shanqjang Ruan

Abstract:

In this paper, we propose a novel background subtraction approach in order to accurately detect moving objects. Our method involves three important proposed modules: a block alarm module, a background modeling module, and an object extraction module. Our proposed block alarm module efficiently checks each block for the presence of either moving object or background information. This is accomplished by using temporal differencing pixels of the Laplacian Distribution model and allows the subsequent background modeling module to process only those blocks found to contain background pixels. For our proposed background modeling module, a unique two-stage background training procedure is performed using Rough Training followed by Precise Training in order to generate a high-quality adaptive background model. As the final step of our process, we present an object extraction module which will compute the binary object detection mask through the applied suitable threshold value. This is accomplished by using our proposed threshold training procedure in order to achieve accurate and complete detection of moving objects. The overall results of these analyses demonstrate that our proposed method attains a substantially higher degree of efficacy, outperforming other state-of-the-art methods by Similarity and F 1 accuracy rates of up to 57.17% and 48.48%, respectively.

15 days free trial to Access Article
ICME - Advanced background subtraction approach using Laplacian Distribution model

2010 IEEE International Conference on Multimedia and Expo, 2010

Co-Authors: Fanchieh Cheng, Shihchia Huang, Shanqjang Ruan

Abstract:

In this paper, we propose a novel background subtraction approach in order to accurately detect moving objects. Our method involves three important proposed modules: a block alarm module, a background modeling module, and an object extraction module. Our proposed block alarm module efficiently checks each block for the presence of either moving object or background information. This is accomplished by using temporal differencing pixels of the Laplacian Distribution model and allows the subsequent background modeling module to process only those blocks found to contain background pixels. For our proposed background modeling module, a unique two-stage background training procedure is performed using Rough Training followed by Precise Training in order to generate a high-quality adaptive background model. As the final step of our process, we present an object extraction module which will compute the binary object detection mask through the applied suitable threshold value. This is accomplished by using our proposed threshold training procedure in order to achieve accurate and complete detection of moving objects. The overall results of these analyses demonstrate that our proposed method attains a substantially higher degree of efficacy, outperforming other state-of-the-art methods by Similarity and F 1 accuracy rates of up to 57.17% and 48.48%, respectively.

15 days free trial to Access Article

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Tomoki Toda - One of the best experts on this subject based on the ideXlab platform.

efficient shallow wavenet vocoder using multiple samples output based on Laplacian Distribution and linear prediction

ICASSP - Efficient Shallow Wavenet Vocoder Using Multiple Samples Output Based on Laplacian Distribution and Linear Prediction

investigation of shallow wavenet vocoder with Laplacian Distribution output

ASRU - Investigation of Shallow Wavenet Vocoder with Laplacian Distribution Output

Patrick Lumban Tobing - One of the best experts on this subject based on the ideXlab platform.

efficient shallow wavenet vocoder using multiple samples output based on Laplacian Distribution and linear prediction

ICASSP - Efficient Shallow Wavenet Vocoder Using Multiple Samples Output Based on Laplacian Distribution and Linear Prediction

investigation of shallow wavenet vocoder with Laplacian Distribution output

ASRU - Investigation of Shallow Wavenet Vocoder with Laplacian Distribution Output

Tomoki Hayashi - One of the best experts on this subject based on the ideXlab platform.

efficient shallow wavenet vocoder using multiple samples output based on Laplacian Distribution and linear prediction

ICASSP - Efficient Shallow Wavenet Vocoder Using Multiple Samples Output Based on Laplacian Distribution and Linear Prediction

investigation of shallow wavenet vocoder with Laplacian Distribution output

ASRU - Investigation of Shallow Wavenet Vocoder with Laplacian Distribution Output

Kazuhiro Kobayashi - One of the best experts on this subject based on the ideXlab platform.

efficient shallow wavenet vocoder using multiple samples output based on Laplacian Distribution and linear prediction

ICASSP - Efficient Shallow Wavenet Vocoder Using Multiple Samples Output Based on Laplacian Distribution and Linear Prediction

Shanqjang Ruan - One of the best experts on this subject based on the ideXlab platform.

scene analysis for object detection in advanced surveillance systems using Laplacian Distribution model

advanced background subtraction approach using Laplacian Distribution model

ICME - Advanced background subtraction approach using Laplacian Distribution model

Laplacian Distribution

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Related terms

Tomoki Toda - One of the best experts on this subject based on the ideXlab platform.

Patrick Lumban Tobing - One of the best experts on this subject based on the ideXlab platform.

Tomoki Hayashi - One of the best experts on this subject based on the ideXlab platform.

Kazuhiro Kobayashi - One of the best experts on this subject based on the ideXlab platform.

Shanqjang Ruan - One of the best experts on this subject based on the ideXlab platform.

Related terms