Speech Detection

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 32196 Experts worldwide ranked by ideXlab platform

S. Nanda - One of the best experts on this subject based on the ideXlab platform.

  • Resource auction multiple access (RAMA) for statistical multiplexing of Speech in wireless PCS
    IEEE Transactions on Vehicular Technology, 1994
    Co-Authors: N. Amitay, S. Nanda
    Abstract:

    The excess capacity of resource auction multiple access (RAMA), originally proposed for fast handoffs and resource allocations in wireless personal communications systems (PCS), is evaluated for statistical multiplexing of Speech. Using selected GSM parameters in conjunction with M-ary FSK for signaling, it is shown that, in cells with propagation delays of up to 45 /spl mu/s, 216 assignments/s are feasible. The aim is to exploit this large assignment capacity to increase channel utilization. The authors show that, for packet dropping probabilities of 1%, RAMA can have a multiplexing gain as high as 2.63 with fast Speech Detection and 2.28 with slow Speech Detection. RAMA permits graceful degradation during peak traffic demand by operating at higher packet dropping probabilities. The authors also observe that, at low values of packet dropping probability, delays experienced by transmitted packets are more evenly distributed for the case of fast Speech Detection while the bulk of the packets experience less delay with slow Speech Detection. Speech clipping statistics associated with various values of packet dropping probabilities are also presented.

  • Resource auction multiple access (RAMA) for statistical multiplexing of Speech in wireless PCS
    Proceedings of ICC '93 - IEEE International Conference on Communications, 1993
    Co-Authors: N. Armitay, S. Nanda
    Abstract:

    The excess capacity of resource auction multiple access (RAMA) is evaluated for statistical multiplexing of Speech. Using selected GSM parameters in conjunction with M-ary frequency shift keying (FSK) for signaling, it is shown that, in cells with propagation delays of up to 45 /spl mu/s, 432 assignments/s are feasible. The aim is to exploit this large assignment capacity to increase channel utilization. It is shown that, for packet dropping probabilities of 1%, RAMA can provide a multiplexing gain as high as 2.63 with fast Speech Detection and 2.28 with slow Speech Detection. RAMA permits graceful degradation during peak traffic demand by operating at higher packet dropping probabilities. Simulation results also indicate that RAMA is suitable for fast handoffs and resource allocation with statistical Speech multiplexing over areas with a wide variety of cell sizes.

N. Amitay - One of the best experts on this subject based on the ideXlab platform.

  • Resource auction multiple access (RAMA) for statistical multiplexing of Speech in wireless PCS
    IEEE Transactions on Vehicular Technology, 1994
    Co-Authors: N. Amitay, S. Nanda
    Abstract:

    The excess capacity of resource auction multiple access (RAMA), originally proposed for fast handoffs and resource allocations in wireless personal communications systems (PCS), is evaluated for statistical multiplexing of Speech. Using selected GSM parameters in conjunction with M-ary FSK for signaling, it is shown that, in cells with propagation delays of up to 45 /spl mu/s, 216 assignments/s are feasible. The aim is to exploit this large assignment capacity to increase channel utilization. The authors show that, for packet dropping probabilities of 1%, RAMA can have a multiplexing gain as high as 2.63 with fast Speech Detection and 2.28 with slow Speech Detection. RAMA permits graceful degradation during peak traffic demand by operating at higher packet dropping probabilities. The authors also observe that, at low values of packet dropping probability, delays experienced by transmitted packets are more evenly distributed for the case of fast Speech Detection while the bulk of the packets experience less delay with slow Speech Detection. Speech clipping statistics associated with various values of packet dropping probabilities are also presented.

Marie-philippe Gill - One of the best experts on this subject based on the ideXlab platform.

  • Pyannote.Audio: Neural Building Blocks for Speaker Diarization
    ICASSP 2020 - 2020 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), 2020
    Co-Authors: Hervé Bredin, Juan Manuel Coria, Gregory Gelly, Pavel Korshunov, Marvin Lavechin, Diego Fustes, Hadrien Titeux, Wassim Bouaziz, Marie-philippe Gill
    Abstract:

    We introduce pyannote.audio, an open-source toolkit written in Python for speaker diarization. Based on PyTorch machine learning framework, it provides a set of trainable end-to-end neural building blocks that can be combined and jointly optimized to build speaker diarization pipelines. pyannote.audio also comes with pre-trained models covering a wide range of domains for voice activity Detection, speaker change Detection, overlapped Speech Detection, and speaker embedding – reaching state-of-the-art performance for most of them.

Helge Langseth - One of the best experts on this subject based on the ideXlab platform.

  • effective hate Speech Detection in twitter data using recurrent neural networks
    Applied Intelligence, 2018
    Co-Authors: Georgios Pitsilis, Heri Ramampiaro, Helge Langseth
    Abstract:

    This paper addresses the important problem of discerning hateful content in social media. We propose a Detection scheme that is an ensemble of Recurrent Neural Network (RNN) classifiers, and it incorporates various features associated with user-related information, such as the users’ tendency towards racism or sexism. This data is fed as input to the above classifiers along with the word frequency vectors derived from the textual content. We evaluate our approach on a publicly available corpus of 16k tweets, and the results demonstrate its effectiveness in comparison to existing state-of-the-art solutions. More specifically, our scheme can successfully distinguish racism and sexism messages from normal text, and achieve higher classification quality than current state-of-the-art algorithms.

Herve Bourlard - One of the best experts on this subject based on the ideXlab platform.

  • Automatic dysarthric Speech Detection exploiting pairwise distance-based convolutional neural networks
    arXiv: Audio and Speech Processing, 2020
    Co-Authors: Parvaneh Janbakhshi, Ina Kodrasi, Herve Bourlard
    Abstract:

    Automatic dysarthric Speech Detection can provide reliable and cost-effective computer-aided tools to assist the clinical diagnosis and management of dysarthria. In this paper we propose a novel automatic dysarthric Speech Detection approach based on analyses of pairwise distance matrices using convolutional neural networks (CNNs). We represent utterances through articulatory posteriors and consider pairs of phonetically-balanced representations, with one representation from a healthy speaker (i.e., the reference representation) and the other representation from the test speaker (i.e., test representation). Given such pairs of reference and test representations, features are first extracted using a feature extraction front-end, a frame-level distance matrix is computed, and the obtained distance matrix is considered as an image by a CNN-based binary classifier. The feature extraction, distance matrix computation, and CNN-based classifier are jointly optimized in an end-to-end framework. Experimental results on two databases of healthy and dysarthric speakers for different languages and pathologies show that the proposed approach yields a high dysarthric Speech Detection performance, outperforming other CNN-based baseline approaches.