Source Separation

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 103434 Experts worldwide ranked by ideXlab platform

Mark D. Plumbley - One of the best experts on this subject based on the ideXlab platform.

  • Source Separation with Weakly Labelled Data: an Approach to Computational Auditory Scene Analysis
    ICASSP 2020 - 2020 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), 2020
    Co-Authors: Qiuqiang Kong, Wenwu Wang, Yuxuan Wang, Xuchen Song, Yin Cao, Mark D. Plumbley
    Abstract:

    Source Separation is the task of separating an audio recording into individual sound Sources. Source Separation is fundamental for computational auditory scene analysis. Previous work on Source Separation has focused on separating particular sound classes such as speech and music. Much previous work requires mixtures and clean Source pairs for training. In this work, we propose a Source Separation framework trained with weakly labelled data. Weakly labelled data only contains the tags of an audio clip, without the occurrence time of sound events. We first train a sound event detection system with AudioSet. The trained sound event detection system is used to detect segments that are most likely to contain a target sound event. Then a regression is learnt from a mixture of two randomly selected segments to a target segment conditioned on the audio tagging prediction of the target segment. Our proposed system can separate 527 kinds of sound classes from AudioSet within a single system. A U-Net is adopted for the Separation system and achieves an average SDR of 5.67 dB over 527 sound classes in AudioSet.

  • Single Channel Audio Source Separation using Convolutional Denoising Autoencoders
    arXiv: Sound, 2017
    Co-Authors: Emad M. Grais, Mark D. Plumbley
    Abstract:

    Deep learning techniques have been used recently to tackle the audio Source Separation problem. In this work, we propose to use deep fully convolutional denoising autoencoders (CDAEs) for monaural audio Source Separation. We use as many CDAEs as the number of Sources to be separated from the mixed signal. Each CDAE is trained to separate one Source and treats the other Sources as background noise. The main idea is to allow each CDAE to learn suitable spectral-temporal filters and features to its corresponding Source. Our experimental results show that CDAEs perform Source Separation slightly better than the deep feedforward neural networks (FNNs) even with fewer parameters than FNNs.

  • Deep neural network based audio Source Separation
    2016
    Co-Authors: Alfredo Zermini, Mark D. Plumbley, Wenwu Wang
    Abstract:

    Audio Source Separation aims to extract individual Sources from mixtures of multiple sound Sources. Many techniques have been developed such as independent compo- nent analysis, computational auditory scene analysis, and non-negative matrix factorisa- tion. A method based on Deep Neural Networks (DNNs) and time-frequency (T-F) mask- ing has been recently developed for binaural audio Source Separation. In this method, the DNNs are used to predict the Direction Of Arrival (DOA) of the audio Sources with respect to the listener which is then used to generate soft T-F masks for the recovery/estimation of the individual audio Sources.

  • BENCHMARKING FLEXIBLE ADAPTIVE TIME-FREQUENCY TRANSFORMS FOR UNDERDETERMINED AUDIO Source Separation
    2012
    Co-Authors: Andrew Nesbit, Emmanuel Vincent, Mark D. Plumbley
    Abstract:

    We have implemented several fast and flexible adaptive lapped orthogonal transform (LOT) schemes for underdetermined audio Source Separation. This is generally addressed by time-frequency masking, requiring the Sources to be disjoint in the time-frequency domain. We have already shown that disjointness can be increased via adaptive dyadic LOTs. By taking inspiration from the windowing schemes used in many audio coding frameworks, we improve on earlier results in two ways. Firstly, we consider non-dyadic LOTs which match the time-varying signal structures better. Secondly, we allow for a greater range of overlapping window profiles to decrease window boundary artifacts. This new scheme is benchmarked through oracle evaluations, and is shown to decrease computation time by over an order of magnitude compared to using very general schemes, whilst maintaining high Separation performance and flexible signal adaptivity. As the results demonstrate, this work may find practical applications in high fidelity audio Source Separation. Index Terms — Time-frequency analysis, Discrete cosine transforms, Source Separation, Benchmark, Evaluatio

  • Automatic music transcription and audio Source Separation
    Cybernetics and Systems, 2002
    Co-Authors: Mark D. Plumbley, M Davies, Samer A. Abdallah, Juan Pablo Bello, Giuliano Monti, Mark Sandler
    Abstract:

    In this article, we give an overview of a range of approaches to the analysis and Separation of musical audio. In particular, we consider the problems of automatic music transcription and audio Source Separation, which are of particular interest to our group. Monophonic music transcription, where a single note is present at one time, can be tackled using an autocorrelation-based method. For polyphonic music transcription, with several notes at any time, other approaches can be used, such as a blackboard model or a multiple-cause/sparse coding method. The latter is based on ideas and methods related to independent component analysis (ICA), a method for sound Source Separation.

M. G. Amin - One of the best experts on this subject based on the ideXlab platform.

  • joint anti diagonalization for blind Source Separation
    International Conference on Acoustics Speech and Signal Processing, 2001
    Co-Authors: Adel Belouchrani, M. G. Amin, Karim Abedmeraim, Abdelhak M Zoubir
    Abstract:

    We address the problem of blind Source Separation of non-stationary signals of which only instantaneous linear mixtures are observed. A blind Source Separation approach exploiting both auto-terms and cross-terms of the time-frequency (TF) distributions of the Sources is considered. The approach is based on the simultaneous diagonalization and anti-diagonalization of spatial TF distribution matrices made up of, respectively, auto-terms and cross-terms. Numerical simulations are provided to demonstrate the effectiveness of the proposed approach and compare its performances with existing TF-based methods.

  • Blind Source Separation based on time-frequency signal representations
    IEEE Transactions on Signal Processing, 1998
    Co-Authors: Adel Belouchrani, M. G. Amin
    Abstract:

    Blind Source Separation consists of recovering a set of signals of which only instantaneous linear mixtures are observed. Thus far, this problem has been solved using statistical information available on the Source signals. This paper introduces a new blind Source Separation approach exploiting the difference in the time-frequency (t-f) signatures of the Sources to be separated. The approach is based on the diagonalization of a combined set of “spatial t-f distributions”. In contrast to existing techniques, the proposed approach allows the Separation of Gaussian Sources with identical spectral shape but with different t-f localization properties. The effects of spreading the noise power while localizing the Source energy in the t-f domain amounts to increasing the robustness of the proposed approach with respect to noise and, hence, improved performance. Asymptotic performance analysis and numerical simulations are provided

Adel Belouchrani - One of the best experts on this subject based on the ideXlab platform.

  • joint anti diagonalization for blind Source Separation
    International Conference on Acoustics Speech and Signal Processing, 2001
    Co-Authors: Adel Belouchrani, M. G. Amin, Karim Abedmeraim, Abdelhak M Zoubir
    Abstract:

    We address the problem of blind Source Separation of non-stationary signals of which only instantaneous linear mixtures are observed. A blind Source Separation approach exploiting both auto-terms and cross-terms of the time-frequency (TF) distributions of the Sources is considered. The approach is based on the simultaneous diagonalization and anti-diagonalization of spatial TF distribution matrices made up of, respectively, auto-terms and cross-terms. Numerical simulations are provided to demonstrate the effectiveness of the proposed approach and compare its performances with existing TF-based methods.

  • Blind Source Separation based on time-frequency signal representations
    IEEE Transactions on Signal Processing, 1998
    Co-Authors: Adel Belouchrani, M. G. Amin
    Abstract:

    Blind Source Separation consists of recovering a set of signals of which only instantaneous linear mixtures are observed. Thus far, this problem has been solved using statistical information available on the Source signals. This paper introduces a new blind Source Separation approach exploiting the difference in the time-frequency (t-f) signatures of the Sources to be separated. The approach is based on the diagonalization of a combined set of “spatial t-f distributions”. In contrast to existing techniques, the proposed approach allows the Separation of Gaussian Sources with identical spectral shape but with different t-f localization properties. The effects of spreading the noise power while localizing the Source energy in the t-f domain amounts to increasing the robustness of the proposed approach with respect to noise and, hence, improved performance. Asymptotic performance analysis and numerical simulations are provided

A Bray - One of the best experts on this subject based on the ideXlab platform.

  • nonlinear blind Source Separation using kernels
    IEEE Transactions on Neural Networks, 2003
    Co-Authors: Dominique Martinez, A Bray
    Abstract:

    We derive a new method for solving nonlinear blind Source Separation (BSS) problems by exploiting second-order statistics in a kernel induced feature space. This paper extends a new and efficient closed-form linear algorithm to the nonlinear domain using the kernel trick originally applied in support vector machines (SVMs). This technique could likewise be applied to other linear covariance-based Source Separation algorithms. Experiments on realistic nonlinear mixtures of speech signals, gas multisensor data, and visual disparity data illustrate the applicability of our approach.

Emmanuel Vincent - One of the best experts on this subject based on the ideXlab platform.

  • The Flexible Audio Source Separation Toolbox Version 2.0
    2014
    Co-Authors: Yann Salaün, Emmanuel Vincent, Nancy Bertin, Nathan Souviraà-labastie, Xabier Jaureguiberry, Dung T. Tran, Frédéric Bimbot
    Abstract:

    The Flexible Audio Source Separation Toolbox (FASST) is a toolbox for audio Source Separation that relies on a general modeling and estimation framework that is applicable to a wide range of scenarios. We introduce the new version of the toolbox written in C++, which provides a number of advantages compared to the first Matlab version: portability, faster computation, simplified user interface, more scripting languages. In addition, we provide a state-of-the-art example of use for the Separation of speech and domestic noise. The demonstration will give attendees the opportunity to explore the settings and to experience their effect on the Separation performance.

  • Consistent Wiener Filtering for Audio Source Separation
    IEEE Signal Processing Letters, 2013
    Co-Authors: J. Le Roux, Emmanuel Vincent
    Abstract:

    Wiener filtering is one of the most ubiquitous tools in signal processing, in particular for signal denoising and Source Separation. In the context of audio, it is typically applied in the time-frequency domain by means of the short-time Fourier transform (STFT). Such processing does generally not take into account the relationship between STFT coefficients in different time-frequency bins due to the redundancy of the STFT, which we refer to as consistency. We propose to enforce this relationship in the design of the Wiener filter, either as a hard constraint or as a soft penalty. We derive two conjugate gradient algorithms for the computation of the filter coefficients and show improved audio Source Separation performance compared to the classical Wiener filter both in oracle and in blind conditions.

  • A General Flexible Framework for the Handling of Prior Information in Audio Source Separation
    IEEE Transactions on Audio Speech and Language Processing, 2012
    Co-Authors: Alexey Ozerov, Emmanuel Vincent, Frédéric Bimbot
    Abstract:

    Most of audio Source Separation methods are developed for a particular scenario characterized by the number of Sources and channels and the characteristics of the Sources and the mixing process. In this paper we introduce a general audio Source Separation framework based on a library of structured Source models that enable the incorporation of prior knowledge about each Source via user-specifiable constraints. While this framework generalizes several existing audio Source Separation methods, it also allows to imagine and implement new efficient methods that were not yet reported in the literature. We first introduce the framework by describing the model structure and constraints, explaining its generality, and summarizing its algorithmic implementation using a generalized expectation-maximization algorithm. Finally, we illustrate the above-mentioned capabilities of the framework by applying it in several new and existing configurations to different Source Separation problems. We have released a software tool named Flexible Audio Source Separation Toolbox (FASST) implementing a baseline version of the framework in Matlab.

  • BENCHMARKING FLEXIBLE ADAPTIVE TIME-FREQUENCY TRANSFORMS FOR UNDERDETERMINED AUDIO Source Separation
    2012
    Co-Authors: Andrew Nesbit, Emmanuel Vincent, Mark D. Plumbley
    Abstract:

    We have implemented several fast and flexible adaptive lapped orthogonal transform (LOT) schemes for underdetermined audio Source Separation. This is generally addressed by time-frequency masking, requiring the Sources to be disjoint in the time-frequency domain. We have already shown that disjointness can be increased via adaptive dyadic LOTs. By taking inspiration from the windowing schemes used in many audio coding frameworks, we improve on earlier results in two ways. Firstly, we consider non-dyadic LOTs which match the time-varying signal structures better. Secondly, we allow for a greater range of overlapping window profiles to decrease window boundary artifacts. This new scheme is benchmarked through oracle evaluations, and is shown to decrease computation time by over an order of magnitude compared to using very general schemes, whilst maintaining high Separation performance and flexible signal adaptivity. As the results demonstrate, this work may find practical applications in high fidelity audio Source Separation. Index Terms — Time-frequency analysis, Discrete cosine transforms, Source Separation, Benchmark, Evaluatio

  • A general modular framework for audio Source Separation
    2010
    Co-Authors: Alexey Ozerov, Emmanuel Vincent, Frédéric Bimbot
    Abstract:

    Most of audio Source Separation methods are developed for a particular scenario characterized by the number of Sources and channels and the characteristics of the Sources and the mixing process. In this paper we introduce a general modular audio Source Separation framework based on a library of flexible Source models that enable the incorporation of prior knowledge about the characteristics of each Source. First, this framework generalizes several existing audio Source Separation methods, while bringing a common formulation for them. Second, it allows to imagine and implement new efficient methods that were not yet reported in the literature. We first introduce the framework by describing the flexible model, explaining its generality, and summarizing our modular implementation using a Generalized Expectation-Maximization algorithm. Finally, we illustrate the above-mentioned capabilities of the framework by applying it in several new and existing configurations to different Source Separation scenarios.