Audio Analysis

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 114624 Experts worldwide ranked by ideXlab platform

Pierre Vandergheynst - One of the best experts on this subject based on the ideXlab platform.

  • geometric deep learning going beyond euclidean data
    IEEE Signal Processing Magazine, 2017
    Co-Authors: Michael M. Bronstein, Yann Lecun, Arthur Szlam, Joan Bruna, Pierre Vandergheynst
    Abstract:

    Many scientific fields study data with an underlying structure that is non-Euclidean. Some examples include social networks in computational social sciences, sensor networks in communications, functional networks in brain imaging, regulatory networks in genetics, and meshed surfaces in computer graphics. In many applications, such geometric data are large and complex (in the case of social networks, on the scale of billions) and are natural targets for machine-learning techniques. In particular, we would like to use deep neural networks, which have recently proven to be powerful tools for a broad range of problems from computer vision, natural-language processing, and Audio Analysis. However, these tools have been most successful on data with an underlying Euclidean or grid-like structure and in cases where the invariances of these structures are built into networks used to model them.

  • Geometric Deep Learning: Going beyond Euclidean data
    IEEE Signal Processing Magazine, 2017
    Co-Authors: Michael M. Bronstein, Yann Lecun, Arthur Szlam, Joan Bruna, Pierre Vandergheynst
    Abstract:

    Many scientific fields study data with an underlying structure that is a non-Euclidean space. Some examples include social networks in computational social sciences, sensor networks in communications, functional networks in brain imaging, regulatory networks in genetics, and meshed surfaces in computer graphics. In many applications, such geometric data are large and complex (in the case of social networks, on the scale of billions), and are natural targets for machine learning techniques. In particular, we would like to use deep neural networks, which have recently proven to be powerful tools for a broad range of problems from computer vision, natural language processing, and Audio Analysis. However, these tools have been most successful on data with an underlying Euclidean or grid-like structure, and in cases where the invariances of these structures are built into networks used to model them. Geometric deep learning is an umbrella term for emerging techniques attempting to generalize (structured) deep neural models to non-Euclidean domains such as graphs and manifolds. The purpose of this paper is to overview different examples of geometric deep learning problems and present available solutions, key difficulties, applications, and future research directions in this nascent field.

Roger Reynolds - One of the best experts on this subject based on the ideXlab platform.

  • structural and affective aspects of music from statistical Audio signal Analysis
    Journal of the Association for Information Science and Technology, 2006
    Co-Authors: Shlomo Dubnov, Stephen Mcadams, Roger Reynolds
    Abstract:

    Understanding and modeling human experience and emotional response when listening to music are important for better understanding of the stylistic choices in musical composition. In this work, we explore the relation of Audio signal structure to human perceptual and emotional reactions. Memory, repetition, and anticipatory structure have been suggested as some of the major factors in music that might influence and possibly shape these responses. The Audio Analysis was conducted on two recordings of an extended contemporary musical composition by one of the authors. Signal properties were analyzed using statistical analyses of signal similarities over time and information theoretic measures of signal redundancy. They were then compared to Familiarity Rating and Emotional Force profiles, as recorded continually by listeners hearing the two versions of the piece in a live-concert setting. The Analysis shows strong evidence that signal properties and human reactions are related, suggesting applications of these techniques to music understanding and music information-retrieval systems.

  • structural and affective aspects of music from statistical Audio signal Analysis special topic section on computational Analysis of style
    Journal of the Association for Information Science and Technology, 2006
    Co-Authors: Shlomo Dubnov, Stephen Mcadams, Roger Reynolds
    Abstract:

    Understanding and modeling human experience and emotional response when listening to music are important for better understanding of the stylistic choices in musical composition. In this work, we explore the relation of Audio signal structure to human perceptual and emotional reactions. Memory, repetition, and anticipatory structure have been suggested as some of the major factors in music that might influence and possibly shape these responses. The Audio Analysis was conducted on two recordings of an extended contemporary musical composition by one of the authors. Signal properties were analyzed using statistical analyses of signal similarities over time and information theoretic measures of signal redundancy. They were then compared to Familiarity Rating and Emotional Force profiles, as recorded continually by listeners hearing the two versions of the piece in a live-concert setting. The Analysis shows strong evidence that signal properties and human reactions are related, suggesting applications of these techniques to music understanding and music information-retrieval systems. © 2006 Wiley Periodicals, Inc.

Michael M. Bronstein - One of the best experts on this subject based on the ideXlab platform.

  • geometric deep learning going beyond euclidean data
    IEEE Signal Processing Magazine, 2017
    Co-Authors: Michael M. Bronstein, Yann Lecun, Arthur Szlam, Joan Bruna, Pierre Vandergheynst
    Abstract:

    Many scientific fields study data with an underlying structure that is non-Euclidean. Some examples include social networks in computational social sciences, sensor networks in communications, functional networks in brain imaging, regulatory networks in genetics, and meshed surfaces in computer graphics. In many applications, such geometric data are large and complex (in the case of social networks, on the scale of billions) and are natural targets for machine-learning techniques. In particular, we would like to use deep neural networks, which have recently proven to be powerful tools for a broad range of problems from computer vision, natural-language processing, and Audio Analysis. However, these tools have been most successful on data with an underlying Euclidean or grid-like structure and in cases where the invariances of these structures are built into networks used to model them.

  • Geometric Deep Learning: Going beyond Euclidean data
    IEEE Signal Processing Magazine, 2017
    Co-Authors: Michael M. Bronstein, Yann Lecun, Arthur Szlam, Joan Bruna, Pierre Vandergheynst
    Abstract:

    Many scientific fields study data with an underlying structure that is a non-Euclidean space. Some examples include social networks in computational social sciences, sensor networks in communications, functional networks in brain imaging, regulatory networks in genetics, and meshed surfaces in computer graphics. In many applications, such geometric data are large and complex (in the case of social networks, on the scale of billions), and are natural targets for machine learning techniques. In particular, we would like to use deep neural networks, which have recently proven to be powerful tools for a broad range of problems from computer vision, natural language processing, and Audio Analysis. However, these tools have been most successful on data with an underlying Euclidean or grid-like structure, and in cases where the invariances of these structures are built into networks used to model them. Geometric deep learning is an umbrella term for emerging techniques attempting to generalize (structured) deep neural models to non-Euclidean domains such as graphs and manifolds. The purpose of this paper is to overview different examples of geometric deep learning problems and present available solutions, key difficulties, applications, and future research directions in this nascent field.

Xavier Serra - One of the best experts on this subject based on the ideXlab platform.

  • essentia an open source library for sound and music Analysis
    ACM Multimedia, 2013
    Co-Authors: Dmitry Bogdanov, Justin Salamon, Emilia Gomez, Nicolas Wack, Sankalp Gulati, Perfecto Herrera, Oscar Mayor, Gerard Roma, Jose R Zapata, Xavier Serra
    Abstract:

    We present Essentia 2.0, an open-source C++ library for Audio Analysis and Audio-based music information retrieval released under the Affero GPL license. It contains an extensive collection of reusable algorithms which implement Audio input/output functionality, standard digital signal processing blocks, statistical characterization of data, and a large set of spectral, temporal, tonal and high-level music descriptors. The library is also wrapped in Python and includes a number of predefined executable extractors for the available music descriptors, which facilitates its use for fast prototyping and allows setting up research experiments very rapidly. Furthermore, it includes a Vamp plugin to be used with Sonic Visualiser for visualization purposes. The library is cross-platform and currently supports Linux, Mac OS X, and Windows systems. Essentia is designed with a focus on the robustness of the provided music descriptors and is optimized in terms of the computational cost of the algorithms. The provided functionality, specifically the music descriptors included in-the-box and signal processing algorithms, is easily expandable and allows for both research experiments and development of large-scale industrial applications.

  • essentia an Audio Analysis library for music information retrieval
    International Symposium Conference on Music Information Retrieval, 2013
    Co-Authors: Dmitry Bogdanov, Justin Salamon, Emilia Gomez, Nicolas Wack, Sankalp Gulati, Perfecto Herrera, Oscar Mayor, Gerard Roma, Jose R Zapata, Xavier Serra
    Abstract:

    Comunicacio presentada a la 14th International Society for Music Information Retrieval Conference, celebrada a Curitiba (Brasil) els dies 4 a 8 de novembre de 2013.

  • extending the folksonomies of freesound org using content based Audio Analysis
    Sound and Music Computing Conference, 2009
    Co-Authors: E Martinez, Oscar Celma Herrada, Mohamed Sordo, Bram De Jong, Xavier Serra
    Abstract:

    Comunicacio presentada a la 6th Sound and Music Computing Conference, celebrada els dies 23 a 25 de juliol de 2009 a Porto, Portugal.

George Tzanetakis - One of the best experts on this subject based on the ideXlab platform.

  • marsyas submissions to mirex 2010
    2010
    Co-Authors: George Tzanetakis
    Abstract:

    Marsyas is an open source software framework for Audio Analysis, synthesis and retrieval with specific emphasis on Music Information Retrieval. It is developed by an international team of programmers and researchers led by George Tzanetakis. In MIREX 2010 the Marsyas team participated in the following tasks: Audio Classical Composer Identification, Audio Genre Classification (Latin and Mixed), Audio Music Mood Classification, Audio Beat Tracking, Audio Onset Detection, Audio Tempo Estimation, Audio Music Similarity and Retrieval and Audio Tagging Tasks. In this abstract we describe the specific algorithmic details of our submission and provide information about how researchers can use our system using the MIREX input/output conventions on their own datasets. Also some comments on the results are provided.

  • distributed Audio feature extraction for music
    International Symposium Conference on Music Information Retrieval, 2005
    Co-Authors: Stuart Bray, George Tzanetakis
    Abstract:

    One of the important challenges facing music information retrieval (MIR) of Audio signals is scaling Analysis algorithms to large collections. Typically, Analysis of Audio signals utilizes sophisticated signal processing and machine learning techniques that require significant computational resources. Therefore, Audio MIR is an area were computational resources are a significant bottleneck. For example, the number of pieces utilized in the majority of existing work in Audio MIR is at most a few thousand files. Computing Audio features over thousands files can sometimes take days of processing. In this paper, we describe how Marsyas-0.2, a free software framework for Audio Analysis and synthesis can be used to rapidly implement efficient distributed Audio Analysis algorithms. The framework is based on a dataflow architecture which facilitates partitioning of Audio computations over multiple computers. Experimental results demonstrating the effectiveness of the proposed approach are presented.

  • Audio Analysis using the discrete wavelet transform
    2001
    Co-Authors: George Tzanetakis, Georg Essl, Perry R Cook
    Abstract:

    The Discrete Wavelet Transform (DWT) is a transformation that can be used to analyze the temporal and spectral properties of non-stationary signals like Audio. In this paper we describe some applications of the DWT to the problem of extracting information from non-speech Audio. More specifically automatic classification of various types of Audio using the DWT is described and compared with other traditional feature extractors proposed in the literature. In addition, a technique for detecting the beat attributes of music is presented. Both synthetic and real world stimuli were used to evaluate the performance of the beat detection algorithm. Key-Words: Audio Analysis, wavelets, classification, beat extraction

  • marsyas a framework for Audio Analysis
    Organised Sound, 1999
    Co-Authors: George Tzanetakis, Perry R Cook
    Abstract:

    Existing Audio tools handle the increasing amount of computer Audio data inadequately. The typical tape-recorder paradigm for Audio interfaces is inflexible and time consuming, especially for large data sets. On the other hand, completely automatic Audio Analysis and annotation is impossible using current techniques. Alternative solutions are semi-automatic user interfaces that let users interact with sound in flexible ways based on content. This approach offers significant advantages over manual browsing, annotation and retrieval. Furthermore, it can be implemented using existing techniques for Audio content Analysis in restricted domains. This paper describes MARSYAS, a framework for experimenting, evaluating and integrating such techniques. As a test for the architecture, some recently proposed techniques have been implemented and tested. In addition, a new method for temporal segmentation based on Audio texture is described. This method is combined with Audio Analysis techniques and used for hierarchical browsing, classification and annotation of Audio files.

  • a framework for Audio Analysis based on classification and temporal segmentation
    Proceedings 25th EUROMICRO Conference. Informatics: Theory and Practice for the New Millennium, 1999
    Co-Authors: George Tzanetakis, F Cook
    Abstract:

    Existing Audio tools handle the increasing amount of computer Audio data inadequately. The typical tape-recorder paradigm for Audio interfaces is inflexible and time consuming, especially for large data sets. On the other hand, completely automatic Audio Analysis and annotation is impossible using current techniques. Alternative solutions are semi-automatic user interfaces that let users interact with sound in flexible ways based on content. This approach offers significant advantages over manual browsing, annotation and retrieval. Furthermore, it can be implemented using existing techniques for Audio content Analysis in restricted domains. This paper describes a framework for experimenting evaluating and integrating such techniques. As a test for the architecture, some recently proposed techniques have been implemented and tested. In addition, a new method for temporal segmentation based on Audio texture is described. This method is combined with Audio Analysis techniques and used for hierarchical browsing classification and annotation of Audio files.