Spectrograms

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 321 Experts worldwide ranked by ideXlab platform

Timothy J. Moroney - One of the best experts on this subject based on the ideXlab platform.

  • Spectrograms of ship wakes: identifying linear and nonlinear wave signals
    Journal of Fluid Mechanics, 2016
    Co-Authors: Ravindra Pethiyagoda, Scott W. Mccue, Timothy J. Moroney
    Abstract:

    A spectrogram is a useful way of using short-time discrete Fourier transforms to visualise surface height measurements taken of ship wakes in real-world conditions. For a steadily moving ship that leaves behind small-amplitude waves, the spectrogram is known to have two clear linear components, a sliding-frequency mode caused by the divergent waves and a constant-frequency mode for the transverse waves. However, recent observations of high-speed ferry data have identified additional components of the Spectrograms that are not yet explained. We use computer simulations of linear and nonlinear ship wave patterns and apply time–frequency analysis to generate Spectrograms for an idealised ship. We clarify the role of the linear dispersion relation and ship speed on the two linear components. We use a simple weakly nonlinear theory to identify higher-order effects in a spectrogram and, while the high-speed ferry data are very noisy, we propose that certain additional features in the experimental data are caused by nonlinearity. Finally, we provide a possible explanation for a further discrepancy between the high-speed ferry Spectrograms and linear theory by accounting for ship acceleration.

  • Spectrograms of ship wakes identifying linear and nonlinear wave signals
    arXiv: Fluid Dynamics, 2016
    Co-Authors: Ravindra Pethiyagoda, Scott W. Mccue, Timothy J. Moroney
    Abstract:

    A spectrogram is a useful way of using short-time discrete Fourier transforms to visualise surface height measurements taken of ship wakes in real world conditions. For a steadily moving ship that leaves behind small-amplitude waves, the spectrogram is known to have two clear linear components, a sliding-frequency mode caused by the divergent waves and a constant-frequency mode for the transverse waves. However, recent observations of high speed ferry data have identified additional components of the Spectrograms that are not yet explained. We use computer simulations of linear and nonlinear ship wave patterns and apply time-frequency analysis to generate Spectrograms for an idealised ship. We clarify the role of the linear dispersion relation and ship speed on the two linear components. We use a simple weakly nonlinear theory to identify higher order effects in a spectrogram and, while the high speed ferry data is very noisy, we propose that certain additional features in the experimental data are caused by nonlinearity. Finally, we provide a possible explanation for a further discrepancy between the high speed ferry Spectrograms and linear theory by accounting for ship acceleration.

Alessandro L. Koerich - One of the best experts on this subject based on the ideXlab platform.

  • IJCNN - Cross-Representation Transferability of Adversarial Attacks: From Spectrograms to Audio Waveforms
    2020 International Joint Conference on Neural Networks (IJCNN), 2020
    Co-Authors: Karl Michel Koerich, Sajjad Abdoli, Mohammad Esmailpour, Alceu S. Britto, Alessandro L. Koerich
    Abstract:

    This paper shows the susceptibility of spectrogram-based audio classifiers to adversarial attacks and the transferability of such attacks to audio waveforms. Some commonly used adversarial attacks to images have been applied to Mel-frequency and short-time Fourier transform Spectrograms, and such perturbed Spectrograms are able to fool a 2D convolutional neural network (CNN). Such attacks produce perturbed Spectrograms that are visually imperceptible by humans. Furthermore, the audio waveforms reconstructed from the perturbed Spectrograms are also able to fool a 1D CNN trained on the original audio. Experimental results on a dataset of western music have shown that the 2D CNN achieves up to 81.87% of mean accuracy on legitimate examples and such performance drops to 12.09% on adversarial examples. Likewise, the 1D CNN achieves up to 78.29% of mean accuracy on original audio samples and such performance drops to 27.91% on adversarial audio waveforms reconstructed from the perturbed Spectrograms.

  • Cross-Representation Transferability of Adversarial Attacks: From Spectrograms to Audio Waveforms.
    arXiv: Sound, 2019
    Co-Authors: Karl Michel Koerich, Mohammad Esmaeilpour, Sajjad Abdoli, Alceu De Souza Britto, Alessandro L. Koerich
    Abstract:

    This paper shows the susceptibility of spectrogram-based audio classifiers to adversarial attacks and the transferability of such attacks to audio waveforms. Some commonly used adversarial attacks to images have been applied to Mel-frequency and short-time Fourier transform Spectrograms, and such perturbed Spectrograms are able to fool a 2D convolutional neural network (CNN). Such attacks produce perturbed Spectrograms that are visually imperceptible by humans. Furthermore, the audio waveforms reconstructed from the perturbed Spectrograms are also able to fool a 1D CNN trained on the original audio. Experimental results on a dataset of western music have shown that the 2D CNN achieves up to 81.87% of mean accuracy on legitimate examples and such performance drops to 12.09% on adversarial examples. Likewise, the 1D CNN achieves up to 78.29% of mean accuracy on original audio samples and such performance drops to 27.91% on adversarial audio waveforms reconstructed from the perturbed Spectrograms.

Brendan J. Frey - One of the best experts on this subject based on the ideXlab platform.

  • NIPS - Probabilistic Inference of Speech Signals from Phaseless Spectrograms
    2003
    Co-Authors: Kannan Achan, Sam T. Roweis, Brendan J. Frey
    Abstract:

    Many techniques for complex speech processing such as denoising and deconvolution, time/frequency warping, multiple speaker separation, and multiple microphone analysis operate on sequences of short-time power spectra (Spectrograms), a representation which is often well-suited to these tasks. However, a significant problem with algorithms that manipulate Spectrograms is that the output spectrogram does not include a phase component, which is needed to create a time-domain signal that has good perceptual quality. Here we describe a generative model of time-domain speech signals and their Spectrograms, and show how an efficient optimizer can be used to find the maximum a posteriori speech signal, given the spectrogram. In contrast to techniques that alternate between estimating the phase and a spectrally-consistent signal, our technique directly infers the speech signal, thus jointly optimizing the phase and a spectrally-consistent signal. We compare our technique with a standard method using signal-to-noise ratios, but we also provide audio files on the web for the purpose of demonstrating the improvement in perceptual quality that our technique offers.

  • probabilistic inference of speech signals from phaseless Spectrograms
    Neural Information Processing Systems, 2003
    Co-Authors: Kannan Achan, Sam T. Roweis, Brendan J. Frey
    Abstract:

    Many techniques for complex speech processing such as denoising and deconvolution, time/frequency warping, multiple speaker separation, and multiple microphone analysis operate on sequences of short-time power spectra (Spectrograms), a representation which is often well-suited to these tasks. However, a significant problem with algorithms that manipulate Spectrograms is that the output spectrogram does not include a phase component, which is needed to create a time-domain signal that has good perceptual quality. Here we describe a generative model of time-domain speech signals and their Spectrograms, and show how an efficient optimizer can be used to find the maximum a posteriori speech signal, given the spectrogram. In contrast to techniques that alternate between estimating the phase and a spectrally-consistent signal, our technique directly infers the speech signal, thus jointly optimizing the phase and a spectrally-consistent signal. We compare our technique with a standard method using signal-to-noise ratios, but we also provide audio files on the web for the purpose of demonstrating the improvement in perceptual quality that our technique offers.

Hiroshi G Okuno - One of the best experts on this subject based on the ideXlab platform.

  • speech enhancement based on bayesian low rank and sparse decomposition of multichannel magnitude Spectrograms
    IEEE Transactions on Audio Speech and Language Processing, 2018
    Co-Authors: Yoshiaki Bando, Kazuyoshi Yoshii, Katsutoshi Itoyama, Masashi Konyo, Satoshi Tadokoro, Kazuhiro Nakadai, Tatsuya Kawahara, Hiroshi G Okuno
    Abstract:

    This paper presents a blind multichannel speech enhancement method that can deal with the time-varying layout of microphones and sound sources. Since nonnegative tensor factorization (NTF) separates a multichannel magnitude (or power) spectrogram into source Spectrograms without phase information, it is robust against the time-varying mixing system. This method, however, requires prior information such as the spectral bases (templates) of each source spectrogram in advance. To solve this problem, we develop a Bayesian model called robust NTF (Bayesian RNTF) that decomposes a multichannel magnitude spectrogram into target speech and noise Spectrograms based on their sparseness and low rankness. Bayesian RNTF is applied to the challenging task of speech enhancement for a microphone array distributed on a hose-shaped rescue robot. When the robot searches for victims under collapsed buildings, the layout of the microphones changes over time and some of them often fail to capture target speech. Our method robustly works under such situations, thanks to its characteristic of time-varying mixing system. Experiments using a 3-m hose-shaped rescue robot with eight microphones show that the proposed method outperforms conventional blind methods in enhancement performance by the signal-to-noise ratio of 1.03 dB.

  • Drum sound recognition for polyphonic audio signals by adaptation and matching of spectrogram templates with harmonic structure suppression
    IEEE Transactions on Audio Speech and Language Processing, 2007
    Co-Authors: Kazuyoshi Yoshii, Masataka Goto, Hiroshi G Okuno
    Abstract:

    This paper describes a system that detects onsets of the bass drum, snare drum, and hi-hat cymbals in polyphonic audio signals of popular songs. Our system is based on a template-matching method that uses power Spectrograms of drum sounds as templates. This method calculates the distance between a template and each spectrogram segment extracted from a song spectrogram, using Goto's distance measure originally designed to detect the onsets in drums-only signals. However, there are two main problems. The first problem is that appropriate templates are unknown for each song. The second problem is that it is more difficult to detect drum-sound onsets in sound mixtures including various sounds other than drum sounds. To solve these problems, we propose template-adaptation and harmonic-structure-suppression methods. First of all, an initial template of each drum sound, called a seed template, is prepared. The former method adapts it to actual drum-sound Spectrograms appearing in the song spectrogram. To make our system robust to the overlapping of harmonic sounds with drum sounds, the latter method suppresses harmonic components in the song spectrogram before the adaptation and matching. Experimental results with 70 popular songs showed that our template-adaptation and harmonic-structure-suppression methods improved the recognition accuracy and achieved 83%, 58%, and 46% in detecting onsets of the bass drum, snare drum, and hi-hat cymbals, respectively

Ravindra Pethiyagoda - One of the best experts on this subject based on the ideXlab platform.

  • Spectrograms of ship wakes: identifying linear and nonlinear wave signals
    Journal of Fluid Mechanics, 2016
    Co-Authors: Ravindra Pethiyagoda, Scott W. Mccue, Timothy J. Moroney
    Abstract:

    A spectrogram is a useful way of using short-time discrete Fourier transforms to visualise surface height measurements taken of ship wakes in real-world conditions. For a steadily moving ship that leaves behind small-amplitude waves, the spectrogram is known to have two clear linear components, a sliding-frequency mode caused by the divergent waves and a constant-frequency mode for the transverse waves. However, recent observations of high-speed ferry data have identified additional components of the Spectrograms that are not yet explained. We use computer simulations of linear and nonlinear ship wave patterns and apply time–frequency analysis to generate Spectrograms for an idealised ship. We clarify the role of the linear dispersion relation and ship speed on the two linear components. We use a simple weakly nonlinear theory to identify higher-order effects in a spectrogram and, while the high-speed ferry data are very noisy, we propose that certain additional features in the experimental data are caused by nonlinearity. Finally, we provide a possible explanation for a further discrepancy between the high-speed ferry Spectrograms and linear theory by accounting for ship acceleration.

  • Spectrograms of ship wakes identifying linear and nonlinear wave signals
    arXiv: Fluid Dynamics, 2016
    Co-Authors: Ravindra Pethiyagoda, Scott W. Mccue, Timothy J. Moroney
    Abstract:

    A spectrogram is a useful way of using short-time discrete Fourier transforms to visualise surface height measurements taken of ship wakes in real world conditions. For a steadily moving ship that leaves behind small-amplitude waves, the spectrogram is known to have two clear linear components, a sliding-frequency mode caused by the divergent waves and a constant-frequency mode for the transverse waves. However, recent observations of high speed ferry data have identified additional components of the Spectrograms that are not yet explained. We use computer simulations of linear and nonlinear ship wave patterns and apply time-frequency analysis to generate Spectrograms for an idealised ship. We clarify the role of the linear dispersion relation and ship speed on the two linear components. We use a simple weakly nonlinear theory to identify higher order effects in a spectrogram and, while the high speed ferry data is very noisy, we propose that certain additional features in the experimental data are caused by nonlinearity. Finally, we provide a possible explanation for a further discrepancy between the high speed ferry Spectrograms and linear theory by accounting for ship acceleration.