Audio Compression

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 351 Experts worldwide ranked by ideXlab platform

Jin Li - One of the best experts on this subject based on the ideXlab platform.

  • low noise reversible mdct rmdct and its application in progressive to lossless embedded Audio coding
    IEEE Transactions on Signal Processing, 2005
    Co-Authors: Jin Li
    Abstract:

    A reversible transform converts an integer input to an integer output, while retaining the ability to reconstruct the exact input from the output sequence. It is one of the key components for lossless and progressive-to-lossless Audio codecs. In this work, we investigate the desired characteristics of a high-performance reversible transform. Specifically, we show that the smaller the quantization noise of the reversible modified discrete cosine transform (RMDCT), the better the Compression performance of the lossless and progressive-to-lossless codec that utilizes the transform. Armed with this knowledge, we develop a number of RMDCT solutions. The first RMDCT solution is implemented by turning every rotation module of a float MDCT (FMDCT) into a reversible rotation, which uses multiple factorizations to further reduce the quantization noise. The second and third solutions use the matrix lifting to implement a reversible fast Fourier transform (FFT) and a reversible fractional-shifted FFT, respectively, which are further combined with the reversible rotations to form the RMDCT. With the matrix lifting, we can design the RMDCT that has less quantization noise and can still be computed efficiently. A progressive-to-lossless embedded Audio codec (PLEAC) employing the RMDCT is implemented with superior results for both lossless and lossy Audio Compression.

  • embedded Audio coding eac with implicit auditory masking
    ACM Multimedia, 2002
    Co-Authors: Jin Li
    Abstract:

    An embedded Audio coder (EAC) is proposed with Compression performance rivals the best available non-scalable Audio coder. The key technology that empowers the EAC with high performance is the implicit auditory masking. Unlike the common practice, where an auditory masking threshold is derived from the input Audio signal, transmitted to the decoder and used to quantize (modify) the transform coefficients; the EAC integrates the auditory masking process into the embedded entropy coding. The auditory masking threshold is derived from the encoded coefficients and used to change the order of coding. There is no need to store or send the auditory masking threshold in the EAC. By eliminating the overhead of the auditory mask, EAC greatly improves the Compression efficiency, especially at low bitrate. Extensive experimental results demonstrate that the EAC coder substantially outperforms existing scalable Audio coders and Audio Compression standards (MP3 and MPEG-4), and rivals the best available commercial Audio coder. Yet the EAC compressed bitstream is fully scalable, in term of the coding bitrate, number of Audio channels and Audio sampling rate.

Michael S Scordilis - One of the best experts on this subject based on the ideXlab platform.

  • psychoacoustic music analysis based on the discrete wavelet packet transform
    Research Letters in Signal Processing, 2008
    Co-Authors: Michael S Scordilis
    Abstract:

    Psychoacoustical computational models are necessary for the perceptual processing of acoustic signals and have contributed significantly in the development of highly efficient Audio analysis and coding. In this paper, we present an approach for the psychoacoustic analysis of musical signals based on the discrete wavelet packet transform. The proposed method mimics the multiresolution properties of the human ear closer than other techniques and it includes simultaneous and temporal auditory masking. Experimental results show that this method provides better masking capabilities and it reduces the signal-to-masking ratio substantially more than other approaches, without introducing audible distortion. This model can lead to greater Audio Compression by permitting further bit rate reduction and more secure watermarking by providing greater signal space for information hiding.

  • an enhanced psychoacoustic model based on the discrete wavelet packet transform
    Journal of The Franklin Institute-engineering and Applied Mathematics, 2006
    Co-Authors: Michael S Scordilis
    Abstract:

    The perception of acoustic information by humans is based on the detailed temporal and spectral analysis provided by the auditory processing of the received signal. The incorporation of this process in psychoacoustical computational models has contributed significantly both in the development of highly efficient Audio Compression schemes as well as in effective Audio watermarking methods. In this paper, we present an approach based on the discrete wavelet packet transform, which closely mimics the multi-resolution properties of the human ear and also includes simultaneous and temporal auditory masking. Experimental results show that the proposed technique offers better masking capabilities and it reduces the signal-to-masking ratio when compared to related approaches, without introducing audible distortion. Those results have implications that are important both for Audio Compression by permitting further bit rate reduction, and for watermarking by providing greater signal space for information hiding.

Leah H Jamieson - One of the best experts on this subject based on the ideXlab platform.

  • high quality Audio Compression using an adaptive wavelet packet decomposition and psychoacoustic modeling
    IEEE Transactions on Signal Processing, 1998
    Co-Authors: P Srinivasan, Leah H Jamieson
    Abstract:

    This paper presents a technique to incorporate psychoacoustic models into an adaptive wavelet packet scheme to achieve perceptually transparent Compression of high-quality (34.1 kHz) Audio signals at about 45 kb/s. The filter bank structure adapts according to psychoacoustic criteria and according to the computational complexity that is available at the decoder. This permits software implementations that can perform according to the computational power available in order to achieve real time coding/decoding. The bit allocation scheme is an adapted zero-tree algorithm that also takes input from the psychoacoustic model. The measure of performance is a quantity called subband perceptual rate, which the filter bank structure adapts to approach the perceptual entropy (PE) as closely as possible. In addition, this method is also amenable to progressive transmission, that is, it can achieve the best quality of reconstruction possible considering the size of the bit stream available at the encoder. The result is a variable-rate Compression scheme for high-quality Audio that takes into account the allowed computational complexity, the available bit-budget, and the psychoacoustic criteria for transparent coding. This paper thus provides a novel scheme to marry the results in wavelet packets and perceptual coding to construct an algorithm that is well suited to high-quality Audio transfer for Internet and storage applications.

Davis Yen Pan - One of the best experts on this subject based on the ideXlab platform.

  • a tutorial on mpeg Audio Compression
    IEEE MultiMedia, 1995
    Co-Authors: Davis Yen Pan
    Abstract:

    This tutorial covers the theory behind MPEG/Audio Compression. While lossy, the algorithm can often provide "transparent", perceptually lossless Compression, even with factors of 6-to-1 or more. It exploits the perceptual properties of the human auditory system. The article also covers the basics of psychoacoustic modeling and the methods the algorithm uses to compress Audio data with the least perceptible degradation. >

  • Digital Audio Compression
    Digital Technical Journal, 1993
    Co-Authors: Davis Yen Pan
    Abstract:

    Compared to most digital data types, with the exception of digital video, the data rates associ-ated with uncompressed digital Audio are substan-tial. Digital Audio Compression enables more effi-cient storage and transmission of Audio data. The many forms of Audio Compression techniques offer a range of encoder and decoder complexity, compressed Audio quality, and differing amounts of data com-pression. The -law transformation and ADPCM coder are simple approaches with low-complexity, low-Compression, and medium Audio quality algo-rithms. The MPEG/Audio standard is a high-complexity, high-Compression, and high Audio qual-ity algorithm. These techniques apply to general au-dio signals and are not specifically tuned for speech signals.

D Sinha - One of the best experts on this subject based on the ideXlab platform.

  • Audio Compression at low bit rates using a signal adaptive switched filterbank
    International Conference on Acoustics Speech and Signal Processing, 1996
    Co-Authors: D Sinha, J.d. Johnston
    Abstract:

    A perceptual Audio coder typically consists of a filter-bank which breaks the signal into its frequency components. These components are then quantized using a perceptual masking model. Previous efforts have indicated that a high resolution filter-bank, e.g., the modified discrete cosine transform (MDCT) with 1024 subbands, is able to minimize the bit rate requirements for most of the music samples. The high resolution MDCT, however, is not suitable for the encoding of non-stationary segments of music. A long/short resolution or "window" switching scheme has been employed to overcome this problem but it has certain inherent disadvantages which become prominent at lower bit rates (<64 kbps for stereo). We propose a novel switched filter-bank scheme which switches between a MDCT and a wavelet filter-bank based on the signal characteristics. A tree structured wavelet filter-bank with properly designed filters offers natural advantages for the representation of non-stationary segments such as attacks. Furthermore, it allows for the optimum exploitation of perceptual irrelevancies.

  • low bit rate transparent Audio Compression using adapted wavelets
    IEEE Transactions on Signal Processing, 1993
    Co-Authors: D Sinha, A H Tewfik
    Abstract:

    Describes a novel wavelet based Audio synthesis and coding method. The method uses optimal adaptive wavelet selection and wavelet coefficients quantization procedures together with a dynamic dictionary approach. The adaptive wavelet transform selection and transform coefficient bit allocation procedures are designed to take advantage of the masking effect in human hearing. They minimize the number of bits required to represent each frame of Audio material at a fixed distortion level. The dynamic dictionary greatly reduces statistical redundancies in the Audio source. Experiments indicate that the proposed adaptive wavelet selection procedure by itself can achieve almost transparent coding of monophonic compact disk (CD) quality signals (sampled at 44.1 kHz) at bit rates of 64-70 kilobits per second (kb/s). The combined adaptive wavelet selection and dynamic dictionary coding procedures achieve almost transparent coding of monophonic CD quality signals at bit rates of 48-66 kb/s. >