Audio Coding

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 14316 Experts worldwide ranked by ideXlab platform

Karlheinz Brandenburg - One of the best experts on this subject based on the ideXlab platform.

  • MPEG-4 natural Audio Coding
    Signal Processing-image Communication, 2002
    Co-Authors: Karlheinz Brandenburg, Oliver Kunz, Akihiko Sugiyama
    Abstract:

    MPEG-4 Audio represents a new kind of Audio Coding standard. Unlike its predecessors, MPEG-1 and MPEG-2 high-quality Audio Coding, and unlike the speech Coding standards which have been completed by the ITU-T, it describes not a single or small set of highly efficient compression schemes but a complete toolbox to do everything from low bit-rate speech Coding to high-quality Audio Coding or music synthesis. The natural Coding part within MPEG-4 Audio describes traditional type speech and high-quality Audio Coding algorithms and their combination to enable new functionalities like scalability (hierarchical Coding) across the boundaries of Coding algorithms. This paper gives an overview of the basic algorithms and how they can be combined.

  • intmdct a link between perceptual and lossless Audio Coding
    International Conference on Acoustics Speech and Signal Processing, 2002
    Co-Authors: Ralf Geiger, Jürgen Herre, Jürgen Koller, Karlheinz Brandenburg
    Abstract:

    The Modified Discrete Cosine Transform (MDCT) is widely used in modem perceptual Audio Coding schemes. In this paper we present an integer approximation of this lapped transform, called IntMDCT, which is derived from the MDCT using the lifting scheme. This reversible integer transform inherits most of the attractive properties of the MDCT, exhibiting a good spectral representation of the Audio signal, critical sampling and overlapping of blocks. This makes the IntMDCT well suited for both lossless Audio Coding as well as for combined perceptual and lossless Audio Coding. A scalable system is presented providing a lossless enhancement of perceptual Audio Coding schemes, such as MPEG-2 AAC.

  • ICASSP - IntMDCT - A link between perceptual and lossless Audio Coding
    IEEE International Conference on Acoustics Speech and Signal Processing, 2002
    Co-Authors: Ralf Geiger, Jürgen Herre, Jürgen Koller, Karlheinz Brandenburg
    Abstract:

    The Modified Discrete Cosine Transform (MDCT) is widely used in modem perceptual Audio Coding schemes. In this paper we present an integer approximation of this lapped transform, called IntMDCT, which is derived from the MDCT using the lifting scheme. This reversible integer transform inherits most of the attractive properties of the MDCT, exhibiting a good spectral representation of the Audio signal, critical sampling and overlapping of blocks. This makes the IntMDCT well suited for both lossless Audio Coding as well as for combined perceptual and lossless Audio Coding. A scalable system is presented providing a lossless enhancement of perceptual Audio Coding schemes, such as MPEG-2 AAC.

  • Audio Coding based on Integer Transforms
    Journal of The Audio Engineering Society, 2001
    Co-Authors: Ralf Geiger, Thomas Sporer, Jürgen Koller, Karlheinz Brandenburg
    Abstract:

    In recent years Audio Coding has become a very popular field for research and applications. Especially perceptual Audio Coding schemes, such as MPEG-1 Layer-3 (MP3) and MPEG-2 Advanced Audio Coding (AAC), are widely used for efficient storage and transmission of music signals. Nevertheless, for professional applications, such as archiving and transmission in studio environments, lossless Audio Coding schemes are considered more appropriate. Traditionally, the technical approaches used in perceptual and lossless Audio Coding have been separate worlds. In perceptual Audio Coding, the use of filter banks, such as the lapped orthogonal transform “Modified Discrete Cosine Transform” (MDCT), has been the approach of choice being used by many state of the art Coding schemes. On the other hand, lossless Audio Coding schemes mostly employ predictive Coding of waveforms to remove redundancy. Only few attempts have been made so far to use transform Coding for the purpose of lossless Audio Coding. This work presents a new approach of applying the lifting scheme to lapped transforms used in perceptual Audio Coding. This allows for an invertible integer-tointeger approximation of the original transform, e.g. the IntMDCT as an integer approximation of the MDCT. The same technique can also be applied to low-delay filter banks. A generalized, multi-dimensional lifting approach and a noise-shaping technique are introduced, allowing to further optimize the accuracy of the approximation to the original transform. Based on these new integer transforms, this work presents new Audio Coding schemes and applications. The Audio Coding applications cover lossless Audio Coding, scalable lossless enhancement of a perceptual Audio coder and fine-grain scalable perceptual and lossless Audio Coding. Finally an approach to data hiding with high data rates in uncompressed Audio signals based on integer transforms is described.

  • iso iec mpeg 2 advanced Audio Coding
    Journal of The Audio Engineering Society, 1997
    Co-Authors: Marina Bosi, Karlheinz Brandenburg, Schuyler Quackenbush, Louis Dunn Fielder, Kenzo Akagiri, Hendrik Fuchs, Martin Dietz
    Abstract:

    The ISO/IEC MPEG-2 advanced Audio Coding (AAC) system was designed to provide MPEG-2 with the best Audio quality without any restrictions due to compatibility requirements. The main features of the AAC system (ISO/IEC 13818-7) are described. MPEG-2 AAC combines the Coding efficiency of a high-resolution filter bank, prediction techniques, and Huffman Coding with additional functionalities aimed to deliver very high Audio quality at a variety of data rates.

Susanto Rahardja - One of the best experts on this subject based on the ideXlab platform.

  • ICASSP - Enhanced scalable to lossless Audio Coding scheme
    2010 IEEE International Conference on Acoustics Speech and Signal Processing, 2010
    Co-Authors: Haiyan Shu, Haibin Huang, Susanto Rahardja
    Abstract:

    Scalable to lossless (SLS) Audio Coding is a state-of-art Audio Coding technique that has been adopted as MPEG scalable Audio Coding tool. To realize bit-plane refinement, this technique employs bit-plane arithmetic Coding for lossless entropy Coding, and Laplacian distribution is used to model the input data to realize high compression efficiency. In this paper, bit-plane probability is analyzed when generalized Gaussian distribution is used to model the input data. Based on the result of bit-plane probability for generalized Gaussian distribution, a low cost bit-plane arithmetic Coding method is presented. This scheme is implemented in the SLS Audio Coding platform. With the same computational complexity, the proposed algorithm presents higher compression efficiency than SLS.

  • On integer MDCT for perceptual Audio Coding
    IEEE Transactions on Audio, Speech and Language Processing, 2007
    Co-Authors: Te Li, Susanto Rahardja, Rongshan Yu, Soo Ngee Koh
    Abstract:

    In MPEG-4 scalable lossless Coding (SLS) which was recently published as an ISO standard in June 2006, the integer modified discrete cosine transform (IntMDCT) was adopted to enable efficient lossless reconstruction. In addition, there is an MDCT filterbank which is inherent to the advanced Audio Coding (AAC) core that is present in the SLS codec. The presence of two filterbanks have undoubtedly increased the complexity of the implementation, and it is for this reason that the MDCT is disabled and the IntMDCT is then the only type of filterbank that is employed in SLS for both lossy and lossless operations. Because of the rounding operations in the IntMDCT, there is a concern if the use of IntMDCT for perceptual Audio Coding will eventually degrade the fidelity of the Audio codec. This paper addresses this concern by analyzing the performance of the IntMDCT in a lossy Coding scenario. It is found that noise introduced by the IntMDCT does not affect the perceptual quality of the coded Audio under standard playback circumstances. As such, it concludes that the MDCT and IntMDCT filterbanks are interchangeable at lossy bitrate, and the way of using only the IntMDCT filterbank in scalable Audio Coding is also justified.

  • ISM - Perceptually Prioritized Bit-Plane Coding for High-Definition Advanced Audio Coding
    Eighth IEEE International Symposium on Multimedia (ISM'06), 2006
    Co-Authors: Susanto Rahardja, Soo Ngee Koh
    Abstract:

    Wide bitrate range scalability is now the latest trend in Audio Coding. A lot of efforts has been devoted to the development of algorithms for more efficient scalable Audio coder that scales from very low bitrate. Scalable Audio Coding technique such as MPEG-4 Scalable Lossless Coding (SLS) offers a unified solution for high-compression perceptual Audio and high-quality lossless Audio. SLS provides a fine-grain scalable extension of the well-known MPEG-4 Advanced Audio Coding (AAC) perceptual Audio coder up to fully lossless reconstruction. Recently, the combination of SLS and AAC coder is renamed as "High Definition Advanced Audio Coding" (HD-AAC). It is observed that HD-AAC can be further improved at intermediate enhancement bitrate when the core bitrate is low. In this paper, a Perceptually Prioritized Bit-Plane Coding (PPBPC) is proposed. With this novel Coding scheme, the bit-plane Coding is performed with priorities according to the perceptual information of the signal to be coded. By using this low-complexity structure with trivial extra side information, the bit-plane Coding for scalable Audio can be implemented in a perceptually more efficient manner and the quality of the Audio under aforementioned scenario is greatly improved.

  • mpeg 4 scalable to lossless Audio Coding
    Journal of The Audio Engineering Society, 2004
    Co-Authors: Ralf Geiger, Juergen Herre, Haibin Huang, Xiao Lin, Susanto Rahardja
    Abstract:

    As the latest extension of MPEG-4 Audio Coding, MPEG-4 Lossless Audio Coding includes a scalable Audio Coding solution (SLS) that integrates the functionalities of lossless Audio Coding, perceptual Audio Coding, and fine granular scalable Audio Coding into a single coder framework while providing backward compatibility to MPEG Advanced Audio Coding (AAC) at the bit-stream level. Despite its abundant functionalities, SLS still achieves a compression performance that is comparable to state-of-the-art non-scalable lossless Audio Coding algorithms. As a result, SLS provides a universal digital Audio format for a variety of application domains including professional Audio, Internet music, consumer electronics, broadcasting and others. This paper presents the structure of SLS and its latest developments during the MPEG standardization process.

  • ICASSP (3) - A scalable lossy to lossless Audio coder for MPEG-4 lossless Audio Coding
    2004 IEEE International Conference on Acoustics Speech and Signal Processing, 1
    Co-Authors: Xiao Lin, Susanto Rahardja
    Abstract:

    In this paper, we present Advanced Audio Zip (AAZ), a scalable lossless Audio Coding technology that was recently selected as the reference model for MPEG Audio scalable lossless Coding (SLS) work. AAZ provides excellent compression performance while delivering fine grain bit-rate scalability from lossy to lossless Coding. Moreover, AAZ provides backward compatibility to the MPEG advanced Audio Coding (AAC) system by embedding an AAC compliant bit-stream into the lossless bit-stream. As a result, AAZ serves as a universal Coding solution with functionalities that were previously offered by several distinct Audio Coding technologies such as lossless Audio Coding, perceptual Audio Coding, or scalable Audio Coding; and maximizes the interchangeability for digital Audio contents migrating among these application domains.

Ralf Geiger - One of the best experts on this subject based on the ideXlab platform.

  • low delay filterbanks for enhanced low delay Audio Coding
    Workshop on Applications of Signal Processing to Audio and Acoustics, 2007
    Co-Authors: Markus Schnell, Jürgen Herre, Ralf Geiger, Markus Multrus, Markus Schmidt, Michael Mellar, Gerald Schuller
    Abstract:

    Low delay perceptual Audio Coding has recently gained wide acceptance for high quality communication. While common schemes are based on the well-known Modified Discrete Cosine Transform (MDCT) filterbank, this paper describes novel Coding algorithms that, for the first time, make use of dedicated low delay filterbanks, thus achieving improved Coding efficiency while maintaining or even reducing the low codec delay. The MPEG-4 Enhanced Low Delay AAC (AAC-ELD) coder currently under development within ISO/MPEG combines a traditional perceptual Audio Coding scheme with spectral band replication (SBR), both running in a delay-optimized fashion by using low delay filterbanks.

  • mpeg 4 scalable to lossless Audio Coding
    Journal of The Audio Engineering Society, 2004
    Co-Authors: Ralf Geiger, Juergen Herre, Haibin Huang, Xiao Lin, Susanto Rahardja
    Abstract:

    As the latest extension of MPEG-4 Audio Coding, MPEG-4 Lossless Audio Coding includes a scalable Audio Coding solution (SLS) that integrates the functionalities of lossless Audio Coding, perceptual Audio Coding, and fine granular scalable Audio Coding into a single coder framework while providing backward compatibility to MPEG Advanced Audio Coding (AAC) at the bit-stream level. Despite its abundant functionalities, SLS still achieves a compression performance that is comparable to state-of-the-art non-scalable lossless Audio Coding algorithms. As a result, SLS provides a universal digital Audio format for a variety of application domains including professional Audio, Internet music, consumer electronics, broadcasting and others. This paper presents the structure of SLS and its latest developments during the MPEG standardization process.

  • fine grain scalable perceptual and lossless Audio Coding based on intmdct
    International Conference on Acoustics Speech and Signal Processing, 2003
    Co-Authors: Ralf Geiger, Gerald Schuller, A Herre, Thomas Sporer
    Abstract:

    This papers presents an embedded fine grain scalable perceptual and lossless Audio Coding scheme. The enabling technology for this combined perceptual and lossless Audio Coding approach is the integer modified discrete cosine transform (IntMDCT), which is an integer approximation of the MDCT based on the lifting scheme. It maintains the perfect reconstruction property and therefore enables efficient lossless Coding in the frequency domain. The close approximation of the MDCT also allows us to build a perceptual Coding scheme based on the IntMDCT. In this paper a bitsliced arithmetic Coding technique is applied to the IntMDCT values. Together with the encoded shape of the masking threshold a perceptually hierarchical bitstream is obtained, containing several stages of perceptual quality and extending to lossless operation when transmitted completely. A concept of enCoding subslices is presented in order to obtain a fine adaptation to the masking threshold especially in the range of perceptually transparent quality.

  • intmdct a link between perceptual and lossless Audio Coding
    International Conference on Acoustics Speech and Signal Processing, 2002
    Co-Authors: Ralf Geiger, Jürgen Herre, Jürgen Koller, Karlheinz Brandenburg
    Abstract:

    The Modified Discrete Cosine Transform (MDCT) is widely used in modem perceptual Audio Coding schemes. In this paper we present an integer approximation of this lapped transform, called IntMDCT, which is derived from the MDCT using the lifting scheme. This reversible integer transform inherits most of the attractive properties of the MDCT, exhibiting a good spectral representation of the Audio signal, critical sampling and overlapping of blocks. This makes the IntMDCT well suited for both lossless Audio Coding as well as for combined perceptual and lossless Audio Coding. A scalable system is presented providing a lossless enhancement of perceptual Audio Coding schemes, such as MPEG-2 AAC.

  • ICASSP - IntMDCT - A link between perceptual and lossless Audio Coding
    IEEE International Conference on Acoustics Speech and Signal Processing, 2002
    Co-Authors: Ralf Geiger, Jürgen Herre, Jürgen Koller, Karlheinz Brandenburg
    Abstract:

    The Modified Discrete Cosine Transform (MDCT) is widely used in modem perceptual Audio Coding schemes. In this paper we present an integer approximation of this lapped transform, called IntMDCT, which is derived from the MDCT using the lifting scheme. This reversible integer transform inherits most of the attractive properties of the MDCT, exhibiting a good spectral representation of the Audio signal, critical sampling and overlapping of blocks. This makes the IntMDCT well suited for both lossless Audio Coding as well as for combined perceptual and lossless Audio Coding. A scalable system is presented providing a lossless enhancement of perceptual Audio Coding schemes, such as MPEG-2 AAC.

Li Gao - One of the best experts on this subject based on the ideXlab platform.

  • ICIS - Joint speech/Audio Coding based scalable perceptual Audio Coding
    2014 IEEE ACIS 13th International Conference on Computer and Information Science (ICIS), 2014
    Co-Authors: Li Gao, Yuhong Yang
    Abstract:

    With the technical evolution of global mobile communications, various heterogeneous communication environments, frequently fluctuant bandwidth and multiform signals put new challenges to Coding technology of multimedia signals. Scalable Audio Coding (SAC) can provide smooth transition between different Coding qualities, which is an optimal choice for Coding Audio signals of different types and can produce more reliable and consistent service quality in multimedia communications. A scalable Audio Coding system based on joint speech/Audio Coding method and an auditory perceptual importance model based on bit-plane are proposed here. Both the Audio content and the network bandwidth fluctuation will be considered in the system to obtain stable service qualities in mobile multimedia services. Experimental results indicate that with the same bit rates the subjective quality of proposed method is slightly better than G.729.1 and the SNR is improved by 0.3dB.

  • ICIMCS - 3D Audio Coding Based on Distance Perception
    Proceedings of International Conference on Internet Multimedia Computing and Service - ICIMCS '14, 2014
    Co-Authors: Li Gao
    Abstract:

    The rapid development of 3D film stimulates the requirement for 3D Audio. Current 3D Audio systems mainly focus on the performance of directional sound image. Multichannel techniques extract binaural cues from channels to represent the directional information with less information about sound distance, which results in the degradation of distance perception quality due to the perceptual difference of direction and distance perception. The distance information is the key to distinct 3D Audio from 2D Audio. We focus on the auditory distance perception in 3D Audio Coding. The auditory distance estimation model is established based on the auditory perception of the human ear and imported to the multichannel 3D Audio Coding system to conduct Coding and reproduction of the sound from different directions and distances. Experimental results verified the performance of proposed 3D Audio Coding based on distance perception.

  • ICIMCS - An Enhanced Perceptual Frequency Subband Priority Based Scalable Audio Coding
    Proceedings of International Conference on Internet Multimedia Computing and Service - ICIMCS '14, 2014
    Co-Authors: Hui Liu, Li Gao
    Abstract:

    Energy and frequency play important roles in the scalable priority assignment schemes of scalable Audio Coding. Both of them should be taken into account to ensure the real important frequency subbands to be coded firstly and finely. An enhanced perceptual frequency subbands priority based scalable Audio Coding scheme is proposed in the frequency subbands priority assignment. The perceptual characters of human auditory sensory system are applied to prioritize the Coding of frequency subbands those are more sensitive to human auditory system. Experimental results of CMOS verified the performance of proposed scalable Audio Coding.

  • ICASSP - A spatial priority based scalable Audio Coding
    2014 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), 2014
    Co-Authors: Li Gao, Yuhong Yang
    Abstract:

    A spatial priority scheme for scalable Audio Coding is presented in this paper. To improve the Coding quality of important sounds with high attention, especially the moving sound, spatial information is introduced to assign the priorities of frequency subbands. Spatial cues and distance features are extracted in frequency subbands to represent the sound with fast changing direction and distance. Coding priorities are assigned to different frequency subbands according to the energy and spatial information. With trivial added side information and complexity, experimental results show that the perceptual quality is improved especially for the sound with high attention, especially the moving sound in scalable Audio Coding.

  • ICAILP - Scalable Audio Coding based on spatial perception in Audio surveillance
    2014 International Conference on Audio Language and Image Processing, 2014
    Co-Authors: Hui Liu, Li Gao
    Abstract:

    A spatial perception based scalable Audio Coding in Audio surveillance is presented in this paper. Sudden events in Audio surveillance are often accompanied with high energy, fast changing energy or sound location. Current priority schemes of scalable Audio Coding are mainly based on the energy criterion or perception importance criterion, in which priorities are assigned according to high energy or low frequency which are more sensitive to auditory sensation. According to the energy and spatial cues in different frequency bands, this paper assigns the priorities to the sub-bands with fast changing location and energy of sound source in scalable Audio Coding. With zero extra side information and trivial added complexity, experimental results show that the perceptual quality is improved especially for the sound with high attention in Audio surveillance.

Yuhong Yang - One of the best experts on this subject based on the ideXlab platform.

  • 3D Audio Coding approach based on spatial perception features
    China Communications, 2017
    Co-Authors: Cheng Yang, Yuhong Yang, Xiaochen Wang, Maosheng Zhang, Wei Chen
    Abstract:

    A new three-dimensional (3D) Audio Coding approach is presented to improve the spatial perceptual quality of 3D Audio. Different from other Audio Coding approaches, the distance side information is also quantified, and the non-uniform perceptual quantization is proposed based on the spatial perception features of the human auditory system, which is named as concentric spheres spatial quantization (CSSQ) method. Comparison results were presented, which showed that a better distance perceptual quality of 3D Audio can be enhanced by 5.7%∼8.8% through extracting and Coding the distance side information comparing with the directional Audio Coding, and the bit rate of our Coding method is decreased of 8.07% comparing with the spatial squeeze surround Audio Coding.

  • ICIS - Joint speech/Audio Coding based scalable perceptual Audio Coding
    2014 IEEE ACIS 13th International Conference on Computer and Information Science (ICIS), 2014
    Co-Authors: Li Gao, Yuhong Yang
    Abstract:

    With the technical evolution of global mobile communications, various heterogeneous communication environments, frequently fluctuant bandwidth and multiform signals put new challenges to Coding technology of multimedia signals. Scalable Audio Coding (SAC) can provide smooth transition between different Coding qualities, which is an optimal choice for Coding Audio signals of different types and can produce more reliable and consistent service quality in multimedia communications. A scalable Audio Coding system based on joint speech/Audio Coding method and an auditory perceptual importance model based on bit-plane are proposed here. Both the Audio content and the network bandwidth fluctuation will be considered in the system to obtain stable service qualities in mobile multimedia services. Experimental results indicate that with the same bit rates the subjective quality of proposed method is slightly better than G.729.1 and the SNR is improved by 0.3dB.

  • ICASSP - A spatial priority based scalable Audio Coding
    2014 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), 2014
    Co-Authors: Li Gao, Yuhong Yang
    Abstract:

    A spatial priority scheme for scalable Audio Coding is presented in this paper. To improve the Coding quality of important sounds with high attention, especially the moving sound, spatial information is introduced to assign the priorities of frequency subbands. Spatial cues and distance features are extracted in frequency subbands to represent the sound with fast changing direction and distance. Coding priorities are assigned to different frequency subbands according to the energy and spatial information. With trivial added side information and complexity, experimental results show that the perceptual quality is improved especially for the sound with high attention, especially the moving sound in scalable Audio Coding.