Decompression Algorithm

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 1134 Experts worldwide ranked by ideXlab platform

T. Ungerer - One of the best experts on this subject based on the ideXlab platform.

  • Performance of simultaneous multithreaded multimedia-enhanced processors for MPEG-2 video Decompression
    Journal of Systems Architecture, 2000
    Co-Authors: Hartmut Oehring, U. Sigmund, T. Ungerer
    Abstract:

    Abstract This paper explores microarchitecture models for a simultaneous multithreaded (SMT) processor with multimedia enhancements. We start with a wide-issue superscalar processor, enhance it by the SMT technique, by multimedia units, and by an additional on-chip RAM storage. Our workload is a multithreaded MPEG-2 video Decompression Algorithm that extensively uses multimedia units. The simulations show that a single-threaded, 8-issue maximum processor (assuming an abundance of resources) reaches an instructions per cycle (IPC) count of only 1.60, while an 8-threaded 8-issue processor is able to reach an IPC of 6.07. A more realistic processor model reaches an IPC of 1.27 in the single-threaded 8-issue vs 3.03 in the 4-threaded 4-issue and 3.21 in the 8-threaded 8-issue modes. Our conclusion on next generation’s microprocessors is that a 2- or 4-threaded 4-issue processor with a small on-chip RAM accessed by a local load/store unit will be superior to a wide-issue (single-threaded) superscalar processor at least for MPEG-2 style video Decompression Algorithms.

  • IEEE PACT - MPEG-2 video Decompression on simultaneous multithreaded multimedia processors
    1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425), 1999
    Co-Authors: Hartmut Oehring, U. Sigmund, T. Ungerer
    Abstract:

    This paper explores microarchitecture models for a simultaneous multithreaded processor with multimedia enhancements. We enhance a wide-issue superscalar processor by the simultaneous multithreading technique, by multimedia units, and by an additional on-chip RAM storage. Our workload is a multithreaded MPEG-2 video Decompression Algorithm that extensively uses multimedia units. Our simulation results suggest that a 2- or 4-threaded 4-issue processor with a small on-chip RAM accessed by a local load/store unit will be superior to a wide-issue (single-threaded) superscalar processor.

  • MPEG-2 video Decompression on simultaneous multithreaded multimedia processors
    1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425), 1999
    Co-Authors: Hartmut Oehring, U. Sigmund, T. Ungerer
    Abstract:

    This paper explores microarchitecture models for a simultaneous multithreaded processor with multimedia enhancements. We enhance a wide-issue superscalar processor by the simultaneous multithreading technique, by multimedia units, and by an additional on-chip RAM storage. Our workload is a multithreaded MPEG-2 video Decompression Algorithm that extensively uses multimedia units. Our simulation results suggest that a 2- or 4-threaded 4-issue processor with a small on-chip RAM accessed by a local load/store unit will be superior to a wide-issue (single-threaded) superscalar processor.

Koji Nakano - One of the best experts on this subject based on the ideXlab platform.

  • Throughput-Optimal Hardware Implementation of LZW Decompression on the FPGA
    2019 Seventh International Symposium on Computing and Networking Workshops (CANDARW), 2019
    Co-Authors: Hiroshi Kagawa, Koji Nakano
    Abstract:

    The main contribution of this paper is to present a throughput-optimal FPGA implementation of LZW Decompression Algorithm. Since LZW Decompression creates a dictionary table by reading codes of a compressed file one by one, its parallelization is not an easy task. However, in this work, continuously given LZW-compressed codes, the proposed circuit can decompress them with no interruption. In the same time, the circuit starts outputting the decompressed byte-data in several clock cycles after the first code of the input is provided. We have the proposed circuit on Xilinx Virtex-7 XCVX485T-2 and evaluated the performance. The experimental results show that our proposed circuit can decompress a LZW-compressed TIFF image of size 4096x3072 in 42.82 ms.

  • IPDPS Workshops - An Efficient Implementation of LZW Decompression in the FPGA
    Algorithms and Architectures for Parallel Processing, 2016
    Co-Authors: Xin Zhou, Koji Nakano
    Abstract:

    LZW Algorithm is one of the most famous dictionary-based compression and Decompression Algorithms. The main contribution of this paper is to present a hardware LZW Decompression Algorithm and to implement it in an FPGA. The experimental results show that one proposed module on Virtex-7 family FPGA XC7VX485T-2 runs up to 2.16 times faster than sequential LZW Decompression on a single CPU, where the frequency of FPGA is 301.02MHz. Since the proposed module is compactly designed and uses a few resources of the FPGA, we have succeeded to implement 150 identical modules which works in parallel on the FPGA, where the frequency of FPGA is 245.4MHz. In other words, our implementation runs up to 264 times faster than a sequential implementation on a single CPU.

  • An Efficient Implementation of LZW Decompression in the FPGA
    2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2016
    Co-Authors: Xin Zhou, Koji Nakano
    Abstract:

    LZW Algorithm is one of the most famous dictionary-based compression and Decompression Algorithms. The main contribution of this paper is to present a hardware LZW Decompression Algorithm and to implement it in an FPGA. The experimental results show that one proposed module on Virtex-7 family FPGA XC7VX485T-2 runs up to 2.16 times faster than sequential LZW Decompression on a single CPU, where the frequency of FPGA is 301.02MHz. Since the proposed module is compactly designed and uses a few resources of the FPGA, we have succeeded to implement 150 identical modules which works in parallel on the FPGA, where the frequency of FPGA is 245.4MHz. In other words, our implementation runs up to 264 times faster than a sequential implementation on a single CPU.

  • a parallel Algorithm for lzw Decompression with gpu implementation
    International Conference on Parallel Processing, 2015
    Co-Authors: Shunji Funasaka, Koji Nakano
    Abstract:

    The main contribution of this paper is to present a parallel Algorithm for LZW Decompression and to implement it in a CUDA-enabled GPU. Since sequential LZW Decompression creates a dictionary table by reading codes in a compressed file one by one, its parallelization is not an easy task. We first present a parallel LZW Decompression Algorithm on the CREW-PRAM. We then go on to present an efficient implementation of this parallel Algorithm on a GPU. The experimental results show that our parallel LZW Decompression on GeForce GTX 980 runs up to 69.4 times faster than sequential LZW Decompression on a single CPU. We also show a scenario that parallel LZW Decompression on a GPU can be used for accelerating big data applications.

Charlie Chung-ping Chen - One of the best experts on this subject based on the ideXlab platform.

  • Lossless compression Algorithm based on dictionary coding for multiple e-beam direct write system
    2016 Design Automation & Test in Europe Conference & Exhibition (DATE), 2016
    Co-Authors: Yu-hsiang Chiu, Shao-yuan Fang, Charlie Chung-ping Chen
    Abstract:

    Electron-beam direct-write (EBDW) lithography is an attractive candidate of next-generation lithography in advanced semiconductor processes. The huge data stream bandwidth required for the data delivery path in EBDW systems could seriously deteriorate throughput, which is one of the major deficiencies constraining EBDW lithography from mass production. A lossless electron-beam layout data compression and Decompression Algorithm is proposed in this paper for 5-bit gray level bitmaps. Compared with the state-of-the-art LineDiff Entropy Algorithm, the proposed method averagely improves the compression rate by 18% and achieves more than 7.5 times speedup for Decompression.

  • DATE - Lossless compression Algorithm based on dictionary coding for multiple e-beam direct write system
    Proceedings of the 2016 Design Automation & Test in Europe Conference & Exhibition (DATE), 2016
    Co-Authors: Yu-hsiang Chiu, Shao-yuan Fang, Charlie Chung-ping Chen
    Abstract:

    Electron-beam direct-write (EBDW) lithography is an attractive candidate of next-generation lithography in advanced semiconductor processes. The huge data stream bandwidth required for the data delivery path in EBDW systems could seriously deteriorate throughput, which is one of the major deficiencies constraining EBDW lithography from mass production. A lossless electron-beam layout data compression and Decompression Algorithm is proposed in this paper for 5-bit gray level bitmaps. Compared with the state-of-the-art LineDiff Entropy Algorithm, the proposed method averagely improves the compression rate by 18% and achieves more than 7.5 times speedup for Decompression.

Hartmut Oehring - One of the best experts on this subject based on the ideXlab platform.

  • Performance of simultaneous multithreaded multimedia-enhanced processors for MPEG-2 video Decompression
    Journal of Systems Architecture, 2000
    Co-Authors: Hartmut Oehring, U. Sigmund, T. Ungerer
    Abstract:

    Abstract This paper explores microarchitecture models for a simultaneous multithreaded (SMT) processor with multimedia enhancements. We start with a wide-issue superscalar processor, enhance it by the SMT technique, by multimedia units, and by an additional on-chip RAM storage. Our workload is a multithreaded MPEG-2 video Decompression Algorithm that extensively uses multimedia units. The simulations show that a single-threaded, 8-issue maximum processor (assuming an abundance of resources) reaches an instructions per cycle (IPC) count of only 1.60, while an 8-threaded 8-issue processor is able to reach an IPC of 6.07. A more realistic processor model reaches an IPC of 1.27 in the single-threaded 8-issue vs 3.03 in the 4-threaded 4-issue and 3.21 in the 8-threaded 8-issue modes. Our conclusion on next generation’s microprocessors is that a 2- or 4-threaded 4-issue processor with a small on-chip RAM accessed by a local load/store unit will be superior to a wide-issue (single-threaded) superscalar processor at least for MPEG-2 style video Decompression Algorithms.

  • IEEE PACT - MPEG-2 video Decompression on simultaneous multithreaded multimedia processors
    1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425), 1999
    Co-Authors: Hartmut Oehring, U. Sigmund, T. Ungerer
    Abstract:

    This paper explores microarchitecture models for a simultaneous multithreaded processor with multimedia enhancements. We enhance a wide-issue superscalar processor by the simultaneous multithreading technique, by multimedia units, and by an additional on-chip RAM storage. Our workload is a multithreaded MPEG-2 video Decompression Algorithm that extensively uses multimedia units. Our simulation results suggest that a 2- or 4-threaded 4-issue processor with a small on-chip RAM accessed by a local load/store unit will be superior to a wide-issue (single-threaded) superscalar processor.

  • MPEG-2 video Decompression on simultaneous multithreaded multimedia processors
    1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425), 1999
    Co-Authors: Hartmut Oehring, U. Sigmund, T. Ungerer
    Abstract:

    This paper explores microarchitecture models for a simultaneous multithreaded processor with multimedia enhancements. We enhance a wide-issue superscalar processor by the simultaneous multithreading technique, by multimedia units, and by an additional on-chip RAM storage. Our workload is a multithreaded MPEG-2 video Decompression Algorithm that extensively uses multimedia units. Our simulation results suggest that a 2- or 4-threaded 4-issue processor with a small on-chip RAM accessed by a local load/store unit will be superior to a wide-issue (single-threaded) superscalar processor.

Yu-hsiang Chiu - One of the best experts on this subject based on the ideXlab platform.

  • Lossless compression Algorithm based on dictionary coding for multiple e-beam direct write system
    2016 Design Automation & Test in Europe Conference & Exhibition (DATE), 2016
    Co-Authors: Yu-hsiang Chiu, Shao-yuan Fang, Charlie Chung-ping Chen
    Abstract:

    Electron-beam direct-write (EBDW) lithography is an attractive candidate of next-generation lithography in advanced semiconductor processes. The huge data stream bandwidth required for the data delivery path in EBDW systems could seriously deteriorate throughput, which is one of the major deficiencies constraining EBDW lithography from mass production. A lossless electron-beam layout data compression and Decompression Algorithm is proposed in this paper for 5-bit gray level bitmaps. Compared with the state-of-the-art LineDiff Entropy Algorithm, the proposed method averagely improves the compression rate by 18% and achieves more than 7.5 times speedup for Decompression.

  • DATE - Lossless compression Algorithm based on dictionary coding for multiple e-beam direct write system
    Proceedings of the 2016 Design Automation & Test in Europe Conference & Exhibition (DATE), 2016
    Co-Authors: Yu-hsiang Chiu, Shao-yuan Fang, Charlie Chung-ping Chen
    Abstract:

    Electron-beam direct-write (EBDW) lithography is an attractive candidate of next-generation lithography in advanced semiconductor processes. The huge data stream bandwidth required for the data delivery path in EBDW systems could seriously deteriorate throughput, which is one of the major deficiencies constraining EBDW lithography from mass production. A lossless electron-beam layout data compression and Decompression Algorithm is proposed in this paper for 5-bit gray level bitmaps. Compared with the state-of-the-art LineDiff Entropy Algorithm, the proposed method averagely improves the compression rate by 18% and achieves more than 7.5 times speedup for Decompression.