Decompression Algorithm - Explore the Science & Experts

Related Terms:

The Experts below are selected from a list of 1134 Experts worldwide ranked by ideXlab platform

T. Ungerer - One of the best experts on this subject based on the ideXlab platform.

Performance of simultaneous multithreaded multimedia-enhanced processors for MPEG-2 video Decompression

Journal of Systems Architecture, 2000

Co-Authors: Hartmut Oehring, U. Sigmund, T. Ungerer

Abstract:

Abstract This paper explores microarchitecture models for a simultaneous multithreaded (SMT) processor with multimedia enhancements. We start with a wide-issue superscalar processor, enhance it by the SMT technique, by multimedia units, and by an additional on-chip RAM storage. Our workload is a multithreaded MPEG-2 video Decompression Algorithm that extensively uses multimedia units. The simulations show that a single-threaded, 8-issue maximum processor (assuming an abundance of resources) reaches an instructions per cycle (IPC) count of only 1.60, while an 8-threaded 8-issue processor is able to reach an IPC of 6.07. A more realistic processor model reaches an IPC of 1.27 in the single-threaded 8-issue vs 3.03 in the 4-threaded 4-issue and 3.21 in the 8-threaded 8-issue modes. Our conclusion on next generation’s microprocessors is that a 2- or 4-threaded 4-issue processor with a small on-chip RAM accessed by a local load/store unit will be superior to a wide-issue (single-threaded) superscalar processor at least for MPEG-2 style video Decompression Algorithms.

15 days free trial to Access Article
IEEE PACT - MPEG-2 video Decompression on simultaneous multithreaded multimedia processors

1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425), 1999

Co-Authors: Hartmut Oehring, U. Sigmund, T. Ungerer

Abstract:

This paper explores microarchitecture models for a simultaneous multithreaded processor with multimedia enhancements. We enhance a wide-issue superscalar processor by the simultaneous multithreading technique, by multimedia units, and by an additional on-chip RAM storage. Our workload is a multithreaded MPEG-2 video Decompression Algorithm that extensively uses multimedia units. Our simulation results suggest that a 2- or 4-threaded 4-issue processor with a small on-chip RAM accessed by a local load/store unit will be superior to a wide-issue (single-threaded) superscalar processor.

15 days free trial to Access Article
MPEG-2 video Decompression on simultaneous multithreaded multimedia processors

1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425), 1999

Co-Authors: Hartmut Oehring, U. Sigmund, T. Ungerer

Abstract:

This paper explores microarchitecture models for a simultaneous multithreaded processor with multimedia enhancements. We enhance a wide-issue superscalar processor by the simultaneous multithreading technique, by multimedia units, and by an additional on-chip RAM storage. Our workload is a multithreaded MPEG-2 video Decompression Algorithm that extensively uses multimedia units. Our simulation results suggest that a 2- or 4-threaded 4-issue processor with a small on-chip RAM accessed by a local load/store unit will be superior to a wide-issue (single-threaded) superscalar processor.

15 days free trial to Access Article

Koji Nakano - One of the best experts on this subject based on the ideXlab platform.

Throughput-Optimal Hardware Implementation of LZW Decompression on the FPGA

2019 Seventh International Symposium on Computing and Networking Workshops (CANDARW), 2019

Co-Authors: Hiroshi Kagawa, Koji Nakano

Abstract:

The main contribution of this paper is to present a throughput-optimal FPGA implementation of LZW Decompression Algorithm. Since LZW Decompression creates a dictionary table by reading codes of a compressed file one by one, its parallelization is not an easy task. However, in this work, continuously given LZW-compressed codes, the proposed circuit can decompress them with no interruption. In the same time, the circuit starts outputting the decompressed byte-data in several clock cycles after the first code of the input is provided. We have the proposed circuit on Xilinx Virtex-7 XCVX485T-2 and evaluated the performance. The experimental results show that our proposed circuit can decompress a LZW-compressed TIFF image of size 4096x3072 in 42.82 ms.

15 days free trial to Access Article
IPDPS Workshops - An Efficient Implementation of LZW Decompression in the FPGA

Algorithms and Architectures for Parallel Processing, 2016

Co-Authors: Xin Zhou, Koji Nakano

Abstract:

LZW Algorithm is one of the most famous dictionary-based compression and Decompression Algorithms. The main contribution of this paper is to present a hardware LZW Decompression Algorithm and to implement it in an FPGA. The experimental results show that one proposed module on Virtex-7 family FPGA XC7VX485T-2 runs up to 2.16 times faster than sequential LZW Decompression on a single CPU, where the frequency of FPGA is 301.02MHz. Since the proposed module is compactly designed and uses a few resources of the FPGA, we have succeeded to implement 150 identical modules which works in parallel on the FPGA, where the frequency of FPGA is 245.4MHz. In other words, our implementation runs up to 264 times faster than a sequential implementation on a single CPU.

15 days free trial to Access Article
An Efficient Implementation of LZW Decompression in the FPGA

2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2016

Co-Authors: Xin Zhou, Koji Nakano

Abstract:

LZW Algorithm is one of the most famous dictionary-based compression and Decompression Algorithms. The main contribution of this paper is to present a hardware LZW Decompression Algorithm and to implement it in an FPGA. The experimental results show that one proposed module on Virtex-7 family FPGA XC7VX485T-2 runs up to 2.16 times faster than sequential LZW Decompression on a single CPU, where the frequency of FPGA is 301.02MHz. Since the proposed module is compactly designed and uses a few resources of the FPGA, we have succeeded to implement 150 identical modules which works in parallel on the FPGA, where the frequency of FPGA is 245.4MHz. In other words, our implementation runs up to 264 times faster than a sequential implementation on a single CPU.

15 days free trial to Access Article
a parallel Algorithm for lzw Decompression with gpu implementation

International Conference on Parallel Processing, 2015

Co-Authors: Shunji Funasaka, Koji Nakano

Abstract:

The main contribution of this paper is to present a parallel Algorithm for LZW Decompression and to implement it in a CUDA-enabled GPU. Since sequential LZW Decompression creates a dictionary table by reading codes in a compressed file one by one, its parallelization is not an easy task. We first present a parallel LZW Decompression Algorithm on the CREW-PRAM. We then go on to present an efficient implementation of this parallel Algorithm on a GPU. The experimental results show that our parallel LZW Decompression on GeForce GTX 980 runs up to 69.4 times faster than sequential LZW Decompression on a single CPU. We also show a scenario that parallel LZW Decompression on a GPU can be used for accelerating big data applications.

15 days free trial to Access Article

Charlie Chung-ping Chen - One of the best experts on this subject based on the ideXlab platform.

Lossless compression Algorithm based on dictionary coding for multiple e-beam direct write system

2016 Design Automation & Test in Europe Conference & Exhibition (DATE), 2016

Co-Authors: Yu-hsiang Chiu, Shao-yuan Fang, Charlie Chung-ping Chen

Abstract:

Electron-beam direct-write (EBDW) lithography is an attractive candidate of next-generation lithography in advanced semiconductor processes. The huge data stream bandwidth required for the data delivery path in EBDW systems could seriously deteriorate throughput, which is one of the major deficiencies constraining EBDW lithography from mass production. A lossless electron-beam layout data compression and Decompression Algorithm is proposed in this paper for 5-bit gray level bitmaps. Compared with the state-of-the-art LineDiff Entropy Algorithm, the proposed method averagely improves the compression rate by 18% and achieves more than 7.5 times speedup for Decompression.

15 days free trial to Access Article
DATE - Lossless compression Algorithm based on dictionary coding for multiple e-beam direct write system

Proceedings of the 2016 Design Automation & Test in Europe Conference & Exhibition (DATE), 2016

Co-Authors: Yu-hsiang Chiu, Shao-yuan Fang, Charlie Chung-ping Chen

Abstract:

Electron-beam direct-write (EBDW) lithography is an attractive candidate of next-generation lithography in advanced semiconductor processes. The huge data stream bandwidth required for the data delivery path in EBDW systems could seriously deteriorate throughput, which is one of the major deficiencies constraining EBDW lithography from mass production. A lossless electron-beam layout data compression and Decompression Algorithm is proposed in this paper for 5-bit gray level bitmaps. Compared with the state-of-the-art LineDiff Entropy Algorithm, the proposed method averagely improves the compression rate by 18% and achieves more than 7.5 times speedup for Decompression.

15 days free trial to Access Article

Hartmut Oehring - One of the best experts on this subject based on the ideXlab platform.

Performance of simultaneous multithreaded multimedia-enhanced processors for MPEG-2 video Decompression

Journal of Systems Architecture, 2000

Co-Authors: Hartmut Oehring, U. Sigmund, T. Ungerer

Abstract:

Abstract This paper explores microarchitecture models for a simultaneous multithreaded (SMT) processor with multimedia enhancements. We start with a wide-issue superscalar processor, enhance it by the SMT technique, by multimedia units, and by an additional on-chip RAM storage. Our workload is a multithreaded MPEG-2 video Decompression Algorithm that extensively uses multimedia units. The simulations show that a single-threaded, 8-issue maximum processor (assuming an abundance of resources) reaches an instructions per cycle (IPC) count of only 1.60, while an 8-threaded 8-issue processor is able to reach an IPC of 6.07. A more realistic processor model reaches an IPC of 1.27 in the single-threaded 8-issue vs 3.03 in the 4-threaded 4-issue and 3.21 in the 8-threaded 8-issue modes. Our conclusion on next generation’s microprocessors is that a 2- or 4-threaded 4-issue processor with a small on-chip RAM accessed by a local load/store unit will be superior to a wide-issue (single-threaded) superscalar processor at least for MPEG-2 style video Decompression Algorithms.

15 days free trial to Access Article
IEEE PACT - MPEG-2 video Decompression on simultaneous multithreaded multimedia processors

1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425), 1999

Co-Authors: Hartmut Oehring, U. Sigmund, T. Ungerer

Abstract:

This paper explores microarchitecture models for a simultaneous multithreaded processor with multimedia enhancements. We enhance a wide-issue superscalar processor by the simultaneous multithreading technique, by multimedia units, and by an additional on-chip RAM storage. Our workload is a multithreaded MPEG-2 video Decompression Algorithm that extensively uses multimedia units. Our simulation results suggest that a 2- or 4-threaded 4-issue processor with a small on-chip RAM accessed by a local load/store unit will be superior to a wide-issue (single-threaded) superscalar processor.

15 days free trial to Access Article
MPEG-2 video Decompression on simultaneous multithreaded multimedia processors

1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425), 1999

Co-Authors: Hartmut Oehring, U. Sigmund, T. Ungerer

Abstract:

This paper explores microarchitecture models for a simultaneous multithreaded processor with multimedia enhancements. We enhance a wide-issue superscalar processor by the simultaneous multithreading technique, by multimedia units, and by an additional on-chip RAM storage. Our workload is a multithreaded MPEG-2 video Decompression Algorithm that extensively uses multimedia units. Our simulation results suggest that a 2- or 4-threaded 4-issue processor with a small on-chip RAM accessed by a local load/store unit will be superior to a wide-issue (single-threaded) superscalar processor.

15 days free trial to Access Article

Yu-hsiang Chiu - One of the best experts on this subject based on the ideXlab platform.

Lossless compression Algorithm based on dictionary coding for multiple e-beam direct write system

2016 Design Automation & Test in Europe Conference & Exhibition (DATE), 2016

Co-Authors: Yu-hsiang Chiu, Shao-yuan Fang, Charlie Chung-ping Chen

Abstract:

Electron-beam direct-write (EBDW) lithography is an attractive candidate of next-generation lithography in advanced semiconductor processes. The huge data stream bandwidth required for the data delivery path in EBDW systems could seriously deteriorate throughput, which is one of the major deficiencies constraining EBDW lithography from mass production. A lossless electron-beam layout data compression and Decompression Algorithm is proposed in this paper for 5-bit gray level bitmaps. Compared with the state-of-the-art LineDiff Entropy Algorithm, the proposed method averagely improves the compression rate by 18% and achieves more than 7.5 times speedup for Decompression.

15 days free trial to Access Article
DATE - Lossless compression Algorithm based on dictionary coding for multiple e-beam direct write system

Proceedings of the 2016 Design Automation & Test in Europe Conference & Exhibition (DATE), 2016

Co-Authors: Yu-hsiang Chiu, Shao-yuan Fang, Charlie Chung-ping Chen

Abstract:

Electron-beam direct-write (EBDW) lithography is an attractive candidate of next-generation lithography in advanced semiconductor processes. The huge data stream bandwidth required for the data delivery path in EBDW systems could seriously deteriorate throughput, which is one of the major deficiencies constraining EBDW lithography from mass production. A lossless electron-beam layout data compression and Decompression Algorithm is proposed in this paper for 5-bit gray level bitmaps. Compared with the state-of-the-art LineDiff Entropy Algorithm, the proposed method averagely improves the compression rate by 18% and achieves more than 7.5 times speedup for Decompression.

15 days free trial to Access Article

Discover everything there is to know about the scientific topic Decompression Algorithm with ideXlab!

T. Ungerer - One of the best experts on this subject based on the ideXlab platform.

Performance of simultaneous multithreaded multimedia-enhanced processors for MPEG-2 video Decompression

IEEE PACT - MPEG-2 video Decompression on simultaneous multithreaded multimedia processors

MPEG-2 video Decompression on simultaneous multithreaded multimedia processors

Koji Nakano - One of the best experts on this subject based on the ideXlab platform.

Throughput-Optimal Hardware Implementation of LZW Decompression on the FPGA

IPDPS Workshops - An Efficient Implementation of LZW Decompression in the FPGA

An Efficient Implementation of LZW Decompression in the FPGA

a parallel Algorithm for lzw Decompression with gpu implementation

Charlie Chung-ping Chen - One of the best experts on this subject based on the ideXlab platform.

Lossless compression Algorithm based on dictionary coding for multiple e-beam direct write system

DATE - Lossless compression Algorithm based on dictionary coding for multiple e-beam direct write system

Hartmut Oehring - One of the best experts on this subject based on the ideXlab platform.

Performance of simultaneous multithreaded multimedia-enhanced processors for MPEG-2 video Decompression

IEEE PACT - MPEG-2 video Decompression on simultaneous multithreaded multimedia processors

MPEG-2 video Decompression on simultaneous multithreaded multimedia processors

Yu-hsiang Chiu - One of the best experts on this subject based on the ideXlab platform.

Lossless compression Algorithm based on dictionary coding for multiple e-beam direct write system

DATE - Lossless compression Algorithm based on dictionary coding for multiple e-beam direct write system