Lossy Compression

The Experts below are selected from a list of 11517 Experts worldwide ranked by ideXlab platform

Tracy Camp - One of the best experts on this subject based on the ideXlab platform.

Lossy Compression for wireless seismic data acquisition

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2016

Co-Authors: Marc J Rubin, Michael B Wakin, Tracy Camp

Abstract:

In this paper, we rigorously compare compressive sampling (CS) to four state of the art, on-mote, Lossy Compression algorithms [ $K$ -run-length encoding (KRLE), lightweight temporal Compression (LTC), wavelet quantization thresholding and run-length encoding (WQTR), and a low-pass filtered fast Fourier transform (FFT)]. Specifically, we first simulate Lossy Compression on two real-world seismic data sets, and we then evaluate algorithm performance using implementations on real hardware. In terms of Compression ratios, recovered signal error, power consumption, on-mote execution runtime, and classification accuracy of a seismic event detection task (on decompressed signals), results show that CS performs comparable to (and in many cases better than) the other algorithms evaluated. A main benefit to users is that CS, a lightweight and nonadaptive Compression technique, can guarantee a desired level of Compression performance (and thus, radio usage and power consumption) without subjugating recovered signal quality. Our contribution is a novel and rigorous comparison of five state of the art, on-mote, Lossy Compression algorithms in simulation on real-world data sets and in implementations on hardware.

15 days free trial to Access Article
A Comparison of On-Mote Lossy Compression Algorithms for Wireless Seismic Data Acquisition

2014 IEEE International Conference on Distributed Computing in Sensor Systems, 2014

Co-Authors: Marc J Rubin, Michael B Wakin, Tracy Camp

Abstract:

In this article, we rigorously compare compressive sampling (CS) to four state of the art, on-mote, Lossy Compression algorithms (K-run-length encoding (KRLE), lightweight temporal Compression (LTC), wavelet quantization thresholding and run-length encoding (WQTR), and a low-pass filtered fast Fourier transform (FFT)). Specifically, we first simulate Lossy Compression on two real-world seismic data sets, and we then evaluate algorithm performance using implementations on real hardware. In terms of Compression rates, recovered signal error, power consumption, and classification accuracy of a seismic event detection task (on decompressed signals), results show that CS performs comparable to (and in many cases better than) the other algorithms evaluated. The main benefit to users is that CS, a lightweight and non-adaptive Compression technique, can guarantee a desired level of Compression performance (and thus, radio usage and power consumption) without subjugating recovered signal quality. Our contribution is a novel and rigorous comparison of five state of the art, on-mote, Lossy Compression algorithms in simulation on real-world data sets and implemented on hardware.

15 days free trial to Access Article

Franck Cappello - One of the best experts on this subject based on the ideXlab platform.

Performance Optimization for Relative-Error-Bounded Lossy Compression on Scientific Data

IEEE Transactions on Parallel and Distributed Systems, 2020

Co-Authors: Tao Lu, Sheng Di, Xuan Wang, Weizhe Zhang, Haijun Zhang, Franck Cappello

Abstract:

Scientific simulations in high-performance computing (HPC) environments generate vast volume of data, which may cause a severe I/O bottleneck at runtime and a huge burden on storage space for postanalysis. Unlike traditional data reduction schemes such as deduplication or lossless Compression, not only can error-controlled Lossy Compression significantly reduce the data size but it also holds the promise to satisfy user demand on error control. Pointwise relative error bounds (i.e., Compression errors depends on the data values) are widely used by many scientific applications with Lossy Compression since error control can adapt to the error bound in the dataset automatically. Pointwise relative-error-bounded Compression is complicated and time consuming. In this article, we develop efficient precomputation-based mechanisms based on the SZ Lossy Compression framework. Our mechanisms can avoid costly logarithmic transformation and identify quantization factor values via a fast table lookup, greatly accelerating the relative-error-bounded Compression with excellent Compression ratios. In addition, we reduce traversing operations for Huffman decoding, significantly accelerating the deCompression process in SZ. Experiments with eight well-known real-world scientific simulation datasets show that our solution can improve the Compression and deCompression rates (i.e., the speed) by about 40 and 80 p, respectively, in most of cases, making our designed Lossy Compression strategy the best-in-class solution in most cases.

15 days free trial to Access Article
FRaZ: A Generic High-Fidelity Fixed-Ratio Lossy Compression Framework for Scientific Floating-point Data

arXiv: Distributed Parallel and Cluster Computing, 2020

Co-Authors: Robert Underwood, Sheng Di, Jon Calhoun, Franck Cappello

Abstract:

With ever-increasing volumes of scientific floating-point data being produced by high-performance computing applications, significantly reducing scientific floating-point data size is critical, and error-controlled Lossy compressors have been developed for years. None of the existing scientific floating-point Lossy data compressors, however, support effective fixed-ratio Lossy Compression. Yet fixed-ratio Lossy Compression for scientific floating-point data not only compresses to the requested ratio but also respects a user-specified error bound with higher fidelity. In this paper, we present FRaZ: a generic fixed-ratio Lossy Compression framework respecting user-specified error constraints. The contribution is twofold. (1) We develop an efficient iterative approach to accurately determine the appropriate error settings for different Lossy compressors based on target Compression ratios. (2) We perform a thorough performance and accuracy evaluation for our proposed fixed-ratio Compression framework with multiple state-of-the-art error-controlled Lossy compressors, using several real-world scientific floating-point datasets from different domains. Experiments show that FRaZ effectively identifies the optimum error setting in the entire error setting space of any given Lossy compressor. While fixed-ratio Lossy Compression is slower than fixed-error Compression, it provides an important new Lossy Compression technique for users of very large scientific floating-point datasets.

15 days free trial to Access Article
Accelerating Relative-error Bounded Lossy Compression for HPC datasets with Precomputation-Based Mechanisms

2019 35th Symposium on Mass Storage Systems and Technologies (MSST), 2019

Co-Authors: Tao Lu, Sheng Di, Xuan Wang, Weizhe Zhang, Franck Cappello

Abstract:

Scientific simulations in high-performance computing (HPC) environments are producing vast volume of data, which may cause a severe I/O bottleneck at runtime and a huge burden on storage space for post-analysis. Unlike the traditional data reduction schemes (such as deduplication or lossless Compression), not only can error-controlled Lossy Compression significantly reduce the data size but it can also hold the promise to satisfy user demand on error control. Point-wise relative error bounds (i.e., Compression errors depends on the data values) are widely used by many scientific applications in the Lossy Compression, since error control can adapt to the precision in the dataset automatically. Point-wise relative error bounded Compression is complicated and time consuming. In this work, we develop efficient precomputation-based mechanisms in the SZ Lossy Compression framework. Our mechanisms can avoid costly logarithmic transformation and identify quantization factor values via a fast table lookup, greatly accelerating the relative-error bounded Compression with excellent Compression ratios. In addition, our mechanisms also help reduce traversing operations for Huffman decoding, and thus significantly accelerate the deCompression process in SZ. Experiments with four well-known real-world scientific simulation datasets show that our solution can improve the Compression rate by about 30% and deCompression rate by about 70% in most of cases, making our designed Lossy Compression strategy the best choice in class in most cases.

15 days free trial to Access Article
CLUSTER - Fixed-PSNR Lossy Compression for Scientific Data

2018 IEEE International Conference on Cluster Computing (CLUSTER), 2018

Co-Authors: Sheng Di, Xin Liang, Zizhong Chen, Franck Cappello

Abstract:

Error-controlled Lossy Compression has been studied for years because of extremely large volumes of data being produced by today's scientific simulations. None of existing Lossy compressors, however, allow users to fix the peak signal-to-noise ratio (PSNR) during Compression, although PSNR has been considered as one of the most significant indicators to assess Compression quality. In this paper, we propose a novel technique providing a fixed-PSNR Lossy Compression for scientific data sets. We implement our proposed method based on the SZ Lossy Compression framework and release the code as an open-source toolkit. We evaluate our fixed-PSNR compressor on three realworld high-performance computing data sets. Experiments show that our solution has a high accuracy in controlling PSNR, with an average deviation of 0.1 ~ 5.0 dB on the tested data sets.

15 days free trial to Access Article
Fixed-PSNR Lossy Compression for Scientific Data

2018 IEEE International Conference on Cluster Computing (CLUSTER), 2018

Co-Authors: Sheng Di, Xin Liang, Zizhong Chen, Franck Cappello

Abstract:

Error-controlled Lossy Compression has been studied for years because of extremely large volumes of data being produced by today's scientific simulations. None of existing Lossy compressors, however, allow users to fix the peak signal-to-noise ratio (PSNR) during Compression, although PSNR has been considered as one of the most significant indicators to assess Compression quality. In this paper, we propose a novel technique providing a fixed-PSNR Lossy Compression for scientific data sets. We implement our proposed method based on the SZ Lossy Compression framework and release the code as an open-source toolkit. We evaluate our fixed-PSNR compressor on three realworld high-performance computing data sets. Experiments show that our solution has a high accuracy in controlling PSNR, with an average deviation of 0.1 ~ 5.0 dB on the tested data sets.

15 days free trial to Access Article

Tsachy Weissman - One of the best experts on this subject based on the ideXlab platform.

Effect of Lossy Compression of quality scores on variant calling

Briefings in Bioinformatics, 2017

Co-Authors: Idoia Ochoa, Rachel Goldfeder, Mikel Hernaez, Tsachy Weissman, Euan Ashley

Abstract:

Recent advancements in sequencing technology have led to a drastic reduction in genome sequencing costs. This development has generated an unprecedented amount of data that must be stored, processed, and communicated. To facilitate this effort, Compression of genomic files has been proposed. Specifically, Lossy Compression of quality scores is emerging as a natural candidate for reducing the growing costs of storage. A main goal of performing DNA sequencing in population studies and clinical settings is to identify genetic variation. Though the field agrees that smaller files are advantageous, the cost of Lossy Compression, in terms of variant discovery, is unclear.Bioinformatic algorithms to identify SNPs and INDELs use base quality score information; here, we evaluate the effect of Lossy Compression of quality scores on SNP and INDEL detection. Specifically, we investigate how the output of the variant caller when using the original data differs from that obtained when quality scores are replaced by those generated by a Lossy compressor. Using gold standard genomic datasets and simulated data, we are able to analyze how accurate the output of the variant calling is, both for the original data and that previously lossily compressed. We show that Lossy Compression can significantly alleviate the storage while maintaining variant calling performance comparable to that with the original data. Further, in some cases Lossy Compression can lead to variant calling performance that is superior to that using the original file. We envisage our findings and framework serving as a benchmark in future development and analyses of Lossy genomic data compressors.

15 days free trial to Access Article
effect of Lossy Compression of quality scores on variant calling

bioRxiv, 2015

Co-Authors: Idoia Ochoa, Rachel Goldfeder, Mikel Hernaez, Tsachy Weissman, Euan A Ashley

Abstract:

Recent advancements in sequencing technology have led to a drastic reduction in the cost of genome sequencing. This development has generated an unprecedented amount of genomic data that must be stored, processed, and communicated. To facilitate this effort, Compression of genomic files has been proposed. Specifically, Lossy Compression of quality scores is emerging as a natural candidate for reducing the growing costs of storage. A main goal of performing DNA sequencing in population studies and clinical settings is to identify genetic variation. Though the field agrees that smaller files are advantageous, the cost of Lossy Compression, in terms of variant discovery, is unclear. Bioinformatic algorithms to identify SNPs and INDELs from next-generation DNA sequencing data use base quality score information; here, we evaluate the effect of Lossy Compression of quality scores on SNP and INDEL detection. We analyze several Lossy compressors introduced recently in the literature. Specifically, we investigate how the output of the variant caller when using the original data (uncompressed) differs from that obtained when quality scores are replaced by those generated by a Lossy compressor. Using gold standard genomic datasets such as the GIAB (Genome In A Bottle) consensus sequence for NA12878 and simulated data, we are able to analyze how accurate the output of the variant calling is, both for the original data and that previously lossily compressed. We show that Lossy Compression can significantly alleviate the storage while maintaining variant calling performance comparable to that with the uncompressed data. Further, in some cases Lossy Compression can lead to variant calling performance which is superior to that using the uncompressed file. We envisage our findings and framework serving as a benchmark in future development and analyses of Lossy genomic data compressors. The \emph{Supplementary Data} can be found at \url{http://web.stanford.edu/~iochoa/supplementEffectLossy.zip}.

15 days free trial to Access Article
Universality of logarithmic loss in Lossy Compression

2015 IEEE International Symposium on Information Theory (ISIT), 2015

Co-Authors: Albert No, Tsachy Weissman

Abstract:

We establish two strong senses of universality of logarithmic loss as a distortion criterion in Lossy Compression: For any fixed length Lossy Compression problem under an arbitrary distortion criterion, we show that there is an equivalent Lossy Compression problem under logarithmic loss. In the successive refinement problem, if the first decoder operates under logarithmic loss, we show that any discrete memoryless source is successively refinable under an arbitrary distortion criterion for the second decoder.

15 days free trial to Access Article
ISIT - Universality of logarithmic loss in Lossy Compression

2015 IEEE International Symposium on Information Theory (ISIT), 2015

Co-Authors: Albert No, Tsachy Weissman

Abstract:

We establish two strong senses of universality of logarithmic loss as a distortion criterion in Lossy Compression: For any fixed length Lossy Compression problem under an arbitrary distortion criterion, we show that there is an equivalent Lossy Compression problem under logarithmic loss. In the successive refinement problem, if the first decoder operates under logarithmic loss, we show that any discrete memoryless source is successively refinable under an arbitrary distortion criterion for the second decoder.

15 days free trial to Access Article
Achievable complexity-performance tradeoffs in Lossy Compression

Problems of Information Transmission, 2012

Co-Authors: Ankit Gupta, Sergio Verdu, Tsachy Weissman

Abstract:

We present several results related to the complexity-performance tradeoff in Lossy Compression. The first result shows that for a memoryless source with rate-distortion function R(D) and a bounded distortion measure, the rate-distortion point (R(D) + ?, D + ?) can be achieved with constant deCompression time per (separable) symbol and Compression time per symbol proportional to $$\left( {{{\lambda _1 } \mathord{\left/ {\vphantom {{\lambda _1 } \varepsilon }} \right. \kern-\nulldelimiterspace} \varepsilon }} \right)^{{{\lambda _2 } \mathord{\left/ {\vphantom {{\lambda _2 } {\gamma ^2 }}} \right. \kern-\nulldelimiterspace} {\gamma ^2 }}}$$ , where ? 1 and ? 2 are source dependent constants. The second result establishes that the same point can be achieved with constant deCompression time and Compression time per symbol proportional to $$\left( {{{\rho _1 } \mathord{\left/ {\vphantom {{\rho _1 } \gamma }} \right. \kern-\nulldelimiterspace} \gamma }} \right)^{{{\rho _2 } \mathord{\left/ {\vphantom {{\rho _2 } {\varepsilon ^2 }}} \right. \kern-\nulldelimiterspace} {\varepsilon ^2 }}}$$ . These results imply, for any function g(n) that increases without bound arbitrarily slowly, the existence of a sequence of Lossy Compression schemes of blocklength n with O(ng(n)) Compression complexity and O(n) deCompression complexity that achieve the point (R(D), D) asymptotically with increasing blocklength. We also establish that if the reproduction alphabet is finite, then for any given R there exists a universal Lossy Compression scheme with O(ng(n)) Compression complexity and O(n) deCompression complexity that achieves the point (R, D(R)) asymptotically for any stationary ergodic source with distortion-rate function D(·).

15 days free trial to Access Article

Marc J Rubin - One of the best experts on this subject based on the ideXlab platform.

Lossy Compression for wireless seismic data acquisition

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2016

Co-Authors: Marc J Rubin, Michael B Wakin, Tracy Camp

Abstract:

In this paper, we rigorously compare compressive sampling (CS) to four state of the art, on-mote, Lossy Compression algorithms [ $K$ -run-length encoding (KRLE), lightweight temporal Compression (LTC), wavelet quantization thresholding and run-length encoding (WQTR), and a low-pass filtered fast Fourier transform (FFT)]. Specifically, we first simulate Lossy Compression on two real-world seismic data sets, and we then evaluate algorithm performance using implementations on real hardware. In terms of Compression ratios, recovered signal error, power consumption, on-mote execution runtime, and classification accuracy of a seismic event detection task (on decompressed signals), results show that CS performs comparable to (and in many cases better than) the other algorithms evaluated. A main benefit to users is that CS, a lightweight and nonadaptive Compression technique, can guarantee a desired level of Compression performance (and thus, radio usage and power consumption) without subjugating recovered signal quality. Our contribution is a novel and rigorous comparison of five state of the art, on-mote, Lossy Compression algorithms in simulation on real-world data sets and in implementations on hardware.

15 days free trial to Access Article
A Comparison of On-Mote Lossy Compression Algorithms for Wireless Seismic Data Acquisition

2014 IEEE International Conference on Distributed Computing in Sensor Systems, 2014

Co-Authors: Marc J Rubin, Michael B Wakin, Tracy Camp

Abstract:

In this article, we rigorously compare compressive sampling (CS) to four state of the art, on-mote, Lossy Compression algorithms (K-run-length encoding (KRLE), lightweight temporal Compression (LTC), wavelet quantization thresholding and run-length encoding (WQTR), and a low-pass filtered fast Fourier transform (FFT)). Specifically, we first simulate Lossy Compression on two real-world seismic data sets, and we then evaluate algorithm performance using implementations on real hardware. In terms of Compression rates, recovered signal error, power consumption, and classification accuracy of a seismic event detection task (on decompressed signals), results show that CS performs comparable to (and in many cases better than) the other algorithms evaluated. The main benefit to users is that CS, a lightweight and non-adaptive Compression technique, can guarantee a desired level of Compression performance (and thus, radio usage and power consumption) without subjugating recovered signal quality. Our contribution is a novel and rigorous comparison of five state of the art, on-mote, Lossy Compression algorithms in simulation on real-world data sets and implemented on hardware.

15 days free trial to Access Article

Jun Muramatsu - One of the best experts on this subject based on the ideXlab platform.

Variable-Length Lossy Compression Algorithms Based on Constrained Random Numbers

2015 Data Compression Conference, 2015

Co-Authors: Jun Muramatsu

Abstract:

Summary form only given. A variable-length Lossy Compression algorithms for a stationary memory less source with a continuous alphabet are introduced with a rate-distortion pair close to the rate-distortion function.

15 days free trial to Access Article
DCC - Variable-Length Lossy Compression Algorithms Based on Constrained Random Numbers

2015 Data Compression Conference, 2015

Co-Authors: Jun Muramatsu

Abstract:

A variable-length Lossy Compression algorithms for a stationary memory less source with a continuous alphabet are introduced with a rate-distortion pair close to the rate-distortion function.

15 days free trial to Access Article

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Tracy Camp - One of the best experts on this subject based on the ideXlab platform.

Lossy Compression for wireless seismic data acquisition

A Comparison of On-Mote Lossy Compression Algorithms for Wireless Seismic Data Acquisition

Franck Cappello - One of the best experts on this subject based on the ideXlab platform.

Performance Optimization for Relative-Error-Bounded Lossy Compression on Scientific Data

FRaZ: A Generic High-Fidelity Fixed-Ratio Lossy Compression Framework for Scientific Floating-point Data

Accelerating Relative-error Bounded Lossy Compression for HPC datasets with Precomputation-Based Mechanisms

CLUSTER - Fixed-PSNR Lossy Compression for Scientific Data

Fixed-PSNR Lossy Compression for Scientific Data

Tsachy Weissman - One of the best experts on this subject based on the ideXlab platform.

Effect of Lossy Compression of quality scores on variant calling

effect of Lossy Compression of quality scores on variant calling

Universality of logarithmic loss in Lossy Compression

ISIT - Universality of logarithmic loss in Lossy Compression

Achievable complexity-performance tradeoffs in Lossy Compression

Marc J Rubin - One of the best experts on this subject based on the ideXlab platform.

Lossy Compression for wireless seismic data acquisition

A Comparison of On-Mote Lossy Compression Algorithms for Wireless Seismic Data Acquisition

Jun Muramatsu - One of the best experts on this subject based on the ideXlab platform.

Variable-Length Lossy Compression Algorithms Based on Constrained Random Numbers

DCC - Variable-Length Lossy Compression Algorithms Based on Constrained Random Numbers

Lossy Compression

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Related terms

Tracy Camp - One of the best experts on this subject based on the ideXlab platform.

Franck Cappello - One of the best experts on this subject based on the ideXlab platform.

Tsachy Weissman - One of the best experts on this subject based on the ideXlab platform.

Marc J Rubin - One of the best experts on this subject based on the ideXlab platform.

Jun Muramatsu - One of the best experts on this subject based on the ideXlab platform.

Related terms