Extended Precision

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 198 Experts worldwide ranked by ideXlab platform

Scott A Sarra - One of the best experts on this subject based on the ideXlab platform.

  • radial basis function approximation methods with Extended Precision floating point arithmetic
    Engineering Analysis With Boundary Elements, 2011
    Co-Authors: Scott A Sarra
    Abstract:

    Radial basis function (RBF) methods that employ infinitely differentiable basis functions featuring a shape parameter are theoretically spectrally accurate methods for scattered data interpolation and for solving partial differential equations. It is also theoretically known that RBF methods are most accurate when the linear systems associated with the methods are extremely ill-conditioned. This often prevents the RBF methods from realizing spectral accuracy in applications. In this work we examine how Extended Precision floating point arithmetic can be used to improve the accuracy of RBF methods in an efficient manner. RBF methods using Extended Precision are compared to algorithms that evaluate RBF methods by bypassing the solution of the ill-conditioned linear systems.

Timon Rabczuk - One of the best experts on this subject based on the ideXlab platform.

  • an efficient boundary collocation scheme for transient thermal analysis in large size ratio functionally graded materials under heat source load
    Computational Mechanics, 2019
    Co-Authors: Timon Rabczuk
    Abstract:

    This paper presents a boundary collocation scheme for transient thermal analysis in large-size-ratio functionally graded materials (FGMs) with heat source load. In the proposed scheme, Laplace transformation and the numerical inverse Laplace transformation (NILT) are implemented to avoid the troublesome time-stepping effect on numerical efficiency. The collocation Trefftz method (CTM) coupled with composite multiple reciprocity method is used to obtain the high accurate results in the solution of nonhomogeneous problems in Laplace-space domain. The Extended Precision arithmetic is introduced to overcome the ill-posed issues generated from the CTM simulation, the NILT process and the large-size-ratio FGM. Heuristic error analysis and numerical investigation are presented to demonstrate the effectiveness of the proposed scheme for transient thermal analysis. Several benchmark examples are considered under large-size-ratio FGMs with some specific spatial variations (quadratic, exponential and trigonometric functions). The proposed scheme is validated in comparison with known analytical solutions and COMSOL simulation.

Valentina Popescu - One of the best experts on this subject based on the ideXlab platform.

  • implementation and performance evaluation of an Extended Precision floating point arithmetic library for high accuracy semidefinite programming
    Symposium on Computer Arithmetic, 2017
    Co-Authors: Mioara Joldeş, Jean-michel Muller, Valentina Popescu
    Abstract:

    Semidefinite programming (SDP) is widely used in optimization problems with many applications, however, certain SDP instances are ill-posed and need more Precision than the standard double-Precision available. Moreover, these problems are large-scale and could benefit from parallelization on specialized architectures such as GPUs. In this article, we implement and evaluate the performance of a floating-point expansion-based arithmetic library (CAMPARY) in the context of such numerically highly accurate SDP solvers. We plugged-in CAMPARY with the state-of-the-art SDPA solver for both CPU and GPU-tuned implementations. We compare and contrast both the numerical accuracy and performance of SDPA-GMP, -QD and -DD, which employ other multiple-Precision arithmetic libraries against SDPA-CAMPARY. We show that CAMPARY is a very good trade-off for accuracy and speed when solving ill-conditioned SDP problems.

  • A new multiplication algorithm for Extended Precision using floating-point expansions
    2016
    Co-Authors: Jean-michel Muller, Valentina Popescu, Ping Tak Peter Tang
    Abstract:

    Some important computational problems must use a floating-point (FP) Precision several times higher than the hardware-implemented available one. These computations critically rely on software libraries for high-Precision FP arithmetic. The representation of a high-Precision data type crucially influences the corresponding arithmetic algorithms. Recent work showed that algorithms for FP expansions, that is, a representation based on unevaluated sum of standard FP types, benefit from various high-performance support for native FP, such as low latency, high throughput, vectorization, threading, etc. Bailey's QD library and its corresponding Graphics Processing Unit (GPU) version, GQD, are such examples. Despite using native FP arithmetic as the key operations, QD and GQD algorithms are focused on double-double or quad-double representations and do not generalize efficiently or naturally to a flexible number of components in the FP expansion. In this paper, we introduce a new multiplication algorithm for FP expansion with flexible Precision, up to the order of tens of FP elements in mind. The main feature consists in the partial products being accumulated in a special designed data structure that has the regularity of a fixed-point representation while allowing the computation to be naturally carried out using native FP types. This allows us to easily avoid unnecessary computation and to present rigorous accuracy analysis transparently. The algorithm, its correctness and accuracy proofs and some performance comparisons with existing libraries are all contributions of this paper.

  • Arithmetic Algorithms for Extended Precision Using Floating-Point Expansions
    IEEE Transactions on Computers, 2016
    Co-Authors: Mioara Joldeş, Olivier Marty, Jean-michel Muller, Valentina Popescu
    Abstract:

    Many numerical problems require a higher computing Precision than the one offered by standard floating-point (FP) formats. One common way of extending the Precision is to represent numbers in a multiple component format. By using the so-called floating-point expansions, real numbers are represented as the unevaluated sum of standard machine Precision FP numbers. This representation offers the simplicity of using directly available, hardware implemented and highly optimized, FP operations. It is used by multiple-Precision libraries such as Bailey's QD or the analogue Graphics Processing Units (GPU) tuned version, GQD. In this article we briefly revisit algorithms for adding and multiplying FP expansions, then we introduce and prove new algorithms for normalizing, dividing and square rooting of FP expansions. The new method used for computing the reciprocal ${a}^{-1}$ and the square root $\sqrt{a}$ of a FP expansion $a$ is based on an adapted Newton-Raphson iteration where the intermediate calculations are done using “truncated” operations (additions, multiplications) involving FP expansions. We give here a thorough error analysis showing that it allows very accurate computations. More precisely, after $q$ iterations, the computed FP expansion $x=x_0+\ldots +x_{2^q-1}$ satisfies, for the reciprocal algorithm, the relative error bound: $\left|({x-a^{-1}})/{a^{-1}}\right| \le 2^{-2^q(p-3)-1}$ and, respectively, for the square root one: $\left|x-{1}/{\sqrt{a}}\right| \le {2^{-2^q(p-3)-1}}/{\sqrt{a}}$ , where $p> 2$ is the Precision of the FP representation used ($p=24$ for single Precision and $p=53$ for double Precision).

Andrew Thall - One of the best experts on this subject based on the ideXlab platform.

  • Extended Precision floating point numbers for gpu computation
    International Conference on Computer Graphics and Interactive Techniques, 2006
    Co-Authors: Andrew Thall
    Abstract:

    Double-float (df64) and quad-float (qf128) numeric types can be implemented on current GPU hardware and used efficiently and effectively for Extended-Precision computational arithmetic. Using unevaluated sums of paired or quadrupled f32 single-Precision values, these numeric types provide approximately 48 and 96 bits of mantissa respectively at singlePrecision exponent ranges for computer graphics, numerical, and general-purpose GPU programming. This paper surveys current art, presents algorithms and Cg implementation for arithmetic, exponential and trigonometric functions, and presents data on numerical accuracy on several different GPUs. It concludes with an in-depth discussion of the application of Extended Precision primitives to performing fast Fourier transforms on the GPU for real and complex data. [Addendum (July 2009): the presence of IEEE compliant double-Precision hardware in modern GPUs from NVidia and other manufacturers has reduced the need for these techniques. The double-Precision capabilities can be accessed using CUDA or other GPGPU software, but are not (as of this writing) exposed in the graphics pipeline for use in Cg-based shader code. Shader writers or those still using a graphics API for their numerical computing may still find the methods described herein to be of interest.]

Isupov, Mendeley K Data) - One of the best experts on this subject based on the ideXlab platform.

  • Execution time of multiple-Precision BLAS Level 1 operations on CPU and GPU platforms
    2019
    Co-Authors: Isupov, Mendeley K Data)
    Abstract:

    This dataset contains the execution time of four BLAS Level 1 operations - ASUM, DOT, SCAL and AXPY - implemented using multiple-Precision software for central processing units (CPUs) and CUDA compatible graphics processing units (GPUs). Each log file provided contains the results of three test runs, and also details about the experimental setup. The following software packages are considered: 1. For CPUs: • MPFR: A C library for multiple-Precision floating-point computations with correct rounding (https://www.mpfr.org) • ARPREC: An arbitrary Precision package for Fortran and C++ (https://www.davidhbailey.com/dhbsoftware) • MPDECIMAL: A package for correctly-rounded arbitrary Precision decimal floating point (https://www.bytereef.org/mpdecimal) • MPACK: Multiple-Precision versions of BLAS and LAPACK (http://mplapack.sourceforge.net) • XBLAS: A reference implementation of Extended and mixed Precision BLAS routines (https://www.netlib.org/xblas) 2. For GPUs: • GARPREC: A port of the ARPREC package for CUDA-enabled GPUs (https://code.google.com/archive/p/gpuprec/downloads) • CAMPARY: A multiple-Precision library that uses floating-point expansions to represent Extended Precision numbers (http://homepages.laas.fr/mmjoldes/campary) • CUMP: A library for arbitrary Precision arithmetic on CUDA based on the GNU MP Bignum Library (https://github.com/skystar0227/CUMP) • MPRES-BLAS: Multiple-Precision GPU accelerated BLAS functions based on Residue number system (https://github.com/kisupov/mpres-blas) The complete source code for the benchmarks can be found at https://github.com/kisupov/mpres-bla

  • Execution time of multiple-Precision BLAS Level 1 operations on CPU and GPU platforms
    2019
    Co-Authors: Isupov, Mendeley K Data)
    Abstract:

    This dataset contains the execution time of four BLAS Level 1 operations - ASUM, DOT, SCAL and AXPY - implemented using multiple-Precision software for central processing units (CPUs) and CUDA compatible graphics processing units (GPUs). Each log file provided contains the results of three test runs, and also details about the experimental setup. For each test run, the BLAS function was repeated ten times, and the total execution time of ten iterations (in milliseconds) was measured. For comparison purposes, the execution time of double Precision routines from OpenBLAS and cuBLAS is also presented. The following multiple-Precision packages are considered: 1. For CPUs: • MPFR: A C library for multiple-Precision floating-point computations with correct rounding (https://www.mpfr.org) • ARPREC: An arbitrary Precision package for Fortran and C++ (https://www.davidhbailey.com/dhbsoftware) • MPDECIMAL: A package for correctly-rounded arbitrary Precision decimal floating point (https://www.bytereef.org/mpdecimal) • MPACK: Multiple-Precision versions of BLAS and LAPACK (http://mplapack.sourceforge.net) • XBLAS: A reference implementation of Extended and mixed Precision BLAS routines (https://www.netlib.org/xblas) 2. For GPUs: • GARPREC: A port of the ARPREC package for CUDA-enabled GPUs (https://code.google.com/archive/p/gpuprec/downloads) • CAMPARY: A multiple-Precision library that uses floating-point expansions to represent Extended Precision numbers (http://homepages.laas.fr/mmjoldes/campary) • CUMP: A library for arbitrary Precision arithmetic on CUDA based on the GNU MP Bignum Library (https://github.com/skystar0227/CUMP) • MPRES-BLAS: Multiple-Precision GPU accelerated BLAS functions based on Residue number system (https://github.com/kisupov/mpres-blas) The complete source code for the benchmarks can be found at https://github.com/kisupov/mpres-bla