Opencl Standard

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 255 Experts worldwide ranked by ideXlab platform

Paul Chow - One of the best experts on this subject based on the ideXlab platform.

  • IWOCL - Enabling FPGAs as a True Device in the Opencl Standard: Bridging the Gap for FPGAs
    Proceedings of the 5th International Workshop on OpenCL - IWOCL 2017, 2017
    Co-Authors: Vincent Mirian, Paul Chow
    Abstract:

    In our work with developing an Opencl platform for FPGAs, we observed that the way that Opencl is currently used on FPGAs does not expose the full capability of FPGAs to the programmer. In particular, FPGAs are spatial devices that can be partitioned by area with each partition programmed with a different function. The latest FPGAs can even be reconfigured dynamically such that one partition of the FPGA can be configured while the rest of the FPGA is still in use. The analogy with GPUs is that an Opencl programmer can partition a GPU into multiple device objects, execute different kernels on each device object, and reprogram the device objects. An Opencl programmer cannot do this with an FPGA even though the capability exists. As FPGA capacities continue to increase, the ability to partition and partially reconfigure the FPGA will become even more desirable. The fundamental issue is how FPGAs are currently viewed as devices in the Opencl model. In this paper, we propose a small change to the Opencl definition of a device that unlocks the full potential of FPGAs to the programmer.

  • enabling fpgas as a true device in the Opencl Standard bridging the gap for fpgas
    International Workshop on OpenCL, 2017
    Co-Authors: Vincent Mirian, Paul Chow
    Abstract:

    In our work with developing an Opencl platform for FPGAs, we observed that the way that Opencl is currently used on FPGAs does not expose the full capability of FPGAs to the programmer. In particular, FPGAs are spatial devices that can be partitioned by area with each partition programmed with a different function. The latest FPGAs can even be reconfigured dynamically such that one partition of the FPGA can be configured while the rest of the FPGA is still in use. The analogy with GPUs is that an Opencl programmer can partition a GPU into multiple device objects, execute different kernels on each device object, and reprogram the device objects. An Opencl programmer cannot do this with an FPGA even though the capability exists. As FPGA capacities continue to increase, the ability to partition and partially reconfigure the FPGA will become even more desirable. The fundamental issue is how FPGAs are currently viewed as devices in the Opencl model. In this paper, we propose a small change to the Opencl definition of a device that unlocks the full potential of FPGAs to the programmer.

  • ReConFig - UT-OCL: an Opencl framework for embedded systems using xilinx FPGAs
    2015 International Conference on ReConFigurable Computing and FPGAs (ReConFig), 2015
    Co-Authors: Vincent Mirian, Paul Chow
    Abstract:

    FPGA vendors now include hardened IPs to form a system-on-chip (SoC) making it easier to build embedded systems. However programming and integrating hardware accelerators (devices) into these systems present a challenge. The Opencl Standard has become accepted as a good programming model for managing devices, or hardware accelerators in the context of embedded systems on FPGAs, due to its rich set of constructs. Opencl has also caught the attention of FPGA vendors for use in high-level systhesis (HLS). While commercial Opencl frameworks are now emerging, there is a need for an open-source Opencl framework that facilitates the exploration of the overall system architecture and software, as well as the implementation and architectures of the task-level parallel devices. This would enable exploration of concepts that can improve current architectures as well as allow the study of features that are not within the current Standard. This paper presents UT-OCL, an Opencl framework for embedded systems using FPGAs. The framework is composed of a hardware system and its necessary software counterparts, which together form an embedded Linux system augmented to run Opencl applications within a single FPGA. This paper describes the challenges with implementing an Opencl framework for embedded systems on FPGAs, and presents an Opencl implementation that is compliant with Opencl 2.0. This framework is intended for use as a platform to explore architectures for hosting Opencl applications, implemetations of Opencl features and to study potential new features for Opencl. Although the current trend is to use Opencl in high-level synthesis targeting FPGAs, it is not the focus of this paper.

  • FPL - Using an Opencl framework to evaluate interconnect implementations on FPGAs
    2014 24th International Conference on Field Programmable Logic and Applications (FPL), 2014
    Co-Authors: Vincent Mirian, Paul Chow
    Abstract:

    Field Programmable Gate Arrays (FPGAs) are an ideal platform for building systems with custom hardware accelerators, however managing these systems is still a major challenge. The Opencl Standard has become accepted as a good programming model for managing heterogeneous platforms due to its rich constructs. Although commercial Opencl frameworks are now emerging, there is a need for an open-source Opencl framework that facilitates the exploration of the overall system architecture and software, as well as the implementation and architectures of the custom hardware accelerators (devices). In this paper, we use an Opencl framework to compare interconnect implementations for a simple multiprocessor accelerator.

Sergei Gorlatch - One of the best experts on this subject based on the ideXlab platform.

  • dOpencl: towards a uniform programming approach for distributed heterogeneous multi-/many-core systems
    2016
    Co-Authors: Michel Steuwer, Sergei Gorlatch
    Abstract:

    c©2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. DOI: 10.1109/IPDPSW.2012.16 Abstract—Modern computer systems are becoming increas-ingly heterogeneous by comprising multi-core CPUs, GPUs, and other accelerators. Current programming approaches for such systems usually require the application developer to use a combination of several programming models (e. g., MPI with Opencl or CUDA) in order to exploit the full compute capability of a system. In this paper, we present dOpencl (Distributed Opencl) – a uniform approach to programming distributed heterogeneous systems with accelerators. dOpencl extends the Opencl Standard, such that arbitrary computing devices installed on any node of a distributed system can be used together within a single application. dOpencl allows moving data and program code to these devices in a transparent, portable manner. Since dOpencl is designed as a fully-fledged implementation of the Opencl API, it allows running existing Opencl applications in a heterogeneous distributed environment without any modifi-cations. We describe in detail the mechanisms that are required to implement Opencl for distributed systems, including a de-vice management mechanism for running multiple applications concurrently. Using three application studies, we compare the performance of dOpencl with MPI+Opencl and a Standard Opencl implementation

  • Ershov Memorial Conference - Towards high-level programming for systems with many cores
    Lecture Notes in Computer Science, 2015
    Co-Authors: Sergei Gorlatch, Michel Steuwer
    Abstract:

    Application development for modern high-performance systems with many cores, i.e., comprising multiple Graphics Processing Units (GPUs) and multi-core CPUs, currently exploits low-level programming approaches like CUDA and Opencl, which leads to complex, lengthy and error-prone programs. In this paper, we advocate a high-level programming approach for such systems, which relies on the following two main principles: (a) the model is based on the current Opencl Standard, such that programs remain portable across various many-core systems, independently of the vendor, and all low-level code optimizations can be applied; (b) the model extends Opencl with three high-level features which simplify many-core programming and are automatically translated by the system into Opencl code. The high-level features of our programming model are as follows: (1) memory management is simplified and automated using parallel container data types (vectors and matrices); (2) a data (re)distribution mechanism supports data partitioning and generates automatic data movements between multiple GPUs; (3) computations are precisely and concisely expressed using parallel algorithmic patterns (skeletons). The well-defined skeletons allow for semantics-preserving transformations of SkelCL programs which can be applied in the process of program development, as well as in the compilation and optimization phase. We demonstrate how our programming model and its implementation are used to express several parallel applications, and we report first experimental results on evaluating our approach in terms of program size and target performance.

  • High-Level Programming of Stencil Computations on Multi-GPU Systems Using the SkelCL Library
    Parallel Processing Letters, 2014
    Co-Authors: Michel Steuwer, Stefan Breuer, Michael Haidl, Sergei Gorlatch
    Abstract:

    The implementation of stencil computations on modern, massively parallel systems with GPUs and other accelerators currently relies on manually-tuned coding using low-level approaches like Opencl and CUDA. This makes development of stencil applications a complex, time-consuming, and error-prone task. We describe how stencil computations can be programmed in our SkelCL approach that combines high-level programming abstractions with competitive performance on multi-GPU systems. SkelCL extends the Opencl Standard by three high-level features: 1) pre-implemented parallel patterns (a.k.a. skeletons); 2) container data types for vectors and matrices; 3) automatic data (re)distribution mechanism. We introduce two new SkelCL skeletons which specifically target stencil computations – MapOverlap and Stencil – and we describe their use for particular application examples, discuss their efficient parallel implementation, and report experimental results on systems with multiple GPUs. Our evaluation of three real-world applications shows that stencil code written with SkelCL is considerably shorter and offers competitive performance to hand-tuned Opencl code.

  • SkelCL: a high-level extension of Opencl for multi-GPU systems
    The Journal of Supercomputing, 2014
    Co-Authors: Michel Steuwer, Sergei Gorlatch
    Abstract:

    Application development for modern high-performance systems with graphics processing units (GPUs) currently relies on low-level programming approaches like CUDA and Opencl, which leads to complex, lengthy and error-prone programs. We present SkelCL—a high-level programming approach for systems with multiple GPUs and its implementation as a library on top of Opencl. SkelCL makes three main enhancements to the Opencl Standard: (1) memory management is simplified using parallel container data types (vectors and matrices); (2) an automatic data (re)distribution mechanism allows for implicit data movements between GPUs and ensures scalability when using multiple GPUs; (3) computations are conveniently expressed using parallel algorithmic patterns (skeletons). We demonstrate how SkelCL is used to implement parallel applications, and we report experimental evaluation of our approach in terms of programming effort and performance.

  • Extending the SkelCL Skeleton Library for Stencil Computations on Multi-GPU Systems
    2014
    Co-Authors: Stefan Breuer, Michel Steuwer, Sergei Gorlatch
    Abstract:

    The implementation of stencil computations on modern, massively parallel systems with GPUs and other accelerators currently relies on manually-tuned coding using low-level approaches like Opencl and CUDA, which makes it a complex, time-consuming, and error-prone task. We describe how stencil computations can be programmed in our SkelCL approach that combines high level of programming abstraction with competitive performance on multi-GPU systems. SkelCL extends the Opencl Standard by three high-level features: 1) pre-implemented parallel patterns (a.k.a. skeletons); 2) container data types for vectors and matrices; 3) automatic data (re)distribution mechanism. We introduce two new SkelCL skeletons which specifically target stencil computations – MapOverlap and Stencil – and we describe their use for particular application examples, discuss their efficient parallel implementation, and report experimental results on manycore systems with multiple GPUs.

Michel Steuwer - One of the best experts on this subject based on the ideXlab platform.

  • dOpencl: towards a uniform programming approach for distributed heterogeneous multi-/many-core systems
    2016
    Co-Authors: Michel Steuwer, Sergei Gorlatch
    Abstract:

    c©2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. DOI: 10.1109/IPDPSW.2012.16 Abstract—Modern computer systems are becoming increas-ingly heterogeneous by comprising multi-core CPUs, GPUs, and other accelerators. Current programming approaches for such systems usually require the application developer to use a combination of several programming models (e. g., MPI with Opencl or CUDA) in order to exploit the full compute capability of a system. In this paper, we present dOpencl (Distributed Opencl) – a uniform approach to programming distributed heterogeneous systems with accelerators. dOpencl extends the Opencl Standard, such that arbitrary computing devices installed on any node of a distributed system can be used together within a single application. dOpencl allows moving data and program code to these devices in a transparent, portable manner. Since dOpencl is designed as a fully-fledged implementation of the Opencl API, it allows running existing Opencl applications in a heterogeneous distributed environment without any modifi-cations. We describe in detail the mechanisms that are required to implement Opencl for distributed systems, including a de-vice management mechanism for running multiple applications concurrently. Using three application studies, we compare the performance of dOpencl with MPI+Opencl and a Standard Opencl implementation

  • Ershov Memorial Conference - Towards high-level programming for systems with many cores
    Lecture Notes in Computer Science, 2015
    Co-Authors: Sergei Gorlatch, Michel Steuwer
    Abstract:

    Application development for modern high-performance systems with many cores, i.e., comprising multiple Graphics Processing Units (GPUs) and multi-core CPUs, currently exploits low-level programming approaches like CUDA and Opencl, which leads to complex, lengthy and error-prone programs. In this paper, we advocate a high-level programming approach for such systems, which relies on the following two main principles: (a) the model is based on the current Opencl Standard, such that programs remain portable across various many-core systems, independently of the vendor, and all low-level code optimizations can be applied; (b) the model extends Opencl with three high-level features which simplify many-core programming and are automatically translated by the system into Opencl code. The high-level features of our programming model are as follows: (1) memory management is simplified and automated using parallel container data types (vectors and matrices); (2) a data (re)distribution mechanism supports data partitioning and generates automatic data movements between multiple GPUs; (3) computations are precisely and concisely expressed using parallel algorithmic patterns (skeletons). The well-defined skeletons allow for semantics-preserving transformations of SkelCL programs which can be applied in the process of program development, as well as in the compilation and optimization phase. We demonstrate how our programming model and its implementation are used to express several parallel applications, and we report first experimental results on evaluating our approach in terms of program size and target performance.

  • High-Level Programming of Stencil Computations on Multi-GPU Systems Using the SkelCL Library
    Parallel Processing Letters, 2014
    Co-Authors: Michel Steuwer, Stefan Breuer, Michael Haidl, Sergei Gorlatch
    Abstract:

    The implementation of stencil computations on modern, massively parallel systems with GPUs and other accelerators currently relies on manually-tuned coding using low-level approaches like Opencl and CUDA. This makes development of stencil applications a complex, time-consuming, and error-prone task. We describe how stencil computations can be programmed in our SkelCL approach that combines high-level programming abstractions with competitive performance on multi-GPU systems. SkelCL extends the Opencl Standard by three high-level features: 1) pre-implemented parallel patterns (a.k.a. skeletons); 2) container data types for vectors and matrices; 3) automatic data (re)distribution mechanism. We introduce two new SkelCL skeletons which specifically target stencil computations – MapOverlap and Stencil – and we describe their use for particular application examples, discuss their efficient parallel implementation, and report experimental results on systems with multiple GPUs. Our evaluation of three real-world applications shows that stencil code written with SkelCL is considerably shorter and offers competitive performance to hand-tuned Opencl code.

  • SkelCL: a high-level extension of Opencl for multi-GPU systems
    The Journal of Supercomputing, 2014
    Co-Authors: Michel Steuwer, Sergei Gorlatch
    Abstract:

    Application development for modern high-performance systems with graphics processing units (GPUs) currently relies on low-level programming approaches like CUDA and Opencl, which leads to complex, lengthy and error-prone programs. We present SkelCL—a high-level programming approach for systems with multiple GPUs and its implementation as a library on top of Opencl. SkelCL makes three main enhancements to the Opencl Standard: (1) memory management is simplified using parallel container data types (vectors and matrices); (2) an automatic data (re)distribution mechanism allows for implicit data movements between GPUs and ensures scalability when using multiple GPUs; (3) computations are conveniently expressed using parallel algorithmic patterns (skeletons). We demonstrate how SkelCL is used to implement parallel applications, and we report experimental evaluation of our approach in terms of programming effort and performance.

  • Extending the SkelCL Skeleton Library for Stencil Computations on Multi-GPU Systems
    2014
    Co-Authors: Stefan Breuer, Michel Steuwer, Sergei Gorlatch
    Abstract:

    The implementation of stencil computations on modern, massively parallel systems with GPUs and other accelerators currently relies on manually-tuned coding using low-level approaches like Opencl and CUDA, which makes it a complex, time-consuming, and error-prone task. We describe how stencil computations can be programmed in our SkelCL approach that combines high level of programming abstraction with competitive performance on multi-GPU systems. SkelCL extends the Opencl Standard by three high-level features: 1) pre-implemented parallel patterns (a.k.a. skeletons); 2) container data types for vectors and matrices; 3) automatic data (re)distribution mechanism. We introduce two new SkelCL skeletons which specifically target stencil computations – MapOverlap and Stencil – and we describe their use for particular application examples, discuss their efficient parallel implementation, and report experimental results on manycore systems with multiple GPUs.

Vincent Mirian - One of the best experts on this subject based on the ideXlab platform.

  • IWOCL - Enabling FPGAs as a True Device in the Opencl Standard: Bridging the Gap for FPGAs
    Proceedings of the 5th International Workshop on OpenCL - IWOCL 2017, 2017
    Co-Authors: Vincent Mirian, Paul Chow
    Abstract:

    In our work with developing an Opencl platform for FPGAs, we observed that the way that Opencl is currently used on FPGAs does not expose the full capability of FPGAs to the programmer. In particular, FPGAs are spatial devices that can be partitioned by area with each partition programmed with a different function. The latest FPGAs can even be reconfigured dynamically such that one partition of the FPGA can be configured while the rest of the FPGA is still in use. The analogy with GPUs is that an Opencl programmer can partition a GPU into multiple device objects, execute different kernels on each device object, and reprogram the device objects. An Opencl programmer cannot do this with an FPGA even though the capability exists. As FPGA capacities continue to increase, the ability to partition and partially reconfigure the FPGA will become even more desirable. The fundamental issue is how FPGAs are currently viewed as devices in the Opencl model. In this paper, we propose a small change to the Opencl definition of a device that unlocks the full potential of FPGAs to the programmer.

  • enabling fpgas as a true device in the Opencl Standard bridging the gap for fpgas
    International Workshop on OpenCL, 2017
    Co-Authors: Vincent Mirian, Paul Chow
    Abstract:

    In our work with developing an Opencl platform for FPGAs, we observed that the way that Opencl is currently used on FPGAs does not expose the full capability of FPGAs to the programmer. In particular, FPGAs are spatial devices that can be partitioned by area with each partition programmed with a different function. The latest FPGAs can even be reconfigured dynamically such that one partition of the FPGA can be configured while the rest of the FPGA is still in use. The analogy with GPUs is that an Opencl programmer can partition a GPU into multiple device objects, execute different kernels on each device object, and reprogram the device objects. An Opencl programmer cannot do this with an FPGA even though the capability exists. As FPGA capacities continue to increase, the ability to partition and partially reconfigure the FPGA will become even more desirable. The fundamental issue is how FPGAs are currently viewed as devices in the Opencl model. In this paper, we propose a small change to the Opencl definition of a device that unlocks the full potential of FPGAs to the programmer.

  • ReConFig - UT-OCL: an Opencl framework for embedded systems using xilinx FPGAs
    2015 International Conference on ReConFigurable Computing and FPGAs (ReConFig), 2015
    Co-Authors: Vincent Mirian, Paul Chow
    Abstract:

    FPGA vendors now include hardened IPs to form a system-on-chip (SoC) making it easier to build embedded systems. However programming and integrating hardware accelerators (devices) into these systems present a challenge. The Opencl Standard has become accepted as a good programming model for managing devices, or hardware accelerators in the context of embedded systems on FPGAs, due to its rich set of constructs. Opencl has also caught the attention of FPGA vendors for use in high-level systhesis (HLS). While commercial Opencl frameworks are now emerging, there is a need for an open-source Opencl framework that facilitates the exploration of the overall system architecture and software, as well as the implementation and architectures of the task-level parallel devices. This would enable exploration of concepts that can improve current architectures as well as allow the study of features that are not within the current Standard. This paper presents UT-OCL, an Opencl framework for embedded systems using FPGAs. The framework is composed of a hardware system and its necessary software counterparts, which together form an embedded Linux system augmented to run Opencl applications within a single FPGA. This paper describes the challenges with implementing an Opencl framework for embedded systems on FPGAs, and presents an Opencl implementation that is compliant with Opencl 2.0. This framework is intended for use as a platform to explore architectures for hosting Opencl applications, implemetations of Opencl features and to study potential new features for Opencl. Although the current trend is to use Opencl in high-level synthesis targeting FPGAs, it is not the focus of this paper.

  • FPL - Using an Opencl framework to evaluate interconnect implementations on FPGAs
    2014 24th International Conference on Field Programmable Logic and Applications (FPL), 2014
    Co-Authors: Vincent Mirian, Paul Chow
    Abstract:

    Field Programmable Gate Arrays (FPGAs) are an ideal platform for building systems with custom hardware accelerators, however managing these systems is still a major challenge. The Opencl Standard has become accepted as a good programming model for managing heterogeneous platforms due to its rich constructs. Although commercial Opencl frameworks are now emerging, there is a need for an open-source Opencl framework that facilitates the exploration of the overall system architecture and software, as well as the implementation and architectures of the custom hardware accelerators (devices). In this paper, we use an Opencl framework to compare interconnect implementations for a simple multiprocessor accelerator.

Zobal Lukáš - One of the best experts on this subject based on the ideXlab platform.

  • Distributed Password Recovery Using Hashcat Tool
    Vysoké učení technické v Brně. Fakulta informačních technologií, 2018
    Co-Authors: Zobal Lukáš
    Abstract:

    The aim of this thesis is a distributed solution for password recovery, using hashcat tool. The basis of this solution is password recovery tool Fitcrack, developed during my previous work on TARZAN project. The jobs distribution is done using BOINC platform, which is widely used for volunteer computing in a variety of scientific projects. The outcome of this work is a tool, which uses robust and reliable way of job distribution across a local or the Internet network. On the client side, fast and efficient password recovery process takes place, using Opencl Standard for acceleration of the whole process with the use of GPGPU principle

  • Distributed Password Recovery Using Hashcat Tool
    Vysoké učení technické v Brně. Fakulta informačních technologií, 2018
    Co-Authors: Zobal Lukáš
    Abstract:

    Cílem této práce je vyvinout distribuované řešení pro obnovu hesel využívající nástroje hashcat. Základem tohoto řešení je nástroj pro obnovu hesel, Fitcrack, vyvinutý v rámci mé předchozí spolupráce na projektu TARZAN. Distribuce práce mezi jednotlivé výpočetní uzly bude řešena pomocí systému BOINC, který je hojně využíván pro dobrovolnické poskytování výpočetní síly pro různé vědecké projekty. Výsledkem je pak nástroj, který používá robustní a spolehlivý systém distribuce práce klientům napříč lokální sítí nebo sítí Internet. Na nich probíhá obnova přidělených hesel a kryptografických hešů rychlým a efektivním způsobem, s využitím Standardu Opencl pro akceleraci celého procesu na principu GPGPU.The aim of this thesis is a distributed solution for password recovery, using hashcat tool. The basis of this solution is password recovery tool Fitcrack, developed during my previous work on TARZAN project. The jobs distribution is done using BOINC platform, which is widely used for volunteer computing in a variety of scientific projects. The outcome of this work is a tool, which uses robust and reliable way of job distribution across a local or the Internet network. On the client side, fast and efficient password recovery process takes place, using Opencl Standard for acceleration of the whole process with the use of GPGPU principle.

  • Microsoft Office Password Recovery Using GPU
    Vysoké učení technické v Brně. Fakulta informačních technologií, 2016
    Co-Authors: Zobal Lukáš
    Abstract:

    This thesis describes the password recovery of Microsoft Office documents by expanding an existing tool Wrathion. The thesis explains the issue of digital document protection, modern encryption and hashing algorithms and rudiments of Opencl Standard. Next, the analysis of structure of MS Word, MS Excel and MS PowerPoint documents is performed, including all the versions since 1997. Using this knowledge, we create a draft and an implementation of improved DOC module for newer versions of the encryption, as well as a draft and an implementation of brand new modules for XLS and PPT formats and their newer variants DOCX, XLSX and PPTX. After that, we measure performance of the new modules and compare it with other competing password recovery tools