Spatial Parallelism

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 3912 Experts worldwide ranked by ideXlab platform

Paul Gough - One of the best experts on this subject based on the ideXlab platform.

  • VECPAR - Evaluating the performance of space plasma simulations using FPGA's
    Lecture Notes in Computer Science, 2003
    Co-Authors: Ben Popoola, Paul Gough
    Abstract:

    This paper analyses the performance of a custom compute machine, that performs electrostatic plasma simulations, using Field Programmable Gate Array's (FPGAs). Although FPGA's run at slower clock speeds than their off-the-shelf counterparts, the processing power lost in the reduced number of clock cycles per second is quickly recovered in the high degree of Spatial Parallelism that is achievable within the devices. We describe the development of the architecture of the machine and its support for the C-programming language via the use of a cross-compiler. Results are presented and a discussion is given on the constraints of FPGAs in particular and the hardware design process in general.

  • Evaluating the performance of space plasma simulations using FPGA's
    Lecture Notes in Computer Science, 2003
    Co-Authors: Ben Popoola, Paul Gough
    Abstract:

    This paper analyses the performance of a custom compute machine, that performs electrostatic plasma simulations, using Field Programmable Gate Array's (FPGAs). Although FPGA's run at slower clock speeds than their off-the-shelf counterparts, the processing power lost in the reduced number of clock cycles per second is quickly recovered in the high degree of Spatial Parallelism that is achievable within the devices. We describe the development of the architecture of the machine and its support for the C-programming language via the use of a cross-compiler. Results are presented and a discussion is given on the constraints of FPGAs in particular and the hardware design process in general.

Zhen Peng - One of the best experts on this subject based on the ideXlab platform.

  • A Parallel-in-Space-and-Time Method for Transient Electromagnetic Problems
    IEEE Transactions on Antennas and Propagation, 2019
    Co-Authors: Shu Wang, Yang Shao, Zhen Peng
    Abstract:

    This paper aims to address a growing need for space–time parallel simulation capability in solving transient electromagnetic (EM) problems. Currently, time-domain (TD) EM solvers are typically parallel in space. The sequential-in-time nature of these solvers can achieve good parallel scaling when the number of Spatial mesh points per core is large. However, the parallel efficiency quickly deteriorates and even saturates if Spatial Parallelism has been fully exploited. We propose a new TD method to harvest Parallelism in both Spatial and temporal dimensions. The objective is obtained through the investigation of space–time domain decomposition formulation and rational Krylov approximation of TD Green’s function. The improved parallel performance over space-only parallel TD solvers is illustrated by numerical examples.

  • A space-time domain decomposition method for high-fidelity electromagnetic simulation
    2018 International Applied Computational Electromagnetics Society Symposium (ACES), 2018
    Co-Authors: Shu Wang, Zhen Peng
    Abstract:

    This paper addresses a growing need for space-time parallel simulation capability in electromagnetics (EM) applications. Currently time-dependent EM solvers are typically parallel only in space. The sequential-in-time nature of these solvers can achieve good parallel scaling when the number of Spatial mesh points per core is large. But the parallel efficiency quickly deteriorates and even saturates if Spatial Parallelism has been fully exploited. We proposed a new time domain EM solver to harvest Parallelism in both Spatial and temporal dimension. The Spatial Parallelism is achieved by discontinuous Galerkin formulation, and the temporal Parallelism is enabled by Krylov subspace method based exponential integrator. The improved parallel performance over space-only parallel time-domain solvers is validated by numerical examples.

  • Space-time parallel computation for time-domain Maxwell's equations
    2017 International Conference on Electromagnetics in Advanced Applications (ICEAA), 2017
    Co-Authors: Shu Wang, Zhen Peng
    Abstract:

    This work addresses a growing need for the parallel-in-time simulation capability in electromagnetics (EM) applications. Currently time-dependent EM solvers are typically parallel only in space. The sequential-in-time nature of these solvers can achieve good parallel scaling when the number of Spatial mesh points per core is large. But the parallel efficiency quickly deteriorates and even saturates if Spatial Parallelism has been fully exploited. We proposed a new time domain EM solver to harvest Parallelism in both Spatial and temporal dimension. The Spatial Parallelism is achieved by discontinuous Galerkin formulation, and the temporal Parallelism is enabled by Krylov subspace method based exponential integrator. This work results in a highly scalable parallel time domain solver which can amend the scalability issue for traditional ones. The convergence and parallel performance are validated through numerical experiments.

Sun-ho Han - One of the best experts on this subject based on the ideXlab platform.

  • exploiting Spatial and temporal Parallelism in the multithreaded node architecture implemented on superscalar risc processors
    International Conference on Parallel Processing, 1993
    Co-Authors: D. J. Hwang, Sukki Cho, Yong-wook Kim, Sun-ho Han
    Abstract:

    In most multithreaded node architectures moti? vated by the dataflow computational model, Spatial Parallelism could not be exploited at the thread level due to the resource deficit incurred by their inter nal organization. So we proposed a node architecture exploiting both Spatial and temporal Parallelism of a program. A multi-port non-blocking data cache is in corporated into our design to cope with the excessive data bandwidth required in parallel execution of mul tiple threads. The proposed node architecture may contribute to greatly reducing communication latency through the interconnection network. Simulation re sults show that parallel loops can be executed on this architecture more efficiently than on other competi tive ones.

  • ICPP (1) - Exploiting Spatial and Temporal Parallelism in the Multithreaded Node Architecture Implemented on Superscalar RISC Processors
    1993 International Conference on Parallel Processing - ICPP'93 Vol1, 1993
    Co-Authors: D. J. Hwang, Sukki Cho, Yong-wook Kim, Sun-ho Han
    Abstract:

    In most multithreaded node architectures moti? vated by the dataflow computational model, Spatial Parallelism could not be exploited at the thread level due to the resource deficit incurred by their inter nal organization. So we proposed a node architecture exploiting both Spatial and temporal Parallelism of a program. A multi-port non-blocking data cache is in corporated into our design to cope with the excessive data bandwidth required in parallel execution of mul tiple threads. The proposed node architecture may contribute to greatly reducing communication latency through the interconnection network. Simulation re sults show that parallel loops can be executed on this architecture more efficiently than on other competi tive ones.

Ben Popoola - One of the best experts on this subject based on the ideXlab platform.

  • VECPAR - Evaluating the performance of space plasma simulations using FPGA's
    Lecture Notes in Computer Science, 2003
    Co-Authors: Ben Popoola, Paul Gough
    Abstract:

    This paper analyses the performance of a custom compute machine, that performs electrostatic plasma simulations, using Field Programmable Gate Array's (FPGAs). Although FPGA's run at slower clock speeds than their off-the-shelf counterparts, the processing power lost in the reduced number of clock cycles per second is quickly recovered in the high degree of Spatial Parallelism that is achievable within the devices. We describe the development of the architecture of the machine and its support for the C-programming language via the use of a cross-compiler. Results are presented and a discussion is given on the constraints of FPGAs in particular and the hardware design process in general.

  • Evaluating the performance of space plasma simulations using FPGA's
    Lecture Notes in Computer Science, 2003
    Co-Authors: Ben Popoola, Paul Gough
    Abstract:

    This paper analyses the performance of a custom compute machine, that performs electrostatic plasma simulations, using Field Programmable Gate Array's (FPGAs). Although FPGA's run at slower clock speeds than their off-the-shelf counterparts, the processing power lost in the reduced number of clock cycles per second is quickly recovered in the high degree of Spatial Parallelism that is achievable within the devices. We describe the development of the architecture of the machine and its support for the C-programming language via the use of a cross-compiler. Results are presented and a discussion is given on the constraints of FPGAs in particular and the hardware design process in general.

Lawrence A. Rowe - One of the best experts on this subject based on the ideXlab platform.

  • Exploiting Spatial Parallelism for software-only video effects processing
    Multimedia Computing and Networking 1999, 1998
    Co-Authors: Ketan Mayer-patel, Lawrence A. Rowe
    Abstract:

    ABSTRACT Video effects play an important role in adding production value to video programs. The use of video effects with Internet Video sources, however, is still uncommon because traditional hardware-based solutions are poorly suited tothe Internet environment. In previous work, we described a parallel, software-only video effects system designed for Internet Video and explored the use of temporal Parallelism. This paper explores the use of Spatial Parallelism. In particular, an intermediate semicompressed video format is desribed that was designed to exploit Spatial Parallelism, and performance measurements are reported on the use of this representation.Keywords: Multimedia systems, video processing, parallel processing, video effects 1. INTRODUCTION Experience from the television, video, and film industries shows that visual effects are an important tool for corn-municating and maintaining audience interest." Titling, for example, is used to identify speakers and topics in avideo presentation. Compositing effects that combine two or more video images into one image are used to present