Serial Programming

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 7983 Experts worldwide ranked by ideXlab platform

P Banerjee - One of the best experts on this subject based on the ideXlab platform.

  • automatic selection of dynamic data partitioning schemes for distributed memory multicomputers
    Languages and Compilers for Parallel Computing, 1995
    Co-Authors: Daniel J Palermo, P Banerjee
    Abstract:

    For distributed-memory multicomputers such as the Intel Paragon, the IBM SP-1/SP-2, the NCUBE/2, and the Thinking Machines CM-5, the quality of the data partitioning for a given application is crucial to obtaining high performance. This task has traditionally been the user's responsibility, but in recent years much effort has been directed to automating the selection of data partitioning schemes. Several researchers have proposed systems that are able to produce data distributions that remain in effect for the entire execution of an application. For complex programs, however, such static data distributions may be insufficient to obtain acceptable performance. The selection of distributions that dynamically change over the course of a program's execution adds another dimension to the data partitioning problem. In this paper, we present a technique that can be used to automatically determine which partitionings are most beneficial over specific sections of a program while taking into account the added overhead of performing redistribution. This system is being built as part of the PARADIGM (PARAllelizing compiler for DIstributed-memory General-purpose Multicomputers) project at the University of Illinois. The complete system will provide a fully automated means to parallelize programs written in a Serial Programming model obtaining high performance on a wide range of distributed-memory multicomputers.

  • advanced compilation techniques in the paradigm compiler for distributed memory multicomputers
    International Conference on Supercomputing, 1995
    Co-Authors: Antonio Lain, Daniel J Palermo, Shankar Ramaswamy, Eugene W Hodges, P Banerjee
    Abstract:

    The PARADIGM compiler project provides an automated means to parallelize programs, written in a Serial Programming model, for efficient execution on distributed-memory multicomputers. .A previous implementation of the compiler based on the PTD representation allowed symbolic array sizes, affine loop bounds and array subscripts, and variable number of processors, provided that arrays were single or multi-dimensionally block distributed. The techniques presented here extend the compiler to also accept multidimensional cyclic and block-cyclic distributions within a uniform symbolic framework. These extensions demand more sophisticated symbolic manipulation capabilities. A novel aspect of our approach is to meet this demand by interfacing PARADIGM with a powerful off-the-shelf symbolic package, Mathematica. This paper describes some of the Mathematica routines that performs various transformations, shows how they are invoked and used by the compiler to overcome the new challenges, and presents experimental results for code involving cyclic and block-cyclic arrays as evidence of the feasibility of the approach.

  • communication optimizations used in the paradigm compiler for distributed memory multicomputers
    International Conference on Parallel Processing, 1994
    Co-Authors: Daniel J Palermo, John A Chandy, P Banerjee
    Abstract:

    The PARADIGM (PARAllelizing compiler for DIstributed-memory General-purpose Multicomputers) project at the University of Illinois provides a fully automated means to parallelize programs, written in a Serial Programming model, for execution on distributed-memory multicomputers. To provide efficient execution, PARADIGM automatically performs various optimizations to reduce the overhead and idle time caused by interprocessor communication. Optimizations studied in this paper include message coalescing, message vectorization, message aggregation, and coarse gram pipelining. To separate the optimization algorithms from machine-specific details, parameterized models are used to estimate communication and computation costs for a given machine. The models are also used in coarse gram pipelining to automatically select a task granularity that balances the available parallelism with the costs of communication. To determine the applicability of the optimizations on different machines, we analyzed their performance on an Intel iPSC/860, an Intel iPSC/2, and a Thinking Machines CM-5.

Daniel J Palermo - One of the best experts on this subject based on the ideXlab platform.

  • automatic selection of dynamic data partitioning schemes for distributed memory multicomputers
    Languages and Compilers for Parallel Computing, 1995
    Co-Authors: Daniel J Palermo, P Banerjee
    Abstract:

    For distributed-memory multicomputers such as the Intel Paragon, the IBM SP-1/SP-2, the NCUBE/2, and the Thinking Machines CM-5, the quality of the data partitioning for a given application is crucial to obtaining high performance. This task has traditionally been the user's responsibility, but in recent years much effort has been directed to automating the selection of data partitioning schemes. Several researchers have proposed systems that are able to produce data distributions that remain in effect for the entire execution of an application. For complex programs, however, such static data distributions may be insufficient to obtain acceptable performance. The selection of distributions that dynamically change over the course of a program's execution adds another dimension to the data partitioning problem. In this paper, we present a technique that can be used to automatically determine which partitionings are most beneficial over specific sections of a program while taking into account the added overhead of performing redistribution. This system is being built as part of the PARADIGM (PARAllelizing compiler for DIstributed-memory General-purpose Multicomputers) project at the University of Illinois. The complete system will provide a fully automated means to parallelize programs written in a Serial Programming model obtaining high performance on a wide range of distributed-memory multicomputers.

  • advanced compilation techniques in the paradigm compiler for distributed memory multicomputers
    International Conference on Supercomputing, 1995
    Co-Authors: Antonio Lain, Daniel J Palermo, Shankar Ramaswamy, Eugene W Hodges, P Banerjee
    Abstract:

    The PARADIGM compiler project provides an automated means to parallelize programs, written in a Serial Programming model, for efficient execution on distributed-memory multicomputers. .A previous implementation of the compiler based on the PTD representation allowed symbolic array sizes, affine loop bounds and array subscripts, and variable number of processors, provided that arrays were single or multi-dimensionally block distributed. The techniques presented here extend the compiler to also accept multidimensional cyclic and block-cyclic distributions within a uniform symbolic framework. These extensions demand more sophisticated symbolic manipulation capabilities. A novel aspect of our approach is to meet this demand by interfacing PARADIGM with a powerful off-the-shelf symbolic package, Mathematica. This paper describes some of the Mathematica routines that performs various transformations, shows how they are invoked and used by the compiler to overcome the new challenges, and presents experimental results for code involving cyclic and block-cyclic arrays as evidence of the feasibility of the approach.

  • communication optimizations used in the paradigm compiler for distributed memory multicomputers
    International Conference on Parallel Processing, 1994
    Co-Authors: Daniel J Palermo, John A Chandy, P Banerjee
    Abstract:

    The PARADIGM (PARAllelizing compiler for DIstributed-memory General-purpose Multicomputers) project at the University of Illinois provides a fully automated means to parallelize programs, written in a Serial Programming model, for execution on distributed-memory multicomputers. To provide efficient execution, PARADIGM automatically performs various optimizations to reduce the overhead and idle time caused by interprocessor communication. Optimizations studied in this paper include message coalescing, message vectorization, message aggregation, and coarse gram pipelining. To separate the optimization algorithms from machine-specific details, parameterized models are used to estimate communication and computation costs for a given machine. The models are also used in coarse gram pipelining to automatically select a task granularity that balances the available parallelism with the costs of communication. To determine the applicability of the optimizations on different machines, we analyzed their performance on an Intel iPSC/860, an Intel iPSC/2, and a Thinking Machines CM-5.

Charles E Leiserson - One of the best experts on this subject based on the ideXlab platform.

  • can multithreaded Programming save massively parallel computing
    International Parallel Processing Symposium, 1996
    Co-Authors: Charles E Leiserson
    Abstract:

    Massively parallel computing has taken a turn for the worse. MPP (massively parallel processor) companies have generally been doing poorly in the marketplace. The additional time to design and deliver MPP systems puts them a generation behind the latest small-scale microprocessor and SMP systems. Truly large machines have mean times to failure measured in days, limiting their ability to provide reliable computing platforms for longrunning computations. Software for MPP’s is arcane, and porting a Serial code from a conventional workstation to an MPP is a major chore, if not a research project. Is massively parallel computing doomed? Does anybody care? We should care! Massively parallel computing is the only way to solve society’s most computationally intensive problems. In the last ten years, MPP’s have shown scientists and engineers from many disciplines that important problems they had previously considered beyond their reach are, in fact, solvable. The rapid advancement of electronics, automobile, and pharmaceutical designs demands ever-higher performance from simulation and analysis tools. The computational power needed for data mining and decision analysis is increasing at a rapid rate. The burgeoning popularity of the Internet is now making it possible to deliver high-performance computing services to millions, if networks and software can meet the challenge. Algorithmic multithreaded Programming, such as provided by the Cilk system being developed at MIT and the University of Texas at Austin, offers the hope of allowing massively parallel computing to fulfill its promise, even if conventional MPP’s themselves fall by the technology wayside. Algorithmic multithreaded languages provide high-level parallel abstractions for system resources-such as processors, memory, and files-thereby allowing the runtime system to map these abstract resources onto available physical resources dynamically, while providing solid guarantees of high performance. As a consequence, a program can execute adaptively and tolerate faults in a changeable computing environment, such as the clusters of SMP workstations that appear to be the next high-performance fad. Moreover, a multithreaded program can “scale down” to run on a single processor with the same performance as Serial C code, thereby removing a major barrier between parallel and Serial Programming. Significant problems remain before multithreading can replace the existing base of parallel software, however. The most pressing appears to be the problem of duplicating the successes of data parallelism and message passing for problems that require tight and frequent synchronization. In addition, multithreading will demand stronger support from architectures and operating systems for low-latency interrupts and low-latency inter-processor communication. Proceedings of the 10th International Parallel Processing Symposium (IPPS '96) 1063-7133/96 $10.00 © 1996 IEEE

John A Chandy - One of the best experts on this subject based on the ideXlab platform.

  • communication optimizations used in the paradigm compiler for distributed memory multicomputers
    International Conference on Parallel Processing, 1994
    Co-Authors: Daniel J Palermo, John A Chandy, P Banerjee
    Abstract:

    The PARADIGM (PARAllelizing compiler for DIstributed-memory General-purpose Multicomputers) project at the University of Illinois provides a fully automated means to parallelize programs, written in a Serial Programming model, for execution on distributed-memory multicomputers. To provide efficient execution, PARADIGM automatically performs various optimizations to reduce the overhead and idle time caused by interprocessor communication. Optimizations studied in this paper include message coalescing, message vectorization, message aggregation, and coarse gram pipelining. To separate the optimization algorithms from machine-specific details, parameterized models are used to estimate communication and computation costs for a given machine. The models are also used in coarse gram pipelining to automatically select a task granularity that balances the available parallelism with the costs of communication. To determine the applicability of the optimizations on different machines, we analyzed their performance on an Intel iPSC/860, an Intel iPSC/2, and a Thinking Machines CM-5.

Sarah J White - One of the best experts on this subject based on the ideXlab platform.