Shared Memory

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 96396 Experts worldwide ranked by ideXlab platform

L. Alvisi - One of the best experts on this subject based on the ideXlab platform.

  • Improving the performance of software distributed Shared Memory with speculation
    IEEE Transactions on Parallel and Distributed Systems, 2005
    Co-Authors: Michael Kistler, L. Alvisi
    Abstract:

    We study the performance benefits of speculation in a release consistent software distributed Shared Memory system. We propose a new protocol, speculative home-based release consistency (SHRC) that speculatively updates data at remote nodes to reduce the latency of remote Memory accesses. Our protocol employs a predictor that uses patterns in past accesses to Shared Memory to predict future accesses. We have implemented our protocol in a release consistent software distributed Shared Memory system that runs on commodity hardware. We evaluate our protocol implementation using eight software distributed Shared Memory benchmarks and show that it can result in significant performance improvements.

James R. Larus - One of the best experts on this subject based on the ideXlab platform.

  • PPOPP - Shared-Memory performance profiling
    Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming - PPOPP '97, 1997
    Co-Authors: James R. Larus, Barton P. Miller
    Abstract:

    This paper describes a new approach to finding performance bottlenecks in Shared-Memory parallel programs and its embodiment in the Paradyn Parallel Performance Tools running with the Blizzard fine-grain distributed Shared Memory system. This approach exploits the underlying system's cache coherence protocol to detect data sharing patterns that indicate potential performance bottlenecks and presents performance measurements in a data-centric manner. As a demonstration, Parodyn helped us improve the performance of a new Shared-Memory application program by a factor of four.

  • Cooperative Shared Memory: software and hardware for scalable multiprocessors
    ACM Transactions on Computer Systems, 1993
    Co-Authors: Mark D. Hill, James R. Larus, Steven K. Reinhardt, David A. Wood
    Abstract:

    We believe the paucity of massively parallel, Shared-Memory machines follows from the lack of a Shared-Memory programming performance model that can inform programmers of the cost of operations (so they can avoid expensive ones) and can tell hardware designers which cases are common (so they can build simple hardware to optimize them). Cooperative Shared Memory, our approach to Shared-Memory design, addresses this problem. Our initial implementation of cooperative Shared Memory uses a simple programming model, called Check-In/Check-Out (CICO), in conjunction with even simpler hardware, called Dir 1 SW. In CICO, programs bracket uses of Shared data with a check_in directive terminating the expected use of the data. A cooperative prefetch directive helps hide communication latency. Dir 1 SW is a minimal directory protocol that adds little complexity to message-passing hardware, but efficiently supports programs written within the CICO model.

  • Cooperative Shared Memory: software and hardware for scalable multiprocessor
    Proceedings of the fifth international conference on Architectural support for programming languages and operating systems - ASPLOS-V, 1992
    Co-Authors: Mark D. Hill, James R. Larus, Steven K. Reinhardt, David A. Wood
    Abstract:

    We believe the absence of massively-parallel, Shared-Memory machines follows from the lack of a Shared-Memory programming performance model that can inform programmers of the cost of operations (so they can avoid expensive ones) and can tell hardware designers which cases are common (so they can build simple hardware to optimize them). Cooperative Shared Memory , our approach to Shared-Memory design, addresses this problem. Our initial implementation of cooperative Shared Memory uses a simple programming model, called Check-In / Check-Out (CICO) , in conjunction with even simpler hardware, called Dir 1 SW is a minimal director protocol that adds little complexity to message-passing hardware, but efficiently supports programs written within the CICO model.

Michael Kistler - One of the best experts on this subject based on the ideXlab platform.

  • Improving the performance of software distributed Shared Memory with speculation
    IEEE Transactions on Parallel and Distributed Systems, 2005
    Co-Authors: Michael Kistler, L. Alvisi
    Abstract:

    We study the performance benefits of speculation in a release consistent software distributed Shared Memory system. We propose a new protocol, speculative home-based release consistency (SHRC) that speculatively updates data at remote nodes to reduce the latency of remote Memory accesses. Our protocol employs a predictor that uses patterns in past accesses to Shared Memory to predict future accesses. We have implemented our protocol in a release consistent software distributed Shared Memory system that runs on commodity hardware. We evaluate our protocol implementation using eight software distributed Shared Memory benchmarks and show that it can result in significant performance improvements.

Orli Waarts - One of the best experts on this subject based on the ideXlab platform.

  • Contention in Shared Memory algorithms
    Journal of the ACM, 1997
    Co-Authors: Cynthia Dwork, Maurice Herlihy, Orli Waarts
    Abstract:

    Most complexity measures for concurrent algorithms for asynchronous Shared-Memory architectures focus on process steps and Memory consumption. In practice, however, performance of multiprocessor algorithms is heavily influenced by contention , the extent to which processess access the same location at the same time. Nevertheless, even though contention is one of the principal considerations affecting the performance of real algorithms on real multiprocessors, there are no formal tools for analyzing the contention of asynchronous Shared-Memory algorithms. This paper introduces the first formal complexity model for contention in Shared-Memory multiprocessors. We focus on the standard multiprocessor architecture in which n asynchronous processes communicate by applying read, write, and read-modify-write operations to a Shared Memory. To illustrate the utility of our model, we use it to derive two kinds of results: (1) lower bounds on contention for well-known basic problems such as agreement and mutual exclusion, and (2) trade-offs between the length of the critical path (maximal number of accesses to Shared variables performed by a single process in executing the algorithm) and contention for these algorithms. Furthermore, we give the first formal contention analysis of a variety of counting networks, a class of concurrent data structures inplementing Shared counters. Experiments indicate that certain counting networks outperform conventional single-variable counters at high levels of contention. Our analysis provides the first formal model explaining this phenomenon.

  • STOC - Contention in Shared Memory algorithms
    Proceedings of the twenty-fifth annual ACM symposium on Theory of computing - STOC '93, 1993
    Co-Authors: Cynthia Dwork, Maurice Herlihy, Orli Waarts
    Abstract:

    Most complexity measures for concurrent algorithms for asynchronous Shared-Memory architectures focus on process steps and Memory consumption. In practice, however, performance of multiprocessor algorithms is heavily influenced by contention, the extent to which processes access the same location at the same time. Nevertheless, even though contention is one of the principal considerations affecting the performance of real algorithms on real multiprocessors, there are no formal tools for analyzing the contention of asynchronous Shared-Memory algorithms. This paper introduces the first formal complexity model for contention in Shared-Memory multiprocessors. We focus on the standard multiprocessor architecture in which n asynchronous processes communicate by applying read, write, and read-modify-write operations to a Shared Memory. To illustrate the utility of our model, we use it to derive two kinds of results: (1) lower bounds on contention for well-known basic problems such as agreement and mutual exclusion, and (2) trade-offs between the length of the critical path (maximal number of accesses to Shared variables performed by a single process in executing the algorithm) and contention for these algorithms. Furthermore, we give the first formal contention analysis of a variety of counting networks, a class of concurrent data structures implementing Shared counters. Experiments indicate that certain counting networks

David A. Wood - One of the best experts on this subject based on the ideXlab platform.

  • Mechanisms for distributed Shared Memory
    1996
    Co-Authors: Steven K. Reinhardt, David A. Wood
    Abstract:

    Distributed Shared Memory (DSM) systems simplify the task of writing distributed-Memory parallel programs by automating data distribution and communication. Unfortunately, DSM systems control Memory and communication using fixed policies, even when programmers or compilers could manage these resources more efficiently. This thesis proposes a new approach that lets users efficiently manage communication and Memory on DSM systems. Systems provide primitive DSM mechanisms without binding them to fixed protocols (policies). Standard Shared-Memory programs use default protocols similar to those found in current DSM machines. Unlike current systems, these protocols are implemented in unprivileged software. Programmers and compilers are free to modify or replace them with optimized custom protocols that manage Memory and communication directly and efficiently. To explore this new approach, this thesis: (1) identifies a set of mechanisms for distributed Shared Memory, (2) develops Tempest, a portable programming interface for mechanism-based DSM systems, (3) describes Stache, a protocol that uses Tempest to implement a standard Shared-Memory model, (4) summarizes custom protocols developed for six Shared-Memory applications, (5) designs and simulates three systems--Typhoon, Typhoon-1, and Typhoon-0--that support Tempest, and (6) describes a working hardware prototype of Typhoon-0, the simplest of those designs. Tempest combines fine-grain coherence support, an active message model, and virtual-Memory-based page allocation to provide portability across a range of platforms. Typhoon, Typhoon-1, and Typhoon-0 support Tempest using different levels of custom hardware integration. Typhoon achieves high performance by integrating key components on one device. Typhoon-1 and Typhoon-0 use off-the-shelf parts for some of these components, trading some performance for simpler designs. Typhoon demonstrates that mechanism-based DSM systems can compete with hard-wired-protocol systems on unmodified Shared-Memory applications (within 25% across six benchmarks). Despite Typhoon's low overheads, custom protocols improve performance significantly for some applications--by 384% for one benchmark. Results for Typhoon-1 and Typhoon-0 on unmodified applications are varied, but custom protocols bring them within 13% and 47% of Typhoon, respectively. A working Typhoon-0 prototype demonstrates the feasibility of these designs. Measurements of the prototype's performance substantiate simulator projections.

  • Cooperative Shared Memory: software and hardware for scalable multiprocessors
    ACM Transactions on Computer Systems, 1993
    Co-Authors: Mark D. Hill, James R. Larus, Steven K. Reinhardt, David A. Wood
    Abstract:

    We believe the paucity of massively parallel, Shared-Memory machines follows from the lack of a Shared-Memory programming performance model that can inform programmers of the cost of operations (so they can avoid expensive ones) and can tell hardware designers which cases are common (so they can build simple hardware to optimize them). Cooperative Shared Memory, our approach to Shared-Memory design, addresses this problem. Our initial implementation of cooperative Shared Memory uses a simple programming model, called Check-In/Check-Out (CICO), in conjunction with even simpler hardware, called Dir 1 SW. In CICO, programs bracket uses of Shared data with a check_in directive terminating the expected use of the data. A cooperative prefetch directive helps hide communication latency. Dir 1 SW is a minimal directory protocol that adds little complexity to message-passing hardware, but efficiently supports programs written within the CICO model.

  • Cooperative Shared Memory: software and hardware for scalable multiprocessor
    Proceedings of the fifth international conference on Architectural support for programming languages and operating systems - ASPLOS-V, 1992
    Co-Authors: Mark D. Hill, James R. Larus, Steven K. Reinhardt, David A. Wood
    Abstract:

    We believe the absence of massively-parallel, Shared-Memory machines follows from the lack of a Shared-Memory programming performance model that can inform programmers of the cost of operations (so they can avoid expensive ones) and can tell hardware designers which cases are common (so they can build simple hardware to optimize them). Cooperative Shared Memory , our approach to Shared-Memory design, addresses this problem. Our initial implementation of cooperative Shared Memory uses a simple programming model, called Check-In / Check-Out (CICO) , in conjunction with even simpler hardware, called Dir 1 SW is a minimal director protocol that adds little complexity to message-passing hardware, but efficiently supports programs written within the CICO model.