Multiprocessor

The Experts below are selected from a list of 49173 Experts worldwide ranked by ideXlab platform

Sanjoy Baruah - One of the best experts on this subject based on the ideXlab platform.

Schedulability analysis of global edf

2008

Co-Authors: Sanjoy Baruah, Theodore Baker

Abstract:

The Multiprocessor edf scheduling of sporadic task systems is studied. A new sufficient schedulability test is presented and proved correct. It is shown that this test generalizes the previously-known exact uniprocessor edf -schedulability test, and that it offers non-trivial quantitative guarantees (including a resource augmentation bound) on Multiprocessors.

15 days free trial to Access Article
robustness results concerning edf scheduling upon uniform Multiprocessors

2003

Co-Authors: Sanjoy Baruah, Shelby Funk, Joël Goossens

Abstract:

Each processor in a uniform Multiprocessor machine is characterized by a speed or computing capacity, with the interpretation that a job executing on a processor with speed s for t time units completes (s /spl times/ t) units of execution. The earliest deadline first (EDF) scheduling of hard-real-time systems upon uniform Multiprocessor machines is considered. It is known that online algorithms tend to perform very poorly in scheduling such hard-real-time systems on Multiprocessors; resource-augmentation techniques are presented here that permit online algorithms in general (EDF in particular) to perform better than may be expected given these inherent limitations. It is shown that EDF scheduling upon uniform Multiprocessors is robust with respect to both job execution requirements and processor computing capacity.

15 days free trial to Access Article
on line scheduling on uniform Multiprocessors

2001

Co-Authors: Shelby Funk, Joël Goossens, Sanjoy Baruah

Abstract:

Each processor in a uniform Multiprocessor machine is characterized by a speed or computing capacity, with the interpretation that a job executing on a processor with speed s for t time units completes (s/spl times/t) units of execution. The on-line scheduling of hard-real-time systems, in which all jobs must complete by specified deadlines, on uniform Multiprocessor machines is considered It is known that online algorithms tend to perform very poorly in scheduling such hard-real-time systems on Multiprocessors; resource-augmentation techniques are presented here that permit online algorithms to perform better than may be expected given the inherent limitations. Results derived here are applied to the scheduling of periodic task systems on uniform Multiprocessor machines.

15 days free trial to Access Article

Mendel Rosenblum - One of the best experts on this subject based on the ideXlab platform.

disco running commodity operating systems on scalable Multiprocessors

1997

Co-Authors: Edouard Bugnion, Scott W Devine, Kinshuk Govil, Mendel Rosenblum

Abstract:

In this article we examine the problem of extending modern operating systems to run efficiently on large-scale shared-memory Multiprocessors without a large implementation effort. Our approach brings back an idea popular in the 1970s: virtual machine monitors. We use virtual machines to run multiple commodity operating systems on a scalable Multiprocessor. This solution addresses many of the challenges facing the system software for these machines. We demonstrate our approach with a prototype called Disco that runs multiple copies of Silicon Graphics' IRIX operating system on a Multiprocessor. Our experience shows that the overheads of the monitor are small and that the approach provides scalability as well as the ability to deal with the nonuniform memory access time of these systems. To reduce the memory overheads associated with running multiple operating systems, virtual machines transparently share major data structures such as the program code and the file system buffer cache. We use the distributed-system support of modern operating systems to export a partial single system image to the users. The overall solution achieves most of the benefits of operating systems customized for scalable Multiprocessors, yet it can be achieved with a significantly smaller implementation effort.

15 days free trial to Access Article
disco running commodity operating systems on scalable Multiprocessors

1997

Co-Authors: Edouard Bugnion, Scott W Devine, Mendel Rosenblum

Abstract:

In this paper we examine the problem of extending modern operating systems to run efficiently on large-scale shared memory Multiprocessors without a large implementation effort. Our approach brings back an idea popular in the 1970s, virtual machine monitors. We use virtual machines to run multiple commodity operating systems on a scalable Multiprocessor. This solution addresses many of the challenges facing the system software for these machines. We demonstrate our approach with a prototype called Disco that can run multiple copies of Silicon Graphics' IRIX operating system on a Multiprocessor. Our experience shows that the overheads of the monitor are small and that the approach provides scalability as well as the ability to deal with the non-uniform memory access time of these systems. To reduce the memory overheads associated with running multiple operating systems, we have developed techniques where the virtual machines transparently share major data structures such as the program code and the file system buffer cache. We use the distributed system support of modern operating systems to export a partial single system image to the users. The overall solution achieves most of the benefits of operating systems customized for scalable Multiprocessors yet it can be achieved with a significantly smaller implementation effort.

15 days free trial to Access Article
memory system performance of unix on cc numa Multiprocessors

1995

Co-Authors: John Chapin, Mendel Rosenblum, A Herrod, Anoop Gupta

Abstract:

This study characterizes the performance of a variant of UNIX SVR4 on a large shared-memory Multiprocessor and analyzes the effects of possible OS and architectural changes. We use a nonintrusive cache miss monitor to trace the execution of an OS-intensive multiprogrammed workload on the Stanford DASH, a 32-CPU CC-NUMA Multiprocessor (CC-NUMA Multiprocessors have cache-coherent shared memory that is physically distributed across the machine). We find that our version of UNIX accounts for 24% of the workload's total execution time. A surprisingly large fraction of OS time (79%) is spent on memory system stalls, divided equally between instruction and data cache miss time. In analyzing techniques to reduce instruction cache miss stall time, we find that replication of only 7% of the OS code would allow 80% of instruction cache misses to be serviced locally on a CC-NUMA machine. For data cache misses, we find that a small number of routines account for 96% of OS data cache stall time. We find that most of these misses are coherence (communication) misses, and larger caches will not necessarily help. After presenting detailed performance data, we analyze the benefits of several OS changes and predict the effects of altering the cache configuration, degree of clustering, and cache coherence mechanism of the machine. (This paper is available via http://wwwflash.stanford.edu.)

15 days free trial to Access Article

Anant Agarwal - One of the best experts on this subject based on the ideXlab platform.

automatic partitioning of parallel loops and data arrays for distributed shared memory Multiprocessors

1995

Co-Authors: Anant Agarwal, D A Kranz, V Natarajan

Abstract:

Presents a theoretical framework for automatically partitioning parallel loops to minimize cache coherency traffic on shared-memory Multiprocessors. While several previous papers have looked at hyperplane partitioning of iteration spaces to reduce communication traffic, the problem of deriving the optimal tiling parameters for minimal communication in loops with general affine index expressions has remained open. Our paper solves this open problem by presenting a method for deriving an optimal hyperparallelepiped tiling of iteration spaces for minimal communication in Multiprocessors with caches. We show that the same theoretical framework can also be used to determine optimal tiling parameters for both data and loop partitioning in distributed memory multicomputers. Our framework uses matrices to represent iteration and data space mappings and the notion of uniformly intersecting references to capture temporal locality in array references. We introduce the notion of data footprints to estimate the communication traffic between processors and use linear algebraic methods and lattice theory to compute precisely the size of data footprints. We have implemented this framework in a compiler for Alewife, a distributed shared-memory Multiprocessor. >

15 days free trial to Access Article
limitless directories a scalable cache coherence scheme

1991

Co-Authors: David Chaiken, John Kubiatowicz, Anant Agarwal

Abstract:

Caches enhance the performance of Multiprocessors by reducing network traffic and average memory access latency. However, cache-based systems must address the problem of cache coherence. We propose the LimitLESS directory protocol to solve this problem. The LimitLESS scheme uses a combination of hardware and software techniques to realize the performance of a full-map directory with the memory overhead of a limited directory. This protocol is supported by Alewife, a large-scale Multiprocessor. We describe the architectural interfaces needed to implement the LimitLESS directory, and evaluate its performance through simulations of the Alewife machine.

15 days free trial to Access Article

Yi Zhang - One of the best experts on this subject based on the ideXlab platform.

a simple fast and scalable non blocking concurrent fifo queue for shared memory Multiprocessor systems

2001

Co-Authors: Philippas Tsigas, Yi Zhang

Abstract:

A non-blocking FIFO queue algorithm for Multiprocessor shared memory systems is presented in this paper. The algorithm is very simple, fast and scales very well in both symmetric and non-symmetric Multiprocessor shared memory systems. Experiments on a 64-node SUN Enterprise 10000 — a symmetric Multiprocessorsystem — and on a 64-node SGI Origin 2000 — a cache coherent non uniform memory access Multiprocessorsystem — indicate that our algorithm considerably outperforms the best of the known alternatives in both Multiprocessors in any level of multiprogramming. This work introduces two new, simple algorithmic mechanisms. The first lowers the contention to key variables used by the concurrent enqueue and/or dequeue operations which consequently results in the good performance of the algorithm, the second deals with the pointer recycling problem, an inconsistency problem that all non-blocking algorithms based on the compare-and-swap synchronisation primitive have to address. In our construction we selected to use compare-and-swap since compare-and-swap is an atomic primitive that scales well under contention and either is supported by modern Multiprocessors or can be implemented efficiently on them.

15 days free trial to Access Article

Karin Petersen - One of the best experts on this subject based on the ideXlab platform.

cache coherence for shared memory Multiprocessors based on virtual memory support

1993

Co-Authors: Karin Petersen

Abstract:

This paper presents a software cache coherence scheme that uses virtual memory (VM) support to maintain cache coherency for shared memory Multiprocessors. Traditional VM translation hardware in each processor is used to detect memory access attempts that would violate cache coherence and system software is used to enforce coherence. The implementation of this class of coherence schemes is very economical: it requires neither special Multiprocessor hardware nor compiler support, and easily incorporates different consistency models. The authors evaluated two consistency models for the VM-based approach: sequential consistency and lazy release consistency. The VM-based schemes are compared with a bus based snoopy caching architecture, and the authors' trace-driven simulation results show that the VM-based cache coherence schemes are practical for small-scale, shared memory Multiprocessors. >

15 days free trial to Access Article

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Sanjoy Baruah - One of the best experts on this subject based on the ideXlab platform.

Schedulability analysis of global edf

robustness results concerning edf scheduling upon uniform Multiprocessors

on line scheduling on uniform Multiprocessors

Mendel Rosenblum - One of the best experts on this subject based on the ideXlab platform.

disco running commodity operating systems on scalable Multiprocessors

disco running commodity operating systems on scalable Multiprocessors

memory system performance of unix on cc numa Multiprocessors

Anant Agarwal - One of the best experts on this subject based on the ideXlab platform.

automatic partitioning of parallel loops and data arrays for distributed shared memory Multiprocessors

limitless directories a scalable cache coherence scheme

Yi Zhang - One of the best experts on this subject based on the ideXlab platform.

a simple fast and scalable non blocking concurrent fifo queue for shared memory Multiprocessor systems

Karin Petersen - One of the best experts on this subject based on the ideXlab platform.

cache coherence for shared memory Multiprocessors based on virtual memory support

Multiprocessor

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Related terms

Sanjoy Baruah - One of the best experts on this subject based on the ideXlab platform.

Mendel Rosenblum - One of the best experts on this subject based on the ideXlab platform.

Anant Agarwal - One of the best experts on this subject based on the ideXlab platform.

Yi Zhang - One of the best experts on this subject based on the ideXlab platform.

Karin Petersen - One of the best experts on this subject based on the ideXlab platform.

Related terms