Interrupt Handling

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 1617 Experts worldwide ranked by ideXlab platform

B Jacob - One of the best experts on this subject based on the ideXlab platform.

  • In-line Interrupt Handling and lock-up free translation lookaside buffers (TLBs)
    IEEE Transactions on Computers, 2006
    Co-Authors: Aamer Jaleel, B Jacob
    Abstract:

    The effects of the general-purpose precise Interrupt mechanisms in use for the past few decades have received very little attention. When modern out-of-order processors handle Interrupts precisely, they typically begin by flushing the pipeline to make the CPU available to execute handler instructions. In doing so, the CPU ends up flushing many instructions that have been brought in to the reorder buffer. In particular, these instructions may have reached a very deep stage in the pipeline - representing significant work that is wasted. In addition, an overhead of several cycles and wastage of energy (per exception detected) can be expected in refetching and reexecuting the instructions flushed. This paper concentrates on improving the performance of precisely Handling software managed translation look-aside buffer (TLB) Interrupts, one of the most frequently occurring Interrupts. The paper presents a novel method of in-lining the Interrupt handler within the reorder buffer. Since the first level Interrupt-handlers of TLBs are usually small, they could potentially fit in the reorder buffer along with the user-level code already there. In doing so, the instructions that would otherwise be flushed from the pipe need not be refetched and reexecuted. Additionally, it allows for instructions independent of the exceptional instruction to continue to execute in parallel with the handler code. By in-lining the TLB Interrupt handler, this provides lock-up free TLBs. This paper proposes the prepend and append schemes of in-lining the Interrupt handler into the available reorder buffer space. The two schemes are implemented on a performance model of the Alpha 21264 processor built by Alpha designers at the Palo Alto Design Center (PADC), California. We compare the overhead and performance impact of Handling TLB Interrupts by the traditional scheme, the append in-lined scheme, and the prepend in-lined scheme. For small, medium, and large memory footprints, the overhead is quantified by comparing the number and pipeline state of instructions flushed, the energy savings, and the performance improvements. We find that lock-up free TLBs reduce the overhead of refetching and reexecuting the instructions flushed by 30-95 percent, reduce the execution time by 5-25 percent, and also reduce the energy wasted by 30-90 percent.

  • ICCD - In-line Interrupt Handling for software-managed TLBs
    Proceedings 2001 IEEE International Conference on Computer Design: VLSI in Computers and Processors. ICCD 2001, 2001
    Co-Authors: Aamer Jaleel, B Jacob
    Abstract:

    The general-purpose precise Interrupt mechanism, which has long been used to handle exceptional conditions that occur infrequently, is now being used increasingly often to handle conditions that are neither exceptional nor infrequent. One example is the use of Interrupts to perform memory management-e.g., to handle translation lookaside buffer (TLB) misses in today's microprocessors. When modern out-of-order processors handle Interrupts precisely, they typically begin by flushing the pipeline. Doing so makes the CPU available to execute handler instructions, but it wastes potentially hundreds of cycles of execution time. However, if the handler code is small, it could potentially fit in the reorder buffer along with the user-level code already there. This essentially in-lines the Interrupt-handler code. One good example of where this would be both possible and useful is in the TLB-miss handler in a software-managed TLB implementation. We simulate a lockup free data-TLB facility on a processor model with a 4-way out-of-order core reminiscent of the Alpha 21264. We find that, by using lockup free TLBs, one can get the performance of a fully associative TLB with a lockup free TLB of one fourth the size.

  • In-line Interrupt Handling for software-managed TLBs
    Proceedings 2001 IEEE International Conference on Computer Design: VLSI in Computers and Processors. ICCD 2001, 2001
    Co-Authors: Aamer Jaleel, B Jacob
    Abstract:

    The general-purpose precise Interrupt mechanism, which has long been used to handle exceptional conditions that occur infrequently, is now being used increasingly often to handle conditions that are neither exceptional nor infrequent. One example is the use of Interrupts to perform memory management-e.g., to handle translation lookaside buffer (TLB) misses in today's microprocessors. When modern out-of-order processors handle Interrupts precisely, they typically begin by flushing the pipeline. Doing so makes the CPU available to execute handler instructions, but it wastes potentially hundreds of cycles of execution time. However, if the handler code is small, it could potentially fit in the reorder buffer along with the user-level code already there. This essentially in-lines the Interrupt-handler code. One good example of where this would be both possible and useful is in the TLB-miss handler in a software-managed TLB implementation. We simulate a lockup free data-TLB facility on a processor model with a 4-way out-of-order core reminiscent of the Alpha 21264. We find that, by using lockup free TLBs, one can get the performance of a fully associative TLB with a lockup free TLB of one fourth the size.

Aamer Jaleel - One of the best experts on this subject based on the ideXlab platform.

  • In-line Interrupt Handling and lock-up free translation lookaside buffers (TLBs)
    IEEE Transactions on Computers, 2006
    Co-Authors: Aamer Jaleel, B Jacob
    Abstract:

    The effects of the general-purpose precise Interrupt mechanisms in use for the past few decades have received very little attention. When modern out-of-order processors handle Interrupts precisely, they typically begin by flushing the pipeline to make the CPU available to execute handler instructions. In doing so, the CPU ends up flushing many instructions that have been brought in to the reorder buffer. In particular, these instructions may have reached a very deep stage in the pipeline - representing significant work that is wasted. In addition, an overhead of several cycles and wastage of energy (per exception detected) can be expected in refetching and reexecuting the instructions flushed. This paper concentrates on improving the performance of precisely Handling software managed translation look-aside buffer (TLB) Interrupts, one of the most frequently occurring Interrupts. The paper presents a novel method of in-lining the Interrupt handler within the reorder buffer. Since the first level Interrupt-handlers of TLBs are usually small, they could potentially fit in the reorder buffer along with the user-level code already there. In doing so, the instructions that would otherwise be flushed from the pipe need not be refetched and reexecuted. Additionally, it allows for instructions independent of the exceptional instruction to continue to execute in parallel with the handler code. By in-lining the TLB Interrupt handler, this provides lock-up free TLBs. This paper proposes the prepend and append schemes of in-lining the Interrupt handler into the available reorder buffer space. The two schemes are implemented on a performance model of the Alpha 21264 processor built by Alpha designers at the Palo Alto Design Center (PADC), California. We compare the overhead and performance impact of Handling TLB Interrupts by the traditional scheme, the append in-lined scheme, and the prepend in-lined scheme. For small, medium, and large memory footprints, the overhead is quantified by comparing the number and pipeline state of instructions flushed, the energy savings, and the performance improvements. We find that lock-up free TLBs reduce the overhead of refetching and reexecuting the instructions flushed by 30-95 percent, reduce the execution time by 5-25 percent, and also reduce the energy wasted by 30-90 percent.

  • ICCD - In-line Interrupt Handling for software-managed TLBs
    Proceedings 2001 IEEE International Conference on Computer Design: VLSI in Computers and Processors. ICCD 2001, 2001
    Co-Authors: Aamer Jaleel, B Jacob
    Abstract:

    The general-purpose precise Interrupt mechanism, which has long been used to handle exceptional conditions that occur infrequently, is now being used increasingly often to handle conditions that are neither exceptional nor infrequent. One example is the use of Interrupts to perform memory management-e.g., to handle translation lookaside buffer (TLB) misses in today's microprocessors. When modern out-of-order processors handle Interrupts precisely, they typically begin by flushing the pipeline. Doing so makes the CPU available to execute handler instructions, but it wastes potentially hundreds of cycles of execution time. However, if the handler code is small, it could potentially fit in the reorder buffer along with the user-level code already there. This essentially in-lines the Interrupt-handler code. One good example of where this would be both possible and useful is in the TLB-miss handler in a software-managed TLB implementation. We simulate a lockup free data-TLB facility on a processor model with a 4-way out-of-order core reminiscent of the Alpha 21264. We find that, by using lockup free TLBs, one can get the performance of a fully associative TLB with a lockup free TLB of one fourth the size.

  • In-line Interrupt Handling for software-managed TLBs
    Proceedings 2001 IEEE International Conference on Computer Design: VLSI in Computers and Processors. ICCD 2001, 2001
    Co-Authors: Aamer Jaleel, B Jacob
    Abstract:

    The general-purpose precise Interrupt mechanism, which has long been used to handle exceptional conditions that occur infrequently, is now being used increasingly often to handle conditions that are neither exceptional nor infrequent. One example is the use of Interrupts to perform memory management-e.g., to handle translation lookaside buffer (TLB) misses in today's microprocessors. When modern out-of-order processors handle Interrupts precisely, they typically begin by flushing the pipeline. Doing so makes the CPU available to execute handler instructions, but it wastes potentially hundreds of cycles of execution time. However, if the handler code is small, it could potentially fit in the reorder buffer along with the user-level code already there. This essentially in-lines the Interrupt-handler code. One good example of where this would be both possible and useful is in the TLB-miss handler in a software-managed TLB implementation. We simulate a lockup free data-TLB facility on a processor model with a 4-way out-of-order core reminiscent of the Alpha 21264. We find that, by using lockup free TLBs, one can get the performance of a fully associative TLB with a lockup free TLB of one fourth the size.

Chisheng Shih - One of the best experts on this subject based on the ideXlab platform.

  • nukernel microkernel for multi core dsp socs with load sharing and priority Interrupts
    ACM Symposium on Applied Computing, 2013
    Co-Authors: Chisheng Shih
    Abstract:

    The demands of modern embedded systems are hastening the adoption of multicore SoCs. Although multicore SoCs can be conceptually viewed as distributed systems, the resources on multicore SoCs including Interrupts and scheduling are mostly, if not all, managed by the operating systems on general purpose CPU on SoC in a centralized manner. This approach leads to heavy overhead on the general purpose CPU and does not scale up. This paper presents the design and implementation of a microkernel for multi-core DSP SoCs, named nμKernel, to support real-time scheduling, load sharing among DSP cores, nested priority Interrupts, and predictable Interrupt latency jitter. The kernel takes advantage of both the shared memory architecture on multicore DSP SoCs and pipeline real-time scheduling to support load sharing. A server-base algorithm is designed for overrun control and to reduce load sharing overhead. The developed hybrid Interrupt Handling framework adopts on-demand Interrupt thread mechanism to reduce Interrupt Handling overhead and support nested priority Interrupts. The experiments show that the kernel can significantly enhance application performance with least management overhead. The frame rate of a secure image display application speeds up for six times: from 2.2 frames per second to 19 frames per second while workload are shared among DSP cores. The developed Interrupt Handling framework shortens the Interrupted latency for up to 90%, compared to two-level Interrupt Handling mechanism and limits the range of Interrupt latency to no more than 5%.

  • SAC - nuKernel: MicroKernel for multi-core DSP SoCs with load sharing and priority Interrupts
    Proceedings of the 28th Annual ACM Symposium on Applied Computing - SAC '13, 2013
    Co-Authors: Chisheng Shih, Hsin-yu Lai
    Abstract:

    The demands of modern embedded systems are hastening the adoption of multicore SoCs. Although multicore SoCs can be conceptually viewed as distributed systems, the resources on multicore SoCs including Interrupts and scheduling are mostly, if not all, managed by the operating systems on general purpose CPU on SoC in a centralized manner. This approach leads to heavy overhead on the general purpose CPU and does not scale up. This paper presents the design and implementation of a microkernel for multi-core DSP SoCs, named nμKernel, to support real-time scheduling, load sharing among DSP cores, nested priority Interrupts, and predictable Interrupt latency jitter. The kernel takes advantage of both the shared memory architecture on multicore DSP SoCs and pipeline real-time scheduling to support load sharing. A server-base algorithm is designed for overrun control and to reduce load sharing overhead. The developed hybrid Interrupt Handling framework adopts on-demand Interrupt thread mechanism to reduce Interrupt Handling overhead and support nested priority Interrupts. The experiments show that the kernel can significantly enhance application performance with least management overhead. The frame rate of a secure image display application speeds up for six times: from 2.2 frames per second to 19 frames per second while workload are shared among DSP cores. The developed Interrupt Handling framework shortens the Interrupted latency for up to 90%, compared to two-level Interrupt Handling mechanism and limits the range of Interrupt latency to no more than 5%.

Thomas M Conte - One of the best experts on this subject based on the ideXlab platform.

  • a fast Interrupt Handling scheme for vliw processors
    International Conference on Parallel Architectures and Compilation Techniques, 1998
    Co-Authors: Emre Ozer, Sumedh W Sathaye, Kishore N Menezes, Sanjeev Banerjia, Matthew D Jennings, Thomas M Conte
    Abstract:

    Interrupt Handling in out-of-order execution processors requires complex hardware schemes to maintain the sequential state. The amount of hardware will be substantial in VLIW architectures due to the nature of issuing a very large number of instructions in each cycle. It is hard to implement precise Interrupts in out-of-order execution machines, especially in VLIW processors. In this paper, we will apply the reorder buffer with future file and the history buffer methods to a VLIW platform, and present a novel scheme, called the current-state buffer, which employs modest hardware with compiler support. Unlike the other Interrupt Handling schemes, the current-state buffer does not keep history state, result buffering or bypass mechanisms. It is a fast Interrupt Handling scheme with a relatively small buffer that records the execution and exception status of operations. It is suitable for embedded processors that require a fast Interrupt Handling mechanism with modest hardware.

  • IEEE PACT - A fast Interrupt Handling scheme for VLIW processors
    Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192), 1998
    Co-Authors: Emre Ozer, Sumedh W Sathaye, Kishore N Menezes, Sanjeev Banerjia, Matthew D Jennings, Thomas M Conte
    Abstract:

    Interrupt Handling in out-of-order execution processors requires complex hardware schemes to maintain the sequential state. The amount of hardware will be substantial in VLIW architectures due to the nature of issuing a very large number of instructions in each cycle. It is hard to implement precise Interrupts in out-of-order execution machines, especially in VLIW processors. In this paper, we will apply the reorder buffer with future file and the history buffer methods to a VLIW platform, and present a novel scheme, called the current-state buffer, which employs modest hardware with compiler support. Unlike the other Interrupt Handling schemes, the current-state buffer does not keep history state, result buffering or bypass mechanisms. It is a fast Interrupt Handling scheme with a relatively small buffer that records the execution and exception status of operations. It is suitable for embedded processors that require a fast Interrupt Handling mechanism with modest hardware.

James H. Anderson - One of the best experts on this subject based on the ideXlab platform.

  • ECRTS - Robust Real-Time Multiprocessor Interrupt Handling Motivated by GPUs
    2012 24th Euromicro Conference on Real-Time Systems, 2012
    Co-Authors: Glenn A Elliott, James H. Anderson
    Abstract:

    Architectures in which multicore chips are augmented with graphics processing units (GPUs) have great potential in many domains in which computationally intensive real-time workloads must be supported. However, unlike standard CPUs, GPUs are treated as I/O devices and require the use of Interrupts to facilitate communication with CPUs. Given their disruptive nature, Interrupts must be dealt with carefully in real-time systems. With GPU-driven Interrupts, such disruptiveness is further compounded by the closed-source nature of GPU drivers. In this paper, such problems are considered and a solution is presented in the form of an extension to LITMUS^RT called klmirqd. The design of klmirqd targets systems with multiple CPUs and GPUs. In such settings, Interrupt-related issues arise that have not been previously addressed.

  • robust real time multiprocessor Interrupt Handling motivated by gpus
    Euromicro Conference on Real-Time Systems, 2012
    Co-Authors: Glenn A Elliott, James H. Anderson
    Abstract:

    Architectures in which multicore chips are augmented with graphics processing units (GPUs) have great potential in many domains in which computationally intensive real-time workloads must be supported. However, unlike standard CPUs, GPUs are treated as I/O devices and require the use of Interrupts to facilitate communication with CPUs. Given their disruptive nature, Interrupts must be dealt with carefully in real-time systems. With GPU-driven Interrupts, such disruptiveness is further compounded by the closed-source nature of GPU drivers. In this paper, such problems are considered and a solution is presented in the form of an extension to LITMUS^RT called klmirqd. The design of klmirqd targets systems with multiple CPUs and GPUs. In such settings, Interrupt-related issues arise that have not been previously addressed.

  • RTSS - On the Implementation of Global Real-Time Schedulers
    2009 30th IEEE Real-Time Systems Symposium, 2009
    Co-Authors: Bjorn B. Brandenburg, James H. Anderson
    Abstract:

    An empirical study of implementation tradeoffs (choice of ready queue implementation, quantum-driven vs. event driven scheduling, and Interrupt Handling strategy) affecting global real-time schedulers, and in particular global EDF, is presented. This study, conducted using UNC’s Linux-based LITMUSRT on Sun’s Niagara platform, suggests that implementation tradeoffs can impact schedulability as profoundly as scheduling-theoretic tradeoffs. For most of the considered workloads, implementation scalability proved to not be a key limitation of global EDF on the considered platform. Further, a combination of a parallel heap, event-driven scheduling, and dedicated Interrupt Handling performed best for most workloads.