Itanium - Explore the Science & Experts

The Experts below are selected from a list of 1623 Experts worldwide ranked by ideXlab platform

D. Lavery - One of the best experts on this subject based on the ideXlab platform.

Optimization for the Intel/spl reg/ Itanium/spl reg/ architecture register stack

International Symposium on Code Generation and Optimization 2003. CGO 2003., 2003

Co-Authors: A. Settle, D.a. Connors, G. Hoflehner, D. Lavery

Abstract:

The Intel/spl reg/ Itanium/spl reg/ architecture contains a number of innovative compiler-controllable features designed to exploit instruction level parallelism. New code generation and optimization techniques are critical to the application of these features to improve processor performance. For instance, the Itanium/spl reg/ architecture provides a compiler-controllable virtual register stack to reduce the penalty of memory accesses associated with procedure calls. The Itanium/spl reg/ Register Stack Engine (RSE) transparently manages the register stack and saves and restores physical registers to and from memory as needed. Existing code generation techniques for the register stack aggressively allocate virtual registers without regard to the register pressure on different control-flow paths. As such, applications with large data sets may stress the RSE, and cause substantial execution delays due to the high number of register saves and restores. Since the Itanium/spl reg/ architecture is developed around Explicitly Parallel Instruction Computing (EPIC) concepts, solutions to increasing the register stack efficiency favor code generation techniques rather than hardware approaches.

15 days free trial to Access Article
optimization for the intel spl reg Itanium spl reg architecture register stack

Symposium on Code Generation and Optimization, 2003

Co-Authors: A. Settle, D.a. Connors, G. Hoflehner, D. Lavery

Abstract:

The Intel/spl reg/ Itanium/spl reg/ architecture contains a number of innovative compiler-controllable features designed to exploit instruction level parallelism. New code generation and optimization techniques are critical to the application of these features to improve processor performance. For instance, the Itanium/spl reg/ architecture provides a compiler-controllable virtual register stack to reduce the penalty of memory accesses associated with procedure calls. The Itanium/spl reg/ Register Stack Engine (RSE) transparently manages the register stack and saves and restores physical registers to and from memory as needed. Existing code generation techniques for the register stack aggressively allocate virtual registers without regard to the register pressure on different control-flow paths. As such, applications with large data sets may stress the RSE, and cause substantial execution delays due to the high number of register saves and restores. Since the Itanium/spl reg/ architecture is developed around Explicitly Parallel Instruction Computing (EPIC) concepts, solutions to increasing the register stack efficiency favor code generation techniques rather than hardware approaches.

15 days free trial to Access Article
CGO - Optimization for the Intel/spl reg/ Itanium/spl reg/ architecture register stack

International Symposium on Code Generation and Optimization 2003. CGO 2003., 2003

Co-Authors: A. Settle, D.a. Connors, G. Hoflehner, D. Lavery

Abstract:

The Intel/spl reg/ Itanium/spl reg/ architecture contains a number of innovative compiler-controllable features designed to exploit instruction level parallelism. New code generation and optimization techniques are critical to the application of these features to improve processor performance. For instance, the Itanium/spl reg/ architecture provides a compiler-controllable virtual register stack to reduce the penalty of memory accesses associated with procedure calls. The Itanium/spl reg/ Register Stack Engine (RSE) transparently manages the register stack and saves and restores physical registers to and from memory as needed. Existing code generation techniques for the register stack aggressively allocate virtual registers without regard to the register pressure on different control-flow paths. As such, applications with large data sets may stress the RSE, and cause substantial execution delays due to the high number of register saves and restores. Since the Itanium/spl reg/ architecture is developed around Explicitly Parallel Instruction Computing (EPIC) concepts, solutions to increasing the register stack efficiency favor code generation techniques rather than hardware approaches.

15 days free trial to Access Article

T. Shpeisman - One of the best experts on this subject based on the ideXlab platform.

IEEE PACT - Just-in-time Java compilation for the Itanium/spl reg/ processor

Proceedings.International Conference on Parallel Architectures and Compilation Techniques, 2002

Co-Authors: T. Shpeisman, Guei-yuan Lueh, A.-r. Adl-tabatabai

Abstract:

This paper describes a just-in-time (JIT) Java compiler for the Intel/spl reg/ Itanium/spl reg/ processor. The Itanium processor is an example of an Explicitly Parallel Instruction Computing (EPIC) architecture and thus relies on aggressive and expensive compiler optimizations for performance. Static compilers for Itanium use aggressive global scheduling algorithms to extract instruction-level parallelism. In a JIT compiler, however, the additional overhead of such expensive optimizations may offset any gains from the improved code. In this paper, we describe lightweight code generation techniques for generating efficient Itanium code. Our compiler relies on two basic methods to generate efficient code. First, the compiler uses inexpensive scheduling heuristics to model the Itanium microarchitecture. Second, the compiler uses the semantics of the Java virtual machine to extract instruction-level parallelism.

15 days free trial to Access Article
just in time java compilation for the Itanium spl reg processor

International Conference on Parallel Architectures and Compilation Techniques, 2002

Co-Authors: T. Shpeisman, Guei-yuan Lueh, Alireza Adltabatabai

Abstract:

This paper describes a just-in-time (JIT) Java compiler for the Intel/spl reg/ Itanium/spl reg/ processor. The Itanium processor is an example of an Explicitly Parallel Instruction Computing (EPIC) architecture and thus relies on aggressive and expensive compiler optimizations for performance. Static compilers for Itanium use aggressive global scheduling algorithms to extract instruction-level parallelism. In a JIT compiler, however, the additional overhead of such expensive optimizations may offset any gains from the improved code. In this paper, we describe lightweight code generation techniques for generating efficient Itanium code. Our compiler relies on two basic methods to generate efficient code. First, the compiler uses inexpensive scheduling heuristics to model the Itanium microarchitecture. Second, the compiler uses the semantics of the Java virtual machine to extract instruction-level parallelism.

15 days free trial to Access Article
Just-in-time Java compilation for the Itanium/spl reg/ processor

Proceedings.International Conference on Parallel Architectures and Compilation Techniques, 2002

Co-Authors: T. Shpeisman, Guei-yuan Lueh, A.-r. Adl-tabatabai

Abstract:

This paper describes a just-in-time (JIT) Java compiler for the Intel/spl reg/ Itanium/spl reg/ processor. The Itanium processor is an example of an Explicitly Parallel Instruction Computing (EPIC) architecture and thus relies on aggressive and expensive compiler optimizations for performance. Static compilers for Itanium use aggressive global scheduling algorithms to extract instruction-level parallelism. In a JIT compiler, however, the additional overhead of such expensive optimizations may offset any gains from the improved code. In this paper, we describe lightweight code generation techniques for generating efficient Itanium code. Our compiler relies on two basic methods to generate efficient code. First, the compiler uses inexpensive scheduling heuristics to model the Itanium microarchitecture. Second, the compiler uses the semantics of the Java virtual machine to extract instruction-level parallelism.

15 days free trial to Access Article

Y. Zemach - One of the best experts on this subject based on the ideXlab platform.

ia 32 execution layer a two phase dynamic translator designed to support ia 32 applications on Itanium spl reg based systems

International Symposium on Microarchitecture, 2003

Co-Authors: L. Baraz, T. Devor, A. Skaletsky, Opher Etzion, Yun Wang, Shalom Goldenberg, Y. Zemach

Abstract:

IA-32 execution layer (IA-32 EL) is a new technology that executes IA-32 applications on Intel Itanium processor family systems. Currently, support for IA-32 applications on Itanium-based platforms is achieved using hardware circuitry on the Itanium processors. This capability will be enhanced with IA-32 EL - software that will ship with Itanium-based operating systems and will convert IA-32 instructions into Itanium instructions via dynamic translation. In this paper, we describe aspects of the IA-32 execution layer technology, including the general two-phase translation architecture and the usage of a single translator for multiple operating systems. The paper provides details of some of the technical challenges such as precise exception, emulation of FP, MMX, and Intel streaming SIMD extension instructions, and misalignment handling. Finally, the paper presents some performance results.

15 days free trial to Access Article
MICRO - IA-32 execution layer: a two-phase dynamic translator designed to support IA-32 applications on Itanium/spl reg/-based systems

22nd Digital Avionics Systems Conference. Proceedings (Cat. No.03CH37449), 2003

Co-Authors: L. Baraz, T. Devor, A. Skaletsky, Opher Etzion, Yun Wang, Shalom Goldenberg, Y. Zemach

Abstract:

IA-32 execution layer (IA-32 EL) is a new technology that executes IA-32 applications on Intel Itanium processor family systems. Currently, support for IA-32 applications on Itanium-based platforms is achieved using hardware circuitry on the Itanium processors. This capability will be enhanced with IA-32 EL - software that will ship with Itanium-based operating systems and will convert IA-32 instructions into Itanium instructions via dynamic translation. In this paper, we describe aspects of the IA-32 execution layer technology, including the general two-phase translation architecture and the usage of a single translator for multiple operating systems. The paper provides details of some of the technical challenges such as precise exception, emulation of FP, MMX, and Intel streaming SIMD extension instructions, and misalignment handling. Finally, the paper presents some performance results.

15 days free trial to Access Article
IA-32 execution layer: A two-phase dynamic translator designed to support IA-32 applications on Itanium®-based systems

Proceedings of the Annual International Symposium on Microarchitecture MICRO, 2003

Co-Authors: L. Baraz, T. Devor, A. Skaletsky, Suzanne Goldenberg, Opher Etzion, Yun Wang, Y. Zemach

Abstract:

IA-32 execution layer (IA-32 EL) is a new technology that executes IA-32 applications on Intel Itanium processor family systems. Currently, support for IA-32 applications on Itanium-based platforms is achieved using hardware circuitry on the Itanium processors. This capability will be enhanced with IA-32 EL - software that will ship with Itanium-based operating systems and will convert IA-32 instructions into Itanium instructions via dynamic translation. In this paper, we describe aspects of the IA-32 execution layer technology, including the general two-phase translation architecture and the usage of a single translator for multiple operating systems. The paper provides details of some of the technical challenges such as precise exception, emulation of FP, MMX, and Intel streaming SIMD extension instructions, and misalignment handling. Finally, the paper presents some performance results.

15 days free trial to Access Article
IA-32 execution layer: a two-phase dynamic translator designed to support IA-32 applications on Itanium/spl reg/-based systems

Proceedings. 36th Annual IEEE ACM International Symposium on Microarchitecture 2003. MICRO-36., 2003

Co-Authors: L. Baraz, T. Devor, A. Skaletsky, Suzanne Goldenberg, Opher Etzion, Yun Wang, Y. Zemach

Abstract:

IA-32 execution layer (IA-32 EL) is a new technology that executes IA-32 applications on Intel Itanium processor family systems. Currently, support for IA-32 applications on Itanium-based platforms is achieved using hardware circuitry on the Itanium processors. This capability will be enhanced with IA-32 EL - software that will ship with Itanium-based operating systems and will convert IA-32 instructions into Itanium instructions via dynamic translation. In this paper, we describe aspects of the IA-32 execution layer technology, including the general two-phase translation architecture and the usage of a single translator for multiple operating systems. The paper provides details of some of the technical challenges such as precise exception, emulation of FP, MMX, and Intel streaming SIMD extension instructions, and misalignment handling. Finally, the paper presents some performance results.

15 days free trial to Access Article

A. Settle - One of the best experts on this subject based on the ideXlab platform.

Optimization for the Intel/spl reg/ Itanium/spl reg/ architecture register stack

International Symposium on Code Generation and Optimization 2003. CGO 2003., 2003

Co-Authors: A. Settle, D.a. Connors, G. Hoflehner, D. Lavery

Abstract:

The Intel/spl reg/ Itanium/spl reg/ architecture contains a number of innovative compiler-controllable features designed to exploit instruction level parallelism. New code generation and optimization techniques are critical to the application of these features to improve processor performance. For instance, the Itanium/spl reg/ architecture provides a compiler-controllable virtual register stack to reduce the penalty of memory accesses associated with procedure calls. The Itanium/spl reg/ Register Stack Engine (RSE) transparently manages the register stack and saves and restores physical registers to and from memory as needed. Existing code generation techniques for the register stack aggressively allocate virtual registers without regard to the register pressure on different control-flow paths. As such, applications with large data sets may stress the RSE, and cause substantial execution delays due to the high number of register saves and restores. Since the Itanium/spl reg/ architecture is developed around Explicitly Parallel Instruction Computing (EPIC) concepts, solutions to increasing the register stack efficiency favor code generation techniques rather than hardware approaches.

15 days free trial to Access Article
optimization for the intel spl reg Itanium spl reg architecture register stack

Symposium on Code Generation and Optimization, 2003

Co-Authors: A. Settle, D.a. Connors, G. Hoflehner, D. Lavery

Abstract:

The Intel/spl reg/ Itanium/spl reg/ architecture contains a number of innovative compiler-controllable features designed to exploit instruction level parallelism. New code generation and optimization techniques are critical to the application of these features to improve processor performance. For instance, the Itanium/spl reg/ architecture provides a compiler-controllable virtual register stack to reduce the penalty of memory accesses associated with procedure calls. The Itanium/spl reg/ Register Stack Engine (RSE) transparently manages the register stack and saves and restores physical registers to and from memory as needed. Existing code generation techniques for the register stack aggressively allocate virtual registers without regard to the register pressure on different control-flow paths. As such, applications with large data sets may stress the RSE, and cause substantial execution delays due to the high number of register saves and restores. Since the Itanium/spl reg/ architecture is developed around Explicitly Parallel Instruction Computing (EPIC) concepts, solutions to increasing the register stack efficiency favor code generation techniques rather than hardware approaches.

15 days free trial to Access Article
CGO - Optimization for the Intel/spl reg/ Itanium/spl reg/ architecture register stack

International Symposium on Code Generation and Optimization 2003. CGO 2003., 2003

Co-Authors: A. Settle, D.a. Connors, G. Hoflehner, D. Lavery

Abstract:

The Intel/spl reg/ Itanium/spl reg/ architecture contains a number of innovative compiler-controllable features designed to exploit instruction level parallelism. New code generation and optimization techniques are critical to the application of these features to improve processor performance. For instance, the Itanium/spl reg/ architecture provides a compiler-controllable virtual register stack to reduce the penalty of memory accesses associated with procedure calls. The Itanium/spl reg/ Register Stack Engine (RSE) transparently manages the register stack and saves and restores physical registers to and from memory as needed. Existing code generation techniques for the register stack aggressively allocate virtual registers without regard to the register pressure on different control-flow paths. As such, applications with large data sets may stress the RSE, and cause substantial execution delays due to the high number of register saves and restores. Since the Itanium/spl reg/ architecture is developed around Explicitly Parallel Instruction Computing (EPIC) concepts, solutions to increasing the register stack efficiency favor code generation techniques rather than hardware approaches.

15 days free trial to Access Article

G. Lowney - One of the best experts on this subject based on the ideXlab platform.

Ispike: a post-link optimizer for the Intel/spl reg/ Itanium/spl reg/ architecture

International Symposium on Code Generation and Optimization 2004. CGO 2004., 2004

Co-Authors: R. Muth, Harish Patil, R. Cohn, G. Lowney

Abstract:

Ispike is a post-link optimizer developed for the Intel/spl reg/ Itanium Processor Family (IPF) processors. The IPF architecture poses both opportunities and challenges to post-link optimizations. IPF offers a rich set of performance counters to collect detailed profile information at a low cost, which is essential to post-link optimization being practical. At the same time, the predication and bundling features on IPF make post-link code transformation more challenging than on other architectures. In Ispike, we have implemented optimizations like code layout, instruction prefetching, data layout, and data prefetching that exploit the IPF advantages, and strategies that cope with the IPF-specific challenges. Using SPEC CINT2000 as benchmarks, we show that Ispike improves performance by as much as 40% on the ltanium/spl reg/2 processor, with average improvement of 8.5% and 9.9% over executables generated by the Intel/spl reg/ Electron compiler and by the Gcc compiler, respectively. We also demonstrate that statistical profiles collected via IPF performance counters and complete profiles collected via instrumentation produce equal performance benefit, but the profiling overhead is significantly lower for performance counters.

15 days free trial to Access Article
ispike a post link optimizer for the intel spl reg Itanium spl reg architecture

Symposium on Code Generation and Optimization, 2004

Co-Authors: R. Muth, Harish Patil, R. Cohn, G. Lowney

Abstract:

Ispike is a post-link optimizer developed for the Intel/spl reg/ Itanium Processor Family (IPF) processors. The IPF architecture poses both opportunities and challenges to post-link optimizations. IPF offers a rich set of performance counters to collect detailed profile information at a low cost, which is essential to post-link optimization being practical. At the same time, the predication and bundling features on IPF make post-link code transformation more challenging than on other architectures. In Ispike, we have implemented optimizations like code layout, instruction prefetching, data layout, and data prefetching that exploit the IPF advantages, and strategies that cope with the IPF-specific challenges. Using SPEC CINT2000 as benchmarks, we show that Ispike improves performance by as much as 40% on the ltanium/spl reg/2 processor, with average improvement of 8.5% and 9.9% over executables generated by the Intel/spl reg/ Electron compiler and by the Gcc compiler, respectively. We also demonstrate that statistical profiles collected via IPF performance counters and complete profiles collected via instrumentation produce equal performance benefit, but the profiling overhead is significantly lower for performance counters.

15 days free trial to Access Article

Discover everything there is to know about the scientific topic Itanium with ideXlab!

D. Lavery - One of the best experts on this subject based on the ideXlab platform.

Optimization for the Intel/spl reg/ Itanium/spl reg/ architecture register stack

optimization for the intel spl reg Itanium spl reg architecture register stack

CGO - Optimization for the Intel/spl reg/ Itanium/spl reg/ architecture register stack

T. Shpeisman - One of the best experts on this subject based on the ideXlab platform.

IEEE PACT - Just-in-time Java compilation for the Itanium/spl reg/ processor

just in time java compilation for the Itanium spl reg processor

Just-in-time Java compilation for the Itanium/spl reg/ processor

Y. Zemach - One of the best experts on this subject based on the ideXlab platform.

ia 32 execution layer a two phase dynamic translator designed to support ia 32 applications on Itanium spl reg based systems

MICRO - IA-32 execution layer: a two-phase dynamic translator designed to support IA-32 applications on Itanium/spl reg/-based systems

IA-32 execution layer: A two-phase dynamic translator designed to support IA-32 applications on Itanium®-based systems

IA-32 execution layer: a two-phase dynamic translator designed to support IA-32 applications on Itanium/spl reg/-based systems

A. Settle - One of the best experts on this subject based on the ideXlab platform.

Optimization for the Intel/spl reg/ Itanium/spl reg/ architecture register stack

optimization for the intel spl reg Itanium spl reg architecture register stack

CGO - Optimization for the Intel/spl reg/ Itanium/spl reg/ architecture register stack

G. Lowney - One of the best experts on this subject based on the ideXlab platform.

Ispike: a post-link optimizer for the Intel/spl reg/ Itanium/spl reg/ architecture

ispike a post link optimizer for the intel spl reg Itanium spl reg architecture