Cache Configuration

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 186 Experts worldwide ranked by ideXlab platform

Greg Stitt - One of the best experts on this subject based on the ideXlab platform.

Naehyuck Chang - One of the best experts on this subject based on the ideXlab platform.

  • low energy off chip sdram memory systems for embedded applications
    ACM Transactions in Embedded Computing Systems, 2003
    Co-Authors: Hojun Shim, Yongseok Choi, Yongsoo Joo, Hyung Gyu Lee, Naehyuck Chang
    Abstract:

    Memory systems are dominant energy consumers, and thus many energy reduction techniques for memory buses and devices have been proposed. For practical energy reduction practices, we have to take into account the interaction between a processor and Cache memories together with application programs. Furthermore, energy characterization of memory systems must be accurate enough to justify various techniques. In this article, we build an in-house energy simulator for memory systems that is accelerated by special hardware support while maintaining accuracy. We explore energy behavior of memory systems for various values of the processor and memory clock frequencies and Cache Configuration. Each experiment is performed with 24M instruction steps of real application programs to guarantee accuracy.The simulator is based on precise energy characterization of memory systems including buses, bus drivers, and memory devices by a cycle-accurate energy measurement technique. We characterize energy consumption of each component by an energy state machine whose states and transitions are associated with the dynamic and static energy costs, respectively. Our approach easily characterizes the energy consumption of complex SDRAMs. We divide and quantify energy components of main memory systems for high-level reduction. The energy simulator enables us to devise practical energy reduction schemes by providing the actual amount of reduction out of the total energy consumption in main memory systems. We introduce several practical energy reduction techniques for SDRAM memory systems and demonstrate energy reduction ratio over the SDRAM memory systems with commercial SDRAM controller chipsets. We classify the SDRAM memory systems into high-performance and mid-performance classes and achieve suitable system Configurations for each class. For instance, a typical high-performance 32-bit, 64 MB SDRAM memory system consumes 19.6 mJ, 33.8 mJ, 35.4 mJ, and 37.0 mJ for 24M instructions of an MP3 decoder, a JPEG compressor, a JPEG decompressor, and an MPEG4 decoder, respectively. Our reduction scheme saves 12.7 mJ, 15.1 mJ, 15.5 mJ, and 14.8 mJ, and the reduction ratios are 64.8p, 44.6p, 43.8p, and 40.1p, respectively, without compromising execution speed.

  • energy exploration and reduction of sdram memory systems
    Design Automation Conference, 2002
    Co-Authors: Yongsoo Joo, Yongseok Choi, Hojun Shim, Hyung Gyu Lee, Kwanho Kim, Naehyuck Chang
    Abstract:

    In this paper, we introduce a precise energy characterization of SDRAM main memory systems and explore the amount of energy associated with design parameters, leading to energy reduction techniques that we are able to recommend for practical use. We build an in-house energy simulator for SDRAM main memory systems based on cycle-accurate energy measurement and state-machine-based characterizations which independently characterize dynamic and static energy. We explore energy behavior of the memory systems by changing design parameters such as processor clock, memory clock and Cache Configuration. Finally we propose new energy reduction techniques for the address bus and practical mode control schemes for the SDRAM devices. We save 10.8 mJ and 12 mJ, 40.2% and 14.5% of the total energy, for 24 M instructions of an MP3 decoder and a JPEG compressor, using a typical 32-bit, 64 MB SDRAM memory system.

Hossein Asadi - One of the best experts on this subject based on the ideXlab platform.

  • reca an efficient reconfigurable Cache architecture for storage systems with online workload characterization
    arXiv: Performance, 2018
    Co-Authors: Reza Salkhordeh, Shahriar Ebrahimi, Hossein Asadi
    Abstract:

    In recent years, SSDs have gained tremendous attention in computing and storage systems due to significant performance improvement over HDDs. The cost per capacity of SSDs, however, prevents them from entirely replacing HDDs in such systems. One approach to effectively take advantage of SSDs is to use them as a caching layer to store performance critical data blocks to reduce the number of accesses to disk subsystem. Due to characteristics of Flash-based SSDs such as limited write endurance and long latency on write operations, employing caching algorithms at the Operating System (OS) level necessitates to take such characteristics into consideration. Previous caching techniques are optimized towards only one type of application, which affects both generality and applicability. In addition, they are not adaptive when the workload pattern changes over time. This paper presents an efficient Reconfigurable Cache Architecture (ReCA) for storage systems using a comprehensive workload characterization to find an optimal Cache Configuration for I/O intensive applications. For this purpose, we first investigate various types of I/O workloads and classify them into five major classes. Based on this characterization, an optimal Cache Configuration is presented for each class of workloads. Then, using the main features of each class, we continuously monitor the characteristics of an application during system runtime and the Cache organization is reconfigured if the application changes from one class to another class of workloads. The Cache reConfiguration is done online and workload classes can be extended to emerging I/O workloads in order to maintain its efficiency with the characteristics of I/O requests. Experimental results obtained by implementing ReCA in a server running Linux show that the proposed architecture improves performance and lifetime up to 24\% and 33\%, respectively.

  • ReCA: An Efficient Reconfigurable Cache Architecture for Storage Systems with Online Workload Characterization
    IEEE Transactions on Parallel and Distributed Systems, 2018
    Co-Authors: Reza Salkhordeh, Shahriar Ebrahimi, Hossein Asadi
    Abstract:

    In recent years, Solid-State Drives (SSDs) have gained tremendous attention in computing and storage systems due to significant performance improvement over Hard DiskDrives (HDDs). The cost per capacity of SSDs, however, prevents them from entirely replacing HDDs in such systems. One approach to effectively take advantage of SSDs is to use them as a caching layer to store performance critical data blocks in order to reduce the number of accesses to HDD-based disk subsystem. Due to characteristics of Flash-based SSDs such as limited write endurance and long latency on write operations, employing caching algorithms at the Operating System (OS) level necessitates to take such characteristics into consideration. Previous OS-level caching techniques are optimized towards only one type of application, which affects both generality and applicability. In addition, they are not adaptive when the workload pattern changes over time. This paper presents an efficient Reconfigurable Cache Architecture (ReCA) for storage systems using a comprehensive workload characterization to find an optimal Cache Configuration for I/O intensive applications. For this purpose, we first investigate various types of I/O workloads and classify them into five major classes. Based on this characterization, an optimal Cache Configuration is presented for each class of workloads. Then, using the main features of each class, we continuously monitor the characteristics of an application during system runtime and the Cache organization is reconfigured if the application changes from one class to another class of workloads. The Cache reConfiguration is done online and workload classes can be extended to emerging I/O workloads in order to maintain its efficiency with the characteristics of I/O requests. Experimental results obtained by implementing ReCA in a 4U rackmount server with SATA 6Gb/s disk interfaces running Linux 3.17.0 show that the proposed architecture improves performance and lifetime up to 24 and 33 percent, respectively.

Ann Gordonross - One of the best experts on this subject based on the ideXlab platform.

  • offloading Cache Configuration prediction to an fpga for hardware speedup and overhead reduction work in progress
    International Conference on Hardware Software Codesign and System Synthesis, 2019
    Co-Authors: Ruben Vazquez, Ann Gordonross, Greg Stitt
    Abstract:

    In this paper, we present our Cache Configuration prediction methodology offloaded to an FPGA for improved performance and hardware overhead reduction, while maintaining Cache Configuration predictions within 5% of the optimal energy Cache Configuration for application phases for the instruction and data Caches.

  • work in progress offloading Cache Configuration prediction to an fpga for hardware speedup and overhead reduction
    International Conference on Hardware Software Codesign and System Synthesis, 2019
    Co-Authors: Ruben Vazquez, Ann Gordonross, Greg Stitt
    Abstract:

    In this paper, we present our Cache Configuration prediction methodology offloaded to an FPGA for improved performance and hardware overhead reduction, while maintaining Cache Configuration predictions within 5% of the optimal energy Cache Configuration for application phases for the instruction and data Caches.

  • combining code reordering and Cache Configuration
    ACM Transactions in Embedded Computing Systems, 2012
    Co-Authors: Ann Gordonross, Frank Vahid, Nikil Dutt
    Abstract:

    The instruction Cache is a popular optimization target due to the Cache's high impact on system performance and power and because of the Cache's predictable temporal and spatial locality. This article is an in depth study on the interaction of code reordering (a long-known technique) and Cache Configuration (a relatively new technique). Experimental results show that code reordering coupled with Cache Configuration reveals additional energy savings as high as 10--15p for several benchmarks with reduced Cache area as high as 48p. To exploit these additional benefits, we architect and evaluate several design exploration heuristics for combining these two methods.

  • on the interplay of loop caching code compression and Cache Configuration
    Asia and South Pacific Design Automation Conference, 2011
    Co-Authors: Marisha Rawlins, Ann Gordonross
    Abstract:

    Even though much previous work explores varying instruction Cache optimization techniques individually, little work explores the combined effects of these techniques (i.e., do they complement or obviate each other). In this paper we explore the interaction of three optimizations: loop caching, Cache tuning, and code compression. Results show that loop caching increases energy savings by as much as 26% compared to Cache tuning alone and reduces decompression energy by as much as 73%.

Ruben Vazquez - One of the best experts on this subject based on the ideXlab platform.