Hardware Cache

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 21306 Experts worldwide ranked by ideXlab platform

Luca Benini - One of the best experts on this subject based on the ideXlab platform.

  • dory lightweight memory hierarchy management for deep nn inference on iot endnodes work in progress
    2019
    Co-Authors: Alessio Burrello, Francesco Conti, Angelo Garofalo, Davide Rossi, Luca Benini
    Abstract:

    IoT endnodes often couple a small and fast L1 scratchpad memory with higher-capacity but lower bandwidth and speed L2 background memory. The absence of a coherent Hardware Cache hierarchy saves energy but comes at the cost of labor-intensive explicit memory management, complicating the deployment of algorithms with large data memory footprint, such as Deep Neural Network (DNN) inference. In this work, we present DORY, a lightweight software-Cache dedicated to DNN Deployment Oriented to memoRY. DORY leverages static data tiling and DMA-based double buffering to hide the complexity of manual L1-L2 memory traffic management. DORY enables storage of activations and weights in L2 with less than 4% performance overhead with respect to direct execution in L1. We show that a 142 kB DNN achieving 79.9% on CIFAR-10 runs 3.2x faster compared to its execution directly from L2 memory while consuming 1.9x less energy.

  • Work-in-Progress: DORY: Lightweight Memory Hierarchy Management for Deep NN Inference on IoT Endnodes
    2019
    Co-Authors: Alessio Burrello, Francesco Conti, Angelo Garofalo, Davide Rossi, Luca Benini
    Abstract:

    IoT endnodes often couple a small and fast L1 scratchpad memory with higher-capacity but lower bandwidth and speed L2 background memory. The absence of a coherent Hardware Cache hierarchy saves energy but comes at the cost of labor-intensive explicit memory management, complicating the deployment of algorithms with large data memory footprint, such as Deep Neural Network (DNN) inference. In this work, we present DORY, a lightweight software-Cache dedicated to DNN Deployment Oriented to memoRY. DORY leverages static data tiling and DMA-based double buffering to hide the complexity of manual L1-L2 memory traffic management. DORY enables storage of activations and weights in L2 with less than 4% performance overhead with respect to direct execution in L1. We show that a 142 kB DNN achieving 79.9% on CIFAR-10 runs 3.2× faster compared to its execution directly from L2 memory while consuming 1.9× less energy.

  • Efficient OpenMP support and extensions for MPSoCs with explicitly managed memory hierarchy
    2009
    Co-Authors: Andrea Marongiu, Luca Benini
    Abstract:

    OpenMP is a de facto standard interface of the shared address space parallel programming model. Recently, there have been many attempts to use it as a programming environment for embedded MultiProcessor Systems-On-Chip (MPSoCs). This is due both to the ease of specifying parallel execution within a sequential code with OpenMP directives, and to the lack of a standard parallel programming method on MPSoCs. However, MPSoC platforms for embedded applications often feature non-uniform, explicitly managed memory hierarchies with no Hardware Cache coherency as well as heterogeneous cores with heterogeneous run-time systems. In this paper we present an optimized implementation of the compiler and runtime support infrastructure for OpenMP programming for a non-Cache-coherent distributed memory MPSoC with explicitly managed scratchpad memories (SPM). The proposed framework features specific extensions to the OpenMP programming model that leverage explicit management of the memory hierarchy. Experimental results on different real-life applications confirm the effectiveness of the optimization in terms of performance improvements.

Alessio Burrello - One of the best experts on this subject based on the ideXlab platform.

  • dory lightweight memory hierarchy management for deep nn inference on iot endnodes work in progress
    2019
    Co-Authors: Alessio Burrello, Francesco Conti, Angelo Garofalo, Davide Rossi, Luca Benini
    Abstract:

    IoT endnodes often couple a small and fast L1 scratchpad memory with higher-capacity but lower bandwidth and speed L2 background memory. The absence of a coherent Hardware Cache hierarchy saves energy but comes at the cost of labor-intensive explicit memory management, complicating the deployment of algorithms with large data memory footprint, such as Deep Neural Network (DNN) inference. In this work, we present DORY, a lightweight software-Cache dedicated to DNN Deployment Oriented to memoRY. DORY leverages static data tiling and DMA-based double buffering to hide the complexity of manual L1-L2 memory traffic management. DORY enables storage of activations and weights in L2 with less than 4% performance overhead with respect to direct execution in L1. We show that a 142 kB DNN achieving 79.9% on CIFAR-10 runs 3.2x faster compared to its execution directly from L2 memory while consuming 1.9x less energy.

  • Work-in-Progress: DORY: Lightweight Memory Hierarchy Management for Deep NN Inference on IoT Endnodes
    2019
    Co-Authors: Alessio Burrello, Francesco Conti, Angelo Garofalo, Davide Rossi, Luca Benini
    Abstract:

    IoT endnodes often couple a small and fast L1 scratchpad memory with higher-capacity but lower bandwidth and speed L2 background memory. The absence of a coherent Hardware Cache hierarchy saves energy but comes at the cost of labor-intensive explicit memory management, complicating the deployment of algorithms with large data memory footprint, such as Deep Neural Network (DNN) inference. In this work, we present DORY, a lightweight software-Cache dedicated to DNN Deployment Oriented to memoRY. DORY leverages static data tiling and DMA-based double buffering to hide the complexity of manual L1-L2 memory traffic management. DORY enables storage of activations and weights in L2 with less than 4% performance overhead with respect to direct execution in L1. We show that a 142 kB DNN achieving 79.9% on CIFAR-10 runs 3.2× faster compared to its execution directly from L2 memory while consuming 1.9× less energy.

Davide Rossi - One of the best experts on this subject based on the ideXlab platform.

  • dory lightweight memory hierarchy management for deep nn inference on iot endnodes work in progress
    2019
    Co-Authors: Alessio Burrello, Francesco Conti, Angelo Garofalo, Davide Rossi, Luca Benini
    Abstract:

    IoT endnodes often couple a small and fast L1 scratchpad memory with higher-capacity but lower bandwidth and speed L2 background memory. The absence of a coherent Hardware Cache hierarchy saves energy but comes at the cost of labor-intensive explicit memory management, complicating the deployment of algorithms with large data memory footprint, such as Deep Neural Network (DNN) inference. In this work, we present DORY, a lightweight software-Cache dedicated to DNN Deployment Oriented to memoRY. DORY leverages static data tiling and DMA-based double buffering to hide the complexity of manual L1-L2 memory traffic management. DORY enables storage of activations and weights in L2 with less than 4% performance overhead with respect to direct execution in L1. We show that a 142 kB DNN achieving 79.9% on CIFAR-10 runs 3.2x faster compared to its execution directly from L2 memory while consuming 1.9x less energy.

  • Work-in-Progress: DORY: Lightweight Memory Hierarchy Management for Deep NN Inference on IoT Endnodes
    2019
    Co-Authors: Alessio Burrello, Francesco Conti, Angelo Garofalo, Davide Rossi, Luca Benini
    Abstract:

    IoT endnodes often couple a small and fast L1 scratchpad memory with higher-capacity but lower bandwidth and speed L2 background memory. The absence of a coherent Hardware Cache hierarchy saves energy but comes at the cost of labor-intensive explicit memory management, complicating the deployment of algorithms with large data memory footprint, such as Deep Neural Network (DNN) inference. In this work, we present DORY, a lightweight software-Cache dedicated to DNN Deployment Oriented to memoRY. DORY leverages static data tiling and DMA-based double buffering to hide the complexity of manual L1-L2 memory traffic management. DORY enables storage of activations and weights in L2 with less than 4% performance overhead with respect to direct execution in L1. We show that a 142 kB DNN achieving 79.9% on CIFAR-10 runs 3.2× faster compared to its execution directly from L2 memory while consuming 1.9× less energy.

Francesco Conti - One of the best experts on this subject based on the ideXlab platform.

  • dory lightweight memory hierarchy management for deep nn inference on iot endnodes work in progress
    2019
    Co-Authors: Alessio Burrello, Francesco Conti, Angelo Garofalo, Davide Rossi, Luca Benini
    Abstract:

    IoT endnodes often couple a small and fast L1 scratchpad memory with higher-capacity but lower bandwidth and speed L2 background memory. The absence of a coherent Hardware Cache hierarchy saves energy but comes at the cost of labor-intensive explicit memory management, complicating the deployment of algorithms with large data memory footprint, such as Deep Neural Network (DNN) inference. In this work, we present DORY, a lightweight software-Cache dedicated to DNN Deployment Oriented to memoRY. DORY leverages static data tiling and DMA-based double buffering to hide the complexity of manual L1-L2 memory traffic management. DORY enables storage of activations and weights in L2 with less than 4% performance overhead with respect to direct execution in L1. We show that a 142 kB DNN achieving 79.9% on CIFAR-10 runs 3.2x faster compared to its execution directly from L2 memory while consuming 1.9x less energy.

  • Work-in-Progress: DORY: Lightweight Memory Hierarchy Management for Deep NN Inference on IoT Endnodes
    2019
    Co-Authors: Alessio Burrello, Francesco Conti, Angelo Garofalo, Davide Rossi, Luca Benini
    Abstract:

    IoT endnodes often couple a small and fast L1 scratchpad memory with higher-capacity but lower bandwidth and speed L2 background memory. The absence of a coherent Hardware Cache hierarchy saves energy but comes at the cost of labor-intensive explicit memory management, complicating the deployment of algorithms with large data memory footprint, such as Deep Neural Network (DNN) inference. In this work, we present DORY, a lightweight software-Cache dedicated to DNN Deployment Oriented to memoRY. DORY leverages static data tiling and DMA-based double buffering to hide the complexity of manual L1-L2 memory traffic management. DORY enables storage of activations and weights in L2 with less than 4% performance overhead with respect to direct execution in L1. We show that a 142 kB DNN achieving 79.9% on CIFAR-10 runs 3.2× faster compared to its execution directly from L2 memory while consuming 1.9× less energy.

Angelo Garofalo - One of the best experts on this subject based on the ideXlab platform.

  • dory lightweight memory hierarchy management for deep nn inference on iot endnodes work in progress
    2019
    Co-Authors: Alessio Burrello, Francesco Conti, Angelo Garofalo, Davide Rossi, Luca Benini
    Abstract:

    IoT endnodes often couple a small and fast L1 scratchpad memory with higher-capacity but lower bandwidth and speed L2 background memory. The absence of a coherent Hardware Cache hierarchy saves energy but comes at the cost of labor-intensive explicit memory management, complicating the deployment of algorithms with large data memory footprint, such as Deep Neural Network (DNN) inference. In this work, we present DORY, a lightweight software-Cache dedicated to DNN Deployment Oriented to memoRY. DORY leverages static data tiling and DMA-based double buffering to hide the complexity of manual L1-L2 memory traffic management. DORY enables storage of activations and weights in L2 with less than 4% performance overhead with respect to direct execution in L1. We show that a 142 kB DNN achieving 79.9% on CIFAR-10 runs 3.2x faster compared to its execution directly from L2 memory while consuming 1.9x less energy.

  • Work-in-Progress: DORY: Lightweight Memory Hierarchy Management for Deep NN Inference on IoT Endnodes
    2019
    Co-Authors: Alessio Burrello, Francesco Conti, Angelo Garofalo, Davide Rossi, Luca Benini
    Abstract:

    IoT endnodes often couple a small and fast L1 scratchpad memory with higher-capacity but lower bandwidth and speed L2 background memory. The absence of a coherent Hardware Cache hierarchy saves energy but comes at the cost of labor-intensive explicit memory management, complicating the deployment of algorithms with large data memory footprint, such as Deep Neural Network (DNN) inference. In this work, we present DORY, a lightweight software-Cache dedicated to DNN Deployment Oriented to memoRY. DORY leverages static data tiling and DMA-based double buffering to hide the complexity of manual L1-L2 memory traffic management. DORY enables storage of activations and weights in L2 with less than 4% performance overhead with respect to direct execution in L1. We show that a 142 kB DNN achieving 79.9% on CIFAR-10 runs 3.2× faster compared to its execution directly from L2 memory while consuming 1.9× less energy.