Target Platform - Explore the Science & Experts

The Experts below are selected from a list of 83652 Experts worldwide ranked by ideXlab platform

Jian Sun - One of the best experts on this subject based on the ideXlab platform.

shufflenet v2 practical guidelines for efficient cnn architecture design

European Conference on Computer Vision, 2018

Co-Authors: Xiangyu Zhang, Haitao Zheng, Jian Sun

Abstract:

Currently, the neural network architecture design is mostly guided by the indirect metric of computation complexity, i.e., FLOPs. However, the direct metric, e.g., speed, also depends on the other factors such as memory access cost and Platform characterics. Thus, this work proposes to evaluate the direct metric on the Target Platform, beyond only considering FLOPs. Based on a series of controlled experiments, this work derives several practical guidelines for efficient network design. Accordingly, a new architecture is presented, called ShuffleNet V2. Comprehensive ablation experiments verify that our model is the state-of-the-art in terms of speed and accuracy tradeoff.

15 days free trial to Access Article
shufflenet v2 practical guidelines for efficient cnn architecture design

arXiv: Computer Vision and Pattern Recognition, 2018

Co-Authors: Xiangyu Zhang, Haitao Zheng, Jian Sun

Abstract:

Currently, the neural network architecture design is mostly guided by the \emph{indirect} metric of computation complexity, i.e., FLOPs. However, the \emph{direct} metric, e.g., speed, also depends on the other factors such as memory access cost and Platform characterics. Thus, this work proposes to evaluate the direct metric on the Target Platform, beyond only considering FLOPs. Based on a series of controlled experiments, this work derives several practical \emph{guidelines} for efficient network design. Accordingly, a new architecture is presented, called \emph{ShuffleNet V2}. Comprehensive ablation experiments verify that our model is the state-of-the-art in terms of speed and accuracy tradeoff.

15 days free trial to Access Article

Jon Stockwood - One of the best experts on this subject based on the ideXlab platform.

hardware software co design of embedded reconfigurable architectures

Design Automation Conference, 2000

Co-Authors: Timothy J Callahan, Ervan Darnell, Randolph Harr, Uday Kurkure, Jon Stockwood

Abstract:

In this paper we describe a new hardware/software partitioning approach for embedded reconfigurable architectures consisting of a general-purpose processor (CPU), a dynamically reconfigurable datapath (e.g. an FPGA), and a memory hierarchy. We have developed a framework called Nimble that automatically compiles system-level applications specified in C to executables on the Target Platform. A key component of this framework is a hardware/software partitioning algorithm that performs fine-grained partitioning (at loop and basic-block levels) of an application to execute on the combined CPU and datapath. The partitioning algorithm optimizes the global application execution time, including the software and hardware execution times, communication time and datapath reconfiguration time. Experimental results on real applications show that our algorithm is effective in rapidly finding close to optimal solutions.

15 days free trial to Access Article
hardware software co design of embedded reconfigurable architectures

Design Automation Conference, 2000

Co-Authors: Yanbing Li, Timothy J Callahan, Ervan Darnell, Randolph Harr, Uday Kurkure, Jon Stockwood

Abstract:

In this paper we describe a new hardware/software partitioning approach for embedded reconfigurable architectures consisting of a general-purpose processor (CPU), a dynamically reconfigurable datapath (e.g. an FPGA), and a memory hierarchy. We have developed a framework called Nimble that automatically compiles system-level applications specified in C to executables on the Target Platform. A key component of this framework is a hardware/software partitioning algorithm that performs fine-grained partitioning (at loop and basic-block levels) of an application to execute on the combined CPU and datapath. The partitioning algorithm optimizes the global application execution time, including the software and hardware execution times, communication time and datapath reconfiguration time. Experimental results on real applications show that our algorithm is effective in rapidly finding close to optimal solutions.

15 days free trial to Access Article

Xiangyu Zhang - One of the best experts on this subject based on the ideXlab platform.

shufflenet v2 practical guidelines for efficient cnn architecture design

European Conference on Computer Vision, 2018

Co-Authors: Xiangyu Zhang, Haitao Zheng, Jian Sun

Abstract:

Currently, the neural network architecture design is mostly guided by the indirect metric of computation complexity, i.e., FLOPs. However, the direct metric, e.g., speed, also depends on the other factors such as memory access cost and Platform characterics. Thus, this work proposes to evaluate the direct metric on the Target Platform, beyond only considering FLOPs. Based on a series of controlled experiments, this work derives several practical guidelines for efficient network design. Accordingly, a new architecture is presented, called ShuffleNet V2. Comprehensive ablation experiments verify that our model is the state-of-the-art in terms of speed and accuracy tradeoff.

15 days free trial to Access Article
shufflenet v2 practical guidelines for efficient cnn architecture design

arXiv: Computer Vision and Pattern Recognition, 2018

Co-Authors: Xiangyu Zhang, Haitao Zheng, Jian Sun

Abstract:

Currently, the neural network architecture design is mostly guided by the \emph{indirect} metric of computation complexity, i.e., FLOPs. However, the \emph{direct} metric, e.g., speed, also depends on the other factors such as memory access cost and Platform characterics. Thus, this work proposes to evaluate the direct metric on the Target Platform, beyond only considering FLOPs. Based on a series of controlled experiments, this work derives several practical \emph{guidelines} for efficient network design. Accordingly, a new architecture is presented, called \emph{ShuffleNet V2}. Comprehensive ablation experiments verify that our model is the state-of-the-art in terms of speed and accuracy tradeoff.

15 days free trial to Access Article

Wolfgang Schroderpreikschat - One of the best experts on this subject based on the ideXlab platform.

worst case energy consumption analysis for energy constrained embedded systems

Euromicro Conference on Real-Time Systems, 2015

Co-Authors: Peter Wagemann, Tobias Distler, Timo Honig, Heiko Janker, Rudiger Kapitza, Wolfgang Schroderpreikschat

Abstract:

The fact that energy is a scarce resource in many embedded real-time systems creates the need for energy-aware task schedulers, which not only guarantee timing constraints but also consider energy consumption. Unfortunately, existing approaches to analyze the worst-case execution time (WCET) of a task usually cannot be directly applied to determine its worst-case energy consumption (WCEC) due to execution time and energy consumption not being closely correlated on many state-of-the-art processors. Instead, a WCEC analyzer must take into account the particular energy characteristics of a Target Platform. In this paper, we present 0g, a comprehensive approach to WCEC analysis that combines different techniques to speed up the analysis and to improve results. If detailed knowledge about the energy costs of instructions on the Target Platform is available, our tool is able to compute upper bounds for the WCEC by statically analyzing the program code. Otherwise, a novel approach allows 0g to determine the WCEC by measurement after having identified a set of suitable program inputs based on an auxiliary energy model, which specifies the energy consumption of instructions in relation to each other. Our experiments for three Target Platforms show that 0g provides precise WCEC estimates.

15 days free trial to Access Article

Timothy J Callahan - One of the best experts on this subject based on the ideXlab platform.

hardware software co design of embedded reconfigurable architectures

Design Automation Conference, 2000

Co-Authors: Timothy J Callahan, Ervan Darnell, Randolph Harr, Uday Kurkure, Jon Stockwood

Abstract:

In this paper we describe a new hardware/software partitioning approach for embedded reconfigurable architectures consisting of a general-purpose processor (CPU), a dynamically reconfigurable datapath (e.g. an FPGA), and a memory hierarchy. We have developed a framework called Nimble that automatically compiles system-level applications specified in C to executables on the Target Platform. A key component of this framework is a hardware/software partitioning algorithm that performs fine-grained partitioning (at loop and basic-block levels) of an application to execute on the combined CPU and datapath. The partitioning algorithm optimizes the global application execution time, including the software and hardware execution times, communication time and datapath reconfiguration time. Experimental results on real applications show that our algorithm is effective in rapidly finding close to optimal solutions.

15 days free trial to Access Article
hardware software co design of embedded reconfigurable architectures

Design Automation Conference, 2000

Co-Authors: Yanbing Li, Timothy J Callahan, Ervan Darnell, Randolph Harr, Uday Kurkure, Jon Stockwood

Abstract:

In this paper we describe a new hardware/software partitioning approach for embedded reconfigurable architectures consisting of a general-purpose processor (CPU), a dynamically reconfigurable datapath (e.g. an FPGA), and a memory hierarchy. We have developed a framework called Nimble that automatically compiles system-level applications specified in C to executables on the Target Platform. A key component of this framework is a hardware/software partitioning algorithm that performs fine-grained partitioning (at loop and basic-block levels) of an application to execute on the combined CPU and datapath. The partitioning algorithm optimizes the global application execution time, including the software and hardware execution times, communication time and datapath reconfiguration time. Experimental results on real applications show that our algorithm is effective in rapidly finding close to optimal solutions.

15 days free trial to Access Article

Discover everything there is to know about the scientific topic Target Platform with ideXlab!

Jian Sun - One of the best experts on this subject based on the ideXlab platform.

Jon Stockwood - One of the best experts on this subject based on the ideXlab platform.

Xiangyu Zhang - One of the best experts on this subject based on the ideXlab platform.

Wolfgang Schroderpreikschat - One of the best experts on this subject based on the ideXlab platform.

Timothy J Callahan - One of the best experts on this subject based on the ideXlab platform.