Target Platform

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 83652 Experts worldwide ranked by ideXlab platform

Jian Sun - One of the best experts on this subject based on the ideXlab platform.

  • shufflenet v2 practical guidelines for efficient cnn architecture design
    European Conference on Computer Vision, 2018
    Co-Authors: Xiangyu Zhang, Haitao Zheng, Jian Sun
    Abstract:

    Currently, the neural network architecture design is mostly guided by the indirect metric of computation complexity, i.e., FLOPs. However, the direct metric, e.g., speed, also depends on the other factors such as memory access cost and Platform characterics. Thus, this work proposes to evaluate the direct metric on the Target Platform, beyond only considering FLOPs. Based on a series of controlled experiments, this work derives several practical guidelines for efficient network design. Accordingly, a new architecture is presented, called ShuffleNet V2. Comprehensive ablation experiments verify that our model is the state-of-the-art in terms of speed and accuracy tradeoff.

  • shufflenet v2 practical guidelines for efficient cnn architecture design
    arXiv: Computer Vision and Pattern Recognition, 2018
    Co-Authors: Xiangyu Zhang, Haitao Zheng, Jian Sun
    Abstract:

    Currently, the neural network architecture design is mostly guided by the \emph{indirect} metric of computation complexity, i.e., FLOPs. However, the \emph{direct} metric, e.g., speed, also depends on the other factors such as memory access cost and Platform characterics. Thus, this work proposes to evaluate the direct metric on the Target Platform, beyond only considering FLOPs. Based on a series of controlled experiments, this work derives several practical \emph{guidelines} for efficient network design. Accordingly, a new architecture is presented, called \emph{ShuffleNet V2}. Comprehensive ablation experiments verify that our model is the state-of-the-art in terms of speed and accuracy tradeoff.

Jon Stockwood - One of the best experts on this subject based on the ideXlab platform.

  • hardware software co design of embedded reconfigurable architectures
    Design Automation Conference, 2000
    Co-Authors: Timothy J Callahan, Ervan Darnell, Randolph Harr, Uday Kurkure, Jon Stockwood
    Abstract:

    In this paper we describe a new hardware/software partitioning approach for embedded reconfigurable architectures consisting of a general-purpose processor (CPU), a dynamically reconfigurable datapath (e.g. an FPGA), and a memory hierarchy. We have developed a framework called Nimble that automatically compiles system-level applications specified in C to executables on the Target Platform. A key component of this framework is a hardware/software partitioning algorithm that performs fine-grained partitioning (at loop and basic-block levels) of an application to execute on the combined CPU and datapath. The partitioning algorithm optimizes the global application execution time, including the software and hardware execution times, communication time and datapath reconfiguration time. Experimental results on real applications show that our algorithm is effective in rapidly finding close to optimal solutions.

  • hardware software co design of embedded reconfigurable architectures
    Design Automation Conference, 2000
    Co-Authors: Yanbing Li, Timothy J Callahan, Ervan Darnell, Randolph Harr, Uday Kurkure, Jon Stockwood
    Abstract:

    In this paper we describe a new hardware/software partitioning approach for embedded reconfigurable architectures consisting of a general-purpose processor (CPU), a dynamically reconfigurable datapath (e.g. an FPGA), and a memory hierarchy. We have developed a framework called Nimble that automatically compiles system-level applications specified in C to executables on the Target Platform. A key component of this framework is a hardware/software partitioning algorithm that performs fine-grained partitioning (at loop and basic-block levels) of an application to execute on the combined CPU and datapath. The partitioning algorithm optimizes the global application execution time, including the software and hardware execution times, communication time and datapath reconfiguration time. Experimental results on real applications show that our algorithm is effective in rapidly finding close to optimal solutions.

Xiangyu Zhang - One of the best experts on this subject based on the ideXlab platform.

  • shufflenet v2 practical guidelines for efficient cnn architecture design
    European Conference on Computer Vision, 2018
    Co-Authors: Xiangyu Zhang, Haitao Zheng, Jian Sun
    Abstract:

    Currently, the neural network architecture design is mostly guided by the indirect metric of computation complexity, i.e., FLOPs. However, the direct metric, e.g., speed, also depends on the other factors such as memory access cost and Platform characterics. Thus, this work proposes to evaluate the direct metric on the Target Platform, beyond only considering FLOPs. Based on a series of controlled experiments, this work derives several practical guidelines for efficient network design. Accordingly, a new architecture is presented, called ShuffleNet V2. Comprehensive ablation experiments verify that our model is the state-of-the-art in terms of speed and accuracy tradeoff.

  • shufflenet v2 practical guidelines for efficient cnn architecture design
    arXiv: Computer Vision and Pattern Recognition, 2018
    Co-Authors: Xiangyu Zhang, Haitao Zheng, Jian Sun
    Abstract:

    Currently, the neural network architecture design is mostly guided by the \emph{indirect} metric of computation complexity, i.e., FLOPs. However, the \emph{direct} metric, e.g., speed, also depends on the other factors such as memory access cost and Platform characterics. Thus, this work proposes to evaluate the direct metric on the Target Platform, beyond only considering FLOPs. Based on a series of controlled experiments, this work derives several practical \emph{guidelines} for efficient network design. Accordingly, a new architecture is presented, called \emph{ShuffleNet V2}. Comprehensive ablation experiments verify that our model is the state-of-the-art in terms of speed and accuracy tradeoff.

Wolfgang Schroderpreikschat - One of the best experts on this subject based on the ideXlab platform.

  • worst case energy consumption analysis for energy constrained embedded systems
    Euromicro Conference on Real-Time Systems, 2015
    Co-Authors: Peter Wagemann, Tobias Distler, Timo Honig, Heiko Janker, Rudiger Kapitza, Wolfgang Schroderpreikschat
    Abstract:

    The fact that energy is a scarce resource in many embedded real-time systems creates the need for energy-aware task schedulers, which not only guarantee timing constraints but also consider energy consumption. Unfortunately, existing approaches to analyze the worst-case execution time (WCET) of a task usually cannot be directly applied to determine its worst-case energy consumption (WCEC) due to execution time and energy consumption not being closely correlated on many state-of-the-art processors. Instead, a WCEC analyzer must take into account the particular energy characteristics of a Target Platform. In this paper, we present 0g, a comprehensive approach to WCEC analysis that combines different techniques to speed up the analysis and to improve results. If detailed knowledge about the energy costs of instructions on the Target Platform is available, our tool is able to compute upper bounds for the WCEC by statically analyzing the program code. Otherwise, a novel approach allows 0g to determine the WCEC by measurement after having identified a set of suitable program inputs based on an auxiliary energy model, which specifies the energy consumption of instructions in relation to each other. Our experiments for three Target Platforms show that 0g provides precise WCEC estimates.

Timothy J Callahan - One of the best experts on this subject based on the ideXlab platform.

  • hardware software co design of embedded reconfigurable architectures
    Design Automation Conference, 2000
    Co-Authors: Timothy J Callahan, Ervan Darnell, Randolph Harr, Uday Kurkure, Jon Stockwood
    Abstract:

    In this paper we describe a new hardware/software partitioning approach for embedded reconfigurable architectures consisting of a general-purpose processor (CPU), a dynamically reconfigurable datapath (e.g. an FPGA), and a memory hierarchy. We have developed a framework called Nimble that automatically compiles system-level applications specified in C to executables on the Target Platform. A key component of this framework is a hardware/software partitioning algorithm that performs fine-grained partitioning (at loop and basic-block levels) of an application to execute on the combined CPU and datapath. The partitioning algorithm optimizes the global application execution time, including the software and hardware execution times, communication time and datapath reconfiguration time. Experimental results on real applications show that our algorithm is effective in rapidly finding close to optimal solutions.

  • hardware software co design of embedded reconfigurable architectures
    Design Automation Conference, 2000
    Co-Authors: Yanbing Li, Timothy J Callahan, Ervan Darnell, Randolph Harr, Uday Kurkure, Jon Stockwood
    Abstract:

    In this paper we describe a new hardware/software partitioning approach for embedded reconfigurable architectures consisting of a general-purpose processor (CPU), a dynamically reconfigurable datapath (e.g. an FPGA), and a memory hierarchy. We have developed a framework called Nimble that automatically compiles system-level applications specified in C to executables on the Target Platform. A key component of this framework is a hardware/software partitioning algorithm that performs fine-grained partitioning (at loop and basic-block levels) of an application to execute on the combined CPU and datapath. The partitioning algorithm optimizes the global application execution time, including the software and hardware execution times, communication time and datapath reconfiguration time. Experimental results on real applications show that our algorithm is effective in rapidly finding close to optimal solutions.