Network-on-Chip

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 98008152 Experts worldwide ranked by ideXlab platform

Keren Bergman - One of the best experts on this subject based on the ideXlab platform.

  • SC - Circuit-Switched Memory Access in Photonic Interconnection Networks for High-Performance Embedded Computing
    2010 ACM IEEE International Conference for High Performance Computing Networking Storage and Analysis, 2010
    Co-Authors: Gilbert Hendry, Johnnie Chan, Luca P Carloni, Eric Robinson, Vitaliy Gleyzer, Nadya T. Bliss, Keren Bergman
    Abstract:

    As advancements in CMOS technology trend toward ever increasing core counts in chip multiprocessors for high-performance embedded computing, the discrepancy between on- and off-chip communication bandwidth continues to widen due to the power and spatial constraints of electronic off-chip signaling. Silicon photonics-based communication offers many advantages over electronics for Network-on-Chip design, namely power consumption that is effectively agnostic to distance traveled at the chip- and board-scale, even across chip boundaries. In this work we develop a design for a photonic Network-on-Chip with integrated DRAM I/O interfaces and compare its performance to similar electronic solutions using a detailed Network-on-Chip simulation. When used in a circuit-switched network, silicon nanophotonic switches offer higher bandwidth density and low power transmission, adding up to over 10x better performance and 3-5x lower power over the baseline for projective transform, matrix multiply, and Fast Fourier Transform (FFT), all key algorithms in embedded real-time signal and image processing.

  • photonic networks on chip for future generations of chip multiprocessors
    IEEE Transactions on Computers, 2008
    Co-Authors: Assaf Shacham, Keren Bergman, Luca P Carloni
    Abstract:

    The design and performance of next-generation chip multiprocessors (CMPs) will be bound by the limited amount of power that can be dissipated on a single die. We present photonic networks-on-chip (NoC) as a solution to reduce the impact of intra-chip and off-chip communication on the overall power budget. A photonic interconnection network can deliver higher bandwidth and lower latencies with significantly lower power dissipation. We explain why on-chip photonic communication has recently become a feasible opportunity and explore the challenges that need to be addressed to realize its implementation. We introduce a novel hybrid micro-architecture for NoCs combining a broadband photonic circuit-switched network with an electronic overlay packet-switched control network. We address the critical design issues including: topology, routing algorithms, deadlock avoidance, and path-setup/tear-down procedures. We present experimental results obtained with POINTS, an event-driven simulator specifically developed to analyze the proposed idea, as well as a comparative power analysis of a photonic versus an electronic NoC. Overall, these results confirm the unique benefits for future generations of CMPs that can be achieved by bringing optics into the chip in the form of photonic NoCs.

  • nanophotonic optical interconnection network architecture for on chip and off chip communications
    Optical Fiber Communication Conference, 2008
    Co-Authors: Howard Wang, Luca P Carloni, Aleksandr Biberman, Michele Petracca, Keren Bergman
    Abstract:

    An architecture for an integrated low-power, high-bandwidth optical interconnection network based on microring resonator technology is presented. The layout of the non-blocking network is described and a simulation-based performance evaluation is conducted.

  • on the design of a photonic network on chip
    Networks-on-Chips, 2007
    Co-Authors: Assaf Shacham, Keren Bergman
    Abstract:

    Recent remarkable advances in nanoscale silicon-photonic integrated circuitry specifically compatible with CMOS fabrication have generated new opportunities for leveraging the unique capabilities of optical technologies in the on-chip communications infrastructure. Based on these nano-photonic building blocks, we consider a photonic Network-on-Chip architecture designed to exploit the enormous transmission bandwidths, low latencies, and low power dissipation enabled by data exchange in the optical domain. The novel architectural approach employs a broadband photonic circuit-switched network driven in a distributed fashion by an electronic overlay control network which is also used for independent exchange of short messages. We address the critical network design issues for insertion in chip multiprocessors (CMP) applications, including topology, routing algorithms, path-setup and tear-down procedures, and deadlock avoidance. Simulations show that this class of photonic networks-on-chip offers a significant leap in the performance for CMP intrachip communication systems delivering low-latencies and ultra-high throughputs per core while consuming minimal power

K Hirotsu - One of the best experts on this subject based on the ideXlab platform.

  • With On-Chip Parallel Learning for Oscillation Cancellation
    2015
    Co-Authors: J Liu, M A Brooke, K Hirotsu
    Abstract:

    Abstract—This paper presents a mixed signal CMOS feedfor-ward neural-network chip with on-chip error-reduction hardware for real-time adaptation. The chip has compact on-chip weighs ca-pable of high-speed parallel learning; the implemented learning al-gorithm is a genetic random search algorithm—the random weight change (RWC) algorithm. The algorithm does not require a known desired neural-network output for error calculation and is suit-able for direct feedback control. With hardware experiments, we demonstrate that the RWC chip, as a direct feedback controller, successfully suppresses unstable oscillations modeling combustion engine instability in real time. Index Terms—Analog finite impulse response (FIR) filter, direct feedback control, neural-network chip, parallel on-chip learning, oscillation cancellation. I

  • a cmos feedforward neural network chip with on chip parallel learning for oscillation cancellation
    IEEE Transactions on Neural Networks, 2002
    Co-Authors: J Liu, M A Brooke, K Hirotsu
    Abstract:

    The paper presents a mixed signal CMOS feedforward neural-network chip with on-chip error-reduction hardware for real-time adaptation. The chip has compact on-chip weighs capable of high-speed parallel learning; the implemented learning algorithm is a genetic random search algorithm: the random weight change (RWC) algorithm. The algorithm does not require a known desired neural network output for error calculation and is suitable for direct feedback control. With hardware experiments, we demonstrate that the RWC chip, as a direct feedback controller, successfully suppresses unstable oscillations modeling combustion engine instability in real time.

Luca Benini - One of the best experts on this subject based on the ideXlab platform.

  • sunfloor 3d a tool for networks on chip topology synthesis for 3 d systems on chips
    IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2010
    Co-Authors: Ciprian Seiculescu, Luca Benini, Srinivasan Murali, Giovanni De Micheli
    Abstract:

    Three-dimensional integrated circuits (3D-ICs) are a promising approach to address the integration challenges faced by current systems on chips (SoCs). Designing an efficient network on chip (NoC) interconnect for a 3-D SoC that meets not only the application performance constraints but also the constraints imposed by the 3-D technology is a significant challenge. In this paper, we present a design tool, SunFloor 3D, to synthesize application-specific 3-D NoCs. The proposed tool determines the best NoC topology for the application, finds paths for the communication flows, assigns the network components to the 3-D layers, and places them in each layer. We perform experiments on several SoC benchmarks and present a comparative study between 3-D and 2-D NoC designs. Our studies show large improvements in interconnect power consumption (average of 38%) and delay (average of 13%) for the 3-D NoC when compared to the corresponding 2-D implementation. Our studies also show that the synthesized topologies result in large power (average of 54%) and delay savings (average of 21%) when compared to standard topologies.

  • sunfloor 3d a tool for networks on chip topology synthesis for 3d systems on chips
    Design Automation and Test in Europe, 2009
    Co-Authors: Ciprian Seiculescu, Luca Benini, Srinivasan Murali, Giovanni De Micheli
    Abstract:

    Three-dimensional integrated circuits are a promising approach to address the integration challenges faced by current Systems on Chips (SoCs). Designing an efficient Network on Chip (NoC) interconnect for a 3D SoC that not only meets the application performance constraints, but also the constraints imposed by the 3D technology, is a significant challenge. In this work we present a design tool, SunFloor 3D, to synthesize application-specific 3D NoCs. The proposed tool determines the best NoC topology for the application, finds paths for the communication flows, assigns the network components on to the 3D layers and performs a placement of them in each layer. We perform experiments on several SoC benchmarks and present a comparative study between 3D and 2D NoC designs. Our studies show large improvements in interconnect power consumption (average of 38%) and delay (average of 13%) for the 3D NoC when compared to the corresponding 2D implementation. Our studies also show that the synthesized topologies result in large power (average of 54%) and delay savings (average of 21%) when compared to standard topologies.

  • packetization and routing analysis of on chip multiprocessor networks
    Journal of Systems Architecture, 2004
    Co-Authors: Terry Tao Ye, Luca Benini, Giovanni De Micheli
    Abstract:

    Some current and most future systems-on-chips use and will use network architectures/protocols to implement on-chip communication. On-chip networks borrow features and design methods from those used in parallel computing clusters and computer system area networks. They differ from traditional networks because of larger on-chip wiring resources and flexibility, as well as constraints on area and energy consumption (in addition to performance requirements). In this paper, we analyze different routing schemes for packetized on-chip communication on a mesh network architecture, with particular emphasis on specific benefits and limitations of silicon VLSI implementations. A contention-look-ahead on-chip routing scheme is proposed. It reduces the network delay with significantly smaller buffer requirement. We further show that in the on-chip multiprocessor systems, both the instruction execution inside node processors, as well as data transaction between different processing elements, are greatly affected by the packetized dataflows that are transported on the on-chip networks. Different packetization schemes affect the performance and power consumption of multiprocessor systems. Our analysis is also quantified by the network/multiprocessor co-simulation benchmark results.

  • xpipes a network on chip architecture for gigascale systems on chip
    IEEE Circuits and Systems Magazine, 2004
    Co-Authors: Davide Bertozzi, Luca Benini
    Abstract:

    The growing complexity of embedded multiprocessor architectures for digital media processing will soon require highly scalable communication infrastructures. Packet switched networks-on-chip (NoC) have been proposed to support the trend for systems-on-chip integration. In this paper, an advanced NoC architecture, called Xpipes, targeting high performance and reliable communication for on-chip multi-processors is introduced. It consists of a library of soft macros (switches, network interfaces and links) that are design-time composable and tunable so that domain-specific heterogeneous architectures can be instantiated and synthesized. Links can be pipelined with a flexible number of stages to decouple link throughput from its length and to get arbitrary topologies. Moreover, a tool called XpipesCompiler, which automatically instantiates a customized NoC from the library of soft network components, is used in this paper to test the Xpipes-based synthesis flow for domain-specific communication architectures.

  • networks on chip a new paradigm for systems on chip design
    Design Automation and Test in Europe, 2002
    Co-Authors: Giovanni De Micheli, Luca Benini
    Abstract:

    This paper is meant to be a short introduction to a new paradigm for systems on chip (SoC) design. The premises are that a component-based design methodology will prevail in the future, to support component re-use in a plug-and-play fashion. At the same time, SoCs will have to provide a functionally-correct, reliable operation of the interacting components. The physical interconnections on chip will be a limiting factor for performance and energy consumption.

Lishiuan Peh - One of the best experts on this subject based on the ideXlab platform.

  • swift a low power network on chip implementing the token flow control router architecture with swing reduced interconnects
    IEEE Transactions on Very Large Scale Integration Systems, 2013
    Co-Authors: Jacob Postman, Lishiuan Peh, Tushar Krishna, Christopher Douglas Edmonds, Patrick Chiang
    Abstract:

    A 64-bit, 8 × 8 mesh Network-on-Chip (NoC) is presented that uses both new architectural and circuit design techniques to improve on-chip network energy-efficiency, latency, and throughput. First, we propose token flow control, which enables bypassing of flit buffering in routers, thereby reducing buffer size and their power consumption. We also incorporate reduced-swing signaling in on-chip links and crossbars to minimize datapath interconnect energy. The 64-node NoC is experimentally validated with a 2 × 2 test chip in 90 nm, 1.2 V CMOS that incorporates traffic generators to emulate the traffic of the full network. Compared with a fully synthesized baseline 8 × 8 NoC architecture designed to meet the same peak throughput, the fabricated prototype reduces network latency by 20% under uniform random traffic, when both networks are run at their maximum operating frequencies. When operated at the same frequencies, the SWIFT NoC reduces network power by 38% and 25% at saturation and low loads, respectively.

  • garnet a detailed on chip network model inside a full system simulator
    International Symposium on Performance Analysis of Systems and Software, 2009
    Co-Authors: Niket Agarwal, Lishiuan Peh, Tushar Krishna, Niraj K Jha
    Abstract:

    Until very recently, microprocessor designs were computation-centric. On-chip communication was frequently ignored. This was because of fast, single-cycle on-chip communication. The interconnect power was also insignificant compared to the transistor power. With uniprocessor designs providing diminishing returns and the advent of chip multiprocessors (CMPs) in mainstream systems, the on-chip network that connects different processing cores has become a critical part of the design. Transistor miniaturization has led to high global wire delay, and interconnect power comparable to transistor power. CMP design proposals can no longer ignore the interaction between the memory hierarchy and the interconnection network that connects various elements. This necessitates a detailed and accurate interconnection network model within a full-system evaluation framework. Ignoring the interconnect details might lead to inaccurate results when simulating a CMP architecture. It also becomes important to analyze the impact of interconnection network optimization techniques on full system behavior. In this light, we developed a detailed cycle-accurate interconnection network model (GARNET), inside the GEMS full-system simulation framework. GARNET models a classic five-stage pipelined router with virtual channel (VC) flow control. Microarchitectural details, such as flit-level input buffers, routing logic, allocators and the crossbar switch, are modeled. GARNET, along with GEMS, provides a detailed and accurate memory system timing model. To demonstrate the importance and potential impact of GARNET, we evaluate a shared and private L2 CMP with a realistic state-of-the-art interconnection network against the original GEMS simple network. The objective of the evaluation was to figure out which configuration is better for a particular workload. We show that not modeling the interconnect in detail might lead to an incorrect outcome. We also evaluate Express Virtual Channels (EVCs), an on-chip network flow control proposal, in a full-system fashion. We show that in improving on-chip network latency-throughput, EVCs do lead to better overall system runtime, however, the impact varies widely across applications.

  • garnet a detailed on chip network model inside a full system simulator
    International Symposium on Performance Analysis of Systems and Software, 2009
    Co-Authors: Niket Agarwal, Lishiuan Peh, Tushar Krishna, Niraj K Jha
    Abstract:

    Until very recently, microprocessor designs were computation-centric. On-chip communication was frequently ignored. This was because of fast, single-cycle on-chip communication. The interconnect power was also insignificant compared to the transistor power. With uniprocessor designs providing diminishing returns and the advent of chip multiprocessors (CMPs) in mainstream systems, the on-chip network that connects different processing cores has become a critical part of the design. Transistor miniaturization has led to high global wire delay, and interconnect power comparable to transistor power. CMP design proposals can no longer ignore the interaction between the memory hierarchy and the interconnection network that connects various elements. This necessitates a detailed and accurate interconnection network model within a full-system evaluation framework. Ignoring the interconnect details might lead to inaccurate results when simulating a CMP architecture. It also becomes important to analyze the impact of interconnection network optimization techniques on full system behavior. In this light, we developed a detailed cycle-accurate interconnection network model (GARNET), inside the GEMS full-system simulation framework. GARNET models a classic five-stage pipelined router with virtual channel (VC) flow control. Microarchitectural details, such as flit-level input buffers, routing logic, allocators and the crossbar switch, are modeled. GARNET, along with GEMS, provides a detailed and accurate memory system timing model. To demonstrate the importance and potential impact of GARNET, we evaluate a shared and private L2 CMP with a realistic state-of-the-art interconnection network against the original GEMS simple network. The objective of the evaluation was to figure out which configuration is better for a particular workload. We show that not modeling the interconnect in detail might lead to an incorrect outcome. We also evaluate Express Virtual Channels (EVCs), an on-chip network flow control proposal, in a full-system fashion. We show that in improving on-chip network latency-throughput, EVCs do lead to better overall system runtime, however, the impact varies widely across applications.

  • polaris a system level roadmapping toolchain for on chip interconnection networks
    IEEE Transactions on Very Large Scale Integration Systems, 2007
    Co-Authors: Vassos Soteriou, Hangsheng Wang, Noel Eisley, Lishiuan Peh
    Abstract:

    Technology trends are driving parallel on-chip architectures in the form of multiprocessor systems-on-a-chip (MPSoCs) and chip multiprocessors (CMPs). In these systems, the increasing on-chip communication demand among the computation elements necessitates the use of scalable, high-bandwidth Network-on-Chip (NoC) fabrics instead of dedicated interconnects and shared buses. As transistor feature sizes are further miniaturized, more complicated NoC architectures become feasible that can support more demanding applications. Given the myriad emerging software-hardware combinations, for cost-effectiveness, a system designer critically needs to prune this widening NoC design-space to predict the interconnect fabric(s) that best balance(s) cost/performance, before the actual design process begins. This prompted us to develop Polaris, a system-level roadmapping toolchain for on-chip interconnection networks that helps designers predict the most suitable interconnection network design(s) tailored to their performance needs and power/silicon area constraints with respect to a range of applications that the system will run. Polaris explores the plethora of NoC designs based on projections of network traffic, architectures, and process characteristics. While Polaris's toolchain is extensible so new traffic, network designs, and technology processes can be added, the current version already incorporates 7872 NoC design points. Polaris is rapid, efficiently iterating over thousands of NoC design points, while maintaining high relative and absolute accuracies when validated against detailed NoC synthesis results.

  • polaris a system level roadmap for on chip interconnection networks
    International Conference on Computer Design, 2006
    Co-Authors: Vassos Soteriou, Hangsheng Wang, Noel Eisley, Lishiuan Peh
    Abstract:

    Technology trends are driving parallel on-chip architectures in the form of multi-processor systems-on-a-chip (MPSoCs) and chip multi-processors (CMPs). In these systems the increasing on-chip communication demand among the computation elements necessitates the use of scalable, nigh-bandwidth Network-on-Chip (NoC) fabrics. As transistor feature sizes are further miniaturized leading to rapidly increasing amounts of on-chip resources, more complicated and powerful NoC architectures become feasible that can support more sophisticated and demanding applications. Given the myriad emerging software-hardware combinations, for cost-effectiveness, a system designer critically needs to prune this widening NoC design space to identify the architecture(s) that best balance(s) cost/performance, before the actual design process begins. This prompted us to develop Polaris 1, a system-level roadmap for on-chip interconnection networks that guides designers towards the most suitable network design(s) tailored to their performance needs and power/silicon area constraints with respect to a range of applications that will run over this network(s). Polaris explores the plethora of NoC designs based on projections of network traffic, architectures, and process characteristics. While the Polaris roadmapping toolchain is extensible so new traffic, network designs, and processes can be added, the current version of the roadmap already incorporates 7,872 NoC design points. Polaris is rapid and iterates over all these NoC architectures within a tractable run time of 125 hours on a typical desktop machine, while maintaining high relative and absolute accuracies when validated against detailed NoC synthesis results.

J Liu - One of the best experts on this subject based on the ideXlab platform.

  • With On-Chip Parallel Learning for Oscillation Cancellation
    2015
    Co-Authors: J Liu, M A Brooke, K Hirotsu
    Abstract:

    Abstract—This paper presents a mixed signal CMOS feedfor-ward neural-network chip with on-chip error-reduction hardware for real-time adaptation. The chip has compact on-chip weighs ca-pable of high-speed parallel learning; the implemented learning al-gorithm is a genetic random search algorithm—the random weight change (RWC) algorithm. The algorithm does not require a known desired neural-network output for error calculation and is suit-able for direct feedback control. With hardware experiments, we demonstrate that the RWC chip, as a direct feedback controller, successfully suppresses unstable oscillations modeling combustion engine instability in real time. Index Terms—Analog finite impulse response (FIR) filter, direct feedback control, neural-network chip, parallel on-chip learning, oscillation cancellation. I

  • a cmos feedforward neural network chip with on chip parallel learning for oscillation cancellation
    IEEE Transactions on Neural Networks, 2002
    Co-Authors: J Liu, M A Brooke, K Hirotsu
    Abstract:

    The paper presents a mixed signal CMOS feedforward neural-network chip with on-chip error-reduction hardware for real-time adaptation. The chip has compact on-chip weighs capable of high-speed parallel learning; the implemented learning algorithm is a genetic random search algorithm: the random weight change (RWC) algorithm. The algorithm does not require a known desired neural network output for error calculation and is suitable for direct feedback control. With hardware experiments, we demonstrate that the RWC chip, as a direct feedback controller, successfully suppresses unstable oscillations modeling combustion engine instability in real time.