VLIW Processor

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 3555 Experts worldwide ranked by ideXlab platform

Stephan Wong - One of the best experts on this subject based on the ideXlab platform.

  • dynamic trade off among fault tolerance energy consumption and performance on a multiple issue VLIW Processor
    IEEE Transactions on Multi-Scale Computing Systems, 2018
    Co-Authors: Anderson L Sartor, Stephan Wong, Joost Hoozemans, Pedro Henrique Exenberger Becker, Antonio Carlos Schneider Beck
    Abstract:

    In the design of modern-day Processors, energy consumption and fault tolerance have gained significant importance next to performance. This is caused by battery constraints, thermal design limits, and higher susceptibility to errors as transistor feature sizes are decreasing. However, achieving the ideal balance among them is challenging due to their conflicting nature (e.g., fault-tolerance techniques usually influence execution time or increase energy consumption), and that is why current Processor designs target at most two of these axes. Based on that, we propose a new VLIW-based Processor design capable of adapting the execution of the application at run-time in a totally transparent fashion, considering performance, fault tolerance, and energy consumption altogether, in which the weight (priority) of each one can be defined a priori . This is achieved by a novel decision module that dynamically controls the application’s ILP to increase the possibility of replicating instructions or applying power gating. For an energy-oriented configuration, it is possible, on average, to reduce energy consumption by 37.2 percent with an overhead of only 8.2 percent in performance, while maintaining low levels of failure rate, when compared to a fault-tolerant design.

  • increasing resource utilization in mixed criticality systems using a polymorphic VLIW Processor
    Journal of Systems Architecture, 2018
    Co-Authors: Joost Hoozemans, Jeroen Van Straten, Stephan Wong
    Abstract:

    Abstract Mixed-criticality systems need to provide strict guarantees to hard real-time tasks and simultaneously, deliver high throughput for non-critical tasks. However, techniques to enhance performance more often than not affect the analyzability, e.g., caches, branch prediction, out-of-order (OoO) execution superscalar processing, and simultaneous multithreading (SMT). In this paper, we propose the use of a polymorphic VLIW Processor to increase performance for non-critical tasks while maintaining analyzability. The Processor achieves these goals by dynamically distributing computing resources (in the form of datapaths) to one or multiple threads. A static schedule guarantees the minimum amount of cycles to meet the deadlines for critical tasks. Datapaths that are not used by critical tasks can be assigned to non-critical tasks in a highly flexible way, thereby increasing resource utilization resulting in higher throughput. Our experiments show that our approach can exploit its dynamic properties to improve schedulability and assign up to 50% and on average 25% more resources to lower-priority threads during the execution of a static real-time schedule.

  • exploring ilp and tlp on a polymorphic VLIW Processor
    Automation Robotics and Control Systems, 2017
    Co-Authors: Anthony Brandon, Joost Hoozemans, Jeroen Van Straten, Stephan Wong
    Abstract:

    In today’s computing environments, the concurrent execution of multiple applications/threads is common and multi-cores are very well-suited to handle such workloads. However, they suffer from the fact that any mismatch between the application’s inherent instruction-level parallelism (ILP) and the core’s parallelism leads to unused resources or loss in performance. An accepted solution is to include several types of cores and match them dynamically depending on the performance needs of the application. This approach becomes less efficient when the number of cores does not match the number of parallel threads. Furthermore, the heterogeneity of (fixed) cores cannot be increased indefinitely as it would result in even higher degrees of mismatching and increased movement of instruction and data streams. In this paper, we are proposing a polymorphic Processor, based on VLIW architectures, that can adapt its issue-width during runtime. By design, the Processor can be perceived as a single wide core (8-issue VLIW) or two medium-wide cores (4-issue) or four small cores (2-issue) that can run high-ILP/low DLP, medium-ILP/medium DLP, and low-ILP/high-DLP applications, respectively. Furthermore, we are executing one single generic binary while performing these reconfigurations. In order to show the effectiveness of our approach, we synthesized different versions of the core to represent fixed heterogeneous cores and compared them to the dynamic implementation of the core. Our experiments show that the dynamically adaptive solution performs on average \(7\%\) faster and uses \(5\%\) less area than a platform which consists of fixed cores with \(1.5\times \) as many datapaths.

  • exploiting idle hardware to provide low overhead fault tolerance for VLIW Processors
    ACM Journal on Emerging Technologies in Computing Systems, 2017
    Co-Authors: Anderson L Sartor, Stephan Wong, Luigi Carro, Arthur Francisco Lorenzon, Fernanda Lima Kastensmidt, Antonio Carlos Schneider Beck
    Abstract:

    Because of technology scaling, the soft error rate has been increasing in digital circuits, which affects system reliability. Therefore, modern Processors, including VLIW architectures, must have means to mitigate such effects to guarantee reliable computing. In this scenario, our work proposes three low overhead fault tolerance approaches based on instruction duplication with zero latency detection, which uses a rollback mechanism to correct soft errors in the pipelanes of a configurable VLIW Processor. The first uses idle issue slots within a period of time to execute extra instructions considering distinct application phases. The second works at a finer grain, adaptively exploiting idle functional units at run-time. However, some applications present high instruction-level parallelism (ILP), so the ability to provide fault tolerance is reduced: less functional units will be idle, decreasing the number of potential duplicated instructions. The third approach attacks this issue by dynamically reducing ILP according to a configurable threshold, increasing fault tolerance at the cost of performance. While the first two approaches achieve significant fault coverage with minimal area and power overhead for applications with low ILP, the latter improves fault tolerance with low performance degradation. All approaches are evaluated considering area, performance, power dissipation, and error coverage.

  • run time phase prediction for a reconfigurable VLIW Processor
    Design Automation and Test in Europe, 2016
    Co-Authors: Qi Guo, Anderson L Sartor, Anthony Brandon, Antonio Carlos Schneider Beck, Xuehai Zhou, Stephan Wong
    Abstract:

    It is well-known that different applications exhibit varying amounts of ILP. Execution of these applications on the same fixed-width VLIW Processor will result (1) in wasted energy due to underutilized resources if the issue-width of the Processor is larger than the inherent ILP; or alternatively, (2) in lower performance if the issue-width is smaller than the inherent ILP. Moreover, even within a single application distinct phases can be observed with varying ILP and therefore changing resource requirements. With this in mind, we designed the ρ-VEX Processor, which is a VLIW Processor that can change its issue-width at run-time. In this paper, we propose a novel scheme to dynamically (i.e., at run-time) optimize the resource utilization by predicting and matching the number of active data-paths for each application phase. The purpose is to achieve low energy consumption for applications with low ILP, and high performance for applications with high ILP, on a single VLIW Processor design. We prototyped the ρ-VEX Processor on an FPGA and obtained the dynamic traces of applications running on top of a Linux port. Our results show that it is possible in some cases to achieve the performance of an 8-issue core with 10% lower energy consumption, while in others we achieve the energy consumption of a 2-issue core with close to 20% lower execution time.

Rudy Lauwereins - One of the best experts on this subject based on the ideXlab platform.

  • crisp a template for reconfigurable instruction set Processors
    Lecture Notes in Computer Science, 2001
    Co-Authors: Pieter Op De Beeck, Francisco Barat, Murali Jayapala, Rudy Lauwereins
    Abstract:

    A template for reconfigurable instruction set Processors is described. This template defines a design space that enables the exploration of Processors potentially suitable for flexible, power and cost efficient implementations of embedded multimedia applications, such as video compression in a hand held device. The template is based on a VLIW Processor with a reconfigurable instruction set. In the future this template will be used for design space exploration, compiler retargeting and automatic hardware synthesis. Several existing reconfigurable- and non-reconfigurable Processors were mapped onto the template to assess its expressiveness.

Pieter Op De Beeck - One of the best experts on this subject based on the ideXlab platform.

  • crisp a template for reconfigurable instruction set Processors
    Lecture Notes in Computer Science, 2001
    Co-Authors: Pieter Op De Beeck, Francisco Barat, Murali Jayapala, Rudy Lauwereins
    Abstract:

    A template for reconfigurable instruction set Processors is described. This template defines a design space that enables the exploration of Processors potentially suitable for flexible, power and cost efficient implementations of embedded multimedia applications, such as video compression in a hand held device. The template is based on a VLIW Processor with a reconfigurable instruction set. In the future this template will be used for design space exploration, compiler retargeting and automatic hardware synthesis. Several existing reconfigurable- and non-reconfigurable Processors were mapped onto the template to assess its expressiveness.

Antonio Carlos Schneider Beck - One of the best experts on this subject based on the ideXlab platform.

  • dynamic trade off among fault tolerance energy consumption and performance on a multiple issue VLIW Processor
    IEEE Transactions on Multi-Scale Computing Systems, 2018
    Co-Authors: Anderson L Sartor, Stephan Wong, Joost Hoozemans, Pedro Henrique Exenberger Becker, Antonio Carlos Schneider Beck
    Abstract:

    In the design of modern-day Processors, energy consumption and fault tolerance have gained significant importance next to performance. This is caused by battery constraints, thermal design limits, and higher susceptibility to errors as transistor feature sizes are decreasing. However, achieving the ideal balance among them is challenging due to their conflicting nature (e.g., fault-tolerance techniques usually influence execution time or increase energy consumption), and that is why current Processor designs target at most two of these axes. Based on that, we propose a new VLIW-based Processor design capable of adapting the execution of the application at run-time in a totally transparent fashion, considering performance, fault tolerance, and energy consumption altogether, in which the weight (priority) of each one can be defined a priori . This is achieved by a novel decision module that dynamically controls the application’s ILP to increase the possibility of replicating instructions or applying power gating. For an energy-oriented configuration, it is possible, on average, to reduce energy consumption by 37.2 percent with an overhead of only 8.2 percent in performance, while maintaining low levels of failure rate, when compared to a fault-tolerant design.

  • adaptive and polymorphic VLIW Processor to optimize fault tolerance energy consumption and performance
    Computing Frontiers, 2018
    Co-Authors: Anderson L Sartor, Arthur Francisco Lorenzon, Sandip Kundu, Israel Koren, Antonio Carlos Schneider Beck
    Abstract:

    Because most traditional homogeneous and heterogeneous Processors have a fixed design that limits its runtime adaptability, they are not able to cope with the varying application behavior when one considers the axes of fault tolerance, performance, and energy consumption altogether. In this context, we propose a new dynamically adaptive Processor design that is capable of delivering the best trade-off among these three axes according to the application at hand, or be tuned to optimize a specific metric. This is achieved by extending a polymorphic Processor that can change its issue-width during runtime with specific mechanisms for fault tolerance, energy optimization, and performance enhancement. They are controlled by an optimization algorithm that evaluates and chooses which is the best configuration according to given requirements. Considering a metric that weighs all three axes, the proposed adaptive Processor delivers a result that is 94.88% of the oracle Processor on average, while a static configuration (defined at design time without runtime adaptation) only achieves 28.24% at most, which means that dynamic adaptation is required to cope with different application behaviors as there is not one specific configuration that fits all applications.

  • exploiting idle hardware to provide low overhead fault tolerance for VLIW Processors
    ACM Journal on Emerging Technologies in Computing Systems, 2017
    Co-Authors: Anderson L Sartor, Stephan Wong, Luigi Carro, Arthur Francisco Lorenzon, Fernanda Lima Kastensmidt, Antonio Carlos Schneider Beck
    Abstract:

    Because of technology scaling, the soft error rate has been increasing in digital circuits, which affects system reliability. Therefore, modern Processors, including VLIW architectures, must have means to mitigate such effects to guarantee reliable computing. In this scenario, our work proposes three low overhead fault tolerance approaches based on instruction duplication with zero latency detection, which uses a rollback mechanism to correct soft errors in the pipelanes of a configurable VLIW Processor. The first uses idle issue slots within a period of time to execute extra instructions considering distinct application phases. The second works at a finer grain, adaptively exploiting idle functional units at run-time. However, some applications present high instruction-level parallelism (ILP), so the ability to provide fault tolerance is reduced: less functional units will be idle, decreasing the number of potential duplicated instructions. The third approach attacks this issue by dynamically reducing ILP according to a configurable threshold, increasing fault tolerance at the cost of performance. While the first two approaches achieve significant fault coverage with minimal area and power overhead for applications with low ILP, the latter improves fault tolerance with low performance degradation. All approaches are evaluated considering area, performance, power dissipation, and error coverage.

  • run time phase prediction for a reconfigurable VLIW Processor
    Design Automation and Test in Europe, 2016
    Co-Authors: Qi Guo, Anderson L Sartor, Anthony Brandon, Antonio Carlos Schneider Beck, Xuehai Zhou, Stephan Wong
    Abstract:

    It is well-known that different applications exhibit varying amounts of ILP. Execution of these applications on the same fixed-width VLIW Processor will result (1) in wasted energy due to underutilized resources if the issue-width of the Processor is larger than the inherent ILP; or alternatively, (2) in lower performance if the issue-width is smaller than the inherent ILP. Moreover, even within a single application distinct phases can be observed with varying ILP and therefore changing resource requirements. With this in mind, we designed the ρ-VEX Processor, which is a VLIW Processor that can change its issue-width at run-time. In this paper, we propose a novel scheme to dynamically (i.e., at run-time) optimize the resource utilization by predicting and matching the number of active data-paths for each application phase. The purpose is to achieve low energy consumption for applications with low ILP, and high performance for applications with high ILP, on a single VLIW Processor design. We prototyped the ρ-VEX Processor on an FPGA and obtained the dynamic traces of applications running on top of a Linux port. Our results show that it is possible in some cases to achieve the performance of an 8-issue core with 10% lower energy consumption, while in others we achieve the energy consumption of a 2-issue core with close to 20% lower execution time.

Murali Jayapala - One of the best experts on this subject based on the ideXlab platform.

  • crisp a template for reconfigurable instruction set Processors
    Lecture Notes in Computer Science, 2001
    Co-Authors: Pieter Op De Beeck, Francisco Barat, Murali Jayapala, Rudy Lauwereins
    Abstract:

    A template for reconfigurable instruction set Processors is described. This template defines a design space that enables the exploration of Processors potentially suitable for flexible, power and cost efficient implementations of embedded multimedia applications, such as video compression in a hand held device. The template is based on a VLIW Processor with a reconfigurable instruction set. In the future this template will be used for design space exploration, compiler retargeting and automatic hardware synthesis. Several existing reconfigurable- and non-reconfigurable Processors were mapped onto the template to assess its expressiveness.