Temporary Register

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 54 Experts worldwide ranked by ideXlab platform

Dan Wang - One of the best experts on this subject based on the ideXlab platform.

  • a constrained multiobjective evolutionary algorithm based decomposition and Temporary Register
    Congress on Evolutionary Computation, 2013
    Co-Authors: Hailin Liu, Dan Wang
    Abstract:

    We propose a novel constrained multiobjective evolutionary algorithm based on decomposition and Temporary Register in this paper. It decomposes the constrained multiobjective optimization problem into a number of subproblems and then optimizes each subproblem in a collaborative way. We also propose a novel constraint handling technique based on Temporary Register. Each subproblem has its own subpopulation and one Temporary Register. The subpopulation is composed of those individuals which has better objective values and lower constraint violations of this subproblem, while the Temporary Register is composed of those individuals that are found before. We perform the crossover operator between each individual in the subpopulations and an individual which is randomly chosen from the corresponding Temporary Register. Therefore, the Temporary Register strategy makes the individuals which have better objective values and lower constraint violations have an opportunity to participate in the crossover and mutation, but don't been eliminated at once. Moreover, this constraint handling technique does not need any parameter setting. The numerical simulations show the proposed algorithm outperforms existing ones.

  • IEEE Congress on Evolutionary Computation - A constrained multiobjective evolutionary algorithm based decomposition and Temporary Register
    2013 IEEE Congress on Evolutionary Computation, 2013
    Co-Authors: Hailin Liu, Dan Wang
    Abstract:

    We propose a novel constrained multiobjective evolutionary algorithm based on decomposition and Temporary Register in this paper. It decomposes the constrained multiobjective optimization problem into a number of subproblems and then optimizes each subproblem in a collaborative way. We also propose a novel constraint handling technique based on Temporary Register. Each subproblem has its own subpopulation and one Temporary Register. The subpopulation is composed of those individuals which has better objective values and lower constraint violations of this subproblem, while the Temporary Register is composed of those individuals that are found before. We perform the crossover operator between each individual in the subpopulations and an individual which is randomly chosen from the corresponding Temporary Register. Therefore, the Temporary Register strategy makes the individuals which have better objective values and lower constraint violations have an opportunity to participate in the crossover and mutation, but don't been eliminated at once. Moreover, this constraint handling technique does not need any parameter setting. The numerical simulations show the proposed algorithm outperforms existing ones.

Hailin Liu - One of the best experts on this subject based on the ideXlab platform.

  • A Constrained Multi-Objective Evolutionary Algorithm Based on Boundary Search and Archive
    International Journal of Pattern Recognition and Artificial Intelligence, 2015
    Co-Authors: Hailin Liu, Chaoda Peng, Jiechang Wen
    Abstract:

    In this paper, we propose a decomposition-based evolutionary algorithm with boundary search and archive for constrained multi-objective optimization problems (CMOPs), named CM2M. It decomposes a CMOP into a number of optimization subproblems and optimizes them simultaneously. Moreover, a novel constraint handling scheme based on the boundary search and archive is proposed. Each subproblem has one archive, including a subpopulation and a Temporary Register. Those individuals with better objective values and lower constraint violations are recorded in the subpopulation, while the Temporary Register consists of those individuals ever found before. To improve the efficiency of the algorithm, the boundary search method is designed. This method makes the feasible individuals with a higher probability to perform genetic operator with the infeasible individuals. Especially, when the constraints are active at the Pareto solutions, it can play its leading role. Compared with two algorithms, i.e. CMOEA/D-DE-CDP and Gary’s algorithm, on 18 CMOPs, the results show the effectiveness of the proposed constraint handling scheme.

  • a constrained multiobjective evolutionary algorithm based decomposition and Temporary Register
    Congress on Evolutionary Computation, 2013
    Co-Authors: Hailin Liu, Dan Wang
    Abstract:

    We propose a novel constrained multiobjective evolutionary algorithm based on decomposition and Temporary Register in this paper. It decomposes the constrained multiobjective optimization problem into a number of subproblems and then optimizes each subproblem in a collaborative way. We also propose a novel constraint handling technique based on Temporary Register. Each subproblem has its own subpopulation and one Temporary Register. The subpopulation is composed of those individuals which has better objective values and lower constraint violations of this subproblem, while the Temporary Register is composed of those individuals that are found before. We perform the crossover operator between each individual in the subpopulations and an individual which is randomly chosen from the corresponding Temporary Register. Therefore, the Temporary Register strategy makes the individuals which have better objective values and lower constraint violations have an opportunity to participate in the crossover and mutation, but don't been eliminated at once. Moreover, this constraint handling technique does not need any parameter setting. The numerical simulations show the proposed algorithm outperforms existing ones.

  • IEEE Congress on Evolutionary Computation - A constrained multiobjective evolutionary algorithm based decomposition and Temporary Register
    2013 IEEE Congress on Evolutionary Computation, 2013
    Co-Authors: Hailin Liu, Dan Wang
    Abstract:

    We propose a novel constrained multiobjective evolutionary algorithm based on decomposition and Temporary Register in this paper. It decomposes the constrained multiobjective optimization problem into a number of subproblems and then optimizes each subproblem in a collaborative way. We also propose a novel constraint handling technique based on Temporary Register. Each subproblem has its own subpopulation and one Temporary Register. The subpopulation is composed of those individuals which has better objective values and lower constraint violations of this subproblem, while the Temporary Register is composed of those individuals that are found before. We perform the crossover operator between each individual in the subpopulations and an individual which is randomly chosen from the corresponding Temporary Register. Therefore, the Temporary Register strategy makes the individuals which have better objective values and lower constraint violations have an opportunity to participate in the crossover and mutation, but don't been eliminated at once. Moreover, this constraint handling technique does not need any parameter setting. The numerical simulations show the proposed algorithm outperforms existing ones.

William J. Dally - One of the best experts on this subject based on the ideXlab platform.

  • A Compile-Time Managed Multi-Level Register File Hierarchy
    2012
    Co-Authors: Mark Gebhart, Stephen W. Keckler, William J. Dally
    Abstract:

    As processors increasingly become power limited, performance improvements will be achieved by rearchitecting systems with energy efficiency as the primary design constraint. While some of these optimizations will be hardware based, combined hardware and software techniques likely will be the most productive. This work redesigns the Register file system of a modern throughput processor with a combined hardware and software solution that reduces Register file energy without harming system performance. Throughput processors utilize a large number of threads to tolerate latency, requiring a large, energy-intensive Register file to store thread context. Our results show that a compiler controlled Register file hierarchy can reduce Register file energy by up to 54%, compared to a hardware only caching approach that reduces Register file energy by 34%. We explore Register allocation algorithms that are specifically targeted to improve energy efficiency by sharing Temporary Register file resources across concurrently running threads and conduct a detailed limit study on the further potential to optimize operand delivery for throughput processors. Our efficiency gains represent a direct performance gain for power limited systems, such as GPUs

  • ISCA - Energy-efficient mechanisms for managing thread context in throughput processors
    Proceeding of the 38th annual international symposium on Computer architecture - ISCA '11, 2011
    Co-Authors: Mark Gebhart, Daniel R. Johnson, David Tarjan, Stephen W. Keckler, William J. Dally, Erik Lindholm, Kevin Skadron
    Abstract:

    Modern graphics processing units (GPUs) use a large number of hardware threads to hide both function unit and memory access latency. Extreme multithreading requires a complicated thread scheduler as well as a large Register file, which is expensive to access both in terms of energy and latency. We present two complementary techniques for reducing energy on massively-threaded processors such as GPUs. First, we examine Register file caching to replace accesses to the large main Register file with accesses to a smaller structure containing the immediate Register working set of active threads. Second, we investigate a two-level thread scheduler that maintains a small set of active threads to hide ALU and local memory access latency and a larger set of pending threads to hide main memory latency. Combined with Register file caching, a two-level thread scheduler provides a further reduction in energy by limiting the allocation of Temporary Register cache resources to only the currently active subset of threads. We show that on average, across a variety of real world graphics and compute workloads, a 6-entry per-thread Register file cache reduces the number of reads and writes to the main Register file by 50% and 59% respectively. We further show that the active thread count can be reduced by a factor of 4 with minimal impact on performance, resulting in a 36% reduction of Register file energy.

  • MICRO - A compile-time managed multi-level Register file hierarchy
    Proceedings of the 44th Annual IEEE ACM International Symposium on Microarchitecture - MICRO-44 '11, 2011
    Co-Authors: Mark Gebhart, Stephen W. Keckler, William J. Dally
    Abstract:

    As processors increasingly become power limited, performance improvements will be achieved by rearchitecting systems with energy efficiency as the primary design constraint. While some of these optimizations will be hardware based, combined hardware and software techniques likely will be the most productive. This work redesigns the Register file system of a modern throughput processor with a combined hardware and software solution that reduces Register file energy without harming system performance. Throughput processors utilize a large number of threads to tolerate latency, requiring a large, energy-intensive Register file to store thread context. Our results show that a compiler controlled Register file hierarchy can reduce Register file energy by up to 54%, compared to a hardware only caching approach that reduces Register file energy by 34%. We explore Register allocation algorithms that are specifically targeted to improve energy efficiency by sharing Temporary Register file resources across concurrently running threads and conduct a detailed limit study on the further potential to optimize operand delivery for throughput processors. Our efficiency gains represent a direct performance gain for power limited systems, such as GPUs.

Mark Gebhart - One of the best experts on this subject based on the ideXlab platform.

  • A Compile-Time Managed Multi-Level Register File Hierarchy
    2012
    Co-Authors: Mark Gebhart, Stephen W. Keckler, William J. Dally
    Abstract:

    As processors increasingly become power limited, performance improvements will be achieved by rearchitecting systems with energy efficiency as the primary design constraint. While some of these optimizations will be hardware based, combined hardware and software techniques likely will be the most productive. This work redesigns the Register file system of a modern throughput processor with a combined hardware and software solution that reduces Register file energy without harming system performance. Throughput processors utilize a large number of threads to tolerate latency, requiring a large, energy-intensive Register file to store thread context. Our results show that a compiler controlled Register file hierarchy can reduce Register file energy by up to 54%, compared to a hardware only caching approach that reduces Register file energy by 34%. We explore Register allocation algorithms that are specifically targeted to improve energy efficiency by sharing Temporary Register file resources across concurrently running threads and conduct a detailed limit study on the further potential to optimize operand delivery for throughput processors. Our efficiency gains represent a direct performance gain for power limited systems, such as GPUs

  • ISCA - Energy-efficient mechanisms for managing thread context in throughput processors
    Proceeding of the 38th annual international symposium on Computer architecture - ISCA '11, 2011
    Co-Authors: Mark Gebhart, Daniel R. Johnson, David Tarjan, Stephen W. Keckler, William J. Dally, Erik Lindholm, Kevin Skadron
    Abstract:

    Modern graphics processing units (GPUs) use a large number of hardware threads to hide both function unit and memory access latency. Extreme multithreading requires a complicated thread scheduler as well as a large Register file, which is expensive to access both in terms of energy and latency. We present two complementary techniques for reducing energy on massively-threaded processors such as GPUs. First, we examine Register file caching to replace accesses to the large main Register file with accesses to a smaller structure containing the immediate Register working set of active threads. Second, we investigate a two-level thread scheduler that maintains a small set of active threads to hide ALU and local memory access latency and a larger set of pending threads to hide main memory latency. Combined with Register file caching, a two-level thread scheduler provides a further reduction in energy by limiting the allocation of Temporary Register cache resources to only the currently active subset of threads. We show that on average, across a variety of real world graphics and compute workloads, a 6-entry per-thread Register file cache reduces the number of reads and writes to the main Register file by 50% and 59% respectively. We further show that the active thread count can be reduced by a factor of 4 with minimal impact on performance, resulting in a 36% reduction of Register file energy.

  • MICRO - A compile-time managed multi-level Register file hierarchy
    Proceedings of the 44th Annual IEEE ACM International Symposium on Microarchitecture - MICRO-44 '11, 2011
    Co-Authors: Mark Gebhart, Stephen W. Keckler, William J. Dally
    Abstract:

    As processors increasingly become power limited, performance improvements will be achieved by rearchitecting systems with energy efficiency as the primary design constraint. While some of these optimizations will be hardware based, combined hardware and software techniques likely will be the most productive. This work redesigns the Register file system of a modern throughput processor with a combined hardware and software solution that reduces Register file energy without harming system performance. Throughput processors utilize a large number of threads to tolerate latency, requiring a large, energy-intensive Register file to store thread context. Our results show that a compiler controlled Register file hierarchy can reduce Register file energy by up to 54%, compared to a hardware only caching approach that reduces Register file energy by 34%. We explore Register allocation algorithms that are specifically targeted to improve energy efficiency by sharing Temporary Register file resources across concurrently running threads and conduct a detailed limit study on the further potential to optimize operand delivery for throughput processors. Our efficiency gains represent a direct performance gain for power limited systems, such as GPUs.

Stephen W. Keckler - One of the best experts on this subject based on the ideXlab platform.

  • A Compile-Time Managed Multi-Level Register File Hierarchy
    2012
    Co-Authors: Mark Gebhart, Stephen W. Keckler, William J. Dally
    Abstract:

    As processors increasingly become power limited, performance improvements will be achieved by rearchitecting systems with energy efficiency as the primary design constraint. While some of these optimizations will be hardware based, combined hardware and software techniques likely will be the most productive. This work redesigns the Register file system of a modern throughput processor with a combined hardware and software solution that reduces Register file energy without harming system performance. Throughput processors utilize a large number of threads to tolerate latency, requiring a large, energy-intensive Register file to store thread context. Our results show that a compiler controlled Register file hierarchy can reduce Register file energy by up to 54%, compared to a hardware only caching approach that reduces Register file energy by 34%. We explore Register allocation algorithms that are specifically targeted to improve energy efficiency by sharing Temporary Register file resources across concurrently running threads and conduct a detailed limit study on the further potential to optimize operand delivery for throughput processors. Our efficiency gains represent a direct performance gain for power limited systems, such as GPUs

  • ISCA - Energy-efficient mechanisms for managing thread context in throughput processors
    Proceeding of the 38th annual international symposium on Computer architecture - ISCA '11, 2011
    Co-Authors: Mark Gebhart, Daniel R. Johnson, David Tarjan, Stephen W. Keckler, William J. Dally, Erik Lindholm, Kevin Skadron
    Abstract:

    Modern graphics processing units (GPUs) use a large number of hardware threads to hide both function unit and memory access latency. Extreme multithreading requires a complicated thread scheduler as well as a large Register file, which is expensive to access both in terms of energy and latency. We present two complementary techniques for reducing energy on massively-threaded processors such as GPUs. First, we examine Register file caching to replace accesses to the large main Register file with accesses to a smaller structure containing the immediate Register working set of active threads. Second, we investigate a two-level thread scheduler that maintains a small set of active threads to hide ALU and local memory access latency and a larger set of pending threads to hide main memory latency. Combined with Register file caching, a two-level thread scheduler provides a further reduction in energy by limiting the allocation of Temporary Register cache resources to only the currently active subset of threads. We show that on average, across a variety of real world graphics and compute workloads, a 6-entry per-thread Register file cache reduces the number of reads and writes to the main Register file by 50% and 59% respectively. We further show that the active thread count can be reduced by a factor of 4 with minimal impact on performance, resulting in a 36% reduction of Register file energy.

  • MICRO - A compile-time managed multi-level Register file hierarchy
    Proceedings of the 44th Annual IEEE ACM International Symposium on Microarchitecture - MICRO-44 '11, 2011
    Co-Authors: Mark Gebhart, Stephen W. Keckler, William J. Dally
    Abstract:

    As processors increasingly become power limited, performance improvements will be achieved by rearchitecting systems with energy efficiency as the primary design constraint. While some of these optimizations will be hardware based, combined hardware and software techniques likely will be the most productive. This work redesigns the Register file system of a modern throughput processor with a combined hardware and software solution that reduces Register file energy without harming system performance. Throughput processors utilize a large number of threads to tolerate latency, requiring a large, energy-intensive Register file to store thread context. Our results show that a compiler controlled Register file hierarchy can reduce Register file energy by up to 54%, compared to a hardware only caching approach that reduces Register file energy by 34%. We explore Register allocation algorithms that are specifically targeted to improve energy efficiency by sharing Temporary Register file resources across concurrently running threads and conduct a detailed limit study on the further potential to optimize operand delivery for throughput processors. Our efficiency gains represent a direct performance gain for power limited systems, such as GPUs.