Temporary Register

The Experts below are selected from a list of 54 Experts worldwide ranked by ideXlab platform

Dan Wang - One of the best experts on this subject based on the ideXlab platform.

a constrained multiobjective evolutionary algorithm based decomposition and Temporary Register

Congress on Evolutionary Computation, 2013

Co-Authors: Hailin Liu, Dan Wang

Abstract:

We propose a novel constrained multiobjective evolutionary algorithm based on decomposition and Temporary Register in this paper. It decomposes the constrained multiobjective optimization problem into a number of subproblems and then optimizes each subproblem in a collaborative way. We also propose a novel constraint handling technique based on Temporary Register. Each subproblem has its own subpopulation and one Temporary Register. The subpopulation is composed of those individuals which has better objective values and lower constraint violations of this subproblem, while the Temporary Register is composed of those individuals that are found before. We perform the crossover operator between each individual in the subpopulations and an individual which is randomly chosen from the corresponding Temporary Register. Therefore, the Temporary Register strategy makes the individuals which have better objective values and lower constraint violations have an opportunity to participate in the crossover and mutation, but don't been eliminated at once. Moreover, this constraint handling technique does not need any parameter setting. The numerical simulations show the proposed algorithm outperforms existing ones.

15 days free trial to Access Article
IEEE Congress on Evolutionary Computation - A constrained multiobjective evolutionary algorithm based decomposition and Temporary Register

2013 IEEE Congress on Evolutionary Computation, 2013

Co-Authors: Hailin Liu, Dan Wang

Abstract:

We propose a novel constrained multiobjective evolutionary algorithm based on decomposition and Temporary Register in this paper. It decomposes the constrained multiobjective optimization problem into a number of subproblems and then optimizes each subproblem in a collaborative way. We also propose a novel constraint handling technique based on Temporary Register. Each subproblem has its own subpopulation and one Temporary Register. The subpopulation is composed of those individuals which has better objective values and lower constraint violations of this subproblem, while the Temporary Register is composed of those individuals that are found before. We perform the crossover operator between each individual in the subpopulations and an individual which is randomly chosen from the corresponding Temporary Register. Therefore, the Temporary Register strategy makes the individuals which have better objective values and lower constraint violations have an opportunity to participate in the crossover and mutation, but don't been eliminated at once. Moreover, this constraint handling technique does not need any parameter setting. The numerical simulations show the proposed algorithm outperforms existing ones.

15 days free trial to Access Article

Hailin Liu - One of the best experts on this subject based on the ideXlab platform.

A Constrained Multi-Objective Evolutionary Algorithm Based on Boundary Search and Archive

International Journal of Pattern Recognition and Artificial Intelligence, 2015

Co-Authors: Hailin Liu, Chaoda Peng, Jiechang Wen

Abstract:

In this paper, we propose a decomposition-based evolutionary algorithm with boundary search and archive for constrained multi-objective optimization problems (CMOPs), named CM2M. It decomposes a CMOP into a number of optimization subproblems and optimizes them simultaneously. Moreover, a novel constraint handling scheme based on the boundary search and archive is proposed. Each subproblem has one archive, including a subpopulation and a Temporary Register. Those individuals with better objective values and lower constraint violations are recorded in the subpopulation, while the Temporary Register consists of those individuals ever found before. To improve the efficiency of the algorithm, the boundary search method is designed. This method makes the feasible individuals with a higher probability to perform genetic operator with the infeasible individuals. Especially, when the constraints are active at the Pareto solutions, it can play its leading role. Compared with two algorithms, i.e. CMOEA/D-DE-CDP and Gary’s algorithm, on 18 CMOPs, the results show the effectiveness of the proposed constraint handling scheme.

15 days free trial to Access Article
a constrained multiobjective evolutionary algorithm based decomposition and Temporary Register

Congress on Evolutionary Computation, 2013

Co-Authors: Hailin Liu, Dan Wang

Abstract:

We propose a novel constrained multiobjective evolutionary algorithm based on decomposition and Temporary Register in this paper. It decomposes the constrained multiobjective optimization problem into a number of subproblems and then optimizes each subproblem in a collaborative way. We also propose a novel constraint handling technique based on Temporary Register. Each subproblem has its own subpopulation and one Temporary Register. The subpopulation is composed of those individuals which has better objective values and lower constraint violations of this subproblem, while the Temporary Register is composed of those individuals that are found before. We perform the crossover operator between each individual in the subpopulations and an individual which is randomly chosen from the corresponding Temporary Register. Therefore, the Temporary Register strategy makes the individuals which have better objective values and lower constraint violations have an opportunity to participate in the crossover and mutation, but don't been eliminated at once. Moreover, this constraint handling technique does not need any parameter setting. The numerical simulations show the proposed algorithm outperforms existing ones.

15 days free trial to Access Article
IEEE Congress on Evolutionary Computation - A constrained multiobjective evolutionary algorithm based decomposition and Temporary Register

2013 IEEE Congress on Evolutionary Computation, 2013

Co-Authors: Hailin Liu, Dan Wang

Abstract:

We propose a novel constrained multiobjective evolutionary algorithm based on decomposition and Temporary Register in this paper. It decomposes the constrained multiobjective optimization problem into a number of subproblems and then optimizes each subproblem in a collaborative way. We also propose a novel constraint handling technique based on Temporary Register. Each subproblem has its own subpopulation and one Temporary Register. The subpopulation is composed of those individuals which has better objective values and lower constraint violations of this subproblem, while the Temporary Register is composed of those individuals that are found before. We perform the crossover operator between each individual in the subpopulations and an individual which is randomly chosen from the corresponding Temporary Register. Therefore, the Temporary Register strategy makes the individuals which have better objective values and lower constraint violations have an opportunity to participate in the crossover and mutation, but don't been eliminated at once. Moreover, this constraint handling technique does not need any parameter setting. The numerical simulations show the proposed algorithm outperforms existing ones.

15 days free trial to Access Article

William J. Dally - One of the best experts on this subject based on the ideXlab platform.

A Compile-Time Managed Multi-Level Register File Hierarchy

2012

Co-Authors: Mark Gebhart, Stephen W. Keckler, William J. Dally

Abstract:

As processors increasingly become power limited, performance improvements will be achieved by rearchitecting systems with energy efficiency as the primary design constraint. While some of these optimizations will be hardware based, combined hardware and software techniques likely will be the most productive. This work redesigns the Register file system of a modern throughput processor with a combined hardware and software solution that reduces Register file energy without harming system performance. Throughput processors utilize a large number of threads to tolerate latency, requiring a large, energy-intensive Register file to store thread context. Our results show that a compiler controlled Register file hierarchy can reduce Register file energy by up to 54%, compared to a hardware only caching approach that reduces Register file energy by 34%. We explore Register allocation algorithms that are specifically targeted to improve energy efficiency by sharing Temporary Register file resources across concurrently running threads and conduct a detailed limit study on the further potential to optimize operand delivery for throughput processors. Our efficiency gains represent a direct performance gain for power limited systems, such as GPUs

15 days free trial to Access Article
ISCA - Energy-efficient mechanisms for managing thread context in throughput processors

Proceeding of the 38th annual international symposium on Computer architecture - ISCA '11, 2011

Co-Authors: Mark Gebhart, Daniel R. Johnson, David Tarjan, Stephen W. Keckler, William J. Dally, Erik Lindholm, Kevin Skadron

Abstract:

Modern graphics processing units (GPUs) use a large number of hardware threads to hide both function unit and memory access latency. Extreme multithreading requires a complicated thread scheduler as well as a large Register file, which is expensive to access both in terms of energy and latency. We present two complementary techniques for reducing energy on massively-threaded processors such as GPUs. First, we examine Register file caching to replace accesses to the large main Register file with accesses to a smaller structure containing the immediate Register working set of active threads. Second, we investigate a two-level thread scheduler that maintains a small set of active threads to hide ALU and local memory access latency and a larger set of pending threads to hide main memory latency. Combined with Register file caching, a two-level thread scheduler provides a further reduction in energy by limiting the allocation of Temporary Register cache resources to only the currently active subset of threads. We show that on average, across a variety of real world graphics and compute workloads, a 6-entry per-thread Register file cache reduces the number of reads and writes to the main Register file by 50% and 59% respectively. We further show that the active thread count can be reduced by a factor of 4 with minimal impact on performance, resulting in a 36% reduction of Register file energy.

15 days free trial to Access Article
MICRO - A compile-time managed multi-level Register file hierarchy

Proceedings of the 44th Annual IEEE ACM International Symposium on Microarchitecture - MICRO-44 '11, 2011

Co-Authors: Mark Gebhart, Stephen W. Keckler, William J. Dally

Abstract:

As processors increasingly become power limited, performance improvements will be achieved by rearchitecting systems with energy efficiency as the primary design constraint. While some of these optimizations will be hardware based, combined hardware and software techniques likely will be the most productive. This work redesigns the Register file system of a modern throughput processor with a combined hardware and software solution that reduces Register file energy without harming system performance. Throughput processors utilize a large number of threads to tolerate latency, requiring a large, energy-intensive Register file to store thread context. Our results show that a compiler controlled Register file hierarchy can reduce Register file energy by up to 54%, compared to a hardware only caching approach that reduces Register file energy by 34%. We explore Register allocation algorithms that are specifically targeted to improve energy efficiency by sharing Temporary Register file resources across concurrently running threads and conduct a detailed limit study on the further potential to optimize operand delivery for throughput processors. Our efficiency gains represent a direct performance gain for power limited systems, such as GPUs.

15 days free trial to Access Article

Mark Gebhart - One of the best experts on this subject based on the ideXlab platform.

A Compile-Time Managed Multi-Level Register File Hierarchy

2012

Co-Authors: Mark Gebhart, Stephen W. Keckler, William J. Dally

Abstract:

As processors increasingly become power limited, performance improvements will be achieved by rearchitecting systems with energy efficiency as the primary design constraint. While some of these optimizations will be hardware based, combined hardware and software techniques likely will be the most productive. This work redesigns the Register file system of a modern throughput processor with a combined hardware and software solution that reduces Register file energy without harming system performance. Throughput processors utilize a large number of threads to tolerate latency, requiring a large, energy-intensive Register file to store thread context. Our results show that a compiler controlled Register file hierarchy can reduce Register file energy by up to 54%, compared to a hardware only caching approach that reduces Register file energy by 34%. We explore Register allocation algorithms that are specifically targeted to improve energy efficiency by sharing Temporary Register file resources across concurrently running threads and conduct a detailed limit study on the further potential to optimize operand delivery for throughput processors. Our efficiency gains represent a direct performance gain for power limited systems, such as GPUs

15 days free trial to Access Article
ISCA - Energy-efficient mechanisms for managing thread context in throughput processors

Proceeding of the 38th annual international symposium on Computer architecture - ISCA '11, 2011

Co-Authors: Mark Gebhart, Daniel R. Johnson, David Tarjan, Stephen W. Keckler, William J. Dally, Erik Lindholm, Kevin Skadron

Abstract:

Modern graphics processing units (GPUs) use a large number of hardware threads to hide both function unit and memory access latency. Extreme multithreading requires a complicated thread scheduler as well as a large Register file, which is expensive to access both in terms of energy and latency. We present two complementary techniques for reducing energy on massively-threaded processors such as GPUs. First, we examine Register file caching to replace accesses to the large main Register file with accesses to a smaller structure containing the immediate Register working set of active threads. Second, we investigate a two-level thread scheduler that maintains a small set of active threads to hide ALU and local memory access latency and a larger set of pending threads to hide main memory latency. Combined with Register file caching, a two-level thread scheduler provides a further reduction in energy by limiting the allocation of Temporary Register cache resources to only the currently active subset of threads. We show that on average, across a variety of real world graphics and compute workloads, a 6-entry per-thread Register file cache reduces the number of reads and writes to the main Register file by 50% and 59% respectively. We further show that the active thread count can be reduced by a factor of 4 with minimal impact on performance, resulting in a 36% reduction of Register file energy.

15 days free trial to Access Article
MICRO - A compile-time managed multi-level Register file hierarchy

Proceedings of the 44th Annual IEEE ACM International Symposium on Microarchitecture - MICRO-44 '11, 2011

Co-Authors: Mark Gebhart, Stephen W. Keckler, William J. Dally

Abstract:

As processors increasingly become power limited, performance improvements will be achieved by rearchitecting systems with energy efficiency as the primary design constraint. While some of these optimizations will be hardware based, combined hardware and software techniques likely will be the most productive. This work redesigns the Register file system of a modern throughput processor with a combined hardware and software solution that reduces Register file energy without harming system performance. Throughput processors utilize a large number of threads to tolerate latency, requiring a large, energy-intensive Register file to store thread context. Our results show that a compiler controlled Register file hierarchy can reduce Register file energy by up to 54%, compared to a hardware only caching approach that reduces Register file energy by 34%. We explore Register allocation algorithms that are specifically targeted to improve energy efficiency by sharing Temporary Register file resources across concurrently running threads and conduct a detailed limit study on the further potential to optimize operand delivery for throughput processors. Our efficiency gains represent a direct performance gain for power limited systems, such as GPUs.

15 days free trial to Access Article

Stephen W. Keckler - One of the best experts on this subject based on the ideXlab platform.

A Compile-Time Managed Multi-Level Register File Hierarchy

2012

Co-Authors: Mark Gebhart, Stephen W. Keckler, William J. Dally

Abstract:

As processors increasingly become power limited, performance improvements will be achieved by rearchitecting systems with energy efficiency as the primary design constraint. While some of these optimizations will be hardware based, combined hardware and software techniques likely will be the most productive. This work redesigns the Register file system of a modern throughput processor with a combined hardware and software solution that reduces Register file energy without harming system performance. Throughput processors utilize a large number of threads to tolerate latency, requiring a large, energy-intensive Register file to store thread context. Our results show that a compiler controlled Register file hierarchy can reduce Register file energy by up to 54%, compared to a hardware only caching approach that reduces Register file energy by 34%. We explore Register allocation algorithms that are specifically targeted to improve energy efficiency by sharing Temporary Register file resources across concurrently running threads and conduct a detailed limit study on the further potential to optimize operand delivery for throughput processors. Our efficiency gains represent a direct performance gain for power limited systems, such as GPUs

15 days free trial to Access Article
ISCA - Energy-efficient mechanisms for managing thread context in throughput processors

Proceeding of the 38th annual international symposium on Computer architecture - ISCA '11, 2011

Co-Authors: Mark Gebhart, Daniel R. Johnson, David Tarjan, Stephen W. Keckler, William J. Dally, Erik Lindholm, Kevin Skadron

Abstract:

Modern graphics processing units (GPUs) use a large number of hardware threads to hide both function unit and memory access latency. Extreme multithreading requires a complicated thread scheduler as well as a large Register file, which is expensive to access both in terms of energy and latency. We present two complementary techniques for reducing energy on massively-threaded processors such as GPUs. First, we examine Register file caching to replace accesses to the large main Register file with accesses to a smaller structure containing the immediate Register working set of active threads. Second, we investigate a two-level thread scheduler that maintains a small set of active threads to hide ALU and local memory access latency and a larger set of pending threads to hide main memory latency. Combined with Register file caching, a two-level thread scheduler provides a further reduction in energy by limiting the allocation of Temporary Register cache resources to only the currently active subset of threads. We show that on average, across a variety of real world graphics and compute workloads, a 6-entry per-thread Register file cache reduces the number of reads and writes to the main Register file by 50% and 59% respectively. We further show that the active thread count can be reduced by a factor of 4 with minimal impact on performance, resulting in a 36% reduction of Register file energy.

15 days free trial to Access Article
MICRO - A compile-time managed multi-level Register file hierarchy

Proceedings of the 44th Annual IEEE ACM International Symposium on Microarchitecture - MICRO-44 '11, 2011

Co-Authors: Mark Gebhart, Stephen W. Keckler, William J. Dally

Abstract:

As processors increasingly become power limited, performance improvements will be achieved by rearchitecting systems with energy efficiency as the primary design constraint. While some of these optimizations will be hardware based, combined hardware and software techniques likely will be the most productive. This work redesigns the Register file system of a modern throughput processor with a combined hardware and software solution that reduces Register file energy without harming system performance. Throughput processors utilize a large number of threads to tolerate latency, requiring a large, energy-intensive Register file to store thread context. Our results show that a compiler controlled Register file hierarchy can reduce Register file energy by up to 54%, compared to a hardware only caching approach that reduces Register file energy by 34%. We explore Register allocation algorithms that are specifically targeted to improve energy efficiency by sharing Temporary Register file resources across concurrently running threads and conduct a detailed limit study on the further potential to optimize operand delivery for throughput processors. Our efficiency gains represent a direct performance gain for power limited systems, such as GPUs.

15 days free trial to Access Article

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Dan Wang - One of the best experts on this subject based on the ideXlab platform.

a constrained multiobjective evolutionary algorithm based decomposition and Temporary Register

IEEE Congress on Evolutionary Computation - A constrained multiobjective evolutionary algorithm based decomposition and Temporary Register

Hailin Liu - One of the best experts on this subject based on the ideXlab platform.

A Constrained Multi-Objective Evolutionary Algorithm Based on Boundary Search and Archive

a constrained multiobjective evolutionary algorithm based decomposition and Temporary Register

IEEE Congress on Evolutionary Computation - A constrained multiobjective evolutionary algorithm based decomposition and Temporary Register

William J. Dally - One of the best experts on this subject based on the ideXlab platform.

A Compile-Time Managed Multi-Level Register File Hierarchy

ISCA - Energy-efficient mechanisms for managing thread context in throughput processors

MICRO - A compile-time managed multi-level Register file hierarchy

Mark Gebhart - One of the best experts on this subject based on the ideXlab platform.

A Compile-Time Managed Multi-Level Register File Hierarchy

ISCA - Energy-efficient mechanisms for managing thread context in throughput processors

MICRO - A compile-time managed multi-level Register file hierarchy

Stephen W. Keckler - One of the best experts on this subject based on the ideXlab platform.

A Compile-Time Managed Multi-Level Register File Hierarchy

ISCA - Energy-efficient mechanisms for managing thread context in throughput processors

MICRO - A compile-time managed multi-level Register file hierarchy