Associative Operation

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 129 Experts worldwide ranked by ideXlab platform

Henk Corporaal - One of the best experts on this subject based on the ideXlab platform.

  • reduction operator for wide simds reconsidered
    Design Automation Conference, 2014
    Co-Authors: Luc Waeijen, Dongrui She, Henk Corporaal
    Abstract:

    It has been shown that wide Single Instruction Multiple Data architectures (wide-SIMDs) can achieve high energy efficiency, especially in domains such as image and vision processing. In these and various other application domains, reduction is a frequently encountered Operation, where multiple input elements need to be combined into a single element by an Associative Operation, e.g. addition or multiplication. There are many applications that require reduction such as: partial histogram merging, matrix multiplication and min/max-finding. Wide-SIMDs contain a large number of processing elements (PEs), which in general are connected by a minimal form of interconnect for scalability reasons. To efficiently support reduction Operations on wide-SIMDs with such a minimal interconnect, we introduce two novel reduction algorithms which do not rely on complex communication networks or any dedicated hardware. The proposed approaches are compared with both dedicated hardware and other software solutions in terms of performance, area, and energy consumption. A practical case study demonstrates that the proposed software approach has much better generality, flexibility and no additional hardware cost. Compared to a dedicated hardware adder tree, the proposed software approach saves 6.8% area with a performance penalty of only 6.5%.

  • DAC - Reduction Operator for Wide-SIMDs Reconsidered
    Proceedings of the The 51st Annual Design Automation Conference on Design Automation Conference - DAC '14, 2014
    Co-Authors: Luc Waeijen, Dongrui She, Henk Corporaal
    Abstract:

    It has been shown that wide Single Instruction Multiple Data architectures (wide-SIMDs) can achieve high energy efficiency, especially in domains such as image and vision processing. In these and various other application domains, reduction is a frequently encountered Operation, where multiple input elements need to be combined into a single element by an Associative Operation, e.g. addition or multiplication. There are many applications that require reduction such as: partial histogram merging, matrix multiplication and min/max-finding. Wide-SIMDs contain a large number of processing elements (PEs), which in general are connected by a minimal form of interconnect for scalability reasons. To efficiently support reduction Operations on wide-SIMDs with such a minimal interconnect, we introduce two novel reduction algorithms which do not rely on complex communication networks or any dedicated hardware. The proposed approaches are compared with both dedicated hardware and other software solutions in terms of performance, area, and energy consumption. A practical case study demonstrates that the proposed software approach has much better generality, flexibility and no additional hardware cost. Compared to a dedicated hardware adder tree, the proposed software approach saves 6.8% area with a performance penalty of only 6.5%.

Tadashi Shibata - One of the best experts on this subject based on the ideXlab platform.

  • right brain left brain integrated Associative processor employing convertible multiple instruction stream multiple data stream elements
    Japanese Journal of Applied Physics, 2005
    Co-Authors: Hitoshi Hayakawa, M Ogawa, Tadashi Shibata
    Abstract:

    A very large scale integrated circuit (VLSI) architecture for a multiple-instruction-stream multiple-data-stream (MIMD) Associative processor has been proposed. The processor employs an architecture that enables seamless switching from Associative Operations to arithmetic Operations. The MIMD element is convertible to a regular central processing unit (CPU) while maintaining its high performance as an Associative processor. Therefore, the MIMD Associative processor can perform not only on-chip perception, i.e., searching for the vector most similar to an input vector throughout the on-chip cache memory, but also arithmetic and logic Operations similar to those in ordinary CPUs, both simultaneously in parallel processing. Three key technologies have been developed to generate the MIMD element: Associative-Operation-and-arithmetic-Operation switchable calculation units, a versatile register control scheme within the MIMD element for flexible Operations, and a short instruction set for minimizing the memory size for program storage. Key circuit blocks were designed and fabricated using 0.18 µm complementary metal-oxide-semiconductor (CMOS) technology. As a result, the full-featured MIMD element is estimated to be 3 mm2, showing the feasibility of an 8-parallel-MIMD-element Associative processor in a single chip of 5 mm×5 mm.

  • Right-Brain/Left-Brain Integrated Associative Processor Employing Convertible Multiple-Instruction-Stream Multiple-Data-Stream Elements
    Japanese Journal of Applied Physics, 2005
    Co-Authors: Hitoshi Hayakawa, M Ogawa, Tadashi Shibata
    Abstract:

    A very large scale integrated circuit (VLSI) architecture for a multiple-instruction-stream multiple-data-stream (MIMD) Associative processor has been proposed. The processor employs an architecture that enables seamless switching from Associative Operations to arithmetic Operations. The MIMD element is convertible to a regular central processing unit (CPU) while maintaining its high performance as an Associative processor. Therefore, the MIMD Associative processor can perform not only on-chip perception, i.e., searching for the vector most similar to an input vector throughout the on-chip cache memory, but also arithmetic and logic Operations similar to those in ordinary CPUs, both simultaneously in parallel processing. Three key technologies have been developed to generate the MIMD element: Associative-Operation-and-arithmetic-Operation switchable calculation units, a versatile register control scheme within the MIMD element for flexible Operations, and a short instruction set for minimizing the memory size for program storage. Key circuit blocks were designed and fabricated using 0.18 µm complementary metal-oxide-semiconductor (CMOS) technology. As a result, the full-featured MIMD element is estimated to be 3 mm2, showing the feasibility of an 8-parallel-MIMD-element Associative processor in a single chip of 5 mm×5 mm.

Luc Waeijen - One of the best experts on this subject based on the ideXlab platform.

  • reduction operator for wide simds reconsidered
    Design Automation Conference, 2014
    Co-Authors: Luc Waeijen, Dongrui She, Henk Corporaal
    Abstract:

    It has been shown that wide Single Instruction Multiple Data architectures (wide-SIMDs) can achieve high energy efficiency, especially in domains such as image and vision processing. In these and various other application domains, reduction is a frequently encountered Operation, where multiple input elements need to be combined into a single element by an Associative Operation, e.g. addition or multiplication. There are many applications that require reduction such as: partial histogram merging, matrix multiplication and min/max-finding. Wide-SIMDs contain a large number of processing elements (PEs), which in general are connected by a minimal form of interconnect for scalability reasons. To efficiently support reduction Operations on wide-SIMDs with such a minimal interconnect, we introduce two novel reduction algorithms which do not rely on complex communication networks or any dedicated hardware. The proposed approaches are compared with both dedicated hardware and other software solutions in terms of performance, area, and energy consumption. A practical case study demonstrates that the proposed software approach has much better generality, flexibility and no additional hardware cost. Compared to a dedicated hardware adder tree, the proposed software approach saves 6.8% area with a performance penalty of only 6.5%.

  • DAC - Reduction Operator for Wide-SIMDs Reconsidered
    Proceedings of the The 51st Annual Design Automation Conference on Design Automation Conference - DAC '14, 2014
    Co-Authors: Luc Waeijen, Dongrui She, Henk Corporaal
    Abstract:

    It has been shown that wide Single Instruction Multiple Data architectures (wide-SIMDs) can achieve high energy efficiency, especially in domains such as image and vision processing. In these and various other application domains, reduction is a frequently encountered Operation, where multiple input elements need to be combined into a single element by an Associative Operation, e.g. addition or multiplication. There are many applications that require reduction such as: partial histogram merging, matrix multiplication and min/max-finding. Wide-SIMDs contain a large number of processing elements (PEs), which in general are connected by a minimal form of interconnect for scalability reasons. To efficiently support reduction Operations on wide-SIMDs with such a minimal interconnect, we introduce two novel reduction algorithms which do not rely on complex communication networks or any dedicated hardware. The proposed approaches are compared with both dedicated hardware and other software solutions in terms of performance, area, and energy consumption. A practical case study demonstrates that the proposed software approach has much better generality, flexibility and no additional hardware cost. Compared to a dedicated hardware adder tree, the proposed software approach saves 6.8% area with a performance penalty of only 6.5%.

Hitoshi Hayakawa - One of the best experts on this subject based on the ideXlab platform.

  • right brain left brain integrated Associative processor employing convertible multiple instruction stream multiple data stream elements
    Japanese Journal of Applied Physics, 2005
    Co-Authors: Hitoshi Hayakawa, M Ogawa, Tadashi Shibata
    Abstract:

    A very large scale integrated circuit (VLSI) architecture for a multiple-instruction-stream multiple-data-stream (MIMD) Associative processor has been proposed. The processor employs an architecture that enables seamless switching from Associative Operations to arithmetic Operations. The MIMD element is convertible to a regular central processing unit (CPU) while maintaining its high performance as an Associative processor. Therefore, the MIMD Associative processor can perform not only on-chip perception, i.e., searching for the vector most similar to an input vector throughout the on-chip cache memory, but also arithmetic and logic Operations similar to those in ordinary CPUs, both simultaneously in parallel processing. Three key technologies have been developed to generate the MIMD element: Associative-Operation-and-arithmetic-Operation switchable calculation units, a versatile register control scheme within the MIMD element for flexible Operations, and a short instruction set for minimizing the memory size for program storage. Key circuit blocks were designed and fabricated using 0.18 µm complementary metal-oxide-semiconductor (CMOS) technology. As a result, the full-featured MIMD element is estimated to be 3 mm2, showing the feasibility of an 8-parallel-MIMD-element Associative processor in a single chip of 5 mm×5 mm.

  • Right-Brain/Left-Brain Integrated Associative Processor Employing Convertible Multiple-Instruction-Stream Multiple-Data-Stream Elements
    Japanese Journal of Applied Physics, 2005
    Co-Authors: Hitoshi Hayakawa, M Ogawa, Tadashi Shibata
    Abstract:

    A very large scale integrated circuit (VLSI) architecture for a multiple-instruction-stream multiple-data-stream (MIMD) Associative processor has been proposed. The processor employs an architecture that enables seamless switching from Associative Operations to arithmetic Operations. The MIMD element is convertible to a regular central processing unit (CPU) while maintaining its high performance as an Associative processor. Therefore, the MIMD Associative processor can perform not only on-chip perception, i.e., searching for the vector most similar to an input vector throughout the on-chip cache memory, but also arithmetic and logic Operations similar to those in ordinary CPUs, both simultaneously in parallel processing. Three key technologies have been developed to generate the MIMD element: Associative-Operation-and-arithmetic-Operation switchable calculation units, a versatile register control scheme within the MIMD element for flexible Operations, and a short instruction set for minimizing the memory size for program storage. Key circuit blocks were designed and fabricated using 0.18 µm complementary metal-oxide-semiconductor (CMOS) technology. As a result, the full-featured MIMD element is estimated to be 3 mm2, showing the feasibility of an 8-parallel-MIMD-element Associative processor in a single chip of 5 mm×5 mm.

Dongrui She - One of the best experts on this subject based on the ideXlab platform.

  • reduction operator for wide simds reconsidered
    Design Automation Conference, 2014
    Co-Authors: Luc Waeijen, Dongrui She, Henk Corporaal
    Abstract:

    It has been shown that wide Single Instruction Multiple Data architectures (wide-SIMDs) can achieve high energy efficiency, especially in domains such as image and vision processing. In these and various other application domains, reduction is a frequently encountered Operation, where multiple input elements need to be combined into a single element by an Associative Operation, e.g. addition or multiplication. There are many applications that require reduction such as: partial histogram merging, matrix multiplication and min/max-finding. Wide-SIMDs contain a large number of processing elements (PEs), which in general are connected by a minimal form of interconnect for scalability reasons. To efficiently support reduction Operations on wide-SIMDs with such a minimal interconnect, we introduce two novel reduction algorithms which do not rely on complex communication networks or any dedicated hardware. The proposed approaches are compared with both dedicated hardware and other software solutions in terms of performance, area, and energy consumption. A practical case study demonstrates that the proposed software approach has much better generality, flexibility and no additional hardware cost. Compared to a dedicated hardware adder tree, the proposed software approach saves 6.8% area with a performance penalty of only 6.5%.

  • DAC - Reduction Operator for Wide-SIMDs Reconsidered
    Proceedings of the The 51st Annual Design Automation Conference on Design Automation Conference - DAC '14, 2014
    Co-Authors: Luc Waeijen, Dongrui She, Henk Corporaal
    Abstract:

    It has been shown that wide Single Instruction Multiple Data architectures (wide-SIMDs) can achieve high energy efficiency, especially in domains such as image and vision processing. In these and various other application domains, reduction is a frequently encountered Operation, where multiple input elements need to be combined into a single element by an Associative Operation, e.g. addition or multiplication. There are many applications that require reduction such as: partial histogram merging, matrix multiplication and min/max-finding. Wide-SIMDs contain a large number of processing elements (PEs), which in general are connected by a minimal form of interconnect for scalability reasons. To efficiently support reduction Operations on wide-SIMDs with such a minimal interconnect, we introduce two novel reduction algorithms which do not rely on complex communication networks or any dedicated hardware. The proposed approaches are compared with both dedicated hardware and other software solutions in terms of performance, area, and energy consumption. A practical case study demonstrates that the proposed software approach has much better generality, flexibility and no additional hardware cost. Compared to a dedicated hardware adder tree, the proposed software approach saves 6.8% area with a performance penalty of only 6.5%.