Hash Table

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 15639 Experts worldwide ranked by ideXlab platform

Viktor K Prasanna - One of the best experts on this subject based on the ideXlab platform.

  • A High Throughput Parallel Hash Table on FPGA using XOR-based Memory
    2020 IEEE High Performance Extreme Computing Conference (HPEC), 2020
    Co-Authors: Ruizhi Zhang, Sasindu Wijeratne, Yang Yang, Sanmukh R. Kuppannagari, Viktor K Prasanna
    Abstract:

    Hash Table is a fundamental data structure for quick search and retrieval of data. It is a key component in complex graph analytics and AI/ML applications. State-of-the-art parallel Hash Table implementations either make some simplifying assumptions such as supporting only a subset of Hash Table operations or employ optimizations that lead to performance that is highly data dependent and in the worst case can be similar to a sequential implementation. In contrast, in this work we develop a dynamic Hash Table that supports all the Hash Table queries - search, insert, delete, update, while allowing us to support $p$ parallel queries (p > 1) per clock cycle via $p$ processing engines (PEs) in the worst case i.e. the performance is data agnostic. We achieve this by implementing novel XOR based multi-ported block memories on FPGAs. Additionally, we develop a technique to optimize the memory requirement of the Hash Table if the ratio of search to insert/update/delete queries is known beforehand. We implement our design on state-of-the-art FPGA devices. Our design is scalable to 16 PEs and supports throughput up to 5926 MOPS. It matches the throughput of the state-of-the-art Hash Table design - FASTHash, which only supports search and insert operations. Comparing with the best FPGA design that supports the same set of operations, our Hash Table achieves up to 12.3 x speedup.

  • fastHash fpga based high throughput parallel Hash Table
    IEEE International Conference on High Performance Computing Data and Analytics, 2020
    Co-Authors: Yang Yang, Sanmukh R. Kuppannagari, Ajitesh Srivastava, Rajgopal Kannan, Viktor K Prasanna
    Abstract:

    Hash Table is a fundamental data structure that provides efficient data store and access. It is a key component in AI applications which rely on building a model of the environment using observations and performing lookups on the model for newer observations. In this work, we develop FASTHash, a “truly” high throughput parallel Hash Table implementation using FPGA on-chip SRAM. Contrary to state-of-the-art Hash Table implementations on CPU, GPU, and FPGA, the parallelism in our design is data independent, allowing us to support p parallel queries (\(p>1\)) per clock cycle via p processing engines (PEs) in the worst case. Our novel data organization and query flow techniques allow full utilization of abundant low latency on-chip SRAM and enable conflict free concurrent insertions. Our Hash Table ensures relaxed eventual consistency - inserts from a PE are visible to all PEs with some latency. We provide theoretical worst case bound on the number of erroneous queries (true negative search, duplicate inserts) due to relaxed eventual consistency. We customize our design to implement both static and dynamic Hash Tables on state-of-the-art FPGA devices. Our implementations are scalable to 16 PEs and support throughput as high as 5360 million operations per second with PEs running at 335 MHz for static Hashing and 4480 million operations per second with PEs running at 280 MHz for dynamic Hashing. They outperform state-of-the-art implementations by 5.7x and 8.7x respectively.

  • ISC - FASTHash: FPGA-Based High Throughput Parallel Hash Table
    Lecture Notes in Computer Science, 2020
    Co-Authors: Yang Yang, Sanmukh R. Kuppannagari, Ajitesh Srivastava, Rajgopal Kannan, Viktor K Prasanna
    Abstract:

    Hash Table is a fundamental data structure that provides efficient data store and access. It is a key component in AI applications which rely on building a model of the environment using observations and performing lookups on the model for newer observations. In this work, we develop FASTHash, a “truly” high throughput parallel Hash Table implementation using FPGA on-chip SRAM. Contrary to state-of-the-art Hash Table implementations on CPU, GPU, and FPGA, the parallelism in our design is data independent, allowing us to support p parallel queries (\(p>1\)) per clock cycle via p processing engines (PEs) in the worst case. Our novel data organization and query flow techniques allow full utilization of abundant low latency on-chip SRAM and enable conflict free concurrent insertions. Our Hash Table ensures relaxed eventual consistency - inserts from a PE are visible to all PEs with some latency. We provide theoretical worst case bound on the number of erroneous queries (true negative search, duplicate inserts) due to relaxed eventual consistency. We customize our design to implement both static and dynamic Hash Tables on state-of-the-art FPGA devices. Our implementations are scalable to 16 PEs and support throughput as high as 5360 million operations per second with PEs running at 335 MHz for static Hashing and 4480 million operations per second with PEs running at 280 MHz for dynamic Hashing. They outperform state-of-the-art implementations by 5.7x and 8.7x respectively.

  • high throughput online Hash Table on fpga
    International Parallel and Distributed Processing Symposium, 2015
    Co-Authors: Da Tong, Shijie Zhou, Viktor K Prasanna
    Abstract:

    Hash Tables are widely used in many network applications such as packet classification, traffic classification, and heavy hitter detection, etc. In this paper, we present a pipelined architecture for high throughput online Hash Table on FPGA. The proposed architecture supports search, insert, and delete operations at line rate for the massive Hash Table which is stored in off-chip memory. We propose two Hash Table access schemes: (1) the first scheme assigns each Hash entry multiple slots to reduce the Hash collision rate; each slot can store the corresponding Hash key of the Hash entry; (2) the second scheme has a higher Hash collision rate but a lower off-chip memory bandwidth requirement than the first scheme. Both schemes guarantee the line rate processing when using the memory devices with sufficient access bandwidth. We design an application specific data forwarding unit to deal with the potential data hazards. Our architecture ensures that no stalling is required to process any sequence of concurrent operations while tolerating large external memory access latency. On a state-of-the-art FPGA, the proposed architecture achieves 66-85 Gbps throughput while supporting a Hash Table of various number of entries with various key sizes for various DRAM access latency. Our design also shows good scalability in terms of throughput for various Hash Table configurations.

  • IPDPS Workshops - High-Throughput Online Hash Table on FPGA
    2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015
    Co-Authors: Da Tong, Shijie Zhou, Viktor K Prasanna
    Abstract:

    Hash Tables are widely used in many network applications such as packet classification, traffic classification, and heavy hitter detection, etc. In this paper, we present a pipelined architecture for high throughput online Hash Table on FPGA. The proposed architecture supports search, insert, and delete operations at line rate for the massive Hash Table which is stored in off-chip memory. We propose two Hash Table access schemes: (1) the first scheme assigns each Hash entry multiple slots to reduce the Hash collision rate; each slot can store the corresponding Hash key of the Hash entry; (2) the second scheme has a higher Hash collision rate but a lower off-chip memory bandwidth requirement than the first scheme. Both schemes guarantee the line rate processing when using the memory devices with sufficient access bandwidth. We design an application specific data forwarding unit to deal with the potential data hazards. Our architecture ensures that no stalling is required to process any sequence of concurrent operations while tolerating large external memory access latency. On a state-of-the-art FPGA, the proposed architecture achieves 66-85 Gbps throughput while supporting a Hash Table of various number of entries with various key sizes for various DRAM access latency. Our design also shows good scalability in terms of throughput for various Hash Table configurations.

Usman Younis - One of the best experts on this subject based on the ideXlab platform.

  • Randomness testing of non-cryptographic Hash functions for real-time Hash Table based storage and look-up of URLs
    Journal of Network and Computer Applications, 2014
    Co-Authors: Tahir Ahmad, Usman Younis
    Abstract:

    Abstract Non-cryptographic Hash functions have been investigated to identify their pseudo-random nature when employed in the implementation of Hash Tables for real-time storage and look-up of uniform resource locators. Statistical studies have been performed on the sequences generated using five widely used non-cryptographic Hash functions: (1) CRC, (2) Adler, (3) DJBX33A, (4) FNV, and (5) Murmur. The comparative analysis of tested non-cryptographic Hash functions shows that the Adler Hash function is not suiTable for Hash Table implementation, whereas, the rest of non-cryptographic Hash functions exhibit similar and better randomizing features which make them an attractive choice for Hash Table implementation.

  • Performance Analysis of Non-cryptographic Hash Functions for Real-Time Storage and Lookup of URLs
    2013
    Co-Authors: Tahir Ahmad, Usman Younis
    Abstract:

    In this work, the performance of various non-cryptographic Hash functions has been investigated to identify their random nature when employed in the implementation of Hash Tables for real-time storage and lookup of uniform resource locators. The performance analysis is performed mainly using statistical studies on the sequences generated using five widely used non-cryptographic Hash functions: 1) CRC, 2) Adler, 3) FNV, 4) DJBX33A, and 5) Murmur. The comparative analysis of tested non-cryptographic Hash functions shows that the Adler Hash function is not suiTable for Hash Table implementation, whereas, the rest of non-cryptographic Hash functions exhibit similar and better randomizing features which make them an attractive choice for Hash Table implementation. The results of these statistical studies have been verified by the implementation of Hash Table using these non-cryptographic Hash functions. The implementation results show that the average number of probes for Adler based Hash Table varies between 1.25 and 2.75 for different load factors and Hash Table sizes, whereas, for the rest of non-cryptographic Hash functions the average number of probes in a Hash Table is sim1, which is highly desirable for real-time network applications. Thus proving that 1) CRC, 2) FNV, 3) DJBX33A, and 4) Murmur non-cryptographic Hash functions are good choices for Hash Table based implementation for real-time storage and lookup of uniform resource locators.

Da Tong - One of the best experts on this subject based on the ideXlab platform.

  • high throughput online Hash Table on fpga
    International Parallel and Distributed Processing Symposium, 2015
    Co-Authors: Da Tong, Shijie Zhou, Viktor K Prasanna
    Abstract:

    Hash Tables are widely used in many network applications such as packet classification, traffic classification, and heavy hitter detection, etc. In this paper, we present a pipelined architecture for high throughput online Hash Table on FPGA. The proposed architecture supports search, insert, and delete operations at line rate for the massive Hash Table which is stored in off-chip memory. We propose two Hash Table access schemes: (1) the first scheme assigns each Hash entry multiple slots to reduce the Hash collision rate; each slot can store the corresponding Hash key of the Hash entry; (2) the second scheme has a higher Hash collision rate but a lower off-chip memory bandwidth requirement than the first scheme. Both schemes guarantee the line rate processing when using the memory devices with sufficient access bandwidth. We design an application specific data forwarding unit to deal with the potential data hazards. Our architecture ensures that no stalling is required to process any sequence of concurrent operations while tolerating large external memory access latency. On a state-of-the-art FPGA, the proposed architecture achieves 66-85 Gbps throughput while supporting a Hash Table of various number of entries with various key sizes for various DRAM access latency. Our design also shows good scalability in terms of throughput for various Hash Table configurations.

  • IPDPS Workshops - High-Throughput Online Hash Table on FPGA
    2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015
    Co-Authors: Da Tong, Shijie Zhou, Viktor K Prasanna
    Abstract:

    Hash Tables are widely used in many network applications such as packet classification, traffic classification, and heavy hitter detection, etc. In this paper, we present a pipelined architecture for high throughput online Hash Table on FPGA. The proposed architecture supports search, insert, and delete operations at line rate for the massive Hash Table which is stored in off-chip memory. We propose two Hash Table access schemes: (1) the first scheme assigns each Hash entry multiple slots to reduce the Hash collision rate; each slot can store the corresponding Hash key of the Hash entry; (2) the second scheme has a higher Hash collision rate but a lower off-chip memory bandwidth requirement than the first scheme. Both schemes guarantee the line rate processing when using the memory devices with sufficient access bandwidth. We design an application specific data forwarding unit to deal with the potential data hazards. Our architecture ensures that no stalling is required to process any sequence of concurrent operations while tolerating large external memory access latency. On a state-of-the-art FPGA, the proposed architecture achieves 66-85 Gbps throughput while supporting a Hash Table of various number of entries with various key sizes for various DRAM access latency. Our design also shows good scalability in terms of throughput for various Hash Table configurations.

Tahir Ahmad - One of the best experts on this subject based on the ideXlab platform.

  • Randomness testing of non-cryptographic Hash functions for real-time Hash Table based storage and look-up of URLs
    Journal of Network and Computer Applications, 2014
    Co-Authors: Tahir Ahmad, Usman Younis
    Abstract:

    Abstract Non-cryptographic Hash functions have been investigated to identify their pseudo-random nature when employed in the implementation of Hash Tables for real-time storage and look-up of uniform resource locators. Statistical studies have been performed on the sequences generated using five widely used non-cryptographic Hash functions: (1) CRC, (2) Adler, (3) DJBX33A, (4) FNV, and (5) Murmur. The comparative analysis of tested non-cryptographic Hash functions shows that the Adler Hash function is not suiTable for Hash Table implementation, whereas, the rest of non-cryptographic Hash functions exhibit similar and better randomizing features which make them an attractive choice for Hash Table implementation.

  • Performance Analysis of Non-cryptographic Hash Functions for Real-Time Storage and Lookup of URLs
    2013
    Co-Authors: Tahir Ahmad, Usman Younis
    Abstract:

    In this work, the performance of various non-cryptographic Hash functions has been investigated to identify their random nature when employed in the implementation of Hash Tables for real-time storage and lookup of uniform resource locators. The performance analysis is performed mainly using statistical studies on the sequences generated using five widely used non-cryptographic Hash functions: 1) CRC, 2) Adler, 3) FNV, 4) DJBX33A, and 5) Murmur. The comparative analysis of tested non-cryptographic Hash functions shows that the Adler Hash function is not suiTable for Hash Table implementation, whereas, the rest of non-cryptographic Hash functions exhibit similar and better randomizing features which make them an attractive choice for Hash Table implementation. The results of these statistical studies have been verified by the implementation of Hash Table using these non-cryptographic Hash functions. The implementation results show that the average number of probes for Adler based Hash Table varies between 1.25 and 2.75 for different load factors and Hash Table sizes, whereas, for the rest of non-cryptographic Hash functions the average number of probes in a Hash Table is sim1, which is highly desirable for real-time network applications. Thus proving that 1) CRC, 2) FNV, 3) DJBX33A, and 4) Murmur non-cryptographic Hash functions are good choices for Hash Table based implementation for real-time storage and lookup of uniform resource locators.

Yang Yang - One of the best experts on this subject based on the ideXlab platform.

  • A High Throughput Parallel Hash Table on FPGA using XOR-based Memory
    2020 IEEE High Performance Extreme Computing Conference (HPEC), 2020
    Co-Authors: Ruizhi Zhang, Sasindu Wijeratne, Yang Yang, Sanmukh R. Kuppannagari, Viktor K Prasanna
    Abstract:

    Hash Table is a fundamental data structure for quick search and retrieval of data. It is a key component in complex graph analytics and AI/ML applications. State-of-the-art parallel Hash Table implementations either make some simplifying assumptions such as supporting only a subset of Hash Table operations or employ optimizations that lead to performance that is highly data dependent and in the worst case can be similar to a sequential implementation. In contrast, in this work we develop a dynamic Hash Table that supports all the Hash Table queries - search, insert, delete, update, while allowing us to support $p$ parallel queries (p > 1) per clock cycle via $p$ processing engines (PEs) in the worst case i.e. the performance is data agnostic. We achieve this by implementing novel XOR based multi-ported block memories on FPGAs. Additionally, we develop a technique to optimize the memory requirement of the Hash Table if the ratio of search to insert/update/delete queries is known beforehand. We implement our design on state-of-the-art FPGA devices. Our design is scalable to 16 PEs and supports throughput up to 5926 MOPS. It matches the throughput of the state-of-the-art Hash Table design - FASTHash, which only supports search and insert operations. Comparing with the best FPGA design that supports the same set of operations, our Hash Table achieves up to 12.3 x speedup.

  • fastHash fpga based high throughput parallel Hash Table
    IEEE International Conference on High Performance Computing Data and Analytics, 2020
    Co-Authors: Yang Yang, Sanmukh R. Kuppannagari, Ajitesh Srivastava, Rajgopal Kannan, Viktor K Prasanna
    Abstract:

    Hash Table is a fundamental data structure that provides efficient data store and access. It is a key component in AI applications which rely on building a model of the environment using observations and performing lookups on the model for newer observations. In this work, we develop FASTHash, a “truly” high throughput parallel Hash Table implementation using FPGA on-chip SRAM. Contrary to state-of-the-art Hash Table implementations on CPU, GPU, and FPGA, the parallelism in our design is data independent, allowing us to support p parallel queries (\(p>1\)) per clock cycle via p processing engines (PEs) in the worst case. Our novel data organization and query flow techniques allow full utilization of abundant low latency on-chip SRAM and enable conflict free concurrent insertions. Our Hash Table ensures relaxed eventual consistency - inserts from a PE are visible to all PEs with some latency. We provide theoretical worst case bound on the number of erroneous queries (true negative search, duplicate inserts) due to relaxed eventual consistency. We customize our design to implement both static and dynamic Hash Tables on state-of-the-art FPGA devices. Our implementations are scalable to 16 PEs and support throughput as high as 5360 million operations per second with PEs running at 335 MHz for static Hashing and 4480 million operations per second with PEs running at 280 MHz for dynamic Hashing. They outperform state-of-the-art implementations by 5.7x and 8.7x respectively.

  • ISC - FASTHash: FPGA-Based High Throughput Parallel Hash Table
    Lecture Notes in Computer Science, 2020
    Co-Authors: Yang Yang, Sanmukh R. Kuppannagari, Ajitesh Srivastava, Rajgopal Kannan, Viktor K Prasanna
    Abstract:

    Hash Table is a fundamental data structure that provides efficient data store and access. It is a key component in AI applications which rely on building a model of the environment using observations and performing lookups on the model for newer observations. In this work, we develop FASTHash, a “truly” high throughput parallel Hash Table implementation using FPGA on-chip SRAM. Contrary to state-of-the-art Hash Table implementations on CPU, GPU, and FPGA, the parallelism in our design is data independent, allowing us to support p parallel queries (\(p>1\)) per clock cycle via p processing engines (PEs) in the worst case. Our novel data organization and query flow techniques allow full utilization of abundant low latency on-chip SRAM and enable conflict free concurrent insertions. Our Hash Table ensures relaxed eventual consistency - inserts from a PE are visible to all PEs with some latency. We provide theoretical worst case bound on the number of erroneous queries (true negative search, duplicate inserts) due to relaxed eventual consistency. We customize our design to implement both static and dynamic Hash Tables on state-of-the-art FPGA devices. Our implementations are scalable to 16 PEs and support throughput as high as 5360 million operations per second with PEs running at 335 MHz for static Hashing and 4480 million operations per second with PEs running at 280 MHz for dynamic Hashing. They outperform state-of-the-art implementations by 5.7x and 8.7x respectively.