State Transition Table

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 303 Experts worldwide ranked by ideXlab platform

Viktor K. Prasanna - One of the best experts on this subject based on the ideXlab platform.

  • Space-time tradeoff in regular expression matching with semi-deterministic finite automata
    Proceedings - IEEE INFOCOM, 2011
    Co-Authors: Yi-Hua E. Yang, Viktor K. Prasanna
    Abstract:

    Regular expression matching (REM) with nondeterministic finite automata (NFA) can be computationally expensive when a large number of patterns are matched concurrently. On the other hand, converting the NFA to a deterministic finite automaton (DFA) can cause State explosion, where the number of States and Transitions in the DFA are exponentially larger than in the NFA. In this paper, we seek to answer the following question: to match an arbitrary set of regular expressions, is there a finite automaton that lies between the NFA and DFA in terms of computation and memory complexities? We introduce the semi-deterministic finite automata (SFA) and the State convolvement test to construct an SFA from a given NFA. An SFA consists of a fixed number (p) of constituent DFAs (c-DFA) running in parallel; each c-DFA is responsible for a subset of States in the original NFA. To match a set of regular expressions with n overlapping symbols (that can match to the same input character concurrently), the NFA can require O(n) computation per input character, whereas the DFA can have a State Transition Table with O(2n) States. By exploiting the State convolvements during the SFA construction, an equivalent SFA reduces the computation complexity to O(p2=c2) per input character while limiting the space requirement to O(|Σ|×p2×(n=p)c) States, where Σ is the alphabet and c ≥ 1 is a small design constant. Although the problem of constructing the optimal (minimum-sized) SFA is shown to be NP-complete, we develop a greedy heuristic to quickly construct a near-optimal SFA in time and space quadratic in the number of States in the original NFA. We demonstrate our SFA construction using real-world regular expressions taken from the Snort IDS.

  • Optimizing regular expression matching with SR-NFA on multi-core systems
    Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT, 2011
    Co-Authors: Yi-Hua E. Yang, Viktor K. Prasanna
    Abstract:

    Conventionally, regular expression matching (REM) has been performed by sequentially comparing the regular expression (regex) to the input stream, which can be slow due to excessive backtracking (smith:acsac06). Alternatively, the regex can be converted to a deterministic finite automaton (DFA) for efficient matching, which however may require an extremely large State Transition Table (STT) due to exponential State explosion (meyer:swat71, yu:ancs06). We propose the segmented regex-NFA (SR-NFA) architecture, where the regex is first compiled into modular nondeterministic finite automata (NFA), then partitioned, optimized, and matched efficiently on modern multi-core processors. SR-NFA offers attack-resilient multi-gigabit per second matching throughput, does not suffer from either backtracking or State explosion, and can be rapidly constructed. For regex sets that construct a DFA with moderate State explosion, i.e., on average 200k States in the STT, the proposed SR-NFA is 367k times faster to construct and update and use 23k times less memory than the DFA approach. Running on an 8-core 2.6 GHz Opteron platform, our prototype achieves 2.2 Gbps average matching throughput for regex sets with up to 4,000 SR-NFA States per regex set.

  • Optimizing Regular Expression Matching with SR-NFA on Multi-Core Systems
    2011 International Conference on Parallel Architectures and Compilation Techniques, 2011
    Co-Authors: Yi-Hua E. Yang, Viktor K. Prasanna
    Abstract:

    Conventionally, regular expression matching (REM) has been performed by sequentially comparing the regular expression (regex) to the input stream, which can be slow due to excessive backtracking [21]. Alternatively, the regex can be converted to a deterministic finite automaton (DFA) for efficient matching, which however may require an extremely large State Transition Table (STT) due to exponential State explosion [17, 27]. We propose the segmented regex-NFA (SR-NFA) architecture, where the regex is first compiled into modular nondeterministic finite automata (NFA), then partitioned, optimized, and matched efficiently on modern multi-core processors. SR-NFA offers attack-resilient multi-gigabit per second matching throughput, does not suffer from either backtracking or State explosion, and can be rapidly constructed. For regex sets that construct a DFA with moderate State explosion, i.e., on average 200k States in the STT, the proposed SR-NFA is 367k times faster to construct and update and use 23k times less memory than the DFA approach. Running on an 8-core 2.6 GHz Opteron platform, our prototype achieves 2.2 Gbps average matching throughput for regex sets with up to 4,000 SR-NFA States per regex set.

  • Head-body partitioned string matching for deep packet inspection with scalable and attack-resilient performance
    Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2010, 2010
    Co-Authors: Yi-Hua E. Yang, Viktor K. Prasanna, Chenqian Jiang
    Abstract:

    Dictionary-based string matching (DBSM) is a critical component of Deep Packet Inspection (DPI), where thousands of malicious patterns are matched against high-bandwidth network traffic. Deterministic finite automata constructed with the Aho-Corasick algorithm (AC-DFA) have been widely used for solving this problem. However, the State Transition Table (STT) of a large-scale DBSM AC-DFA can span hundreds of megabytes of system memory, whose limited bandwidth and long latency could become the performance bottleneck We propose a novel partitioning algorithm which converts an AC-DFA into a "head" and a "body" parts. The head part behaves as a traditional AC-DFA that matches the pattern prefixes up to a predefined length; the body part extends any head match to the full pattern length in parallel body-tree traversals. Taking advantage of the SIMD instructions in modern x86-64 multi-core processors, we design compact and efficient data structures packing multi-path and multi-stride pattern segments in the body-tree. Compared with an optimized AC-DFA solution, our head-body matching (HBM) implementation achieves 1.2x to 3x throughput performance when the input match (attack) ratio varies from 2% to 32%, respectively. Our HBM data structure is over 20x smaller than a fully-populated AC-DFA for both Snort and ClamAV dictionaries. The aggregated throughput of our HBM approach scales almost 7x with 8 threads to over 10 Gbps in a dual-socket quad-core Opteron (Shanghai) server. 2010 IEEE.

  • Multi-Core Architecture on FPGA for Large Dictionary String Matching
    2009 17th IEEE Symposium on Field Programmable Custom Computing Machines, 2009
    Co-Authors: Qingbo Wang, Viktor K. Prasanna
    Abstract:

    FPGA has long been considered an attractive platform for high performance implementations of string matching. However, as the size of pattern dictionaries continues to grow, such large dictionaries can be stored in external DRAM only. The increased memory latency and limited bandwidth pose new challenges to FPGA-based designs, and the lack of spatial and temporal locality in data access also leads to low utilization of memory bandwidth. In this paper, we propose a multi-core architecture on FPGA to address these challenges. We adopt the popular Aho-Corasick (AC-opt) algorithm for our string matching engine. Utilizing the data access feature in this algorithm, we design a specialized BRAM buffer for the cores to exploit a data reuse existing in such applications. Several design optimization techniques are utilized to realize a simple design with high clock rate for the string matching engine. An implementation of a 2-core system with one shared BRAM buffer on a Virtex-5 LX155 achieves up to 3.2 Gbps throughput on a 64 MB State Transition Table stored in DRAM. Performance of systems with more cores is also evaluated for this architecture, and a throughput of over 5.5 Gbps can be obtained for some application scenarios.

Reza Langari - One of the best experts on this subject based on the ideXlab platform.

  • A Real-Time Fuzzy Learning Algorithm for Markov Chain and Its Application on Prediction of Vehicle Speed*
    2019 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), 2019
    Co-Authors: Qingyu Zhang, Dimitar Filev, Steven Szwabowski, Reza Langari
    Abstract:

    This paper presents a real-time-capable recursive fuzzy learning algorithm (FLA) for learning Transition probabilities in Markov Chain (MC) from observed information and its performance on speed prediction. In detail, real-time State Transition is observed at each step as the latest information to update the MC. Accordingly FLA locates a parallelogram area in the MC State Transition Table in a fuzzy way. Cells in this area are updated with different weights such that Transition probabilities of the Transitions more similar to the observed one receive a bigger increase while those of less similar Transitions get a smaller increase. Numeric examples are given to illustrate FLA's learning pattern, its good prediction capability of vehicle speed and its low computation cost, comparing to normal MC, constant velocity model, constant acceleration model, autoregressive model with exogenous input and back-propagation neural networks.

  • FUZZ-IEEE - A Real-Time Fuzzy Learning Algorithm for Markov Chain and Its Application on Prediction of Vehicle Speed*
    2019 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), 2019
    Co-Authors: Qingyu Zhang, Dimitar Filev, Steven Szwabowski, Reza Langari
    Abstract:

    This paper presents a real-time-capable recursive fuzzy learning algorithm (FLA) for learning Transition probabilities in Markov Chain (MC) from observed information and its performance on speed prediction. In detail, real-time State Transition is observed at each step as the latest information to update the MC. Accordingly FLA locates a parallelogram area in the MC State Transition Table in a fuzzy way. Cells in this area are updated with different weights such that Transition probabilities of the Transitions more similar to the observed one receive a bigger increase while those of less similar Transitions get a smaller increase. Numeric examples are given to illustrate FLA’s learning pattern, its good prediction capability of vehicle speed and its low computation cost, comparing to normal MC, constant velocity model, constant acceleration model, autoregressive model with exogenous input and back-propagation neural networks.

Qingyu Zhang - One of the best experts on this subject based on the ideXlab platform.

  • A Real-Time Fuzzy Learning Algorithm for Markov Chain and Its Application on Prediction of Vehicle Speed*
    2019 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), 2019
    Co-Authors: Qingyu Zhang, Dimitar Filev, Steven Szwabowski, Reza Langari
    Abstract:

    This paper presents a real-time-capable recursive fuzzy learning algorithm (FLA) for learning Transition probabilities in Markov Chain (MC) from observed information and its performance on speed prediction. In detail, real-time State Transition is observed at each step as the latest information to update the MC. Accordingly FLA locates a parallelogram area in the MC State Transition Table in a fuzzy way. Cells in this area are updated with different weights such that Transition probabilities of the Transitions more similar to the observed one receive a bigger increase while those of less similar Transitions get a smaller increase. Numeric examples are given to illustrate FLA's learning pattern, its good prediction capability of vehicle speed and its low computation cost, comparing to normal MC, constant velocity model, constant acceleration model, autoregressive model with exogenous input and back-propagation neural networks.

  • FUZZ-IEEE - A Real-Time Fuzzy Learning Algorithm for Markov Chain and Its Application on Prediction of Vehicle Speed*
    2019 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), 2019
    Co-Authors: Qingyu Zhang, Dimitar Filev, Steven Szwabowski, Reza Langari
    Abstract:

    This paper presents a real-time-capable recursive fuzzy learning algorithm (FLA) for learning Transition probabilities in Markov Chain (MC) from observed information and its performance on speed prediction. In detail, real-time State Transition is observed at each step as the latest information to update the MC. Accordingly FLA locates a parallelogram area in the MC State Transition Table in a fuzzy way. Cells in this area are updated with different weights such that Transition probabilities of the Transitions more similar to the observed one receive a bigger increase while those of less similar Transitions get a smaller increase. Numeric examples are given to illustrate FLA’s learning pattern, its good prediction capability of vehicle speed and its low computation cost, comparing to normal MC, constant velocity model, constant acceleration model, autoregressive model with exogenous input and back-propagation neural networks.

Yi-Hua E. Yang - One of the best experts on this subject based on the ideXlab platform.

  • Optimizing regular expression matching with SR-NFA on multi-core systems
    Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT, 2011
    Co-Authors: Yi-Hua E. Yang, Viktor K. Prasanna
    Abstract:

    Conventionally, regular expression matching (REM) has been performed by sequentially comparing the regular expression (regex) to the input stream, which can be slow due to excessive backtracking (smith:acsac06). Alternatively, the regex can be converted to a deterministic finite automaton (DFA) for efficient matching, which however may require an extremely large State Transition Table (STT) due to exponential State explosion (meyer:swat71, yu:ancs06). We propose the segmented regex-NFA (SR-NFA) architecture, where the regex is first compiled into modular nondeterministic finite automata (NFA), then partitioned, optimized, and matched efficiently on modern multi-core processors. SR-NFA offers attack-resilient multi-gigabit per second matching throughput, does not suffer from either backtracking or State explosion, and can be rapidly constructed. For regex sets that construct a DFA with moderate State explosion, i.e., on average 200k States in the STT, the proposed SR-NFA is 367k times faster to construct and update and use 23k times less memory than the DFA approach. Running on an 8-core 2.6 GHz Opteron platform, our prototype achieves 2.2 Gbps average matching throughput for regex sets with up to 4,000 SR-NFA States per regex set.

  • Space-time tradeoff in regular expression matching with semi-deterministic finite automata
    Proceedings - IEEE INFOCOM, 2011
    Co-Authors: Yi-Hua E. Yang, Viktor K. Prasanna
    Abstract:

    Regular expression matching (REM) with nondeterministic finite automata (NFA) can be computationally expensive when a large number of patterns are matched concurrently. On the other hand, converting the NFA to a deterministic finite automaton (DFA) can cause State explosion, where the number of States and Transitions in the DFA are exponentially larger than in the NFA. In this paper, we seek to answer the following question: to match an arbitrary set of regular expressions, is there a finite automaton that lies between the NFA and DFA in terms of computation and memory complexities? We introduce the semi-deterministic finite automata (SFA) and the State convolvement test to construct an SFA from a given NFA. An SFA consists of a fixed number (p) of constituent DFAs (c-DFA) running in parallel; each c-DFA is responsible for a subset of States in the original NFA. To match a set of regular expressions with n overlapping symbols (that can match to the same input character concurrently), the NFA can require O(n) computation per input character, whereas the DFA can have a State Transition Table with O(2n) States. By exploiting the State convolvements during the SFA construction, an equivalent SFA reduces the computation complexity to O(p2=c2) per input character while limiting the space requirement to O(|Σ|×p2×(n=p)c) States, where Σ is the alphabet and c ≥ 1 is a small design constant. Although the problem of constructing the optimal (minimum-sized) SFA is shown to be NP-complete, we develop a greedy heuristic to quickly construct a near-optimal SFA in time and space quadratic in the number of States in the original NFA. We demonstrate our SFA construction using real-world regular expressions taken from the Snort IDS.

  • Optimizing Regular Expression Matching with SR-NFA on Multi-Core Systems
    2011 International Conference on Parallel Architectures and Compilation Techniques, 2011
    Co-Authors: Yi-Hua E. Yang, Viktor K. Prasanna
    Abstract:

    Conventionally, regular expression matching (REM) has been performed by sequentially comparing the regular expression (regex) to the input stream, which can be slow due to excessive backtracking [21]. Alternatively, the regex can be converted to a deterministic finite automaton (DFA) for efficient matching, which however may require an extremely large State Transition Table (STT) due to exponential State explosion [17, 27]. We propose the segmented regex-NFA (SR-NFA) architecture, where the regex is first compiled into modular nondeterministic finite automata (NFA), then partitioned, optimized, and matched efficiently on modern multi-core processors. SR-NFA offers attack-resilient multi-gigabit per second matching throughput, does not suffer from either backtracking or State explosion, and can be rapidly constructed. For regex sets that construct a DFA with moderate State explosion, i.e., on average 200k States in the STT, the proposed SR-NFA is 367k times faster to construct and update and use 23k times less memory than the DFA approach. Running on an 8-core 2.6 GHz Opteron platform, our prototype achieves 2.2 Gbps average matching throughput for regex sets with up to 4,000 SR-NFA States per regex set.

  • Head-body partitioned string matching for deep packet inspection with scalable and attack-resilient performance
    Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2010, 2010
    Co-Authors: Yi-Hua E. Yang, Viktor K. Prasanna, Chenqian Jiang
    Abstract:

    Dictionary-based string matching (DBSM) is a critical component of Deep Packet Inspection (DPI), where thousands of malicious patterns are matched against high-bandwidth network traffic. Deterministic finite automata constructed with the Aho-Corasick algorithm (AC-DFA) have been widely used for solving this problem. However, the State Transition Table (STT) of a large-scale DBSM AC-DFA can span hundreds of megabytes of system memory, whose limited bandwidth and long latency could become the performance bottleneck We propose a novel partitioning algorithm which converts an AC-DFA into a "head" and a "body" parts. The head part behaves as a traditional AC-DFA that matches the pattern prefixes up to a predefined length; the body part extends any head match to the full pattern length in parallel body-tree traversals. Taking advantage of the SIMD instructions in modern x86-64 multi-core processors, we design compact and efficient data structures packing multi-path and multi-stride pattern segments in the body-tree. Compared with an optimized AC-DFA solution, our head-body matching (HBM) implementation achieves 1.2x to 3x throughput performance when the input match (attack) ratio varies from 2% to 32%, respectively. Our HBM data structure is over 20x smaller than a fully-populated AC-DFA for both Snort and ClamAV dictionaries. The aggregated throughput of our HBM approach scales almost 7x with 8 threads to over 10 Gbps in a dual-socket quad-core Opteron (Shanghai) server. 2010 IEEE.

Shih-chieh Chang - One of the best experts on this subject based on the ideXlab platform.

  • Perfect Hashing Based Parallel Algorithms for Multiple String Matching on Graphic Processing Units
    IEEE Transactions on Parallel and Distributed Systems, 2017
    Co-Authors: Jin-cheng Li, Shih-chieh Chang
    Abstract:

    Multiple string matching has a wide range of applications such as network intrusion detection systems, spam filters, information retrieval systems, and bioinformatics. To accelerate multiple string matching, many hardware approaches are proposed to accelerate string matching. Among the hardware approaches, memory architectures have been widely adopted because of their flexibility and scalability. A conventional memory architecture compiles multiple string patterns into a State machine and performs string matching by traversing the corresponding State Transition Table. Due to the ever-increasing number of attack patterns, the memory used for storing the State Transition Table increased tremendously. Therefore, memory reduction has become a crucial issue in optimizing memory architectures. In this paper, we propose two parallel string matching algorithms which adopt perfect hashing to compact a State Transition Table. Different from most State-of-the-art approaches implemented on specific hardware such as TCAM, FPGA, or ASIC, our proposed approaches are easily implemented on commodity DRAM and extremely suiTable to be implemented on GPUs. The proposed algorithms reduce up to 99.5 percent memory requirements for storing the State Transition Table compared to the traditional two-dimensional memory architecture. By studying existing approaches, our results obtain significant improvements in memory efficiency.

  • Memory-efficient pattern matching architectures using perfect hashing on graphic processing units
    2012 Proceedings IEEE INFOCOM, 2012
    Co-Authors: Shih-chieh Chang
    Abstract:

    Memory architectures have been widely adopted in network intrusion detection system for inspecting malicious packets due to their flexibility and scalability. Memory architectures match input streams against thousands of attack patterns by traversing the corresponding State Transition Table stored in commodity memories. With the increasing number of attack patterns, reducing memory requirement has become critical for memory architectures. In this paper, we propose a novel memory architecture using perfect hashing to condense State Transition Tables without hash collisions. The proposed memory architecture achieves up to 99.5% improvement in memory reduction compared to the traditional two-dimensional memory architecture. We have implemented our memory architectures on graphic processing units and tested using attack patterns from Snort V2.8 and input packets form DEFCON. The experimental results show that the proposed memory architectures outperform State-of-the-art memory architectures both on performance and memory efficiency.

  • INFOCOM - Memory-efficient pattern matching architectures using perfect hashing on graphic processing units
    2012 Proceedings IEEE INFOCOM, 2012
    Co-Authors: Shih-chieh Chang
    Abstract:

    Memory architectures have been widely adopted in network intrusion detection system for inspecting malicious packets due to their flexibility and scalability. Memory architectures match input streams against thousands of attack patterns by traversing the corresponding State Transition Table stored in commodity memories. With the increasing number of attack patterns, reducing memory requirement has become critical for memory architectures. In this paper, we propose a novel memory architecture using perfect hashing to condense State Transition Tables without hash collisions. The proposed memory architecture achieves up to 99.5% improvement in memory reduction compared to the traditional two-dimensional memory architecture. We have implemented our memory architectures on graphic processing units and tested using attack patterns from Snort V2.8 and input packets form DEFCON. The experimental results show that the proposed memory architectures outperform State-of-the-art memory architectures both on performance and memory efficiency.