Packet Content

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 10005 Experts worldwide ranked by ideXlab platform

Sarang Dharmapurikar - One of the best experts on this subject based on the ideXlab platform.

  • fast and scalable pattern matching for network intrusion detection systems
    IEEE Journal on Selected Areas in Communications, 2006
    Co-Authors: Sarang Dharmapurikar, John W Lockwood
    Abstract:

    High-speed Packet Content inspection and filtering devices rely on a fast multipattern matching algorithm which is used to detect predefined keywords or signatures in the Packets. Multipattern matching is known to require intensive memory accesses and is often a performance bottleneck. Hence, specialized hardware-accelerated algorithms are required for line-speed Packet processing. We present hardware-implementable pattern matching algorithm for Content filtering applications, which is scalable in terms of speed, the number of patterns and the pattern length. Our algorithm is based on a memory efficient multihashing data structure called Bloom filter. We use embedded on-chip memory blocks in field programmable gate array/very large scale integration chips to construct Bloom filters which can suppress a large fraction of memory accesses and speed up string matching. Based on this concept, we first present a simple algorithm which can scan for several thousand short (up to 16 bytes) patterns at multigigabit per second speeds with a moderately small amount of embedded memory and a few mega bytes of external memory. Furthermore, we modify this algorithm to be able to handle arbitrarily large strings at the cost of a little more on-chip memory. We demonstrate the merit of our algorithm through theoretical analysis and simulations performed on Snort's string set

  • Algorithms to accelerate multiple regular expressions matching for deep Packet inspection
    ACM SIGCOMM Computer Communication Review, 2006
    Co-Authors: Sailesh Kumar, Sarang Dharmapurikar, Patrick Crowley, Fang Yu, Jonathan Turner
    Abstract:

    There is a growing demand for network devices capable of examining the Content of data Packets in order to improve network security and provide application-specific services. Most high performance systems that perform deep Packet inspection implement simple string matching algorithms to match Packets against a large, but finite set of strings. owever, there is growing interest in the use of regular expression-based pattern matching, since regular expressions offer superior expressive power and flexibility. Deterministic finite automata (DFA) representations are typically used to implement regular expressions. However, DFA representations of regular expression sets arising in network applications require large amounts of memory, limiting their practical application.In this paper, we introduce a new representation for regular expressions, called the Delayed Input DFA (D2FA), which substantially reduces space equirements as compared to a DFA. A D2FA is constructed by transforming a DFA via incrementally replacing several transitions of the automaton with a single default transition. Our approach dramatically reduces the number of distinct transitions between states. For a collection of regular expressions drawn from current commercial and academic systems, a D2FA representation reduces transitions by more than 95%. Given the substantially reduced space equirements, we describe an efficient architecture that can perform deep Packet inspection at multi-gigabit rates. Our architecture uses multiple on-chip memories in such a way that each remains uniformly occupied and accessed over a short duration, thus effectively distributing the load and enabling high throughput. Our architecture can provide ostffective Packet Content scanning at OC-192 rates with memory requirements that are consistent with current ASIC technology.

Scott Tillman - One of the best experts on this subject based on the ideXlab platform.

  • Packet Content matching with Packetc searchsets
    International Conference on Parallel and Distributed Systems, 2010
    Co-Authors: Ralph Duncan, Peder Jungck, Kenneth Ross, Scott Tillman
    Abstract:

    Increasing speeds and volumes push network Packet applications to use parallel processing to boost performance. Examining the Packet payload (message Content) is a key aspect of Packet processing. Applications search payloads to find strings that match a pattern described by regular expressions (regex). Searching for multiple strings that may start anywhere in the payload is a major obstacle to performance. Commercial systems often employ multiple network processors to provide parallel processing in general and use regex software engines or special regex processors to speed up searching performance via parallelism. Typically, regex rules are prepared separately from the application program and compiled into a binary image to be read by a regex processor or software engine. Our approach integrates specifying search rules with specifying network application code written in Packet C, a C dialect that hides host-machine specifics, supports coarse-grain parallelism and supplies high-level data type and operator extensions for Packet processing. PacketC provides a search set data type, as well as match and find operations, to support payload searching. We show that our search set operator implementation, using associative memory and regex processors, lets users enjoy the performance benefits of parallel regex technology without learning hardware-specifics or using a separate regex toolchain’s use.

Andrew W. Moore - One of the best experts on this subject based on the ideXlab platform.

  • lightweight application classification for network management
    ACM Special Interest Group on Data Communication, 2007
    Co-Authors: Hongbo Jiang, Andrew W. Moore, Shudong Jin, Jia Wang
    Abstract:

    Traffic application classification is an essential step in the network management process to provide high availability of network services. However, network management has seen limited use of traffic classification because of the significant overheads of existing techniques. In this context we explore the feasibility and performance of lightweight traffic classification based on NetFlow records. In our experiments, the NetFlow records are created from Packet-trace data and pre-tagged based upon Packet Content. This provides us with NetFlow records that are tagged with a high accuracy for ground-truth. Our experiments show that NetFlow records can be usefully employed for application classification. We demonstrate that our machine learning technique is able to provide an identification accuracy (a 91%) that, while a little lower than that based upon previous Packet-based machine learning work (> 95%), is significantly higher than the commonly used port-based approach (50--70%). Trade-offs such as the complexity of feature selection and Packet sampling are also studied. We conclude that a lightweight mechanism of classification can provide application information with a considerably high accuracy, and can be a useful practice towards more effective network management.

  • Bayesian neural networks for internet traffic classification
    IEEE Transactions on Neural Networks, 2007
    Co-Authors: Tom Auld, Andrew W. Moore, S.f. Gull
    Abstract:

    Internet traffic identification is an important tool for network management. It allows operators to better predict future traffic matrices and demands, security personnel to detect anomalous behavior, and researchers to develop more realistic traffic models. We present here a traffic classifier that can achieve a high accuracy across a range of application types without any source or destination host-address or port information. We use supervised machine learning based on a Bayesian trained neural network. Though our technique uses training data with categories derived from Packet Content, training and testing were done using features derived from Packet streams consisting of one or more Packet headers. By providing classification without access to the Contents of Packets, our technique offers wider application than methods that require full Packet/payloads for classification. This is a powerful advantage, using samples of classified traffic to permit the categorization of traffic based only upon commonly available information.

  • internet traffic classification using bayesian analysis techniques
    Measurement and Modeling of Computer Systems, 2005
    Co-Authors: Andrew W. Moore, Denis Zuev
    Abstract:

    Accurate traffic classification is of fundamental importance to numerous other network activities, from security monitoring to accounting, and from Quality of Service to providing operators with useful forecasts for long-term provisioning. We apply a Naive Bayes estimator to categorize traffic by application. Uniquely, our work capitalizes on hand-classified network data, using it as input to a supervised Naive Bayes estimator. In this paper we illustrate the high level of accuracy achievable with the \Naive Bayes estimator. We further illustrate the improved accuracy of refined variants of this estimator.Our results indicate that with the simplest of Naive Bayes estimator we are able to achieve about 65% accuracy on per-flow classification and with two powerful refinements we can improve this value to better than 95%; this is a vast improvement over traditional techniques that achieve 50--70%. While our technique uses training data, with categories derived from Packet-Content, all of our training and testing was done using header-derived discriminators. We emphasize this as a powerful aspect of our approach: using samples of well-known traffic to allow the categorization of traffic using commonly available information alone.

S.f. Gull - One of the best experts on this subject based on the ideXlab platform.

  • Bayesian neural networks for internet traffic classification
    IEEE Transactions on Neural Networks, 2007
    Co-Authors: Tom Auld, Andrew W. Moore, S.f. Gull
    Abstract:

    Internet traffic identification is an important tool for network management. It allows operators to better predict future traffic matrices and demands, security personnel to detect anomalous behavior, and researchers to develop more realistic traffic models. We present here a traffic classifier that can achieve a high accuracy across a range of application types without any source or destination host-address or port information. We use supervised machine learning based on a Bayesian trained neural network. Though our technique uses training data with categories derived from Packet Content, training and testing were done using features derived from Packet streams consisting of one or more Packet headers. By providing classification without access to the Contents of Packets, our technique offers wider application than methods that require full Packet/payloads for classification. This is a powerful advantage, using samples of classified traffic to permit the categorization of traffic based only upon commonly available information.

Ralph Duncan - One of the best experts on this subject based on the ideXlab platform.

  • Packet Content matching with Packetc searchsets
    International Conference on Parallel and Distributed Systems, 2010
    Co-Authors: Ralph Duncan, Peder Jungck, Kenneth Ross, Scott Tillman
    Abstract:

    Increasing speeds and volumes push network Packet applications to use parallel processing to boost performance. Examining the Packet payload (message Content) is a key aspect of Packet processing. Applications search payloads to find strings that match a pattern described by regular expressions (regex). Searching for multiple strings that may start anywhere in the payload is a major obstacle to performance. Commercial systems often employ multiple network processors to provide parallel processing in general and use regex software engines or special regex processors to speed up searching performance via parallelism. Typically, regex rules are prepared separately from the application program and compiled into a binary image to be read by a regex processor or software engine. Our approach integrates specifying search rules with specifying network application code written in Packet C, a C dialect that hides host-machine specifics, supports coarse-grain parallelism and supplies high-level data type and operator extensions for Packet processing. PacketC provides a search set data type, as well as match and find operations, to support payload searching. We show that our search set operator implementation, using associative memory and regex processors, lets users enjoy the performance benefits of parallel regex technology without learning hardware-specifics or using a separate regex toolchain’s use.