Traffic Classification

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 26493 Experts worldwide ranked by ideXlab platform

Jun Zhang - One of the best experts on this subject based on the ideXlab platform.

  • Noise-Resistant Statistical Traffic Classification
    IEEE Transactions on Big Data, 2019
    Co-Authors: Binfeng Wang, Yang Xiang, Jun Zhang, Zili Zhang, Lei Pan, Dawen Xia
    Abstract:

    Network Traffic Classification plays a significant role in cyber security applications and management scenarios. Conventional statistical Classification techniques rely on the assumption that clean labelled samples are available for building Classification models. However, in the big data era, mislabelled training data commonly exist due to the introduction of new applications and lack of knowledge. Existing statistical Traffic Classification techniques do not address the problem of mislabelled training data, so their performance become poor in the presence of mislabelled training data. To meet this challenge, in this paper, we propose a new scheme, Noise-resistant Statistical Traffic Classification (NSTC), which incorporates the techniques of noise elimination and reliability estimation into Traffic Classification. NSTC estimates the reliability of the remaining training data before it builds a robust Traffic classifier. Through a number of Traffic Classification experiments on two real-world Traffic data sets, the results show that the new NSTC scheme can effectively address the problem of mislabelled training data. Compared with the state of the art methods, NSTC can significantly improve the Classification performance in the context of big unclean data.

  • ICPADS - Robust Traffic Classification with Mislabelled Training Samples
    2015 IEEE 21st International Conference on Parallel and Distributed Systems (ICPADS), 2015
    Co-Authors: Binfeng Wang, Jun Zhang, Zili Zhang, Wei Luo, Dawen Xia
    Abstract:

    Traffic Classification plays the significant role in the network security and management. However, accurate Classification is challenging if the training data is contaminated with unclean Traffic. Recent researches often assume clean training data, and hence performance reduced on real-time network Traffic. To meet this challenge, in this paper, we propose a robust method, Unclean Traffic Classification (UTC), which incorporates noise elimination and suspected noise reweighting. Firstly, UTC eliminates strong noisy training data identified by a consensus filtering with multiple classifiers. Furthermore, UTC estimates the relevance of remaining training data and learns a robust Traffic classifier. Through a number of experiments on a real-world Traffic dataset, we show that the new method outperforms existing state-of-the-art Traffic Classification methods, under the extremely difficult circumstance with unclean training data.

  • Robust Network Traffic Classification
    IEEE ACM Transactions on Networking, 2015
    Co-Authors: Jun Zhang, Wanlei Zhou, Xiao Chen, Yang Xiang, Jie Wu
    Abstract:

    As a fundamental tool for network management and security, Traffic Classification has attracted increasing attention in recent years. A significant challenge to the robustness of classifi- cation performance comes from zero-day applications previously unknown in Traffic Classification systems. In this paper, we propose a new scheme of Robust statistical Traffic Classification (RTC) by combining supervised and unsupervised machine learning techniques to meet this challenge. The proposed RTC scheme has the capability of identifying the Traffic of zero-day applications as well as accurately discriminating predefined application classes. In addition, we develop a new method for automating the RTC scheme parameters optimization process. The empirical study on real-world Traffic data confirms the effectiveness of the proposed scheme. When zero-day applications are present, the Classification performance of the new scheme is significantly better than four state-of-the-art methods: random forest, correlation-based Classification, semi-supervised clustering, and one-class SVM.

  • Network Traffic Classification using correlation information
    IEEE Transactions on Parallel and Distributed Systems, 2013
    Co-Authors: Jun Zhang, Yong Xiang, Wanlei Zhou, Yang Xiang, Yu Wang, Yong Guan
    Abstract:

    Traffic Classification has wide applications in network management, from security monitoring to quality of service measurements. Recent research tends to apply machine learning techniques to flow statistical feature based Classification methods. The nearest neighbor (NN)-based method has exhibited superior Classification performance. It also has several important advantages, such as no requirements of training procedure, no risk of overfitting of parameters, and naturally being able to handle a huge number of classes. However, the performance of NN classifier can be severely affected if the size of training data is small. In this paper, we propose a novel nonparametric approach for Traffic Classification, which can improve the Classification performance effectively by incorporating correlated information into the Classification process. We analyze the new Classification approach and its performance benefit from both theoretical and empirical perspectives. A large number of experiments are carried out on two real-world Traffic data sets to validate the proposed approach. The results show the Traffic Classification performance can be improved significantly even under the extreme difficult circumstance of very few training samples.

  • internet Traffic Classification by aggregating correlated naive bayes predictions
    IEEE Transactions on Information Forensics and Security, 2013
    Co-Authors: Jun Zhang, Yang Xiang, Wanlei Zhou, Chao Chen, Yong Xiang
    Abstract:

    This paper presents a novel Traffic Classification scheme to improve Classification performance when few training data are available. In the proposed scheme, Traffic flows are described using the discretized statistical features and flow correlation information is modeled by bag-of-flow (BoF). We solve the BoF-based Traffic Classification in a classifier combination framework and theoretically analyze the performance benefit. Furthermore, a new BoF-based Traffic Classification method is proposed to aggregate the naive Bayes (NB) predictions of the correlated flows. We also present an analysis on prediction error sensitivity of the aggregation strategies. Finally, a large number of experiments are carried out on two large-scale real-world Traffic datasets to evaluate the proposed scheme. The experimental results show that the proposed scheme can achieve much better Classification performance than existing state-of-the-art Traffic Classification methods.

Yang Xiang - One of the best experts on this subject based on the ideXlab platform.

  • Noise-Resistant Statistical Traffic Classification
    IEEE Transactions on Big Data, 2019
    Co-Authors: Binfeng Wang, Yang Xiang, Jun Zhang, Zili Zhang, Lei Pan, Dawen Xia
    Abstract:

    Network Traffic Classification plays a significant role in cyber security applications and management scenarios. Conventional statistical Classification techniques rely on the assumption that clean labelled samples are available for building Classification models. However, in the big data era, mislabelled training data commonly exist due to the introduction of new applications and lack of knowledge. Existing statistical Traffic Classification techniques do not address the problem of mislabelled training data, so their performance become poor in the presence of mislabelled training data. To meet this challenge, in this paper, we propose a new scheme, Noise-resistant Statistical Traffic Classification (NSTC), which incorporates the techniques of noise elimination and reliability estimation into Traffic Classification. NSTC estimates the reliability of the remaining training data before it builds a robust Traffic classifier. Through a number of Traffic Classification experiments on two real-world Traffic data sets, the results show that the new NSTC scheme can effectively address the problem of mislabelled training data. Compared with the state of the art methods, NSTC can significantly improve the Classification performance in the context of big unclean data.

  • Robust Network Traffic Classification
    IEEE ACM Transactions on Networking, 2015
    Co-Authors: Jun Zhang, Wanlei Zhou, Xiao Chen, Yang Xiang, Jie Wu
    Abstract:

    As a fundamental tool for network management and security, Traffic Classification has attracted increasing attention in recent years. A significant challenge to the robustness of classifi- cation performance comes from zero-day applications previously unknown in Traffic Classification systems. In this paper, we propose a new scheme of Robust statistical Traffic Classification (RTC) by combining supervised and unsupervised machine learning techniques to meet this challenge. The proposed RTC scheme has the capability of identifying the Traffic of zero-day applications as well as accurately discriminating predefined application classes. In addition, we develop a new method for automating the RTC scheme parameters optimization process. The empirical study on real-world Traffic data confirms the effectiveness of the proposed scheme. When zero-day applications are present, the Classification performance of the new scheme is significantly better than four state-of-the-art methods: random forest, correlation-based Classification, semi-supervised clustering, and one-class SVM.

  • Network Traffic Classification using correlation information
    IEEE Transactions on Parallel and Distributed Systems, 2013
    Co-Authors: Jun Zhang, Yong Xiang, Wanlei Zhou, Yang Xiang, Yu Wang, Yong Guan
    Abstract:

    Traffic Classification has wide applications in network management, from security monitoring to quality of service measurements. Recent research tends to apply machine learning techniques to flow statistical feature based Classification methods. The nearest neighbor (NN)-based method has exhibited superior Classification performance. It also has several important advantages, such as no requirements of training procedure, no risk of overfitting of parameters, and naturally being able to handle a huge number of classes. However, the performance of NN classifier can be severely affected if the size of training data is small. In this paper, we propose a novel nonparametric approach for Traffic Classification, which can improve the Classification performance effectively by incorporating correlated information into the Classification process. We analyze the new Classification approach and its performance benefit from both theoretical and empirical perspectives. A large number of experiments are carried out on two real-world Traffic data sets to validate the proposed approach. The results show the Traffic Classification performance can be improved significantly even under the extreme difficult circumstance of very few training samples.

  • internet Traffic Classification by aggregating correlated naive bayes predictions
    IEEE Transactions on Information Forensics and Security, 2013
    Co-Authors: Jun Zhang, Yang Xiang, Wanlei Zhou, Chao Chen, Yong Xiang
    Abstract:

    This paper presents a novel Traffic Classification scheme to improve Classification performance when few training data are available. In the proposed scheme, Traffic flows are described using the discretized statistical features and flow correlation information is modeled by bag-of-flow (BoF). We solve the BoF-based Traffic Classification in a classifier combination framework and theoretically analyze the performance benefit. Furthermore, a new BoF-based Traffic Classification method is proposed to aggregate the naive Bayes (NB) predictions of the correlated flows. We also present an analysis on prediction error sensitivity of the aggregation strategies. Finally, a large number of experiments are carried out on two large-scale real-world Traffic datasets to evaluate the proposed scheme. The experimental results show that the proposed scheme can achieve much better Classification performance than existing state-of-the-art Traffic Classification methods.

Wanlei Zhou - One of the best experts on this subject based on the ideXlab platform.

  • Robust Network Traffic Classification
    IEEE ACM Transactions on Networking, 2015
    Co-Authors: Jun Zhang, Wanlei Zhou, Xiao Chen, Yang Xiang, Jie Wu
    Abstract:

    As a fundamental tool for network management and security, Traffic Classification has attracted increasing attention in recent years. A significant challenge to the robustness of classifi- cation performance comes from zero-day applications previously unknown in Traffic Classification systems. In this paper, we propose a new scheme of Robust statistical Traffic Classification (RTC) by combining supervised and unsupervised machine learning techniques to meet this challenge. The proposed RTC scheme has the capability of identifying the Traffic of zero-day applications as well as accurately discriminating predefined application classes. In addition, we develop a new method for automating the RTC scheme parameters optimization process. The empirical study on real-world Traffic data confirms the effectiveness of the proposed scheme. When zero-day applications are present, the Classification performance of the new scheme is significantly better than four state-of-the-art methods: random forest, correlation-based Classification, semi-supervised clustering, and one-class SVM.

  • Network Traffic Classification using correlation information
    IEEE Transactions on Parallel and Distributed Systems, 2013
    Co-Authors: Jun Zhang, Yong Xiang, Wanlei Zhou, Yang Xiang, Yu Wang, Yong Guan
    Abstract:

    Traffic Classification has wide applications in network management, from security monitoring to quality of service measurements. Recent research tends to apply machine learning techniques to flow statistical feature based Classification methods. The nearest neighbor (NN)-based method has exhibited superior Classification performance. It also has several important advantages, such as no requirements of training procedure, no risk of overfitting of parameters, and naturally being able to handle a huge number of classes. However, the performance of NN classifier can be severely affected if the size of training data is small. In this paper, we propose a novel nonparametric approach for Traffic Classification, which can improve the Classification performance effectively by incorporating correlated information into the Classification process. We analyze the new Classification approach and its performance benefit from both theoretical and empirical perspectives. A large number of experiments are carried out on two real-world Traffic data sets to validate the proposed approach. The results show the Traffic Classification performance can be improved significantly even under the extreme difficult circumstance of very few training samples.

  • internet Traffic Classification by aggregating correlated naive bayes predictions
    IEEE Transactions on Information Forensics and Security, 2013
    Co-Authors: Jun Zhang, Yang Xiang, Wanlei Zhou, Chao Chen, Yong Xiang
    Abstract:

    This paper presents a novel Traffic Classification scheme to improve Classification performance when few training data are available. In the proposed scheme, Traffic flows are described using the discretized statistical features and flow correlation information is modeled by bag-of-flow (BoF). We solve the BoF-based Traffic Classification in a classifier combination framework and theoretically analyze the performance benefit. Furthermore, a new BoF-based Traffic Classification method is proposed to aggregate the naive Bayes (NB) predictions of the correlated flows. We also present an analysis on prediction error sensitivity of the aggregation strategies. Finally, a large number of experiments are carried out on two large-scale real-world Traffic datasets to evaluate the proposed scheme. The experimental results show that the proposed scheme can achieve much better Classification performance than existing state-of-the-art Traffic Classification methods.

Antonio Pescape - One of the best experts on this subject based on the ideXlab platform.

  • Traffic identification engine: an open platform for Traffic Classification
    IEEE Network, 2014
    Co-Authors: Walter De Donato, Antonio Pescape, Alberto Dainotti
    Abstract:

    The availability of open source Traffic Classification systems designed for both experimental and operational use, can facilitate collaboration, convergence on standard definitions and procedures, and reliable evaluation of techniques. In this article, we describe Traffic Identification Engine (TIE), an open source tool for network Traffic Classification, which we started developing in 2008 to promote sharing common implementations and data in this field. We designed TIE?s architecture and functionalities focusing on the evaluation, comparison, and combination of different Traffic Classification techniques, which can be applied to both live Traffic and previously captured Traffic traces. Through scientific collaborations, and thanks to the support of the open source community, this platform gradually evolved over the past five years, supporting an increasing number of functionalities, some of which we highlight in this article through sample use cases.

  • Reviewing Traffic Classification
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2013
    Co-Authors: Silvio Valenti, Alessandro Finamore, Alberto Dainotti, Dario Rossi, Antonio Pescape, Marco Mellia
    Abstract:

    Traffic Classification has received increasing attention in the last years. It aims at offering the ability to automatically recognize the application that has generated a given stream of packets from the direct and passive observation of the individual packets, or stream of packets, flowing in the network. This ability is instrumental to a number of activities that are of extreme interest to carriers, Internet service providers and network administrators in general. Indeed, Traffic Classification is the basic block that is required to enable any Traffic management operations, from differentiating Traffic pricing and treatment (e.g., policing, shaping, etc.), to security operations (e.g., firewalling, filtering, anomaly detection, etc.). Up to few years ago, almost any Internet application was using well-known transport layer protocol ports that easily allowed its identification. More recently, the number of applications using random or non-standard ports has dramatically increased (e.g. Skype, BitTorrent, VPNs, etc.). Moreover, often network applications are configured to use well-known protocol ports assigned to other applications (e.g. TCP port 80 originally reserved for Web Traffic) attempting to disguise their presence. For these reasons, and for the importance of correctly classifying Traffic flows, novel approaches based respectively on packet inspection, statistical and machine learning techniques, and behavioral methods have been investigated and are becoming standard practice. In this chapter, we discuss the main trend in the field of Traffic Classification and we describe some of the main proposals of the research community. We complete this chapter by developing two examples of behavioral classifiers: both use supervised machine learning algorithms for Classifications, but each is based on different features to describe the Traffic. After presenting them, we compare their performance using a large dataset, showing the benefits and drawback of each approach.

  • Issues and future directions in Traffic Classification
    IEEE Network, 2012
    Co-Authors: Alberto Dainotti, Antonio Pescape, Kc Claffy
    Abstract:

    Traffic Classification technology has increased in relevance this decade, as it is now used in the definition and implementation of mechanisms for service differentiation, network design and engineering, security, accounting, advertising, and research. Over the past 10 years the research community and the networking industry have investigated, proposed and developed several Classification approaches. While Traffic Classification techniques are improving in accuracy and efficiency, the continued proliferation of different Internet application behaviors, in addition to growing incentives to disguise some applications to avoid filtering or blocking, are among the reasons that Traffic Classification remains one of many open problems in Internet research. In this article we review recent achievements and discuss future directions in Traffic Classification, along with their trade-offs in applicability, reliability, and privacy. We outline the persistently unsolved challenges in the field over the last decade, and suggest several strategies for tackling these challenges to promote progress in the science of Internet Traffic Classification.

  • TMA - TIE: A Community-Oriented Traffic Classification Platform
    Traffic Monitoring and Analysis, 2009
    Co-Authors: Alberto Dainotti, Walter De Donato, Antonio Pescape
    Abstract:

    The research on network Traffic Classification has recently become very active. The research community, moved by increasing difficulties in the automated identification of network Traffic, started to investigate Classification approaches alternative to port-based and payload-based techniques. Despite the large quantity of works published in the past few years on this topic, very few implementations targeting alternative approaches have been made available to the community. Moreover, most approaches proposed in literature suffer of problems related to the ability of evaluating and comparing them. In this paper we present a novel community-oriented software for Traffic Classification called TIE, which aims at becoming a common tool for the fair evaluation and comparison of different techniques and at fostering the sharing of common implementations and data. Moreover, TIE supports the combination of more Classification plugins in order to build multi-classifier systems, and its architecture is designed to allow online Traffic Classification.

Alberto Dainotti - One of the best experts on this subject based on the ideXlab platform.

  • Traffic identification engine: an open platform for Traffic Classification
    IEEE Network, 2014
    Co-Authors: Walter De Donato, Antonio Pescape, Alberto Dainotti
    Abstract:

    The availability of open source Traffic Classification systems designed for both experimental and operational use, can facilitate collaboration, convergence on standard definitions and procedures, and reliable evaluation of techniques. In this article, we describe Traffic Identification Engine (TIE), an open source tool for network Traffic Classification, which we started developing in 2008 to promote sharing common implementations and data in this field. We designed TIE?s architecture and functionalities focusing on the evaluation, comparison, and combination of different Traffic Classification techniques, which can be applied to both live Traffic and previously captured Traffic traces. Through scientific collaborations, and thanks to the support of the open source community, this platform gradually evolved over the past five years, supporting an increasing number of functionalities, some of which we highlight in this article through sample use cases.

  • Reviewing Traffic Classification
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2013
    Co-Authors: Silvio Valenti, Alessandro Finamore, Alberto Dainotti, Dario Rossi, Antonio Pescape, Marco Mellia
    Abstract:

    Traffic Classification has received increasing attention in the last years. It aims at offering the ability to automatically recognize the application that has generated a given stream of packets from the direct and passive observation of the individual packets, or stream of packets, flowing in the network. This ability is instrumental to a number of activities that are of extreme interest to carriers, Internet service providers and network administrators in general. Indeed, Traffic Classification is the basic block that is required to enable any Traffic management operations, from differentiating Traffic pricing and treatment (e.g., policing, shaping, etc.), to security operations (e.g., firewalling, filtering, anomaly detection, etc.). Up to few years ago, almost any Internet application was using well-known transport layer protocol ports that easily allowed its identification. More recently, the number of applications using random or non-standard ports has dramatically increased (e.g. Skype, BitTorrent, VPNs, etc.). Moreover, often network applications are configured to use well-known protocol ports assigned to other applications (e.g. TCP port 80 originally reserved for Web Traffic) attempting to disguise their presence. For these reasons, and for the importance of correctly classifying Traffic flows, novel approaches based respectively on packet inspection, statistical and machine learning techniques, and behavioral methods have been investigated and are becoming standard practice. In this chapter, we discuss the main trend in the field of Traffic Classification and we describe some of the main proposals of the research community. We complete this chapter by developing two examples of behavioral classifiers: both use supervised machine learning algorithms for Classifications, but each is based on different features to describe the Traffic. After presenting them, we compare their performance using a large dataset, showing the benefits and drawback of each approach.

  • Issues and future directions in Traffic Classification
    IEEE Network, 2012
    Co-Authors: Alberto Dainotti, Antonio Pescape, Kc Claffy
    Abstract:

    Traffic Classification technology has increased in relevance this decade, as it is now used in the definition and implementation of mechanisms for service differentiation, network design and engineering, security, accounting, advertising, and research. Over the past 10 years the research community and the networking industry have investigated, proposed and developed several Classification approaches. While Traffic Classification techniques are improving in accuracy and efficiency, the continued proliferation of different Internet application behaviors, in addition to growing incentives to disguise some applications to avoid filtering or blocking, are among the reasons that Traffic Classification remains one of many open problems in Internet research. In this article we review recent achievements and discuss future directions in Traffic Classification, along with their trade-offs in applicability, reliability, and privacy. We outline the persistently unsolved challenges in the field over the last decade, and suggest several strategies for tackling these challenges to promote progress in the science of Internet Traffic Classification.

  • TMA - TIE: A Community-Oriented Traffic Classification Platform
    Traffic Monitoring and Analysis, 2009
    Co-Authors: Alberto Dainotti, Walter De Donato, Antonio Pescape
    Abstract:

    The research on network Traffic Classification has recently become very active. The research community, moved by increasing difficulties in the automated identification of network Traffic, started to investigate Classification approaches alternative to port-based and payload-based techniques. Despite the large quantity of works published in the past few years on this topic, very few implementations targeting alternative approaches have been made available to the community. Moreover, most approaches proposed in literature suffer of problems related to the ability of evaluating and comparing them. In this paper we present a novel community-oriented software for Traffic Classification called TIE, which aims at becoming a common tool for the fair evaluation and comparison of different techniques and at fostering the sharing of common implementations and data. Moreover, TIE supports the combination of more Classification plugins in order to build multi-classifier systems, and its architecture is designed to allow online Traffic Classification.