data leakage prevention

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 1086 Experts worldwide ranked by ideXlab platform

Vallipuram Muthukkumarasamy - One of the best experts on this subject based on the ideXlab platform.

  • a survey on data leakage prevention systems
    Journal of Network and Computer Applications, 2016
    Co-Authors: Sultan Alneyadi, Elankayer Sithirasenan, Vallipuram Muthukkumarasamy
    Abstract:

    Protection of confidential data from being leaked to the public is a growing concern among organisations and individuals. Traditionally, confidentiality of data has been preserved using security procedures such as information security policies along with conventional security mechanisms such as firewalls, virtual private networks and intrusion detection systems. Unfortunately, these mechanisms lack pro-activeness and dedication towards protecting confidential data, and in most cases, they require predefined rules by which protection actions are taken. This can result in serious consequences, as confidential data can appear in different forms in different leaking channels. Therefore, there has been an urge to mitigate these drawbacks using more efficient mechanisms. Recently, data leakage prevention systems (DLPSs) have been introduced as dedicated mechanisms to detect and prevent the leakage of confidential data in use, in transit and at rest. DLPSs use different techniques to analyse the content and the context of confidential data to detect or prevent the leakage. Although DLPSs are increasingly being designed and developed as standalone products by IT security vendors and researchers, the term still ambiguous. In this study, we have carried out a comprehensive survey on the current DLPS mechanisms. We explicitly define DLPS and categorise active research directions in this field. In addition, we suggest future directions towards developing more consistent DLPSs that can overcome some of the weaknesses of the current ones. This survey is an updated reference on DLPSs, that can benefit both academics and professionals.

  • detecting data semantic a data leakage prevention approach
    Trust Security And Privacy In Computing And Communications, 2015
    Co-Authors: Sultan Alneyadi, Elankayer Sithirasenan, Vallipuram Muthukkumarasamy
    Abstract:

    data leakage prevention systems (DLPSs) are increasingly being implemented by organizations. Unlike standard security mechanisms such as firewalls and intrusion detection systems, DLPSs are designated systems used to protect in use, at rest and in transit data. DLPSs analytically use the content and surrounding context of confidential data to detect and prevent unauthorized access to confidential data. DLPSs that use content analysis techniques are largely dependent upon data fingerprinting, regular expressions, and statistical analysis to detect data leaks. Given that data is susceptible to change, data fingerprinting and regular expressions suffer from shortcomings in detecting the semantics of evolved confidential data. However, statistical analysis can manage any data that appears fuzzy in nature or has other variations. Thus, DLPSs with statistical analysis capabilities can approximate the presence of data semantics. In this paper, a statistical data leakage prevention (DLP) model is presented to classify data on the basis of semantics. This study contributes to the data leakage prevention field by using data statistical analysis to detect evolved confidential data. The approach was based on using the well-known information retrieval function Term Frequency-Inverse Document Frequency (TF-IDF) to classify documents under certain topics. A Singular Value Decomposition (SVD) matrix was also used to visualize the classification results. The results showed that the proposed statistical DLP approach could correctly classify documents even in cases of extreme modification. It also had a high level of precision and recall scores.

  • a semantics aware classification approach for data leakage prevention
    Australasian Conference on Information Security and Privacy, 2014
    Co-Authors: Sultan Alneyadi, Elankayer Sithirasenan, Vallipuram Muthukkumarasamy
    Abstract:

    data leakage prevention (DLP) is an emerging subject in the field of information security. It deals with tools working under a central policy, which analyze networked environments to detect sensitive data, prevent unauthorized access to it and block channels associated with data leak. This requires special data classification capabilities to distinguish between sensitive and normal data. Not only this task needs prior knowledge of the sensitive data, but also requires knowledge of potentially evolved and unknown data. Most current DLPs use content-based analysis in order to detect sensitive data. This mainly involves the use of regular expressions and data fingerprinting. Although these content analysis techniques are robust in detecting known unmodified data, they usually become ineffective if the sensitive data is not known before or largely modified. In this paper we study the effectiveness of using N-gram based statistical analysis, fostered by the use of stem words, in classifying documents according to their topics. The results are promising with an overall classification accuracy of 92%. Also we discuss classification deterioration when the text is exposed to multiple spins that simulate data modification.

  • adaptable n gram classification model for data leakage prevention
    International Conference on Signal Processing and Communication Systems, 2013
    Co-Authors: Sultan Alneyadi, Elankayer Sithirasenan, Vallipuram Muthukkumarasamy
    Abstract:

    data confidentiality, integrity and availability are the ultimate goals for all information security mechanisms. However, most of these mechanisms do not proactively protect sensitive data; rather, they work under predefined policies and conditions to protect data in general. Few systems such as anomaly-based intrusion detection systems (IDS) might work independently without much administrative interference, but with no dedication to sensitivity of data. New mechanisms called data leakage prevention systems (DLP) have been developed to mitigate the risk of sensitive data leakage. Current DLPs mostly use data fingerprinting and exact and partial document matching to classify sensitive data. These approaches can have a serious limitation because they are susceptible to data misidentification. In this paper, we investigate the use of N-grams statistical analysis for data classification purposes. Our method is based on using N-grams frequency to classify documents under distinct categories. We are using simple taxicap geometry to compute the similarity between documents and existing categories. Moreover, we examine the effect of removing the most common words and connecting phrases on the overall classification. We are aiming to compensate the limitations in current data classification approaches used in the field of data leakage prevention. We show that our method is capable of correctly classifying up to 90.5% of the tested documents.

  • word n gram based classification for data leakage prevention
    Trust Security And Privacy In Computing And Communications, 2013
    Co-Authors: Sultan Alneyadi, Elankayer Sithirasenan, Vallipuram Muthukkumarasamy
    Abstract:

    Revealing sensitive data to unauthorised personal is a serious problem to many organizations that can lead to devastating consequences. Traditionally, prevention of data leak was achieved through firewalls, VPNs and IDS, but without much consideration to sensitivity of the data. In recent years, new technologies such as data leakage prevention systems (DLPs) are developed, especially to either identify and protect sensitive data or monitor and detect sensitive data leakage. One of the most popular approaches used in DLPs is content analysis, where the content of exchanged documents, stored data or even network traffic is monitored for sensitive data. Contents of documents are examined using mainly text analysis and text clustering methods. Moreover, text analysis can be performed using methods such as pattern recognition, style variation and N-gram frequency. In this paper, we investigate the use of N-grams for data classification purposes. Our method is based on using the N-grams frequency to classify documents in order to detect and prevent leakage of sensitive data. We have studied the effectiveness of N-grams to measure the similarity between regular documents and existing classified documents.

Sultan Alneyadi - One of the best experts on this subject based on the ideXlab platform.

  • a survey on data leakage prevention systems
    Journal of Network and Computer Applications, 2016
    Co-Authors: Sultan Alneyadi, Elankayer Sithirasenan, Vallipuram Muthukkumarasamy
    Abstract:

    Protection of confidential data from being leaked to the public is a growing concern among organisations and individuals. Traditionally, confidentiality of data has been preserved using security procedures such as information security policies along with conventional security mechanisms such as firewalls, virtual private networks and intrusion detection systems. Unfortunately, these mechanisms lack pro-activeness and dedication towards protecting confidential data, and in most cases, they require predefined rules by which protection actions are taken. This can result in serious consequences, as confidential data can appear in different forms in different leaking channels. Therefore, there has been an urge to mitigate these drawbacks using more efficient mechanisms. Recently, data leakage prevention systems (DLPSs) have been introduced as dedicated mechanisms to detect and prevent the leakage of confidential data in use, in transit and at rest. DLPSs use different techniques to analyse the content and the context of confidential data to detect or prevent the leakage. Although DLPSs are increasingly being designed and developed as standalone products by IT security vendors and researchers, the term still ambiguous. In this study, we have carried out a comprehensive survey on the current DLPS mechanisms. We explicitly define DLPS and categorise active research directions in this field. In addition, we suggest future directions towards developing more consistent DLPSs that can overcome some of the weaknesses of the current ones. This survey is an updated reference on DLPSs, that can benefit both academics and professionals.

  • detecting data semantic a data leakage prevention approach
    Trust Security And Privacy In Computing And Communications, 2015
    Co-Authors: Sultan Alneyadi, Elankayer Sithirasenan, Vallipuram Muthukkumarasamy
    Abstract:

    data leakage prevention systems (DLPSs) are increasingly being implemented by organizations. Unlike standard security mechanisms such as firewalls and intrusion detection systems, DLPSs are designated systems used to protect in use, at rest and in transit data. DLPSs analytically use the content and surrounding context of confidential data to detect and prevent unauthorized access to confidential data. DLPSs that use content analysis techniques are largely dependent upon data fingerprinting, regular expressions, and statistical analysis to detect data leaks. Given that data is susceptible to change, data fingerprinting and regular expressions suffer from shortcomings in detecting the semantics of evolved confidential data. However, statistical analysis can manage any data that appears fuzzy in nature or has other variations. Thus, DLPSs with statistical analysis capabilities can approximate the presence of data semantics. In this paper, a statistical data leakage prevention (DLP) model is presented to classify data on the basis of semantics. This study contributes to the data leakage prevention field by using data statistical analysis to detect evolved confidential data. The approach was based on using the well-known information retrieval function Term Frequency-Inverse Document Frequency (TF-IDF) to classify documents under certain topics. A Singular Value Decomposition (SVD) matrix was also used to visualize the classification results. The results showed that the proposed statistical DLP approach could correctly classify documents even in cases of extreme modification. It also had a high level of precision and recall scores.

  • a semantics aware classification approach for data leakage prevention
    Australasian Conference on Information Security and Privacy, 2014
    Co-Authors: Sultan Alneyadi, Elankayer Sithirasenan, Vallipuram Muthukkumarasamy
    Abstract:

    data leakage prevention (DLP) is an emerging subject in the field of information security. It deals with tools working under a central policy, which analyze networked environments to detect sensitive data, prevent unauthorized access to it and block channels associated with data leak. This requires special data classification capabilities to distinguish between sensitive and normal data. Not only this task needs prior knowledge of the sensitive data, but also requires knowledge of potentially evolved and unknown data. Most current DLPs use content-based analysis in order to detect sensitive data. This mainly involves the use of regular expressions and data fingerprinting. Although these content analysis techniques are robust in detecting known unmodified data, they usually become ineffective if the sensitive data is not known before or largely modified. In this paper we study the effectiveness of using N-gram based statistical analysis, fostered by the use of stem words, in classifying documents according to their topics. The results are promising with an overall classification accuracy of 92%. Also we discuss classification deterioration when the text is exposed to multiple spins that simulate data modification.

  • adaptable n gram classification model for data leakage prevention
    International Conference on Signal Processing and Communication Systems, 2013
    Co-Authors: Sultan Alneyadi, Elankayer Sithirasenan, Vallipuram Muthukkumarasamy
    Abstract:

    data confidentiality, integrity and availability are the ultimate goals for all information security mechanisms. However, most of these mechanisms do not proactively protect sensitive data; rather, they work under predefined policies and conditions to protect data in general. Few systems such as anomaly-based intrusion detection systems (IDS) might work independently without much administrative interference, but with no dedication to sensitivity of data. New mechanisms called data leakage prevention systems (DLP) have been developed to mitigate the risk of sensitive data leakage. Current DLPs mostly use data fingerprinting and exact and partial document matching to classify sensitive data. These approaches can have a serious limitation because they are susceptible to data misidentification. In this paper, we investigate the use of N-grams statistical analysis for data classification purposes. Our method is based on using N-grams frequency to classify documents under distinct categories. We are using simple taxicap geometry to compute the similarity between documents and existing categories. Moreover, we examine the effect of removing the most common words and connecting phrases on the overall classification. We are aiming to compensate the limitations in current data classification approaches used in the field of data leakage prevention. We show that our method is capable of correctly classifying up to 90.5% of the tested documents.

  • word n gram based classification for data leakage prevention
    Trust Security And Privacy In Computing And Communications, 2013
    Co-Authors: Sultan Alneyadi, Elankayer Sithirasenan, Vallipuram Muthukkumarasamy
    Abstract:

    Revealing sensitive data to unauthorised personal is a serious problem to many organizations that can lead to devastating consequences. Traditionally, prevention of data leak was achieved through firewalls, VPNs and IDS, but without much consideration to sensitivity of the data. In recent years, new technologies such as data leakage prevention systems (DLPs) are developed, especially to either identify and protect sensitive data or monitor and detect sensitive data leakage. One of the most popular approaches used in DLPs is content analysis, where the content of exchanged documents, stored data or even network traffic is monitored for sensitive data. Contents of documents are examined using mainly text analysis and text clustering methods. Moreover, text analysis can be performed using methods such as pattern recognition, style variation and N-gram frequency. In this paper, we investigate the use of N-grams for data classification purposes. Our method is based on using the N-grams frequency to classify documents in order to detect and prevent leakage of sensitive data. We have studied the effectiveness of N-grams to measure the similarity between regular documents and existing classified documents.

Elankayer Sithirasenan - One of the best experts on this subject based on the ideXlab platform.

  • a survey on data leakage prevention systems
    Journal of Network and Computer Applications, 2016
    Co-Authors: Sultan Alneyadi, Elankayer Sithirasenan, Vallipuram Muthukkumarasamy
    Abstract:

    Protection of confidential data from being leaked to the public is a growing concern among organisations and individuals. Traditionally, confidentiality of data has been preserved using security procedures such as information security policies along with conventional security mechanisms such as firewalls, virtual private networks and intrusion detection systems. Unfortunately, these mechanisms lack pro-activeness and dedication towards protecting confidential data, and in most cases, they require predefined rules by which protection actions are taken. This can result in serious consequences, as confidential data can appear in different forms in different leaking channels. Therefore, there has been an urge to mitigate these drawbacks using more efficient mechanisms. Recently, data leakage prevention systems (DLPSs) have been introduced as dedicated mechanisms to detect and prevent the leakage of confidential data in use, in transit and at rest. DLPSs use different techniques to analyse the content and the context of confidential data to detect or prevent the leakage. Although DLPSs are increasingly being designed and developed as standalone products by IT security vendors and researchers, the term still ambiguous. In this study, we have carried out a comprehensive survey on the current DLPS mechanisms. We explicitly define DLPS and categorise active research directions in this field. In addition, we suggest future directions towards developing more consistent DLPSs that can overcome some of the weaknesses of the current ones. This survey is an updated reference on DLPSs, that can benefit both academics and professionals.

  • detecting data semantic a data leakage prevention approach
    Trust Security And Privacy In Computing And Communications, 2015
    Co-Authors: Sultan Alneyadi, Elankayer Sithirasenan, Vallipuram Muthukkumarasamy
    Abstract:

    data leakage prevention systems (DLPSs) are increasingly being implemented by organizations. Unlike standard security mechanisms such as firewalls and intrusion detection systems, DLPSs are designated systems used to protect in use, at rest and in transit data. DLPSs analytically use the content and surrounding context of confidential data to detect and prevent unauthorized access to confidential data. DLPSs that use content analysis techniques are largely dependent upon data fingerprinting, regular expressions, and statistical analysis to detect data leaks. Given that data is susceptible to change, data fingerprinting and regular expressions suffer from shortcomings in detecting the semantics of evolved confidential data. However, statistical analysis can manage any data that appears fuzzy in nature or has other variations. Thus, DLPSs with statistical analysis capabilities can approximate the presence of data semantics. In this paper, a statistical data leakage prevention (DLP) model is presented to classify data on the basis of semantics. This study contributes to the data leakage prevention field by using data statistical analysis to detect evolved confidential data. The approach was based on using the well-known information retrieval function Term Frequency-Inverse Document Frequency (TF-IDF) to classify documents under certain topics. A Singular Value Decomposition (SVD) matrix was also used to visualize the classification results. The results showed that the proposed statistical DLP approach could correctly classify documents even in cases of extreme modification. It also had a high level of precision and recall scores.

  • a semantics aware classification approach for data leakage prevention
    Australasian Conference on Information Security and Privacy, 2014
    Co-Authors: Sultan Alneyadi, Elankayer Sithirasenan, Vallipuram Muthukkumarasamy
    Abstract:

    data leakage prevention (DLP) is an emerging subject in the field of information security. It deals with tools working under a central policy, which analyze networked environments to detect sensitive data, prevent unauthorized access to it and block channels associated with data leak. This requires special data classification capabilities to distinguish between sensitive and normal data. Not only this task needs prior knowledge of the sensitive data, but also requires knowledge of potentially evolved and unknown data. Most current DLPs use content-based analysis in order to detect sensitive data. This mainly involves the use of regular expressions and data fingerprinting. Although these content analysis techniques are robust in detecting known unmodified data, they usually become ineffective if the sensitive data is not known before or largely modified. In this paper we study the effectiveness of using N-gram based statistical analysis, fostered by the use of stem words, in classifying documents according to their topics. The results are promising with an overall classification accuracy of 92%. Also we discuss classification deterioration when the text is exposed to multiple spins that simulate data modification.

  • adaptable n gram classification model for data leakage prevention
    International Conference on Signal Processing and Communication Systems, 2013
    Co-Authors: Sultan Alneyadi, Elankayer Sithirasenan, Vallipuram Muthukkumarasamy
    Abstract:

    data confidentiality, integrity and availability are the ultimate goals for all information security mechanisms. However, most of these mechanisms do not proactively protect sensitive data; rather, they work under predefined policies and conditions to protect data in general. Few systems such as anomaly-based intrusion detection systems (IDS) might work independently without much administrative interference, but with no dedication to sensitivity of data. New mechanisms called data leakage prevention systems (DLP) have been developed to mitigate the risk of sensitive data leakage. Current DLPs mostly use data fingerprinting and exact and partial document matching to classify sensitive data. These approaches can have a serious limitation because they are susceptible to data misidentification. In this paper, we investigate the use of N-grams statistical analysis for data classification purposes. Our method is based on using N-grams frequency to classify documents under distinct categories. We are using simple taxicap geometry to compute the similarity between documents and existing categories. Moreover, we examine the effect of removing the most common words and connecting phrases on the overall classification. We are aiming to compensate the limitations in current data classification approaches used in the field of data leakage prevention. We show that our method is capable of correctly classifying up to 90.5% of the tested documents.

  • word n gram based classification for data leakage prevention
    Trust Security And Privacy In Computing And Communications, 2013
    Co-Authors: Sultan Alneyadi, Elankayer Sithirasenan, Vallipuram Muthukkumarasamy
    Abstract:

    Revealing sensitive data to unauthorised personal is a serious problem to many organizations that can lead to devastating consequences. Traditionally, prevention of data leak was achieved through firewalls, VPNs and IDS, but without much consideration to sensitivity of the data. In recent years, new technologies such as data leakage prevention systems (DLPs) are developed, especially to either identify and protect sensitive data or monitor and detect sensitive data leakage. One of the most popular approaches used in DLPs is content analysis, where the content of exchanged documents, stored data or even network traffic is monitored for sensitive data. Contents of documents are examined using mainly text analysis and text clustering methods. Moreover, text analysis can be performed using methods such as pattern recognition, style variation and N-gram frequency. In this paper, we investigate the use of N-grams for data classification purposes. Our method is based on using the N-grams frequency to classify documents in order to detect and prevent leakage of sensitive data. We have studied the effectiveness of N-grams to measure the similarity between regular documents and existing classified documents.

Barbara Hauer - One of the best experts on this subject based on the ideXlab platform.

  • data and Information leakage prevention Within the Scope of Information Security
    IEEE Access, 2015
    Co-Authors: Barbara Hauer
    Abstract:

    Incidents involving data breaches are ever-present in the media since several years. In order to overcome this threat, organizations apply enterprise content-aware data leakage prevention (DLP) solutions to monitor and control data access and usage. However, this paper argues that current solutions are not able to reliably protect information assets. The analyses of data breaches reported in 2014 reveal a significant number of data leakage incidents that are not within the focus of the DLP solutions. Furthermore, these analyses indicate that the classification of the provided data breach records is not qualified for detailed investigations. Therefore, advanced criteria for characterizing data leakage incidents are introduced, and the reported records are extended. The resulting analyses illustrate that DLP and information leakage prevention (ILP) demand various information security (IS) measures to be established in order to reduce the risk of technologically based data breaches. Furthermore, the effectiveness of DLP and information leakage prevention (ILP) measures is significantly influenced by non-technological aspects, such as the human factor. Therefore, this paper presents a concept for establishing DLP and ILP within the scope of IS.

  • data leakage prevention
    International Conference on Enterprise Information Systems, 2014
    Co-Authors: Barbara Hauer
    Abstract:

    Organizations from all around the world are facing a continuous increase of information exposure over the past decades. In order to overcome this thread, out of the box data leakage prevention (DLP) solutions are applied which are used to monitor and to control data access and usage on storage systems, on client endpoints, and in networks. In recent years products from market leaders, such as McAfee, Symantec, Verdasys, and Websense, evolved to enterprise content-aware DLP solutions. However, this paper argues that current out of the box solutions are not able to reliably protect information assets. It is only possible to reduce the probability of various incidents if organizational and technical requirements are accomplished before implementing a DLP solution. To be efficient, DLP should be a concept of information security within the information leakage prevention (ILP) pyramid which is presented in this paper. Furthermore, data must not be equalized with information which requires different strategies for protection. Especially in case of misusing privileges by exploiting an unlocked system or by shoulder surfing, the remaining risk must not to be underestimated after all.

Bracha Shapira - One of the best experts on this subject based on the ideXlab platform.

  • coban a context based model for data leakage prevention
    Information Sciences, 2014
    Co-Authors: Gilad Katz, Yuval Elovici, Bracha Shapira
    Abstract:

    A new context-based model (CoBAn) for accidental and intentional data leakage prevention (DLP) is proposed. Existing methods attempt to prevent data leakage by either looking for specific keywords and phrases or by using various statistical methods. Keyword-based methods are not sufficiently accurate since they ignore the context of the keyword, while statistical methods ignore the content of the analyzed text. The context-based approach we propose leverages the advantages of both these approaches. The new model consists of two phases: training and detection. During the training phase, clusters of documents are generated and a graph representation of the confidential content of each cluster is created. This representation consists of key terms and the context in which they need to appear in order to be considered confidential. During the detection phase, each tested document is assigned to several clusters and its contents are then matched to each cluster's respective graph in an attempt to determine the confidentiality of the document. Extensive experiments have shown that the model is superior to other methods in detecting leakage attempts, where the confidential information is rephrased or is different from the original examples provided in the learning set.