Backdoors

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 1164 Experts worldwide ranked by ideXlab platform

Bimal Viswanath - One of the best experts on this subject based on the ideXlab platform.

  • t miner a generative approach to defend against trojan attacks on dnn based text classification
    USENIX Security Symposium, 2021
    Co-Authors: Ahmadreza Azizi, Ibrahim Asadullah Tahmid, Asim Waheed, Neal Mangaokar, Mobin Javed, Chandan K Reddy, Bimal Viswanath
    Abstract:

    Deep Neural Network (DNN) classifiers are known to be vulnerable to Trojan or backdoor attacks, where the classifier is manipulated such that it misclassifies any input containing an attacker-determined Trojan trigger. Backdoors compromise a model's integrity, thereby posing a severe threat to the landscape of DNN-based classification. While multiple defenses against such attacks exist for classifiers in the image domain, there have been limited efforts to protect classifiers in the text domain. We present Trojan-Miner (T-Miner) -- a defense framework for Trojan attacks on DNN-based text classifiers. T-Miner employs a sequence-to-sequence (seq-2-seq) generative model that probes the suspicious classifier and learns to produce text sequences that are likely to contain the Trojan trigger. T-Miner then analyzes the text produced by the generative model to determine if they contain trigger phrases, and correspondingly, whether the tested classifier has a backdoor. T-Miner requires no access to the training dataset or clean inputs of the suspicious classifier, and instead uses synthetically crafted "nonsensical" text inputs to train the generative model. We extensively evaluate T-Miner on 1100 model instances spanning 3 ubiquitous DNN model architectures, 5 different classification tasks, and a variety of trigger phrases. We show that T-Miner detects Trojan and clean models with a 98.75% overall accuracy, while achieving low false positives on clean models. We also show that T-Miner is robust against a variety of targeted, advanced attacks from an adaptive attacker.

Dai Jiazhu - One of the best experts on this subject based on the ideXlab platform.

  • Mitigating backdoor attacks in LSTM-based Text Classification Systems by Backdoor Keyword Identification
    2020
    Co-Authors: Chen Chuanshuai, Dai Jiazhu
    Abstract:

    It has been proved that deep neural networks are facing a new threat called backdoor attacks, where the adversary can inject Backdoors into the neural network model through poisoning the training dataset. When the input containing some special pattern called the backdoor trigger, the model with backdoor will carry out malicious task such as misclassification specified by adversaries. In text classification systems, Backdoors inserted in the models can cause spam or malicious speech to escape detection. Previous work mainly focused on the defense of backdoor attacks in computer vision, little attention has been paid to defense method for RNN backdoor attacks regarding text classification. In this paper, through analyzing the changes in inner LSTM neurons, we proposed a defense method called Backdoor Keyword Identification (BKI) to mitigate backdoor attacks which the adversary performs against LSTM-based text classification by data poisoning. This method can identify and exclude poisoning samples crafted to insert backdoor into the model from training data without a verified and trusted dataset. We evaluate our method on text classification models trained on IMDB dataset and DBpedia ontology dataset, and it achieves good performance regardless of the trigger sentences

  • A backdoor attack against LSTM-based text classification systems
    2019
    Co-Authors: Dai Jiazhu, Chen Chuanshuai
    Abstract:

    With the widespread use of deep learning system in many applications, the adversary has strong incentive to explore vulnerabilities of deep neural networks and manipulate them. Backdoor attacks against deep neural networks have been reported to be a new type of threat. In this attack, the adversary will inject Backdoors into the model and then cause the misbehavior of the model through inputs including backdoor triggers. Existed research mainly focuses on backdoor attacks in image classification based on CNN, little attention has been paid to the backdoor attacks in RNN. In this paper, we implement a backdoor attack in text classification based on LSTM by data poisoning. When the backdoor is injected, the model will misclassify any text samples that contains a specific trigger sentence into the target category determined by the adversary. The existence of the backdoor trigger is stealthy and the backdoor injected has little impact on the performance of the model. We consider the backdoor attack in black-box setting where the adversary has no knowledge of model structures or training algorithms except for small amount of training data. We verify the attack through sentiment analysis on the dataset of IMDB movie reviews. The experimental results indicate that our attack can achieve around 95% success rate with 1% poisoning rate

Chen Chuanshuai - One of the best experts on this subject based on the ideXlab platform.

  • Mitigating backdoor attacks in LSTM-based Text Classification Systems by Backdoor Keyword Identification
    2020
    Co-Authors: Chen Chuanshuai, Dai Jiazhu
    Abstract:

    It has been proved that deep neural networks are facing a new threat called backdoor attacks, where the adversary can inject Backdoors into the neural network model through poisoning the training dataset. When the input containing some special pattern called the backdoor trigger, the model with backdoor will carry out malicious task such as misclassification specified by adversaries. In text classification systems, Backdoors inserted in the models can cause spam or malicious speech to escape detection. Previous work mainly focused on the defense of backdoor attacks in computer vision, little attention has been paid to defense method for RNN backdoor attacks regarding text classification. In this paper, through analyzing the changes in inner LSTM neurons, we proposed a defense method called Backdoor Keyword Identification (BKI) to mitigate backdoor attacks which the adversary performs against LSTM-based text classification by data poisoning. This method can identify and exclude poisoning samples crafted to insert backdoor into the model from training data without a verified and trusted dataset. We evaluate our method on text classification models trained on IMDB dataset and DBpedia ontology dataset, and it achieves good performance regardless of the trigger sentences

  • A backdoor attack against LSTM-based text classification systems
    2019
    Co-Authors: Dai Jiazhu, Chen Chuanshuai
    Abstract:

    With the widespread use of deep learning system in many applications, the adversary has strong incentive to explore vulnerabilities of deep neural networks and manipulate them. Backdoor attacks against deep neural networks have been reported to be a new type of threat. In this attack, the adversary will inject Backdoors into the model and then cause the misbehavior of the model through inputs including backdoor triggers. Existed research mainly focuses on backdoor attacks in image classification based on CNN, little attention has been paid to the backdoor attacks in RNN. In this paper, we implement a backdoor attack in text classification based on LSTM by data poisoning. When the backdoor is injected, the model will misclassify any text samples that contains a specific trigger sentence into the target category determined by the adversary. The existence of the backdoor trigger is stealthy and the backdoor injected has little impact on the performance of the model. We consider the backdoor attack in black-box setting where the adversary has no knowledge of model structures or training algorithms except for small amount of training data. We verify the attack through sentiment analysis on the dataset of IMDB movie reviews. The experimental results indicate that our attack can achieve around 95% success rate with 1% poisoning rate

Stefan Szeider - One of the best experts on this subject based on the ideXlab platform.

  • Backdoors to tractable answer set programming
    International Joint Conference on Artificial Intelligence, 2011
    Co-Authors: Johannes Klaus Fichte, Stefan Szeider
    Abstract:

    We present a unifying approach to the efficient evaluation of propositional answer-set programs. Our approach is based on Backdoors which are small sets of atoms that represent "clever reasoning shortcuts" through the search space. The concept of Backdoors is widely used in the areas of propositional satisfiability and constraint satisfaction. We show how this concept can be adapted to the nonmonotonic setting and how it allows to augment various known tractable subproblems, such as the evaluation of Horn and acyclic programs. In order to use Backdoors we need to find them first. We utilize recent advances in fixed-parameter algorithmics to detect small Backdoors. This implies fixed-parameter tractability of the evaluation of propositional answer-set programs, parameterized by the size of Backdoors. Hence backdoor size provides a structural parameter similar to the treewidth parameter previously considered. We show that backdoor size and treewidth are incomparable, hence there are instances that are hard for one and easy for the other parameter. We complement our theoretical results with first empirical results.

Dawn Song - One of the best experts on this subject based on the ideXlab platform.

  • HookFinder: Identifying and Understanding Malware Hooking Behaviors
    2018
    Co-Authors: H. Yin, Z. Liang, Dawn Song
    Abstract:

    Installing various hooks into the victim system is an important attacking strategy used by malware, including spyware, rootkits, stealth Backdoors, and others. In order to evade detection, malware writers are exploring new hooking mechanisms. For example, a stealth kernel backdoor, deepdoor, has been demonstrated to successfully evade all existing hook detectors. Unfortunately, the state of the art of malware analysis is painstaking, mostly manual and error-prone. In this paper, we propose the first systematic approach to automatically identifying hooks and extracting the hook implanting mechanisms. We propose fine-grained impact analysis, as a unified approach to identify hooking behaviors of malicious code. Since it does not rely on any prior knowledge of hooking mechanisms,it can identify novel hooks. Moreover, we devise a semantics-aware impact dependency analysis method to provide a succinct and intuitive graph representation to illustrate the hooking mechanisms. We have developed a prototype, HookFinder, and conducted extensive experiments using representative malware samples from various categories. The experimental results demonstrated that HookFinder correctly identified the hooking behaviors for all the samples, and provided accurate insights about their hooking mechanisms

  • Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning
    arXiv: Cryptography and Security, 2017
    Co-Authors: Xinyun Chen, Kimberly Lu, Bo Li, Dawn Song
    Abstract:

    Deep learning models have achieved high performance on many tasks, and thus have been applied to many security-critical scenarios. For example, deep learning-based face recognition systems have been used to authenticate users to access many security-sensitive applications like payment apps. Such usages of deep learning systems provide the adversaries with sufficient incentives to perform attacks against these systems for their adversarial purposes. In this work, we consider a new type of attacks, called backdoor attacks, where the attacker's goal is to create a backdoor into a learning-based authentication system, so that he can easily circumvent the system by leveraging the backdoor. Specifically, the adversary aims at creating backdoor instances, so that the victim learning system will be misled to classify the backdoor instances as a target label specified by the adversary. In particular, we study backdoor poisoning attacks, which achieve backdoor attacks using poisoning strategies. Different from all existing work, our studied poisoning strategies can apply under a very weak threat model: (1) the adversary has no knowledge of the model and the training set used by the victim system; (2) the attacker is allowed to inject only a small amount of poisoning samples; (3) the backdoor key is hard to notice even by human beings to achieve stealthiness. We conduct evaluation to demonstrate that a backdoor adversary can inject only around 50 poisoning samples, while achieving an attack success rate of above 90%. We are also the first work to show that a data poisoning attack can create physically implementable Backdoors without touching the training process. Our work demonstrates that backdoor poisoning attacks pose real threats to a learning system, and thus highlights the importance of further investigation and proposing defense strategies against them.