Fault Tolerance

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 327 Experts worldwide ranked by ideXlab platform

Sandeep S. Kulkarni - One of the best experts on this subject based on the ideXlab platform.

  • Automatic synthesis of Fault-Tolerance
    2005
    Co-Authors: Sandeep S. Kulkarni, Ali Ebnenasir
    Abstract:

    Fault-Tolerance is an important property of today's software systems as we rely on computers in our daily affairs (e.g., medical equipments, transportation systems, etc). Since it is difficult (if not impossible) to anticipate all classes of Faults that perturb a program while designing that program, it is desirable to incrementally add Fault-Tolerance concerns to an existing program as we encounter new classes of Faults. Hence, in this dissertation, we concentrate on automatic addition of Fault-Tolerance to (distributed) programs; i.e., synthesizing Fault-tolerant programs from their Fault-intolerant version. The main contributions of the dissertation regarding theoretical aspects are as follows: (1) We identify the effect of safety specification modeling on the complexity of synthesizing Fault-tolerant programs from their Fault-intolerant version. (2) We show the NP-completeness of synthesizing failsafe Fault-tolerant distributed programs from their Fault-intolerant version. (3) We identify the sufficient conditions for polynomial-time synthesis of failsafe Fault-tolerant distributed programs. (4) We design a sound and complete synthesis algorithm for enhancing the Fault-Tolerance of high atomicity programs—where program processes can atomically read/write all program variables—from nonmasking to masking. (5) We present a sound algorithm for enhancing the Fault-Tolerance of distributed programs—where program processes have read/write restriction with respect to program variables. (6) We present a synthesis method for providing reuse in the synthesis of different programs where we automatically specify and add pre-synthesized Fault-Tolerance components to programs. (7) We define and address the problem of synthesizing multitolerant programs that are subject to multiple classes of Faults and provide (possibly) different levels of Fault-Tolerance corresponding to each Fault-class. To validate our theoretical results, we develop an extensible software framework, called Fault-Tolerance Synthesizer (FTSyn), where developers of Fault-Tolerance can interactively synthesize Fault-tolerant programs. Also. FTSyn provides a platform for developers of heuristics to extend FTSyn by integrating their heuristics for the addition of Fault-Tolerance in FTSyn. Using FTSyn, we have synthesized several Fault-tolerant distributed programs that demonstrate the applicability of FTSyn for the cases where we have different types of Faults, and for the cases where a program is subject to multiple simultaneous Faults. (Abstract shortened by UMI.)

  • Designing masking Fault-Tolerance via nonmasking Fault-Tolerance
    IEEE Transactions on Software Engineering, 1998
    Co-Authors: Anish Arora, Sandeep S. Kulkarni
    Abstract:

    Masking Fault-Tolerance guarantees that programs continually satisfy their specification in the presence of Faults. By way of contrast, nonmasking Fault-Tolerance does not guarantee as much: it merely guarantees that when Faults stop occurring, program executions converge to states from where programs continually (re)satisfy their specification. We present in this paper a component based method for the design of masking Fault-tolerant programs. In this method, components are added to a Fault-intolerant program in a stepwise manner, first, to transform the Fault-intolerant program into a nonmasking Fault-tolerant one and, then, to enhance the Fault-Tolerance from nonmasking to masking. We illustrate the method by designing programs for agreement in the presence of Byzantine Faults, data transfer in the presence of message loss, triple modular redundancy in the presence of input corruption, and mutual exclusion in the presence of process fail-stops. These examples also serve to demonstrate that the method accommodates a variety of Fault-classes. It provides alternative designs for programs usually designed with extant design methods, and it offers the potential for improved masking Fault-tolerant programs.

  • Designing Masking Fault-Tolerance via Nonmasking Fault-Tolerance (Extended Abstract)
    1995
    Co-Authors: Sandeep S. Kulkarni
    Abstract:

    Masking Fault-Tolerance guarantees that programs continually satisfy their specification in the presence of Faults. By way of contrast, nonmasking Fault-Tolerance does not guarantee as much: it merely guarantees that when Faults stop occurring, program executions converge to states from where programs continually (re)satisfy their specification. In this paper, we show that a practical method to design masking Fault-Tolerance is to first design nonmasking Fault-Tolerance and to then transform the nonmasking Fault-tolerant program minimally so as to achieve masking Fault-Tolerance. We demonstrate this method by designing novel fully distributed programs for termination detection, mutual exclusion, and leader election, that are masking tolerant of any

  • SRDS - Designing masking Fault-Tolerance via nonmasking Fault-Tolerance
    Proceedings. 14th Symposium on Reliable Distributed Systems, 1
    Co-Authors: Anish Arora, Sandeep S. Kulkarni
    Abstract:

    Masking Fault-Tolerance guarantees that programs continually satisfy their specification in the presence of Faults. By way of contrast, nonmasking Fault-Tolerance does not guarantee as much: it merely guarantees that when Faults stop occurring, program executions converge to states from where programs continually (re)satisfy their specification. In this paper, we show that a practical method to design masking Fault-Tolerance is to first design nonmasking Fault-Tolerance and to then transform the nonmasking Fault-tolerant program minimally so as to achieve masking Fault-Tolerance. We demonstrate this method by designing novel fully distributed programs for termination detection, mutual exclusion, and leader election, that are masking tolerant of any finite number of process fail-stops and/or repairs.

  • ICDCS - The complexity of adding failsafe Fault-Tolerance
    Proceedings 22nd International Conference on Distributed Computing Systems, 1
    Co-Authors: Sandeep S. Kulkarni, Ali Ebnenasir
    Abstract:

    In this paper, we focus our attention on the problem of automating the addition of failsafe Fault-Tolerance where Fault-Tolerance is added to an existing (Fault-intolerant) program. A failsafe Fault-tolerant program satisfies its specification (including safety and liveness) in the absence of Faults. And, in the presence of Faults, it satisfies its safety specification. We present a somewhat unexpected result that, in general, the problem of adding failsafe Fault-Tolerance in distributed programs is NP-hard. Towards this end, we reduce the 3-SAT problem to the problem of adding failsafe Fault-Tolerance. We also identify a class of specifications, monotonic specifications and a class of programs, monotonic programs. Given a (positive) monotonic specification and a (negative) monotonic program, we show that failsafe Fault-Tolerance can be added in polynomial time. We note that the monotonicity restrictions are met for commonly encountered problems such as Byzantine agreement, distributed consensus, and atomic commitment. Finally, we argue that the restrictions on the specifications and programs are necessary to add failsafe Fault-Tolerance in polynomial time; we prove that if only one of these conditions is satisfied, the addition of failsafe Fault-Tolerance is still NP-hard.

Luciano Putruele - One of the best experts on this subject based on the ideXlab platform.

  • TACAS (2) - Measuring Masking Fault-Tolerance
    Tools and Algorithms for the Construction and Analysis of Systems, 2019
    Co-Authors: Pablo F. Castro, Pedro R. D'argenio, Ramiro Demasi, Luciano Putruele
    Abstract:

    In this paper we introduce a notion of Fault-Tolerance distance between labeled transition systems. Intuitively, this notion of distance measures the degree of Fault-Tolerance exhibited by a candidate system. In practice, there are different kinds of Fault-Tolerance, here we restrict ourselves to the analysis of masking Fault-Tolerance because it is often a highly desirable goal for critical systems. Roughly speaking, a system is masking Fault-tolerant when it is able to completely mask the Faults, not allowing these Faults to have any observable consequences for the users. We capture masking Fault-Tolerance via a simulation relation, which is accompanied by a corresponding game characterization. We enrich the resulting games with quantitative objectives to define the notion of masking Fault-Tolerance distance. Furthermore, we investigate the basic properties of this notion of masking distance, and we prove that it is a directed semimetric. We have implemented our approach in a prototype tool that automatically computes the masking distance between a nominal system and a Fault-tolerant version of it. We have used this tool to measure the masking Tolerance of multiple instances of several case studies.

  • Measuring Masking Fault-Tolerance
    arXiv: Logic in Computer Science, 2018
    Co-Authors: Pablo F. Castro, Pedro R. D'argenio, Ramiro Demasi, Luciano Putruele
    Abstract:

    In this paper we introduce a notion of Fault-Tolerance distance between labeled transition systems. Intuitively, this notion of distance measures the degree of Fault-Tolerance exhibited by a candidate system. In practice, there are different kinds of Fault-Tolerance, here we restrict ourselves to the analysis of masking Fault-Tolerance because it is often a highly desirable goal for critical systems. Roughly speaking, a system is masking Fault-tolerant when it is able to completely mask the Faults, not allowing these Faults to have any observable consequences for the users. We capture masking Fault-Tolerance via a simulation relation, which is accompanied by a corresponding game characterization. We enrich the resulting games with quantitative objectives to define the notion of masking Fault-Tolerance distance. Furthermore, we investigate the basic properties of this notion of masking distance, and we prove that it is a directed pseudo metric. We have implemented our approach in a prototype tool that automatically compute the masking distance between a nominal system and a Fault-tolerant version of it. We have used this tool to measure the masking Tolerance of multiple instances of several case studies

Pablo F. Castro - One of the best experts on this subject based on the ideXlab platform.

  • TACAS (2) - Measuring Masking Fault-Tolerance
    Tools and Algorithms for the Construction and Analysis of Systems, 2019
    Co-Authors: Pablo F. Castro, Pedro R. D'argenio, Ramiro Demasi, Luciano Putruele
    Abstract:

    In this paper we introduce a notion of Fault-Tolerance distance between labeled transition systems. Intuitively, this notion of distance measures the degree of Fault-Tolerance exhibited by a candidate system. In practice, there are different kinds of Fault-Tolerance, here we restrict ourselves to the analysis of masking Fault-Tolerance because it is often a highly desirable goal for critical systems. Roughly speaking, a system is masking Fault-tolerant when it is able to completely mask the Faults, not allowing these Faults to have any observable consequences for the users. We capture masking Fault-Tolerance via a simulation relation, which is accompanied by a corresponding game characterization. We enrich the resulting games with quantitative objectives to define the notion of masking Fault-Tolerance distance. Furthermore, we investigate the basic properties of this notion of masking distance, and we prove that it is a directed semimetric. We have implemented our approach in a prototype tool that automatically computes the masking distance between a nominal system and a Fault-tolerant version of it. We have used this tool to measure the masking Tolerance of multiple instances of several case studies.

  • Measuring Masking Fault-Tolerance
    arXiv: Logic in Computer Science, 2018
    Co-Authors: Pablo F. Castro, Pedro R. D'argenio, Ramiro Demasi, Luciano Putruele
    Abstract:

    In this paper we introduce a notion of Fault-Tolerance distance between labeled transition systems. Intuitively, this notion of distance measures the degree of Fault-Tolerance exhibited by a candidate system. In practice, there are different kinds of Fault-Tolerance, here we restrict ourselves to the analysis of masking Fault-Tolerance because it is often a highly desirable goal for critical systems. Roughly speaking, a system is masking Fault-tolerant when it is able to completely mask the Faults, not allowing these Faults to have any observable consequences for the users. We capture masking Fault-Tolerance via a simulation relation, which is accompanied by a corresponding game characterization. We enrich the resulting games with quantitative objectives to define the notion of masking Fault-Tolerance distance. Furthermore, we investigate the basic properties of this notion of masking distance, and we prove that it is a directed pseudo metric. We have implemented our approach in a prototype tool that automatically compute the masking distance between a nominal system and a Fault-tolerant version of it. We have used this tool to measure the masking Tolerance of multiple instances of several case studies

  • Simulation relations for Fault-Tolerance
    Formal Aspects of Computing, 2017
    Co-Authors: Ramiro Demasi, Pablo F. Castro, Tom Maibaum, Nazareno Aguirre
    Abstract:

    We present a formal characterization of Fault-tolerant behaviors of computing systems via simulation relations. This formalization makes use of variations of standard simulation relations in order to compare the executions of a system that exhibits Faults with executions where no Faults occur; intuitively, the latter can be understood as a specification of the system and the former as a Fault-tolerant implementation. By employing variations of standard simulation algorithms, our characterization enables us to algorithmically check Fault-Tolerance in polynomial time, i.e., to verify that a system behaves in an acceptable way even subject to the occurrence of Faults. Furthermore, the use of simulation relations in this setting allows us to distinguish between the different levels of Fault-Tolerance exhibited by systems during their execution. We prove that each kind of simulation relation preserves a corresponding class of temporal properties expressed in CTL; more precisely, masking Fault-Tolerance preserves liveness and safety properties, nonmasking Fault-Tolerance preserves liveness properties, while failsafe Fault-Tolerance guarantees the preservation of safety properties. We illustrate the suitability of this formal framework through its application to standard examples of Fault-Tolerance.

G. Bolt - One of the best experts on this subject based on the ideXlab platform.

  • Fault Tolerance and robustness in neural networks
    IJCNN-91-Seattle International Joint Conference on Neural Networks, 1
    Co-Authors: G. Bolt
    Abstract:

    Summary form only given. A framework by which the Fault Tolerance and robustness of neural networks can be assessed has been proposed. The possible effect on Fault Tolerance of various features of neural networks is discussed, as well as how to sensibly and realistically choose a method to assess the Fault Tolerance of a neural network. Advantages of systems employing neural networks with respect to error detection and recovery were considered. Also, the validity of applying conventional Fault Tolerance design methods is discussed. It is concluded that research should be aimed at abstract models rather titan at physical implementations. >

Anish Arora - One of the best experts on this subject based on the ideXlab platform.

  • Designing masking Fault-Tolerance via nonmasking Fault-Tolerance
    IEEE Transactions on Software Engineering, 1998
    Co-Authors: Anish Arora, Sandeep S. Kulkarni
    Abstract:

    Masking Fault-Tolerance guarantees that programs continually satisfy their specification in the presence of Faults. By way of contrast, nonmasking Fault-Tolerance does not guarantee as much: it merely guarantees that when Faults stop occurring, program executions converge to states from where programs continually (re)satisfy their specification. We present in this paper a component based method for the design of masking Fault-tolerant programs. In this method, components are added to a Fault-intolerant program in a stepwise manner, first, to transform the Fault-intolerant program into a nonmasking Fault-tolerant one and, then, to enhance the Fault-Tolerance from nonmasking to masking. We illustrate the method by designing programs for agreement in the presence of Byzantine Faults, data transfer in the presence of message loss, triple modular redundancy in the presence of input corruption, and mutual exclusion in the presence of process fail-stops. These examples also serve to demonstrate that the method accommodates a variety of Fault-classes. It provides alternative designs for programs usually designed with extant design methods, and it offers the potential for improved masking Fault-tolerant programs.

  • SRDS - Designing masking Fault-Tolerance via nonmasking Fault-Tolerance
    Proceedings. 14th Symposium on Reliable Distributed Systems, 1
    Co-Authors: Anish Arora, Sandeep S. Kulkarni
    Abstract:

    Masking Fault-Tolerance guarantees that programs continually satisfy their specification in the presence of Faults. By way of contrast, nonmasking Fault-Tolerance does not guarantee as much: it merely guarantees that when Faults stop occurring, program executions converge to states from where programs continually (re)satisfy their specification. In this paper, we show that a practical method to design masking Fault-Tolerance is to first design nonmasking Fault-Tolerance and to then transform the nonmasking Fault-tolerant program minimally so as to achieve masking Fault-Tolerance. We demonstrate this method by designing novel fully distributed programs for termination detection, mutual exclusion, and leader election, that are masking tolerant of any finite number of process fail-stops and/or repairs.