Symmetric Difference

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 27066 Experts worldwide ranked by ideXlab platform

Leopoldo Bertossi - One of the best experts on this subject based on the ideXlab platform.

  • complexity of consistent query answering in databases under cardinality based and incremental repair semantics extended version
    arXiv: Databases, 2016
    Co-Authors: Andrei Lopatenko, Leopoldo Bertossi
    Abstract:

    A database D may be inconsistent wrt a given set IC of integrity constraints. Consistent Query Answering (CQA) is the problem of computing from D the answers to a query that are consistent wrt IC . Consistent answers are invariant under all the repairs of D, i.e. the consistent instances that minimally depart from D. Three classes of repair have been considered in the literature: those that minimize set-theoretically the set of tuples in the Symmetric Difference; those that minimize the changes of attribute values, and those that minimize the cardinality of the set of tuples in the Symmetric Difference. The latter class has not been systematically investigated. In this paper we obtain algorithmic and complexity theoretic results for CQA under this cardinality-based repair semantics. We do this in the usual, static setting, but also in a dynamic framework where a consistent database is affected by a sequence of updates, which may make it inconsistent. We also establish comparative results with the other two kinds of repairs in the dynamic case.

  • complexity of consistent query answering in databases under cardinality based and incremental repair semantics
    arXiv: Databases, 2006
    Co-Authors: Andrei Lopatenko, Leopoldo Bertossi
    Abstract:

    Consistent Query Answering (CQA) is the problem of computing from a database the answers to a query that are consistent with respect to certain integrity constraints that the database, as a whole, may fail to satisfy. Consistent answers have been characterized as those that are invariant under certain minimal forms of restoration of the database consistency. We investigate algorithmic and complexity theoretic issues of CQA under database repairs that minimally depart -wrt the cardinality of the Symmetric Difference- from the original database. We obtain first tight complexity bounds. We also address the problem of incremental complexity of CQA, that naturally occurs when an originally consistent database becomes inconsistent after the execution of a sequence of update operations. Tight bounds on incremental complexity are provided for various semantics under denial constraints. Fixed parameter tractability is also investigated in this dynamic context, where the size of the update sequence becomes the relevant parameter.

  • complexity of consistent query answering in databases under cardinality based and incremental repair semantics
    Lecture Notes in Computer Science, 2006
    Co-Authors: Andrei Lopatenko, Leopoldo Bertossi
    Abstract:

    A database D may be inconsistent wrt a given set IC of integrity constraints. Consistent Query Answering (CQA) is the problem of computing from D the answers to a query that are consistent wrt IC. Consistent answers are invariant under all the repairs of D, i.e. the consistent instances that minimally depart from D. Three classes of repair have been considered in the literature: those that minimize set-theoretically the set of tuples in the Symmetric Difference; those that minimize the changes of attribute values, and those that minimize the cardinality of the set of tuples in the Symmetric Difference. The latter class has not been systematically investigated. In this paper we obtain algorithmic and complexity theoretic results for CQA under this cardinality-based repair semantics. We do this in the usual, static setting, but also in a dynamic framework where a consistent database is affected by a sequence of updates, which may make it inconsistent. We also establish comparative results with the other two kinds of repairs in the dynamic case.

Guangwei Weng - One of the best experts on this subject based on the ideXlab platform.

  • bandwidth selection for kernel density estimators of multivariate level sets and highest density regions
    arXiv: Methodology, 2018
    Co-Authors: Charles R Doss, Guangwei Weng
    Abstract:

    We consider bandwidth matrix selection for kernel density estimators (KDEs) of density level sets in $\mathbb{R}^d$, $d \ge 2$. We also consider estimation of highest density regions, which differs from estimating level sets in that one specifies the probability content of the set rather than specifying the level directly. This complicates the problem. Bandwidth selection for KDEs is well studied, but the goal of most methods is to minimize a global loss function for the density or its derivatives. The loss we consider here is instead the measure of the Symmetric Difference of the true set and estimated set. We derive an asymptotic approximation to the corresponding risk. The approximation depends on unknown quantities which can be estimated, and the approximation can then be minimized to yield a choice of bandwidth, which we show in simulations performs well. We provide an R package lsbs for implementing our procedure.

  • bandwidth selection for kernel density estimators of multivariate level sets and highest density regions
    Electronic Journal of Statistics, 2018
    Co-Authors: Charles R Doss, Guangwei Weng
    Abstract:

    We consider bandwidth matrix selection for kernel density estimators of density level sets in $\mathbb{R} ^{d}$, $d\ge 2$. We also consider estimation of highest density regions, which differs from estimating level sets in that one specifies the probability content of the set rather than specifying the level directly. This complicates the problem. Bandwidth selection for KDEs is well studied, but the goal of most methods is to minimize a global loss function for the density or its derivatives. The loss we consider here is instead the measure of the Symmetric Difference of the true set and estimated set. We derive an asymptotic approximation to the corresponding risk. The approximation depends on unknown quantities which can be estimated, and the approximation can then be minimized to yield a choice of bandwidth, which we show in simulations performs well. We provide an R package lsbs for implementing our procedure.

Peter Widmayer - One of the best experts on this subject based on the ideXlab platform.

  • a better scoring model for de novo peptide sequencing the Symmetric Difference between explained and measured masses
    Algorithms for Molecular Biology, 2017
    Co-Authors: Thomas Tschager, Simon Rosch, Ludovic C Gillet, Peter Widmayer
    Abstract:

    Given a peptide as a string of amino acids, the masses of all its prefixes and suffixes can be found by a trivial linear scan through the amino acid masses. The inverse problem is the ideal de novo peptide sequencing problem: Given all prefix and suffix masses, determine the string of amino acids. In biological reality, the given masses are measured in a lab experiment, and measurements by necessity are noisy. The (real, noisy) de novo peptide sequencing problem therefore has a noisy input: a few of the prefix and suffix masses of the peptide are missing and a few other masses are given in addition. For this setting, we ask for an amino acid string that explains the given masses as accurately as possible. Past approaches interpreted accuracy by searching for a string that explains as many masses as possible. We feel, however, that it is not only bad to not explain a mass that appears, but also to explain a mass that does not appear. We propose to minimize the Symmetric Difference between the set of given masses and the set of masses that the string explains. For this new optimization problem, we propose an efficient algorithm that computes both the best and the k best solutions. Proof-of-concept experiments on measurements of synthesized peptides show that our approach leads to better results compared to finding a string that explains as many given masses as possible. We conclude that considering the Symmetric Difference as optimization goal can improve the identification rates for de novo peptide sequencing. A preliminary version of this work has been presented at WABI 2016.

  • a better scoring model for de novo peptide sequencing the Symmetric Difference between explained and measured masses
    Workshop on Algorithms in Bioinformatics, 2016
    Co-Authors: Ludovic C Gillet, Thomas Tschager, Simon Rosch, Peter Widmayer
    Abstract:

    Given a peptide as a string of amino acids, the masses of all its prefixes and suffixes can be found by a trivial linear scan through the amino acid masses. The inverse problem is the ideal de novo peptide sequencing problem: Given all prefix and suffix masses, determine the string of amino acids. In biological reality, the given masses are measured in a lab experiment, and measurements by necessity are noisy. The (real, noisy) de novo peptide sequencing problem therefore has a noisy input: a few of the prefix and suffix masses of the peptide are missing and a few others are given in addition. For this setting we ask for an amino acid string that explains the given masses as accurately as possible. Past approaches interpreted accuracy by searching for a string that explains as many masses as possible. We feel, however, that it is not only bad to not explain a mass that appears, but also to explain a mass that does not appear. That is, we propose to minimize the Symmetric Difference between the set of given masses and the set of masses that the string explains. For this new optimization problem, we propose an efficient algorithm that computes both the best and the k best solutions. Experiments on measurements of 342 synthesized peptides show that our approach leads to better results compared to finding a string that explains as many given masses as possible.

Mahesh Viswanathan - One of the best experts on this subject based on the ideXlab platform.

  • an approximate l 1 Difference algorithm for massive data streams
    SIAM Journal on Computing, 2003
    Co-Authors: Joan Feigenbaum, Sampath Kannan, Martin J Strauss, Mahesh Viswanathan
    Abstract:

    Massive data sets are increasingly important in a wide range of applications, including observational sciences, product marketing, and the monitoring and operations of large systems. In network operations, raw data typically arrive in streams, and decisions must be made by algorithms that make one pass over each stream, throw much of the raw data away, and produce "synopses" or "sketches" for further processing. Moreover, network-generated massive data sets are often distributed: Several different, physically separated network elements may receive or generate data streams that, together, comprise one logical data set; to be of use in operations, the streams must be analyzed locally and their synopses sent to a central operations facility. The enormous scale, distributed nature, and one-pass processing requirement on the data sets of interest must be addressed with new algorithmic techniques. We present one fundamental new technique here: a space-efficient, one-pass algorithm for approximating the L1-Difference $\sum_i|a_i-b_i|$ between two functions, when the function values ai and bi are given as data streams, and their order is chosen by an adversary. Our main technical innovation, which may be of interest outside the realm of massive data stream algorithmics, is a method of constructing families $\{V_j(s)\}$ of limited-independence random variables that are range-summable, by which we mean that $\sum_{j=0}^{c-1} V_j(s)$ is computable in time polylog(c) for all seeds s. Our L1-Difference algorithm can be viewed as a "sketching" algorithm, in the sense of [Broder et al., J. Comput. System Sci., 60 (2000), pp. 630--659], and our technique performs better than that of Broder et al. when used to approximate the Symmetric Difference of two sets with small Symmetric Difference.

  • an approximate l sup 1 Difference algorithm for massive data streams
    Foundations of Computer Science, 1999
    Co-Authors: Joan Feigenbaum, Sampath Kannan, Martin J Strauss, Mahesh Viswanathan
    Abstract:

    We give a space-efficient, one-pass algorithm for approximating the L/sup 1/ Difference /spl Sigma//sub i/|a/sub i/-b/sub i/| between two functions, when the function values a/sub i/ and b/sub i/ are given as data streams, and their order is chosen by an adversary. Our main technical innovation is a method of constructing families {V/sub j/} of limited independence random variables that are range summable by which we mean that /spl Sigma//sub j=0//sup c-1/ V/sub j/(s) is computable in time polylog(c), for all seeds s. These random variable families may be of interest outside our current application domain, i.e., massive data streams generated by communication networks. Our L/sup 1/-Difference algorithm can be viewed as a "sketching" algorithm, in the sense of (A. Broder et al., 1998), and our algorithm performs better than that of Broder et al., when used to approximate the Symmetric Difference of two sets with small Symmetric Difference.

Lynette Van Zijl - One of the best experts on this subject based on the ideXlab platform.

  • unary self verifying Symmetric Difference automata
    Descriptional Complexity of Formal Systems, 2016
    Co-Authors: Laurette Marais, Lynette Van Zijl
    Abstract:

    We investigate self-verifying nondeterministic finite automata, in the case of unary Symmetric Difference nondeterministic finite automata (SV-XNFA). We show that there is a family of languages \(\mathcal {L}_{n\ge 2}\) which can always be represented non-trivially by unary SV-XNFA. We also consider the descriptional complexity of unary SV-XNFA, giving an upper and lower bound for state complexity.

  • minimal dfa for Symmetric Difference nfa
    Descriptional Complexity of Formal Systems, 2012
    Co-Authors: Brink Van Der Merwe, Hellis Tamm, Lynette Van Zijl
    Abstract:

    Recently, a characterization of the class of nondeterministic finite automata (NFAs) for which determinization results in a minimal deterministic finite automaton (DFA), was given in [2]. We present a similar result for the case of Symmetric Difference NFAs. Also, we show that determinization of any minimal Symmetric Difference NFA produces a minimal DFA.

  • MAGIC NUMBERS FOR Symmetric Difference NFAS
    International Journal of Foundations of Computer Science, 2005
    Co-Authors: Lynette Van Zijl
    Abstract:

    Iwama et al. showed that there exists an n-state binary nondeterministic finite automaton such that its equivalent minimal deterministic finite automaton has exactly 2n - α states, for all n ≥ 7 and 5 ≤ α ≤ 2n-2, subject to certain coprimality conditions. We investigate the same question for both unary and binary Symmetric Difference nondeterministic finite automata. In the binary case, we show that for any n ≥ 4, there is an n-state Symmetric Difference nondeterministic finite automaton for which the equivalent minimal deterministic finite automaton has 2n - 1 + 2k - 1 - 1 states, for 2 < k ≤ n - 1. In the unary case, we consider a large practical subclass of unary Symmetric Difference nondeterministic finite automata: for all n ≥ 2, we argue that there are many values of α such that there is no n-state unary Symmetric Difference nondeterministic finite automaton with an equivalent minimal deterministic finite automaton with 2n - α states, where 0 < α < 2n - 1. For each n ≥ 2, we quantify such values of α precisely.

  • magic numbers for Symmetric Difference nfas
    International Conference on Implementation and application of automata, 2004
    Co-Authors: Lynette Van Zijl
    Abstract:

    Iwama et al [1] showed that there exists an n-state binary nondeterministic finite automaton such that its equivalent minimal deterministic finite automaton has exactly 2n– α states, for all n ≥ 7 and 5 ≤ α ≤ 2n – 2, subject to certain coprimality conditions. We investigate the same question for both unary and binary Symmetric Difference nondeterministic finite automata [2]. In the binary case, we show that for any n ≥ 4, there is an n-state ⊕-NFA which needs 2n−−1 + 2k−−1 –1 states, for 2< k ≤ n – 1. In the unary case, we prove the following result for a large practical subclass of unary Symmetric Difference nondeterministic finite automata: For all n ≥ 2, we show that there are many values of α such that there is no n-state unary Symmetric Difference nondeterministic finite automaton with an equivalent deterministic finite automaton with 2n – α states, where 0 <α< 2n−1. For each n ≥ 2, we quantify such values of α precisely.