Data-Value

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 39094995 Experts worldwide ranked by ideXlab platform

James P. Ziliak - One of the best experts on this subject based on the ideXlab platform.

  • the value of a statistical life evidence from panel data
    The Review of Economics and Statistics, 2012
    Co-Authors: Thomas J. Kniesner, Kip W Viscusi, Christopher Woock, James P. Ziliak
    Abstract:

    Abstract We address long-standing concerns in the literature on compensating wage differentials: the econometric properties of the estimated value of statistical life (VSL) and the wide range of such estimates. We confront prominent econometric issues using panel data, a more accurate fatality risk measure, and systematic application of panel data estimators. Controlling for measurement error, endogeneity, latent individual heterogeneity possibly correlated with regressors, state dependence, and sample composition yields VSL estimates of $4 million to $10 million. The comparatively narrow range clarifies the cost-effectiveness of regulatory decisions. Most important econometrically is controlling for latent heterogeneity; less important is how one does it.

  • the value of a statistical life evidence from panel data
    2011
    Co-Authors: Kip W Viscusi, Thomas J. Kniesner, Christopher Woock, James P. Ziliak
    Abstract:

    This article addresses fundamental long-standing concerns in the compensating wage differentials literature and its public policy implications: the econometric properties of estimates of the value of statistical life (VSL) and the wide range of such estimates. Here we address most of the prominent econometric issues using panel data, a new and more accurate fatality risk measure, and systematic application of panel data estimators. Controlling for measurement error, endogeneity, latent individual heterogeneity that may be correlated with the regressors, state dependence, and sample composition yields estimates of the value of a statistical life in the range of about $6 million to $10 million. This comparatively narrow range greatly clarifies the assessments of the cost-effectiveness of regulatory decisions. We show that probably the most important econometric issue is controlling for latent heterogeneity; less important is how one does it.

Thomas J. Kniesner - One of the best experts on this subject based on the ideXlab platform.

  • the value of a statistical life evidence from panel data
    The Review of Economics and Statistics, 2012
    Co-Authors: Thomas J. Kniesner, Kip W Viscusi, Christopher Woock, James P. Ziliak
    Abstract:

    Abstract We address long-standing concerns in the literature on compensating wage differentials: the econometric properties of the estimated value of statistical life (VSL) and the wide range of such estimates. We confront prominent econometric issues using panel data, a more accurate fatality risk measure, and systematic application of panel data estimators. Controlling for measurement error, endogeneity, latent individual heterogeneity possibly correlated with regressors, state dependence, and sample composition yields VSL estimates of $4 million to $10 million. The comparatively narrow range clarifies the cost-effectiveness of regulatory decisions. Most important econometrically is controlling for latent heterogeneity; less important is how one does it.

  • the value of a statistical life evidence from panel data
    2011
    Co-Authors: Kip W Viscusi, Thomas J. Kniesner, Christopher Woock, James P. Ziliak
    Abstract:

    This article addresses fundamental long-standing concerns in the compensating wage differentials literature and its public policy implications: the econometric properties of estimates of the value of statistical life (VSL) and the wide range of such estimates. Here we address most of the prominent econometric issues using panel data, a new and more accurate fatality risk measure, and systematic application of panel data estimators. Controlling for measurement error, endogeneity, latent individual heterogeneity that may be correlated with the regressors, state dependence, and sample composition yields estimates of the value of a statistical life in the range of about $6 million to $10 million. This comparatively narrow range greatly clarifies the assessments of the cost-effectiveness of regulatory decisions. We show that probably the most important econometric issue is controlling for latent heterogeneity; less important is how one does it.

Costas J Spanos - One of the best experts on this subject based on the ideXlab platform.

  • towards efficient data valuation based on the shapley value
    International Conference on Artificial Intelligence and Statistics, 2019
    Co-Authors: Ruoxi Jia, David Dao, Boxin Wang, Frances Ann Hubis, Nicholas Hynes, Nezihe Merve Gurel, Ce Zhang, Dawn Song, Costas J Spanos
    Abstract:

    {\em ``How much is my data worth?''} is an increasingly common question posed by organizations and individuals alike. An answer to this question could allow, for instance, fairly distributing profits among multiple data contributors and determining prospective compensation when data breaches happen. In this paper, we study the problem of \emph{data valuation} by utilizing the Shapley value, a popular notion of value which originated in coopoerative game theory. The Shapley value defines a unique payoff scheme that satisfies many desiderata for the notion of data value. However, the Shapley value often requires \emph{exponential} time to compute. To meet this challenge, we propose a repertoire of efficient algorithms for approximating the Shapley value. We also demonstrate the value of each training instance for various benchmark datasets.

  • towards efficient data valuation based on the shapley value
    arXiv: Learning, 2019
    Co-Authors: Ruoxi Jia, David Dao, Boxin Wang, Frances Ann Hubis, Nicholas Hynes, Nezihe Merve Gurel, Ce Zhang, Dawn Song, Costas J Spanos
    Abstract:

    "How much is my data worth?" is an increasingly common question posed by organizations and individuals alike. An answer to this question could allow, for instance, fairly distributing profits among multiple data contributors and determining prospective compensation when data breaches happen. In this paper, we study the problem of data valuation by utilizing the Shapley value, a popular notion of value which originated in coopoerative game theory. The Shapley value defines a unique payoff scheme that satisfies many desiderata for the notion of data value. However, the Shapley value often requires exponential time to compute. To meet this challenge, we propose a repertoire of efficient algorithms for approximating the Shapley value. We also demonstrate the value of each training instance for various benchmark datasets.

Prasenjit Mitra - One of the best experts on this subject based on the ideXlab platform.

  • schema matching and embedded value mapping for databases with opaque column names and mixed continuous and discrete valued data fields
    ACM Transactions on Database Systems, 2013
    Co-Authors: Anuj R Jaiswal, David J Miller, Prasenjit Mitra
    Abstract:

    Schema matching and value mapping across two information sources, such as databases, are critical information aggregation tasks. Before data can be integrated from multiple tables, the columns and values within the tables must be matched. The complexities of both these problems grow quickly with the number of attributes to be matched and due to multiple semantics of data values. Traditional research has mostly tackled schema matching and value mapping independently, and for categorical (discrete-valued) attributes. We propose novel methods that leverage value mappings to enhance schema matching in the presence of opaque column names for schemas consisting of both continuous and discrete-valued attributes. An additional source of complexity is that a discrete-valued attribute in one schema could in fact be a quantized, encoded version of a continuous-valued attribute in the other schema. In our approach, which can tackle both “onto” and bijective schema matching, the fitness objective for matching a pair of attributes from two schemas exploits the statistical distribution over values within the two attributes. Suitable fitness objectives are based on Euclidean-distance and the data log-likelihood, both of which are applied in our experimental study. A heuristic local descent optimization strategy that uses two-opt switching to optimize attribute matches, while simultaneously embedding value mappings, is applied for our matching methods. Our experiments show that the proposed techniques matched mixed continuous and discrete-valued attribute schemas with high accuracy and, thus, should be a useful addition to a framework of (semi) automated tools for data alignment.

  • uninterpreted schema matching with embedded value mapping under opaque column names and data values
    IEEE Transactions on Knowledge and Data Engineering, 2010
    Co-Authors: Anuj R Jaiswal, David J Miller, Prasenjit Mitra
    Abstract:

    Schema matching and value mapping across two heterogeneous information sources are critical tasks in applications involving data integration, data warehousing, and federation of databases. Before data can be integrated from multiple tables, the columns and the values appearing in the tables must be matched. The complexity of the problem grows quickly with the number of data attributes/columns to be matched and due to multiple semantics of data values. Traditional research has tackled schema matching and value mapping independently. We propose a novel method that optimizes embedded value mappings to enhance schema matching in the presence of opaque data values and column names. In this approach, the fitness objective for matching a pair of attributes from two schemas depends on the value mapping function for each of the two attributes. Suitable fitness objectives include the euclidean distance measure, which we use in our experimental study, as well as relative (cross) entropy. We propose a heuristic local descent optimization strategy that uses sorting and two-opt switching to jointly optimize value mappings and attribute matches. Our experiments show that our proposed technique outperforms earlier uninterpreted schema matching methods, and thus, should form a useful addition to a suite of (semi) automated tools for resolving structural heterogeneity.

Geraint Rees - One of the best experts on this subject based on the ideXlab platform.

  • the value of data applying a public value model to the english national health service
    Journal of Medical Internet Research, 2020
    Co-Authors: James F Wilson, Daniel Herron, Parashkev Nachev, Nick Mcnally, Bryan Williams, Geraint Rees
    Abstract:

    Research and innovation in biomedicine and health care increasingly depend on electronic data. The emergence of data-driven technologies and associated digital transformations has focused attention on the value of such data. Despite the broad consensus of the value of health data, there is less consensus on the basis for that value; thus, the nature and extent of health data value remain unclear. Much of the existing literature presupposes that the value of data is to be understood primarily in financial terms, and assumes that a single financial value can be assigned. We here argue that the value of a dataset is instead relational; that is, the value depends on who wants to use it and for what purposes. Moreover, data are valued for both nonfinancial and financial reasons. Thus, it may be more accurate to discuss the values (plural) of a dataset rather than the singular value. This plurality of values opens up an important set of questions about how health data should be valued for the purposes of public policy. We argue that public value models provide a useful approach in this regard. According to public value theory, public value is created, or captured, to the extent that public sector institutions further their democratically established goals, and their impact on improving the lives of citizens. This article outlines how adopting such an approach might be operationalized within existing health care systems such as the English National Health Service, with particular focus on actionable conclusions.