Identity Capture

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 54 Experts worldwide ranked by ideXlab platform

John R. Talburt - One of the best experts on this subject based on the ideXlab platform.

  • Entity Information Life Cycle for Big Data: Master Data Management and Information Integration
    2015
    Co-Authors: John R. Talburt, Yinle Zhou
    Abstract:

    Entity Information Life Cycle for Big Data walks you through the ins and outs of managing entity information so you can successfully achieve master data management (MDM) in the era of big data. This book explains big datas impact on MDM and the critical role of entity information management system (EIMS) in successful MDM. Expert authors Dr. John R. Talburt and Dr. Yinle Zhou provide a thorough background in the principles of managing the entity information life cycle and provide practical tips and techniques for implementing an EIMS, strategies for exploiting distributed processing to handle big data for EIMS, and examples from real applications. Additional material on the theory of EIIM and methods for assessing and evaluating EIMS performance also make this book appropriate for use as a textbook in courses on entity and Identity management, data management, customer relationship management (CRM), and related topics. Explains the business value and impact of entity information management system (EIMS) and directly addresses the problem of EIMS design and operation, a critical issue organizations face when implementing MDM systems Offers practical guidance to help you design and build an EIM system that will successfully handle big data Details how to measure and evaluate entity integrity in MDM systems and explains the principles and processes that comprise EIM Provides an understanding of features and functions an EIM system should have that will assist in evaluating commercial EIM systems Includes chapter review questions, exercises, tips, and free downloads of demonstrations that use the OYSTER open source EIM system Executable code (Java .jar files), control scripts, and synthetic input data illustrate various aspects of CSRUD life cycle such as Identity Capture, Identity update, and assertions

  • Strategies for Large-Scale Entity Resolution Based on Inverted Index Data Partitioning
    Information Quality and Governance for Business Intelligence, 2014
    Co-Authors: Yinle Zhou, John R. Talburt
    Abstract:

    Inverted indexing is a commonly used technique for improving the performance of entity resolution algorithms by reducing the number of pair-wise comparisons necessary to arrive at acceptable results. This chapter describes how inverted indexing can also be used as a data partitioning strategy to perform entity resolution on large datasets in a distributed processing environment. This chapter discusses the importance of index-to-rule alignment, pre-resolution index closure, post-resolution link closure, and workflows for record-based Identity Capture and update, and attribute-based Identity Capture and update in a distributed processing environment.

  • A Graduate-Level Course on Entity Resolution and Information Quality: A Step toward ER Education
    Journal of Data and Information Quality, 2013
    Co-Authors: Yinle Zhou, Fumiko Kobayashi, Eric Nelson, John R. Talburt
    Abstract:

    This article discusses the topics, approaches, and lessons learned in teaching a graduate-level course covering entity resolution (ER) and its relationship to information quality (IQ). The course surveys a broad spectrum of ER topics and activities including entity reference extraction, entity reference preparation, entity reference resolution techniques, entity Identity management, and entity relationship analysis. The course content also attempts to balance aspects of ER theory with practical application through a series of laboratory exercises coordinated with the lecture topics. As an additional teaching aid, a configurable, open-source entity resolution engine (OYSTER) was developed that allows students to experience with different types of ER architectures including merge-purge, record linking, Identity resolution, and Identity Capture.

  • The OYSTER Project
    Entity Resolution and Information Quality, 2011
    Co-Authors: John R. Talburt
    Abstract:

    This chapter discusses the oyster project. OYSTER is an open-source software development project sponsored by the ERIQ Research Center. OYSTER (Open sYSTem Entity Resolution) is an entity resolution system that can be configured to run in several modes of operation including mergepurge, Identity Capture, and Identity resolution. The resolution engine supports probabilistic direct matching, transitive equivalence, and asserted equivalence. To facilitate prospecting for match candidates, the system builds and maintains an in-memory index of attribute values to identities. Because OYSTER has an Identity management system, it also supports persistent Identity identifiers. OYSTER is written in Java and the source code and documentation are available as a free download from the ERIQ website for use under the OYSTER open-source license. OYSTER is a freely available general-purpose entity resolution system that can be adapted to a wide range of applications, including instructional support.

  • Design and construction of an entity resolution system that supports entity Identity information management and asserted resolution
    2011
    Co-Authors: John R. Talburt, Eric Nelson
    Abstract:

    This work describes the design and construction of an open source, entity resolution system that enables users to assign and maintain persistent identifiers for master data items. Two key features of this system that are not available in current ER systems and that make persistent identification possible are (1) The Capture and management of entity Identity information (2) Support for user-directed asserted resolution to complement automated direct matching and transitive closure. Another important feature of the design is that the system can be easily configured at runtime into any one of four types of entity resolution architectures including (1) Traditional merge/purge, also known as, record linking (2) Identity Capture (3) Identity Update (4) Identity Resolution. Because these configurations can be established by the user at run-time, the system provides a valuable tool for academic research and instruction. This will allow researchers and students to use the same system to explore the behavior and nature of different ER architectures. Even though the most common string-match comparators have been built into the system, such as, Levenshtein Edit Distance, Q-Gram, Soundex, and many others, the system has been designed to allow users to easily add additional comparators by extending the systems Comparator class. Furthermore, the system incorporates a dynamic filtering system that improves the performance of the matching algorithm by avoiding record pairs that cannot possibly match.

Kevin R. Crooks - One of the best experts on this subject based on the ideXlab platform.

  • Spatial Capture–reCapture with partial Identity: An application to camera traps
    The Annals of Applied Statistics, 2018
    Co-Authors: Ben C. Augustine, J. Andrew Royle, Marcella J. Kelly, Christopher B. Satter, Robert S. Alonso, Erin E. Boydston, Kevin R. Crooks
    Abstract:

    Camera trapping surveys frequently Capture individuals whose Identity is only known from a single flank. The most widely used methods for incorporating these partial Identity individuals into density analyses discard some of the partial Identity Capture histories, reducing precision, and, while not previously recognized, introducing bias. Here, we present the spatial partial Identity model (SPIM), which uses the spatial location where partial Identity samples are Captured to probabilistically resolve their complete identities, allowing all partial Identity samples to be used in the analysis. We show that the SPIM outperforms other analytical alternatives. We then apply the SPIM to an ocelot data set collected on a trapping array with double-camera stations and a bobcat data set collected on a trapping array with single-camera stations. The SPIM improves inference in both cases and, in the ocelot example, individual sex is determined from photographs used to further resolve partial identities—one of which is resolved to near certainty. The SPIM opens the door for the investigation of trapping designs that deviate from the standard two camera design, the combination of other data types between which identities cannot be deterministically linked, and can be extended to the problem of partial genotypes.

  • Spatial Capture-reCapture with partial Identity: an application to camera traps
    2016
    Co-Authors: Ben C. Augustine, J. Andrew Royle, Marcella J. Kelly, Christopher B. Satter, Robert S. Alonso, Erin E. Boydston, Kevin R. Crooks
    Abstract:

    Camera trapping surveys frequently Capture individuals whose Identity is only known from a single flank. The most widely used methods for incorporating these partial Identity individuals into density analyses do not use all of the partial Identity Capture histories, reducing precision, and while not previously recognized, introducing bias. Here, we present the spatial partial Identity model (SPIM), which uses the spatial location where partial Identity samples are Captured to probabilistically resolve their complete identities, allowing all partial Identity samples to be used in the analysis. We show that the SPIM out-performs other analytical alternatives. We then apply the SPIM to an ocelot data set collected on a trapping array with double-camera stations and a bobcat data set collected on a trapping array with single-camera stations. The SPIM improves inference in both cases and in the ocelot example, individual sex determined from photographs is used to further resolve partial identities, one of which is resolved to near certainty. The SPIM opens the door for the investigation of trapping designs that deviate from the standard 2 camera design, the combination of other data types between which identities cannot be deterministically linked, and can be extended to the problem of partial genotypes.

Francisco Guajardo - One of the best experts on this subject based on the ideXlab platform.

  • La Universidad de la Vida: a pedagogy built to last
    International Journal of Qualitative Studies in Education, 2016
    Co-Authors: Miguel A. Guajardo, Francisco Guajardo
    Abstract:

    AbstractThis article weaves the life of a Mexican laborer, who with his wife brought his family to the United States and mentored two university professors, as they became activists in their craft. The professors honor their father through a reflective process where they share and make sense of a series of stories that describe their Papi’s experience in La Universidad de la Vida. The narratives speak to ontology of research, the utility of stories, particularly as stories can shape Identity, Capture critical life moments, and can help us make meaning of lived experiences, a methodology not commonly explored in education research.

Ben C. Augustine - One of the best experts on this subject based on the ideXlab platform.

  • Spatial Capture–reCapture with partial Identity: An application to camera traps
    The Annals of Applied Statistics, 2018
    Co-Authors: Ben C. Augustine, J. Andrew Royle, Marcella J. Kelly, Christopher B. Satter, Robert S. Alonso, Erin E. Boydston, Kevin R. Crooks
    Abstract:

    Camera trapping surveys frequently Capture individuals whose Identity is only known from a single flank. The most widely used methods for incorporating these partial Identity individuals into density analyses discard some of the partial Identity Capture histories, reducing precision, and, while not previously recognized, introducing bias. Here, we present the spatial partial Identity model (SPIM), which uses the spatial location where partial Identity samples are Captured to probabilistically resolve their complete identities, allowing all partial Identity samples to be used in the analysis. We show that the SPIM outperforms other analytical alternatives. We then apply the SPIM to an ocelot data set collected on a trapping array with double-camera stations and a bobcat data set collected on a trapping array with single-camera stations. The SPIM improves inference in both cases and, in the ocelot example, individual sex is determined from photographs used to further resolve partial identities—one of which is resolved to near certainty. The SPIM opens the door for the investigation of trapping designs that deviate from the standard two camera design, the combination of other data types between which identities cannot be deterministically linked, and can be extended to the problem of partial genotypes.

  • Spatial Capture-reCapture with partial Identity: an application to camera traps
    2016
    Co-Authors: Ben C. Augustine, J. Andrew Royle, Marcella J. Kelly, Christopher B. Satter, Robert S. Alonso, Erin E. Boydston, Kevin R. Crooks
    Abstract:

    Camera trapping surveys frequently Capture individuals whose Identity is only known from a single flank. The most widely used methods for incorporating these partial Identity individuals into density analyses do not use all of the partial Identity Capture histories, reducing precision, and while not previously recognized, introducing bias. Here, we present the spatial partial Identity model (SPIM), which uses the spatial location where partial Identity samples are Captured to probabilistically resolve their complete identities, allowing all partial Identity samples to be used in the analysis. We show that the SPIM out-performs other analytical alternatives. We then apply the SPIM to an ocelot data set collected on a trapping array with double-camera stations and a bobcat data set collected on a trapping array with single-camera stations. The SPIM improves inference in both cases and in the ocelot example, individual sex determined from photographs is used to further resolve partial identities, one of which is resolved to near certainty. The SPIM opens the door for the investigation of trapping designs that deviate from the standard 2 camera design, the combination of other data types between which identities cannot be deterministically linked, and can be extended to the problem of partial genotypes.

Yinle Zhou - One of the best experts on this subject based on the ideXlab platform.

  • Entity Information Life Cycle for Big Data: Master Data Management and Information Integration
    2015
    Co-Authors: John R. Talburt, Yinle Zhou
    Abstract:

    Entity Information Life Cycle for Big Data walks you through the ins and outs of managing entity information so you can successfully achieve master data management (MDM) in the era of big data. This book explains big datas impact on MDM and the critical role of entity information management system (EIMS) in successful MDM. Expert authors Dr. John R. Talburt and Dr. Yinle Zhou provide a thorough background in the principles of managing the entity information life cycle and provide practical tips and techniques for implementing an EIMS, strategies for exploiting distributed processing to handle big data for EIMS, and examples from real applications. Additional material on the theory of EIIM and methods for assessing and evaluating EIMS performance also make this book appropriate for use as a textbook in courses on entity and Identity management, data management, customer relationship management (CRM), and related topics. Explains the business value and impact of entity information management system (EIMS) and directly addresses the problem of EIMS design and operation, a critical issue organizations face when implementing MDM systems Offers practical guidance to help you design and build an EIM system that will successfully handle big data Details how to measure and evaluate entity integrity in MDM systems and explains the principles and processes that comprise EIM Provides an understanding of features and functions an EIM system should have that will assist in evaluating commercial EIM systems Includes chapter review questions, exercises, tips, and free downloads of demonstrations that use the OYSTER open source EIM system Executable code (Java .jar files), control scripts, and synthetic input data illustrate various aspects of CSRUD life cycle such as Identity Capture, Identity update, and assertions

  • Strategies for Large-Scale Entity Resolution Based on Inverted Index Data Partitioning
    Information Quality and Governance for Business Intelligence, 2014
    Co-Authors: Yinle Zhou, John R. Talburt
    Abstract:

    Inverted indexing is a commonly used technique for improving the performance of entity resolution algorithms by reducing the number of pair-wise comparisons necessary to arrive at acceptable results. This chapter describes how inverted indexing can also be used as a data partitioning strategy to perform entity resolution on large datasets in a distributed processing environment. This chapter discusses the importance of index-to-rule alignment, pre-resolution index closure, post-resolution link closure, and workflows for record-based Identity Capture and update, and attribute-based Identity Capture and update in a distributed processing environment.

  • A Graduate-Level Course on Entity Resolution and Information Quality: A Step toward ER Education
    Journal of Data and Information Quality, 2013
    Co-Authors: Yinle Zhou, Fumiko Kobayashi, Eric Nelson, John R. Talburt
    Abstract:

    This article discusses the topics, approaches, and lessons learned in teaching a graduate-level course covering entity resolution (ER) and its relationship to information quality (IQ). The course surveys a broad spectrum of ER topics and activities including entity reference extraction, entity reference preparation, entity reference resolution techniques, entity Identity management, and entity relationship analysis. The course content also attempts to balance aspects of ER theory with practical application through a series of laboratory exercises coordinated with the lecture topics. As an additional teaching aid, a configurable, open-source entity resolution engine (OYSTER) was developed that allows students to experience with different types of ER architectures including merge-purge, record linking, Identity resolution, and Identity Capture.