Data Diagnostics

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 80856 Experts worldwide ranked by ideXlab platform

Steven J. Miller - One of the best experts on this subject based on the ideXlab platform.

  • Data Diagnostics Using Second‐Order Tests of Benford's Law
    Ear and Hearing, 2009
    Co-Authors: Mark J. Nigrini, Steven J. Miller
    Abstract:

    SUMMARY: Auditors are required to use analytical procedures to identify the existence of unusual transactions, events, and trends. Benford's Law gives the expected patterns of the digits in numerical Data, and has been advocated as a test for the authenticity and reliability of transaction level accounting Data. This paper describes a new second‐order test that calculates the digit frequencies of the differences between the ordered (ranked) values in a Data set. These digit frequencies approximate the frequencies of Benford's Law for most Data sets. The second‐order test is applied to four sets of transactional Data. The second‐order test detected errors in Data downloads, rounded Data, Data generated by statistical procedures, and the inaccurate ordering of Data. The test can be applied to any Data set and nonconformity usually signals an unusual issue related to Data integrity that might not have been easily detectable using traditional analytical procedures.

  • Data Diagnostics using second order tests of benford s law
    Ear and Hearing, 2009
    Co-Authors: Mark J. Nigrini, Steven J. Miller
    Abstract:

    SUMMARY: Auditors are required to use analytical procedures to identify the existence of unusual transactions, events, and trends. Benford's Law gives the expected patterns of the digits in numerical Data, and has been advocated as a test for the authenticity and reliability of transaction level accounting Data. This paper describes a new second‐order test that calculates the digit frequencies of the differences between the ordered (ranked) values in a Data set. These digit frequencies approximate the frequencies of Benford's Law for most Data sets. The second‐order test is applied to four sets of transactional Data. The second‐order test detected errors in Data downloads, rounded Data, Data generated by statistical procedures, and the inaccurate ordering of Data. The test can be applied to any Data set and nonconformity usually signals an unusual issue related to Data integrity that might not have been easily detectable using traditional analytical procedures.

Mark J. Nigrini - One of the best experts on this subject based on the ideXlab platform.

  • Data Diagnostics Using Second‐Order Tests of Benford's Law
    Ear and Hearing, 2009
    Co-Authors: Mark J. Nigrini, Steven J. Miller
    Abstract:

    SUMMARY: Auditors are required to use analytical procedures to identify the existence of unusual transactions, events, and trends. Benford's Law gives the expected patterns of the digits in numerical Data, and has been advocated as a test for the authenticity and reliability of transaction level accounting Data. This paper describes a new second‐order test that calculates the digit frequencies of the differences between the ordered (ranked) values in a Data set. These digit frequencies approximate the frequencies of Benford's Law for most Data sets. The second‐order test is applied to four sets of transactional Data. The second‐order test detected errors in Data downloads, rounded Data, Data generated by statistical procedures, and the inaccurate ordering of Data. The test can be applied to any Data set and nonconformity usually signals an unusual issue related to Data integrity that might not have been easily detectable using traditional analytical procedures.

  • Data Diagnostics using second order tests of benford s law
    Ear and Hearing, 2009
    Co-Authors: Mark J. Nigrini, Steven J. Miller
    Abstract:

    SUMMARY: Auditors are required to use analytical procedures to identify the existence of unusual transactions, events, and trends. Benford's Law gives the expected patterns of the digits in numerical Data, and has been advocated as a test for the authenticity and reliability of transaction level accounting Data. This paper describes a new second‐order test that calculates the digit frequencies of the differences between the ordered (ranked) values in a Data set. These digit frequencies approximate the frequencies of Benford's Law for most Data sets. The second‐order test is applied to four sets of transactional Data. The second‐order test detected errors in Data downloads, rounded Data, Data generated by statistical procedures, and the inaccurate ordering of Data. The test can be applied to any Data set and nonconformity usually signals an unusual issue related to Data integrity that might not have been easily detectable using traditional analytical procedures.

Eduard Nafria - One of the best experts on this subject based on the ideXlab platform.

  • Generalized Impurity Measures and Data Diagnostics in Decision Trees
    Visualization of Categorical Data, 2007
    Co-Authors: Tomàs Aluja-banet, Eduard Nafria
    Abstract:

    Publisher Summary This chapter discusses generalized impurity measures and Data Diagnostics in decision trees. The chapter concentrates on classification trees and follows the Classification and Regression Trees (CART) methodology. The tree-based methods provide a simple rule for predicting a response variable from a set of predictors. The response variable can either be continuous or categorical, leading to what are called “regression trees” or “classification trees,” respectively. The main advantage of tree-based classification is the simplicity of the results that are provided visually in the form of a decision tree. The branchings of the tree closely follow the human process for decision making. The chapter explains the general formulation of the impurity, concepts related to the impurity measures, and contribution of each observation to the reduction of impurity.

  • Robust impurity measures in decision trees
    Studies in Classification Data Analysis and Knowledge Organization, 1998
    Co-Authors: Tomàs Aluja-banet, Eduard Nafria
    Abstract:

    Tree-based methods are a statistical procedure for automatic learning from Data, their main characteristic being the simplicity of the results obtained. Their virtue is also their defect since the tree growing process is very dependent on Data; small fluctuations in Data may cause a big change in the tree growing process. Our main objective was to define Data Diagnostics to prevent internal instability in the tree growing process before a particular split has been made. We present a general formulation for the impurity of a node, a function of the proximity between the individuals in the node and its representative. Then, we compute a stability measure of a split and hence we can define more robust splits. Also. we have studied the theoretical complexity of this algorithm and its applicability to large Data sets.

  • Automatic Segmentation by Decision Trees
    COMPSTAT, 1996
    Co-Authors: Tomàs Aluja-banet, Eduard Nafria
    Abstract:

    We present a system for automatic segmentation by decision trees, able to cope with large Data sets, with special attention to stability problems. Tree-based methods are a statistical operation for automatic learning from Data, its main characteristic is the simplicity of the obtained results. It uses a recursive algorithm which can be very costly for large Data sets and it is very dependent on Data, since small fluctuations on Data may cause a big change in the tree-growing process. First our purpose has been to define Data Diagnostics to prevent internal instability in the tree growingprocess before a particular split has been made. Then we study the complexity of the algorithm and its applicability to big Data sets.

Rami Tawil - One of the best experts on this subject based on the ideXlab platform.

  • Impacts of wireless sensor networks strategies and topologies on prognostics and health management
    Journal of Intelligent Manufacturing, 2019
    Co-Authors: Ahmad Farhat, Ali Jaber, Rami Tawil, Christophe Guyeux, Abdallah Makhoul, Abbas Hijazi
    Abstract:

    In this article, we used wireless sensor network (WSN) techniques for monitoring an area under consideration, in order to diagnose its state in real time. What differentiates this type of network from the traditional computer ones is that it is composed by a large number of sensor nodes having very limited and almost nonrenewable energy. A key issue in designing such networks is energy conservation because once a sensor depletes its resources, it will be dropped from the network. This will lead to coverage hole and incomplete Data arriving to the sink. Therefore, preserving the energy held by the nodes so that the network keeps running for as long as possible is a very important concern. If we achieve to improve the network lifetime and Quality of Service (QoS). Diagnosing the state of area will be more accurate for a longer time. One of the most important elements to achieve a QoS in WSN is the network coverage which is usually interpreted as how well the network can observe a given area. Obviously, if the coverage decreases over time, the diagnosis quality decreases accordingly. Various coverage strategies are thus proposed by the WSN community, in order to guarantee a certain coverage rate as long as possible, to reach a certain QoS that in turn will impact the diagnosis and prognostic quality. Various other strategies are in common use in WSN like Data aggregation and scheduling, to preserve a QoS in wireless sensor networks, as long as possible. We argue that such strategies are not neutral if this network is used for prognostic and health management. Some politics may have a positive impact while other ones may blur the sensed Data, like Data aggregation or redundancy suppression, leading to erroneous Diagnostics and/or prognostics. In this work, we will show and measure the impact of each WSN strategy on the resulting estimation of Diagnostics. We emphasized several issues and studied various parameters related to these strategies that have a very important impact on the network, and therefore on Data Diagnostics over time. To reach this goal, to evaluate both prognostic and health management with the WSN strategies, we have used six diagnostic algorithms.

  • On the coverage effects in wireless sensor networks based prognostic and health management
    International Journal of Sensor Networks (IJSN), 2018
    Co-Authors: Ahmad Farhat, Ali Jaber, Christophe Guyeux, Abdallah Makhoul, Rami Tawil
    Abstract:

    In this article, we used Wireless Sensor Network (WSN) techniques for monitoring an area under consideration, in order to diagnose its state. What differentiates this type of network from the traditional computer ones is that it is composed by a large number of sensor nodes having very limited and almost nonrenewable energy. Thus, maintain the quality of service (QoS) of a wireless sensor network for a long period is very important in order to ensure accurate Data. Then, the area diagnosing (diagnosing the state of the area) will be more accurate. One of the most important indexes of the QoS in WSN is its coverage which is usually interpreted as how well the network can observe a given area. Many studies have been conducted to study the problem of detecting and eliminating redundant sensors in order to improve energy efficiency, while preserving the network’s coverage. However, in this article, we discuss the coverage problem in wireless sensor networks and its relation with prognostic and health management. The aim of this article is to study the impact of coverage issues in wireless sensor network on these processes. We emphasized several issues and studied various parameters related to the coverage problem that have a very important impact on the network, and therefore on Data Diagnostics over time: like the scheduling mechanisms, density, deployment of sensors, energy consumption. To reach this goal, evaluating both prognostic and health management with the coverage issues in WSN, we have used four diagnostic algorithms.

Ahmad Farhat - One of the best experts on this subject based on the ideXlab platform.

  • Impacts of wireless sensor networks strategies and topologies on prognostics and health management
    Journal of Intelligent Manufacturing, 2019
    Co-Authors: Ahmad Farhat, Ali Jaber, Rami Tawil, Christophe Guyeux, Abdallah Makhoul, Abbas Hijazi
    Abstract:

    In this article, we used wireless sensor network (WSN) techniques for monitoring an area under consideration, in order to diagnose its state in real time. What differentiates this type of network from the traditional computer ones is that it is composed by a large number of sensor nodes having very limited and almost nonrenewable energy. A key issue in designing such networks is energy conservation because once a sensor depletes its resources, it will be dropped from the network. This will lead to coverage hole and incomplete Data arriving to the sink. Therefore, preserving the energy held by the nodes so that the network keeps running for as long as possible is a very important concern. If we achieve to improve the network lifetime and Quality of Service (QoS). Diagnosing the state of area will be more accurate for a longer time. One of the most important elements to achieve a QoS in WSN is the network coverage which is usually interpreted as how well the network can observe a given area. Obviously, if the coverage decreases over time, the diagnosis quality decreases accordingly. Various coverage strategies are thus proposed by the WSN community, in order to guarantee a certain coverage rate as long as possible, to reach a certain QoS that in turn will impact the diagnosis and prognostic quality. Various other strategies are in common use in WSN like Data aggregation and scheduling, to preserve a QoS in wireless sensor networks, as long as possible. We argue that such strategies are not neutral if this network is used for prognostic and health management. Some politics may have a positive impact while other ones may blur the sensed Data, like Data aggregation or redundancy suppression, leading to erroneous Diagnostics and/or prognostics. In this work, we will show and measure the impact of each WSN strategy on the resulting estimation of Diagnostics. We emphasized several issues and studied various parameters related to these strategies that have a very important impact on the network, and therefore on Data Diagnostics over time. To reach this goal, to evaluate both prognostic and health management with the WSN strategies, we have used six diagnostic algorithms.

  • On the coverage effects in wireless sensor networks based prognostic and health management
    International Journal of Sensor Networks (IJSN), 2018
    Co-Authors: Ahmad Farhat, Ali Jaber, Christophe Guyeux, Abdallah Makhoul, Rami Tawil
    Abstract:

    In this article, we used Wireless Sensor Network (WSN) techniques for monitoring an area under consideration, in order to diagnose its state. What differentiates this type of network from the traditional computer ones is that it is composed by a large number of sensor nodes having very limited and almost nonrenewable energy. Thus, maintain the quality of service (QoS) of a wireless sensor network for a long period is very important in order to ensure accurate Data. Then, the area diagnosing (diagnosing the state of the area) will be more accurate. One of the most important indexes of the QoS in WSN is its coverage which is usually interpreted as how well the network can observe a given area. Many studies have been conducted to study the problem of detecting and eliminating redundant sensors in order to improve energy efficiency, while preserving the network’s coverage. However, in this article, we discuss the coverage problem in wireless sensor networks and its relation with prognostic and health management. The aim of this article is to study the impact of coverage issues in wireless sensor network on these processes. We emphasized several issues and studied various parameters related to the coverage problem that have a very important impact on the network, and therefore on Data Diagnostics over time: like the scheduling mechanisms, density, deployment of sensors, energy consumption. To reach this goal, evaluating both prognostic and health management with the coverage issues in WSN, we have used four diagnostic algorithms.