Process Data

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 360 Experts worldwide ranked by ideXlab platform

Zhiliang Ying - One of the best experts on this subject based on the ideXlab platform.

  • an exploratory analysis of the latent structure of Process Data via action sequence autoencoders
    British Journal of Mathematical and Statistical Psychology, 2021
    Co-Authors: Xueying Tang, Zhi Wang, Jingchen Liu, Zhiliang Ying
    Abstract:

    Computer simulations have become a popular tool for assessing complex skills such as problem-solving. Log files of computer-based items record the human-computer interactive Processes for each respondent in full. The response Processes are very diverse, noisy, and of non-standard formats. Few generic methods have been developed to exploit the information contained in Process Data. In this paper we propose a method to extract latent variables from Process Data. The method utilizes a sequence-to-sequence autoencoder to compress response Processes into standard numerical vectors. It does not require prior knowledge of the specific items and human-computer interaction patterns. The proposed method is applied to both simulated and real Process Data to demonstrate that the resulting latent variables extract useful information from the response Processes.

  • an exploratory analysis of the latent structure of Process Data via action sequence autoencoder
    arXiv: Machine Learning, 2019
    Co-Authors: Xueying Tang, Zhi Wang, Jingchen Liu, Zhiliang Ying
    Abstract:

    Computer simulations have become a popular tool of assessing complex skills such as problem-solving skills. Log files of computer-based items record the entire human-computer interactive Processes for each respondent. The response Processes are very diverse, noisy, and of nonstandard formats. Few generic methods have been developed for exploiting the information contained in Process Data. In this article, we propose a method to extract latent variables from Process Data. The method utilizes a sequence-to-sequence autoencoder to compress response Processes into standard numerical vectors. It does not require prior knowledge of the specific items and human-computers interaction patterns. The proposed method is applied to both simulated and real Process Data to demonstrate that the resulting latent variables extract useful information from the response Processes.

  • statistical analysis of complex problem solving Process Data an event history analysis approach
    Frontiers in Psychology, 2019
    Co-Authors: Yunxiao Chen, Xiaoou Li, Zhiliang Ying
    Abstract:

    Complex problem-solving (CPS) ability has been recognized as a central 21st century skill. Individuals' Processes of solving crucial complex problems may contain substantial information about their CPS ability. In this paper, we consider the prediction of duration and final outcome (i.e., success/failure) of solving a complex problem during task completion Process, by making use of Process Data recorded in computer log files. Solving this problem may help answer questions like "how much information about an individual's CPS ability is contained in the Process Data?", "what CPS patterns will yield a higher chance of success?", and "what CPS patterns predict the remaining time for task completion?". We propose an event history analysis model for this prediction problem. The trained prediction model may provide us a better understanding of individuals' problem-solving patterns, which may eventually lead to a good design of automated interventions (e.g., providing hints) for the training of CPS ability. A real Data example from the 2012 Programme for International Student Assessment (PISA) is provided for illustration.

John F Macgregor - One of the best experts on this subject based on the ideXlab platform.

  • comparing alternative approaches for multivariate statistical analysis of batch Process Data
    Journal of Chemometrics, 1999
    Co-Authors: Johan A. Westerhuis, Theodora Kourti, John F Macgregor
    Abstract:

    Batch Process Data can be arranged in a three-way matrix (batch × variable × time). This paper provides a critical discussion of various aspects of the treatment of these multiway Data. First, several methods that have been proposed for decomposing three-way Data matrices are discussed in the context of batch Process Data analysis and monitoring. These methods are multiway principal component analysis (MPCA)—also called Tucker1—parallel factor analysis (PARAFAC) and Tucker3. Secondly, different ways of unfolding, mean centering and scaling the three-way matrix are compared and discussed with respect to their effects on the analysis of batch Data. Finally, the role of the time variable in batch Process Data is considered and methods suggested to predict the per cent completion of batch runs with unequal duration are discussed. Copyright © 1999 John Wiley & Sons, Ltd.

  • product design through multivariate statistical analysis of Process Data
    Aiche Journal, 1998
    Co-Authors: Christiane M Jaeckle, John F Macgregor
    Abstract:

    A methodology is developed for finding a window of operating conditions within which one should be able to produce a product having a specified set of quality characteristics. The only information assumed to be available is that contained within historical Data on the Process obtained during the production of a range of existing product grades. Multivariate statistical methods are used to build and to invert either linear or nonlinear empirical latent variable models of the existing plant operations to obtain a window of operating conditions that are capable of yielding the desired product and that are still consistent with past operating procedures and constraints. The methods and concepts are illustrated using a simulated high-pressure tubular reactor Process for producing low-density polyethylene.

  • product design through multivariate statistical analysis of Process Data
    Computers & Chemical Engineering, 1996
    Co-Authors: Christiane M Jaeckle, John F Macgregor
    Abstract:

    Abstract A methodology is developed for finding a window of operating conditions within which one should be able to produce a product having a specified set of quality characteristics. The only information that is assumed to be available is that contained within historical Data on the Process obtained during the production of a range of existing product grades. Multivariate statistical methods are used to build and to invert an empirical model of the existing plant operations to obtain a window of operating conditions that are capable of yielding the desired product, and that are still consistent with past operating procedures and constraints. The methods and concepts are illustrated using a simulated high pressure tubular reactor Process for producing low density polyethylene

Zhiqiang Ge - One of the best experts on this subject based on the ideXlab platform.

  • distributed parallel deep learning of hierarchical extreme learning machine for multimode quality prediction with big Process Data
    Engineering Applications of Artificial Intelligence, 2019
    Co-Authors: Zhiqiang Ge
    Abstract:

    Abstract In this work, the distributed and parallel Extreme Learning Machine (dp-ELM) and Hierarchical Extreme Learning Machine (dp-HELM) are proposed for multimode Process quality prediction with big Data. The efficient ELM algorithm is transformed into the distributed and parallel modeling form according to the MapReduce framework. Since the deep learning network structure of HELM is more accurate than the single layer of ELM in feature representation, the dp-HELM is further developed through decomposing the ELM-based Auto-encoders (ELM-AE) of deep hidden layers into a loop of MapReduce jobs. Additionally, the multimode issue is solved through the “divide and rule” strategy. The distributed and parallel K-means (dp-K-means) is utilized to divide the Process modes, which are further trained in a synchronous parallel way by dp-ELM and dp-HELM. Finally, the Bayesian model fusion technique is utilized to integrate the local models for online prediction. The proposed algorithms are deployed on a Hadoop MapReduce computing cluster and the feasibility and efficiency are illustrated through building a real industrial quality prediction model with big Process Data.

  • Process Data analytics via probabilistic latent variable models a tutorial review
    Industrial & Engineering Chemistry Research, 2018
    Co-Authors: Zhiqiang Ge
    Abstract:

    Dimensionality reduction is important for the high-dimensional nature of Data in the Process industry, which has made latent variable modeling methods popular in recent years. By projecting high-dimensional Data into a lower-dimensional space, latent variables models are able to extract key information from Process Data while simultaneously improving the efficiency of Data analytics. Through a probabilistic viewpoint, this paper carries out a tutorial review of probabilistic latent variable models on Process Data analytics. Detailed illustrations of different kinds of basic probabilistic latent variable models (PLVM) are provided, as well as their research statuses. Additionally, more counterparts of those basic PLVMs are introduced and discussed for Process Data analytics. Several perspectives are highlighted for future research on this topic.

  • deep learning of semisupervised Process Data with hierarchical extreme learning machine and soft sensor application
    IEEE Transactions on Industrial Electronics, 2018
    Co-Authors: Zhiqiang Ge
    Abstract:

    Data-driven soft sensors have been widely utilized in industrial Processes to estimate the critical quality variables which are intractable to directly measure online through physical devices. Due to the low sampling rate of quality variables, most of the soft sensors are developed on small number of labeled samples and the large number of unlabeled Process Data is discarded. The loss of information greatly limits the improvement of quality prediction accuracy. One of the main issues of Data-driven soft sensor is to furthest exploit the information contained in all available Process Data. This paper proposes a semisupervised deep learning model for soft sensor development based on the hierarchical extreme learning machine (HELM). First, the deep network structure of autoencoders is implemented for unsupervised feature extraction with all the Process samples. Then, extreme learning machine is utilized for regression through appending the quality variable. Meanwhile, the manifold regularization method is introduced for semisupervised model training. The new method can not only deeply extract the information that the Data contains, but learn more from the extra unlabeled samples as well. The proposed semisupervised HELM method is applied in a high–low transformer to estimate the carbon monoxide content, which shows a significant improvement of the prediction accuracy, compared to traditional methods.

Leo H Chiang - One of the best experts on this subject based on the ideXlab platform.

  • advances and opportunities in machine learning for Process Data analytics
    Computers & Chemical Engineering, 2019
    Co-Authors: Joe S Qin, Leo H Chiang
    Abstract:

    Abstract In this paper we introduce the current thrust of development in machine learning and artificial intelligence, fueled by advances in statistical learning theory over the last 20 years and commercial successes by leading big Data companies. Then we discuss the characteristics of Process manufacturing systems and briefly review the Data analytics research and development in the last three decades. We give three attributes for Process Data analytics to make machine learning techniques applicable in the Process industries. Next we provide a perspective on the currently active topics in machine learning that could be opportunities for Process Data analytics research and development. Finally we address the importance of a Data analytics culture. Issues discussed range from technology development to workforce education and from government initiatives to curriculum enhancement.

  • industrial experiences with multivariate statistical analysis of batch Process Data
    Chemometrics and Intelligent Laboratory Systems, 2006
    Co-Authors: Leo H Chiang, Riccardo Leardi, Randy J Pell, Mary Beth Seasholtz
    Abstract:

    Abstract The Data collected from a batch Process over time from multiple sensors can be arranged in a matrix of J -variables ×  K -time points. Data collected on multiple batches can be arranged in a cube of I -batches ×  J -variables ×  K -time points. The analysis of a cube of Data can be performed by unfolding in two different ways, batch unfolding giving an I  ×  JK Data matrix or observation unfolding resulting in an IK  ×  J Data matrix, followed by PCA. The Data can also be analyzed directly using three-way methods such as PARAFAC or Tucker3. In the literature there is no clear agreement as to the most effective approach for the analysis of batch Data. This paper makes detailed comparisons between the two unfolding approaches and the Tucker3 method. Batch Data from a fermentation Process at The Dow Chemical Company San Diego facility is used for this study. The three methods were found to be complementary to each other and a well-trained chemometrician/practitioner will find all three methods to be useful for batch Data analysis. The batch unfolding MPCA is more sensitive to the overall batch variation while the observation unfolding MPLS is more sensitive to the localized batch variation. The Tucker3 method is in good balance in terms of detecting both variations.

Anne Buu - One of the best experts on this subject based on the ideXlab platform.

  • statistical methods for evaluating the correlation between timeline follow back Data and daily Process Data with applications to research on alcohol and marijuana use
    Addictive Behaviors, 2019
    Co-Authors: Wanjun Liu, Marc A Zimmerman, Maureen A Walton, Rebecca M Cunningham, Anne Buu
    Abstract:

    Abstract Background Retrospective timeline follow-back (TLFB) Data and prospective daily Process Data have been frequently collected in addiction research to characterize behavioral patterns. Although previous validity studies have demonstrated high correlations between these two types of Data, the conventional method adopted in those studies was based on summary measures that may lose critical information and the Pearson's correlation coefficient that has an undesirable property. This study proposes the functional concordance correlation coefficient to address these issues. Methods We use real Data collected from a randomized experiment to demonstrate the applications of the proposed method and compare its analytical results with those of the conventional method. We also conduct a simulation study based on the real Data to evaluate the level of overestimation associated with the conventional method. Results The results of the real Data example indicate that the correlation between these two types of Data varies across substances (alcohol vs. marijuana) and assessment schedules (daily vs. weekly). Additionally, the correlations estimated by the conventional method tend to be higher than those estimated by the proposed method. The simulation results further show that the magnitude of overestimation associated with the conventional method is greatest when the true correlation is medium. Conclusions The findings of the real Data example imply that daily assessments are particularly beneficial for characterizing more variable behaviors like alcohol use, whereas weekly assessments may be sufficient for low variation events such as marijuana use. The proposed method is a better approach for evaluating the validity of TLFB Data.

  • two stage model for time varying effects of discrete longitudinal covariates with applications in analysis of daily Process Data
    Statistics in Medicine, 2015
    Co-Authors: Hanyu Yang, James A Cranford, Anne Buu
    Abstract:

    This study proposes a generalized time-varying effect model that can be used to characterize a discrete longitudinal covariate Process and its time-varying effect on a later outcome that may be discrete. The proposed method can be applied to examine two important research questions for daily Process Data: measurement reactivity and predictive validity. We demonstrate these applications using health risk behavior Data collected from alcoholic couples through an interactive voice response (IVR) system. The statistical analysis results show that the effect of measurement reactivity may only be evident in the first week of IVR assessment. Moreover, the level of urge to drink before measurement reactivity takes effect may be more predictive of a later depression outcome. Our simulation study shows that the performance of the proposed method improves with larger sample sizes, more time points, and smaller proportions of zeros in the binary longitudinal covariate.