Preprocessing

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 104655 Experts worldwide ranked by ideXlab platform

Bruno Clerckx - One of the best experts on this subject based on the ideXlab platform.

  • multi user linear precoding for multi polarized massive mimo system under imperfect csit
    IEEE Transactions on Wireless Communications, 2015
    Co-Authors: Jaehyun Park, Bruno Clerckx
    Abstract:

    The space limitation and channel acquisition prevent Massive MIMO from being easily deployed in a practical setup. Motivated by current deployments of LTE-Advanced, the use of multi-polarized antenna elements can be an efficient solution to address the space constraint. Furthermore, the dual-structured precoding, in which a Preprocessing based on the spatial correlation and a subsequent linear precoding based on the short-term channel state information at the transmitter (CSIT) are concatenated, can reduce the feedback overhead efficiently. By grouping and Preprocessing spatially correlated mobile stations (MSs), the dimension of the precoding signal space is reduced and the corresponding short-term CSIT dimension is reduced. In this paper, to reduce the feedback overhead further, we propose a dual-structured multi-user linear precoding, in which the subgrouping method based on co-polarization is additionally applied to the spatially grouped MSs in the Preprocessing stage. Furthermore, under imperfect CSIT, the proposed scheme is asymptotically analyzed based on random matrix theory. By investigating the behavior of the asymptotic performance, we also propose a new dual-structured precoding in which the precoding mode is switched between two dual-structured precoding strategies with 1) the Preprocessing based only on the spatial correlation and 2) the Preprocessing based on both the spatial correlation and polarization. Finally, we extend it to 3D dual-structured precoding.

  • multi user linear precoding for multi polarized massive mimo system under imperfect csit
    arXiv: Information Theory, 2014
    Co-Authors: Jaehyun Park, Bruno Clerckx
    Abstract:

    The space limitation and the channel acquisition prevent Massive MIMO from being easily deployed in a practical setup. Motivated by current deployments of LTE-Advanced, the use of multi-polarized antennas can be an efficient solution to address the space constraint. Furthermore, the dual-structured precoding, in which a Preprocessing based on the spatial correlation and a subsequent linear precoding based on the short-term channel state information at the transmitter (CSIT) are concatenated, can reduce the feedback overhead efficiently. By grouping and Preprocessing spatially correlated mobile stations (MSs), the dimension of the precoding signal space is reduced and the corresponding short-term CSIT dimension is reduced. In this paper, to reduce the feedback overhead further, we propose a dual-structured multi-user linear precoding, in which the subgrouping method based on co-polarization is additionally applied to the spatially grouped MSs in the Preprocessing stage. Furthermore, under imperfect CSIT, the proposed scheme is asymptotically analyzed based on random matrix theory. By investigating the behavior of the asymptotic performance, we also propose a new dual-structured precoding in which the precoding mode is switched between two dual-structured precoding strategies with 1) the Preprocessing based only on the spatial correlation and 2) the Preprocessing based on both the spatial correlation and polarization. Finally, we extend it to 3D dual-structured precoding.

Alireza Atri - One of the best experts on this subject based on the ideXlab platform.

  • inter rater reliability of Preprocessing eeg data impact of subjective artifact removal on associative memory task erp results
    Frontiers in Neuroscience, 2017
    Co-Authors: Steven D. Shirk, Donald G. Mclaren, Jessica S. Bloomfield, Alex Powers, Alec Duffy, Meghan B. Mitchell, Ali Ezzati, Brandon A. Ally, Alireza Atri
    Abstract:

    The processing of EEG data routinely involves subjective removal of artifacts during a Preprocessing stage. Preprocessing inter-rater reliability (IRR) and how differences in Preprocessing may affect outcomes of primary event-related potential (ERP) analyses has not been previously assessed. Three raters independently preprocessed EEG data of 16 cognitively healthy adult participants (ages 18-39 years) who performed a memory task. Using intraclass correlations (ICCs), IRR was assessed for Early-frontal, Late-frontal and Parietal Old/new memory effects contrasts across 8 regions of interest (ROIs). IRR was good to excellent for all ROIs; 22 of 26 ICCs were above 0.80. Raters were highly consistent in Preprocessing across ROIs, although the frontal pole ROI (ICC range 0.60-0.90) showed less consistency. Old/new parietal effects had highest ICCs with the lowest variability. Rater Preprocessing differences did not alter primary ERP results. IRR for EEG Preprocessing was good to excellent, and subjective rater-removal of EEG artifacts did not alter primary memory-task ERP results. Findings provide preliminary support for robustness of cognitive/memory task-related ERP results against significant inter-rater Preprocessing variability and suggest reliability of EEG to assess cognitive-neurophysiological processes multiple preprocessors are involved.

  • Inter-Rater Reliability of Preprocessing EEG Data: Impact of Subjective Artifact Removal on Associative Memory Task ERP Results
    Frontiers Media S.A., 2017
    Co-Authors: Steven D. Shirk, Donald G. Mclaren, Jessica S. Bloomfield, Alex Powers, Alec Duffy, Meghan B. Mitchell, Ali Ezzati, Brandon A. Ally, Alireza Atri
    Abstract:

    The processing of EEG data routinely involves subjective removal of artifacts during a Preprocessing stage. Preprocessing inter-rater reliability (IRR) and how differences in Preprocessing may affect outcomes of primary event-related potential (ERP) analyses has not been previously assessed. Three raters independently preprocessed EEG data of 16 cognitively healthy adult participants (ages 18–39 years) who performed a memory task. Using intraclass correlations (ICCs), IRR was assessed for Early-frontal, Late-frontal, and Parietal Old/new memory effects contrasts across eight regions of interest (ROIs). IRR was good to excellent for all ROIs; 22 of 26 ICCs were above 0.80. Raters were highly consistent in Preprocessing across ROIs, although the frontal pole ROI (ICC range 0.60–0.90) showed less consistency. Old/new parietal effects had highest ICCs with the lowest variability. Rater Preprocessing differences did not alter primary ERP results. IRR for EEG Preprocessing was good to excellent, and subjective rater-removal of EEG artifacts did not alter primary memory-task ERP results. Findings provide preliminary support for robustness of cognitive/memory task-related ERP results against significant inter-rater Preprocessing variability and suggest reliability of EEG to assess cognitive-neurophysiological processes multiple preprocessors are involved

Jaehyun Park - One of the best experts on this subject based on the ideXlab platform.

  • multi user linear precoding for multi polarized massive mimo system under imperfect csit
    IEEE Transactions on Wireless Communications, 2015
    Co-Authors: Jaehyun Park, Bruno Clerckx
    Abstract:

    The space limitation and channel acquisition prevent Massive MIMO from being easily deployed in a practical setup. Motivated by current deployments of LTE-Advanced, the use of multi-polarized antenna elements can be an efficient solution to address the space constraint. Furthermore, the dual-structured precoding, in which a Preprocessing based on the spatial correlation and a subsequent linear precoding based on the short-term channel state information at the transmitter (CSIT) are concatenated, can reduce the feedback overhead efficiently. By grouping and Preprocessing spatially correlated mobile stations (MSs), the dimension of the precoding signal space is reduced and the corresponding short-term CSIT dimension is reduced. In this paper, to reduce the feedback overhead further, we propose a dual-structured multi-user linear precoding, in which the subgrouping method based on co-polarization is additionally applied to the spatially grouped MSs in the Preprocessing stage. Furthermore, under imperfect CSIT, the proposed scheme is asymptotically analyzed based on random matrix theory. By investigating the behavior of the asymptotic performance, we also propose a new dual-structured precoding in which the precoding mode is switched between two dual-structured precoding strategies with 1) the Preprocessing based only on the spatial correlation and 2) the Preprocessing based on both the spatial correlation and polarization. Finally, we extend it to 3D dual-structured precoding.

  • multi user linear precoding for multi polarized massive mimo system under imperfect csit
    arXiv: Information Theory, 2014
    Co-Authors: Jaehyun Park, Bruno Clerckx
    Abstract:

    The space limitation and the channel acquisition prevent Massive MIMO from being easily deployed in a practical setup. Motivated by current deployments of LTE-Advanced, the use of multi-polarized antennas can be an efficient solution to address the space constraint. Furthermore, the dual-structured precoding, in which a Preprocessing based on the spatial correlation and a subsequent linear precoding based on the short-term channel state information at the transmitter (CSIT) are concatenated, can reduce the feedback overhead efficiently. By grouping and Preprocessing spatially correlated mobile stations (MSs), the dimension of the precoding signal space is reduced and the corresponding short-term CSIT dimension is reduced. In this paper, to reduce the feedback overhead further, we propose a dual-structured multi-user linear precoding, in which the subgrouping method based on co-polarization is additionally applied to the spatially grouped MSs in the Preprocessing stage. Furthermore, under imperfect CSIT, the proposed scheme is asymptotically analyzed based on random matrix theory. By investigating the behavior of the asymptotic performance, we also propose a new dual-structured precoding in which the precoding mode is switched between two dual-structured precoding strategies with 1) the Preprocessing based only on the spatial correlation and 2) the Preprocessing based on both the spatial correlation and polarization. Finally, we extend it to 3D dual-structured precoding.

Francisco Herrera - One of the best experts on this subject based on the ideXlab platform.

  • DPASF: a flink library for streaming data Preprocessing
    Big Data Analytics, 2019
    Co-Authors: Alejandro Alcalde-barros, Diego García-gil, Salvador García, Francisco Herrera
    Abstract:

    Background Data Preprocessing techniques are devoted to correcting or alleviating errors in data. Discretization and feature selection are two of the most extended data Preprocessing techniques. Although we can find many proposals for static Big Data Preprocessing, there is little research devoted to the continuous Big Data problem. Apache Flink is a recent and novel Big Data framework, following the MapReduce paradigm, focused on distributed stream and batch data processing. In this paper, we propose a data stream library for Big Data Preprocessing, named DPASF, under Apache Flink. The library is composed of six of the most popular and widely used data Preprocessing algorithms. It contains three algorithms for discretization, and three algorithms for performing feature selection. Results The algorithms have been tested using two Big Data datasets. Experimental results show that Preprocessing can not only reduce the size of the data, but also maintain or even improve the original accuracy in a short period of time. Conclusion DPASF contains algorithms that are useful when dealing with Big Data data streams. The Preprocessing algorithms included in the library are able to tackle Big Datasets efficiently and to correct imperfections in the data.

  • a survey on data Preprocessing for data stream mining
    Neurocomputing, 2017
    Co-Authors: Sergio Ramrezgallego, Salvador Garca, Bartosz Krawczyk, Micha Woniak, Francisco Herrera
    Abstract:

    Data Preprocessing and reduction have become essential techniques in current knowledge discovery scenarios, dominated by increasingly large datasets. These methods aim at reducing the complexity inherent to real-world datasets, so that they can be easily processed by current data mining solutions. Advantages of such approaches include, among others, a faster and more precise learning process, and more understandable structure of raw data. However, in the context of data Preprocessing techniques for data streams have a long road ahead of them, despite online learning is growing in importance thanks to the development of Internet and technologies for massive data collection. Throughout this survey, we summarize, categorize and analyze those contributions on data Preprocessing that cope with streaming data. This work also takes into account the existing relationships between the different families of methods (feature and instance selection, and discretization). To enrich our study, we conduct thorough experiments using the most relevant contributions and present an analysis of their predictive performance, reduction rates, computational time, and memory usage. Finally, we offer general advices about existing data stream Preprocessing algorithms, as well as discuss emerging future challenges to be faced in the domain of data stream Preprocessing.

  • tutorial on practical tips of the most influential data Preprocessing algorithms in data mining
    Knowledge Based Systems, 2016
    Co-Authors: Julián Luengo, Francisco Herrera, Salvador García
    Abstract:

    Abstract Data Preprocessing is a major and essential stage whose main goal is to obtain final data sets that can be considered correct and useful for further data mining algorithms. This paper summarizes the most influential data Preprocessing algorithms according to their usage, popularity and extensions proposed in the specialized literature. For each algorithm, we provide a description, a discussion on its impact, and a review of current and further research on it. These most influential algorithms cover missing values imputation, noise filtering, dimensionality reduction (including feature selection and space transformations), instance reduction (including selection and generation), discretization and treatment of data for imbalanced Preprocessing. They constitute all among the most important topics in data Preprocessing research and development. This paper emphasizes on the most well-known Preprocessing methods and their practical study, selected after a recent, generic book on data Preprocessing that does not deepen on them. This manuscript also presents an illustrative study in two sections with different data sets that provide useful tips for the use of Preprocessing algorithms. In the first place, we graphically present the effects on two benchmark data sets for the Preprocessing methods. The reader may find useful insights on the different characteristics and outcomes generated by them. Secondly, we use a real world problem presented in the ECDBL’2014 Big Data competition to provide a thorough analysis on the application of some Preprocessing techniques, their combination and their performance. As a result, five different cases are analyzed, providing tips that may be useful for readers.

  • Data Preprocessing in Data Mining
    2014
    Co-Authors: Salvador Garca, Julián Luengo, Francisco Herrera
    Abstract:

    Data Preprocessing for Data Mining addresses one of the most important issues within the well-known Knowledge Discovery from Data process. Data directly taken from the source will likely have inconsistencies, errors or most importantly, it is not ready to be considered for a data mining process. Furthermore, the increasing amount of data in recent science, industry and business applications, calls to the requirement of more complex tools to analyze it. Thanks to data Preprocessing, it is possible to convert the impossible into possible, adapting the data to fulfill the input demands of each data mining algorithm. Data Preprocessing includes the data reduction techniques, which aim at reducing the complexity of the data, detecting or removing irrelevant and noisy elements from the data. This book is intended to review the tasks that fill the gap between the data acquisition from the source and the data mining process. A comprehensive look from a practical point of view, including basic concepts and surveying the techniques proposed in the specialized literature, is given. Each chapter is a stand-alone guide to a particular data Preprocessing topic, from basic concepts and detailed descriptions of classical algorithms, to an incursion of an exhaustive catalog of recent developments. The in-depth technical descriptions make this book suitable for technical professionals, researchers, senior undergraduate and graduate students in data science, computer science and engineering.

Stephen C Strother - One of the best experts on this subject based on the ideXlab platform.

  • optimization of Preprocessing strategies in positron emission tomography pet neuroimaging a 11c dasb pet study
    NeuroImage, 2019
    Co-Authors: Martin Norgaard, Melanie Ganz, Claus Svarer, Vibe G Frokjaer, Douglas N Greve, Stephen C Strother, Gitte M Knudsen
    Abstract:

    Positron Emission Tomography (PET) is an important neuroimaging tool to quantify the distribution of specific molecules in the brain. The quantification is based on a series of individually designed data Preprocessing steps (pipeline) and an optimal Preprocessing strategy is per definition associated with less noise and improved statistical power, potentially allowing for more valid neurobiological interpretations. In spite of this, it is currently unclear how to design the best Preprocessing pipeline and to what extent the choice of each Preprocessing step in the pipeline minimizes subject-specific errors. To evaluate the impact of various Preprocessing strategies, we systematically examined 384 different pipeline strategies in data from 30 healthy participants scanned twice with the serotonin transporter (5-HTT) radioligand [11C]DASB. Five commonly used Preprocessing steps with two to four options were investigated: (1) motion correction (MC) (2) co-registration (3) delineation of volumes of interest (VOI's) (4) partial volume correction (PVC), and (5) kinetic modeling. To quantitatively compare and evaluate the impact of various Preprocessing strategies, we used the performance metrics: test-retest bias, within- and between-subject variability, the intraclass-correlation coefficient, and global signal-to-noise ratio. We also performed a power analysis to estimate the required sample size to detect either a 5% or 10% difference in 5-HTT binding as a function of Preprocessing pipeline. The results showed a complex downstream dependency between the various Preprocessing steps on the performance metrics. The choice of MC had the most profound effect on 5-HTT binding, prior to the effects caused by PVC and kinetic modeling, and the effects differed across VOI's. Notably, we observed a negative bias in 5-HTT binding across test and retest in 98% of pipelines, ranging from 0 to 6% depending on the pipeline. Optimization of the performance metrics revealed a trade-off in within- and between-subject variability at the group-level with opposite effects (i.e. minimization of within-subject variability increased between-subject variability and vice versa). The sample size required to detect a given effect size was also compromised by the Preprocessing strategy, resulting in up to 80% increases in sample size needed to detect a 5% difference in 5-HTT binding. This is the first study to systematically investigate and demonstrate the effect of choosing different Preprocessing strategies on the outcome of dynamic PET studies. We provide a framework to show how optimal and maximally powered neuroimaging results can be obtained by choosing appropriate Preprocessing strategies and we provide recommendations depending on the study design. In addition, the results contribute to a better understanding of methodological uncertainty and variability in Preprocessing decisions for future group- and/or longitudinal PET studies.

  • The Impact of Preprocessing Pipeline Choice in Univariate and Multivariate Analyses of PET Data
    2018 International Workshop on Pattern Recognition in Neuroimaging (PRNI), 2018
    Co-Authors: Martin Norgaard, Claus Svarer, Douglas N Greve, Stephen C Strother, Gitte M Knudsen, Melanie Ganz
    Abstract:

    It has long been recognized that the data Preprocessing chain is a critical part of a neuroimaging experiment. In this work we evaluate the impact of Preprocessing choices in univariate and multivariate analyses of Positron Emission Tomography (PET) data. Thirty healthy participants were scanned twice in a High-Resolution Research Tomography PET scanner with the serotonin transporter (5-HTT) radioligand [11 C]DASB. Binding potentials (BPND) from 14 brain regions are quantified with 384 different Preprocessing choices. A univariate paired t-test is applied to each region and for each Preprocessing choice, and corrected for multiple comparisons using FDR within each pipeline. Additionally, a multivariate Linear Discriminant Analysis (LDA) model is used to discriminate test and retest BPND, and the model performance is evaluated using a repeated cross-validation framework with permutations. The univariate analysis revealed several significant differences in 5-HTT BPND across brain regions, depending on the Preprocessing choice. The classification accuracy of the multivariate LDA model varied from 37% to 70% depending on the choice of Preprocessing, and could reasonably be modeled with a normal distribution centered at 51% accuracy. In spite of correcting for multiple comparisons, the univariate model with varying Preprocessing choices is more likely to generate false-positive results compared to a simple multivariate analysis model evaluated with cross-validation and permutations.

  • optimizing Preprocessing and analysis pipelines for single subject fmri 2 interactions with ica pca task contrast and inter subject heterogeneity
    PLOS ONE, 2012
    Co-Authors: Nathan W Churchill, Grigori Yourganov, Anita Oder, Fred Tam, Simon J Graham, Stephen C Strother
    Abstract:

    A variety of Preprocessing techniques are available to correct subject-dependant artifacts in fMRI, caused by head motion and physiological noise. Although it has been established that the chosen Preprocessing steps (or “pipeline”) may significantly affect fMRI results, it is not well understood how Preprocessing choices interact with other parts of the fMRI experimental design. In this study, we examine how two experimental factors interact with Preprocessing: between-subject heterogeneity, and strength of task contrast. Two levels of cognitive contrast were examined in an fMRI adaptation of the Trail-Making Test, with data from young, healthy adults. The importance of standard Preprocessing with motion correction, physiological noise correction, motion parameter regression and temporal detrending were examined for the two task contrasts. We also tested subspace estimation using Principal Component Analysis (PCA), and Independent Component Analysis (ICA). Results were obtained for Penalized Discriminant Analysis, and model performance quantified with reproducibility (R) and prediction metrics (P). Simulation methods were also used to test for potential biases from individual-subject optimization. Our results demonstrate that (1) individual pipeline optimization is not significantly more biased than fixed Preprocessing. In addition, (2) when applying a fixed pipeline across all subjects, the task contrast significantly affects pipeline performance; in particular, the effects of PCA and ICA models vary with contrast, and are not by themselves optimal Preprocessing steps. Also, (3) selecting the optimal pipeline for each subject improves within-subject (P,R) and between-subject overlap, with the weaker cognitive contrast being more sensitive to pipeline optimization. These results demonstrate that sensitivity of fMRI results is influenced not only by Preprocessing choices, but also by interactions with other experimental design factors. This paper outlines a quantitative procedure to denoise data that would otherwise be discarded due to artifact; this is particularly relevant for weak signal contrasts in single-subject, small-sample and clinical datasets.