The Experts below are selected from a list of 117387 Experts worldwide ranked by ideXlab platform
Dominik Heider - One of the best experts on this subject based on the ideXlab platform.
-
SEDE-GPS: socio-economic Data Enrichment based on GPS information
BMC Bioinformatics, 2018Co-Authors: Theodor Sperlea, Stefan Füser, Jens Boenigk, Dominik HeiderAbstract:Background Microbes are essentail components of all ecosystems because they drive many biochemical processes and act as primary producers. In freshwater ecosystems, the biodiversity in and the composition of microbial communities can be used as indicators for environmental quality. Recently, some environmental features have been identified that influence microbial ecosystems. However, the impact of human action on lake microbiomes is not well understood. This is, in part, due to the fact that environmental Data is, albeit theoretically accessible, not easily available. Results In this work, we present SEDE-GPS, a tool that gathers Data that are relevant to the environment of an user-provided GPS coordinate. To this end, it accesses a list of public and corporate Databases and aggregates the information in a single file, which can be used for further analysis. To showcase the use of SEDE-GPS, we enriched a lake microbial ecology sequencing Dataset with around 18,000 socio-economic, climate, and geographic features. The sources of SEDE-GPS are public Databases such as Eurostat, the Climate Data Center, and OpenStreetMap, as well as corporate sources such as Twitter. Using machine learning and feature selection methods, we were able to identify features in the Data provided by SEDE-GPS that can be used to predict lake microbiome alpha diversity. Conclusion The results presented in this study show that SEDE-GPS is a handy and easy-to-use tool for comprehensive Data Enrichment for studies of ecology and other processes that are affected by environmental features. Furthermore, we present lists of environmental, socio-economic, and climate features that are predictive for microbial biodiversity in lake ecosystems. These lists indicate that human action has a major impact on lake microbiomes. SEDE-GPS and its source code is available for download at http://SEDE-GPS.heiderlab.de
-
SEDE-GPS: socio-economic Data Enrichment based on GPS information.
BMC bioinformatics, 2018Co-Authors: Theodor Sperlea, Stefan Füser, Jens Boenigk, Dominik HeiderAbstract:Microbes are essentail components of all ecosystems because they drive many biochemical processes and act as primary producers. In freshwater ecosystems, the biodiversity in and the composition of microbial communities can be used as indicators for environmental quality. Recently, some environmental features have been identified that influence microbial ecosystems. However, the impact of human action on lake microbiomes is not well understood. This is, in part, due to the fact that environmental Data is, albeit theoretically accessible, not easily available. In this work, we present SEDE-GPS, a tool that gathers Data that are relevant to the environment of an user-provided GPS coordinate. To this end, it accesses a list of public and corporate Databases and aggregates the information in a single file, which can be used for further analysis. To showcase the use of SEDE-GPS, we enriched a lake microbial ecology sequencing Dataset with around 18,000 socio-economic, climate, and geographic features. The sources of SEDE-GPS are public Databases such as Eurostat, the Climate Data Center, and OpenStreetMap, as well as corporate sources such as Twitter. Using machine learning and feature selection methods, we were able to identify features in the Data provided by SEDE-GPS that can be used to predict lake microbiome alpha diversity. The results presented in this study show that SEDE-GPS is a handy and easy-to-use tool for comprehensive Data Enrichment for studies of ecology and other processes that are affected by environmental features. Furthermore, we present lists of environmental, socio-economic, and climate features that are predictive for microbial biodiversity in lake ecosystems. These lists indicate that human action has a major impact on lake microbiomes. SEDE-GPS and its source code is available for download at http://SEDE-GPS.heiderlab.de.
-
SEDE-GPS: socio-economic Data Enrichment based on GPS information
BMC Bioinformatics, 2018Co-Authors: Theodor Sperlea, Stefan Füser, Jens Boenigk, Dominik HeiderAbstract:Background Microbes are essentail components of all ecosystems because they drive many biochemical processes and act as primary producers. In freshwater ecosystems, the biodiversity in and the composition of microbial communities can be used as indicators for environmental quality. Recently, some environmental features have been identified that influence microbial ecosystems. However, the impact of human action on lake microbiomes is not well understood. This is, in part, due to the fact that environmental Data is, albeit theoretically accessible, not easily available.
Ela Hunt - One of the best experts on this subject based on the ideXlab platform.
-
Improved Data retrieval from TreeBASE via taxonomic and linguistic Data Enrichment
BMC Evolutionary Biology, 2009Co-Authors: Nadia Anwar, Ela HuntAbstract:Background TreeBASE, the only Data repository for phylogenetic studies, is not being used effectively since it does not meet the taxonomic Data retrieval requirements of the systematics community. We show, through an examination of the queries performed on TreeBASE, that Data retrieval using taxon names is unsatisfactory. Results We report on a new wrapper supporting taxon queries on TreeBASE by utilising a Taxonomy and Classification Database (TCl-Db) we created. TCl-Db holds merged and consolidated taxonomic names from multiple Data sources and can be used to translate hierarchical, vernacular and synonym queries into specific query terms in TreeBASE. The query expansion supported by TCl-Db shows very significant information retrieval quality improvement. The wrapper can be accessed at the URL http://spira.zoology.gla.ac.uk/app/tbasewrapper.php The methodology we developed is scalable and can be applied to new Data, as those become available in the future. Conclusion Significantly improved Data retrieval quality is shown for all queries, and additional flexibility is achieved via user-driven taxonomy selection.
-
Improved Data retrieval from TreeBASE via taxonomic and linguistic Data Enrichment
BMC Evolutionary Biology, 2009Co-Authors: Nadia Anwar, Ela HuntAbstract:Background TreeBASE, the only Data repository for phylogenetic studies, is not being used effectively since it does not meet the taxonomic Data retrieval requirements of the systematics community. We show, through an examination of the queries performed on TreeBASE, that Data retrieval using taxon names is unsatisfactory.
Sean O'riain - One of the best experts on this subject based on the ideXlab platform.
-
SSN - Toward situation awareness for the semantic sensor web: complex event processing with dynamic linked Data Enrichment
2011Co-Authors: Souleiman Hasan, Mauricio Banduk, Edward Curry, Sean O'riainAbstract:Over the past few years there has been a proliferation in the use of sensors within different applications. The increase in the quantity of sensor Data makes it difficult for end users to understand situations within the environments where the sensors are deployed. Thus, there is a need for situation assessment mechanisms upon the sensor networks to assist users to interpret sensor Data when making decisions. However, one of the challenges to realize such a mechanism is the need to integrate real-time sensor readings with contextual Data sources from legacy systems. This paper tackles the Data Enrichment problem for sensor Data. It builds upon Linked Data principles as a valid basis for a unified Enrichment infrastructure and proposes a dynamic Enrichment approach that sees Enrichment as a process driven by situations of interest. The approach is demonstrated through examples and a proof-of-concept prototype based on an energy management use case.
-
Toward situation awareness for the semantic sensor web: Complex event processing with dynamic linked Data Enrichment
CEUR Workshop Proceedings, 2011Co-Authors: Souleiman Hasan, Mauricio Banduk, Edward Curry, Sean O'riainAbstract:Over the past few years there has been a proliferation in the use of sensors within different applications. The increase in the quantity of sensor Data makes it difficult for end users to understand situations within the environments where the sensors are deployed. Thus, there is a need for situation assessment mechanisms upon the sensor networks to assist users to interpret sensor Data when making decisions. However, one of the challenges to realize such a mechanism is the need to integrate real-time sensor readings with contextual Data sources from legacy systems. This paper tackles the Data Enrichment problem for sensor Data. It builds upon Linked Data principles as a valid basis for a unified Enrichment infrastructure and proposes a dynamic Enrichment approach that sees Enrichment as a process driven by situations of interest. The approach is demonstrated through examples and a proof-of-concept prototype based on an energy management use case.
Giuseppe Carenini - One of the best experts on this subject based on the ideXlab platform.
-
training Data Enrichment for infrequent discourse relations
International Conference on Computational Linguistics, 2016Co-Authors: Kailang Jiang, Giuseppe CareniniAbstract:Discourse parsing is a popular technique widely used in text understanding, sentiment analysis and other NLP tasks. However, for most discourse parsers, the performance varies significantly across different discourse relations. In this paper, we first validate the underfitting hypothesis, i.e., the less frequent a relation is in the training Data, the poorer the performance on that relation. We then explore how to increase the number of positive training instances, without resorting to manually creating additional labeled Data. We propose a training Data Enrichment framework that relies on co-training of two different discourse parsers on unlabeled documents. Importantly, we show that co-training alone is not sufficient. The framework requires a filtering step to ensure that only “good quality” unlabeled documents can be used for Enrichment and re-training. We propose and evaluate two ways to perform the filtering. The first is to use an agreement score between the two parsers. The second is to use only the confidence score of the faster parser. Our empirical results show that agreement score can help to boost the performance on infrequent relations, and that the confidence score is a viable approximation of the agreement score for infrequent relations.
-
COLING - Training Data Enrichment for Infrequent Discourse Relations
2016Co-Authors: Kailang Jiang, Giuseppe CareniniAbstract:Discourse parsing is a popular technique widely used in text understanding, sentiment analysis and other NLP tasks. However, for most discourse parsers, the performance varies significantly across different discourse relations. In this paper, we first validate the underfitting hypothesis, i.e., the less frequent a relation is in the training Data, the poorer the performance on that relation. We then explore how to increase the number of positive training instances, without resorting to manually creating additional labeled Data. We propose a training Data Enrichment framework that relies on co-training of two different discourse parsers on unlabeled documents. Importantly, we show that co-training alone is not sufficient. The framework requires a filtering step to ensure that only “good quality” unlabeled documents can be used for Enrichment and re-training. We propose and evaluate two ways to perform the filtering. The first is to use an agreement score between the two parsers. The second is to use only the confidence score of the faster parser. Our empirical results show that agreement score can help to boost the performance on infrequent relations, and that the confidence score is a viable approximation of the agreement score for infrequent relations.
Theodor Sperlea - One of the best experts on this subject based on the ideXlab platform.
-
SEDE-GPS: socio-economic Data Enrichment based on GPS information
BMC Bioinformatics, 2018Co-Authors: Theodor Sperlea, Stefan Füser, Jens Boenigk, Dominik HeiderAbstract:Background Microbes are essentail components of all ecosystems because they drive many biochemical processes and act as primary producers. In freshwater ecosystems, the biodiversity in and the composition of microbial communities can be used as indicators for environmental quality. Recently, some environmental features have been identified that influence microbial ecosystems. However, the impact of human action on lake microbiomes is not well understood. This is, in part, due to the fact that environmental Data is, albeit theoretically accessible, not easily available. Results In this work, we present SEDE-GPS, a tool that gathers Data that are relevant to the environment of an user-provided GPS coordinate. To this end, it accesses a list of public and corporate Databases and aggregates the information in a single file, which can be used for further analysis. To showcase the use of SEDE-GPS, we enriched a lake microbial ecology sequencing Dataset with around 18,000 socio-economic, climate, and geographic features. The sources of SEDE-GPS are public Databases such as Eurostat, the Climate Data Center, and OpenStreetMap, as well as corporate sources such as Twitter. Using machine learning and feature selection methods, we were able to identify features in the Data provided by SEDE-GPS that can be used to predict lake microbiome alpha diversity. Conclusion The results presented in this study show that SEDE-GPS is a handy and easy-to-use tool for comprehensive Data Enrichment for studies of ecology and other processes that are affected by environmental features. Furthermore, we present lists of environmental, socio-economic, and climate features that are predictive for microbial biodiversity in lake ecosystems. These lists indicate that human action has a major impact on lake microbiomes. SEDE-GPS and its source code is available for download at http://SEDE-GPS.heiderlab.de
-
SEDE-GPS: socio-economic Data Enrichment based on GPS information.
BMC bioinformatics, 2018Co-Authors: Theodor Sperlea, Stefan Füser, Jens Boenigk, Dominik HeiderAbstract:Microbes are essentail components of all ecosystems because they drive many biochemical processes and act as primary producers. In freshwater ecosystems, the biodiversity in and the composition of microbial communities can be used as indicators for environmental quality. Recently, some environmental features have been identified that influence microbial ecosystems. However, the impact of human action on lake microbiomes is not well understood. This is, in part, due to the fact that environmental Data is, albeit theoretically accessible, not easily available. In this work, we present SEDE-GPS, a tool that gathers Data that are relevant to the environment of an user-provided GPS coordinate. To this end, it accesses a list of public and corporate Databases and aggregates the information in a single file, which can be used for further analysis. To showcase the use of SEDE-GPS, we enriched a lake microbial ecology sequencing Dataset with around 18,000 socio-economic, climate, and geographic features. The sources of SEDE-GPS are public Databases such as Eurostat, the Climate Data Center, and OpenStreetMap, as well as corporate sources such as Twitter. Using machine learning and feature selection methods, we were able to identify features in the Data provided by SEDE-GPS that can be used to predict lake microbiome alpha diversity. The results presented in this study show that SEDE-GPS is a handy and easy-to-use tool for comprehensive Data Enrichment for studies of ecology and other processes that are affected by environmental features. Furthermore, we present lists of environmental, socio-economic, and climate features that are predictive for microbial biodiversity in lake ecosystems. These lists indicate that human action has a major impact on lake microbiomes. SEDE-GPS and its source code is available for download at http://SEDE-GPS.heiderlab.de.
-
SEDE-GPS: socio-economic Data Enrichment based on GPS information
BMC Bioinformatics, 2018Co-Authors: Theodor Sperlea, Stefan Füser, Jens Boenigk, Dominik HeiderAbstract:Background Microbes are essentail components of all ecosystems because they drive many biochemical processes and act as primary producers. In freshwater ecosystems, the biodiversity in and the composition of microbial communities can be used as indicators for environmental quality. Recently, some environmental features have been identified that influence microbial ecosystems. However, the impact of human action on lake microbiomes is not well understood. This is, in part, due to the fact that environmental Data is, albeit theoretically accessible, not easily available.