The Experts below are selected from a list of 143655 Experts worldwide ranked by ideXlab platform
Jonathan K Pritchard - One of the best experts on this subject based on the ideXlab platform.
-
inference of population splits and mixtures from genome wide Allele Frequency data
PLOS Genetics, 2012Co-Authors: Joseph Pickrell, Jonathan K PritchardAbstract:Many aspects of the historical relationships between populations in a species are reflected in genetic data. Inferring these relationships from genetic data, however, remains a challenging task. In this paper, we present a statistical model for inferring the patterns of population splits and mixtures in multiple populations. In our model, the sampled populations in a species are related to their common ancestor through a graph of ancestral populations. Using genome-wide Allele Frequency data and a Gaussian approximation to genetic drift, we infer the structure of this graph. We applied this method to a set of 55 human populations and a set of 82 dog breeds and wild canids. In both species, we show that a simple bifurcating tree does not fully describe the data; in contrast, we infer many migration events. While some of the migration events that we find have been detected previously, many have not. For example, in the human data, we infer that Cambodians trace approximately 16% of their ancestry to a population ancestral to other extant East Asian populations. In the dog data, we infer that both the boxer and basenji trace a considerable fraction of their ancestry (9% and 25%, respectively) to wolves subsequent to domestication and that East Asian toy breeds (the Shih Tzu and the Pekingese) result from admixture between modern toy breeds and “ancient” Asian breeds. Software implementing the model described here, called TreeMix, is available at http://treemix.googlecode.com.
Takaho A Endo - One of the best experts on this subject based on the ideXlab platform.
-
quality control method for rna seq using single nucleotide polymorphism Allele Frequency
Genes to Cells, 2014Co-Authors: Takaho A EndoAbstract:RNA sequencing (RNA-seq) provides information not only about the level of expression of individual genes but also about genomic sequences of host cells. When we use transcriptome data with whole-genome single nucleotide polymorphism (SNP) variant information, the Allele Frequency can show the genetic composition of the cell population and/or chromosomal aberrations. Here, I show how SNPs in mRNAs can be used to evaluate RNA-seq experiments by focusing on RNA-seq data based on a recently retracted paper on stimulus-triggered acquisition of pluripotency (STAP) cells. The analysis indicated that different types of cells and chromosomal abnormalities might have been erroneously included in the dataset. This re-evaluation showed that observing Allele frequencies could help in assessing the quality of samples during a study and with retrospective evaluation of experimental quality.
Rasmus Nielsen - One of the best experts on this subject based on the ideXlab platform.
-
an approximate full likelihood method for inferring selection and Allele Frequency trajectories from dna sequence data
PLOS Genetics, 2019Co-Authors: Aaron J Stern, Peter R Wilton, Rasmus NielsenAbstract:Most current methods for detecting natural selection from DNA sequence data are limited in that they are either based on summary statistics or a composite likelihood, and as a consequence, do not make full use of the information available in DNA sequence data. We here present a new importance sampling approach for approximating the full likelihood function for the selection coefficient. Our method CLUES treats the ancestral recombination graph (ARG) as a latent variable that is integrated out using previously published Markov Chain Monte Carlo (MCMC) methods. The method can be used for detecting selection, estimating selection coefficients, testing models of changes in the strength of selection, estimating the time of the start of a selective sweep, and for inferring the Allele Frequency trajectory of a selected or neutral Allele. We perform extensive simulations to evaluate the method and show that it uniformly improves power to detect selection compared to current popular methods such as nSL and SDS, and can provide reliable inferences of Allele Frequency trajectories under many conditions. We also explore the potential of our method to detect extremely recent changes in the strength of selection. We use the method to infer the past Allele Frequency trajectory for a lactase persistence SNP (MCM6) in Europeans. We also infer the trajectory of a SNP (EDAR) in Han Chinese, finding evidence that this Allele's age is much older than previously claimed. We also study a set of 11 pigmentation-associated variants. Several genes show evidence of strong selection particularly within the last 5,000 years, including ASIP, KITLG, and TYR. However, selection on OCA2/HERC2 seems to be much older and, in contrast to previous claims, we find no evidence of selection on TYRP1.
-
an approximate full likelihood method for inferring selection and Allele Frequency trajectories from dna sequence data
bioRxiv, 2019Co-Authors: Aaron J Stern, Peter R Wilton, Rasmus NielsenAbstract:Abstract Most current methods for detecting natural selection from DNA sequence data are limited in that they are either based on summary statistics or a composite likelihood, and as a consequence, do not make full use of the information available in DNA sequence data. We here present a new importance sampling approach for approximating the full likelihood function for the selection coefficient. The method treats the ancestral recombination graph (ARG) as a latent variable that is integrated out using previously published Markov Chain Monte Carlo (MCMC) methods. The method can be used for detecting selection, estimating selection coefficients, testing models of changes in the strength of selection, estimating the time of the start of a selective sweep, and for inferring the Allele Frequency trajectory of a selected or neutral Allele. We perform extensive simulations to evaluate the method and show that it uniformly improves power to detect selection compared to current popular methods such as nSL and SDS, under various demographic models and can provide reliable inferences of Allele Frequency trajectories under many conditions. We also explore the potential of our method to detect extremely recent changes in the strength of selection. We use the method to infer the past Allele Frequency trajectory for a lactase persistence SNP (MCM6) in Europeans. We also study a set of 11 pigmentation-associated variants. Several genes show evidence of strong selection particularly within the last 5,000 years, including ASIP, KITLG, and TYR. However, selection on OCA2/HERC2 seems to be much older and, in contrast to previous claims, we find no evidence of selection on TYRP1. Author summary Current methods to study natural selection using modern population genomic data are limited in their power and flexibility. Here, we present a new method to infer natural selection that builds on recent methodological advances in estimating genome-wide genealogies. By using importance sampling we are able to efficiently estimate the likelihood function of the selection coefficient. We show our method improves power to test for selection over competing methods across a diverse range of scenarios, and also accurately infers the selection coefficient. We also demonstrate a novel capability of our model, using it to infer the Allele’s Frequency over time. We validate these results with a study of a lactase persistence SNP in Europeans, and also study a set of 11 pigmentation-associated variants.
-
snp calling genotype calling and sample Allele Frequency estimation from new generation sequencing data
PLOS ONE, 2012Co-Authors: Rasmus Nielsen, Thorfinn Sand Korneliussen, Anders Albrechtsen, Jun WangAbstract:We present a statistical framework for estimation and application of sample Allele Frequency spectra from New-Generation Sequencing (NGS) data. In this method, we first estimate the Allele Frequency spectrum using maximum likelihood. In contrast to previous methods, the likelihood function is calculated using a dynamic programming algorithm and numerically optimized using analytical derivatives. We then use a Bayesian method for estimating the sample Allele Frequency in a single site, and show how the method can be used for genotype calling and SNP calling. We also show how the method can be extended to various other cases including cases with deviations from Hardy-Weinberg equilibrium. We evaluate the statistical properties of the methods using simulations and by application to a real data set.
Johanna K Wolford - One of the best experts on this subject based on the ideXlab platform.
-
estimation of single nucleotide polymorphism Allele Frequency in dna pools by using pyrosequencing
Human Genetics, 2002Co-Authors: Jonathan D Gruber, Peter B Colligan, Johanna K WolfordAbstract:Positional cloning of genes underlying complex diseases, such as type 2 diabetes mellitus (T2DM), typically follows a two-tiered process in which a chromosomal region is first identified by genome-wide linkage scanning, followed by association analyses using densely spaced single nucleotide polymorphic markers to identify the causal variant(s). The success of genome-wide single nucleotide polymorphism (SNP) detection has resulted in a vast number of potential markers available for use in the construction of such dense SNP maps. However, the cost of genotyping large numbers of SNPs in appropriately sized samples is nearly prohibitive. We have explored pooled DNA genotyping as a means of identifying differences in Allele Frequency between pools of individuals with T2DM and unaffected controls by using Pyrosequencing technology. We found that Allele frequencies in pooled DNA were strongly correlated with those in individuals (r=0.99, P<0.0001) across a wide range of Allele frequencies (0.02–0.50). We further investigated the sensitivity of this method to detect Allele Frequency differences between contrived pools, also over a wide range of Allele frequencies. We found that Pyrosequencing was able to detect an Allele Frequency difference of less than 2% between pools, indicating that this method may be sensitive enough for use in association studies involving complex diseases where a small difference in Allele Frequency between cases and controls is expected.
-
Estimation of single nucleotide polymorphism Allele Frequency in DNA pools by using Pyrosequencing.
Human Genetics, 2002Co-Authors: Jonathan D Gruber, Peter B Colligan, Johanna K WolfordAbstract:Positional cloning of genes underlying complex diseases, such as type 2 diabetes mellitus (T2DM), typically follows a two-tiered process in which a chromosomal region is first identified by genome-wide linkage scanning, followed by association analyses using densely spaced single nucleotide polymorphic markers to identify the causal variant(s). The success of genome-wide single nucleotide polymorphism (SNP) detection has resulted in a vast number of potential markers available for use in the construction of such dense SNP maps. However, the cost of genotyping large numbers of SNPs in appropriately sized samples is nearly prohibitive. We have explored pooled DNA genotyping as a means of identifying differences in Allele Frequency between pools of individuals with T2DM and unaffected controls by using Pyrosequencing technology. We found that Allele frequencies in pooled DNA were strongly correlated with those in individuals (r=0.99, P
Koichiro Tamura - One of the best experts on this subject based on the ideXlab platform.
-
poptreew web version of poptree for constructing population trees from Allele Frequency data and computing some other quantities
Molecular Biology and Evolution, 2014Co-Authors: Naoko Takezaki, Masatoshi Nei, Koichiro TamuraAbstract:POPTREE software, including the command line (POPTREE) and the Windows (POPTREE2) versions, is available to perform evolutionary analyses of Allele Frequency data, computing distance measures for constructing population trees and average heterozygosity (H) (measure of genetic diversity within populations) and G(ST) (measure of genetic differentiation among subdivided populations). We have now developed a web version POPTREEW (http://www.med.kagawa-u.ac.jp/∼genomelb/takezaki/poptreew/) to provide cross-platform access to all POPTREE functions including interactive tree editing. Furthermore, new POPTREE software (POPTREE, POPTREE2, and POPTREEW) computes standardized G(ST) and Jost's D, which may be appropriate for data with high variability, and accepts genotype data in GENEPOP format as an input.
-
poptree2 software for constructing population trees from Allele Frequency data and computing other population statistics with windows interface
Molecular Biology and Evolution, 2010Co-Authors: Naoko Takezaki, Masatoshi Nei, Koichiro TamuraAbstract:Currently, there is a demand for software to analyze polymorphism data such as microsatellite DNA and single nucleotide polymorphism with easily accessible interface in many fields of research. In this article, we would like to make an announcement of POPTREE2, a computer program package, that can perform evolutionary analyses of Allele Frequency data. The original version (POPTREE) was a command-line program that runs on the Command Prompt of Windows and Unix. In POPTREE2 genetic distances (measures of the extent of genetic differentiation between populations) for constructing phylogenetic trees, average heterozygosities (H) (a measure of genetic variation within populations) and GST (a measure of genetic differentiation of subdivided populations) are computed through a simple and intuitive Windows interface. It will facilitate statistical analyses of polymorphism data for researchers in many different fields. POPTREE2 is available at http://www.med.kagawa-u.ac.jp/;genomelb/takezaki/poptree2/index.html.