Statistical Measures

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 228765 Experts worldwide ranked by ideXlab platform

Nelson Areal - One of the best experts on this subject based on the ideXlab platform.

  • Stock market sentiment lexicon acquisition using microblogging data and Statistical Measures
    Decision Support Systems, 2016
    Co-Authors: Nuno Oliveira, Paulo Cortez, Nelson Areal
    Abstract:

    Lexicon acquisition is a key issue for sentiment analysis. This paper presents a novel and fast approach for creating stock market lexicons. The approach is based on Statistical Measures applied over a vast set of labeled messages from StockTwits, which is a specialized stock market microblog. We compare three adaptations of Statistical Measures, such as Pointwise Mutual Information (PMI), two new complementary statistics and the use of sentiment scores for affirmative and negated contexts. Using StockTwits, we show that the new lexicons are competitive for measuring investor sentiment when compared with six popular lexicons. We also applied a lexicon to easily produce Twitter investor sentiment indicators and analyzed their correlation with survey sentiment indexes. The new microblogging indicators have a moderate correlation with popular Investors Intelligence (II) and American Association of Individual Investors (AAII) indicators. Thus, the new microblogging approach can be used alternatively to traditional survey indicators with advantages (e.g., cheaper creation, higher frequencies). Proposal of an automatic procedure for the creation of stock market lexicons.The procedure uses diverse Statistical Measures on StockTwits labeled messages.The new lexicons obtain better investor sentiment indicators than general lexicons.The new Twitter sentiment indicators correlate with survey sentiment indicators.

Karl Aberer - One of the best experts on this subject based on the ideXlab platform.

  • affinity efficiently querying Statistical Measures on time series data
    International Conference on Data Engineering, 2013
    Co-Authors: Saket Sathe, Karl Aberer
    Abstract:

    Computing Statistical Measures for large databases of time series is a fundamental primitive for querying and mining time-series data [1]-[6]. This primitive is gaining importance with the increasing number and rapid growth of time series databases. In this paper, we introduce a framework for efficient computation of Statistical Measures by exploiting the concept of affine relationships. Affine relationships can be used to infer Statistical Measures for time series, from other related time series, instead of computing them directly; thus, reducing the overall computational cost significantly. The resulting methods exhibit at least one order of magnitude improvement over the best known methods. To the best of our knowledge, this is the first work that presents an unified approach for computing and querying several Statistical Measures at once. Our approach exploits affine relationships using three key components. First, the AFCLST algorithm clusters the time-series data, such that high-quality affine relationships could be easily found. Second, the SYMEX algorithm uses the clustered time series and efficiently computes the desired affine relationships. Third, the SCAPE index structure produces a many-fold improvement in the performance of processing several Statistical queries by seamlessly indexing the affine relationships. Finally, we establish the effectiveness of our approaches by performing comprehensive experimental evaluation on real datasets.

  • ICDE - AFFINITY: Efficiently querying Statistical Measures on time-series data
    2013 IEEE 29th International Conference on Data Engineering (ICDE), 2013
    Co-Authors: Saket Sathe, Karl Aberer
    Abstract:

    Computing Statistical Measures for large databases of time series is a fundamental primitive for querying and mining time-series data [1]-[6]. This primitive is gaining importance with the increasing number and rapid growth of time series databases. In this paper, we introduce a framework for efficient computation of Statistical Measures by exploiting the concept of affine relationships. Affine relationships can be used to infer Statistical Measures for time series, from other related time series, instead of computing them directly; thus, reducing the overall computational cost significantly. The resulting methods exhibit at least one order of magnitude improvement over the best known methods. To the best of our knowledge, this is the first work that presents an unified approach for computing and querying several Statistical Measures at once. Our approach exploits affine relationships using three key components. First, the AFCLST algorithm clusters the time-series data, such that high-quality affine relationships could be easily found. Second, the SYMEX algorithm uses the clustered time series and efficiently computes the desired affine relationships. Third, the SCAPE index structure produces a many-fold improvement in the performance of processing several Statistical queries by seamlessly indexing the affine relationships. Finally, we establish the effectiveness of our approaches by performing comprehensive experimental evaluation on real datasets.

Nuno Oliveira - One of the best experts on this subject based on the ideXlab platform.

  • Stock market sentiment lexicon acquisition using microblogging data and Statistical Measures
    Decision Support Systems, 2016
    Co-Authors: Nuno Oliveira, Paulo Cortez, Nelson Areal
    Abstract:

    Lexicon acquisition is a key issue for sentiment analysis. This paper presents a novel and fast approach for creating stock market lexicons. The approach is based on Statistical Measures applied over a vast set of labeled messages from StockTwits, which is a specialized stock market microblog. We compare three adaptations of Statistical Measures, such as Pointwise Mutual Information (PMI), two new complementary statistics and the use of sentiment scores for affirmative and negated contexts. Using StockTwits, we show that the new lexicons are competitive for measuring investor sentiment when compared with six popular lexicons. We also applied a lexicon to easily produce Twitter investor sentiment indicators and analyzed their correlation with survey sentiment indexes. The new microblogging indicators have a moderate correlation with popular Investors Intelligence (II) and American Association of Individual Investors (AAII) indicators. Thus, the new microblogging approach can be used alternatively to traditional survey indicators with advantages (e.g., cheaper creation, higher frequencies). Proposal of an automatic procedure for the creation of stock market lexicons.The procedure uses diverse Statistical Measures on StockTwits labeled messages.The new lexicons obtain better investor sentiment indicators than general lexicons.The new Twitter sentiment indicators correlate with survey sentiment indicators.

Tianming Wang - One of the best experts on this subject based on the ideXlab platform.

  • Comparison study on k-word Statistical Measures for protein: From sequence to 'sequence space'
    BMC bioinformatics, 2008
    Co-Authors: Qi Dai, Tianming Wang
    Abstract:

    Background Many proposed Statistical Measures can efficiently compare protein sequence to further infer protein structure, function and evolutionary information. They share the same idea of using k-word frequencies of protein sequences. Given a protein sequence, the information on its related protein sequences hasn't been used for protein sequence comparison until now. This paper proposed a scheme to construct protein 'sequence space' which was associated with protein sequences related to the given protein, and the performances of Statistical Measures were compared when they explored the information on protein 'sequence space' or not. This paper also presented two Statistical Measures for protein: gre.k (generalized relative entropy) and gsm.k (gapped similarity measure).

  • Use of Statistical Measures for analyzing RNA secondary structures
    Journal of computational chemistry, 2008
    Co-Authors: Qi Dai, Tianming Wang
    Abstract:

    With more and more RNA secondary structures accumulated, the need for comparing different RNA secondary structures often arises in function prediction and evolutionary analysis. Numerous efficient algorithms were developed for comparing different RNA secondary structures, but challenges remain. In this article, a new Statistical measure extending the notion of relative entropy based on the proposed stochastic model is evaluated for RNA secondary structures. The results obtained from several experiments on real datasets have shown the effectiveness of the proposed approach. Moreover, the time complexity of our method is favorable by comparing with that of the existing methods which solve the similar problem. © 2008 Wiley Periodicals, Inc. J Comput Chem, 2008

Saket Sathe - One of the best experts on this subject based on the ideXlab platform.

  • affinity efficiently querying Statistical Measures on time series data
    International Conference on Data Engineering, 2013
    Co-Authors: Saket Sathe, Karl Aberer
    Abstract:

    Computing Statistical Measures for large databases of time series is a fundamental primitive for querying and mining time-series data [1]-[6]. This primitive is gaining importance with the increasing number and rapid growth of time series databases. In this paper, we introduce a framework for efficient computation of Statistical Measures by exploiting the concept of affine relationships. Affine relationships can be used to infer Statistical Measures for time series, from other related time series, instead of computing them directly; thus, reducing the overall computational cost significantly. The resulting methods exhibit at least one order of magnitude improvement over the best known methods. To the best of our knowledge, this is the first work that presents an unified approach for computing and querying several Statistical Measures at once. Our approach exploits affine relationships using three key components. First, the AFCLST algorithm clusters the time-series data, such that high-quality affine relationships could be easily found. Second, the SYMEX algorithm uses the clustered time series and efficiently computes the desired affine relationships. Third, the SCAPE index structure produces a many-fold improvement in the performance of processing several Statistical queries by seamlessly indexing the affine relationships. Finally, we establish the effectiveness of our approaches by performing comprehensive experimental evaluation on real datasets.

  • ICDE - AFFINITY: Efficiently querying Statistical Measures on time-series data
    2013 IEEE 29th International Conference on Data Engineering (ICDE), 2013
    Co-Authors: Saket Sathe, Karl Aberer
    Abstract:

    Computing Statistical Measures for large databases of time series is a fundamental primitive for querying and mining time-series data [1]-[6]. This primitive is gaining importance with the increasing number and rapid growth of time series databases. In this paper, we introduce a framework for efficient computation of Statistical Measures by exploiting the concept of affine relationships. Affine relationships can be used to infer Statistical Measures for time series, from other related time series, instead of computing them directly; thus, reducing the overall computational cost significantly. The resulting methods exhibit at least one order of magnitude improvement over the best known methods. To the best of our knowledge, this is the first work that presents an unified approach for computing and querying several Statistical Measures at once. Our approach exploits affine relationships using three key components. First, the AFCLST algorithm clusters the time-series data, such that high-quality affine relationships could be easily found. Second, the SYMEX algorithm uses the clustered time series and efficiently computes the desired affine relationships. Third, the SCAPE index structure produces a many-fold improvement in the performance of processing several Statistical queries by seamlessly indexing the affine relationships. Finally, we establish the effectiveness of our approaches by performing comprehensive experimental evaluation on real datasets.