Document Retrieval

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 12393 Experts worldwide ranked by ideXlab platform

Hsiao-wuen Hon - One of the best experts on this subject based on the ideXlab platform.

  • Adapting ranking SVM to Document Retrieval
    Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '06, 2006
    Co-Authors: Yunbo Cao, Yalou Huang, Tie-yan Liu, Jun Xu, Hang Li, Hsiao-wuen Hon
    Abstract:

    The paper is concerned with applying learning to rank to Document Retrieval. Ranking SVM is a typical method of learning to rank. We point out that there are two factors one must consider when applying Ranking SVM, in general a learning to rank method, to Document Retrieval. First, correctly ranking Documents on the top of the result list is crucial for an Information Retrieval system. One must conduct training in a way that such ranked results are accurate. Second, the number of relevant Documents can vary from query to query. One must avoid training a model biased toward queries with a large number of relevant Documents. Previously, when existing methods that include Ranking SVM were applied to Document Retrieval, none of the two factors was taken into consideration. We show it is possible to make modifications in conventional Ranking SVM, so it can be better used for Document Retrieval. Specifically, we modify the Hinge Loss function in Ranking SVM to deal with the problems described above. We employ two methods to conduct optimization on the loss function: gradient descent and quadratic programming. Experimental results show that our method, referred to as Ranking SVM for IR, can outperform the conventional Ranking SVM and other existing methods for Document Retrieval on two datasets.

  • SIGIR - Adapting ranking SVM to Document Retrieval
    Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '06, 2006
    Co-Authors: Yunbo Cao, Yalou Huang, Tie-yan Liu, Hsiao-wuen Hon
    Abstract:

    The paper is concerned with applying learning to rank to Document Retrieval. Ranking SVM is a typical method of learning to rank. We point out that there are two factors one must consider when applying Ranking SVM, in general a "learning to rank" method, to Document Retrieval. First, correctly ranking Documents on the top of the result list is crucial for an Information Retrieval system. One must conduct training in a way that such ranked results are accurate. Second, the number of relevant Documents can vary from query to query. One must avoid training a model biased toward queries with a large number of relevant Documents. Previously, when existing methods that include Ranking SVM were applied to Document Retrieval, none of the two factors was taken into consideration. We show it is possible to make modifications in conventional Ranking SVM, so it can be better used for Document Retrieval. Specifically, we modify the "Hinge Loss" function in Ranking SVM to deal with the problems described above. We employ two methods to conduct optimization on the loss function: gradient descent and quadratic programming. Experimental results show that our method, referred to as Ranking SVM for IR, can outperform the conventional Ranking SVM and other existing methods for Document Retrieval on two datasets.

Dragutin Petkovic - One of the best experts on this subject based on the ideXlab platform.

  • phonetic confusion matrix based spoken Document Retrieval
    International ACM SIGIR Conference on Research and Development in Information Retrieval, 2000
    Co-Authors: Savitha Srinivasan, Dragutin Petkovic
    Abstract:

    Combined word-based index and phonetic indexes have been used to improve the performance of spoken Document Retrieval systems primarily by addressing the out-of-vocabulary Retrieval problem. However, a known problem with phonetic recognition is its limited accuracy in comparison with word level recognition. We propose a novel method for phonetic Retrieval in the CueVideo system based on the probabilistic formulation of term weighting using phone confusion data in a Bayesian framework. We evaluate this method of spoken Document Retrieval against word-based Retrieval for the search levels identified in a realistic video-based distributed learning setting. Using our test data, we achieved an average recall of 0.88 with an average precision of 0.69 for Retrieval of out-of-vocabulary words on phonetic transcripts with 35% word error rate. For in-vocabulary words, we achieved a 17% improvement in recall over word-based Retrieval with a 17% loss in precision for word error rites ranging from 35 to 65%.

  • SIGIR - Phonetic confusion matrix based spoken Document Retrieval
    Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '00, 2000
    Co-Authors: Savitha Srinivasan, Dragutin Petkovic
    Abstract:

    Combined word-based index and phonetic indexes have been used to improve the performance of spoken Document Retrieval systems primarily by addressing the out-of-vocabulary Retrieval problem. However, a known problem with phonetic recognition is its limited accuracy in comparison with word level recognition. We propose a novel method for phonetic Retrieval in the CueVideo system based on the probabilistic formulation of term weighting using phone confusion data in a Bayesian framework. We evaluate this method of spoken Document Retrieval against word-based Retrieval for the search levels identified in a realistic video-based distributed learning setting. Using our test data, we achieved an average recall of 0.88 with an average precision of 0.69 for Retrieval of out-of-vocabulary words on phonetic transcripts with 35% word error rate. For in-vocabulary words, we achieved a 17% improvement in recall over word-based Retrieval with a 17% loss in precision for word error rites ranging from 35 to 65%.

Yunbo Cao - One of the best experts on this subject based on the ideXlab platform.

  • Adapting ranking SVM to Document Retrieval
    Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '06, 2006
    Co-Authors: Yunbo Cao, Yalou Huang, Tie-yan Liu, Jun Xu, Hang Li, Hsiao-wuen Hon
    Abstract:

    The paper is concerned with applying learning to rank to Document Retrieval. Ranking SVM is a typical method of learning to rank. We point out that there are two factors one must consider when applying Ranking SVM, in general a learning to rank method, to Document Retrieval. First, correctly ranking Documents on the top of the result list is crucial for an Information Retrieval system. One must conduct training in a way that such ranked results are accurate. Second, the number of relevant Documents can vary from query to query. One must avoid training a model biased toward queries with a large number of relevant Documents. Previously, when existing methods that include Ranking SVM were applied to Document Retrieval, none of the two factors was taken into consideration. We show it is possible to make modifications in conventional Ranking SVM, so it can be better used for Document Retrieval. Specifically, we modify the Hinge Loss function in Ranking SVM to deal with the problems described above. We employ two methods to conduct optimization on the loss function: gradient descent and quadratic programming. Experimental results show that our method, referred to as Ranking SVM for IR, can outperform the conventional Ranking SVM and other existing methods for Document Retrieval on two datasets.

  • SIGIR - Adapting ranking SVM to Document Retrieval
    Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '06, 2006
    Co-Authors: Yunbo Cao, Yalou Huang, Tie-yan Liu, Hsiao-wuen Hon
    Abstract:

    The paper is concerned with applying learning to rank to Document Retrieval. Ranking SVM is a typical method of learning to rank. We point out that there are two factors one must consider when applying Ranking SVM, in general a "learning to rank" method, to Document Retrieval. First, correctly ranking Documents on the top of the result list is crucial for an Information Retrieval system. One must conduct training in a way that such ranked results are accurate. Second, the number of relevant Documents can vary from query to query. One must avoid training a model biased toward queries with a large number of relevant Documents. Previously, when existing methods that include Ranking SVM were applied to Document Retrieval, none of the two factors was taken into consideration. We show it is possible to make modifications in conventional Ranking SVM, so it can be better used for Document Retrieval. Specifically, we modify the "Hinge Loss" function in Ranking SVM to deal with the problems described above. We employ two methods to conduct optimization on the loss function: gradient descent and quadratic programming. Experimental results show that our method, referred to as Ranking SVM for IR, can outperform the conventional Ranking SVM and other existing methods for Document Retrieval on two datasets.

Savitha Srinivasan - One of the best experts on this subject based on the ideXlab platform.

  • phonetic confusion matrix based spoken Document Retrieval
    International ACM SIGIR Conference on Research and Development in Information Retrieval, 2000
    Co-Authors: Savitha Srinivasan, Dragutin Petkovic
    Abstract:

    Combined word-based index and phonetic indexes have been used to improve the performance of spoken Document Retrieval systems primarily by addressing the out-of-vocabulary Retrieval problem. However, a known problem with phonetic recognition is its limited accuracy in comparison with word level recognition. We propose a novel method for phonetic Retrieval in the CueVideo system based on the probabilistic formulation of term weighting using phone confusion data in a Bayesian framework. We evaluate this method of spoken Document Retrieval against word-based Retrieval for the search levels identified in a realistic video-based distributed learning setting. Using our test data, we achieved an average recall of 0.88 with an average precision of 0.69 for Retrieval of out-of-vocabulary words on phonetic transcripts with 35% word error rate. For in-vocabulary words, we achieved a 17% improvement in recall over word-based Retrieval with a 17% loss in precision for word error rites ranging from 35 to 65%.

  • SIGIR - Phonetic confusion matrix based spoken Document Retrieval
    Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '00, 2000
    Co-Authors: Savitha Srinivasan, Dragutin Petkovic
    Abstract:

    Combined word-based index and phonetic indexes have been used to improve the performance of spoken Document Retrieval systems primarily by addressing the out-of-vocabulary Retrieval problem. However, a known problem with phonetic recognition is its limited accuracy in comparison with word level recognition. We propose a novel method for phonetic Retrieval in the CueVideo system based on the probabilistic formulation of term weighting using phone confusion data in a Bayesian framework. We evaluate this method of spoken Document Retrieval against word-based Retrieval for the search levels identified in a realistic video-based distributed learning setting. Using our test data, we achieved an average recall of 0.88 with an average precision of 0.69 for Retrieval of out-of-vocabulary words on phonetic transcripts with 35% word error rate. For in-vocabulary words, we achieved a 17% improvement in recall over word-based Retrieval with a 17% loss in precision for word error rites ranging from 35 to 65%.

Tie-yan Liu - One of the best experts on this subject based on the ideXlab platform.

  • Adapting ranking SVM to Document Retrieval
    Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '06, 2006
    Co-Authors: Yunbo Cao, Yalou Huang, Tie-yan Liu, Jun Xu, Hang Li, Hsiao-wuen Hon
    Abstract:

    The paper is concerned with applying learning to rank to Document Retrieval. Ranking SVM is a typical method of learning to rank. We point out that there are two factors one must consider when applying Ranking SVM, in general a learning to rank method, to Document Retrieval. First, correctly ranking Documents on the top of the result list is crucial for an Information Retrieval system. One must conduct training in a way that such ranked results are accurate. Second, the number of relevant Documents can vary from query to query. One must avoid training a model biased toward queries with a large number of relevant Documents. Previously, when existing methods that include Ranking SVM were applied to Document Retrieval, none of the two factors was taken into consideration. We show it is possible to make modifications in conventional Ranking SVM, so it can be better used for Document Retrieval. Specifically, we modify the Hinge Loss function in Ranking SVM to deal with the problems described above. We employ two methods to conduct optimization on the loss function: gradient descent and quadratic programming. Experimental results show that our method, referred to as Ranking SVM for IR, can outperform the conventional Ranking SVM and other existing methods for Document Retrieval on two datasets.

  • SIGIR - Adapting ranking SVM to Document Retrieval
    Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '06, 2006
    Co-Authors: Yunbo Cao, Yalou Huang, Tie-yan Liu, Hsiao-wuen Hon
    Abstract:

    The paper is concerned with applying learning to rank to Document Retrieval. Ranking SVM is a typical method of learning to rank. We point out that there are two factors one must consider when applying Ranking SVM, in general a "learning to rank" method, to Document Retrieval. First, correctly ranking Documents on the top of the result list is crucial for an Information Retrieval system. One must conduct training in a way that such ranked results are accurate. Second, the number of relevant Documents can vary from query to query. One must avoid training a model biased toward queries with a large number of relevant Documents. Previously, when existing methods that include Ranking SVM were applied to Document Retrieval, none of the two factors was taken into consideration. We show it is possible to make modifications in conventional Ranking SVM, so it can be better used for Document Retrieval. Specifically, we modify the "Hinge Loss" function in Ranking SVM to deal with the problems described above. We employ two methods to conduct optimization on the loss function: gradient descent and quadratic programming. Experimental results show that our method, referred to as Ranking SVM for IR, can outperform the conventional Ranking SVM and other existing methods for Document Retrieval on two datasets.