Online Evaluation

The Experts below are selected from a list of 128439 Experts worldwide ranked by ideXlab platform

Pavel Serdyukov - One of the best experts on this subject based on the ideXlab platform.

SIGIR - Effective Online Evaluation for Web Search

Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019

Co-Authors: Alexey Drutsa, Pavel Serdyukov, Gleb Gusev, Eugene Kharitonov, Denis Kulemyakin, Igor Yashkov

Abstract:

We present you a program of a balanced mix between an overview of academic achievements in the field of Online Evaluation and a portion of unique industrial practical experience shared by both the leading researchers and engineers from global Internet companies. First, we give basic knowledge from mathematical statistics. This is followed by foundations of main Evaluation methods such as A/B testing, interleaving, and observational studies. Then, we share rich industrial experiences on constructing of an experimentation pipeline and Evaluation metrics (emphasizing best practices and common pitfalls). A large part of our tutorial is devoted to modern and state-of-the-art techniques (including the ones based on machine learning) that allow to conduct Online experimentation efficiently. We invite software engineers, designers, analysts, and managers of web services and software products, as well as beginners, advanced specialists, and researchers to learn how to make web service development effectively data-driven.

15 days free trial to Access Article
Online Evaluation for Effective Web Service Development

arXiv: Human-Computer Interaction, 2018

Co-Authors: Roman Budylin, Pavel Serdyukov, Alexey Drutsa, Gleb Gusev, Igor Yashkov

Abstract:

Development of the majority of the leading web services and software products today is generally guided by data-driven decisions based on Evaluation that ensures a steady stream of updates, both in terms of quality and quantity. Large internet companies use Online Evaluation on a day-to-day basis and at a large scale. The number of smaller companies using A/B testing in their development cycle is also growing. Web development across the board strongly depends on quality of experimentation platforms. In this tutorial, we overview state-of-the-art methods underlying everyday Evaluation pipelines at some of the leading Internet companies. Software engineers, designers, analysts, service or product managers --- beginners, advanced specialists, and researchers --- can learn how to make web service development data-driven and do it effectively.

15 days free trial to Access Article
SIGIR - Challenges and Opportunities in Online Evaluation of Search Engines

Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2015

Co-Authors: Pavel Serdyukov

Abstract:

Yandex is one of the largest Internet companies in Europe, operating Russia's most popular search engine, generating 58.6\% of all search traffic in Russia (as of April 2015). As all modern search engines, Yandex increasingly relies on Online Evaluation methods such as A/B tests and interleaving. These Online Evaluation methods test various changes in the search engine by analyzing the changes in the character of its interactions with its users. There are several grand challenges in Online Evaluation, including the choice of an appropriate Online metric and the need to deal the limited number of user interactions available for a search engine for experimentation. In my talk, I will overview our latest research on improving the sensitivity of well-known Online metrics, on discovery of more sensitive and robust Online metrics, on scheduling and early stopping of Online experiments.

15 days free trial to Access Article

Amr Huber - One of the best experts on this subject based on the ideXlab platform.

offline and Online Evaluation of news recommender systems at swissinfo ch

Conference on Recommender Systems, 2014

Co-Authors: Florent Garcin, Boi Faltings, Olivier Donatsch, Ayar Alazzawi, Christophe Bruttin, Amr Huber

Abstract:

We report on the live Evaluation of various news recommender systems conducted on the website swissinfo.ch. We demonstrate that there is a major difference between offline and Online accuracy Evaluations. In an offline setting, recommending most popular stories is the best strategy, while in a live environment this strategy is the poorest. For Online setting, context-tree recommender systems which profile the users in real-time improve the click-through rate by up to 35%. The visit length also increases by a factor of 2.5. Our experience holds important lessons for the Evaluation of recommender systems with offline data as well as for the use of the click-through rate as a performance indicator.

15 days free trial to Access Article
RecSys - Offline and Online Evaluation of news recommender systems at swissinfo.ch

Proceedings of the 8th ACM Conference on Recommender systems - RecSys '14, 2014

Co-Authors: Florent Garcin, Boi Faltings, Olivier Donatsch, Ayar Alazzawi, Christophe Bruttin, Amr Huber

Abstract:

We report on the live Evaluation of various news recommender systems conducted on the website swissinfo.ch. We demonstrate that there is a major difference between offline and Online accuracy Evaluations. In an offline setting, recommending most popular stories is the best strategy, while in a live environment this strategy is the poorest. For Online setting, context-tree recommender systems which profile the users in real-time improve the click-through rate by up to 35%. The visit length also increases by a factor of 2.5. Our experience holds important lessons for the Evaluation of recommender systems with offline data as well as for the use of the click-through rate as a performance indicator.

15 days free trial to Access Article

Igor Yashkov - One of the best experts on this subject based on the ideXlab platform.

SIGIR - Effective Online Evaluation for Web Search

Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019

Co-Authors: Alexey Drutsa, Pavel Serdyukov, Gleb Gusev, Eugene Kharitonov, Denis Kulemyakin, Igor Yashkov

Abstract:

We present you a program of a balanced mix between an overview of academic achievements in the field of Online Evaluation and a portion of unique industrial practical experience shared by both the leading researchers and engineers from global Internet companies. First, we give basic knowledge from mathematical statistics. This is followed by foundations of main Evaluation methods such as A/B testing, interleaving, and observational studies. Then, we share rich industrial experiences on constructing of an experimentation pipeline and Evaluation metrics (emphasizing best practices and common pitfalls). A large part of our tutorial is devoted to modern and state-of-the-art techniques (including the ones based on machine learning) that allow to conduct Online experimentation efficiently. We invite software engineers, designers, analysts, and managers of web services and software products, as well as beginners, advanced specialists, and researchers to learn how to make web service development effectively data-driven.

15 days free trial to Access Article
Online Evaluation for Effective Web Service Development

arXiv: Human-Computer Interaction, 2018

Co-Authors: Roman Budylin, Pavel Serdyukov, Alexey Drutsa, Gleb Gusev, Igor Yashkov

Abstract:

Development of the majority of the leading web services and software products today is generally guided by data-driven decisions based on Evaluation that ensures a steady stream of updates, both in terms of quality and quantity. Large internet companies use Online Evaluation on a day-to-day basis and at a large scale. The number of smaller companies using A/B testing in their development cycle is also growing. Web development across the board strongly depends on quality of experimentation platforms. In this tutorial, we overview state-of-the-art methods underlying everyday Evaluation pipelines at some of the leading Internet companies. Software engineers, designers, analysts, service or product managers --- beginners, advanced specialists, and researchers --- can learn how to make web service development data-driven and do it effectively.

15 days free trial to Access Article

Florent Garcin - One of the best experts on this subject based on the ideXlab platform.

offline and Online Evaluation of news recommender systems at swissinfo ch

Conference on Recommender Systems, 2014

Co-Authors: Florent Garcin, Boi Faltings, Olivier Donatsch, Ayar Alazzawi, Christophe Bruttin, Amr Huber

Abstract:

We report on the live Evaluation of various news recommender systems conducted on the website swissinfo.ch. We demonstrate that there is a major difference between offline and Online accuracy Evaluations. In an offline setting, recommending most popular stories is the best strategy, while in a live environment this strategy is the poorest. For Online setting, context-tree recommender systems which profile the users in real-time improve the click-through rate by up to 35%. The visit length also increases by a factor of 2.5. Our experience holds important lessons for the Evaluation of recommender systems with offline data as well as for the use of the click-through rate as a performance indicator.

15 days free trial to Access Article
RecSys - Offline and Online Evaluation of news recommender systems at swissinfo.ch

Proceedings of the 8th ACM Conference on Recommender systems - RecSys '14, 2014

Co-Authors: Florent Garcin, Boi Faltings, Olivier Donatsch, Ayar Alazzawi, Christophe Bruttin, Amr Huber

Abstract:

We report on the live Evaluation of various news recommender systems conducted on the website swissinfo.ch. We demonstrate that there is a major difference between offline and Online accuracy Evaluations. In an offline setting, recommending most popular stories is the best strategy, while in a live environment this strategy is the poorest. For Online setting, context-tree recommender systems which profile the users in real-time improve the click-through rate by up to 35%. The visit length also increases by a factor of 2.5. Our experience holds important lessons for the Evaluation of recommender systems with offline data as well as for the use of the click-through rate as a performance indicator.

15 days free trial to Access Article

Jöran Beel - One of the best experts on this subject based on the ideXlab platform.

Document Embeddings vs. Keyphrases vs. Terms: An Online Evaluation in Digital Library Recommender Systems.

arXiv: Information Retrieval, 2019

Co-Authors: Andrew Collins, Jöran Beel

Abstract:

Many recommendation algorithms are available to digital library recommender system operators. The effectiveness of algorithms is largely unreported by way of Online Evaluation. We compare a standard term-based recommendation approach to two promising approaches for related-article recommendation in digital libraries: document embeddings, and keyphrases. We evaluate the consistency of their performance across multiple scenarios. Through our recommender-as-a-service Mr. DLib, we delivered 33.5M recommendations to users of Sowiport and Jabref over the course of 19 months, from March 2017 to October 2018. The effectiveness of the algorithms differs significantly between Sowiport and Jabref (Wilcoxon rank-sum test; p < 0.05). There is a ~400% difference in effectiveness between the best and worst algorithm in both scenarios separately. The best performing algorithm in Sowiport (terms) is the worst performing in Jabref. The best performing algorithm in Jabref (keyphrases) is 70% worse in Sowiport, than Sowiport`s best algorithm (click-through rate; 0.1% terms, 0.03% keyphrases).

15 days free trial to Access Article
JCDL - Document Embeddings vs. Keyphrases vs. Terms for Recommender Systems: A Large-Scale Online Evaluation

2019 ACM IEEE Joint Conference on Digital Libraries (JCDL), 2019

Co-Authors: Andrew Collins, Jöran Beel

Abstract:

Many recommendation algorithms are available to digital library recommender system operators. The effectiveness of algorithms is largely unreported by way of Online Evaluation. We compare a standard term-based recommendation approach to two promising approaches for related-article recommendation in digital libraries: document embeddings, and keyphrases. We evaluate the consistency of their performance across multiple scenarios. Through our recommender-system as-a-service Mr. DLib, we delivered 33.5M recommendations to users of Sowiport and Jabref over the course of 19 months, from March 2017 to October 2018. The effectiveness of the algorithms differs significantly between Sowiport and Jabref (Wilcoxon rank-sum test; p < 0.05). There is a ~400% difference in effectiveness between the best and worst algorithm in both scenarios separately. The best performing algorithm in Sowiport (terms) is the worst performing in Jabref. The best performing algorithm in Jabref (keyphrases) is 70% worse in Sowiport, than Sowiport's best algorithm (click-through rate; 0.1% terms, 0.03% keyphrases).

15 days free trial to Access Article

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Pavel Serdyukov - One of the best experts on this subject based on the ideXlab platform.

SIGIR - Effective Online Evaluation for Web Search

Online Evaluation for Effective Web Service Development

SIGIR - Challenges and Opportunities in Online Evaluation of Search Engines

Amr Huber - One of the best experts on this subject based on the ideXlab platform.

offline and Online Evaluation of news recommender systems at swissinfo ch

RecSys - Offline and Online Evaluation of news recommender systems at swissinfo.ch

Igor Yashkov - One of the best experts on this subject based on the ideXlab platform.

SIGIR - Effective Online Evaluation for Web Search

Online Evaluation for Effective Web Service Development

Florent Garcin - One of the best experts on this subject based on the ideXlab platform.

offline and Online Evaluation of news recommender systems at swissinfo ch

RecSys - Offline and Online Evaluation of news recommender systems at swissinfo.ch

Jöran Beel - One of the best experts on this subject based on the ideXlab platform.

Document Embeddings vs. Keyphrases vs. Terms: An Online Evaluation in Digital Library Recommender Systems.

JCDL - Document Embeddings vs. Keyphrases vs. Terms for Recommender Systems: A Large-Scale Online Evaluation

Online Evaluation

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Related terms

Pavel Serdyukov - One of the best experts on this subject based on the ideXlab platform.

Amr Huber - One of the best experts on this subject based on the ideXlab platform.

Igor Yashkov - One of the best experts on this subject based on the ideXlab platform.

Florent Garcin - One of the best experts on this subject based on the ideXlab platform.

Jöran Beel - One of the best experts on this subject based on the ideXlab platform.

Related terms