Candidate Table

The Experts below are selected from a list of 11592 Experts worldwide ranked by ideXlab platform

Soumen Chakrabarti - One of the best experts on this subject based on the ideXlab platform.

open domain quantity queries on web Tables annotation response and consensus models

Knowledge Discovery and Data Mining, 2014

Co-Authors: Sunita Sarawagi, Soumen Chakrabarti

Abstract:

Over 40% of columns in hundreds of millions of Web Tables contain numeric quantities. Tables are a richer source of structured knowledge than free text. We harness Web Tables to answer queries whose target is a quantity with natural variation, such as net worth of zuckerburg, battery life of ipad, half life of plutonium, and calories in pizza. Our goal is to respond to such queries with a ranked list of quantity distributions, suitably represented. Apart from the challenges of informal schema and noisy extractions, which have been known since Tables were used for non-quantity information extraction, we face additional problems of noisy number formats, as well as unit specifications that are often contextual and ambiguous. Early "hardening" of extraction decisions at a Table level leads to poor accuracy. Instead, we use a probabilistic context free grammar (PCFG) based unit extractor on the Tables, and retain several top-scoring extractions of quantity and numerals. Then we inject these into a new collective inference framework that makes global decisions about the relevance of Candidate Table snippets, the interpretation of the query's target quantity type, the value distributions to be ranked and presented, and the degree of consensus that can be built to support the proposed quantity distributions. Experiments with over 25 million Web Tables and 350 diverse queries show robust, large benefits from our quantity catalog, unit extractor, and collective inference.

15 days free trial to Access Article
KDD - Open-domain quantity queries on web Tables: annotation, response, and consensus models

Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 2014

Co-Authors: Sunita Sarawagi, Soumen Chakrabarti

Abstract:

Over 40% of columns in hundreds of millions of Web Tables contain numeric quantities. Tables are a richer source of structured knowledge than free text. We harness Web Tables to answer queries whose target is a quantity with natural variation, such as net worth of zuckerburg, battery life of ipad, half life of plutonium, and calories in pizza. Our goal is to respond to such queries with a ranked list of quantity distributions, suitably represented. Apart from the challenges of informal schema and noisy extractions, which have been known since Tables were used for non-quantity information extraction, we face additional problems of noisy number formats, as well as unit specifications that are often contextual and ambiguous. Early "hardening" of extraction decisions at a Table level leads to poor accuracy. Instead, we use a probabilistic context free grammar (PCFG) based unit extractor on the Tables, and retain several top-scoring extractions of quantity and numerals. Then we inject these into a new collective inference framework that makes global decisions about the relevance of Candidate Table snippets, the interpretation of the query's target quantity type, the value distributions to be ranked and presented, and the degree of consensus that can be built to support the proposed quantity distributions. Experiments with over 25 million Web Tables and 350 diverse queries show robust, large benefits from our quantity catalog, unit extractor, and collective inference.

15 days free trial to Access Article

Sunita Sarawagi - One of the best experts on this subject based on the ideXlab platform.

open domain quantity queries on web Tables annotation response and consensus models

Knowledge Discovery and Data Mining, 2014

Co-Authors: Sunita Sarawagi, Soumen Chakrabarti

Abstract:

Over 40% of columns in hundreds of millions of Web Tables contain numeric quantities. Tables are a richer source of structured knowledge than free text. We harness Web Tables to answer queries whose target is a quantity with natural variation, such as net worth of zuckerburg, battery life of ipad, half life of plutonium, and calories in pizza. Our goal is to respond to such queries with a ranked list of quantity distributions, suitably represented. Apart from the challenges of informal schema and noisy extractions, which have been known since Tables were used for non-quantity information extraction, we face additional problems of noisy number formats, as well as unit specifications that are often contextual and ambiguous. Early "hardening" of extraction decisions at a Table level leads to poor accuracy. Instead, we use a probabilistic context free grammar (PCFG) based unit extractor on the Tables, and retain several top-scoring extractions of quantity and numerals. Then we inject these into a new collective inference framework that makes global decisions about the relevance of Candidate Table snippets, the interpretation of the query's target quantity type, the value distributions to be ranked and presented, and the degree of consensus that can be built to support the proposed quantity distributions. Experiments with over 25 million Web Tables and 350 diverse queries show robust, large benefits from our quantity catalog, unit extractor, and collective inference.

15 days free trial to Access Article
KDD - Open-domain quantity queries on web Tables: annotation, response, and consensus models

Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 2014

Co-Authors: Sunita Sarawagi, Soumen Chakrabarti

Abstract:

Over 40% of columns in hundreds of millions of Web Tables contain numeric quantities. Tables are a richer source of structured knowledge than free text. We harness Web Tables to answer queries whose target is a quantity with natural variation, such as net worth of zuckerburg, battery life of ipad, half life of plutonium, and calories in pizza. Our goal is to respond to such queries with a ranked list of quantity distributions, suitably represented. Apart from the challenges of informal schema and noisy extractions, which have been known since Tables were used for non-quantity information extraction, we face additional problems of noisy number formats, as well as unit specifications that are often contextual and ambiguous. Early "hardening" of extraction decisions at a Table level leads to poor accuracy. Instead, we use a probabilistic context free grammar (PCFG) based unit extractor on the Tables, and retain several top-scoring extractions of quantity and numerals. Then we inject these into a new collective inference framework that makes global decisions about the relevance of Candidate Table snippets, the interpretation of the query's target quantity type, the value distributions to be ranked and presented, and the degree of consensus that can be built to support the proposed quantity distributions. Experiments with over 25 million Web Tables and 350 diverse queries show robust, large benefits from our quantity catalog, unit extractor, and collective inference.

15 days free trial to Access Article

J. Karvo - One of the best experts on this subject based on the ideXlab platform.

LCN - Notes on the per-flow packet count flow classifier

Proceedings LCN 2001. 26th Annual IEEE Conference on Local Computer Networks, 1

Co-Authors: M. Ilvesmaki, J. Karvo

Abstract:

To realize a packet count classifier, in addition to the active flow Table, a Candidate Table is needed, where information on flow Candidates is kept. We observe the temporal behavior of both the active flow Table and flow Candidate Table size using actual traffic traces. The results indicate that the performance bottleneck in a packet count classifier lies within the Candidate Table management. Also, the changes in the Candidate Table size occur much faster than in the active flow Table. Therefore, fast methods of creating entries and deletions in the Candidate Table are needed.

15 days free trial to Access Article

Koji Kurokawa - One of the best experts on this subject based on the ideXlab platform.

Fast precise preclassification method using a Candidate Table designed by projections of feature regions

Sixth International Workshop on Digital Image Processing and Computer Graphics: Applications in Humanities and Natural Sciences, 1998

Co-Authors: Katsuhito Fujimoto, Hiroshi Kamada, Koji Kurokawa

Abstract:

For feasible recognition having many categories such as Japanese character recognition, fast matching algorithms are necessary because matching process occupies most of recognition time. In addition, for improving recognition accuracy, matching process must use more complicated discrimination functions or a higher dimensional feature space, which involves higher computational costs. Therefore, pre-classification is used, which outputs a set of Candidate categories to decrease the number of computations of the complicated discrimination functions. Conventional pre-classification uses simpler discrimination functions in a compressed feature space. But sufficient compression of feature space for acceleration sacrifices some degree of matching accuracy because of the lack of information. Therefore, we propose a fast, precise pre-classification method using a Candidate Table, which prepares effective informations for matching included in the original feature space structures as a Candidate Table designed by projections of feature regions of each category, and outputs Candidate categories by quickly looking up to a Candidate Table according to a small number of reference features of the input pattern. The proposed method applied to Japanese character recognition can accelerate matching process by 3 times over the conventional method with no decrease of matching accuracy, which confirms the effectiveness of the proposed method.

15 days free trial to Access Article

M. Ilvesmaki - One of the best experts on this subject based on the ideXlab platform.

LCN - Notes on the per-flow packet count flow classifier

Proceedings LCN 2001. 26th Annual IEEE Conference on Local Computer Networks, 1

Co-Authors: M. Ilvesmaki, J. Karvo

Abstract:

To realize a packet count classifier, in addition to the active flow Table, a Candidate Table is needed, where information on flow Candidates is kept. We observe the temporal behavior of both the active flow Table and flow Candidate Table size using actual traffic traces. The results indicate that the performance bottleneck in a packet count classifier lies within the Candidate Table management. Also, the changes in the Candidate Table size occur much faster than in the active flow Table. Therefore, fast methods of creating entries and deletions in the Candidate Table are needed.

15 days free trial to Access Article

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Soumen Chakrabarti - One of the best experts on this subject based on the ideXlab platform.

open domain quantity queries on web Tables annotation response and consensus models

KDD - Open-domain quantity queries on web Tables: annotation, response, and consensus models

Sunita Sarawagi - One of the best experts on this subject based on the ideXlab platform.

open domain quantity queries on web Tables annotation response and consensus models

KDD - Open-domain quantity queries on web Tables: annotation, response, and consensus models

J. Karvo - One of the best experts on this subject based on the ideXlab platform.

LCN - Notes on the per-flow packet count flow classifier

Koji Kurokawa - One of the best experts on this subject based on the ideXlab platform.

Fast precise preclassification method using a Candidate Table designed by projections of feature regions

M. Ilvesmaki - One of the best experts on this subject based on the ideXlab platform.

LCN - Notes on the per-flow packet count flow classifier