Online Dictionary

The Experts below are selected from a list of 9429 Experts worldwide ranked by ideXlab platform

Chang Wen Chen - One of the best experts on this subject based on the ideXlab platform.

sparse representation with spatio temporal Online Dictionary learning for promising video coding

IEEE Transactions on Image Processing, 2016

Co-Authors: Wenrui Dai, Xin Tang, Hongkai Xiong, Yangmei Shen, Junni Zou, Chang Wen Chen

Abstract:

Classical Dictionary learning methods for video coding suffer from high computational complexity and interfered coding efficiency by disregarding its underlying distribution. This paper proposes a spatio-temporal Online Dictionary learning (STOL) algorithm to speed up the convergence rate of Dictionary learning with a guarantee of approximation error. The proposed algorithm incorporates stochastic gradient descents to form a Dictionary of pairs of 3D low-frequency and high-frequency spatio-temporal volumes. In each iteration of the learning process, it randomly selects one sample volume and updates the atoms of Dictionary by minimizing the expected cost, rather than optimizes empirical cost over the complete training data, such as batch learning methods, e.g., K-SVD. Since the selected volumes are supposed to be independent identically distributed samples from the underlying distribution, decomposition coefficients attained from the trained Dictionary are desirable for sparse representation. Theoretically, it is proved that the proposed STOL could achieve better approximation for sparse representation than K-SVD and maintain both structured sparsity and hierarchical sparsity. It is shown to outperform batch gradient descent methods (K-SVD) in the sense of convergence speed and computational complexity, and its upper bound for prediction error is asymptotically equal to the training error. With lower computational complexity, extensive experiments validate that the STOL-based coding scheme achieves performance improvements than H.264/AVC or High Efficiency Video Coding as well as existing super-resolution-based methods in rate-distortion performance and visual quality.

15 days free trial to Access Article

Tsvi Kopelowitz - One of the best experts on this subject based on the ideXlab platform.

mind the gap essentially optimal algorithms for Online Dictionary matching with one gap

International Symposium on Algorithms and Computation, 2016

Co-Authors: Tsvi Kopelowitz, Ely Porat, Amihood Amir, Avivit Levy, Seth Pettie, Riva B Shalom

Abstract:

We examine the complexity of the Online Dictionary Matching with One Gap Problem (DMOG) which is the following. Preprocess a Dictionary D of d patterns, where each pattern contains a special gap symbol that can match any string, so that given a text that arrives Online, a character at a time, we can report all of the patterns from D that are suffixes of the text that has arrived so far, before the next character arrives. In more general versions the gap symbols are associated with bounds determining the possible lengths of matching strings. Online DMOG captures the difficulty in a bottleneck procedure for cyber-security, as many digital signatures of viruses manifest themselves as patterns with a single gap. In this paper, we demonstrate that the difficulty in obtaining efficient solutions for the DMOG problem, even in the offline setting, can be traced back to the infamous 3SUM conjecture. We show a conditional lower bound of Omega(delta(G_D)+op) time per text character, where G_D is a bipartite graph that captures the structure of D, delta(G_D) is the degeneracy of this graph, and op is the output size. Moreover, we show a conditional lower bound in terms of the magnitude of gaps for the bounded case, thereby showing that some known offline upper bounds are essentially optimal. We also provide matching upper-bounds (up to sub-polynomial factors), in terms of the degeneracy, for the Online DMOG problem. In particular, we introduce algorithms whose time cost depends linearly on delta(G_D). Our algorithms make use of graph orientations, together with some additional techniques. These algorithms are of practical interest since although delta(G_D) can be as large as sqrt(d), and even larger if G_D is a multi-graph, it is typically a very small constant in practice. Finally, when delta(G_D) is large we are able to obtain even more efficient solutions.

15 days free trial to Access Article
succinct Online Dictionary matching with improved worst case guarantees

Combinatorial Pattern Matching, 2016

Co-Authors: Tsvi Kopelowitz, Ely Porat, Yaron Rozen

Abstract:

In the Online Dictionary matching problem the goal is to preprocess a set of patterns D={P_1,...,P_d} over alphabet Sigma, so that given an Online text (one character at a time) we report all of the occurrences of patterns that are a suffix of the current text before the following character arrives. We introduce a succinct Aho-Corasick like data structure for the Online Dictionary matching problem. Our solution uses a new succinct representation for multi-labeled trees, in which each node has a set of labels from a universe of size lambda. We consider lowest labeled ancestor (LLA) queries on multi-labeled trees, where given a node and a label we return the lowest proper ancestor of the node that has the queried label. In this paper we introduce a succinct representation of multi-labeled trees for lambda=omega(1) that support LLA queries in O(log(log(lambda))) time. Using this representation of multi-labeled trees, we introduce a succinct data structure for the Online Dictionary matching problem when sigma=omega(1). In this solution the worst case cost per character is O(log(log(sigma)) + occ) time, where occ is the size of the current output. Moreover, the amortized cost per character is O(1+occ) time.

15 days free trial to Access Article
CPM - Succinct Online Dictionary Matching with Improved Worst-Case Guarantees.

2016

Co-Authors: Tsvi Kopelowitz, Ely Porat, Yaron Rozen

Abstract:

In the Online Dictionary matching problem the goal is to preprocess a set of patterns D={P_1,...,P_d} over alphabet Sigma, so that given an Online text (one character at a time) we report all of the occurrences of patterns that are a suffix of the current text before the following character arrives. We introduce a succinct Aho-Corasick like data structure for the Online Dictionary matching problem. Our solution uses a new succinct representation for multi-labeled trees, in which each node has a set of labels from a universe of size lambda. We consider lowest labeled ancestor (LLA) queries on multi-labeled trees, where given a node and a label we return the lowest proper ancestor of the node that has the queried label. In this paper we introduce a succinct representation of multi-labeled trees for lambda=omega(1) that support LLA queries in O(log(log(lambda))) time. Using this representation of multi-labeled trees, we introduce a succinct data structure for the Online Dictionary matching problem when sigma=omega(1). In this solution the worst case cost per character is O(log(log(sigma)) + occ) time, where occ is the size of the current output. Moreover, the amortized cost per character is O(1+occ) time.

15 days free trial to Access Article

Yaron Rozen - One of the best experts on this subject based on the ideXlab platform.

succinct Online Dictionary matching with improved worst case guarantees

Combinatorial Pattern Matching, 2016

Co-Authors: Tsvi Kopelowitz, Ely Porat, Yaron Rozen

Abstract:

In the Online Dictionary matching problem the goal is to preprocess a set of patterns D={P_1,...,P_d} over alphabet Sigma, so that given an Online text (one character at a time) we report all of the occurrences of patterns that are a suffix of the current text before the following character arrives. We introduce a succinct Aho-Corasick like data structure for the Online Dictionary matching problem. Our solution uses a new succinct representation for multi-labeled trees, in which each node has a set of labels from a universe of size lambda. We consider lowest labeled ancestor (LLA) queries on multi-labeled trees, where given a node and a label we return the lowest proper ancestor of the node that has the queried label. In this paper we introduce a succinct representation of multi-labeled trees for lambda=omega(1) that support LLA queries in O(log(log(lambda))) time. Using this representation of multi-labeled trees, we introduce a succinct data structure for the Online Dictionary matching problem when sigma=omega(1). In this solution the worst case cost per character is O(log(log(sigma)) + occ) time, where occ is the size of the current output. Moreover, the amortized cost per character is O(1+occ) time.

15 days free trial to Access Article
CPM - Succinct Online Dictionary Matching with Improved Worst-Case Guarantees.

2016

Co-Authors: Tsvi Kopelowitz, Ely Porat, Yaron Rozen

Abstract:

In the Online Dictionary matching problem the goal is to preprocess a set of patterns D={P_1,...,P_d} over alphabet Sigma, so that given an Online text (one character at a time) we report all of the occurrences of patterns that are a suffix of the current text before the following character arrives. We introduce a succinct Aho-Corasick like data structure for the Online Dictionary matching problem. Our solution uses a new succinct representation for multi-labeled trees, in which each node has a set of labels from a universe of size lambda. We consider lowest labeled ancestor (LLA) queries on multi-labeled trees, where given a node and a label we return the lowest proper ancestor of the node that has the queried label. In this paper we introduce a succinct representation of multi-labeled trees for lambda=omega(1) that support LLA queries in O(log(log(lambda))) time. Using this representation of multi-labeled trees, we introduce a succinct data structure for the Online Dictionary matching problem when sigma=omega(1). In this solution the worst case cost per character is O(log(log(sigma)) + occ) time, where occ is the size of the current output. Moreover, the amortized cost per character is O(1+occ) time.

15 days free trial to Access Article

Taha Yasseri - One of the best experts on this subject based on the ideXlab platform.

emo love and god making sense of urban Dictionary a crowd sourced Online Dictionary

Royal Society Open Science, 2018

Co-Authors: Dong Nguyen, Barbara Mcgillivray, Taha Yasseri

Abstract:

The Internet facilitates large-scale collaborative projects and the emergence of Web 2.0 platforms, where producers and consumers of content unify, has drastically changed the information market. On the one hand, the promise of the 'wisdom of the crowd' has inspired successful projects such as Wikipedia, which has become the primary source of crowd-based information in many languages. On the other hand, the decentralized and often unmonitored environment of such projects may make them susceptible to low-quality content. In this work, we focus on Urban Dictionary, a crowd-sourced Online Dictionary. We combine computational methods with qualitative annotation and shed light on the overall features of Urban Dictionary in terms of growth, coverage and types of content. We measure a high presence of opinion-focused entries, as opposed to the meaning-focused entries that we expect from traditional dictionaries. Furthermore, Urban Dictionary covers many informal, unfamiliar words as well as proper nouns. Urban Dictionary also contains offensive content, but highly offensive content tends to receive lower scores through the Dictionary's voting system. The low threshold to include new material in Urban Dictionary enables quick recording of new words and new meanings, but the resulting heterogeneous content can pose challenges in using Urban Dictionary as a source to study language innovation.

15 days free trial to Access Article
emo love and god making sense of urban Dictionary a crowd sourced Online Dictionary

arXiv: Computation and Language, 2017

Co-Authors: Dong Nguyen, Barbara Mcgillivray, Taha Yasseri

Abstract:

The Internet facilitates large-scale collaborative projects. The emergence of Web~2.0 platforms, where producers and consumers of content unify, has drastically changed the information market. On the one hand, the promise of the "wisdom of the crowd" has inspired successful projects such as Wikipedia, which has become the primary source of crowd-based information in many languages. On the other hand, the decentralized and often un-monitored environment of such projects may make them susceptible to systematic malfunction and misbehavior. In this work, we focus on Urban Dictionary, a crowd-sourced Online Dictionary. We combine computational methods with qualitative annotation and shed light on the overall features of Urban Dictionary in terms of growth, coverage and types of content. We measure a high presence of opinion-focused entries, as opposed to the meaning-focused entries that we expect from traditional dictionaries. Furthermore, Urban Dictionary covers many informal, unfamiliar words as well as proper nouns. There is also a high presence of offensive content, but highly offensive content tends to receive lower scores through the voting system. Our study highlights that Urban Dictionary has a higher content heterogeneity than found in traditional dictionaries, which poses challenges in terms in processing but also offers opportunities to analyze and track language innovation.

15 days free trial to Access Article

Wenrui Dai - One of the best experts on this subject based on the ideXlab platform.

sparse representation with spatio temporal Online Dictionary learning for promising video coding

IEEE Transactions on Image Processing, 2016

Co-Authors: Wenrui Dai, Xin Tang, Hongkai Xiong, Yangmei Shen, Junni Zou, Chang Wen Chen

Abstract:

Classical Dictionary learning methods for video coding suffer from high computational complexity and interfered coding efficiency by disregarding its underlying distribution. This paper proposes a spatio-temporal Online Dictionary learning (STOL) algorithm to speed up the convergence rate of Dictionary learning with a guarantee of approximation error. The proposed algorithm incorporates stochastic gradient descents to form a Dictionary of pairs of 3D low-frequency and high-frequency spatio-temporal volumes. In each iteration of the learning process, it randomly selects one sample volume and updates the atoms of Dictionary by minimizing the expected cost, rather than optimizes empirical cost over the complete training data, such as batch learning methods, e.g., K-SVD. Since the selected volumes are supposed to be independent identically distributed samples from the underlying distribution, decomposition coefficients attained from the trained Dictionary are desirable for sparse representation. Theoretically, it is proved that the proposed STOL could achieve better approximation for sparse representation than K-SVD and maintain both structured sparsity and hierarchical sparsity. It is shown to outperform batch gradient descent methods (K-SVD) in the sense of convergence speed and computational complexity, and its upper bound for prediction error is asymptotically equal to the training error. With lower computational complexity, extensive experiments validate that the STOL-based coding scheme achieves performance improvements than H.264/AVC or High Efficiency Video Coding as well as existing super-resolution-based methods in rate-distortion performance and visual quality.

15 days free trial to Access Article

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Chang Wen Chen - One of the best experts on this subject based on the ideXlab platform.

sparse representation with spatio temporal Online Dictionary learning for promising video coding

Tsvi Kopelowitz - One of the best experts on this subject based on the ideXlab platform.

mind the gap essentially optimal algorithms for Online Dictionary matching with one gap

succinct Online Dictionary matching with improved worst case guarantees

CPM - Succinct Online Dictionary Matching with Improved Worst-Case Guarantees.

Yaron Rozen - One of the best experts on this subject based on the ideXlab platform.

succinct Online Dictionary matching with improved worst case guarantees

CPM - Succinct Online Dictionary Matching with Improved Worst-Case Guarantees.

Taha Yasseri - One of the best experts on this subject based on the ideXlab platform.

emo love and god making sense of urban Dictionary a crowd sourced Online Dictionary

emo love and god making sense of urban Dictionary a crowd sourced Online Dictionary

Wenrui Dai - One of the best experts on this subject based on the ideXlab platform.

sparse representation with spatio temporal Online Dictionary learning for promising video coding