Nonterminals

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 1278 Experts worldwide ranked by ideXlab platform

Susumi Hatakeyama - One of the best experts on this subject based on the ideXlab platform.

Kosaburo Hashiguchi - One of the best experts on this subject based on the ideXlab platform.

Chihiro Shibata - One of the best experts on this subject based on the ideXlab platform.

  • learning k l context sensitive probabilistic grammars with nonparametric bayesian approach
    Machine Learning, 2021
    Co-Authors: Chihiro Shibata
    Abstract:

    Inferring formal grammars with nonparametric Bayesian approach is one of the most powerful approach for achieving high accuracy from unsupervised data. In this paper, mildly-context-sensitive probabilities, called (k, l)-context-sensitive probabilities, are defined on context-free grammars (CFGs). Inferring CFGs where the probabilities of rules are identified from contexts can be seen as a kind of dual approaches for distributional learning, in which the contexts characterize the substrings. We can handle the data sparsity for the context-sensitive probabilities by the smoothing effect of the hierarchical nonparametric Bayesian models such as Pitman–Yor processes (PYPs). We define the hierarchy of PYPs naturally by augmenting the infinite PCFGs. The blocked Gibbs sampling is known to be effective for inferring PCFGs. We show that, by modifying the inside probabilities, the blocked Gibbs sampling is able to be applied to the (k, l)-context-sensitive probabilistic grammars. At the same time, we show that the time complexity for (k, l)-context-sensitive probabilities of a CFG is $$O(|V|^{l+3}|w|^3)$$ for each sentence w, where V is a set of Nonterminals. Since it is computationally too expensive to iterate sufficient times especially when |V| is not small, some alternative sampling algorithms are required. Therefore, we propose a new sampling method called composite sampling, with which the sampling procedure is separated into sub-procedures for Nonterminals and for derivation trees. Finally, we demonstrate that the inferred (k, 0)-context-sensitive probabilistic grammars can achieve lower perplexities than other probabilistic language models such as PCFGs, n-grams, and HMMs.

  • Inferring (k, l)-context-sensitive probabilistic context-free grammars using hierarchical Pitman-Yor processes
    2015
    Co-Authors: Chihiro Shibata, Makoto Kanazawa, Er Clark, Ryo Yoshinaka
    Abstract:

    Motivated by the idea of applying nonparametric Bayesian models to dual approaches for distributional learning, we define (k, l)-context-sensitive probabilistic context-free gram-mars (PCFGs) using hierarchical Pitman-Yor processes (PYPs). The data sparseness problem that occurs when inferring context-sensitive probabilities for rules is handled by the smoothing effect of hierarchical PYPs. Many possible definitions or constructions of PYP hierarchies can be used to represent the context sensitivity of derivations of CFGs in Chomsky normal form. In this study, we use a definition that is considered to be the most natural as an extension of infinite PCFGs defined in previous studies. A Markov Chain Monte Carlo method called blocked Metropolis-Hastings (MH) sampling is known to be effective for inferring PCFGs from unsupervised sentences. Blocked MH sampling is ap-plicable to (k, l)-context-sensitive PCFGs by modifying their so-called inside probabilities. We show that the computational cost of blocked MH sampling for (k, l)-context-sensitive PCFGs is O(|V |l+3|s|3) for each sentence s, where V is a set of Nonterminals. This cost is too high to iterate sufficient sampling times, especially when l 6 = 0, thus we propose an al-ternative sampling method that separates the sampling procedure into pointwise sampling for Nonterminals and blocked sampling for rules. The computational cost of this sampling method is O(min{|s|l, |V |l}(|V ||s|2 + |s|3))

Gyorgy Vaszil - One of the best experts on this subject based on the ideXlab platform.

Min Zhang - One of the best experts on this subject based on the ideXlab platform.

  • learning semantic representations for Nonterminals in hierarchical phrase based translation
    Empirical Methods in Natural Language Processing, 2015
    Co-Authors: Xing Wang, Deyi Xiong, Min Zhang
    Abstract:

    In hierarchical phrase-based translation, coarse-grained nonterminal Xs may generate inappropriate translations due to the lack of sufficient information for phrasal substitution. In this paper we propose a framework to refine Nonterminals in hierarchical translation rules with real-valued semantic representations. The semantic representations are learned via a weighted mean value and a minimum distance method using phrase vector representations obtained from large scale monolingual corpus. Based on the learned semantic vectors, we build a semantic nonterminal refinement model to measure semantic similarities between phrasal substitutions and nonterminal Xs in translation rules. Experiment results on ChineseEnglish translation show that the proposed model significantly improves translation quality on NIST test sets.