The Experts below are selected from a list of 34086 Experts worldwide ranked by ideXlab platform
Kyunghyun Cho - One of the best experts on this subject based on the ideXlab platform.
-
iterative refinement in the continuous space for non autoregressive neural machine translation
Empirical Methods in Natural Language Processing, 2020Co-Authors: Jason Lee, Raphael Shu, Kyunghyun ChoAbstract:We propose an efficient Inference Procedure for non-autoregressive machine translation that iteratively refines translation purely in the continuous space. Given a continuous latent variable model for machine translation (Shu et al., 2020), we train an Inference network to approximate the gradient of the marginal log probability of the target sentence, using the latent variable instead. This allows us to use gradient-based optimization to find the target sentence at Inference time that approximately maximizes its marginal probability. As each refinement step only involves computation in the latent space of low dimensionality (we use 8 in our experiments), we avoid computational overhead incurred by existing non-autoregressive Inference Procedures that often refine in token space. We compare our approach to a recently proposed EM-like Inference Procedure (Shu et al., 2020) that optimizes in a hybrid space, consisting of both discrete and continuous variables. We evaluate our approach on WMT’14 En→De, WMT’16 Ro→En and IWSLT’16 De→En, and observe two advantages over the EM-like Inference: (1) it is computationally efficient, i.e. each refinement step is twice as fast, and (2) it is more effective, resulting in higher marginal probabilities and BLEU scores with the same number of refinement steps. On WMT’14 En→De, for instance, our approach is able to decode 6.2 times faster than the autoregressive model with minimal degradation to translation quality (0.9 BLEU).
-
iterative refinement in the continuous space for non autoregressive neural machine translation
arXiv: Computation and Language, 2020Co-Authors: Jason Lee, Raphael Shu, Kyunghyun ChoAbstract:We propose an efficient Inference Procedure for non-autoregressive machine translation that iteratively refines translation purely in the continuous space. Given a continuous latent variable model for machine translation (Shu et al., 2020), we train an Inference network to approximate the gradient of the marginal log probability of the target sentence, using only the latent variable as input. This allows us to use gradient-based optimization to find the target sentence at Inference time that approximately maximizes its marginal probability. As each refinement step only involves computation in the latent space of low dimensionality (we use 8 in our experiments), we avoid computational overhead incurred by existing non-autoregressive Inference Procedures that often refine in token space. We compare our approach to a recently proposed EM-like Inference Procedure (Shu et al., 2020) that optimizes in a hybrid space, consisting of both discrete and continuous variables. We evaluate our approach on WMT'14 En-De, WMT'16 Ro-En and IWSLT'16 De-En, and observe two advantages over the EM-like Inference: (1) it is computationally efficient, i.e. each refinement step is twice as fast, and (2) it is more effective, resulting in higher marginal probabilities and BLEU scores with the same number of refinement steps. On WMT'14 En-De, for instance, our approach is able to decode 6.2 times faster than the autoregressive model with minimal degradation to translation quality (0.9 BLEU).
Michael R Kosorok - One of the best experts on this subject based on the ideXlab platform.
-
likelihood based Inference for current status data on a grid a boundary phenomenon and an adaptive Inference Procedure
arXiv: Statistics Theory, 2012Co-Authors: Runlong Tang, Moulinath Banerjee, Michael R KosorokAbstract:In this paper, we study the nonparametric maximum likelihood estimator for an event time distribution function at a point in the current status model with observation times supported on a grid of potentially unknown sparsity and with multiple subjects sharing the same observation time. This is of interest since observation time ties occur frequently with current status data. The grid resolution is specified as $cn^{-\gamma}$ with $c>0$ being a scaling constant and $\gamma>0$ regulating the sparsity of the grid relative to $n$, the number of subjects. The asymptotic behavior falls into three cases depending on $\gamma$: regular Gaussian-type asymptotics obtain for $\gamma 1/3$ and $\gamma=1/3$ serves as a boundary at which the transition happens. The limit distribution at the boundary is different from either of the previous cases and converges weakly to those obtained with $\gamma\in(0,1/3)$ and $\gamma\in(1/3,\infty)$ as $c$ goes to $\infty$ and 0, respectively. This weak convergence allows us to develop an adaptive Procedure to construct confidence intervals for the value of the event time distribution at a point of interest without needing to know or estimate $\gamma$, which is of enormous advantage from the perspective of Inference. A simulation study of the adaptive Procedure is presented.
-
likelihood based Inference for current status data on a grid a boundary phenomenon and an adaptive Inference Procedure
Annals of Statistics, 2012Co-Authors: Runlong Tang, Moulinath Banerjee, Michael R KosorokAbstract:In this paper, we study the nonparametric maximum likelihood estimator (NPMLE) for an event time distribution function at a point in the current status model with observation times supported on a grid of potentially unknown sparsity and with multiple subjects sharing the same observation time. This is of interest since observation time ties occur frequently with current status data. The grid resolution is specified ascn withc > 0 being a scaling constant and > 0 regulating the sparsity of the grid relative to the number of subjects (n). The asymptotic behavior falls into three cases depending on : regular ‘normal‐type’ asymptotics obtain for 1=3 and = 1=3 serves as a boundary at which the transition happens. The limit distribution at the boundary is different from either of the previous cases and converges weakly to those obtained with 2 (0;1=3) and 2 (1=3;1) asc goes to1 and 0, respectively. This weak convergence allows us to develop an adaptive Procedure to construct confidence intervals for the value of the event time distribution at a point of interest without needing to know or estimate , which is of enormous advantage from the perspective of Inference. A simulation study of the adaptive Procedure is presented.
Wouter M Kouw - One of the best experts on this subject based on the ideXlab platform.
-
online system identification in a duffing oscillator by free energy minimisation
European conference on Machine Learning, 2020Co-Authors: Wouter M KouwAbstract:Online system identification is the estimation of parameters of a dynamical system, such as mass or friction coefficients, for each measurement of the input and output signals. Here, the nonlinear stochastic differential equation of a Duffing oscillator is cast to a generative model and dynamical parameters are inferred using variational message passing on a factor graph of the model. The approach is validated with an experiment on data from an electronic implementation of a Duffing oscillator. The proposed Inference Procedure performs as well as offline prediction error minimisation in a state-of-the-art nonlinear model.
Dav Zimak - One of the best experts on this subject based on the ideXlab platform.
-
semantic role labeling via integer linear programming Inference
International Conference on Computational Linguistics, 2004Co-Authors: Vasin Punyakanok, Dan Roth, Wentau Yih, Dav ZimakAbstract:We present a system for the semantic role labeling task. The system combines a machine learning technique with an Inference Procedure based on integer linear programming that supports the incorporation of linguistic and structural constraints into the decision process. The system is tested on the data provided in CoNLL-2004 shared task on semantic role labeling and achieves very competitive results.
Jason Lee - One of the best experts on this subject based on the ideXlab platform.
-
iterative refinement in the continuous space for non autoregressive neural machine translation
Empirical Methods in Natural Language Processing, 2020Co-Authors: Jason Lee, Raphael Shu, Kyunghyun ChoAbstract:We propose an efficient Inference Procedure for non-autoregressive machine translation that iteratively refines translation purely in the continuous space. Given a continuous latent variable model for machine translation (Shu et al., 2020), we train an Inference network to approximate the gradient of the marginal log probability of the target sentence, using the latent variable instead. This allows us to use gradient-based optimization to find the target sentence at Inference time that approximately maximizes its marginal probability. As each refinement step only involves computation in the latent space of low dimensionality (we use 8 in our experiments), we avoid computational overhead incurred by existing non-autoregressive Inference Procedures that often refine in token space. We compare our approach to a recently proposed EM-like Inference Procedure (Shu et al., 2020) that optimizes in a hybrid space, consisting of both discrete and continuous variables. We evaluate our approach on WMT’14 En→De, WMT’16 Ro→En and IWSLT’16 De→En, and observe two advantages over the EM-like Inference: (1) it is computationally efficient, i.e. each refinement step is twice as fast, and (2) it is more effective, resulting in higher marginal probabilities and BLEU scores with the same number of refinement steps. On WMT’14 En→De, for instance, our approach is able to decode 6.2 times faster than the autoregressive model with minimal degradation to translation quality (0.9 BLEU).
-
iterative refinement in the continuous space for non autoregressive neural machine translation
arXiv: Computation and Language, 2020Co-Authors: Jason Lee, Raphael Shu, Kyunghyun ChoAbstract:We propose an efficient Inference Procedure for non-autoregressive machine translation that iteratively refines translation purely in the continuous space. Given a continuous latent variable model for machine translation (Shu et al., 2020), we train an Inference network to approximate the gradient of the marginal log probability of the target sentence, using only the latent variable as input. This allows us to use gradient-based optimization to find the target sentence at Inference time that approximately maximizes its marginal probability. As each refinement step only involves computation in the latent space of low dimensionality (we use 8 in our experiments), we avoid computational overhead incurred by existing non-autoregressive Inference Procedures that often refine in token space. We compare our approach to a recently proposed EM-like Inference Procedure (Shu et al., 2020) that optimizes in a hybrid space, consisting of both discrete and continuous variables. We evaluate our approach on WMT'14 En-De, WMT'16 Ro-En and IWSLT'16 De-En, and observe two advantages over the EM-like Inference: (1) it is computationally efficient, i.e. each refinement step is twice as fast, and (2) it is more effective, resulting in higher marginal probabilities and BLEU scores with the same number of refinement steps. On WMT'14 En-De, for instance, our approach is able to decode 6.2 times faster than the autoregressive model with minimal degradation to translation quality (0.9 BLEU).