Simplification - Explore the Science & Experts

The Experts below are selected from a list of 1113003 Experts worldwide ranked by ideXlab platform

Mirella Lapata - One of the best experts on this subject based on the ideXlab platform.

sentence Simplification with deep reinforcement learning

arXiv: Computation and Language, 2017

Co-Authors: Xingxing Zhang, Mirella Lapata

Abstract:

Sentence Simplification aims to make sentences easier to read and understand. Most recent approaches draw on insights from machine translation to learn Simplification rewrites from monolingual corpora of complex and simple sentences. We address the Simplification problem with an encoder-decoder model coupled with a deep reinforcement learning framework. Our model, which we call {\sc Dress} (as shorthand for {\bf D}eep {\bf RE}inforcement {\bf S}entence {\bf S}implification), explores the space of possible Simplifications while learning to optimize a reward function that encourages outputs which are simple, fluent, and preserve the meaning of the input. Experiments on three datasets demonstrate that our model outperforms competitive Simplification systems.

15 days free trial to Access Article
learning to simplify sentences with quasi synchronous grammar and integer programming

Empirical Methods in Natural Language Processing, 2011

Co-Authors: Kristian Woodsend, Mirella Lapata

Abstract:

Text Simplification aims to rewrite text into simpler versions, and thus make information accessible to a broader audience. Most previous work simplifies sentences using handcrafted rules aimed at splitting long sentences, or substitutes difficult words using a predefined dictionary. This paper presents a data-driven model based on quasi-synchronous grammar, a formalism that can naturally capture structural mismatches and complex rewrite operations. We describe how such a grammar can be induced from Wikipedia and propose an integer linear programming model for selecting the most appropriate Simplification from the space of possible rewrites generated by the grammar. We show experimentally that our method creates Simplifications that significantly reduce the reading difficulty of the input, while maintaining grammaticality and preserving its meaning.

15 days free trial to Access Article

Chris Callisonburch - One of the best experts on this subject based on the ideXlab platform.

complexity weighted loss and diverse reranking for sentence Simplification

arXiv: Computation and Language, 2019

Co-Authors: Reno Kriz, Joao Sedoc, Marianna Apidianaki, Carolina Zheng, Gaurav Kumar, Eleni Miltsakaki, Chris Callisonburch

Abstract:

Sentence Simplification is the task of rewriting texts so they are easier to understand. Recent research has applied sequence-to-sequence (Seq2Seq) models to this task, focusing largely on training-time improvements via reinforcement learning and memory augmentation. One of the main problems with applying generic Seq2Seq models for Simplification is that these models tend to copy directly from the original sentence, resulting in outputs that are relatively long and complex. We aim to alleviate this issue through the use of two main techniques. First, we incorporate content word complexities, as predicted with a leveled word complexity model, into our loss function during training. Second, we generate a large set of diverse candidate Simplifications at test time, and rerank these to promote fluency, adequacy, and simplicity. Here, we measure simplicity through a novel sentence complexity model. These extensions allow our models to perform competitively with state-of-the-art systems while generating simpler sentences. We report standard automatic and human evaluation metrics.

15 days free trial to Access Article
simple ppdb a paraphrase database for Simplification

Meeting of the Association for Computational Linguistics, 2016

Co-Authors: Ellie Pavlick, Chris Callisonburch

Abstract:

We release the Simple Paraphrase Database, a subset of of the Paraphrase Database (PPDB) adapted for the task of text Simplification. We train a supervised model to associate Simplification scores with each phrase pair, producing rankings competitive with state-of-theart lexical Simplification models. Our new Simplification database contains 4.5 million paraphrase rules, making it the largest available resource for lexical Simplification.

15 days free trial to Access Article
optimizing statistical machine translation for text Simplification

Transactions of the Association for Computational Linguistics, 2016

Co-Authors: Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callisonburch

Abstract:

Most recent sentence Simplification systems use basic machine translation models to learn lexical and syntactic paraphrases from a manually simplified parallel corpus. These methods are limited by the quality and quantity of manually simplified corpora, which are expensive to build. In this paper, we conduct an in-depth adaptation of statistical machine translation to perform text Simplification, taking advantage of large-scale paraphrases learned from bilingual texts and a small amount of manual Simplifications with multiple references. Our work is the first to design automatic metrics that are effective for tuning and evaluating Simplification systems, which will facilitate iterative development for this task.

15 days free trial to Access Article
problems in current text Simplification research new data can help

Transactions of the Association for Computational Linguistics, 2015

Co-Authors: Chris Callisonburch, Courtney Napoles

Abstract:

Simple Wikipedia has dominated Simplification research in the past 5 years. In this opinion paper, we argue that focusing on Wikipedia limits Simplification research. We back up our arguments with corpus analysis and by highlighting statements that other researchers have made in the Simplification literature. We introduce a new Simplification dataset that is a significant improvement over Simple Wikipedia, and present a novel quantitative-comparative approach to study the quality of Simplification data resources.

15 days free trial to Access Article

Horacio Saggion - One of the best experts on this subject based on the ideXlab platform.

Text Simplification

The Oxford Handbook of Computational Linguistics 2nd edition, 2018

Co-Authors: Horacio Saggion

Abstract:

Over the past decades, information has been made available to a broad audience thanks to the availability of texts on the Web. However, understanding the wealth of information contained in texts can pose difficulties for a number of people including those with poor literacy, cognitive or linguistic impairment, or those with limited knowledge of the language of the text. Text Simplification was initially conceived as a technology to simplify sentences so that they would be easier to process by natural-language processing components such as parsers. However, nowadays automatic text Simplification is conceived as a technology to transform a text into an equivalent which is easier to read and to understand by a target user. Text Simplification concerns both the modification of the vocabulary of the text (lexical Simplification) and the modification of the structure of the sentences (syntactic Simplification). In this chapter, after briefly introducing the topic of text readability, we give an overview of past and recent methods to address these two problems. We also describe Simplification applications and full systems also outline language resources and evaluation approaches.

15 days free trial to Access Article
RANLP - Automatic text Simplification for Spanish: comparative evaluation of various Simplification strategies

2015

Co-Authors: Sanja Štajner, Iacer Calixto, Horacio Saggion

Abstract:

In this paper, we explore statistical machine translation (SMT) approaches to automatic text Simplification (ATS) for Spanish. First, we compare the performances of the standard phrase-based (PB) and hierarchical (HIERO) SMT models in this specific task. In both cases, we build two models, one using the TS corpus with “light” Simplifications and the other using the TS corpus with “heavy” Simplifications. Next, we compare the two best systems with the state-of-the-art text Simplification system for Spanish (Simplext). Our results, based on an extensive human evaluation, show that the SMT-based systems perform equally as well as, or better than, Simplext, despite the very small datasets used for training and tuning.

15 days free trial to Access Article
making it simplext implementation and evaluation of a text Simplification system for spanish

ACM Transactions on Accessible Computing, 2015

Co-Authors: Horacio Saggion, Sanja Štajner, Stefan Bott, Simon Mille, Luz Rello, Biljana Drndarevic

Abstract:

The way in which a text is written can be a barrier for many people. Automatic text Simplification is a natural language processing technology that, when mature, could be used to produce texts that are adapted to the specific needs of particular users. Most research in the area of automatic text Simplification has dealt with the English language. In this article, we present results from the Simplext project, which is dedicated to automatic text Simplification for Spanish. We present a modular system with dedicated procedures for syntactic and lexical Simplification that are grounded on the analysis of a corpus manually simplified for people with special needs. We carried out an automatic evaluation of the system’s output, taking into account the interaction between three different modules dedicated to different Simplification aspects. One evaluation is based on readability metrics for Spanish and shows that the system is able to reduce the lexical and syntactic complexity of the texts. We also show, by means of a human evaluation, that sentence meaning is preserved in most cases. Our results, even if our work represents the first automatic text Simplification system for Spanish that addresses different linguistic aspects, are comparable to the state of the art in English Automatic Text Simplification.

15 days free trial to Access Article
Text Simplification resources for Spanish

Language Resources and Evaluation, 2014

Co-Authors: Stefan Bott, Horacio Saggion

Abstract:

In this paper we present the development of a text Simplification system for Spanish. Text Simplification is the adaptation of a text for the special needs of certain groups of readers, such as language learners, people with cognitive difficulties, and elderly people, among others. There is a clear need for simplified texts, but manual production and adaptation of existing text is labour-intensive and costly. Automatic Simplification is a field which attracts growing attention in Natural Language Processing, but, to the best of our knowledge, there are no existing Simplification tools for Spanish. We present a corpus study which aims to identify the operations a text Simplification system needs to carry out in order to produce an output similar to what human editors produce when they simplify news texts. We also present a first prototype for automatic Simplification, which shows that the most important Simplification operations can be successfully treated.

15 days free trial to Access Article
LREC - Text Simplification Tools for Spanish

2012

Co-Authors: Stefan Bott, Horacio Saggion, Simon Mille

Abstract:

In this paper we describe the development of a text Simplification system for Spanish. Text Simplification is the adaptation of a text to the special needs of certain groups of readers, such as language learners, people with cognitive difficulties and elderly people, among others. There is a clear need for simplified texts, but manual production and adaptation of existing texts is labour intensive and costly. Automatic Simplification is a field which attracts growing attention in Natural Language Processing, but, to the best of our knowledge, there are no Simplification tools for Spanish. We present a prototype for automatic Simplification, which shows that the most important structural Simplification operations can be successfully treated with an approach based on rules which can potentially be improved by statistical methods. For the development of this prototype we carried out a corpus study which aims at identifying the operations a text Simplification system needs to carry out in order to produce an output similar to what human editors produce when they simplify texts.

15 days free trial to Access Article

Wim Reddingius - One of the best experts on this subject based on the ideXlab platform.

Progressive Simplification of polygonal curves

Computational Geometry, 2020

Co-Authors: Kevin Buchin, Maximilian Konzack, Wim Reddingius

Abstract:

Abstract Simplifying polygonal curves at different levels of detail is an important problem with many applications. Existing geometric optimization algorithms are only capable of minimizing the complexity of a simplified curve for a single level of detail. We present an O ( n 3 m ) -time algorithm that takes a polygonal curve of n vertices and produces a set of consistent Simplifications for m scales while minimizing the cumulative Simplification complexity. This algorithm is compatible with distance measures such as the Hausdorff, the Frechet and area-based distances, and enables Simplification for continuous scaling in O ( n 5 ) time. To speed up this algorithm in practice, a technique is presented for efficiently constructing many so-called shortcut graphs under the Hausdorff distance, as well as a representation of the shortcut graph that enables us to find shortest paths in anticipated O ( n log ⁡ n ) time on spatial data, improving over O ( n 2 ) time using existing algorithms. Experimental evaluation of these techniques on geospatial data reveals a significant improvement of using shortcut graphs for progressive and non-progressive curve Simplification, both in terms of running time and memory usage.

15 days free trial to Access Article
Progressive Simplification of Polygonal Curves

arXiv: Computational Geometry, 2018

Co-Authors: Kevin Buchin, Maximilian Konzack, Wim Reddingius

Abstract:

Simplifying polygonal curves at different levels of detail is an important problem with many applications. Existing geometric optimization algorithms are only capable of minimizing the complexity of a simplified curve for a single level of detail. We present an $O(n^3m)$-time algorithm that takes a polygonal curve of n vertices and produces a set of consistent Simplifications for m scales while minimizing the cumulative Simplification complexity. This algorithm is compatible with distance measures such as the Hausdorff, the Fr\'echet and area-based distances, and enables Simplification for continuous scaling in $O(n^5)$ time. To speed up this algorithm in practice, we present new techniques for constructing and representing so-called shortcut graphs. Experimental evaluation of these techniques on trajectory data reveals a significant improvement of using shortcut graphs for progressive and non-progressive curve Simplification, both in terms of running time and memory usage.

15 days free trial to Access Article

David Coeurjolly - One of the best experts on this subject based on the ideXlab platform.

A generic and parallel algorithm for 2D digital curve polygonal approximation

Journal of Real-Time Image Processing, 2011

Co-Authors: Guillaume Damiand, David Coeurjolly

Abstract:

In this paper, we present a generic topological and geometrical framework which allows to define and control several parallel algorithms for 2D digital curve approximation. The proposed technique is based on combinatorial map Simplifications guided by geometrical criteria. We illustrate the genericity of the framework by defining three contour Simplification methods: a polygonal approximation one based an area deviation computation; a digital straight segments reconstruction one which guaranties to obtain a loss-less representation; and a moment preserving Simplification one which simplifies the contours while preserving geometrical moments of the image regions. Thanks to a complete experimental evaluation, we demonstrate that the proposed methods can be efficiently implemented in a multi-thread environment to simplify labeled image contours.

15 days free trial to Access Article

Discover everything there is to know about the scientific topic Simplification with ideXlab!

Mirella Lapata - One of the best experts on this subject based on the ideXlab platform.

sentence Simplification with deep reinforcement learning

learning to simplify sentences with quasi synchronous grammar and integer programming

Chris Callisonburch - One of the best experts on this subject based on the ideXlab platform.

complexity weighted loss and diverse reranking for sentence Simplification

simple ppdb a paraphrase database for Simplification

optimizing statistical machine translation for text Simplification

problems in current text Simplification research new data can help

Horacio Saggion - One of the best experts on this subject based on the ideXlab platform.

Text Simplification

RANLP - Automatic text Simplification for Spanish: comparative evaluation of various Simplification strategies

making it simplext implementation and evaluation of a text Simplification system for spanish

Text Simplification resources for Spanish

LREC - Text Simplification Tools for Spanish

Wim Reddingius - One of the best experts on this subject based on the ideXlab platform.

Progressive Simplification of polygonal curves

Progressive Simplification of Polygonal Curves

David Coeurjolly - One of the best experts on this subject based on the ideXlab platform.

A generic and parallel algorithm for 2D digital curve polygonal approximation