avatarFabio Chiusano

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

2446

Abstract

">In recent years, text simplification has evolved to actual generation of new and novel text, thanks to the advent of neural networks, especially Recurrent Neural Networks, Long Short-Term Memory networks, and Transformers, which allow sequence-to-sequence modeling.</p><h2 id="9a4d">Abstractive Approach with Lexical Simplification</h2><p id="ded1">These approaches are basically a pipeline where complex words in sentences are identified and then replaced by substitutions, which have been previously filtered and ranked by relevance.</p><figure id="4288"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*8Dbzr5BSz8KXSTZAODv1IQ.png"><figcaption>Example pipeline for Lexical Simplification. Image from <a href="https://arxiv.org/pdf/2008.08612.pdf">https://arxiv.org/pdf/2008.08612.pdf</a></figcaption></figure><p id="9754">In a sample Lexical Simplification pipeline:</p><ul><li><a href="https://paperswithcode.com/task/complex-word-identification">Complex Word Identification</a> is usually done with indexes like the <a href="https://en.wikipedia.org/wiki/Zipf%27s_law">Zipf word frequencies</a>.</li><li>Substitution Generation is often done looking for synonyms in lexical databases like <a href="https://wordnet.princeton.edu/">WordNet</a>.</li><li>Substitution Selection is done by performing <a href="https://en.wikipedia.org/wiki/Word-sense_disambiguation">Word Sense Disambiguation</a> to select the candidates with the correct meaning.</li><li>Substitution Ranking can be done by selecting the simpler candidates leveraging again the Zipf word frequencies.</li></ul><p id="2099">This <a href="https://medium.com/@armandj.olivares/how-to-use-bert-for-lexical-simplification-6edbf5a4d15e">article</a> is a practical example of how to perform Lexical Simplification with BERT.</p><h2 id="cce3">Abstractive Approach with Novel Text Generation</h2><p id="56ab">As opposed to Lexical Simplification, which aims to reduce the complexity of a text by simplifying the vocabulary, syntactic simplification seeks to identify grammatically complex text, and rewrite it so that it is easier to comprehend. This may involve splitting long sentences into shorter, more digestible chunks, changing passive voice usage to active, and resolving ambiguities and anaphora. Text Generation is done with sequence-to-sequence models.</p><p id="ed9b">See this <a href="https://towardsdatascience.com/text-simplification-for-the-democratizat

Options

ion-of-knowledge-5b3647e4a52">article</a> for an example of a sequence-to-sequence model trained for text simplification.</p><p id="cff4">Thank you for reading! If you are interested in learning more about NLP, remember to follow NLPlanet on <a href="https://medium.com/nlplanet">Medium</a>, <a href="https://www.linkedin.com/company/nlplanet">LinkedIn</a>, and <a href="https://twitter.com/nlplanet_">Twitter</a>!</p><p id="1b1a"><b>Two minutes NLP related posts</b></p><div id="500f" class="link-block"> <a href="https://readmedium.com/two-minutes-nlp-relation-extraction-with-opennre-d22660efd1fd"> <div> <div> <h2>Two minutes NLP — Relation Extraction with OpenNRE</h2> <div><h3>Relation Extraction, Knowledge Graphs, Entities, and OpenNRE</h3></div> <div><p>medium.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*pV1-8AyvZnT3Cj-g)"></div> </div> </div> </a> </div><div id="6c3d" class="link-block"> <a href="https://readmedium.com/two-minutes-nlp-quick-intro-to-text-style-transfer-61de9cbd4083"> <div> <div> <h2>Two minutes NLP — Quick intro to Text Style Transfer</h2> <div><h3>Parallel and Non-parallel data, Disentanglement, and Prototype Editing</h3></div> <div><p>medium.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*FEhOAQM3UNIUyAFJ)"></div> </div> </div> </a> </div><div id="c925" class="link-block"> <a href="https://readmedium.com/two-minutes-nlp-33-important-nlp-tasks-explained-31e2caad2b1b"> <div> <div> <h2>Two minutes NLP — 33 important NLP tasks explained</h2> <div><h3>Information Retrieval, Knowledge Bases, Chatbots, Text Generation, Text-to-Data, Text Reasoning, etc.</h3></div> <div><p>medium.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/1*pR9nbCcPHwCZnSX5VHrYZA.png)"></div> </div> </div> </a> </div></article></body>

Two minutes NLP — Quick intro to Text Simplification

Applications, Extractive methods, Abstractive approaches, Lexical Simplification and Novel Text Generation

Photo by Kelly Sikkema on Unsplash

Text Simplification aims to reduce the linguistic complexity of content to make it easier to understand, while still retaining the original information and meaning.

For example, the complex sentence “The ominous clouds engulfed the hill” may be transformed into its simpler form “The gloomy clouds covered the hill”.

Over time, approaches to Text Simplification have shifted from manual, hand-crafted rules to automated simplification. Research in the field has clearly shifted towards utilizing deep learning techniques, with a specific focus on developing solutions to combat the lack of data available for simplification.

Applications

Among the most prominent target audiences for Text Simplification are foreign language learners, often focusing on lexical but also sentence-level simplification. Text Simplification is also of interest to dyslexics, and the aphasic, for whom particularly long words and sentences, but also certain surface forms such as specific character combinations, may pose difficulties.

Approaches

Most of the early work in the field involved extractive methods of summarization, i.e. extracting the sentences from a document that conveyed the most meaning. The research in simplification has shifted towards abstractive approaches with the actual generation of text.

Abstractive Approach

Initially this involved sentence level simplification through lexical (word-based or phrasal-based) selection and substitution, like using a Paraphrase Database.

In recent years, text simplification has evolved to actual generation of new and novel text, thanks to the advent of neural networks, especially Recurrent Neural Networks, Long Short-Term Memory networks, and Transformers, which allow sequence-to-sequence modeling.

Abstractive Approach with Lexical Simplification

These approaches are basically a pipeline where complex words in sentences are identified and then replaced by substitutions, which have been previously filtered and ranked by relevance.

Example pipeline for Lexical Simplification. Image from https://arxiv.org/pdf/2008.08612.pdf

In a sample Lexical Simplification pipeline:

  • Complex Word Identification is usually done with indexes like the Zipf word frequencies.
  • Substitution Generation is often done looking for synonyms in lexical databases like WordNet.
  • Substitution Selection is done by performing Word Sense Disambiguation to select the candidates with the correct meaning.
  • Substitution Ranking can be done by selecting the simpler candidates leveraging again the Zipf word frequencies.

This article is a practical example of how to perform Lexical Simplification with BERT.

Abstractive Approach with Novel Text Generation

As opposed to Lexical Simplification, which aims to reduce the complexity of a text by simplifying the vocabulary, syntactic simplification seeks to identify grammatically complex text, and rewrite it so that it is easier to comprehend. This may involve splitting long sentences into shorter, more digestible chunks, changing passive voice usage to active, and resolving ambiguities and anaphora. Text Generation is done with sequence-to-sequence models.

See this article for an example of a sequence-to-sequence model trained for text simplification.

Thank you for reading! If you are interested in learning more about NLP, remember to follow NLPlanet on Medium, LinkedIn, and Twitter!

Two minutes NLP related posts

NLP
Naturallanguageprocessing
AI
Artificial Intelligence
Data Science
Recommended from ReadMedium