avatarFarhad Malik

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

1016

Abstract

vides a number of tagging models. The default tagging model is the maxent_treebank_pos_tagger. This tagger relies on the Penn Tree bank corpus. Essentially each sentence S can be composed of a noun (NP), verb (VP) and the full stop.</p><p id="ba13">There are a large number of PoS taggers available such as: maxent_treebank_pos_tagger, HiddenMarkovModelTagger, PerceptronTagger and StanfordPOSTagger.</p><p id="4e35">This example illustrates how we can use the PoS functionality:</p><div id="db67"><pre><span class="hljs-keyword">from</span> nltk import chunk <span class="hljs-built_in">text</span> = '<span class="hljs-keyword">where</span> are you going' <span class="hljs-built_in">words</span> = nltk.word_tokenize(<span class="hljs-built_in">text</span>) tags = nltk.pos_tag(<span class="hljs-built_in">words</span>) print(tags) <span class="hljs-comment">#where = R</span> <span class="hljs-comment">#going = V</span> <span class="hljs-comment">#you = N</span> ..etc</pre></div><p id="fff5">When the tags are

Options

returned, we can use following command to find more information about it:</p><div id="fead"><pre>nltk<span class="hljs-selector-class">.help</span><span class="hljs-selector-class">.upenn_tagset</span>(<span class="hljs-string">'N'</span>) <span class="hljs-selector-id">#tells</span> us N is <span class="hljs-selector-tag">a</span> noun</pre></div><p id="efa3">The common tags are:</p><div id="900e"><pre>J <span class="hljs-built_in">is</span> an Adjective, N <span class="hljs-built_in">is</span> a <span class="hljs-built_in">noun</span>, V <span class="hljs-built_in">is</span> a verb <span class="hljs-keyword">and</span> R <span class="hljs-built_in">is</span> an adverb.</pre></div><h1 id="6539">Summary</h1><p id="6482">Part Of Speech (PoS) is a useful technique that is used in the NLP projects. This article focused on providing an overview of the PoS and how we can use it in Python.</p><p id="a541">We can combine it with Lemmatisation and Stemming to help process the text better.</p></article></body>

NLP: Text Part Of Speech Tagging

How Part Of Speech Works Can Be Implemented In Python

Part Of Speech (PoS) is a useful technique that is used in the NLP projects. This article focuses on providing an overview of the PoS and how we can implement it in Python.

Photo by Samuel Pereira on Unsplash

What Is Part Of Speech (PoS)?

Each language is made up of a number of parts of speech such as verbs, nouns, adverbs, adjectives and so on.

PoS is all about tagging (assigning) language-specific parts of a speech on a text.

NLTK is a fantastic library to support your NLP project. It provides a number of tagging models. The default tagging model is the maxent_treebank_pos_tagger. This tagger relies on the Penn Tree bank corpus. Essentially each sentence S can be composed of a noun (NP), verb (VP) and the full stop.

There are a large number of PoS taggers available such as: maxent_treebank_pos_tagger, HiddenMarkovModelTagger, PerceptronTagger and StanfordPOSTagger.

This example illustrates how we can use the PoS functionality:

from nltk import chunk
text = 'where are you going'
words = nltk.word_tokenize(text)
tags = nltk.pos_tag(words)
print(tags)
#where = R
#going = V
#you = N
..etc

When the tags are returned, we can use following command to find more information about it:

nltk.help.upenn_tagset('N') #tells us N is a noun

The common tags are:

J is an Adjective, N is a noun, V is a verb and R is an adverb.

Summary

Part Of Speech (PoS) is a useful technique that is used in the NLP projects. This article focused on providing an overview of the PoS and how we can use it in Python.

We can combine it with Lemmatisation and Stemming to help process the text better.

Machine Learning
Data Science
NLP
Technology
Fintech
Recommended from ReadMedium