avatarFarhad Malik

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

2332

Abstract

2/resize:fit:800/1*BL2nMjxOUh-poZIiroL1nA.png"><figcaption></figcaption></figure><ul><li>Doc1 and Doc2 are the two vectors.</li><li>i represents the vector component</li></ul><p id="d45d">We can use Sci-kit learn library in Python to implement it:</p><div id="3048"><pre><span class="hljs-keyword">from</span> sklearn.metrics.pairwise <span class="hljs-keyword">import</span> cosine_similarity</pre></div><div id="1c00"><pre><span class="hljs-function"><span class="hljs-title">print</span><span class="hljs-params">(cosine_similarity(df_document1,df_document2)</span></span></pre></div><h1 id="3f6f">2. Jaccard Similarity:</h1><p id="a443">Jaccard similarity is all about finding the commonality via intersection of the data sets. We can compute the Jaccard similarity coefficient score.</p><p id="48c7">It is computed by finding the intersection between two sets and then dividing the size of intersection by the size of the union of the two sets.</p><ol><li>We can find the intersection of two documents using: doc1.intersection(doc2) as long as both are sets</li><li>We can find the union of two documents by using union = doc1.union(doc2) as long as both are sets</li></ol><figure id="af7c"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*4Co6x6C0JMVybZOBE2P6Iw.png"><figcaption></figcaption></figure><p id="996d">The formula is:</p><figure id="6244"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*lqQblwSzXDnSTyB_pn6qDw.png"><figcaption></figcaption></figure><p id="7684">Additionally, the score can by computed by using the Sci-Kit learn library in Python:<code> sklearn.metrics.<b>jaccard_score</b></code>(<i>actual</i>, <i>prediction)</i></p><h1 id="dc76">3. Perplexity:</h1><p id="6e54">We can rely on the perplexity measure to assess and evaluate a NLP model. The perplexity is a numerical value that is computed per word. It relies on the underlying probability distribution of the words in the sentences to find how accurate the NLP model is.</p><p id="d53c">We can compute perplexity of words by computing the formula:</p><figure id="22dd"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*tDZKfH_SFERwESh5WPKnGA.png"><figcaption></figcaption></figure><p id="00bb"><i>Lower the score, better the model.</i></p><p id="f3b7">We can also compute the weighted perplexit

Options

y of the sentences if required.</p><h1 id="44d1">4. Word Error Rate:</h1><p id="0e12">Lastly, I wanted to outline WER. Word error rate (WER) is a useful measure. It can be used to compare two documents and the measure is highly dependent on the number of substitutions, deletions and iterations between the two documents.</p><p id="00de">Let’s understand it with an example.</p><p id="949c">Consider we implemented a NLP model and we expect it to print the text:</p><h2 id="ac83">“FinTechExplained Is A Publication”</h2><p id="40f4">The predicted text could either have:</p><h2 id="61ed">1. Deletion:</h2><p id="2085">The predicted text did not contain all of the words:</p><figure id="9697"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*ftOQr7uF9zJ_1-WH0RXCPQ.png"><figcaption></figcaption></figure><h2 id="6a26">2. Insertion:</h2><p id="d95c">New words have been predicted by the model</p><figure id="a777"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*2hSV0Oegd2IaVpuqeRQVdA.png"><figcaption></figcaption></figure><h2 id="b040">3. Substitution:</h2><p id="5221">Some words have been substituted with the new ones:</p><figure id="0e6c"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*4u2JnNGbuWAXubW8qrz77A.png"><figcaption></figcaption></figure><h2 id="64d8">4. Or any mixture of deletion, substitution and insertion</h2><figure id="c6cb"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*zqsb5HNOs3IPw0IqUZt0XA.png"><figcaption></figcaption></figure><ol><li>The Word Error Rate is the sum of words that have been deleted, inserted and/or substituted over the total number of expected words. The algorithm is all about comparing every single word, sentence by sentence and incrementing the error value by 1 every time we encounter a deletion, insertion or a substitution.</li></ol><figure id="9fc0"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*outFCyNNDR2uISpNq5sK9A.png"><figcaption></figcaption></figure><h1 id="4193">Summary</h1><p id="f3d9">This article demonstrated how we can evaluate the performance of the NLP model.</p><p id="1974">It provided an overview on the four measures including:</p><p id="cfd6">Cosine Similarity, Jaccard Similarity, Perplexity and Word Error Rate.</p><p id="3330">Hope it helps.</p></article></body>

NLP: How To Evaluate The Model Performance

The Key Measures to Measure Accuracy Of The NLP Project

Once we have trained the NLP model, we need to evaluate the performance of the model. This article will demonstrate how we can evaluate and assess the accuracy of the NLP model.

The article will provide an overview of the four measures, Cosine similarity, Jaccard similarity, Perplexity and Word Error Rate

Evaluating Performance

It is vital to understand that the concept of similarity is highly dependent on the domain and environment of the application. We can choose following measures to assess the performance:

1. Cosine Similarity:

Cosine similarity is a useful measure if you want to consider duplicates when comparing the textual documents.

We can compute cosine angle between the two documents to estimate how similar the documents are. The key to note is that the smaller the angle, the bigger the cosine value and the more similar the two documents.

The words can be converted into non-zero vectors by using a number of text mining algorithm such as TF-IDF or Bag Of Words as an instance. Have a look at this article to understand the algorithms:

The cosine similarity equation will result in a value between 0 and 1 as the term frequencies are always positive.

  • Doc1 and Doc2 are the two vectors.
  • i represents the vector component

We can use Sci-kit learn library in Python to implement it:

from sklearn.metrics.pairwise import cosine_similarity
print(cosine_similarity(df_document1,df_document2)

2. Jaccard Similarity:

Jaccard similarity is all about finding the commonality via intersection of the data sets. We can compute the Jaccard similarity coefficient score.

It is computed by finding the intersection between two sets and then dividing the size of intersection by the size of the union of the two sets.

  1. We can find the intersection of two documents using: doc1.intersection(doc2) as long as both are sets
  2. We can find the union of two documents by using union = doc1.union(doc2) as long as both are sets

The formula is:

Additionally, the score can by computed by using the Sci-Kit learn library in Python: sklearn.metrics.jaccard_score(actual, prediction)

3. Perplexity:

We can rely on the perplexity measure to assess and evaluate a NLP model. The perplexity is a numerical value that is computed per word. It relies on the underlying probability distribution of the words in the sentences to find how accurate the NLP model is.

We can compute perplexity of words by computing the formula:

Lower the score, better the model.

We can also compute the weighted perplexity of the sentences if required.

4. Word Error Rate:

Lastly, I wanted to outline WER. Word error rate (WER) is a useful measure. It can be used to compare two documents and the measure is highly dependent on the number of substitutions, deletions and iterations between the two documents.

Let’s understand it with an example.

Consider we implemented a NLP model and we expect it to print the text:

“FinTechExplained Is A Publication”

The predicted text could either have:

1. Deletion:

The predicted text did not contain all of the words:

2. Insertion:

New words have been predicted by the model

3. Substitution:

Some words have been substituted with the new ones:

4. Or any mixture of deletion, substitution and insertion

  1. The Word Error Rate is the sum of words that have been deleted, inserted and/or substituted over the total number of expected words. The algorithm is all about comparing every single word, sentence by sentence and incrementing the error value by 1 every time we encounter a deletion, insertion or a substitution.

Summary

This article demonstrated how we can evaluate the performance of the NLP model.

It provided an overview on the four measures including:

Cosine Similarity, Jaccard Similarity, Perplexity and Word Error Rate.

Hope it helps.

Data Science
NLP
AI
Fintech
Python
Recommended from ReadMedium