avatarBenson Ruan

Summary

This context provides a tutorial on how to use Tesseract.js, a JavaScript library, to build an OCR web application that extracts text from images.

Abstract

The context begins by introducing the concept of Optical Character Recognition (OCR) and its potential applications, such as converting images of lecture notes into text. It then introduces Tesseract.js as a powerful JavaScript library for implementing OCR. The tutorial covers the implementation of an OCR web application using Tesseract.js, including the steps of including the library, setting up HTML elements, initializing and running Tesseract, and displaying progress and results. The tutorial also discusses the pros and cons of Tesseract.js and provides a link to the complete code of the demo.

Bullet points

  • The context introduces the concept of Optical Character Recognition (OCR) and its potential applications.
  • Tesseract.js is introduced as a powerful JavaScript library for implementing OCR.
  • The tutorial covers the implementation of an OCR web application using Tesseract.js.
  • The steps of including the library, setting up HTML elements, initializing and running Tesseract, and displaying progress and results are covered.
  • The tutorial discusses the pros and cons of Tesseract.js.
  • A link to the complete code of the demo is provided.

Image to Text OCR with Tesseract.js

Extract text from images using javascript

Photo by Franck V. on Unsplash

Are you looking to extract text from images, photos? Did you just take a picture of the lecture notes and want to convert it into text? Then you’ll need an application that can recognize text via OCR (Optical Character Recognition).

Today, I am going to fulfill your long-awaited wish, to build an image to text converter with the powerful JavaScript library Tesseract.js

Try it yourself in the link below:

Implementation

Did you just feel like you had discovered the treasure? We could get a scanned image of a book, and use OCR tech to read the image, and output text in a format we can use on a machine. This could drastically improve our productivity, and it avoids duplicate manual entry.

In this tutorial, I’ll show you how to use Tesseract.js to build an OCR web application. Let’s jump straight into the code.

# Step 1: Include tesseract.js

First of all, we need to include the JavaScript library tesseract.js. The easiest way to include Tesseract.js in your HTML5 page is to use a CDN. So, add the following to the <head> of your webpage.

<html>
  <head>
    <script src='https://unpkg.com/[email protected]/dist/tesseract.min.js'></script>
  </head>

If you are using npm, you can also install it by running the command below

npm install tesseract.js@next

At the end of the , include the main javascript file tesseract-ocr.js

    <script src="js/tesseract-ocr.js"></script>
  </body>
</html>

# Step 2: Set up html element

The next thing we will need to do is to add the html elements below

  • Language selector
  • Image File selector
  • Thumbnail preview of image selected
  • Placeholder of results after processing

# Step 3: Initialize And Run Tesseract

Furthermore, we will initialize a TesseractWorker. Then utilize the recognize function. This function runs asynchronously and returns a TesseractJob object.

You can get the text result inside a callback function, which can be added using the then() method. Additionally, add a callback using the progress() method to monitor the status and progress of the OCR operation.

# Step 4: Display progress and result

Finally, let’s explore the TesseractJob object that gets returned, and use it to display the results.

Once the result is returned, it contains a confidence level, the text extracted from the image. In the array of words, it also includes the location of the word inside the image. Now we use the below function progressUpdate to display it to the user.

That’s pretty much it for the code! Choose your own images with some text in it, and watch the results roll in!

GitHub repository

You can download the complete code of the above demo in the link below:

Photo by Temple Cerulean on Unsplash

Conclusion

After all, I had done some experiment with different images, and I found some pros and cons of Tesseract.js.

Pros:

  • It supports multiple languages, check out here for a complete list of supporting languages.
  • The accuracy is pretty high with normal fonts and clear background

Cons:

  • It didn’t work very well with noisy backgrounds
  • It gets confused by some custom fonts

But still, I think it is a great JavaScript library. It brings the power of OCR to the browser, and opens up a door of opportunities for developers.

I recently published a new article to introduce another OCR JavaScript library Ocrad.js, and compare it with Tesseract.js, feel free to read through it before deciding which one is more suitable for your project.

Thank you for reading. If you like this article, feel free to share on social medias. Let me know in the comment if you have any questions. Follow me on Medium, GitHub and Linkedin. Support me on Ko-fi.

Ocr
Tesseract
Computer Vision
Image To Text
Machine Learning
Recommended from ReadMedium