avatarKevin Meneses González

Summary

The web content provides a comprehensive guide on using EasyOCR, a Python-based OCR library, to extract text from images, emphasizing its advantages over traditional OCR tools like Tesseract, including better accuracy and support for over 80 languages.

Abstract

The article titled "Extracting Text from Images Using Python: A Guide to OCR with EasyOCR" introduces the concept of Optical Character Recognition (OCR) and its various applications, such as digitizing documents and automating data entry. It highlights the benefits of using EasyOCR for OCR tasks, noting its superior performance with complex fonts and layouts, and its ease of setup without the need for external binaries. The guide walks through the steps of installing EasyOCR and its dependencies, writing Python code to perform OCR on images, and customizing the tool for better results. It also discusses the challenges faced in OCR, such as varied image formats and quality, and how EasyOCR can be optimized to handle these issues through preprocessing techniques. The article concludes by promoting EasyOCR as a powerful and user-friendly alternative to other OCR tools and encourages readers to integrate it into their workflows.

Opinions

  • The author suggests that EasyOCR is simpler to set up than Tesseract and performs better in certain scenarios, particularly with images containing irregular fonts or intricate layouts.
  • EasyOCR is praised for its ability to support over 80 languages and handle text in various scripts, including complex ones like Chinese, Japanese, Korean, and Arabic.
  • The article implies that EasyOCR's better accuracy with distorted or handwritten text makes it a preferable choice for tasks where text recognition is challenging.
  • The author provides a personal call to action by inviting readers to follow them on LinkedIn, subscribe to the Data Pulse Newsletter, and join their Patreon community, indicating a desire to build a professional network and share further insights in the field of data science and Python programming.

Extracting Text from Images Using Python: A Guide to OCR with EasyOCR

Introduction

Have you ever found yourself needing to extract text from an image — perhaps a street sign, a receipt, or a scanned document — but didn’t want to manually retype everything? Optical Character Recognition (OCR) is the solution to this problem, allowing you to convert images containing text into machine-readable formats. While Tesseract is a popular OCR tool, there are other powerful alternatives that can sometimes yield better results for more complex tasks.

In this article, we’ll explore how to extract text from images using EasyOCR, a Python-based OCR library that supports over 80 languages. EasyOCR is simpler to set up than Tesseract and performs better in some cases, particularly with images containing irregular fonts or complex layouts.

Let’s dive into what OCR is, the advantages of using EasyOCR, and how to implement it in Python.

What is OCR?

OCR, or Optical Character Recognition, is the process of identifying and extracting text from images. It’s widely used in various applications, such as:

  • Digitizing paper documents like invoices, receipts, or books.
  • Extracting text from street signs or license plates for autonomous vehicles.
  • Automating data entry for scanned forms or documents.

Advantages of Using EasyOCR

EasyOCR offers several benefits compared to traditional OCR tools like Tesseract:

  1. Supports over 80 languages: This includes complex scripts like Chinese, Japanese, Korean, and Arabic.
  2. Better accuracy: Especially when dealing with distorted or handwritten text.
  3. Easy to set up: Unlike Tesseract, which requires you to install external binaries, EasyOCR can be installed and run directly via Python.
  4. Handles complex layouts: EasyOCR can recognize text in various fonts, sizes, and orientations, making it ideal for documents with mixed formats.

Getting Started with EasyOCR in Python

To start using EasyOCR, you’ll first need to install the library. Here’s how to set it up:

Step 1: Install EasyOCR and Other Dependencies

You can install EasyOCR using pip:

pip install easyocr

Additionally, you’ll need torch (PyTorch) as EasyOCR is built on top of it:

pip install torch torchvision

Step 2: Writing Python Code to Extract Text from Images

Let’s write a simple Python script to load an image and extract text using EasyOCR.

import easyocr
import cv2
import matplotlib.pyplot as plt

# Inicializamos el lector de EasyOCR
reader = easyocr.Reader(['es'])  # Cambia el idioma si es necesario

# Ruta de la imagen
image_path = r'C:\Users\kevin\OneDrive\Desktop\youtube_scripts\Copia de Curso Inversion en Bolsa (1).png'

# Cargamos la imagen
image = cv2.imread(image_path)

# Realizamos la detección de texto en la imagen
results = reader.readtext(image_path)

# Mostramos los resultados en la terminal
for (bbox, text, prob) in results:
    print(f"Texto detectado: {text} con confianza {prob:.4f}")

# Anotamos la imagen con cajas delimitadoras
for (bbox, text, prob) in results:
    # Desempaquetamos la caja delimitadora
    top_left = tuple([int(val) for val in bbox[0]])
    bottom_right = tuple([int(val) for val in bbox[2]])
    
    # Dibujamos el rectángulo
    cv2.rectangle(image, top_left, bottom_right, (0, 255, 0), 2)
    
    # Anotamos el texto en la imagen
    cv2.putText(image, text, (top_left[0], top_left[1] - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 255, 0), 2)

# Mostramos la imagen anotada
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.show()

Explanation of the Code

  1. EasyOCR Reader: We initialize the EasyOCR reader by loading the English model. You can also specify multiple languages (e.g., ['en', 'fr', 'es']).
  2. Loading the Image: We load an image using opencv (cv2).
  3. Text Detection: The reader.readtext() function processes the image and returns a list of detected text, along with its bounding box and confidence score.
  4. Annotating the Image: We loop through the results and draw bounding boxes around the detected text. We also annotate the image with the extracted text.
  5. Displaying the Image: Using matplotlib, we visualize the image with the detected text marked.

Example Output

Here’s what the output looks like after running the code:

  • The terminal will display the detected text and the confidence score for each piece of text in the image.
  • The image will be displayed with bounding boxes drawn around the text, making it easy to visualize what the OCR tool has recognized.

Customizing EasyOCR for Better Results

EasyOCR can be customized in a few different ways to improve results depending on your specific use case.

  1. Multiple Languages: You can pass multiple languages to the reader, such as ['en', 'es'], to handle multilingual text. For example, if your images contain both English and Spanish text, EasyOCR can recognize both simultaneously.
reader = easyocr.Reader(['en', 'es'])

2. Confidence Threshold: If you’re only interested in highly confident predictions, you can filter out results based on the confidence score.

for (bbox, text, prob) in results:
    if prob > 0.7:  # Only display results with confidence greater than 70%
        print(f"Detected text: {text} with confidence {prob:.4f}")

3. Improving Accuracy with Image Preprocessing: Like other OCR tools, EasyOCR benefits from clean, high-contrast images. You can preprocess your images (e.g., by converting to grayscale or increasing contrast) to improve the accuracy of the OCR results.

# Convert image to grayscale
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Apply thresholding to make the text stand out
_, binary_image = cv2.threshold(gray_image, 150, 255, cv2.THRESH_BINARY)

Results

Handling Challenges with OCR

OCR is a powerful tool, but it’s not without its challenges:

  1. Varied Image Formats: Text in different fonts, orientations, and styles can make it harder for OCR engines to detect accurately.
  2. Complex Layouts: Documents that mix text with images, tables, or graphs can confuse OCR engines, leading to lower accuracy.
  3. Image Quality: Poor lighting, noise, or low resolution in images can hinder OCR performance.

EasyOCR generally performs well with text in multiple fonts and complex layouts, but preprocessing steps such as binarization or image sharpening can further improve the results in difficult cases.

Conclusion

EasyOCR is a simple yet powerful tool for extracting text from images in Python. With its ability to handle multiple languages and complex layouts, it provides an excellent alternative to more traditional OCR tools like Tesseract. By integrating it into your workflow, you can automate the process of text extraction from images, saving both time and effort.

Whether you’re working on digitizing paper records, analyzing street signs, or extracting text from screenshots, EasyOCR can help you get the job done. Try it out today and see how it can transform the way you handle image-based text!

Follow me on Linkedin https://www.linkedin.com/in/kevin-meneses-897a28127/ Subscribe to the Data Pulse Newsletter https://www.linkedin.com/newsletters/datapulse-python-finance-7208914833608478720

Join my Patreon Community https://patreon.com/user?u=29567141&utm_medium=unknown&utm_source=join_link&utm_campaign=creatorshare_creator&utm_content=copyLink

Ocr
Easyocr
Python
Image Extraction
Data Automation
Recommended from ReadMedium