Shakes Vision

Automating repetitive tasks like extracting text from images can save valuable time (unless you don't value it, in that case it will save some worthless time). This process, known as Optical Character Recognition (OCR), is a powerful tool for converting text in images into editable formats. Here, I’ll walk you through setting up custom OCR solutions for both Windows (that we all have) and Linux (mostly used in offices) systems, complete with keyboard shortcuts for seamless integration.

I primarily use OCR for Urdu in my personal work, but professionally it is also required for English. Using Google Lens is a fine option, except if you dislike repeating those clicks and key presses just to copy text from an image. And to be honest - I kind of feel bad even for giant corps like Google when I unnecessarily utilize their 'precious' resources.

Why not use a browser extension you ask? Well, because it's limited to browser - and we do need text from other apps as well. You can argue that one can take a screenshot of that app and then go to browser and run OCR, but if you have opened a browser and afford to take a screenshot just for that, why not run Google Lens instead of an extension. You get the point.

Background

OCR technology is invaluable for tasks such as digitizing printed documents, extracting text from screenshots, or processing scanned images. By automating OCR, you can:

Instantly access extracted text.
Improve productivity.
Simplify your workflow.

This guide provides a step-by-step walkthrough for setting up OCR on Windows and Linux, ensuring a smooth and user-friendly experience.

Introduction

Why Automate OCR?

Manual text extraction is time-consuming and error-prone. Automating the process ensures:

Faster access to text data.
Minimal effort for repetitive tasks.
A consistent and reliable workflow.

How It Works

We’ll create scripts for Windows and Linux that:

Capture an image or utilize an existing one.
Perform OCR using Tesseract (an open-source OCR engine).
Copy the extracted text directly to the clipboard.

Setup

Prerequisites

Before getting started, ensure you have the following:

Tesseract OCR
- Download and install from Tesseract’s official page.
- Install necessary language packs (e.g., -l eng for English, -l ara+eng for Arabic and English).
Clipboard Utilities
- Windows: Use nircmd for clipboard operations.
- Linux: Install xclip for clipboard management.
Screenshot Tools
- Windows: Use built-in snipping tools or third-party software.
- Linux: Install flameshot for advanced screenshot functionality.

Procedure

For Windows

1. Create the OCR Script

Create a batch file named sstoocr.bat and save it in a convenient location:

@echo off
:: Save clipboard to image
start nircmd/nircmd.exe clipboard saveimage screenshot.png

:: Run Tesseract OCR on the image
tesseract screenshot.png output -l ara+eng

:: Copy extracted text to clipboard
type output.txt | clip

:: Optionally, clean up
:: del screenshot.png
:: del output.txt

2. Assign a Shortcut

Place the script on your desktop.
Right-click the script and select Create Shortcut.
Right-click the shortcut, go to Properties, and under the Shortcut tab, assign Ctrl + Alt + O as the shortcut key.

3. Use the Script

Copy an image to the clipboard or take a screenshot.
Press Ctrl + Alt + O.
The extracted text will automatically be copied to your clipboard.

For Linux

1. Create the OCR Script

Create a shell script named flameshot_ocr.sh:

#!/bin/bash
flameshot gui --raw | tesseract -l eng stdin stdout | xclip -selection clipboard

Make the script executable:

chmod +x flameshot_ocr.sh

2. Assign a Shortcut

Open your desktop environment’s keyboard settings.
Add a custom shortcut:
- Command: /path/to/flameshot_ocr.sh
- Shortcut: Ctrl + Shift + O

3. Use the Script

Press Ctrl + Shift + O to open the Flameshot GUI.
Select the area to capture.
The text will be extracted and copied to your clipboard.

Conclusion

By following this guide, you can set up a streamlined OCR solution for both Windows and Linux. With a simple keyboard shortcut, you’ll have quick access to extracted text directly on your clipboard, saving time and effort.

Feel free to customize these scripts to better suit your needs. Happy automating, and may your workflows become ever more efficient!

Energetic Quote

About Me

Blog Archive

Recent comments

Introducing AfsaneDB (Beta) – Now Available on Play Store!

Setting Up OCR for Windows and Linux: A Comprehensive Guide

Background

Introduction

Why Automate OCR?

How It Works

Setup

Prerequisites

Procedure

For Windows

1. Create the OCR Script

2. Assign a Shortcut

3. Use the Script

For Linux

1. Create the OCR Script

2. Assign a Shortcut

3. Use the Script

Conclusion

Popular Posts

Recent Posts