Automating repetitive tasks like extracting text from images can save valuable time (unless you don't value it, in that case it will save some invaluable time). This process, known as Optical Character Recognition (OCR), is a powerful tool for converting text in images into editable formats. Here, I’ll walk you through setting up custom OCR solutions for both Windows (that we all have) and Linux (used in office mostly) systems, complete with keyboard shortcuts for seamless integration.
Most of my OCR use in personal work is for Urdu, but professionally - it is required for English as well. Using Google lens is a fine option - exept that you hate repeating those clicks and key presses just to copy some text from image. And to be honest - I kind of feel bad even for giant corps like Google when I unnecessarily utilize their 'precious' resources.
Why not use a browser extension you ask? Well, because it's limited to browser - and we do need text from other apps as well. You can argue that one can take a screenshot of that app and then go to browser and run OCR, but if you have opened a browser and afford to take a screenshot just for that, why not run Google lens instead of an extension. You get the point.
Background
OCR technology is invaluable for tasks such as digitizing printed documents, extracting text from screenshots, or processing scanned images. By automating OCR, you can:
- Instantly access extracted text.
- Improve productivity.
- Simplify your workflow.
This guide provides a step-by-step walkthrough for setting up OCR on Windows and Linux, ensuring a smooth and user-friendly experience.
Introduction
Why Automate OCR?
Manual text extraction is time-consuming and error-prone. Automating the process ensures:
- Faster access to text data.
- Minimal effort for repetitive tasks.
- A consistent and reliable workflow.
How It Works
We’ll create scripts for Windows and Linux that:
- Capture an image or utilize an existing one.
- Perform OCR using Tesseract (an open-source OCR engine).
- Copy the extracted text directly to the clipboard.
Setup
Prerequisites
Before getting started, ensure you have the following:
Tesseract OCR
- Download and install from Tesseract’s official page.
- Install necessary language packs (e.g.,
-l eng
for English,-l ara+eng
for Arabic and English).
Clipboard Utilities
- Windows: Use
nircmd
for clipboard operations. - Linux: Install
xclip
for clipboard management.
- Windows: Use
Screenshot Tools
- Windows: Use built-in snipping tools or third-party software.
- Linux: Install
flameshot
for advanced screenshot functionality.
Procedure
For Windows
1. Create the OCR Script
Create a batch file named sstoocr.bat
and save it in a convenient location:
@echo off
:: Save clipboard to image
start nircmd/nircmd.exe clipboard saveimage screenshot.png
:: Run Tesseract OCR on the image
tesseract screenshot.png output -l ara+eng
:: Copy extracted text to clipboard
type output.txt | clip
:: Optionally, clean up
:: del screenshot.png
:: del output.txt
2. Assign a Shortcut
- Place the script on your desktop.
- Right-click the script and select Create Shortcut.
- Right-click the shortcut, go to Properties, and under the Shortcut tab, assign
Ctrl + Alt + O
as the shortcut key.
3. Use the Script
- Copy an image to the clipboard or take a screenshot.
- Press
Ctrl + Alt + O
. - The extracted text will automatically be copied to your clipboard.
For Linux
1. Create the OCR Script
Create a shell script named flameshot_ocr.sh
:
#!/bin/bash
flameshot gui --raw | tesseract -l eng stdin stdout | xclip -selection clipboard
Make the script executable:
chmod +x flameshot_ocr.sh
2. Assign a Shortcut
- Open your desktop environment’s keyboard settings.
- Add a custom shortcut:
- Command:
/path/to/flameshot_ocr.sh
- Shortcut:
Ctrl + Shift + O
- Command:
3. Use the Script
- Press
Ctrl + Shift + O
to open the Flameshot GUI. - Select the area to capture.
- The text will be extracted and copied to your clipboard.
Conclusion
By following this guide, you can set up a streamlined OCR solution for both Windows and Linux. With a simple keyboard shortcut, you’ll have quick access to extracted text directly on your clipboard, saving time and effort.
Feel free to customize these scripts to better suit your needs. Happy automating, and may your workflows become ever more efficient!