Optical Character Recognition (OCR): PyTesseract vs. EasyOCR

Text extraction from an image is becoming one of the most common application of artificial intelligence. It is something you approach since the beginning of your deep learning coding career when you confront the MNIST dataset and read about convolutional neural networks (CNNs). As you advance, you will need to apply these basic concepts to implement more sophisticated solutions, such as a script that automatically reads the plate number of a car [1]. But you don’t always need to build a CNN from scratch. These functions are offered as services by main clouds providers and you can use free open-source solutions to be included in a python script.
When I received the assignment of detecting text and digits to extract expenses from a bill, I tested two popular OCR libraries that run flawlessly on a Google-Colabs notebook. I could download on the Colabs virtual machine a sample of 200 receipts images using the command
!wget https://expressexpense.com/large-receipt-image-dataset-SRD.zip # download from source!unzip large-receipt-image-dataset-SRD.zip -d /content/invoice_data/ # extract to a custom subfolder "invoice_data"
Then I imported OpenCV for loading and preprocessing the image,
gray, img_bin = cv2.threshold(image,20,255,cv2.THRESH_BINARY | cv2.THRESH_OTSU)
thresh = cv2.bitwise_not(img_bin)
to be analyzed by Pytesseract. Results weren’t always accurate and I looked for a similar Python library to check if these could be improved, EasyOCR. EasyOCR is built with Pytorch library,and having a GPU speeds up the whole process of detection. This is not an issue as GPU runtime can be used for free in Google Colabs. As shown below in Figure 1, more characters were detected with a higher accuracy. In this case, image preprocessing was not necessary since it is done automatically, but a language has to be specified. You can select several among 58, and I just chose English as default [2].

Bottom line, EasyOCR was the winner for today with the minor downside of requiring a GPU accelerated machine as default.
References
[1] https://jideilori.medium.com/ocr-with-machine-learning-55c7d082fe78
The full code can be found at https://github.com/opsabarsec/Receipts-OCR-on-colabs