CloudPhy OCR Pipeline

This is a Python script for Optical Character Recognition (OCR) of images that are captured by a camera. The script first imports the required libraries, installs the necessary packages, and mounts the Google Drive. It then defines the functions for segmenting, cropping, classifying, object detection, and OCR tasks.

Importing Libraries

The script starts by importing the following libraries:

tesseract-ocr and pytesseract for OCR.
PIL, matplotlib, cv2, and numpy for image processing.
os and google.colab to mount the Google Drive.
roboflow for image segmentation.

Segmenting Images

The segment function is used for image segmentation, which is done using Roboflow. The function first imports the required libraries, including base64, cv2, matplotlib, and numpy. It then reads the input image, resizes it, and passes it to the Roboflow model for segmentation. The function then extracts the segmentation mask from the resulting JSON and converts it to a NumPy array. Finally, it applies the mask to the original image and returns the segmented image.

Cropping Images

The cropping function is used to crop the segmented image. The function takes as input the segmented image and the points for the region of interest (ROI). It first imports cv2 and matplotlib, and then performs perspective transformation to warp the image and extract the ROI. The function returns the cropped image.

Classifying Images

The classify function is used for image classification. The function first imports tensorflow, numpy, and keras.utils. It then loads the pre-trained model for image classification from the Google Drive, preprocesses the input image, and passes it to the model for prediction. The function returns the class label.

Object Detection

The inference_obj function is used for object detection. The function first imports the required libraries, including glob, matplotlib, cv2, requests, and numpy. It then performs object detection on the input images using the pre-trained YOLOv5 model. The function returns the directory where the inference results are stored.

OCR Task

The ocr_task function is used for OCR. The function takes as input an array of cropped images, reads each image, applies Gaussian blur and sharpening filters, and passes it to the OCR engine for text recognition. The function then returns the OCR result for each image.

Google Drive Integration

The script mounts the Google Drive at the beginning using the drive.mount() function to store the pre-trained models and other necessary files.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
static		static
INTER_IIT.pptx		INTER_IIT.pptx
Inference.ipynb		Inference.ipynb
Readme.md		Readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CloudPhy OCR Pipeline

Importing Libraries

Segmenting Images

Cropping Images

Classifying Images

Object Detection

OCR Task

Google Drive Integration

About

Releases

Packages

Languages

alankritkadian/Inter-IIT-CloudPhy

Folders and files

Latest commit

History

Repository files navigation

CloudPhy OCR Pipeline

Importing Libraries

Segmenting Images

Cropping Images

Classifying Images

Object Detection

OCR Task

Google Drive Integration

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages