This repo holds winning 1st Place Notebook in the SuperAI Season 3 Image Processing Hackathon!
This Image Processing Hackathon is one of the 6 hackathons for qualifying entry into the Super AI Season 3 Program. This hackathon is open to the public meaning that people of all ages and all levels of education all over the nation are participating.
The goal of this hackathon is to classify digits (0-9) from images which could be written, printed, or from anywhere in the real world. The dataset is very diverse with the digits varying in size, color, and contrast.
- Image Processing Techniques
- Usage of State-of-The-Art Vision Transformer Model
Therefore, I tackled this by making all the images as similar as possible.
- Resize every image to fit 224x224 pixels (keeping aspect ratio) and pad the remaining spaces.
- Apply auto contrast to increase the contrast between the digits and the background
- Change the image to grayscale in order to emphasize the contrast and reduce classification based on digit color
- Save the image.
- Invert the grayscale image and save it as another image (effectively augmenting the dataset) to handle for black and white text.
The final aspect of this was also to choose a SOTA model. After reading a few papers, I settled on using Vision Transformers due to the nature of their architecture.
You can access the code under the Image_Processing_Hackathon.ipynb file.
All in all, this was a very informative experience and I have to thank AI Builders for providing me with the knowledge to do all of these things!