Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fine tune PaliGemma to Recognize Issue Dates on Cheques Goal #320

Open
Gautam-Rajeev opened this issue Jun 11, 2024 · 2 comments
Open

Fine tune PaliGemma to Recognize Issue Dates on Cheques Goal #320

Gautam-Rajeev opened this issue Jun 11, 2024 · 2 comments

Comments

@Gautam-Rajeev
Copy link
Collaborator

Gautam-Rajeev commented Jun 11, 2024

Goal

Fine-tune the PaliGemma model to accurately recognize and extract issue dates from images of cheques using entity recognition techniques.

Description

The objective is to test PaliGemma model's capabilities to include entity recognition for the specific task of extracting issue dates from cheque images. A practical use case will be implemented by fine-tuning the model on a publicly available cheque dataset focusing exclusively on the issue date entity.

Implementation Details

It'll include the following:

  • Download and preprocess the cheque dataset.
  • Fine-tune the PaliGemma model for NER on issue dates
  • Be able to evaluate performance on train-test splits
  • Set up a pipeline to synthetically create such datasets if required in the future (using only cheque images and human+GPT4) and keep finetuning the model

Organization Name

SamagraX

Domain

Document Processing

Tech Skills Needed

Python, Machine Learning, Entity Recognition, Image Processing, Deep Learning

Category

Feature

Feature

Entity Recognition

Mentor(s)

@kartikbhtt7

Complexity

Simple

@Saswatsusmoy
Copy link

Hi,

I've been working on this issue and recently completed the pre-processing part. Now I'm thinking of using "PaliGemma-3b-pt-224" for fine tuning, Now I don't have much access to GPUs personally is there a way in which I can work upon using atleast 8-12 GB of VRAM.

@kartikbhtt7
Copy link

Hey @Saswatsusmoy,
you can fine tune '224' model in google collab or kaggle notebook only just by freezing out some parameters.
refer to this nb -> https://colab.research.google.com/github/google/generative-ai-docs/blob/main/site/en/gemma/docs/paligemma/fine-tuning-paligemma.ipynb#scrollTo=rv7w-cGuLj5o

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants