-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Addition of local vlm folder support #1051
base: main
Are you sure you want to change the base?
Conversation
Hi, This PR is used to let the user use the local repository or the model downloaded in their path. I have ran this in my local. ``` from docling.datamodel.pipeline_options import PictureDescriptionVlmOptions pipeline_options = PdfPipelineOptions() pipeline_options.do_picture_description = True pipeline_options.picture_description_options = PictureDescriptionVlmOptions( repo_id= "/opt/dlami/nvme/Qwen/Qwen2.5-VL-7B-Instruct", # <-- add here the Hugging Face repo_id of your favorite VLM prompt="Extract the text from the images, if it is table extract the table format.If there are no text give 'No Image Text' response", ) pipeline_options.images_scale = 2.0 pipeline_options.generate_picture_images = True converter = DocumentConverter( format_options={ InputFormat.PDF: PdfFormatOption( pipeline_options=pipeline_options, ) } ) ``` Signed-off-by: Navanit Dubey <[email protected]>
Merge ProtectionsYour pull request matches the following merge protections and will not be merged until they are valid. 🔴 Enforce conventional commitThis rule is failing.Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/
|
Signed-off-by: Navanit Dubey <[email protected]>
@dolfim-ibm any review? |
Could you please see if #1057 would be enough for your use case? We just started the approach of using |
I don't think so this will resolve the error. The local file will again be downloaded. |
@dolfim-ibm The #1057 doesn't solve the artifact path, since now it wants to download the easyocr and other models in that path. So, artifact path will make the path default. But this PR will make the path as a dynamic for every ocr Models using repo id . cc @cau-git |
@Navanit-git From what I can see, your proposed change would not effect too much:
We established the CLI model downloader, and an analogous model download API to make it easy to pre-download models. However, if you want to work with pre-downloaded models and provide an Could you please explain what functionality you miss, given this update? |
Hey, thank you for the review. So basically I have a PDF in which I want to do an OCR using any vlm models. So I followed the below link steps. I have downloaded the vlm models earlier in my local path which is not in a cache folder. So when I give a repo_id of that vlm model it gives an error hf error since it starts downloading. So to support this I have added one line that if the repo id path exist already then don't download give the path. Yes, I know this is a minor changes since I have a project to deliver where I have to ocr the pdf file with images to md file/text file and I thought this simple change can help. If its redundant, kindly close this pr. Also one small tiny request, export to markdown format when can we expect the image description, since I was working on it , and instead of image placeholder we can get the image description and I think I am very close to getting that, by changing in your library docling_core but there is a time crunch for me to work on my project. |
@dolfim-ibm @cau-git ... |
this is coming very soon. |
Regarding the overall PR, my opinion is in a few bullets.
|
thank you @dolfim-ibm for now in my local I have done the above pr changes in docling library and its working fine. But we never know if we will find any error in the future see if we can add patches for it or something. |
Hi, This PR is used to let the user use the local repository or the model downloaded in their path.
I have ran this in my local.
Checklist: