Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Imports inside init causing extremely slow load times #8704

Open
1 task done
lohit8846 opened this issue Jan 10, 2025 · 0 comments · May be fixed by #8706
Open
1 task done

Imports inside init causing extremely slow load times #8704

lohit8846 opened this issue Jan 10, 2025 · 0 comments · May be fixed by #8706

Comments

@lohit8846
Copy link

lohit8846 commented Jan 10, 2025

Describe the bug
Any import made to a component is causing unnecessary packages to be loaded and increasing load times significantly

The design of the __init__.py files in the component directories are not optimized because importing a single component will bring all the others in even when using the full path to file. The users of this framework have no control over this

Here's a simple example.

from haystack.components.routers.conditional_router import ConditionalRouter

The above line of code will actually still import all the other components defined in the router __init__.py which includes TransformersZeroShotTextRouter and that loads in ML libraries transformers and sklearn. This is because of how python works where parent packages will be initialized completely when a child submodule is referenced

This is problematic because load times are much higher due to components & packages that are not being used. In a complete RAG pipeline, I have seen load times spike up all the way 5-7 seconds with several ML libraries such as torch being loaded from the /utils/init.py which on its own takes a few seconds

Expected behavior
The expected behavior is that the imports are lazily loaded to only when they are accessed and init.py should not automatically load everything. This will significantly improve load times

I have put together a pull request with suggested changes for how to lazily import efficiently while still maintain type checking for IDEs. Please prioritize reviewing this as soon as possible as the performance is an issue for our users

Other related issues/PRS

#8650
#8706
#8655

FAQ Check

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant