-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cycle detection removes same edge multiple times #8657
Comments
…ops, due to the cycle detection removing the same edge multiple times (ref deepset-ai#8657)
I submitted a proposal for the fix and a unit test to reproduce the problem: PR: #8677 . See my comments in there for details. |
…ops, due to the cycle detection removing the same edge multiple times (ref deepset-ai#8657)
Thanks @Willenbrink and @etirelli for reporting the issue and working on a fix! We noticed other potential issues in the way we handle pipelines with cycles and we would like to consider different use cases in-depth before merging a fix. We'll proceed with collecting more (e2e) test cases to have a comprehensive test suite for realistic use cases and then work on a fix that runs pipelines with cycles robustly and deterministically. @etirelli your test case already helps with building the test suite. We'd appreciate any other test cases for complex cyclic pipelines that you might have come across when working on your use cases. @Willenbrink you mentioned that you have a few very complex pipelines that you used to test your PR. Can you share these pipelines (even just conceptually) so that we can add them to our test suite for realistic use cases? |
Describe the bug
_break_supported_cycles_in_graph
might remove same edge multiple times.Error message
'NoneType' object has no attribute 'keys'
in https://github.com/deepset-ai/haystack/blob/ea3602643aa52c27f3bea7bf5bc90b97f568dcdc/haystack/core/pipeline/base.py#L1218C1-L1218C94Expected behavior
If one edge is part of two cycles, I expect the algorithm to only break the edge once. After checking the second cycle, it shouldn't attempt to break the edge.
Additional context
I believe there are two other bugs occurring in my project, possibly related:
pipeline.draw()
does not show user-provided value to variadic input #8656 This maybe messes with the topological sort of the graph though I'm not sure if that would affect the cycle detection.As the cycle handling seems quite complicated, I'm wondering why Haystack even does that. Why is the pipeline not based on a queue of components that have all their inputs, executing them one at a time and adding their connected components once they've got their inputs. Something like:
To Reproduce
Probably:
.get_edge_data
failing for the second cycle as the edge no longer exists.I can reliably reproduce the issue in my project.
FAQ Check
System:
The text was updated successfully, but these errors were encountered: