Cycle detection removes same edge multiple times #8657

Willenbrink · 2024-12-18T11:57:30Z

Describe the bug
_break_supported_cycles_in_graph might remove same edge multiple times.

Error message
'NoneType' object has no attribute 'keys' in https://github.com/deepset-ai/haystack/blob/ea3602643aa52c27f3bea7bf5bc90b97f568dcdc/haystack/core/pipeline/base.py#L1218C1-L1218C94

Expected behavior
If one edge is part of two cycles, I expect the algorithm to only break the edge once. After checking the second cycle, it shouldn't attempt to break the edge.

Additional context
I believe there are two other bugs occurring in my project, possibly related:

A cycle is deleted and not executed at all. The pipeline terminates too early as a result. I haven't been able to determine whether this really is a bug or what the cause is.
pipeline.draw() does not show user-provided value to variadic input #8656 This maybe messes with the topological sort of the graph though I'm not sure if that would affect the cycle detection.

As the cycle handling seems quite complicated, I'm wondering why Haystack even does that. Why is the pipeline not based on a queue of components that have all their inputs, executing them one at a time and adding their connected components once they've got their inputs. Something like:

for component in ready_comps:
  component.run()
  for connected_comp in component.connected_components:
    if connected_comp.has_all_inputs():
      ready_comps.append(connected_comp)

To Reproduce
Probably:

Have one edge as part of two cycles.
Run the pipeline
Observe the edge being removed first
Observe .get_edge_data failing for the second cycle as the edge no longer exists.

I can reliably reproduce the issue in my project.

FAQ Check

Have you had a look at our new FAQ page?

System:

OS: Linux
Haystack version (commit or version number): 2.7.0

The text was updated successfully, but these errors were encountered:

…ops, due to the cycle detection removing the same edge multiple times (ref deepset-ai#8657)

etirelli · 2024-12-29T17:59:28Z

I submitted a proposal for the fix and a unit test to reproduce the problem: PR: #8677 . See my comments in there for details.

…ops, due to the cycle detection removing the same edge multiple times (ref deepset-ai#8657)

mathislucka · 2025-01-06T13:12:50Z

Thanks @Willenbrink and @etirelli for reporting the issue and working on a fix!

We noticed other potential issues in the way we handle pipelines with cycles and we would like to consider different use cases in-depth before merging a fix.

We'll proceed with collecting more (e2e) test cases to have a comprehensive test suite for realistic use cases and then work on a fix that runs pipelines with cycles robustly and deterministically.

@etirelli your test case already helps with building the test suite. We'd appreciate any other test cases for complex cyclic pipelines that you might have come across when working on your use cases.

@Willenbrink you mentioned that you have a few very complex pipelines that you used to test your PR. Can you share these pipelines (even just conceptually) so that we can add them to our test suite for realistic use cases?

Willenbrink · 2025-01-09T15:37:34Z

Here is an example of a pipeline with some elements redacted. I hope it is helpful anyway. In short: I have some none-llm steps, then, if an error occurs, I parse it into a json via an llm and use this for another llm call. If no error occurs, I expect the pipeline to terminate early (see the star)

julian-risch added the P1 High priority, add to the next sprint label Dec 20, 2024

etirelli added a commit to etirelli/haystack that referenced this issue Dec 29, 2024

fix: prevents exception when the pipeline contains multiple nested lo…

f81ab09

…ops, due to the cycle detection removing the same edge multiple times (ref deepset-ai#8657)

etirelli linked a pull request Dec 29, 2024 that will close this issue

fix: prevents exception when the pipeline contains multiple nested loops #8677

Open

6 tasks

etirelli added a commit to etirelli/haystack that referenced this issue Dec 29, 2024

fix: prevents exception when the pipeline contains multiple nested lo…

9fb6509

…ops, due to the cycle detection removing the same edge multiple times (ref deepset-ai#8657)

Willenbrink linked a pull request Jan 2, 2025 that will close this issue

fix: Remove cycle handling #8679

Open

julian-risch assigned mathislucka and Amnah199 Jan 6, 2025

mathislucka mentioned this issue Jan 9, 2025

Test: Show current pipeline run issues (DO NOT MERGE) #8695

Open

mathislucka linked a pull request Jan 11, 2025 that will close this issue

Fix: Pipeline.run logic #8707

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cycle detection removes same edge multiple times #8657

Cycle detection removes same edge multiple times #8657

Willenbrink commented Dec 18, 2024

etirelli commented Dec 29, 2024

mathislucka commented Jan 6, 2025

Willenbrink commented Jan 9, 2025

Cycle detection removes same edge multiple times #8657

Cycle detection removes same edge multiple times #8657

Comments

Willenbrink commented Dec 18, 2024

etirelli commented Dec 29, 2024

mathislucka commented Jan 6, 2025

Willenbrink commented Jan 9, 2025