Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: recreate document id if certain attributes are changed #8694

Closed
wants to merge 4 commits into from

Conversation

julian-risch
Copy link
Member

@julian-risch julian-risch commented Jan 9, 2025

Related Issues

Proposed Changes:

  • Re-create the document id whenever one of the document's attributes are updated or when meta is set for the first time.
  • Added a test to check that changes of any document attributes except for score result in an updated document.id
  • Added a test to check that changes of document.score leave the document.id unchanged

How did you test it?

New unit tests

Notes for the reviewer

Our documentation explicitly mentions that the id will not be updated automatically "since Haystack uses the document’s contents to create an ID, two identical documents might have identical IDs. Keep it in mind as you update your documents, as the ID will not be updated automatically."
https://docs.haystack.deepset.ai/docs/document-store#work-with-documents

Users might rely on the ids not being auto-updated and this would mean a breaking change.

There is an alternative, more complex logic we could implement. If a document is initialized with a custom id, we could decide to keep that id regardless of any future changes to other attributes.

Checklist

  • I have read the contributors guidelines and the code of conduct
  • I have updated the related issue with new insights and changes
  • I added unit tests and updated the docstrings
  • I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test: and added ! in case the PR includes breaking changes.
  • I documented my code
  • I ran pre-commit hooks and fixed any issue

@julian-risch julian-risch changed the title recreate document id if certain attributes are changed fix: recreate document id if certain attributes are changed Jan 9, 2025
@julian-risch
Copy link
Member Author

#8708 and #8698 were merged instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Document ID doesn't updated upon metadata update
1 participant