Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add report of duplicates resources ids #3247

Merged
merged 9 commits into from
Jan 29, 2025

Conversation

ThibaudDauce
Copy link
Contributor

No description provided.

@ThibaudDauce ThibaudDauce requested a review from maudetes January 14, 2025 08:33
Copy link
Contributor

@maudetes maudetes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Locally, I have 168 resources with duplicated IDs for duplicate inside datasets only

udata/commands/db.py Outdated Show resolved Hide resolved
udata/commands/db.py Show resolved Hide resolved
udata/commands/db.py Outdated Show resolved Hide resolved
Comment on lines 509 to 516
for r in dataset.resources:
# If it's the duplicated resource we're interested in and
# that ID was already added to the new_resources (so we are
# on the second resource), do not add it.
if r.id == id and id in [r.id for r in new_resources]:
continue

new_resources.append(r)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we directly add resource1?

Suggested change
for r in dataset.resources:
# If it's the duplicated resource we're interested in and
# that ID was already added to the new_resources (so we are
# on the second resource), do not add it.
if r.id == id and id in [r.id for r in new_resources]:
continue
new_resources.append(r)
new_resources.append(resource1)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No because we also add back all the dataset original resources. But I simplified a lot the code by using r == resource2 in dd1a108

udata/commands/db.py Outdated Show resolved Hide resolved
udata/commands/db.py Show resolved Hide resolved
udata/commands/db.py Outdated Show resolved Hide resolved
udata/commands/db.py Show resolved Hide resolved
@ThibaudDauce ThibaudDauce merged commit cf762db into master Jan 29, 2025
1 check passed
@ThibaudDauce ThibaudDauce deleted the add_report_of_duplicates_resources_ids branch January 29, 2025 13:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants