Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The avtable_import_set command can create duplicate sets #61

Open
JoshEvans33 opened this issue Aug 3, 2022 · 4 comments
Open

The avtable_import_set command can create duplicate sets #61

JoshEvans33 opened this issue Aug 3, 2022 · 4 comments

Comments

@JoshEvans33
Copy link

Hi,

It was recently brought to our attention that the avtable_import_set command will create duplicate entities within a data set in Terra if the same code is ran twice. It appears that the code doesn't check if the entities already exist and adds them again.

If I create a small data frame of "sample1", "sample2", and "sample3". Then use avtable_import_set to add those entities to a data set in my Terra workspace. I will get 6 entities if I run that command again instead of system checking if there was any change to the data or ignoring duplicates.

@mtmorgan
Copy link
Collaborator

Sorry to be slow @JoshEvans33 -- I'm not sure that this is a problem or unexpected behavior with the AnVIL package per se? For instance I can easily use the Terra user interface to edit the members of an entity set to contain duplicate entries. I'm also concerned that this could be quite expensive on a large set -- the entire set would have to be downloaded for verification. I can see how checking for duplicates would be a valuable feature...

Internally, the AnVIL package uses this entry point. I'll see if I can get some guidance from the Terra technical support team.

@mtmorgan
Copy link
Collaborator

Terra support tells me the issue comes from https://support.terra.bio/hc/en-us/community/posts/5077709219355-Bug-report-duplicate-entities-appearing-in-sets (@smgogarten); I'll investigate a bit more...

@mtmorgan
Copy link
Collaborator

Some follow-up. Duplicate entities within a set can be accomplished through the UI (either using the 'edit' button, or actually re-importing the same tsv file, provided the elements are of type 'list') so this is an issue with the Terra environment. I understand that there is an open ticket, and will wait to make any modifications to the AnVIL package until whatever upstream steps have been taken.

@JoshEvans33
Copy link
Author

Hi, sorry for the delayed response. I'm actually a member of Terra Support and we were able to determine that there is indeed a bug in the Terra UI that this package interfaces with to cause this behavior. Our engineers are aware of the issue and we'll look into resolving it on our end. Thanks for taking a look at this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants