Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

specimen: days_to_collection field #41

Open
pnrobinson opened this issue Nov 28, 2023 · 3 comments
Open

specimen: days_to_collection field #41

pnrobinson opened this issue Nov 28, 2023 · 3 comments
Assignees

Comments

@pnrobinson
Copy link
Member

This should be mapped to the time_of_collection field of the GA4GH Biosample message.

Check what data is coming for this. Probably use the same days to ISO 8601 function as for disease and individual.

@msierk msierk self-assigned this Dec 4, 2023
@msierk
Copy link
Collaborator

msierk commented Dec 7, 2023

Here's the definition of days_to_collection from GDC:

"The number of days from the index date to the date a sample was collected for a specific study or project."

The Biosample element time_of_collection is "Age of the proband at the time the sample was taken. RECOMMENDED."

So we need to use the index age of the subject + days from the index date = age of subject at time sample was taken.

I'm not sure how to get the age of the subject (line 58 of cda_biosample_factory.py)

@msierk
Copy link
Collaborator

msierk commented Jan 22, 2024

@pnrobinson @ielis I have a question about the Phenopacket age element. The documentation says it should be age: iso8601duration: "P25Y3M2D", but in op_individual the parameter is described as ":param iso8601duration: age represented as an ISO 8601 Period". Is the element 'Age' or 'iso8601duration'?

https://phenopacket-schema.readthedocs.io/en/latest/age.html

@ielis
Copy link
Member

ielis commented Jan 24, 2024

Hi @msierk

OpIndividual is a convenience wrapper for constructing instances of Individual component of the Phenopacket Schema. The purpose of the wrapper is to simplify creation of the Protobuf object. The Python bindings generated by the Protobuf "compiler" are sometimes hard to work with, and OpIndividual simplify this.

However, with the benefit of hindsight, I think OpIndividual does not help too much. So, I think I'll remove it from the code base.

To create a well-formatted Age, either to use in time_at_last_encounter in Individual, in time_of_collection of Biosample, or elsewhere, we must first have a well formatted ISO8601 duration string (such as P25Y3M2D in your example), and then set it to time_element.age.iso8601duration field.

You may need to fight the Protobuf bindings to get it done though.. 😕

In the long term, I would like to create a Python library to simplify this. We have such a library in Java (Phenopacket tools, in particular the builder package), but Python needs one too.

Finally, I agree that we should compute the age at collection per your suggestion, and it is not yet clear to me how to make the subject's age available in the function. We'll have to work something out. Let's keep this issue open until we can resolve it.

@ielis ielis mentioned this issue Feb 2, 2024
@ielis ielis linked a pull request Feb 2, 2024 that will close this issue
@ielis ielis removed a link to a pull request Feb 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants