-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update digest serialization rules in docs #410
Comments
@ahwagner sorry to need reminding (again), but why is the ordering important to the genotypeMembers array? |
@larrybabb the To date, we have only had identifiable objects in arrays, and for those we would compute the digests of the array objects and then sort the array lexicographically. In this new case, |
@ahwagner got it. thank you for re-explaining that. It's just that an if the Thoughts? |
Do I understand correctly that the proposal is to add a non-standard "ordered" attribute to the members property of the message? |
I think so. My questions above are for @ahwagner and offer some other options possibly. Let's see how he responds. |
@reece and @larrybabb there are two concerns here. Tagging @andreasprlic because this is an important technical implementation discussion he should also weigh in on. Concern 1. we need to sort some JSON arrays and not sort others for digest serializationJSON Schema does not differentiate between arrays and sets, but VRS does. We represent all sets in VRS (e.g. Later, however, we added As an aside, one potential decision that would sidestep this issue (for now) is to make Concern 2. define a mechanism that allows us to uniformly indicate sort behavior across classesDuring the GKS-Pilot work we opted to use the There are many approaches we may take to address this concern, including some previously suggested ones: Schema-based approaches (not in message)
Message-based approaches (defined in schema and explicit in message)
Documentation approaches (implementation concern only; not in message or schema)
I ask that we keep the discussion on Concern 1: array sorting proposal in this thread and discuss Concern 2: indicating sort behavior (which is dependent on resolving the first concern) in a separate issue (#411). |
@ahwagner I get what you are saying. Thank you (again) for taking the time and effort to lay out the details above. It seems like the 2 issues are
The CSE case should probably be set up as a linked list or nesting construct to make sure that the chain/hierarchy is semantically included in the data. Maybe all ordered lists should use a designed construct to preserve those semantics? For any data that has no ordering we would only need to solve the issue of digesting un-identifiable components in lists. For this we should assume that any items in a Value object array would meet the requirement of being a value object too and we should simply digest these elements and sort them by their value object digest, even though we don't ever persist or preserve these non-identifiable element's ids. Again, I apologize for re-surfacing these issues. I think the idea of having an |
Separating out these distinct concerns between this thread and #411. Addressing here:
I think this approach is very similar to the proposal laid out above. The array content needs to be digest serialized, then digested. I had proposed we sort on the digest serialized outputs to save on the extra compute of creating digests (since these are not persisted anyways). Is there a reason we want to take the extra step to create digests for these objects, e.g. consistency with the approach for identifiable objects? I'm okay with that, but just want to acknowledge the extra compute expense associated with this decision. |
On 11/7 leads call, @larrybabb @andreasprlic and @ahwagner agreed that sorting on digests (even for non-identifiable objects) is the more consistent approach. Proposed Resolution: digest ALL JSON Objects in arrays for sorting during digest serialization, UNLESS the array order is meaningful (indicated as described in resolution of #411). |
Implemented in #409 |
@ahwagner I did not update the docs with this change. Did you want me to do this in a separate PR? |
Yes, good catch. Reopening this issue until the documentation is updated. |
This issue was marked stale due to inactivity. |
The digest serialization docs do not explicitly say how to handle arrays of objects that can't be serialized (i.e.,
Genotype.members
) but still have the propertyordered=False
. Below is a proposed example from @ahwagner where we serialize eachGenotypeMember
and sortGenotype.members
based on the serialized strings.The text was updated successfully, but these errors were encountered: