You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If FrozenSourmashSignature#1610 goes in, we'll have a de facto documentation of everywhere in the code base where we modify signatures and sketches. This touches on a bunch of things that @luizirber and I have discussed in oblique ways - mostly when it's causing trouble for him in the Rust layer - and I think it might be time to double down and think more holistically on how to think about the codebase as a whole.
Related topics -
it's not been entirely clear how to think about signatures, minhash objects, and databases in the context of long running processes (batch processes, client/server, Web backends, etc.).
who "owns" a signature when it was loaded from a database? and when can the rust code assume that it has ownership of the backend signature, and is allowed to change it?
which parts of the code are "allowed" to make binding changes to a signature or sketch that should be reflected in a database?
which parts of the code are allowed to make changes to databases? and how do we enforce this?
related: what do the APIs return - original or changed signatures? per revisit LinearIndex.select behavior for scaled (and num?) #1433 and many of the recent changes, I've doubled down on returning original signatures from the search APIs, so that all the flattening and downsampling is done dynamically during search. I still think this is a good idea.
Based on #1610 and #1508, I do think we've done an OK job - it turns out that SourmashSignature and MinHash objects are mostly adjusted or changed in the top-level command-line functions, which limits the footprint of things we need to pay attention to. We could make this a policy, though shrug.
We could also make a clearer distinction between immutable database types (load & use, but don't change) and mutable database types (so far, mostly in memory ones, or when building on-disk ones). I'm not quite sure how to do this best, but strengthening the disconnect between building databases and using databases seems like a good idea.
Related: #1494 and discussions therein. Also see #1577 cc @gustavo-salazar, who might have the "honor" of further exploring some of the consequences of long-running sourmash processes :).
The text was updated successfully, but these errors were encountered:
(the larger goal is to have sourmash continue to support nice CLI UX, but also be a robust and performant library that can be used for long-running processes and/or underpin other libraries and workflows)
If
FrozenSourmashSignature
#1610 goes in, we'll have a de facto documentation of everywhere in the code base where we modify signatures and sketches. This touches on a bunch of things that @luizirber and I have discussed in oblique ways - mostly when it's causing trouble for him in the Rust layer - and I think it might be time to double down and think more holistically on how to think about the codebase as a whole.Related topics -
LinearIndex.select
behavior for scaled (and num?) #1433 and many of the recent changes, I've doubled down on returning original signatures from the search APIs, so that all the flattening and downsampling is done dynamically during search. I still think this is a good idea.Based on #1610 and #1508, I do think we've done an OK job - it turns out that
SourmashSignature
andMinHash
objects are mostly adjusted or changed in the top-level command-line functions, which limits the footprint of things we need to pay attention to. We could make this a policy, though shrug.We could also make a clearer distinction between immutable database types (load & use, but don't change) and mutable database types (so far, mostly in memory ones, or when building on-disk ones). I'm not quite sure how to do this best, but strengthening the disconnect between building databases and using databases seems like a good idea.
Related: #1494 and discussions therein. Also see #1577 cc @gustavo-salazar, who might have the "honor" of further exploring some of the consequences of long-running sourmash processes :).
The text was updated successfully, but these errors were encountered: