Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Unified annotation versioning #7917

Draft
wants to merge 184 commits into
base: master
Choose a base branch
from
Draft

Conversation

fm3
Copy link
Member

@fm3 fm3 commented Jul 8, 2024

URL of deployed dev instance (used for testing):

  • https://___.webknossos.xyz

Steps to test:

  • abc

TODOs:

  • Mechanism to Revert Editable Mappings
    • Iterators for SegmentToAgglomerate, AgglomerateToGraph
    • How to encode Reverted chunks, agglomerates? → Single Zero-Byte?
      • Check a single zero byte is not a valid proto message
    • Iterate over current version, fetch old version, rewrite
    • Integrate this in Updater or flush Updater before this happens
    • Test
    • Cleanup: Generic Reversion-Aware Iterator? Build on top of VersionedFossilDbIterator?
    • Save perf by skipping fetching content when only version is needed in iterator (different from volume, because here we don’t need to update the segment index)
      • that might need new fossildb api (ListKeysWithVersions)
    • Make the iterator async?
  • Annotation proto object
  • Design Annotation-wide update actions
    • add layer
      • becomes update action rather than route
      • after applying updates, a summary is sent to wk if postgres-cached properties change
        • could this lead to annotationId lookups before postgres knows about the new layer?
      • assert no duplicate names?
      • assert no more than one skeleton?
      • test add layer as very first update action
    • delete layer
    • update layer metadata
    • update annotation metadata (name+description)
    • iron out reversion folds + layer deletions in merge + duplicate?
  • Test sandbox annotation
  • Adapt Task creation (save annotationProto object)
  • Duplicate
    • Use in duplicate route (“copy to my account”)
    • Use in task creation from base annotation
    • Use in task assignment
    • What to do with task resetToBase? implement as revert action?
    • actionTracingIds need to be remapped in duplicate (or use layer names after all?)
    • duplicate history?
      • duplicate update actions (needed for merging editable mappings)
      • duplicate v0 in addition to current version?
      • Also in fromTask case? How to mark earliest accessible version? We don’t want users to revert too far, right?
      • Should we also copy intermediate materialized versions? Or just 0 and current?
      • What about intermediate bucket versions? They are not in the updates
      • perf: duplicate api for fossilDB?
  • Unified versioning over layers
    • Route
    • Store Updates
    • Create Annotation
    • Updates need Layer Identifier
    • Apply Updates
      • updates that mutate annotation object
      • updates that mutate tracing objects
      • set individual targetVersions for updater/buffers? or is this already done?
      • updates that mutate other stuff
        • volume buckets
        • proofreading
      • Special updates
        • AddSegmentIndex → functionality removed. only sets bool now.
        • ImportVolumeData
        • makeMappingEditable
        • Merge into current? → does not exist, only ImportVolumeData
        • Downsample? (Maybe remove feature)
        • RevertToVersion
          • Revert volume buckets
          • Revert skeletons
          • Revert annotation-level properties
          • Revert proofreading fields
          • Handle gone layers (either in previous or target version)
          • What if two of those come in the same update batch?
          • split updates batches by RevertToVersion? What are the intermediate versions?
    • Replace or fix MergedFromIds (used for compound, mergeTwo, always creates new annotation)
      • merge skeletons
      • merge volume data
      • merge history?
      • merge editable mappings
      • use persist bool in all cases or assert non-supported don’t happen
      • test with compound
    • Replace or fix MergedFromContents (only used during upload, so everything has v0)
    • report applied updates to postgres only when requesting newest
    • Version assertions
    • lazy apply for volumes and editable mappings?
      • can volume data still be written directly? can there be conflicts?
      • or ditch lazy apply completely? (might be ok with distributed skeletons etc)
      • When to materialize which layer?
        • store materialized layer only sometimes? count its update actions?
    • Ensure no parallel update applying on the same object (async cache?)
      • But parallel update applying should also not be a problem, except for perf (was it only a problem because version wasn’t specified when loading volume buckets during revert?)
      • Perf: Reduce cache mem overhead by removing older versions when newer are requested (could be done by nested LRU cache, one with small capacity per annotationId)
      • Perf: When newer is requested, get old one from cache, then apply updates?
    • [ ]
  • Search for // TODO
  • Tests
  • CI
  • Frontend
    • Linearized update actions
    • Use new update action route
    • update actions now need actionTracingId
    • importVolumeData seems to send outdated version, does not reload on success?
    • Some update actions now need to be distinguished between skeleton & volume, thus they are now separate and need their own updateName. This goes for:
      • updateUserBoundingBoxes → updateUserBoundingBoxesInSkeletonTracing, updateUserBoundingBoxesInVolumeTracing
      • ~~updateUserBoundingBoxVisibility → updateUserBoundingBoxVisibilityInSkeletonTracing,
      • updateUserBoundingBoxVisibilityInVolumeTracing~~ is used nowhere in the frontend -> remove in backend?
      • updateTracing → updateVolumeTracing, updateSkeletonTracing
    • Some routes are now update actions (add layer, delete layer). makeHybrid was removed.
      • They need to first send an update action to the store and then e.g. reload
    • Version Restore View; See: WIP: Unified annotation versioning #7917 (comment)
      • For volume actions show which volume layer was edited (if the layer was deleted, show "unknown layer" / layer id)
      • Seems to break if layer set has changed (e.g. if a layer was added since the old version)
    • enforce revert actions to come in separate update group (has its own version number)
    • same for add layer / delete layer / initialize editable mapping
    • why is newestVersion sometimes called with emptystring annotationId?
    • Load the annotation proto object to get correct layer set, name, description for requested version (postgres only knows latest for the dashboard)
    • Respect earliestAccessibleVersion (show version history only down to that one)
    • remove downsample feature for volume annotation layers
  • Migration
    • Linearized Updates
      • How to interleave updates? is by timestamp alone fine?
      • How to deal with reverts? In the new code, a revert can’t be for just one layer
      • How to deal with addEditableMapping?
      • How to deal with addSegmentIndex?
      • How to detect when to add which layer? Add all layers already to v0?
    • changed update actions
      • several actions were renamed, e.g. updateTracing was renamed → updateVolumeTracing/updateSkeletonTracing
      • update actions now need actionTracingId
    • Find annotation ids
    • Can we run a first part in the background?
      • on a second run, (re)do only annotations that in postgres have a modified date newer than when first run started
    • editable mapping update actions and arrays used to be stored by mappingName, now annotationId
    • while we’re migrating everything, could we also do Fix or Remove Morton Order in Volume Data Fossil Keys #3546 ?
      • how does it interact with ND data?
  • migration guide
    • addSegmentIndex is removed. if you want to add segment indices to existing annotations, update to a lower version first, run the migration route, then upgrade to this
    • need to run fossildb migration
  • changelog
    • removed downsample button (?)
    • unified annotation versioning
  • parallelize (distribute using semaphore)

Issues:


(Please delete unneeded items, merge only when none are left open)

@fm3 fm3 self-assigned this Jul 8, 2024
@fm3 fm3 changed the title Unified annotation versioning WIP: Unified annotation versioning Jul 8, 2024
@fm3
Copy link
Member Author

fm3 commented Oct 29, 2024

note to self:
note to self, open questions:

  • remove downsample feature? → yes
  • duplicate history?
    • duplicate update actions (needed for merging editable mappings)
    • duplicate v0 in addition to current version?
    • Also in task assignment case? How to mark earliest accessible version? We don’t want users to revert too far, right?
    • Should we also copy intermediate materialized versions? Or just 0 and current?
    • What about intermediate bucket versions? They are not in the updates
    • perf: duplicate api for fossilDB?
  • resetToBase as update action?
  • during apply, what to load in memory? is it all or nothing? → load all, as usually we request everything
  • during apply, what to flush to fossil? is it all or nothing? → flush all layers that were changed (updates for the layer exist)
  • avoid duplicate update applying (async cache?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants