You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We will need to do dedup at some point during this process, either during writing or during import, or even a step in-between which could be something like external sorting.
An external sorting preprocess step before db import could allow dedup between multiple job results as well, maybe:
wsyntree-collector file-merge [paths to job output dirs] -o [path to combined output dir]
wsyntree-collector file-import [paths to jobor merged output dirs]
The flatfile storage format would likely be a single folder containing text files for each collection where each node should be one line (/entry, CSV or escaped style)
Could also get fancy and write intermediary storage using the treetops "hash composes path" style where division amongst files occurs by the _key/_id property.
Output can be imported separately into the db after run without a direct connection.
The text was updated successfully, but these errors were encountered: