Write analysis data to file per collection #18

robobenklein · 2021-05-13T18:33:07Z

Output can be imported separately into the db after run without a direct connection.

robobenklein · 2021-05-28T14:58:04Z

example compatible file format:

"_key","name","city","state","country","lat","long","vip"
"00M","Thigpen ","Bay Springs","MS","USA",31.95376472,-89.23450472,false
"00R","Livingston Municipal","Livingston","TX","USA",30.68586111,-95.01792778,false
"00V","Meadow Lake","Colorado Springs","CO","USA",38.94574889,-104.5698933,false
"01G","Perry-Warsaw","Perry","NY","USA",42.74134667,-78.05208056,false
"01J","Hilliard Airpark","Hilliard","FL","USA",30.6880125,-81.90594389,false

robobenklein · 2021-05-31T23:47:46Z

We will need to do dedup at some point during this process, either during writing or during import, or even a step in-between which could be something like external sorting.

An external sorting preprocess step before db import could allow dedup between multiple job results as well, maybe:

wsyntree-collector file-merge [paths to job output dirs] -o [path to combined output dir]
wsyntree-collector file-import [paths to jobor merged output dirs]

The flatfile storage format would likely be a single folder containing text files for each collection where each node should be one line (/entry, CSV or escaped style)

Could also get fancy and write intermediary storage using the treetops "hash composes path" style where division amongst files occurs by the _key/_id property.

robobenklein self-assigned this May 14, 2021

robobenklein added performance enhancement New feature or request labels May 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Write analysis data to file per collection #18

Write analysis data to file per collection #18

robobenklein commented May 13, 2021

robobenklein commented May 28, 2021

robobenklein commented May 31, 2021

Write analysis data to file per collection #18

Write analysis data to file per collection #18

Comments

robobenklein commented May 13, 2021

robobenklein commented May 28, 2021

robobenklein commented May 31, 2021