-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/use dynamic harvest field #2762
Feat/use dynamic harvest field #2762
Conversation
We could also store it as isoformat (it was previously the case for last_update). However, we'll need to manipulate dates a bit more with additional created_at and last_modified fields.
Fixture order did not work as I expected, being not-deterministic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks nice, big work 👏 Code is much easier to read.
Really not sure about that dct_identifier thing...
NB: empty file here https://github.com/opendatateam/udata/pull/2762/files#diff-1a11729151284ff74f2c7d1cbec38f0e8960a87758b4c755bf3f9d612722cfce, maybe remove?
The entire job now fails at initlization time if any of the dataset has a DCT.identifier missing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Migration question
Fix datagouv/data.gouv.fr#818, alternative to #2750
Uses a separate
harvest
dynamic document to store harvest information.The core fields are defined in /dataset/models.py.
Any entry can be added freely, without validation however.
Add a migration (around 15min locally) to move harvest metadata.
Explicit api field definition is needed for core or any additional fields to expose it by api. See https://github.com/maudetes/udata/blob/67dfdda59750eb4224c07b1da4c792f41147a26f/udata/core/dataset/api_fields.py#L22 for fields defined in udata core.
Other entries would be added by modifying this field definition, ex in udata-ods:
Harvest dates are now stored in the harvest metadata and don't override the object dates.
Thus, we should iterate to return the correct dates on the frontend (ex: max between mongo object & harvest metadata?).
The exhausting list of dataset extras that have been migrated to harvest metadata is:
ods
andckan
-prefixed extras)TODO
Some defined fields could be replaced or merged? Ex: dct_identifier is the same as remote_id for dcat harvested datasets. Are these values needed? Made a first attempt at removing those: maudetes@a5a6d97.-> We keep these for now, see Feat/use dynamic harvest field #2762 (comment)