You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This can be problematic if the id is not unique on the remote portal. It should not happen on the CKAN side, but it will break in at least the following case:
a dataset with a given resource id is harvested, then deleted on remote side
the dataset is automatically archived (but not deleted) on udata's side
a new dataset appears with the same resource id on the remote side
a new dataset and resource is created with the same resource id as the previous one which still exists on udata's side
➡️ this leads to an id conflict and for example the stable resource URL will point to the obsolete resource URL
Possible solutions:
stop relying on remote resource id altogether, instead use a new attribute resource.extras.harvest:remote_id to map the the remote resource to the local one
— the local resource will have an auto-generated resource id, which should be unique ➡️ this is nice but we need quite some code changes and a migration
protect the harvesting process against conflictual IDs: raise an error for a given dataset if it contains an existing resource id ➡️ easier to implement but requires a manual action (dataset deletion) to fix the situation
The text was updated successfully, but these errors were encountered:
We're currently using the remote resource id for our own resource id
udata-ckan/udata_ckan/harvesters.py
Line 250 in a7c5f4e
This can be problematic if the id is not unique on the remote portal. It should not happen on the CKAN side, but it will break in at least the following case:
Possible solutions:
resource.extras.harvest:remote_id
to map the the remote resource to the local oneudata-ckan/udata_ckan/harvesters.py
Line 245 in a7c5f4e
The text was updated successfully, but these errors were encountered: