You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Data management need a new data quality assessment implemented to detect when data providers have removed records from an endpoint, so that we can reliably identify when entities need to be end-dated.
This is sort of the inverse of the exising unknown entity issue:
unknown entity, when there are records in a new resource which can’t be mapped to existing entities
deleted entity, when there are existing lookups which can’t be mapped to records on a new resource.
For each provision (e.g. Buckinghamshire article-4-direction-area) this code compares the reference values for all existing entities with the reference values for all currently active resources. Where there are entities with reference values that don’t existing in active resources they are flagged as entities which should probably be end-dated.
Dependencies
Identifying deleted entities is one thing, actually end-dating / retiring the entities identified as deleted requires agreeing with Data Design team how to represent this in the data model, i.e. if we just give entities an end-date, and if so what date we use as the end date.
Tech Approach
To be completed by dev.
Acceptance Criteria/Tests
issues to be recorded in issues or expectations issues tables.
Resourcing
Are there any tickets that need to be completed before this one can be? Are there any limitations as to who in the team can complete this ticket?
The text was updated successfully, but these errors were encountered:
Overview
Data management need a new data quality assessment implemented to detect when data providers have removed records from an endpoint, so that we can reliably identify when entities need to be end-dated.
This is sort of the inverse of the exising unknown entity issue:
Demo code for one possible approach in the notebook here: https://github.com/digital-land/jupyter-analysis/blob/gs/deleted-entities/analysis/2024-10_deleted_entities/deleted_entities.ipynb
For each provision (e.g. Buckinghamshire article-4-direction-area) this code compares the reference values for all existing entities with the reference values for all currently active resources. Where there are entities with reference values that don’t existing in active resources they are flagged as entities which should probably be end-dated.
Dependencies
Identifying deleted entities is one thing, actually end-dating / retiring the entities identified as deleted requires agreeing with Data Design team how to represent this in the data model, i.e. if we just give entities an end-date, and if so what date we use as the end date.
Tech Approach
To be completed by dev.
Acceptance Criteria/Tests
Resourcing
Are there any tickets that need to be completed before this one can be? Are there any limitations as to who in the team can complete this ticket?
The text was updated successfully, but these errors were encountered: