diff --git a/README.md b/README.md index 0fbd5c1..4f5bc5d 100644 --- a/README.md +++ b/README.md @@ -1,2 +1,31 @@ -# definitions_library -Definitions from SWA's projects suitable for use with the dataset assembly tool. +# IDI definitions library +Definitions from the Social Investment Agency (proeviously Social Wellbeing Agency) projects in a single repository for ease of discovery. These definitions are intended to be compatible with the Dataset Assembly Tool. + +## Overview +The Dataset Assembly Tool encourages research projects to structure their input data into population and measures. The population is specific to a specific study and includes who is being studied and over what time period. The measures are not specific to a study. This means they can be reused across research projects. + +When creating a new measure, researchers are encouraged to construct the best definition of the measure that they can. By creating and sharing high quality definitions, the quality of research will improve. + +## Dependencies +Because these definitions are based on data in the IDI, it is necessary to have an IDI project if you wish to run the code. Visit the Stats NZ website for more information about this. + +The definitions in this library were developed using specific versions/refreshes of the IDI. As the IDI is further developed, new refreshes will be released. When reusing these definitions, it is the researcher's responsibility to updated them if the refresh being used in the project does not match the original refresh of the definition. Trying to use different refreshes in the same project will likely result in errors and links between data sources can change between refreshes. + +In addition to updating the refresh, researchers will need to update the project schema before running these definitions. Unless you have permission to create tables and views in the chosen schema, you will receive an error when you run a definition. + +Any dependencies for an individual definition should be noted in the header at the top of the file. + +## Folder descriptions +This repository contains definitions from other projects and the definitions are stored in different domain folders. + +## Disclaimer +The definitions provided in this library were determined by the Agency to be suitable in the context of a specific project. Whether or not these definitions are suitable for other projects depends on the context of those projects. Researchers using definitions from this library will need to determine for themselves to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +## Acknowledgements +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of a number of definitions from the vaccination uptake project. + +## Citation +Social Investment Agency. Definitions library. Source code. https://github.com/nz-social-investment-agency/definitions_library + +## Getting Help +If you have any questions email info@sia.govt.nz diff --git a/SIA IDI Definitions Library Catalogue.xlsx b/SIA IDI Definitions Library Catalogue.xlsx new file mode 100644 index 0000000..9f2e999 Binary files /dev/null and b/SIA IDI Definitions Library Catalogue.xlsx differ diff --git a/border/first_arrival_in_NZ b/border/first_arrival_in_NZ new file mode 100644 index 0000000..ab0a77b --- /dev/null +++ b/border/first_arrival_in_NZ @@ -0,0 +1,118 @@ +/************************************************************************************************** +Title: First arrival in New Zealand +Author: Luke Scullion +Reviewer: Shaan Badenhorst + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Best estimate of first arrival in New Zealand across sources. + +Intended purpose: +When a person first entered NZ, how long a person has been in NZ. + +Inputs & Dependencies: +- [IDI_Clean].[cen_clean].[census_individual_2018] +- [IDI_Clean].[cen_clean].[census_individual_2013] +- [IDI_Clean].[data].[person_overseas_spell] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[vacc_Cen2018_Occupation] + +Notes: +1) The definition synthesizes across Census 2018, Census 2013, and Overseas spells data. +2) May not accurately identify beginning of residence in NZ if a person visited + New Zealand before migrating here. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = vacc_ + Project schema = DL-MAA20XX-YY + +Issues: + +History (reverse order): +2021-11-10 SB review +2021-10-31 LC +**************************************************************************************************/ + +/***************************** +create table of all possible entries +*****************************/ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_NZ_arrival] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_NZ_arrival] ( + snz_uid INT, + arrival_year INT, + arrival_month INT, +); +GO + +/* Census 2018 */ +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_NZ_arrival] (snz_uid, arrival_year, arrival_month) +SELECT snz_uid + ,CAST([cen_ind_arrv_in_nz_year_code] AS INT) AS arrival_year + ,CAST([cen_ind_arrv_in_nz_month_code] AS INT) AS arrival_month +FROM [IDI_Clean_YYYYMM].[cen_clean].[census_individual_2018] +WHERE snz_uid IS NOT NULL + +/* Census 2013 */ +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_NZ_arrival] (snz_uid, arrival_year, arrival_month) +SELECT snz_uid + ,CAST([cen_ind_arrival_in_nz_yr_code] AS INT) AS arrival_year + ,CAST([cen_ind_arrival_in_nz_mnth_code] AS INT) AS arrival_month +FROM [IDI_Clean_YYYYMM].[cen_clean].[census_individual_2013] +WHERE snz_uid IS NOT NULL + +/* Overseas spells */ +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_NZ_arrival] (snz_uid, arrival_year, arrival_month) +SELECT snz_uid + ,YEAR([pos_ceased_date]) AS arrival_year + ,MONTH([pos_ceased_date]) AS arrival_month +FROM [IDI_Clean_YYYYMM].[data].[person_overseas_spell] +WHERE [pos_first_arrival_ind] = 'y' +AND snz_uid IS NOT NULL + +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_NZ_arrival] (snz_uid); +GO + +/***************************** +filter and conclude +*****************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_arrival_in_NZ]; +GO + +WITH tidied_variables AS ( + + SELECT snz_uid + ,IIF(arrival_year >= 2025, NULL, arrival_year) AS arrival_year + ,IIF(arrival_month NOT IN (1,2,3,4,5,6,7,8,9,10,11,12), NULL, arrival_month) AS arrival_month + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_NZ_arrival] + +) +SELECT snz_uid + ,MIN(arrival_year) AS arrival_year + ,MIN(arrival_month) AS arrival_month +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_arrival_in_NZ] +FROM tidied_variables +GROUP BY snz_uid +GO + +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_arrival_in_NZ] (snz_uid); +GO + + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_NZ_arrival] +GO diff --git a/border/migration to NZ date.sql b/border/migration to NZ date.sql new file mode 100644 index 0000000..50321b8 --- /dev/null +++ b/border/migration to NZ date.sql @@ -0,0 +1,215 @@ +/************************************************************************************************** +Title: Migration date to New Zealand +Author: Simon Anastasiadis + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Date of migration to New Zealand based on compolation of different sources keeping highest qulaity source. + +Intended purpose: +1. Creating indicators of when a person migrated to New Zealand. +2. Creating indicators of how recently a person migrated to NZ (for example, within the last 10 years). +3. Calculating age when migrated to New Zealand. + +Inputs & Dependencies: +- [IDI_Clean].[cen_clean].[census_individual_2018] +- [IDI_Clean].[cen_clean].[census_individual_2013] +- [IDI_Clean].[data].[person_overseas_spell] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[defn_migration_date] + +Notes: +1) Aim is to create a starting point for examining when people migrate to New Zealand + Useful for understanding whether people are new migrants and at what age they migrated + (with implications for first-generation, second-generation, etc. migrant measures). + +2) We have used Census 2018 and 2013 as the highest quality sources. These sources are + self-report. For old dates these may vary in quality (for example, people may round to + nearest 5 years when self-reporting). + +3) For Overseas spells, we examined duration of spells overseas and compared against + self-reported values in Census 2018. + - While people are non-resident, their overseas spells are much more likely to + be greater than 180 days. + - Once peole become resident, their overseas spells are much more likely to + be less than 180 days. + Hence, we tested "migration date = end date of last overseas spells with duration + 180+ days". This definition has strong consistency with self-reported year. + +4) Note that we have no control for 'is the person resident' or 'was the person resident'. + Some people who migrate to New Zealand will have since emmigrated from NZ. + It is recommended that you first use the Estimated Residnetial Population (ERP) by Stats + to determine who is resident. And where people who are resident were born overseas, + then use this code as a starting point to determine when they arrived in NZ. + + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +Issues: +1) We do not have sufficient expertise with immigration records in IDI to use this source. + Adding this source is the obvious and essential next step. +2) Run time 7 minutes + +History (reverse order): +2022-10-12 SA v1 +**************************************************************************************************/ + +/******************************************************** +TABLES TO APPEND TO +********************************************************/ + +/* Diagnosis or treatment only indicates dysthymia */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_migration_date] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_migration_date] ( + snz_uid INT, + event_date DATE, + event_year INT, + event_month INT, + origin VARCHAR(12) +) + +/******************************************************** +Census 2018 +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_migration_date] (snz_uid, event_date, event_year, event_month, origin) +SELECT [snz_uid] + ,CASE -- required as transformation applied before filtering (even if using nested queries) + WHEN [cen_ind_arrv_in_nz_year_code] NOT IN ('7777', '9999') + AND [cen_ind_arrv_in_nz_month_code] NOT IN ('77', '99') + THEN DATEFROMPARTS([cen_ind_arrv_in_nz_year_code], [cen_ind_arrv_in_nz_month_code], 15) + END AS event_date + ,CAST([cen_ind_arrv_in_nz_year_code] AS INT) AS event_year + ,CAST([cen_ind_arrv_in_nz_month_code] AS INT) AS event_month + ,'cen18' AS origin +FROM [IDI_Clean_YYYYMM].[cen_clean].[census_individual_2018] +WHERE [cen_ind_birth_country_code] NOT IN ('1201', '9999', '0000') -- Exclude NZ and non-response +AND [cen_ind_birth_country_impt_ind] IN ('11', '21') -- Census 2018 response +AND [cen_ind_arrv_in_nz_year_code] NOT IN ('7777', '9999') -- Some year required +AND [cen_ind_arrv_in_nz_month_code] NOT IN ('77', '99') -- Some month required +GO + +/******************************************************** +Census 2013 +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_migration_date] (snz_uid, event_date, event_year, event_month, origin) +SELECT [snz_uid] + ,CASE + WHEN [cen_ind_arrival_in_nz_yr_code] NOT IN ('7777', '9999') + AND [cen_ind_arrival_in_nz_mnth_code] NOT IN ('77', '99') + THEN DATEFROMPARTS([cen_ind_arrival_in_nz_yr_code], [cen_ind_arrival_in_nz_mnth_code], 15) + END AS event_date + ,CAST([cen_ind_arrival_in_nz_yr_code] AS INT) AS event_year + ,CAST([cen_ind_arrival_in_nz_mnth_code] AS INT) AS event_month + ,'cen13' AS origin +FROM [IDI_Clean_YYYYMM].[cen_clean].[census_individual_2013] +WHERE [cen_ind_birth_country_code] NOT IN ('1201', '9999', '0000') -- Exclude NZ and non-response +AND [cen_ind_arrival_in_nz_yr_code] NOT IN ('7777', '9999') -- Some year required +AND [cen_ind_arrival_in_nz_mnth_code] NOT IN ('77', '99') -- Some month required +GO + +/******************************************************** +Overseas spells + +Find people with a first date in NZ. +Find the earliest date such that any overseas spells of length 180+ days +are all prior to this date. +********************************************************/ + +WITH +people_who_have_a_first_arrival AS ( + SELECT snz_uid + FROM [IDI_Clean_YYYYMM].[data].[person_overseas_spell] + WHERE pos_first_arrival_ind = 'y' -- people born in NZ should not have a first arrival +), +last_day_of_long_overseas_spell AS ( + SELECT [snz_uid] + ,MAX([pos_ceased_date]) AS max_ceased_date + FROM [IDI_Clean_YYYYMM].[data].[person_overseas_spell] + WHERE [pos_day_span_nbr] > 180 + AND pos_last_departure_ind <> 'y' -- not last departure from NZ + GROUP BY snz_uid +) +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_migration_date] (snz_uid, event_date, event_year, event_month, origin) +SELECT snz_uid + ,max_ceased_date AS event_date + ,YEAR(max_ceased_date) AS event_year + ,MONTH(max_ceased_date) AS event_month + ,'spells' AS origin +FROM last_day_of_long_overseas_spell AS ld +WHERE max_ceased_date < GETDATE() +AND EXISTS ( + SELECT 1 + FROM people_who_have_a_first_arrival AS pw + WHERE pw.snz_uid = ld.snz_uid +) +GO + +/******************************************************** +Keep highest ranked source +********************************************************/ + +/* index for performance */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_migration_date] (snz_uid); +GO + +/* delete before creation */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_migration_date] +GO + +WITH +ranked AS ( + SELECT * + ,CASE + WHEN origin = 'cen18' THEN 1 + WHEN origin = 'cen13' THEN 2 + WHEN origin = 'spells' THEN 3 + END AS ranked + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_migration_date] +), +source_ranked AS ( + SELECT * + ,ROW_NUMBER() OVER (PARTITION BY snz_uid ORDER BY ranked, event_date) AS source_ranked + FROM ranked +) +SELECT snz_uid + ,event_date + ,event_year + ,event_month + --,origin +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_migration_date] +FROM source_ranked +WHERE source_ranked = 1 +GO + +/******************************************************** +Tidy +********************************************************/ + +/* index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[defn_migration_date] (snz_uid); +GO +/* compress */ +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[defn_migration_date] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE) +GO +/* remove temp tables */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_migration_date] +GO diff --git a/border/overseas_spells.sql b/border/overseas_spells.sql new file mode 100644 index 0000000..e9819b0 --- /dev/null +++ b/border/overseas_spells.sql @@ -0,0 +1,74 @@ +/************************************************************************************************** +Title: Overseas spells +Author: Joel Bancolita, Marianna Pekar + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Inputs & Dependencies: +- [IDI_Clean].[data].[person_overseas_spell] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[vacc_overseas_spell] +- [IDI_UserCode].[DL-MAA20XX-YY].[vacc_currently_overseas] + +Description: +Spells overseas. + +Intended purpose: +1. Identifying when a person is overseas. +2. Counting days overseas. +3. Determining who has left country and is yet to return. + +Notes: + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = vacc_ + Project schema = DL-MAA20XX-YY + +Issues: + +History (reverse order): +2021-09-01: MP revised measures and parameters, extract measures relevant to vaccine rollout analysis +2020-08-20: JB additional revised measures +2020-08-08: JB additional revised measures +2020-07-08: JB additional revised measures +2020-06-24: JB revised initial measures +2020-06-09: JB initialise +**************************************************************************************************/ + +USE IDI_UserCode +GO + +-- Spell overseas +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[vacc_overseas]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[vacc_overseas_spell] AS +SELECT [snz_uid] + ,CAST([pos_applied_date] AS DATE) AS [start_date] + ,CAST([pos_ceased_date] AS DATE) AS [end_date] +FROM [IDI_Clean_YYYYMM].[data].[person_overseas_spell] +GO + +-- Indicator of currently overseas. +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[vacc_OSSpells]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[vacc_currently_overseas] AS +SELECT [snz_uid] + ,CAST([pos_applied_date] AS DATE) AS [start_date] + ,CAST([pos_ceased_date] AS DATE) AS [end_date] +FROM [IDI_Clean_YYYYMM].[data].[person_overseas_spell] +WHERE year([pos_ceased_date]) = 9999 +GO diff --git a/border/recent_migrant.sql b/border/recent_migrant.sql new file mode 100644 index 0000000..f1a358a --- /dev/null +++ b/border/recent_migrant.sql @@ -0,0 +1,65 @@ +/************************************************************************************************** +Title: Recent migrant to New Zealand +Author: Simon Anastasiadis +Reviewer: Freya Li + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Given a number of years within which a person is considered a recent migrant (see parameters) provides a spell during which people are considered recent migrants. + +Intended purpose: +1. Identifying when people enter New Zealand for the first time. +2. Identifying whether a person is a recent migrant. + +Inputs & Dependencies: +- [IDI_Clean].[data].[person_overseas_spell] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[d2g_recent_migrants] + +Notes: +1) Overseas spells are stored as date-time with non-trivial time. + For consistency with other date, we remove their time component. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = tmp_ + Project schema = [DL-MAA20XX-YY] + Years 'recent' must be within = 2 + +Issues: + +History (reverse order): +2020-11-20 FL QA +2020-03-02 SA v1 +**************************************************************************************************/ + +/* Establish database for writing views */ +USE IDI_UserCode +GO + +/* Remove view */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[d2g_recent_migrants]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[d2g_recent_migrants] AS +SELECT [snz_uid] + ,CAST([pos_ceased_date] AS DATE) AS [start_date] + ,IIF(YEAR([pos_ceased_date]) = 9999, CAST([pos_ceased_date] AS DATE), DATEADD(YEAR, 2, CAST([pos_ceased_date] AS DATE))) AS [end_date] + ,[pos_first_arrival_ind] + ,[pos_last_departure_ind] + ,[pos_source_code] +FROM [IDI_Clean_YYYYMM].[data].[person_overseas_spell] +WHERE pos_first_arrival_ind = 'y'; +GO diff --git a/census_survey/census2018_dwelling.sql b/census_survey/census2018_dwelling.sql new file mode 100644 index 0000000..1e1912b --- /dev/null +++ b/census_survey/census2018_dwelling.sql @@ -0,0 +1,121 @@ +/************************************************************************************************** +Title: Census 2018 dwelling conditions +Author: Penny Mok +Reviewer: Manjusha Radhakrishnan + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Describe the dwelling condition of household. + +Intended purpose: +A proxy for dwelling condition using census data. + +Inputs & Dependencies: +- [IDI_Clean].[cen_clean].[census_dwelling_2018] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[defn_cen18_dwelling] + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +Issues: + +History (reverse order): +2022-07-14 VW Add in household composition +2022-07-04 VW Format, use a user code table to store for use with data assembly tool, drop unsued code, + re-work amenities definition to classify all missing >= 1/7 amenities as 0. +2022-06-23 PM Definition creation for MSD seniors project +**************************************************************************************************/ +/* Set database for writing views */ +USE IDI_UserCode; +GO + +/* Clear existing view */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[defn_cen18_dwelling]; +GO + +/* Create view */ +CREATE VIEW [DL-MAA20XX-YY].[defn_cen18_dwelling] AS ( +SELECT snz_uid + ,a.[snz_cen_dwell_uid] + ,[snz_idi_address_register_uid] + ,heat= case + when [cen_dwl_heating_app_code] like '%00%' then 0 --no heating + when [cen_dwl_heating_app_code] like '%77%' or [cen_dwl_heating_app_code] like '%88%' or [cen_dwl_heating_app_code] like '%99%' then NULL + else 1 end + ,cen_dwl_heating_app_code + ,occupy= case + when [cen_dwl_dwell_stus_code]=11 then 1 --occupied + when [cen_dwl_dwell_stus_code] between 21 and 31 then 0 --non occupant + else NULL end + ,private= case + when [cen_dwl_record_type_code]=1 then 1 --private + when [cen_dwl_record_type_code] =2 then 0 --non private + else NULL end + ,own = case + when [cen_dwl_tenure_code] between 10 and 12 then 1 --own (inc. with mortgage) + when [cen_dwl_tenure_code] between 20 and 22 or [cen_dwl_tenure_code] between 30 and 32 then 0 --not own (inc. family trust) + else NULL end + ,mortgage_rent = case + when [cen_dwl_tenure_code] IN (11, 31) then 'mortgage' + when [cen_dwl_tenure_code] IN (10, 12, 20, 22, 30, 32) then 'no mortgage/rent' -- own/trust and mortgage payments not made/not further defined + when [cen_dwl_tenure_code] = 21 then 'rent' + else NULL end -- use NULL for R handling later + ,amenity= case + when [cen_dwl_amenities_code] = '01;02;03;04;05;06;07' then 1 -- have all basic amenity e.g. shower, cooking, toilet,... + when [cen_dwl_amenities_code] IN ('77','99','NULL') then NULL + else 0 end + ,miss_amen= case + when [cen_dwl_amenity_cnt_code] between 0 and 6 then 1 -- missing at least 1 + when [cen_dwl_amenity_cnt_code] IN (77, 99) then NULL + else 0 end + ,damp= case + when [cen_dwl_damp_code] between 1 and 2 then 1 --yes + when [cen_dwl_damp_code]=3 then 0 --no + else NULL end + ,mould= case + when [cen_dwl_mould_code] between 1 and 2 then 1--yes + when [cen_dwl_mould_code]=3 then 0 + else NULL end + ,motor= case + when [cen_dwl_motor_vehicle_cnt_code]=00 then 0 --no + when [cen_dwl_motor_vehicle_cnt_code] between 01 and 05 then 1 --yes + else NULL end + ,hhld_composition = case + when [cen_hhd_composn_code] IN (111,121,122) then 'couple' + when [cen_hhd_composn_code] IN (131,141,142) then 'couple with children' + when [cen_hhd_composn_code] IN (151,161,162) then 'single parent' + when [cen_hhd_composn_code] IN (511) then 'single' + when [cen_hhd_composn_code] = 611 then NULL + else 'other' end + , crowding= case + when [cen_hhd_can_crowding_code] IN (1,2) then 1 --overcrowd + when [cen_hhd_can_crowding_code] between 3 and 5 then 0 --no + else NULL end + +FROM [IDI_Clean_YYYYMM].[cen_clean].[census_dwelling_2018] a +INNER JOIN [IDI_Clean_YYYYMM].[cen_clean].[census_household_2018] b +ON a.snz_cen_dwell_uid=b.snz_cen_dwell_uid +INNER JOIN [IDI_Clean_YYYYMM].[cen_clean].[census_individual_2018] c +ON b.snz_cen_hhld_uid=c.snz_cen_hhld_uid +) +GO + + + + + diff --git a/census_survey/census2018_individual.sql b/census_survey/census2018_individual.sql new file mode 100644 index 0000000..20bc7a0 --- /dev/null +++ b/census_survey/census2018_individual.sql @@ -0,0 +1,73 @@ +/************************************************************************************************** +Title: Census 2018 individual details +Author: Penny Mok +Reviewer: Manjusha Radhakrishnan + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Establish certain characteristics of individual. + +Intended purpose: +To obtain individual characteristics from the 2018 Census: + - owner (home ownership) + - interact (unpaid activities) + - alone1 (live alone) + - cur_rltp (partnered status) + +Inputs & Dependencies: +- [IDI_Clean].[cen_clean].[census_individual_2018] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[defn_cen18v1] + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +Issues: + +History (reverse order): +2022-07-06 MR Remove resident population join +2022-07-06 VW Remove unused parts of code in preperation for QA +2022-07-04 VW Format, use a user code table to store for use with data assembly tool +2022-06-22 PM Definition creation for MSD seniors project +**************************************************************************************************/ +/* Set database for writing views */ +USE IDI_UserCode +GO + +/* Clear existing view */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[defn_cen18v1]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[defn_cen18v1] AS ( +SELECT [snz_uid] + ,owner= case when [cen_ind_home_ownsp_code] =2 then 1 + when [cen_ind_home_ownsp_code]=1 or [cen_ind_home_ownsp_code]=3 then 0 + else NULL end + ,interact=case when (cen_ind_unpaid_activities_code) like '%06%'then 2 --volunteer + when (cen_ind_unpaid_activities_code) like '%02%' or (cen_ind_unpaid_activities_code) like '%03%'or (cen_ind_unpaid_activities_code) like '%04%'or (cen_ind_unpaid_activities_code) like '%05%'then 1 --family interact + when substring(cen_ind_unpaid_activities_code,1,2)= 00 or substring(cen_ind_unpaid_activities_code,1,2)=01 then 0 --no interaction + else NULL end + ,alone1=case when (cen_ind_living_arrangmnts_code) like '%0111%' then 1 --live alone + when substring(cen_ind_living_arrangmnts_code,1,4) between 7777 and 9999 or cen_ind_living_arrangmnts_code is null then NULL + else 0 end + ,cur_rltp= case when [cen_ind_social_mrit_stus_recode] between 10 and 13 then 1 --partnered + when [cen_ind_social_mrit_stus_recode] between 20 and 25 then 0 --not partnered + else NULL end + FROM [IDI_Clean_YYYYMM].[cen_clean].[census_individual_2018] +) +GO + diff --git a/census_survey/census2018_occupation.sql b/census_survey/census2018_occupation.sql new file mode 100644 index 0000000..aace921 --- /dev/null +++ b/census_survey/census2018_occupation.sql @@ -0,0 +1,53 @@ +/************************************************************************************************** +Title: Census 2018 occupation details +Author: Shaan Badenhorst + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Inputs & Dependencies: +- [IDI_Clean].[cen_clean].[census_individual_2018] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[vacc_Cen2018_Occupation] + +Description: +Occupation as reported in Census 2018. + +Intended purpose: +Identifying occupation of individuals at Census 2018, or the broad type of work / skills beyond Census 2018. + +Notes: + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = vacc_ + Project schema = DL-MAA20XX-YY + +Issues: + +History (reverse order): +2021-11-25 SA review and tidy +**************************************************************************************************/ + +/*embedded in user code*/ +USE IDI_UserCode +GO + +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[vacc_Cen2018_Occupation]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[vacc_Cen2018_Occupation] AS +SELECT snz_uid + ,CAST(cen_ind_occupation_code AS INTEGER) AS cen_ind_occupation_code +FROM [IDI_Clean_YYYYMM].[cen_clean].[census_individual_2018]; +GO diff --git a/census_survey/gss2016_wellbeing.sql b/census_survey/gss2016_wellbeing.sql new file mode 100644 index 0000000..89fb0c2 --- /dev/null +++ b/census_survey/gss2016_wellbeing.sql @@ -0,0 +1,74 @@ +/************************************************************************************************** +Title: Wellbeing GSS 2016 +Author: Simon Anastasiadis +Reviewer: Marianna Pekar + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Summary of health and wellbeing questions from GSS 2016. + +Intended purpose: +Provide indicators of general/overall wellbeing, health and material wellbeing. + +Inputs & Dependencies: +- [IDI_Clean].[gss_clean].[gss_person] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[d2g_gss2016] + +Notes: +1) Output values for survey questions are response codes. These include non-response codes + such as Don't Know, Refused to answer, Response Unidentifiable, and Response Out of Scope. + Non-response codes have not been filtered out in this definition. +2) This definition is applicable to GSS 2016. Due to changes between GSS waves, care is advised + when combining across waves that the questions are equivalent, and that the response scales + are coded the same way. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = d2g_ + Project schema = [DL-MAA20XX-YY] + +Issues: + +History (reverse order): +2020-07-22 MP QA +2020-03-02 SA v1 +**************************************************************************************************/ + +/* Establish database for writing views */ +USE IDI_UserCode +GO + +/* Clear view */ +/* GSS 2016 variables - health and financial hardship */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[d2g_gss2016]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[d2g_gss2016] AS +SELECT [snz_uid] + ,[gss_pq_PQinterview_date] AS [event_date] + ,CAST([gss_pq_feel_life_code] AS NUMERIC) AS [life_satisfaction] + ,CAST([gss_pq_life_worthwhile_code] AS NUMERIC) AS [life_worthwhile] + ,CAST([gss_pq_ment_health_code] AS NUMERIC) AS [SF12_mental_health] + ,CAST([gss_pq_phys_health_code] AS NUMERIC) AS [SF12_physical_health] + ,CAST([gss_pq_item_300_limit_code] AS NUMERIC) AS [gss_pq_item_300_limit_code] + ,CAST([gss_pq_not_pay_bills_time_code] AS NUMERIC) AS [gss_pq_not_pay_bills_time_code] + ,CAST([gss_pq_enough_inc_code] AS NUMERIC) AS [gss_pq_enough_inc_code] + ,CAST([gss_pq_material_wellbeing_code] AS NUMERIC) AS [material_wellbeing_index] +FROM [IDI_Clean_YYYYMM].[gss_clean].[gss_person] +WHERE [gss_pq_collection_code] = 'GSS2016'; +GO + + diff --git a/census_survey/gss2018_wellbeing.sql b/census_survey/gss2018_wellbeing.sql new file mode 100644 index 0000000..a0809e9 --- /dev/null +++ b/census_survey/gss2018_wellbeing.sql @@ -0,0 +1,80 @@ +/************************************************************************************************** +Title: Wellbeing GSS 2018 +Author: Simon Anastasiadis +Re-edite: Freya Li +Reviewer: Simon Anastasiadis + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Summary of health and wellbeing questions from GSS 2018. + +Intended purpose: +Provide indicators of general/overall wellbeing, health and material wellbeing. + +Inputs & Dependencies: +- [IDI_Clean].[gss_clean].[gss_person] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[d2gP2_gss2018] + + +Notes: +1) Output values for survey questions are response codes. These include non-response codes + such as Don't Know, Refused to answer, Response Unidentifiable, and Response Out of Scope. + Non-response codes have not been filtered out in this definition. +2) This definition is applicable to GSS 2018. Due to changes between GSS waves, care is advised + when combining across waves that the questions are equivalent, and that the response scales + are coded the same way. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = d2gP2_ + Project schema = [DL-MAA20XX-YY] + +History (reverse order): +2021-02-02 FL V3 Updated to gss 2018 +2021-01-26 SA QA +2021-01-08 FL v2 (Change prefix and update the table to the latest refresh) +2020-07-22 MP QA +2020-03-02 SA v1 +**************************************************************************************************/ +/* Establish database for writing views */ +USE IDI_UserCode +GO + +/* Clear view */ +/* GSS 2018 variables - health and financial hardship */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[d2gP2_gss2018]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[d2gP2_gss2018] AS +SELECT [snz_uid] + ,[gss_pq_PQinterview_date] AS [event_date] + ,CAST([gss_pq_feel_life_code] AS NUMERIC) AS [life_satisfaction] + ,CAST([gss_pq_life_worthwhile_code] AS NUMERIC) AS [life_worthwhile] + --,CAST([gss_pq_ment_health_code] AS NUMERIC) AS [SF12_mental_health] + ,CAST([gss_pq_health_dvwho5_code] AS NUMERIC) AS [WHO5_mental_wellbeing] + --,CAST([gss_pq_phys_health_code] AS NUMERIC) AS [SF12_physical_health] + ,CAST([gss_pq_item_300_limit_code] AS NUMERIC) AS [gss_pq_item_300_limit_code] + ,CAST([gss_pq_not_pay_bills_time_code] AS NUMERIC) AS [gss_pq_not_pay_bills_time_code] + ,CAST([gss_pq_enough_inc_code] AS NUMERIC) AS [gss_pq_enough_inc_code] + ,CAST([gss_pq_material_wellbeing_code] AS NUMERIC) AS [material_wellbeing_index] + ,CAST([gss_pq_fam_wellbeing_code] AS NUMERIC) AS [family_wellbeing] +FROM [IDI_Clean_YYYYMM].[gss_clean].[gss_person] +WHERE [gss_pq_collection_code] = 'GSS2018'; +GO + + + + diff --git a/census_survey/household_census2018.sql b/census_survey/household_census2018.sql new file mode 100644 index 0000000..a14b421 --- /dev/null +++ b/census_survey/household_census2018.sql @@ -0,0 +1,71 @@ +/************************************************************************************************** +Title: Census 2018 household +Author: Freya Li +Reviewer: Simon Anastasiadis + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +List of snz_uid, dwell_uid, and child indicator for census 2018. + +Intended purpose: +Producing summary statistics for household. + +Inputs & Dependencies: +- [IDI_Clean].[cen_clean].[census_individual_2018] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[d2gP2_household] + +Notes: +1. [cen_ind_record_type_code] records the individual record type for census night. + This variable gives the information about whether the individaul is adult or child. + 3 – NZ Adult + 4 – NZ child + 5 – Overseas Adult + 6 – Overseas Child + 7 – Absentee elsewhere in NZ or away < 12 months Adult + 8 – Absentee elsewhere in NZ or away < 12 months Child + 9 – Absentee away >= 12 months + +Issues: + + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = d2gP2_ + Project schema = [DL-MAA20XX-YY] + Custom 'as-at' date = '2020-09-30' + +History (reverse order): +2021_02-02 FL v2 Notes added, replace [cn_snz_cen_dwell_uid] to [ur_snz_cen_dwell_uid] +2021-01-26 SA QA +2021-01-21 FL v1 +**************************************************************************************************/ + +/* Establish database for writing views */ +USE IDI_UserCode +GO + +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[d2gP2_household]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[d2gP2_household] AS +SELECT snz_uid + ,ur_snz_cen_dwell_uid -- Dwelling ID for census night + ,IIF(cen_ind_record_type_code = 4 OR cen_ind_record_type_code = 6 OR cen_ind_record_type_code = 8 ,1, 0) AS child_ind_census_night -- child indicator on census night + -- 4 – NZ child, 6 – Overseas Child, 8 – Absentee elsewhere in NZ or away < 12 months (Child) + ,IIF(DATEDIFF(MONTH, DATEFROMPARTS(cen_ind_birth_year_nbr, cen_ind_birth_month_nbr,1), '2020-09-30') <= 12*18, 1, 0) AS child_ind_custom_date -- child indicator at 2020-09-30 +FROM [IDI_Clean_YYYYMM].[cen_clean].[census_individual_2018] +WHERE ur_snz_cen_dwell_uid IS NOT NULL +GO diff --git a/education/any_education_spells.sql b/education/any_education_spells.sql new file mode 100644 index 0000000..ac96830 --- /dev/null +++ b/education/any_education_spells.sql @@ -0,0 +1,161 @@ +/************************************************************************************************** +Title: Spell enrolled in any education +Author: Simon Anastasiadis +Re-edit: Freya Li + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +A spell with any enrollment in education, regardless of type, level or source. + +Intended purpose: +Creating indicators of when/whether a person has studied. +Identifying spells when a person is studying. +Counting the number of days a person spends studying. +Use when all types of study (university, industry training, etc.) are treated the same. + +Inputs & Dependencies: +- [IDI_Clean].[moe_clean].[enrolment] +- [IDI_Clean].[moe_clean].[targeted_training] +- [IDI_Clean].[moe_clean].[tec_it_learner] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_enrolled_education] + + +Notes: +1) Condensing is necessary to avoid double counting where different enrollments overlap. +2) Condensing can be slow. But speed improvements arise from pre-filtering the input tables + to narrower dates of interest. +3) Writing a staging table (rather than a staging view) is faster as we can add an index. +4) [moe_clean].[enrolment] does not include cancellations/withdrawls. Hence it may overcount. + Some withdrawl dates from courses can be retrieved from [moe_clean].[course] where this + is important. Withdrawls from industry training are not available. +5) For about 15% of tertiary enrollments, the year of enrollment (moe_enr_year_nbr) does not + match the year the course starts. In these cases, the year of enrollment is 1-2 years after + the YEAR(moe_enr_prog_start_date). + We have used course start and end dates as our indication of studying. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = d2gP2_ + Project schema = [DL-MAA20XX-YY] + Earliest start date = '2006-01-01' + Latest end date = '2026-12-31' + +Issues: +1) Industry training duration of enrollment can differ widely from expected duration of course. We are yet to determine how best to reconcile this difference. At present we consider + only enrollment. +2) Secondary and primary enrollment are not included. These can be found in the [IDI_Clean].[moe_clean].[student_enrol] table. + +History (reverse order): +2021-01-26 SA QA +2021-01-08 FL v2 (Change prefix and update the table to the latest refresh) +2020-07-16 MP QA +2020-03-02 SA v1 +**************************************************************************************************/ +/* Clear staging table */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_enrolled_education_staging]; +GO + + +/* Create staging table */ +/*Enrolment in tertiary education*/ +SELECT snz_uid + ,'enrollment' AS [source] + ,[moe_enr_prog_start_date] as [start_date] + ,[moe_enr_prog_end_date] as [end_date] +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_enrolled_education_staging] +FROM [IDI_Clean_YYYYMM].[moe_clean].[enrolment] +WHERE [moe_enr_prog_start_date] <= '2026-12-31' +AND '2006-01-01' <= [moe_enr_prog_end_date] + +UNION ALL + +/*Enrolment in targeted training*/ +SELECT snz_uid + ,'targeted_training' AS [source] + ,[moe_ttr_placement_start_date] as [start_date] + ,[moe_ttr_placement_end_date] as [end_date] +FROM [IDI_Clean_YYYYMM].[moe_clean].[targeted_training] +WHERE [moe_ttr_placement_start_date] <= '2026-12-31' +AND '2006-01-01' <= [moe_ttr_placement_end_date] + +UNION ALL + +/*Enrolment in industry training*/ +SELECT [snz_uid] + ,'tec_it_learner' AS [source] + ,[moe_itl_start_date] as [start_date] + ,[moe_itl_end_date] as end_date +FROM [IDI_Clean_YYYYMM].[moe_clean].[tec_it_learner] +WHERE [moe_itl_end_date] IS NOT NULL +AND [moe_itl_start_date] <= '2026-12-31' +AND '2006-01-01' <= [moe_itl_end_date]; +GO + +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_enrolled_education_staging] (snz_uid) +GO + +/* Condensed spells */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_enrolled_education]; +GO + +/* create table with condensed spells */ +WITH +/* exclude start dates that are within another spell */ +spell_starts AS ( + SELECT [snz_uid], [start_date] + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_enrolled_education_staging] s1 + WHERE [start_date] <= [end_date] + AND NOT EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_enrolled_education_staging] s2 + WHERE s1.snz_uid = s2.snz_uid + AND DATEADD(DAY, -1, s1.[start_date]) BETWEEN s2.[start_date] AND s2.[end_date] + ) +), +/* exclude end dates that are within another spell */ +spell_ends AS ( + SELECT [snz_uid], [end_date] + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_enrolled_education_staging] t1 + WHERE [start_date] <= [end_date] + AND NOT EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_enrolled_education_staging] t2 + WHERE t2.snz_uid = t1.snz_uid + AND YEAR(t1.[end_date]) <> 9999 + AND DATEADD(DAY, 1, t1.[end_date]) BETWEEN t2.[start_date] AND t2.[end_date] + --AND IIF(YEAR(t1.[end_date]) = 9999, t1.[end_date], DATEADD(DAY, 1, t1.[end_date])) BETWEEN t2.[start_date] AND t2.[end_date] + ) +) +SELECT s.snz_uid, s.[start_date], MIN(e.[end_date]) as [end_date] +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_enrolled_education] +FROM spell_starts s +INNER JOIN spell_ends e +ON s.snz_uid = e.snz_uid +AND s.[start_date] <= e.[end_date] +GROUP BY s.snz_uid, s.[start_date] +GO + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_enrolled_education] (snz_uid); +GO +/* Compress final table to save space */ +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_enrolled_education] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE); +GO + +/* Clear staging table */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_enrolled_education_staging]; +GO + diff --git a/education/ece_attendance.sql b/education/ece_attendance.sql new file mode 100644 index 0000000..10d7067 Binary files /dev/null and b/education/ece_attendance.sql differ diff --git a/education/education_industry_workforce.sql b/education/education_industry_workforce.sql new file mode 100644 index 0000000..56be73f --- /dev/null +++ b/education/education_industry_workforce.sql @@ -0,0 +1,72 @@ +/************************************************************************************************** +Title: Education industry employees +Author: Shaan Badenhorst + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Employed by an education providing organisations. + +Intended purpose: +Proxy for identifying teachers, instructors, and teaching assistants. + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Inputs & Dependencies: +- [IDI_Clean].[ir_clean].[ird_ems] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[vacc_education_workforce] + +Notes: +1) Will also capture other employees of education organisations (e.g. administrators, janitors, security). + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = vacc_ + Project schema = DL-MAA20XX-YY + Year of employment = YYYY + +Issues: + +History (reverse order): +2021-11-20 SB +**************************************************************************************************/ + +/* Set database for writing views */ +USE IDI_UserCode +GO + +/* Clear existing view */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[vacc_education_workforce]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[vacc_education_workforce] AS +SELECT snz_uid + ,[snz_employer_ird_uid] + ,[ir_ems_gross_earnings_amt] + ,[ir_ems_return_period_date] + ,[ir_ems_enterprise_nbr] + ,[ir_ems_pbn_nbr] + ,LEFT(COALESCE([ir_ems_pbn_anzsic06_code], [ir_ems_ent_anzsic06_code]), 3) AS anzsic06_3char + ,COALESCE([ir_ems_pbn_anzsic06_code], [ir_ems_ent_anzsic06_code]) AS anzsic06 +FROM [IDI_Clean_YYYYMM].[ir_clean].[ird_ems] +WHERE [ir_ems_income_source_code] IN ('W&S', 'WHP') +AND [snz_ird_uid] <> 0 -- exclude placeholder person without IRD number +AND YEAR([ir_ems_return_period_date]) = YYYY +AND ( + LEFT(COALESCE([ir_ems_pbn_anzsic06_code], [ir_ems_ent_anzsic06_code]), 1) = 'P' + OR COALESCE([ir_ems_pbn_anzsic06_code], [ir_ems_ent_anzsic06_code]) = 'Q871000' +) +GO diff --git a/education/latest_study_enrollments.sql b/education/latest_study_enrollments.sql new file mode 100644 index 0000000..9bad010 --- /dev/null +++ b/education/latest_study_enrollments.sql @@ -0,0 +1,163 @@ +/************************************************************************************************** +Title: Recent enrolment in study +Author: Craig Wright + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Inputs & Dependencies: +- [IDI_Clean].[cen_clean].[census_individual_2018] +- [IDI_Clean].[moe_clean].[student_enrol] +- [IDI_Clean].[moe_clean].[targeted_training] +- [IDI_Clean].[moe_clean].[tec_it_learner] +- [IDI_Clean].[moe_clean].[course] +- [IDI_Clean].[moe_clean].[student_interventions] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_education_enrollments] + +Description: +Enrolment in study when enrolment date is post 1 July 2020 + +Intended purpose: +Determining who is enrolled in the most recently available data. + +Notes: +1) Enrolment type + 1 = primary or secondary schooling + 2 = targeted training + 3 = tec or IT training + 4 = tertiary + 5 = interventions ORRS Ongoing and Reviewable Resourcing Scheme + 6 = interventions HHN Children that are at risk because of serious Health problems + + +Parameters & Present values: +Current refresh = YYYYMM +Prefix = vacc_ +Project schema = DL-MAA20XX-YY +Latest enrolment = '2020-07-01' + +Issues: + +History (reverse order): +2021-10-31 CW +**************************************************************************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_education_enrollments] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_education_enrollments] ( + snz_uid INT, + snz_moe_uid INT, + enroll_type INT, + provider_code INT, +) +GO + +/******************************* +1. Primary and secondary school +*******************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_education_enrollments] (snz_uid, snz_moe_uid, enroll_type, provider_code) +SELECT [snz_uid] + ,snz_moe_uid + ,1 AS enroll_type + ,MAX([moe_esi_provider_code]) AS provider_code +FROM [IDI_Clean_YYYYMM].[moe_clean].[student_enrol] +WHERE [moe_esi_end_date] IS NULL --the spell is open and there is no leave reason +AND [moe_esi_leave_rsn_code] IS NULL +GROUP BY snz_uid, [snz_moe_uid] +GO + +/******************************* +2. Targeted training +*******************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_education_enrollments] (snz_uid, snz_moe_uid, enroll_type, provider_code) +SELECT DISTINCT [snz_uid] + ,snz_moe_uid + ,2 AS enroll_type + ,[moe_ttr_moe_prov_code] AS provider_code +FROM [IDI_Clean_YYYYMM].[moe_clean].[targeted_training] +WHERE [moe_ttr_placement_end_date] IS NULL +OR [moe_ttr_placement_end_date] >= '2020-07-01' +GO + +/******************************* +3. TEC or industry training +*******************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_education_enrollments] (snz_uid, snz_moe_uid, enroll_type, provider_code) +SELECT [snz_uid] + ,snz_moe_uid + ,3 AS enroll_type + ,MAX([moe_itl_edumis_2016_code]) AS provider_code +FROM [IDI_Clean_YYYYMM].[moe_clean].[tec_it_learner] +WHERE [moe_itl_year_nbr] = 2020 +--WHERE [moe_itl_end_date] IS NULL OR [moe_itl_end_date] >= '2020-07-01' +GROUP BY snz_uid, [snz_moe_uid] +GO + +/******************************* +4. Tertiary qualification enrollment +*******************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_education_enrollments] (snz_uid, snz_moe_uid, enroll_type, provider_code) +SELECT [snz_uid] + ,[snz_moe_uid] + ,4 AS enroll_type + ,MAX([moe_crs_provider_code]) AS provider_code + --,MAX([moe_crs_is_domestic_ind]) AS domestic_student +FROM [IDI_Clean_YYYYMM].[moe_clean].[course] +WHERE [moe_crs_year_nbr] = 2020 +GROUP BY snz_uid, [snz_moe_uid] +GO + +/******************************* +5. Student interventions +*******************************/ +-- [IDI_Adhoc].[clean_read_MOE].[school_intervention_codes] +--Code Int_Short Int_Long IntType +--25 ORRS Ongoing and Reviewable Resourcing Scheme Other +--27 High Health Children that are at risk because of serious Health problems SpecialEd + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_education_enrollments] (snz_uid, snz_moe_uid, enroll_type, provider_code) +SELECT snz_uid + ,snz_moe_uid + ,5 AS enroll_type + ,NULL AS provider_code +FROM [IDI_Clean_YYYYMM].[moe_clean].[student_interventions] +WHERE [moe_inv_intrvtn_code] = 25 -- ORRS Ongoing and Reviewable Resourcing Scheme +--AND '2020-07-01' BETWEEN moe_inv_start_date AND moe_inv_end_date +GROUP BY snz_uid, snz_moe_uid +GO + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_education_enrollments] (snz_uid, snz_moe_uid, enroll_type, provider_code) +SELECT snz_uid + ,snz_moe_uid + ,6 AS enroll_type + ,NULL AS provider_code +FROM [IDI_Clean_YYYYMM].[moe_clean].[student_interventions] +WHERE [moe_inv_intrvtn_code] = 27 -- Children that are at risk because of serious Health problems +--AND '2020-07-01' BETWEEN moe_inv_start_date AND moe_inv_end_date +GROUP BY snz_uid, snz_moe_uid +GO + +/******************************* +Index and compress +*******************************/ + +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_education_enrollments] (snz_uid); +GO +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_education_enrollments] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE) +GO diff --git a/education/qualifications.sql b/education/qualifications.sql new file mode 100644 index 0000000..60a55fd --- /dev/null +++ b/education/qualifications.sql @@ -0,0 +1,266 @@ +/************************************************************************************************** +Title: Attainment of qualification +Author: Simon Anastasiadis +Re-edit: Freya Li + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Attainment of qualification (or our best approximation of). + +Intended purpose: +1. Identifying people's highest qualification at a point in time. +2. Identifying when people have been awared qualifications (requires removal of Census data). + +Inputs & Dependencies: +- [IDI_Clean].[cen_clean].[census_individual_2013] +- [IDI_Clean].[cen_clean].[census_individual_2018] +- [IDI_Clean].[moe_clean].[student_qualification] +- [IDI_Clean].[moe_clean].[completion] +- [IDI_Clean].[moe_clean].[tec_it_learner] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[defn_qualification_awards] + + +Notes: +1) Where only year is available assumed qualification awarded 1st December (approx, end of calendar year). + +2) Code guided by Population Explorer Highest Qualification code in SNZ Population Explorer by Peter Elis + github.com/StatisticsNZ/population-explorer/blob/master/build-db/01-int-tables/18-qualificiations.sql + +3) Qualification data is collected from a range of places + - Tertiary qualifications are reported to TEC as part of funding and oversight + - Secondary school qualifications are reported for school leavers + - All qualifications covered by NZQF (including secondary and tertiary) are reported to NZQA + Hence a single qualification event can appear multiple times within our data. We have aimed for + breadth of coverage and have not attempted to resolve duplicates. + +4) Qualifications reported from Census 2013 & Census 2018 have been added, as without only qualifications + earned recently within New Zealander are reported which results in an under count. As Census does not + report date of award/qualification we use December in 18th year of life as proxy for award date of + secondary school degrees, and date of Census 2013 as proxy for aware of post-secondary school degrees. + +5) School leavers qualifications are reported for the highest qualification earned prior to leaving school. + Much more detail is available than only reporting NQF levels. These have been grouped to comparitive + NCEA levels. For example: + 1-13 credits at level 3 --> highest completed qualification is level 2 --> level 2 + International Baccalaureate Year 12 --> level 2 as level 2 is typically completed in year 12 + +6) Numeric values are NZQA levels: + 1 = Certificate or NCEA level 1 + 2 = Certificate or NCEA level 2 + 3 = Certificate or NCEA level 3 + 4 = Certificate level 4 + 5 = Certificate of diploma level 5 + 6 = Certificate or diploma level 6 + 7 = Bachelors degree, graduate diploma or certificate level 7 + 8 = Bachelors honours degree or postgraduate diploma or certificate level 8 + 9 = Masters degree + 10 = Doctoral degree + International qualifications are often classified as 11. However, where we can not identify their + equivalent NZQA level, these have been excluded. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +Issues: + +History (reverse order): +2022-10-11 SA update with school leavers +2021-01-26 SA QA +2021-01-08 FL v2 (Change prefix and update the table to the latest refresh) +2020-07-22 JB QA +2020-03-02 SA v1 +**************************************************************************************************/ + +/* Establish database for writing views */ +USE IDI_UserCode +GO + +/* Clear view */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[defn_qualification_awards]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[defn_qualification_awards] AS + +-- Census 2018 secondary school qualification +SELECT [snz_uid] + ,DATEFROMPARTS(cen_ind_birth_year_nbr + 18, 12, 1) AS [event_date] + ,[cen_ind_scdry_scl_qual_code] AS [qualification_level] + ,'cen2018' AS [source] +FROM [IDI_Clean_YYYYMM].[cen_clean].[census_individual_2018] +WHERE [cen_ind_scdry_scl_qual_code] IN ('01', '02', '03') + +UNION ALL + +-- Census 2018 highest qualification +SELECT [snz_uid] + ,'2018-03-08' AS [event_date] + ,[cen_ind_standard_hst_qual_code] AS [qualification_level] + ,'cen2018' AS [source] +FROM [IDI_Clean_YYYYMM].[cen_clean].[census_individual_2018] +WHERE [cen_ind_standard_hst_qual_code] IN ('01', '02', '03', '04', '05', '06', '07', '08', '09', '10') +AND [cen_ind_standard_hst_qual_code] <> [cen_ind_scdry_scl_qual_code] + +UNION ALL + +-- Census 2013 secondary school qualification +SELECT [snz_uid] + ,DATEFROMPARTS([cen_ind_birth_year_nbr] + 18, 12, 1) AS [event_date] + ,cen_ind_sndry_scl_qual_code AS [qualification_level] + ,'cen2013' AS [source] +FROM [IDI_Clean_YYYYMM].[cen_clean].[census_individual_2013] +WHERE cen_ind_sndry_scl_qual_code IN ('01', '02', '03') + +UNION ALL + +-- Census 2013 highest qualification +SELECT [snz_uid] + ,'2013-03-05' AS [event_date] + ,cen_ind_std_highest_qual_code AS [qualification_level] + ,'cen2013' AS [source] +FROM [IDI_Clean_YYYYMM].[cen_clean].[census_individual_2013] +WHERE cen_ind_std_highest_qual_code IN ('01', '02', '03', '04', '05', '06', '07', '08', '09', '10') +AND cen_ind_std_highest_qual_code <> cen_ind_sndry_scl_qual_code + +UNION ALL + +-- NZQA (Covers Secondary and Tertiary) +SELECT snz_uid + ,DATEFROMPARTS(moe_sql_attained_year_nbr,12,1) AS [event_date] + ,moe_sql_nqf_level_code AS [qualification_level] + ,'nzqa' AS [source] +FROM [IDI_Clean_YYYYMM].[moe_clean].[student_qualification] +WHERE moe_sql_nqf_level_code IS NOT NULL +AND moe_sql_nqf_level_code IN (1,2,3,4,5,6,7,8,9,10) -- limit to 10 levels of NZQF + +UNION ALL + +-- Secondary via school leavers +SELECT [snz_uid] + ,DATEFROMPARTS([moe_sl_leaver_year], 12, 1) AS [event_date] + --,[moe_sl_leaver_year] + --,[moe_sl_leaving_yr_lvl] + --,[moe_sl_leaving_reason_code] + --,[moe_sl_highest_attain_code] + ,CASE + WHEN [moe_sl_highest_attain_code] IN (13, 14, 15, 16, 17, 20, 55, 60, 70, 80, 90) THEN 1 + WHEN [moe_sl_highest_attain_code] IN ( 4, 24, 25, 26, 27, 30, 56, 61, 71, 81, 91) THEN 2 + WHEN [moe_sl_highest_attain_code] IN (33, 34, 35, 36, 37, 40, 43, 62, 72, 82, 92) THEN 3 + ELSE NULL END AS [qualification_level] + ,'leavers' AS [source] +FROM [IDI_Clean_YYYYMM].[moe_clean].[student_leavers] +WHERE [moe_sl_eligibility_code] = 'DOMESTIC' +AND [moe_sl_leaving_yr_lvl] BETWEEN 12 AND 16 +AND ( + [moe_sl_highest_attain_code] IN (13, 14, 15, 16, 17, 20, 55, 60, 70, 80, 90) + OR [moe_sl_highest_attain_code] IN ( 4, 24, 25, 26, 27, 30, 56, 61, 71, 81, 91) + OR [moe_sl_highest_attain_code] IN (33, 34, 35, 36, 37, 40, 43, 62, 72, 82, 92) +) + +UNION ALL + +-- Tertiary qualification +SELECT snz_uid + ,DATEFROMPARTS(moe_com_year_nbr,12,1) AS [event_date] + ,moe_com_qual_level_code AS [qualification_level] + ,'tertiary' AS [source] +FROM [IDI_Clean_YYYYMM].[moe_clean].[completion] +WHERE moe_com_qual_level_code IS NOT NULL +AND moe_com_qual_level_code IN (1,2,3,4,5,6,7,8,9,10) -- limit to 10 levels of NZQF + +UNION ALL + +-- Industry training qualifications +SELECT snz_uid + ,moe_itl_end_date AS [event_date] + ,1 AS [qualification_level] + ,'industry' AS [source] +FROM [IDI_Clean_YYYYMM].moe_clean.tec_it_learner +WHERE moe_itl_end_date IS NOT NULL +AND moe_itl_level1_qual_awarded_nbr > 0 + +UNION ALL + +SELECT snz_uid + ,moe_itl_end_date AS [event_date] + ,2 AS [qualification_level] + ,'industry' AS [source] +FROM [IDI_Clean_YYYYMM].moe_clean.tec_it_learner +WHERE moe_itl_end_date IS NOT NULL +AND moe_itl_level2_qual_awarded_nbr > 0 + +UNION ALL + +SELECT snz_uid + ,moe_itl_end_date AS [event_date] + ,3 AS [qualification_level] + ,'industry' AS [source] +FROM [IDI_Clean_YYYYMM].moe_clean.tec_it_learner +WHERE moe_itl_end_date IS NOT NULL +AND moe_itl_level3_qual_awarded_nbr > 0 + +UNION ALL + +SELECT snz_uid + ,moe_itl_end_date AS [event_date] + ,4 AS [qualification_level] + ,'industry' AS [source] +FROM [IDI_Clean_YYYYMM].moe_clean.tec_it_learner +WHERE moe_itl_end_date IS NOT NULL +AND moe_itl_level4_qual_awarded_nbr > 0 + +UNION ALL + +SELECT snz_uid + ,moe_itl_end_date AS [event_date] + ,5 AS [qualification_level] + ,'industry' AS [source] +FROM [IDI_Clean_YYYYMM].moe_clean.tec_it_learner +WHERE moe_itl_end_date IS NOT NULL +AND moe_itl_level5_qual_awarded_nbr > 0 + +UNION ALL + +SELECT snz_uid + ,moe_itl_end_date AS [event_date] + ,6 AS [qualification_level] + ,'industry' AS [source] +FROM [IDI_Clean_YYYYMM].moe_clean.tec_it_learner +WHERE moe_itl_end_date IS NOT NULL +AND moe_itl_level6_qual_awarded_nbr > 0 + +UNION ALL + +SELECT snz_uid + ,moe_itl_end_date AS [event_date] + ,7 AS [qualification_level] + ,'industry' AS [source] +FROM [IDI_Clean_YYYYMM].moe_clean.tec_it_learner +WHERE moe_itl_end_date IS NOT NULL +AND moe_itl_level7_qual_awarded_nbr > 0 + +UNION ALL + +SELECT snz_uid + ,moe_itl_end_date AS [event_date] + ,8 AS [qualification_level] + ,'industry' AS [source] +FROM [IDI_Clean_YYYYMM].moe_clean.tec_it_learner +WHERE moe_itl_end_date IS NOT NULL +AND moe_itl_level8_qual_awarded_nbr > 0; +GO + diff --git a/education/school_enrollment.sql b/education/school_enrollment.sql new file mode 100644 index 0000000..5461f63 --- /dev/null +++ b/education/school_enrollment.sql @@ -0,0 +1,186 @@ +/************************************************************************************************** +Title: MOE school events +Author: C Wright, C Maccormick and V Benny + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Periods of enrolment in education. + +Intended purpose: +Identify the periods of enrolment in education. + +Inputs & Dependencies: +- [IDI_Clean].[moe_clean].[student_enrol] +- [IDI_Clean].[data].[personal_detail] +- [IDI_Clean].[moe_clean].[enrolment] +- [IDI_Clean].[moe_clean].[targeted_training] +- [IDI_Clean].[moe_clean].[tec_it_learner] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_moe_tidy_dates] +- [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_education_period]; + + +Notes: + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = vacc_ + Project schema = DL-MAA20XX-YY + Date of interest = '2020-12-31' + +Issues: + +History (reverse order): +2021-11-25 SA review and tidy +2021-09-03: MP limited to certain dates +2020-06-10: JB limited to population scope only +**************************************************************************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_moe_tidy_dates]; +GO + +WITH +initial_setup AS ( + + SELECT enr.snz_uid + ,enr.snz_moe_uid + ,enr.moe_esi_entry_year_lvl_nbr + ,enr.moe_esi_start_date + ,enr.moe_esi_end_date + ,LEAD(enr.moe_esi_start_date) OVER (PARTITION BY enr.snz_uid ORDER BY enr.moe_esi_start_date) AS next_start_date + ,enr.moe_esi_extrtn_date + ,enr.moe_esi_provider_code /* school number */ + ,DATEFROMPARTS(per.snz_birth_year_nbr, per.snz_birth_month_nbr, 15) AS date_of_birth + FROM [IDI_Clean_YYYYMM].[moe_clean].[student_enrol] enr + LEFT JOIN [IDI_Clean_YYYYMM].[data].[personal_detail] per + ON enr.snz_uid = per.snz_uid + +), +improve_end_date AS ( + + SELECT snz_uid + ,snz_moe_uid + ,moe_esi_entry_year_lvl_nbr + ,moe_esi_start_date + ,moe_esi_end_date + ,next_start_date + /* Improved end date: + 1. the day before next start if next start is within current spell + 2. the end date if it exists + 3. 18th birthday if within the extraction date + 4. the extraction date otherwise*/ + ,CASE + WHEN next_start_date IS NOT NULL + AND next_start_date < moe_esi_end_date + THEN DATEADD(DAY, - 1, next_start_date) + WHEN moe_esi_end_date IS NOT NULL + THEN moe_esi_end_date + WHEN moe_esi_end_date IS NULL + AND DATEADD(YEAR, 19, date_of_birth) < moe_esi_extrtn_date + THEN DATEADD(YEAR, 18, date_of_birth) + ELSE moe_esi_extrtn_date + END AS improved_end_date + ,moe_esi_extrtn_date + ,moe_esi_provider_code /* school number */ + ,date_of_birth + FROM initial_setup + +) +SELECT DISTINCT snz_uid + ,snz_moe_uid + ,moe_esi_entry_year_lvl_nbr + ,moe_esi_start_date + ,improved_end_date + ,moe_esi_extrtn_date + ,moe_esi_provider_code /* school number */ +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_moe_tidy_dates] +FROM improve_end_date +/* To avoid unusual spells, require that years enrolled <= 14 */ +WHERE DATEDIFF(DAY, moe_esi_start_date, improved_end_date) / 365.25 <= 14 +/* and enrollment age is between 4 and 24 */ +AND DATEDIFF(DAY, date_of_birth, moe_esi_start_date) / 365.25 BETWEEN 4 AND 24 +/* filter to specific date */ +AND '2020-12-31' BETWEEN moe_esi_start_date AND improved_end_date +OR moe_esi_start_date >= '2020-12-31' +GO + +--put index +CREATE INDEX individx ON [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_moe_tidy_dates] (snz_uid); +GO +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_moe_tidy_dates] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE) +GO + +/**************************************** +enrollments +****************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_education_period]; +GO + +/* +Enrolment in secondary education. Less straightforward than the rest due to null values. +NB: records 2007- only. +*/ +SELECT * +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_education_period] +FROM ( + +SELECT snz_uid + ,[moe_esi_start_date] AS [start_date] + ,[improved_end_date] AS [end_date] + --,[moe_esi_entry_year_lvl_nbr] + ,IIF([moe_esi_entry_year_lvl_nbr] BETWEEN 0 AND 13 ,'Year ' + cast([moe_esi_entry_year_lvl_nbr] AS VARCHAR) ,'Unclassified') AS [description] +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_moe_tidy_dates] + +UNION ALL + +/*Enrolment in upper secondary or tertiary education*/ +SELECT snz_uid + ,[moe_enr_prog_start_date] AS [start_date] + ,[moe_enr_prog_end_date] AS [end_date] + ,CASE + WHEN [moe_enr_isced_level_code] LIKE '3_' THEN 'Upper secondary' + WHEN [moe_enr_isced_level_code] LIKE '4_' THEN 'Post-secondary, non-tertiary' + WHEN [moe_enr_isced_level_code] LIKE '5_' THEN '1st stage tertiary' + WHEN [moe_enr_isced_level_code] = '6' THEN '2nd stage tertiary' + ELSE 'Enrollment = Unknown' END AS [description] +FROM [IDI_Clean_YYYYMM].[moe_clean].[enrolment] + +UNION ALL + +/*Enrolment in targeted training*/ +SELECT snz_uid + ,[moe_ttr_placement_start_date] AS [start_date] + ,[moe_ttr_placement_end_date] AS [end_date] + ,'targeted training enroll' AS [description] +FROM [IDI_Clean_YYYYMM].[moe_clean].[targeted_training] + +UNION ALL + +/*Enrolment in industry training*/ +SELECT [snz_uid] + ,[moe_itl_start_date] AS [start_date] + ,[moe_itl_end_date] AS end_date + ,CONCAT ('indust train = ',SUBSTRING([moe_itl_nzsced_narrow_text], 1, 20)) AS [description] +FROM [IDI_Clean_YYYYMM].[moe_clean].[tec_it_learner] +WHERE [moe_itl_end_date] IS NOT NULL + +); +GO + +CREATE NONCLUSTERED INDEX my_index ON [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_education_period] (snz_uid) +GO +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_education_period] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE) +GO diff --git a/education/truancy.sql b/education/truancy.sql new file mode 100644 index 0000000..a372aa8 Binary files /dev/null and b/education/truancy.sql differ diff --git a/employment/employment_spells.sql b/employment/employment_spells.sql new file mode 100644 index 0000000..e963c37 --- /dev/null +++ b/employment/employment_spells.sql @@ -0,0 +1,164 @@ +/************************************************************************************************** +Title: Spell of employment +Author: Simon Anastasiadis +Re-edit: Freya Li + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +A spell where wages or salaries are reported to IRD as evidence of employment. + +Intended purpose: +Creating indicators of when/whether a person was employed. +Identifying spells when a person is employed. +Counting the number of days a person spends employed. + +Inputs & Dependencies: +- [IDI_Clean].[ir_clean].[ird_ems] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_employed_spell] + +Notes: +1) Employer Monthly Summaries (EMS) records provide an indication that a person was employed + during a specific month. Where start or end dates are provided, these are used so long + as they are close to the month of the EMS record. For all other records (the vast majority) + the first and last day of the month is used. +2) Employment includes receipt of Wages & Salaries (WAS or W&S), and receipt of Schedular Payments + from which Withholding Payments are deducted (WHP). People receiving Wages & Salaries are + employees. People receiving Schedular Payments may be contractors. Neither of these definitions + perfectly capture self employment (sole trader, partnership, or company) some of which can + only be observed at a (tax) year resolution. + However, comparing EMS to tax year summaries suggests that W&S and WHP captures at least + some self-employment. And that this is more likely to be observed as WHP rather than W&S. +3) The EMS record includes start and end dates for employment. Where these are populated they + may provide a more accurate indication of start and end date than the beginning and end + of the return month (note the 'return period date' is the end of the month). + - If the start date is provided and is within two months of the return date, then we use + it as the best indication of start date. While employers must submit EMS monthly, an + employee who joins once month and receivies their first pay the next month, will only + appear in the EMS record for the second month. + - If the end date is provided and is within the month for which the return relates, + then we use it as the best indication of end date. + - Inspection of the data suggests that when end dates are outside (after) the month of + the return this is employers reporting future end dates of contracts. The employee + continues to have EMS records for each month of their employment. As future end dates can + change (be renegotiated) and their use would result in double counting of the same period + for the same job (different from the double counting where people work multiple jobs), + we only use provided end date if it is within the month of the EMS return. +4) Condensing is necessary to avoid double counting where people work multiple jobs. + It also merges adjacent spells (e.g. 1-7 Jan and 8-14 Jan becomes 1-14 Jan). + Condensing can be slow. But speed improvements arise from pre-filtering the input tables + to narrower dates of interest. +5) A placeholder identity exists where the encrypted IRD number [snz_ird_uid] is equal to + zero. Documentation states these are people who can not be linked. We exclude this identity. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = d2gP2_ + Project schema = [DL-MAA20XX-YY] + Earliest start date = '2014-01-01' + Latest end date = '2020-12-31' + +Issues: +- Does not capture all self-employment. Use of annual tax year data is recommended for analyses at + annual resolution, or where it is important to capture self-employment. +- Slow. For years 2014-2020 runtime more than 2 hours + +History (reverse order): +2021-01-26 SA QA +2021-01-11 FL v2 (Change prefix and update the table to the latest refresh) +2020-07-22 JB QA +2020-07-01 SA expanded definition of employment to include WHP as this covers some self-employment +2020-03-02 SA v1 +**************************************************************************************************/ + +/* Set database for writing views */ +USE IDI_UserCode +GO + +/* Clear existing view */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[d2gP2_employed_spell_staging]; +GO + +/* Create staging */ +CREATE VIEW [DL-MAA20XX-YY].[d2gP2_employed_spell_staging] AS +SELECT snz_uid + ,CASE WHEN [ir_ems_employee_start_date] IS NOT NULL + AND [ir_ems_employee_start_date] < [ir_ems_return_period_date] + AND DATEDIFF(DAY, [ir_ems_employee_start_date], [ir_ems_return_period_date]) < 60 -- employee started in the last two months + THEN [ir_ems_employee_start_date] + ELSE DATEFROMPARTS(YEAR([ir_ems_return_period_date]), MONTH([ir_ems_return_period_date]),1) END AS [start_date] + ,CASE WHEN [ir_ems_employee_end_date] IS NOT NULL + AND [ir_ems_employee_end_date] < [ir_ems_return_period_date] + AND ([ir_ems_employee_start_date] IS NULL OR [ir_ems_employee_start_date] < [ir_ems_employee_end_date]) + AND DATEDIFF(DAY, [ir_ems_employee_end_date], [ir_ems_return_period_date]) < 27 -- employee finished in the last month + THEN [ir_ems_employee_end_date] + ELSE [ir_ems_return_period_date] END AS [end_date] +FROM [IDI_Clean_YYYYMM].[ir_clean].[ird_ems] +WHERE [ir_ems_income_source_code] IN ('W&S', 'WHP') +AND [snz_ird_uid] <> 0 -- exclude placeholder person without IRD number +AND [ir_ems_return_period_date] BETWEEN '2014-01-01' AND '2020-12-31'; +GO + +/* Condensed spells */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_employed_spell]; +GO + +WITH +/* exclude start dates that are within another spell */ +spell_starts AS ( + SELECT [snz_uid], [start_date] + FROM [IDI_UserCode].[DL-MAA20XX-YY].[d2gP2_employed_spell_staging] s1 + WHERE [start_date] <= [end_date] + AND NOT EXISTS ( + SELECT 1 + FROM [IDI_UserCode].[DL-MAA20XX-YY].[d2gP2_employed_spell_staging] s2 + WHERE s1.snz_uid = s2.snz_uid + AND DATEADD(DAY, -1, s1.[start_date]) BETWEEN s2.[start_date] AND s2.[end_date] + ) +), +/* exclude end dates that are within another spell */ +spell_ends AS ( + SELECT [snz_uid], [end_date] + FROM [IDI_UserCode].[DL-MAA20XX-YY].[d2gP2_employed_spell_staging] t1 + WHERE [start_date] <= [end_date] + AND NOT EXISTS ( + SELECT 1 + FROM [IDI_UserCode].[DL-MAA20XX-YY].[d2gP2_employed_spell_staging] t2 + WHERE t2.snz_uid = t1.snz_uid + AND YEAR(t1.[end_date]) <> 9999 + AND DATEADD(DAY, 1, t1.[end_date]) BETWEEN t2.[start_date] AND t2.[end_date] + --AND IIF(YEAR(t1.[end_date]) = 9999, t1.[end_date], DATEADD(DAY, 1, t1.[end_date])) BETWEEN t2.[start_date] AND t2.[end_date] + ) +) +SELECT s.snz_uid, s.[start_date], MIN(e.[end_date]) as [end_date] +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_employed_spell] +FROM spell_starts s +INNER JOIN spell_ends e +ON s.snz_uid = e.snz_uid +AND s.[start_date] <= e.[end_date] +GROUP BY s.snz_uid, s.[start_date] +GO + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_employed_spell] (snz_uid); +GO +/* Compress final table to save space */ +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_employed_spell] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE); +GO + +/* Clear staging view */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[d2gP2_employed_spell_staging]; +GO + diff --git a/employment/end_of_employment.sql b/employment/end_of_employment.sql new file mode 100644 index 0000000..493756a --- /dev/null +++ b/employment/end_of_employment.sql @@ -0,0 +1,119 @@ +/************************************************************************************************** +Title: End of employment +Author: Simon Anastasiadis +Re-edit: Freya Li + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +An event were a peron's employment ends followed by a gap in employment. +The event is the end date of employment where: +- they are never subsequently employed/paid by the same employer + (this controls for periodic / occassional work) +- there is no income from wages & salaries in the next month + (suggesting a person finished employment without a new role lined up) + +Intended purpose: +As a proxy for job loss. + +Inputs & Dependencies: +- [IDI_Clean].[ir_clean].[ird_ems] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_end_of_employment] + + +Notes: +1) Employer Monthly Summaries (EMS) records provide an indication that a person was employed + during a specific month. These are not used - only the last day of the month is used. +2) Self employment does not appear in this definition. +3) We exclude the last date of every person's employment/EMS history as this is likely to + reflect an event we are not intersted in (e.g. end of dataset, never works again). +4) No control for non-job loss explanations for stopping work, such as retirement, + parental/caring responsibilities, travel overseas. +5) A placeholder identity exists where the encrypted IRD number [snz_ird_uid] is equal to + zero. Checking across refreshes suggests this is people without an IRD number. We exclude + this identity. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = d2gP2_ + Project schema = [DL-MAA20XX-YY] + +Issues: + +History (reverse order): +2020-01-26 SA QA +2020_01-11 FL v2 (Change prefix, update the table to the latest refresh) +2020-07-22 JB QA +2020-03-05 SA v1 +**************************************************************************************************/ + + +/* Clear existing table */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_end_of_employment]; +GO + +WITH possible_ends AS ( + +SELECT a.[snz_uid] + ,MAX(a.[ir_ems_return_period_date]) AS [event_date] + ,a.[snz_employer_ird_uid] +FROM [IDI_Clean_YYYYMM].[ir_clean].[ird_ems] a +WHERE [ir_ems_income_source_code] = 'W&S' +AND a.[snz_ird_uid] <> 0 -- exclude placeholder person without IRD number +GROUP BY a.[snz_uid], a.[snz_employer_ird_uid] + +), +final_ends AS ( + +SELECT snz_uid + ,MAX([ir_ems_return_period_date]) AS [final_date] +FROM [IDI_Clean_YYYYMM].[ir_clean].[ird_ems] +WHERE [ir_ems_income_source_code] = 'W&S' +GROUP BY [snz_uid] + +), +staging AS ( + +SELECT a.[snz_uid] + ,a.[event_date] + ,a.[snz_employer_ird_uid] + ,b.[final_date] + ,c.[snz_uid] AS has_income_next_month +FROM possible_ends a +INNER JOIN final_ends b +ON a.snz_uid = b.snz_uid +LEFT JOIN [IDI_Clean_YYYYMM].[ir_clean].[ird_ems] c +ON a.snz_uid = c.snz_uid +AND DATEDIFF(MONTH, a.[event_date], c.[ir_ems_return_period_date]) = 1 +AND c.[ir_ems_income_source_code] = 'W&S' + +) +SELECT DISTINCT [snz_uid] + ,[event_date] + ,[snz_employer_ird_uid] +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_end_of_employment] +FROM staging +WHERE [event_date] <> [final_date] +AND has_income_next_month IS NULL +GO + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_end_of_employment] (snz_uid); +GO +/* Compress final table to save space */ +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_end_of_employment] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE); +GO + + diff --git a/employment/industry_of_employment.sql b/employment/industry_of_employment.sql new file mode 100644 index 0000000..f2332d0 --- /dev/null +++ b/employment/industry_of_employment.sql @@ -0,0 +1,84 @@ +/************************************************************************************************** +Title: Industry of employment +Author: Simon Anastasiadis + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Industry of employer. Requires people to be employeers (have wages or salaries > 0) + +Intended purpose: +Determining the mixture of businesses in an area. + +Inputs & Dependencies: +- [IDI_Clean].[ir_clean].[ird_ems] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[CEN_ANZSIC06] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[defn_industry] + +Notes: +1) Industry as reported in monthly summary to IRD. Coded according to level 2 + of ANZSIC 2006 codes (87 different values). +2) There are two sources from which industry type can be drawn: + PBN = Permanent Business Bumber + ENT = The Entity + We prioritise the PBN over the ENT. +3) Note that this is not perfect identification of role/responsibilities due to lack of + distinction between business industry and personal industry. For example the + manager of a retirement home is likely to have ANZSIC code for personal care, not + for management. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +Issues: + +History (reverse order): +2020-05-20 SA v1 +**************************************************************************************************/ + +/* Set database for writing views */ +USE IDI_UserCode +GO + +/* Clear existing view */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[defn_industry]; +GO + +/* Create view */ +CREATE VIEW [DL-MAA20XX-YY].[defn_industry] AS +SELECT [snz_uid] + ,[ir_ems_return_period_date] + ,[ir_ems_pbn_anzsic06_code] + ,k.[ir_ems_ent_anzsic06_code] + ,k.anzsic06 + ,[descriptor_text] +FROM ( + +SELECT [snz_uid] + ,[ir_ems_return_period_date] + ,[ir_ems_pbn_anzsic06_code] + ,[ir_ems_ent_anzsic06_code] + ,LEFT(COALESCE([ir_ems_pbn_anzsic06_code], [ir_ems_ent_anzsic06_code]), 1) AS anzsic06 +FROM [IDI_Clean_YYYYMM].[ir_clean].[ird_ems] +WHERE [ir_ems_gross_earnings_amt] > 0 +AND [ir_ems_income_source_code] = 'W&S' + +) k +INNER JOIN [IDI_Metadata].[clean_read_CLASSIFICATIONS].[CEN_ANZSIC06] b +ON k.anzsic06 = b.[cat_code] +WHERE anzsic06 IS NOT NULL; +GO diff --git a/family_whanau/dependent_children.sql b/family_whanau/dependent_children.sql new file mode 100644 index 0000000..7949932 --- /dev/null +++ b/family_whanau/dependent_children.sql @@ -0,0 +1,97 @@ +/************************************************************************************************** +Title: Has dependent children +Author: Simon Anastasiadis +Re-edit: Freya Li + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +The time period during which a person has living children under the age of 18. +So age 0-17 is dependent. + +Intended purpose: +Counting the number of dependent children that a person has at any point in time. + +Inputs & Dependencies: +- [IDI_Clean].[data].[personal_detail] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[d2gP2_dependent_children] + +Notes: +1) There is no control for whether a person lives with (their) children or is + involved in their care. Only considers parents by birth so legal guardians + can not be identified this way. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = d2gP2_ + Project schema = [DL-MAA20XX-YY] + age no longer dependent = 18 + +History (reverse order): +2021-06-09 SA refactor & replace age with year of birth +2021-06-08 FL add the age of child +2021-01-26 SA QA +2020-01-07 FL v2 (Change prefix and update the table to the latest refresh) +2020-07-22 JB QA +2020-07-21 MP QA +2020-03-04 SA v1 + **************************************************************************************************/ + +USE [IDI_UserCode] +GO + +/* Clear before creation */ +IF OBJECT_ID('[DL-MAA20XX-YY].[d2gP2_dependent_children]','V') IS NOT NULL +DROP VIEW [DL-MAA20XX-YY].[d2gP2_dependent_children]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[d2gP2_dependent_children] AS +WITH end_dependency AS ( + + SELECT [snz_parent1_uid] + ,[snz_parent2_uid] + ,[snz_birth_date_proxy] + /* if child dies before 18th birthday, use date of death, otherwise use 18th birthday */ + ,IIF( EOMONTH(DATEFROMPARTS([snz_birth_year_nbr] + 18, [snz_birth_month_nbr], 15)) -- [18th_birthday] + > EOMONTH(DATEFROMPARTS([snz_deceased_year_nbr], [snz_deceased_month_nbr], 28)), -- [death_day] + EOMONTH(DATEFROMPARTS([snz_deceased_year_nbr], [snz_deceased_month_nbr], 28)), + EOMONTH(DATEFROMPARTS([snz_birth_year_nbr] + 18, [snz_birth_month_nbr], 15)) ) AS [end_dependency_date] + FROM [IDI_Clean_YYYYMM].[data].[personal_detail] + WHERE [snz_uid] IS NOT NULL + AND [snz_birth_year_nbr] IS NOT NULL + AND [snz_birth_month_nbr] IS NOT NULL + and [snz_birth_year_nbr] <>9999 +) +SELECT * +FROM ( + SELECT [snz_parent1_uid] AS [snz_uid] + ,[snz_birth_date_proxy] AS [start_date] + ,[end_dependency_date] AS [end_date] + ,YEAR([snz_birth_date_proxy]) AS child_year_of_birth + FROM end_dependency + WHERE [snz_parent1_uid] IS NOT NULL + + UNION ALL + + SELECT [snz_parent2_uid] AS [snz_uid] + ,[snz_birth_date_proxy] AS [start_date] + ,[end_dependency_date] AS [end_date] + ,YEAR([snz_birth_date_proxy]) AS child_year_of_birth + FROM end_dependency + WHERE [snz_parent2_uid] IS NOT NULL + AND [snz_parent1_uid] <> [snz_parent2_uid] -- parents are different +) k +WHERE [start_date] <= [end_date] +GO diff --git a/family_whanau/family_death.sql b/family_whanau/family_death.sql new file mode 100644 index 0000000..c2b53cb --- /dev/null +++ b/family_whanau/family_death.sql @@ -0,0 +1,303 @@ +/************************************************************************************************** +Title: Death of an immediate family member +Author: Simon Anastasiadis +Reviewer: Marianna Pekar + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +The date and relationship to [snz_uid] of the death of an immediate family member. +Included relationships: children, parents, siblings (full or half), spouse +(by marriage or civil-union), co-parent (partner with whom you had a child). + +Intended purpose: +Identifying dates/periods where people are dealing with a death in their close family. + +Inputs & Dependencies: +- [IDI_Clean].[data].[personal_detail] +- [IDI_Clean].[dia_clean].[marriages] +- [IDI_Clean].[dia_clean].[civil_unions] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[d2g_family_death] + +Notes: +1) There are many types of relationships that can not be identified using the IDI. + This view is only those relationships we can observe with the available data. +2) Siblings are identified by people having at least one parent in common via DIA + birth records. These may not go back far enough to identify siblings for older + members of the population. +3) The marriage records contain a small number of records with the same partner snz_uid's + (though other values of the record, such as date may differ). This includes couples who + get married more than once on different dates. We have not filtered these out as they are + a trivial proportion. +4) The relationship types are non-exclusive. So the same death can result in multiple + records of different types (for example a single death can result in a person loosing + both their spouse and co-partner if they were married and had children together). +5) The death of a child includes stillborn babies. +6) For simplicity we have NOT added a check that the person is alive when their family + member dies. For example both spouses will have a record for each other's death, + even if one dies several years before the other. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = d2g_ + Project schema = [DL-MAA20XX-YY] + +Issues: + +History (reverse order): +2020-07-21 MP QA +2020-03-04 SA v1 +**************************************************************************************************/ + +USE [IDI_UserCode] +GO + +/* Clear before creation */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[d2g_family_death]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[d2g_family_death] AS + +/*********** Children ***********/ + +SELECT [snz_parent1_uid] AS [snz_uid] + ,DATEFROMPARTS([snz_deceased_year_nbr], [snz_deceased_month_nbr], 28) AS [event_date] + ,'child' AS [relationship] + ,[snz_uid] AS [deceased_snz_uid] +FROM [IDI_Clean_YYYYMM].[data].[personal_detail] +WHERE [snz_person_ind] = 1 +AND [snz_deceased_year_nbr] IS NOT NULL +AND [snz_deceased_month_nbr] IS NOT NULL +AND [snz_parent1_uid] IS NOT NULL + +UNION ALL + +SELECT [snz_parent2_uid] AS [snz_uid] + ,DATEFROMPARTS([snz_deceased_year_nbr], [snz_deceased_month_nbr], 28) AS [event_date] + ,'child' AS [relationship] + ,[snz_uid] AS [deceased_snz_uid] +FROM [IDI_Clean_YYYYMM].[data].[personal_detail] +WHERE [snz_person_ind] = 1 +AND [snz_deceased_year_nbr] IS NOT NULL +AND [snz_deceased_month_nbr] IS NOT NULL +AND [snz_parent2_uid] IS NOT NULL -- parent 2 exists +AND [snz_parent1_uid] <> [snz_parent2_uid] -- and is different from parent 1 +AND [snz_parent2_uid] IS NOT NULL + +UNION ALL + +/*********** Parents ***********/ + +SELECT p.[snz_uid] + ,DATEFROMPARTS(d.[snz_deceased_year_nbr], d.[snz_deceased_month_nbr], 28) AS [event_date] + ,'parent' AS [relationship] + ,d.[snz_uid] AS [deceased_snz_uid] +FROM [IDI_Clean_YYYYMM].[data].[personal_detail] p -- p for person who attends funeral +INNER JOIN [IDI_Clean_YYYYMM].[data].[personal_detail] d -- d for deceased person +ON d.[snz_uid] = p.[snz_parent1_uid] +WHERE d.[snz_person_ind] = 1 +AND d.[snz_deceased_year_nbr] IS NOT NULL +AND d.[snz_deceased_month_nbr] IS NOT NULL + +UNION ALL + +SELECT p.[snz_uid] + ,DATEFROMPARTS(d.[snz_deceased_year_nbr], d.[snz_deceased_month_nbr], 28) AS [event_date] + ,'parent' AS [relationship] + ,d.[snz_uid] AS [deceased_snz_uid] +FROM [IDI_Clean_YYYYMM].[data].[personal_detail] p -- p for person who attends funeral +INNER JOIN [IDI_Clean_YYYYMM].[data].[personal_detail] d -- d for deceased person +ON d.[snz_uid] = p.[snz_parent2_uid] +WHERE d.[snz_person_ind] = 1 +AND d.[snz_deceased_year_nbr] IS NOT NULL +AND d.[snz_deceased_month_nbr] IS NOT NULL +AND p.[snz_parent2_uid] IS NOT NULL -- parent 2 exists +AND p.[snz_parent1_uid] <> p.[snz_parent2_uid] -- and is different from parent 1 + +UNION ALL + +/*********** Siblings (half or full) ***********/ + +SELECT p.[snz_uid] + ,DATEFROMPARTS(d.[snz_deceased_year_nbr], d.[snz_deceased_month_nbr], 28) AS [event_date] + ,'sibling' AS [relationship] + ,d.[snz_uid] AS [deceased_snz_uid] +FROM [IDI_Clean_YYYYMM].[data].[personal_detail] p -- p for person who attends funeral +INNER JOIN [IDI_Clean_YYYYMM].[data].[personal_detail] d -- d for deceased person +ON d.[snz_parent1_uid] = p.[snz_parent1_uid] -- both share same parent_1 +AND d.[snz_uid] <> p.[snz_uid] +WHERE d.[snz_person_ind] = 1 +AND d.[snz_deceased_year_nbr] IS NOT NULL +AND d.[snz_deceased_month_nbr] IS NOT NULL +AND p.[snz_parent1_uid] IS NOT NULL -- parent 1 exists +AND d.[snz_parent1_uid] IS NOT NULL -- parent 1 exists + +UNION ALL + +SELECT p.[snz_uid] + ,DATEFROMPARTS(d.[snz_deceased_year_nbr], d.[snz_deceased_month_nbr], 28) AS [event_date] + ,'sibling' AS [relationship] + ,d.[snz_uid] AS [deceased_snz_uid] +FROM [IDI_Clean_YYYYMM].[data].[personal_detail] p -- p for person who attends funeral +INNER JOIN [IDI_Clean_YYYYMM].[data].[personal_detail] d -- d for deceased person +ON d.[snz_parent1_uid] = p.[snz_parent2_uid] -- parent_1 = parent_2 +AND d.[snz_uid] <> p.[snz_uid] +WHERE d.[snz_person_ind] = 1 +AND d.[snz_deceased_year_nbr] IS NOT NULL +AND d.[snz_deceased_month_nbr] IS NOT NULL +AND d.[snz_parent1_uid] IS NOT NULL -- parent 1 exists +AND p.[snz_parent2_uid] IS NOT NULL -- parent 2 exists +AND p.[snz_parent1_uid] <> p.[snz_parent2_uid] -- and is different from parent 1 + +UNION ALL + +SELECT p.[snz_uid] + ,DATEFROMPARTS(d.[snz_deceased_year_nbr], d.[snz_deceased_month_nbr], 28) AS [event_date] + ,'sibling' AS [relationship] + ,d.[snz_uid] AS [deceased_snz_uid] +FROM [IDI_Clean_YYYYMM].[data].[personal_detail] p -- p for person who attends funeral +INNER JOIN [IDI_Clean_YYYYMM].[data].[personal_detail] d -- d for deceased person +ON d.[snz_parent2_uid] = p.[snz_parent1_uid] -- parent_2 = parent_1 +AND d.[snz_uid] <> p.[snz_uid] +WHERE d.[snz_person_ind] = 1 +AND d.[snz_deceased_year_nbr] IS NOT NULL +AND d.[snz_deceased_month_nbr] IS NOT NULL +AND p.[snz_parent1_uid] IS NOT NULL -- parent 1 exists +AND d.[snz_parent2_uid] IS NOT NULL -- parent 2 exists +AND d.[snz_parent1_uid] <> d.[snz_parent2_uid] -- and is different from parent 1 + +UNION ALL + +SELECT p.[snz_uid] + ,DATEFROMPARTS(d.[snz_deceased_year_nbr], d.[snz_deceased_month_nbr], 28) AS [event_date] + ,'sibling' AS [relationship] + ,d.[snz_uid] AS [deceased_snz_uid] +FROM [IDI_Clean_YYYYMM].[data].[personal_detail] p -- p for person who attends funeral +INNER JOIN [IDI_Clean_YYYYMM].[data].[personal_detail] d -- d for deceased person +ON d.[snz_parent2_uid] = p.[snz_parent2_uid] -- both share same parent_2 +AND d.[snz_uid] <> p.[snz_uid] +WHERE d.[snz_person_ind] = 1 +AND d.[snz_deceased_year_nbr] IS NOT NULL +AND d.[snz_deceased_month_nbr] IS NOT NULL +AND p.[snz_parent2_uid] IS NOT NULL -- parent 2 exists +AND p.[snz_parent1_uid] <> p.[snz_parent2_uid] -- and is different from parent 1 +AND d.[snz_parent2_uid] IS NOT NULL -- parent 2 exists +AND d.[snz_parent1_uid] <> d.[snz_parent2_uid] -- and is different from parent 1 + +UNION ALL + +/*********** Spouse (by marriage or civil-union) ***********/ + +SELECT p.[partnr1_snz_uid] AS [snz_uid] -- first partner in marriage + ,DATEFROMPARTS(d.[snz_deceased_year_nbr], d.[snz_deceased_month_nbr], 28) AS [event_date] + ,'spouse' AS [relationship] + ,d.[snz_uid] AS [deceased_snz_uid] +FROM [IDI_Clean_YYYYMM].[dia_clean].[marriages] p -- p for person who attends funeral +INNER JOIN [IDI_Clean_YYYYMM].[data].[personal_detail] d -- d for deceased person +ON p.[partnr2_snz_uid] = d.[snz_uid] +WHERE [dia_mar_marriage_date] IS NOT NULL +AND p.[partnr1_snz_uid] <> p.[partnr2_snz_uid] +AND d.[snz_person_ind] = 1 +AND d.[snz_deceased_year_nbr] IS NOT NULL +AND d.[snz_deceased_month_nbr] IS NOT NULL +AND p.[partnr1_snz_uid] IS NOT NULL +AND p.[partnr2_snz_uid] IS NOT NULL + +UNION ALL + +SELECT p.[partnr2_snz_uid] AS [snz_uid] -- second partner in marriage + ,DATEFROMPARTS(d.[snz_deceased_year_nbr], d.[snz_deceased_month_nbr], 28) AS [event_date] + ,'spouse' AS [relationship] + ,d.[snz_uid] AS [deceased_snz_uid] +FROM [IDI_Clean_YYYYMM].[dia_clean].[marriages] p -- p for person who attends funeral +INNER JOIN [IDI_Clean_YYYYMM].[data].[personal_detail] d -- d for deceased person +ON p.[partnr1_snz_uid] = d.[snz_uid] +WHERE [dia_mar_marriage_date] IS NOT NULL +AND p.[partnr1_snz_uid] <> p.[partnr2_snz_uid] +AND d.[snz_person_ind] = 1 +AND d.[snz_deceased_year_nbr] IS NOT NULL +AND d.[snz_deceased_month_nbr] IS NOT NULL +AND p.[partnr1_snz_uid] IS NOT NULL +AND p.[partnr2_snz_uid] IS NOT NULL + +UNION ALL + +SELECT p.[partnr1_snz_uid] AS [snz_uid] -- first partner in civil union + ,DATEFROMPARTS(d.[snz_deceased_year_nbr], d.[snz_deceased_month_nbr], 28) AS [event_date] + ,'spouse' AS [relationship] + ,d.[snz_uid] AS [deceased_snz_uid] +FROM [IDI_Clean_YYYYMM].[dia_clean].[civil_unions] p -- p for person who attends funeral +INNER JOIN [IDI_Clean_YYYYMM].[data].[personal_detail] d -- d for deceased person +ON p.[partnr2_snz_uid] = d.[snz_uid] +WHERE [dia_civ_civil_union_date] IS NOT NULL +AND p.[partnr1_snz_uid] <> p.[partnr2_snz_uid] +AND d.[snz_person_ind] = 1 +AND d.[snz_deceased_year_nbr] IS NOT NULL +AND d.[snz_deceased_month_nbr] IS NOT NULL +AND p.[partnr1_snz_uid] IS NOT NULL +AND p.[partnr2_snz_uid] IS NOT NULL + +UNION ALL + +SELECT p.[partnr2_snz_uid] AS [snz_uid] -- second partner in civil union + ,DATEFROMPARTS(d.[snz_deceased_year_nbr], d.[snz_deceased_month_nbr], 28) AS [event_date] + ,'spouse' AS [relationship] + ,d.[snz_uid] AS [deceased_snz_uid] +FROM [IDI_Clean_YYYYMM].[dia_clean].[civil_unions] p -- p for person who attends funeral +INNER JOIN [IDI_Clean_YYYYMM].[data].[personal_detail] d -- d for deceased person +ON p.[partnr1_snz_uid] = d.[snz_uid] +WHERE [dia_civ_civil_union_date] IS NOT NULL +AND p.[partnr1_snz_uid] <> p.[partnr2_snz_uid] +AND d.[snz_person_ind] = 1 +AND d.[snz_deceased_year_nbr] IS NOT NULL +AND d.[snz_deceased_month_nbr] IS NOT NULL +AND p.[partnr1_snz_uid] IS NOT NULL +AND p.[partnr2_snz_uid] IS NOT NULL + +UNION ALL + +/*********** Co-parent (partner with whom you had a child) ***********/ + +SELECT DISTINCT p.[snz_parent2_uid] AS [snz_uid] -- distinct is required to avoid duplicates where people have multiple children + ,DATEFROMPARTS(d.[snz_deceased_year_nbr], d.[snz_deceased_month_nbr], 28) AS [event_date] + ,'coparent' AS [relationship] + ,d.[snz_uid] AS [deceased_snz_uid] +FROM [IDI_Clean_YYYYMM].[data].[personal_detail] p -- p for person who attends funeral +INNER JOIN [IDI_Clean_YYYYMM].[data].[personal_detail] d -- d for deceased person +ON d.[snz_uid] = p.[snz_parent1_uid] +WHERE d.[snz_person_ind] = 1 +AND d.[snz_deceased_year_nbr] IS NOT NULL +AND d.[snz_deceased_month_nbr] IS NOT NULL +AND p.[snz_parent1_uid] IS NOT NULL -- parent 1 exists +AND p.[snz_parent2_uid] IS NOT NULL -- parent 2 exists +AND p.[snz_parent1_uid] <> p.[snz_parent2_uid] -- and is different from parent 1 + +UNION ALL + +SELECT DISTINCT p.[snz_parent1_uid] AS [snz_uid] -- distinct is required to avoid duplicates where people have multiple children + ,DATEFROMPARTS(d.[snz_deceased_year_nbr], d.[snz_deceased_month_nbr], 28) AS [event_date] + ,'coparent' AS [relationship] + ,d.[snz_uid] AS [deceased_snz_uid] +FROM [IDI_Clean_YYYYMM].[data].[personal_detail] p -- p for person who attends funeral +INNER JOIN [IDI_Clean_YYYYMM].[data].[personal_detail] d -- d for deceased person +ON d.[snz_uid] = p.[snz_parent2_uid] +WHERE d.[snz_person_ind] = 1 +AND d.[snz_deceased_year_nbr] IS NOT NULL +AND d.[snz_deceased_month_nbr] IS NOT NULL +AND p.[snz_parent1_uid] IS NOT NULL -- parent 1 exists +AND p.[snz_parent2_uid] IS NOT NULL -- parent 2 exists +AND p.[snz_parent1_uid] <> p.[snz_parent2_uid] -- and is different from parent 1 + +GO diff --git a/family_whanau/first_birth.sql b/family_whanau/first_birth.sql new file mode 100644 index 0000000..e9b1446 --- /dev/null +++ b/family_whanau/first_birth.sql @@ -0,0 +1,71 @@ +/************************************************************************************************** +Title: Mother's age at first birth +Author: Simon Anastasiadis + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Mother's age at the time of their first birth. Excludes still births. + +Intended purpose: +Identifying when a woman first became a mother. +Determining a women's age at first (live) birth. + +Inputs & Dependencies: +- [IDI_Clean].[moh_clean].[maternity_mother] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[defn_first_birth] + +Notes: +1) Only year and month of birth are available in the IDI. Day of birth is considered + identifying. Hence all births happen on the 15th of the month. +2) Only births observed in New Zealand are recorded. Mothers who have children outside + New Zealand (e.g. migrants) will not appear. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +Issues: + +History (reverse order): +2020-05-20 SA v1 +**************************************************************************************************/ + +/* Set database for writing views */ +USE IDI_UserCode +GO + +/* Clear existing view */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[defn_first_birth]; +GO + +/* Create view */ +CREATE VIEW [DL-MAA20XX-YY].[defn_first_birth] AS +SELECT snz_uid + ,MIN([event_date]) AS [event_date] + ,MIN(moh_matm_mother_age_nbr) AS moh_matm_mother_age_nbr +FROM ( + SELECT snz_uid + ,DATEFROMPARTS([moh_matm_delivery_year_nbr], [moh_matm_delivery_month_nbr], 15) as [event_date] + ,moh_matm_mother_age_nbr + FROM [IDI_Clean_YYYYMM].[moh_clean].[maternity_mother] + WHERE [moh_matm_live_births_count_nbr] IN ('1','2','3','4') --must be a live birth +) k +GROUP BY snz_uid; +GO + + + diff --git a/family_whanau/marriage_civil_union.sql b/family_whanau/marriage_civil_union.sql new file mode 100644 index 0000000..b34ad02 --- /dev/null +++ b/family_whanau/marriage_civil_union.sql @@ -0,0 +1,143 @@ +/************************************************************************************************** +Title: Official relationship - marriage or civil union +Author: Simon Anastasiadis +Reviewer: Freya Li + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Period that a person is married or in a civil union for. + +Intended purpose: +Determining whether a person is married or solemenised (civily unioned). +Identifying dates of marriage, divorce, union, or disolution. + +Inputs & Dependencies: +- [IDI_Clean_202203].[dia_clean].[marriages] +- [IDI_Clean_202203].[dia_clean].[civil_unions] +- [IDI_Clean_202203].[data].[personal_detail] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[defn_official_relationship] + +Notes: +1) Each partner is recorded separately. +2) Max start date is 2021-12-30 + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +Issues: +- Contains a small number of records where [start_date] > [end_date] +- Contains a small number of duplicate records + +History (reverse order): +2022-31-05 VW Point to DL-MAA20XX-YY (seniors project) and 202203 refresh +2020-11-19 FL QA +2020-03-03 SA v1 +**************************************************************************************************/ + +/* Establish database for writing views */ +USE IDI_UserCode +GO + +/* Clear view */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[defn_official_relationship]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[defn_official_relationship] AS +WITH +marriage AS ( + SELECT [partnr1_snz_uid] + ,[dia_mar_partnr1_sex_snz_code] + ,[partnr2_snz_uid] + ,[dia_mar_partnr2_sex_snz_code] + ,[dia_mar_marriage_date] + ,[dia_mar_disolv_order_date] + ,DATEFROMPARTS(b.[snz_deceased_year_nbr], b.[snz_deceased_month_nbr], 28) AS [partnr1_deceased_date] + ,DATEFROMPARTS(c.[snz_deceased_year_nbr], c.[snz_deceased_month_nbr], 28) AS [partnr2_deceased_date] + FROM [IDI_Clean_YYYYMM].[dia_clean].[marriages] a + INNER JOIN [IDI_Clean_YYYYMM].[data].[personal_detail] b + ON a.[partnr1_snz_uid] = b.snz_uid + INNER JOIN [IDI_Clean_YYYYMM].[data].[personal_detail] c + ON a.[partnr2_snz_uid] = c.snz_uid + WHERE [dia_mar_marriage_date] IS NOT NULL + AND [partnr1_snz_uid] <> [partnr2_snz_uid] +), +civil_union AS ( + SELECT [partnr1_snz_uid] + ,[dia_civ_partnr1_sex_snz_code] + ,[partnr2_snz_uid] + ,[dia_civ_partnr2_sex_snz_code] + ,[dia_civ_civil_union_date] + ,[dia_civ_dissolution_type_text] + ,[dia_civ_disolv_order_date] + ,DATEFROMPARTS(b.[snz_deceased_year_nbr], b.[snz_deceased_month_nbr], 28) AS [partnr1_deceased_date] + ,DATEFROMPARTS(c.[snz_deceased_year_nbr], c.[snz_deceased_month_nbr], 28) AS [partnr2_deceased_date] + FROM [IDI_Clean_YYYYMM].[dia_clean].[civil_unions] a + INNER JOIN [IDI_Clean_YYYYMM].[data].[personal_detail] b + ON a.[partnr1_snz_uid] = b.snz_uid + INNER JOIN [IDI_Clean_YYYYMM].[data].[personal_detail] c + ON a.[partnr2_snz_uid] = c.snz_uid + WHERE [dia_civ_civil_union_date] IS NOT NULL + AND [partnr1_snz_uid] <> [partnr2_snz_uid] + +) +/* Partner 1, marriage */ +SELECT [partnr1_snz_uid] as snz_uid + ,[dia_mar_marriage_date] AS [start_date] + ,COALESCE([dia_mar_disolv_order_date], + IIF([partnr1_deceased_date] < [partnr2_deceased_date], [partnr1_deceased_date], [partnr2_deceased_date]), + [partnr1_deceased_date], + [partnr2_deceased_date], + '9999-12-31') AS [end_date] +FROM marriage + +UNION ALL + +/* Partner 2, marriage */ +SELECT [partnr2_snz_uid] as snz_uid + ,[dia_mar_marriage_date] AS [start_date] + ,COALESCE([dia_mar_disolv_order_date], + IIF([partnr1_deceased_date] < [partnr2_deceased_date], [partnr1_deceased_date], [partnr2_deceased_date]), + [partnr1_deceased_date], + [partnr2_deceased_date], + '9999-12-31') AS [end_date] +FROM marriage + +UNION ALL + +/* Partner 1, civil union */ +SELECT [partnr1_snz_uid] AS [snz_uid] + ,[dia_civ_civil_union_date] AS [start_date] + ,COALESCE([dia_civ_disolv_order_date], + IIF([partnr1_deceased_date] < [partnr2_deceased_date], [partnr1_deceased_date], [partnr2_deceased_date]), + [partnr1_deceased_date], + [partnr2_deceased_date], + '9999-12-31') AS [end_date] +FROM civil_union + +UNION ALL + +/* Partner 2, civil union */ +SELECT [partnr2_snz_uid] AS [snz_uid] + ,[dia_civ_civil_union_date] AS [start_date] + ,COALESCE([dia_civ_disolv_order_date], + IIF([partnr1_deceased_date] < [partnr2_deceased_date], [partnr1_deceased_date], [partnr2_deceased_date]), + [partnr1_deceased_date], + [partnr2_deceased_date], + '9999-12-31') AS [end_date] +FROM civil_union; +GO diff --git a/family_whanau/parent-char-stage.sql b/family_whanau/parent-char-stage.sql new file mode 100644 index 0000000..cb1663e Binary files /dev/null and b/family_whanau/parent-char-stage.sql differ diff --git a/family_whanau/parent_child_under_5.sql b/family_whanau/parent_child_under_5.sql new file mode 100644 index 0000000..3d4e431 --- /dev/null +++ b/family_whanau/parent_child_under_5.sql @@ -0,0 +1,226 @@ +/************************************************************************************************** +Title: Parent of a child under 5 years old +Author: Simon Anastasiadis + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Identifies years in which a person is a parent of a child under the age of 5 and the person has at least one address in common with the child. + +Intended purpose: +Identifying whether a person is actively parenting a child under the age of 5. + +Inputs & Dependencies: +- [IDI_Clean].[dia_clean].[births] +- [IDI_Clean].[data].[address_notification] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[defn_parent_under5_share_address] + +Notes: +1) We use the age of 5 as until this age children do not have to attend school regularly. +2) We require that the parent has at least one address in common with the child + as a proxy for whether the parent is living with, and involved in caring for, + their child. + Where there is a sole parent caring for the child, but both parents are named on the + birth record, this should help reduce counting of the non-caring parent. But it + is only an approximation. +3) Because babies may not have an address immediately after they are born, + for the first year of birth, we require an address match in the first two years. +4) Script returns the message: + "Warning: Null value is eliminated by an aggregate or other SET operation." + This is not a cause for concern. We deliberately use aggregation to eliminate nulls. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + Min birth year = 2010 + Max age of child = 5 {this parameter can not be updated via find & replace} + +Issues: +1) The crudeness with which we have filtered out parents who do not share an address + with their child means that this definition is not suited to identifying specific + individuals who are actively parenting. We recommend it only be used in aggregate + and with clear caveats. + +History (reverse order): +2020-05-25 SA v1 +**************************************************************************************************/ + +/*********** all parents and spells when child is aged under 5 ***********/ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_tmp_parent_under5]; +GO + +SELECT [snz_uid] + ,[dob] + ,[child_snz_uid] +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_tmp_parent_under5] +FROM ( + +SELECT [parent1_snz_uid] AS [snz_uid] + ,[dia_bir_birth_month_nbr] + ,[dia_bir_birth_year_nbr] + ,DATEFROMPARTS([dia_bir_birth_year_nbr], [dia_bir_birth_month_nbr], 15) AS dob + ,[dia_bir_still_birth_code] + ,[snz_uid] AS child_snz_uid +FROM [IDI_Clean_YYYYMM].[dia_clean].[births] +WHERE [parent1_snz_uid] IS NOT NULL + +UNION ALL + +SELECT [parent2_snz_uid] AS [snz_uid] + ,[dia_bir_birth_month_nbr] + ,[dia_bir_birth_year_nbr] + ,DATEFROMPARTS([dia_bir_birth_year_nbr], [dia_bir_birth_month_nbr], 15) AS dob + ,[dia_bir_still_birth_code] + ,[snz_uid] AS child_snz_uid +FROM [IDI_Clean_YYYYMM].[dia_clean].[births] +WHERE [parent2_snz_uid] IS NOT NULL +AND ([parent1_snz_uid] IS NULL OR [parent1_snz_uid] <> [parent2_snz_uid]) + +) k +WHERE [dia_bir_still_birth_code] IS NULL +AND [dia_bir_birth_year_nbr] >= 2010; + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[defn_tmp_parent_under5] (snz_uid); +GO + +/*********** Must share address ***********/ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_parent_under5_share_address]; +GO + +WITH +year1 AS ( + SELECT pc.[snz_uid] + ,DATEADD(YEAR, 0, pc.[dob]) AS [start_date] + ,DATEADD(YEAR, 1, pc.[dob]) AS [end_date] + ,pc.[child_snz_uid] + ,COUNT(ca.[snz_idi_address_register_uid]) AS num_address_matches + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[defn_tmp_parent_under5] pc /* parent & child */ + LEFT JOIN [IDI_Clean_YYYYMM].[data].[address_notification] pa /* parent address */ + ON pc.[snz_uid] = pa.[snz_uid] + AND pa.[ant_notification_date] <= DATEADD(YEAR, 2, pc.[dob]) /* extended to 2 years after in case babies don't have address */ + AND DATEADD(YEAR, 0, pc.[dob]) <= pa.[ant_replacement_date] + LEFT JOIN [IDI_Clean_YYYYMM].[data].[address_notification] ca /* child address */ + ON pc.[child_snz_uid] = ca.[snz_uid] + AND pa.[ant_notification_date] <= DATEADD(YEAR, 2, pc.[dob]) /* extended to 2 years after in case babies don't have address */ + AND DATEADD(YEAR, 0, pc.[dob]) <= pa.[ant_replacement_date] + AND pa.[snz_idi_address_register_uid] = ca.[snz_idi_address_register_uid] + GROUP BY pc.[snz_uid] + ,pc.[dob] + ,pc.[child_snz_uid] +), +year2 AS ( + SELECT pc.[snz_uid] + ,DATEADD(YEAR, 1, pc.[dob]) AS [start_date] + ,DATEADD(YEAR, 2, pc.[dob]) AS [end_date] + ,pc.[child_snz_uid] + ,COUNT(ca.[snz_idi_address_register_uid]) AS num_address_matches + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[defn_tmp_parent_under5] pc /* parent & child */ + LEFT JOIN [IDI_Clean_YYYYMM].[data].[address_notification] pa /* parent address */ + ON pc.[snz_uid] = pa.[snz_uid] + AND pa.[ant_notification_date] <= DATEADD(YEAR, 2, pc.[dob]) + AND DATEADD(YEAR, 1, pc.[dob]) <= pa.[ant_replacement_date] + LEFT JOIN [IDI_Clean_YYYYMM].[data].[address_notification] ca /* child address */ + ON pc.[child_snz_uid] = ca.[snz_uid] + AND pa.[ant_notification_date] <= DATEADD(YEAR, 2, pc.[dob]) + AND DATEADD(YEAR, 1, pc.[dob]) <= pa.[ant_replacement_date] + AND pa.[snz_idi_address_register_uid] = ca.[snz_idi_address_register_uid] + GROUP BY pc.[snz_uid] + ,pc.[dob] + ,pc.[child_snz_uid] +), +year3 AS ( + SELECT pc.[snz_uid] + ,DATEADD(YEAR, 2, pc.[dob]) AS [start_date] + ,DATEADD(YEAR, 3, pc.[dob]) AS [end_date] + ,pc.[child_snz_uid] + ,COUNT(ca.[snz_idi_address_register_uid]) AS num_address_matches + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[defn_tmp_parent_under5] pc /* parent & child */ + LEFT JOIN [IDI_Clean_YYYYMM].[data].[address_notification] pa /* parent address */ + ON pc.[snz_uid] = pa.[snz_uid] + AND pa.[ant_notification_date] <= DATEADD(YEAR, 3, pc.[dob]) + AND DATEADD(YEAR, 2, pc.[dob]) <= pa.[ant_replacement_date] + LEFT JOIN [IDI_Clean_YYYYMM].[data].[address_notification] ca /* child address */ + ON pc.[child_snz_uid] = ca.[snz_uid] + AND pa.[ant_notification_date] <= DATEADD(YEAR, 3, pc.[dob]) + AND DATEADD(YEAR, 2, pc.[dob]) <= pa.[ant_replacement_date] + AND pa.[snz_idi_address_register_uid] = ca.[snz_idi_address_register_uid] + GROUP BY pc.[snz_uid] + ,pc.[dob] + ,pc.[child_snz_uid] +), +year4 AS ( + SELECT pc.[snz_uid] + ,DATEADD(YEAR, 3, pc.[dob]) AS [start_date] + ,DATEADD(YEAR, 4, pc.[dob]) AS [end_date] + ,pc.[child_snz_uid] + ,COUNT(ca.[snz_idi_address_register_uid]) AS num_address_matches + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[defn_tmp_parent_under5] pc /* parent & child */ + LEFT JOIN [IDI_Clean_YYYYMM].[data].[address_notification] pa /* parent address */ + ON pc.[snz_uid] = pa.[snz_uid] + AND pa.[ant_notification_date] <= DATEADD(YEAR, 4, pc.[dob]) + AND DATEADD(YEAR, 3, pc.[dob]) <= pa.[ant_replacement_date] + LEFT JOIN [IDI_Clean_YYYYMM].[data].[address_notification] ca /* child address */ + ON pc.[child_snz_uid] = ca.[snz_uid] + AND pa.[ant_notification_date] <= DATEADD(YEAR, 4, pc.[dob]) + AND DATEADD(YEAR, 3, pc.[dob]) <= pa.[ant_replacement_date] + AND pa.[snz_idi_address_register_uid] = ca.[snz_idi_address_register_uid] + GROUP BY pc.[snz_uid] + ,pc.[dob] + ,pc.[child_snz_uid] +), +year5 AS ( + SELECT pc.[snz_uid] + ,DATEADD(YEAR, 4, pc.[dob]) AS [start_date] + ,DATEADD(YEAR, 5, pc.[dob]) AS [end_date] + ,pc.[child_snz_uid] + ,COUNT(ca.[snz_idi_address_register_uid]) AS num_address_matches + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[defn_tmp_parent_under5] pc /* parent & child */ + LEFT JOIN [IDI_Clean_YYYYMM].[data].[address_notification] pa /* parent address */ + ON pc.[snz_uid] = pa.[snz_uid] + AND pa.[ant_notification_date] <= DATEADD(YEAR, 5, pc.[dob]) + AND DATEADD(YEAR, 4, pc.[dob]) <= pa.[ant_replacement_date] + LEFT JOIN [IDI_Clean_YYYYMM].[data].[address_notification] ca /* child address */ + ON pc.[child_snz_uid] = ca.[snz_uid] + AND pa.[ant_notification_date] <= DATEADD(YEAR, 5, pc.[dob]) + AND DATEADD(YEAR, 4, pc.[dob]) <= pa.[ant_replacement_date] + AND pa.[snz_idi_address_register_uid] = ca.[snz_idi_address_register_uid] + GROUP BY pc.[snz_uid] + ,pc.[dob] + ,pc.[child_snz_uid] +) +SELECT [snz_uid], [start_date], [end_date], [child_snz_uid], [num_address_matches] +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_parent_under5_share_address] +FROM ( + SELECT * FROM year1 UNION ALL + SELECT * FROM year2 UNION ALL + SELECT * FROM year3 UNION ALL + SELECT * FROM year4 UNION ALL + SELECT * FROM year5 +) k +WHERE [num_address_matches] > 0 +GO + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[defn_parent_under5_share_address] (snz_uid); +GO +/* Compress final table to save space */ +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[defn_parent_under5_share_address] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE); +GO + +/* remove staging table */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_tmp_parent_under5]; +GO diff --git a/family_whanau/parenthood.sql b/family_whanau/parenthood.sql new file mode 100644 index 0000000..2525c7d --- /dev/null +++ b/family_whanau/parenthood.sql @@ -0,0 +1,81 @@ +/************************************************************************************************** +Title: Parenthood +Author: Joel Bancolita, Marianna Pekar + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Identfying whether someone is a biological parent + +Intended purpose: +When a person became a parent. +Identifying the birth of a child. + +Inputs & Dependencies: +- [IDI_Clean].[dia_clean].[births] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[vacc_biol_parent] + + +Notes: +1) No controls are included for whether of not a parent has any role + in their child's life. Biological parents may not be the caregivers of a child. + Parental involvement might be proxied by checking how much the child and the + parent(s) share an address. +2) Captures births in NZ. Unlikely to capture births that occurred overseas. + Some family information is included in migrant records and could supplement this. +3) How/whether adoption interacts with birth records should be checked for researchers + considering this are advised to investigate further. It is possible that adoption + masks the biological birth record with the adoptive record for ease of administration. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = vacc_ + Project schema = DL-MAA20XX-YY + +Issues: + +History (reverse order): +2021-08-31 MP +2020-08-24 MP +2020-06-10 JB +**************************************************************************************************/ + +USE IDI_UserCode +GO + +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[vacc_biol_parent]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[vacc_biol_parent] AS +/* parent 1 */ +SELECT [snz_uid] AS child_snz_uid + ,[parent1_snz_uid] AS parent_snz_uid + ,DATEFROMPARTS(dia_bir_birth_year_nbr, dia_bir_birth_month_nbr, 15) AS birth_date_proxy +FROM [IDI_Clean_YYYYMM].[dia_clean].[births] +WHERE dia_bir_still_birth_code IS NULL +AND [parent1_snz_uid] IS NOT NULL +AND [snz_uid] IS NOT NULL + +UNION ALL + +/* parent 2 */ +SELECT [snz_uid] AS child_snz_uid + ,[parent2_snz_uid] AS parent_snz_uid + ,DATEFROMPARTS(dia_bir_birth_year_nbr, dia_bir_birth_month_nbr, 15) AS birth_date_proxy +FROM [IDI_Clean_YYYYMM].[dia_clean].[births] +WHERE dia_bir_still_birth_code IS NULL +AND [parent2_snz_uid] IS NOT NULL +AND [snz_uid] IS NOT NULL +GO diff --git a/health/ACC_serious_injury.sql b/health/ACC_serious_injury.sql new file mode 100644 index 0000000..699a014 --- /dev/null +++ b/health/ACC_serious_injury.sql @@ -0,0 +1,80 @@ +/************************************************************************************************** +Title: ACC serious injury under management +Author: Craig Wright + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Indicator for ACC serious injury with management still open. + +Intended purpose: +Identifying where people have had a serious injury where the resolution +of the injury is ongoing with ACC. + +Inputs & Dependencies: +- [IDI_Clean].[acc_clean].[claims] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[vacc_ACC_serious] + +Notes: +1) First medical certification type: + - FUF = Fully unfit + - FFSW = Fit for selected work. + - FATAL + First incapacity covers the first sequential period of incapacity, across multiple medical certificates. + + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = vacc_ + Project schema = DL-MAA20XX-YY + +Issues: + +History (reverse order): +2021-10-31 CW +**************************************************************************************************/ + +USE IDI_UserCode +GO + +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[vacc_ACC_serious] +GO + +CREATE VIEW [DL-MAA20XX-YY].[vacc_ACC_serious] AS +SELECT [snz_uid] + --,[snz_acc_claim_form_45_uid] + --,[snz_acc_claim_uid] + --,[acc_cla_accident_date] + --,[acc_cla_case_management_end_date] + --,[acc_cla_serious_injury_ind] + --,year([acc_cla_accident_date]) as year + --,[acc_cla_multiple_injury_ind] + --,[acc_cla_read_code] + --,[acc_cla_ICD10_code] + --,[acc_cla_activity_prior_text] + --,[acc_cla_external_agency_text] + --,[acc_cla_cause_desc] + --,[acc_cla_contact_desc] + --,[acc_cla_primry_diagnos_grp_text] + --,[acc_cla_primary_injury_site_text] + --,[acc_cla_fatal_ind] + --,[acc_cla_first_incapacity] + --,[acc_cla_first_incapacity_type] + --,[acc_cla_first_incapacity_days] +FROM [IDI_Clean_YYYYMM].[acc_clean].[claims] +WHERE ([acc_cla_case_management_end_date] IS NULL OR [acc_cla_case_management_end_date] = '9999-12-31') +AND [acc_cla_serious_injury_ind] = 'Y' +AND YEAR([acc_cla_accident_date]) IN (2020,2021) -- claims between 2020 and 2021 +GO diff --git a/health/COPD_register.sql b/health/COPD_register.sql new file mode 100644 index 0000000..d5436d7 --- /dev/null +++ b/health/COPD_register.sql @@ -0,0 +1,108 @@ +/************************************************************************************************** +Title: COPD register +Author: MOH +Re-edit: Manjusha Radhakrishnan +Reviewer: + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +People in the COPD register. + +Intended purpose: +Create register of people who have been hospitalised for or received medicine for COPD in the past years + +Inputs & Dependencies: +- [IDI_Clean].[moh_clean].[priv_fund_hosp_discharges_diag] +- [IDI_Clean].[moh_clean].[pub_fund_hosp_discharges_event] +- [IDI_Clean].[moh_clean].[pharmaceutical] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[def_copd] + +Notes: +- COPD codes used: + ICD codes: J41-J44 + Pharms codes: 4043, 4047, 4057, 4058, 4059, 4060 + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +Issues: +1. Duplicates can be found; this is because a person may receive multiple treatments for the same condition + +History (reverse order): +2022-07-20 MR v1 + +**************************************************************************************************/ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[def_copd] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[def_copd] ( + snz_uid INT + , event_date DATE + , source VARCHAR(255) +); +GO + + +/** PUBLIC HOSPITALS **/ +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[def_copd] (snz_uid, event_date, source) +SELECT snz_uid, + nmd.moh_evt_even_date AS event_date, + 'PUB HOSP' AS source +FROM [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_event] as nmd +INNER JOIN ( + SELECT moh_dia_event_id_nbr, + moh_dia_diagnosis_type_code, + moh_dia_clinical_code, + moh_dia_clinical_sys_code + FROM [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_diag] + WHERE moh_dia_clinical_sys_code IN ('11','12','13','14','15') + AND (substring(moh_dia_clinical_code,1,3) IN ('J41','J42','J43','J44'))) as fndp +ON nmd.moh_evt_event_id_nbr = fndp.moh_dia_event_id_nbr +GO + +/** PRIVATE HOSPITALS **/ +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[def_copd] (snz_uid, event_date, source) +SELECT snz_uid, + moh_pri_evt_end_date AS event_date, + 'PRIV HOSP' AS source +FROM [IDI_Clean_YYYYMM].[moh_clean].[priv_fund_hosp_discharges_event] as nmd +INNER JOIN ( + SELECT moh_pri_diag_event_id_nbr, + moh_pri_diag_diag_type_code, + moh_pri_diag_clinic_code, + moh_pri_diag_clinic_sys_code + FROM [IDI_Clean_YYYYMM].[moh_clean].[priv_fund_hosp_discharges_diag] + WHERE moh_pri_diag_clinic_sys_code IN ('11','12','13','14','15') + AND (substring(moh_pri_diag_clinic_code,1,3) IN ('J41','J42','J43','J44')) +) as fndp +ON nmd.moh_pri_evt_event_id_nbr = fndp.moh_pri_diag_event_id_nbr +GO + +/** PHARMACEUTICAL **/ +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[def_copd] (snz_uid, event_date, source) +SELECT snz_uid, + moh_pha_dispensed_date AS event_date, + 'PHARM' AS source +FROM [IDI_Clean_YYYYMM].[moh_clean].[pharmaceutical] as pharm +INNER JOIN [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_dim_form_pack_subsidy_code] as form +ON pharm.moh_pha_dim_form_pack_code = form.dim_form_pack_subsidy_key +WHERE pharm.moh_pha_order_type_code IN (1,7) + AND pharm.moh_pha_patent_category_code != 'W' + AND pharm.moh_pha_admin_record_ind = '0' + AND chemical_id IN ('4043','4047','4057','4058','4060','4059') +GO diff --git a/health/DAP_register.sql b/health/DAP_register.sql new file mode 100644 index 0000000..9b6b0c1 --- /dev/null +++ b/health/DAP_register.sql @@ -0,0 +1,119 @@ +/************************************************************************************************** +Title: Dementia, Alzheimer's and Parkinson's register +Author: MOH +Re-edit: Manjusha Radhakrishnan +Reviewer: + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +People in the DAP register. + +Intended purpose: +Create register of people who have been hospitalised for or received medicine for Dementia, Alzheimer's and Parkinson's in the past years + +Inputs & Dependencies: +- [IDI_Clean].[moh_clean].[priv_fund_hosp_discharges_diag] +- [IDI_Clean].[moh_clean].[pub_fund_hosp_discharges_event] +- [IDI_Clean].[moh_clean].[pharmaceutical] +- [IDI_Clean].[moh_clean].[interrai] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[defn_dap] + + +Notes: +- Dementia, Alzheimer's and Parkinson's codes used: + ICD 10 codes: F00-F02.0, F02.8-F03.91, F06.2, G30-G31.1, G31.8-G32.89 + Pharms codes: 3923, 4037 + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +Issues: +1. Duplicates can be found; this is because a person may receive multiple treatments for the same condition + +History (reverse order): +2022-07-20 MR v1 + +**************************************************************************************************/ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_dap] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[defn_dap] ( + snz_uid INT + , event_date DATE + , source VARCHAR(255) +); +GO + + +/** PUBLIC HOSPITALS **/ +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_dap] (snz_uid, event_date, source) +SELECT snz_uid, + nmd.moh_evt_even_date AS event_date, + 'PUB HOSP' AS source +FROM [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_event] as nmd +INNER JOIN ( + SELECT moh_dia_event_id_nbr, + moh_dia_diagnosis_type_code, + moh_dia_clinical_code, + moh_dia_clinical_sys_code + FROM [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_diag] + WHERE moh_dia_clinical_sys_code IN ('11','12','13','14','15') + AND (substring(moh_dia_clinical_code,1,3) IN ('F01','F02','F03','F06.2','G30','G310','G311','G318','G32'))) as fndp +ON nmd.moh_evt_event_id_nbr = fndp.moh_dia_event_id_nbr +GO + +/** PRIVATE HOSPITALS **/ +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_dap] (snz_uid, event_date, source) +SELECT snz_uid, + moh_pri_evt_end_date AS event_date, + 'PRIV HOSP' AS source +FROM [IDI_Clean_YYYYMM].[moh_clean].[priv_fund_hosp_discharges_event] as nmd +INNER JOIN ( + SELECT moh_pri_diag_event_id_nbr, + moh_pri_diag_diag_type_code, + moh_pri_diag_clinic_code, + moh_pri_diag_clinic_sys_code + FROM [IDI_Clean_YYYYMM].[moh_clean].[priv_fund_hosp_discharges_diag] + WHERE moh_pri_diag_clinic_sys_code IN ('11','12','13','14','15') + AND (substring(moh_pri_diag_clinic_code,1,3) IN ('F01','F02','F03','F06.2','G30','G310','G311','G318','G32')) +) as fndp +ON nmd.moh_pri_evt_event_id_nbr = fndp.moh_pri_diag_event_id_nbr +GO + +/** PHARMACEUTICAL **/ +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_dap] (snz_uid, event_date, source) +SELECT snz_uid, + moh_pha_dispensed_date AS event_date, + 'PHARM' AS source +FROM [IDI_Clean_YYYYMM].[moh_clean].[pharmaceutical] as pharm +INNER JOIN [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_dim_form_pack_subsidy_code] as form +ON pharm.moh_pha_dim_form_pack_code = form.dim_form_pack_subsidy_key +WHERE pharm.moh_pha_order_type_code IN (1,7) + AND pharm.moh_pha_patent_category_code != 'W' + AND pharm.moh_pha_admin_record_ind = '0' + AND chemical_id IN ('3923', '4037') +GO + +/** INTERAI **/ +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_dap] (snz_uid, event_date, source) +SELECT snz_uid, + moh_irai_assessment_date AS event_date, + 'INTERAI' AS source +FROM [IDI_Clean_YYYYMM].[moh_clean].[interrai] +WHERE moh_irai_alzheimers_code = 1 OR moh_irai_dementia_code = 1 OR moh_irai_parkinsons_code = 1 +GO diff --git a/health/GP_contacts.sql b/health/GP_contacts.sql new file mode 100644 index 0000000..6ac05fb --- /dev/null +++ b/health/GP_contacts.sql @@ -0,0 +1,207 @@ +/************************************************************************************************** +Title: GP contacts +Author: Craig Wright + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +GP visit proxy indicator +The General Practice contacts indicator is a proxy indicator of frequency +of GP contact at the individual level by date of visit. + +Intended purpose: +Estimate primary care consults for each snz_uid by date. + +Inputs & Dependencies: +- [IDI_Clean].[acc_clean].[payments] +- [IDI_Clean].[acc_clean].[claims] +- [IDI_Clean].[moh_clean].[lab_claims] +- [IDI_Clean].[moh_clean].[pub_fund_hosp_discharges_event] +- [IDI_Clean].[moh_clean].[nnpac] +- [IDI_Clean].[moh_clean].[nes_enrolment] +- [IDI_Clean].[moh_clean].[pop_cohort_demographics] +- [IDI_Clean].[moh_clean].[pho_enrolment] +- [IDI_Clean].[moh_clean].[gms_claims] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[defn_GP_contacts] + + +Notes: +1) The indicator relies primarily on this: + - The PHO quarterly reporting of last date of GP contact (up to four visits a year). + - The General Medical Subsidy (GMS) claims for all primary care contacts outside of + when a person is enrolled GP (all visits). + - ACC funded primary care contacts. + It also relies on laboratory test data and the date of visit that initiated the test request. + Approximately 20% of lab tests are requested in the secondary care setting to this is likely + to drive a slight over count in estimates. +2) Being based on quarterly PHO contacts and GMS events we expect the indicator to be accurate for + people having primary care outside their PHO and people visiting the GP up to four times a year. + Inclusion of ACC and lab tests likely improves the quality for people with more than four visits + in a year. While the indicator will be accurate for most people, a small group will have more + visits than the indicator suggests. +3) Comparisons of population total annual volumes of GP visits collected by the MOH from PHOs + compare favourably to the indicator, within a few percent. Total volume comparisons by age, + gender, and ethnic group are also similar but differ by about 5-10%. +4) We exclude multiple events on the same day, but for some people these are likely to exist. +5) Final table: Dec 2014-Jan 2021. +6) In the 20210720 refresh the data is complete from 2014-2018, partial for 2019-2020 (80-90%) + and has very little for 2021 except for January. +7) Using Lab visits presents a challenge. ~20% of labs data is from hospitals and not GP visits. + There is no indicator to identify this. Our approach is to drop lab visits on days where the + snz_uid is in hospital or a non-admitted patient (NAP). This reduced lab tests by ~3.5% + Investigation suggested that lab tests reported the day after a hospital visit behave similarly + to lab tests during a hospital visit, hence these were also excluded. This increased the + reduction in lab tests from ~3.5% to ~5.5% +8) Results are consistent with expectations that ~81% of the population have 1 or more contacts + with their GP each year. +9) A small proportion of people will have multiple visits to a GP on the same day. For some datasets + this can be observed via duplicate records. But this can not be reliably detected across datasets. + We have not attempted to capture multiple visits on the same day. +10) The lab tests data includes provider codes. This should include the ID of the referring entity. + This definition might be further refined by investigation & use of these codes. + For example, if we observe providers whose lab tests never coincide with ACC, GMS, or GP enrolments, + then this would suggest these providers are operating in a different function from GPs. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = vacc_ + Project schema = DL-MAA20XX-YY + +Issues: + +History (reverse order): +2022-06-01 VW Point to DL-MAA20XX-YY, update to latest refresh, remove vaccination summary view +2021-11-29 SA review and tidy +2021-10-31 CW +**************************************************************************************************/ + +/* create table of all possible contacts */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_GP_contacts_list] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_GP_contacts_list] ( + snz_uid INT, + visit_date DATE, +); +GO + +/******************************************************** +ACC GP visits +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_GP_contacts_list] (snz_uid, visit_date) +SELECT DISTINCT b.[snz_uid] + ,[acc_pay_first_service_date] AS visit_date +FROM [IDI_Clean_YYYYMM].[acc_clean].[payments] AS a +INNER JOIN [IDI_Clean_YYYYMM].[acc_clean].[claims] AS b +ON a.[snz_acc_claim_uid] = b.[snz_acc_claim_uid] +WHERE [acc_pay_gl_account_text] = 'GENERAL PRACTITIONERS' +AND [acc_pay_focus_group_text] = 'MEDICAL TREATMENT' +AND [acc_pay_first_service_date] = [acc_pay_last_service_date] --ensure a single visit on same day +GO + +/******************************************************** +Labs +********************************************************/ + +WITH adding_hospital_indicators_to_lab_tests AS ( + +SELECT DISTINCT c.[snz_uid] + ,[moh_lab_visit_date] AS visit_date + ,IIF(b.snz_uid IS NOT NULL, 1, 0) AS public_hospital_discharge + ,IIF(n.snz_uid IS NOT NULL, 1, 0) AS non_admitted_patient + ,IIF(b.snz_uid IS NOT NULL OR n.snz_uid IS NOT NULL, 1, 0) AS hospital + + ,IIF(c.[moh_lab_visit_date] = DATEADD(DAY, 1, b.[moh_evt_even_date]), 1, 0) AS public_hospital_discharge_extra + ,IIF(c.[moh_lab_visit_date] = DATEADD(DAY, 1, n.[moh_nnp_service_date]), 1, 0) AS non_admitted_patient_extra +FROM [IDI_Clean_YYYYMM].[moh_clean].[lab_claims] AS c +LEFT JOIN [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_event] AS b +ON c.snz_uid = b.snz_uid +AND c.[moh_lab_visit_date] BETWEEN b.[moh_evt_evst_date] AND DATEADD(DAY, 1, b.[moh_evt_even_date]) -- include one day later +LEFT JOIN [IDI_Clean_YYYYMM].[moh_clean].[nnpac] AS n +ON c.snz_uid = n.snz_uid +AND c.[moh_lab_visit_date] BETWEEN n.[moh_nnp_service_date] AND DATEADD(DAY, 1, n.[moh_nnp_service_date]) -- include one day later +WHERE YEAR([moh_lab_visit_date]) BETWEEN 2014 AND 2021 + +) +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_GP_contacts_list] (snz_uid, visit_date) +SELECT snz_uid + ,visit_date +FROM adding_hospital_indicators_to_lab_tests +GO + +/******************************************************** +NES - contact events for the National enrolment service data +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_GP_contacts_list] (snz_uid, visit_date) +SELECT DISTINCT b.snz_uid + ,[moh_nes_last_consult_date] AS visit_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[nes_enrolment] AS a +LEFT JOIN [IDI_Clean_YYYYMM].[moh_clean].[pop_cohort_demographics] AS b +ON a.snz_moh_uid = b.snz_moh_uid +WHERE YEAR([moh_nes_last_consult_date]) IN (2014,2015,2016,2017,2018,2019,2020,2021) +GO + +/******************************************************** +PHO - contact events for the PHO enrolment service data +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_GP_contacts_list] (snz_uid, visit_date) +SELECT DISTINCT [snz_uid] + ,[moh_pho_last_consul_date] AS visit_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[pho_enrolment] +WHERE YEAR([moh_pho_last_consul_date]) IN (2014,2015,2016,2017,2018,2019,2020,2021) -- range of start year and end year +GO + +/******************************************************** +GMS - contact events for GP visits outside of PHO enrolment +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_GP_contacts_list] (snz_uid, visit_date) +SELECT [snz_uid] --not distinct as all discrete events + ,[moh_gms_visit_date] AS visit_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[gms_claims] +WHERE YEAR([moh_gms_visit_date]) IN (2014,2015,2016,2017,2018,2019,2020,2021) -- range of start year and end year +GO + +/******************************************************** +Combine +********************************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_GP_contacts] +GO + +SELECT DISTINCT snz_uid + ,visit_date +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_GP_contacts] +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_GP_contacts_list] +WHERE snz_uid IS NOT NULL +AND visit_date IS NOT NULL +AND YEAR(visit_date) >= 2014 -- start year +AND YEAR(visit_date) <= 2021 -- end year +GO + +CREATE NONCLUSTERED INDEX my_index ON [IDI_Sandpit].[DL-MAA20XX-YY].[defn_GP_contacts] (snz_uid) +GO +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[defn_GP_contacts] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE) +GO + +/******************************************************** +Tidy +********************************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_GP_contacts_list] +GO diff --git a/health/IHD_register.sql b/health/IHD_register.sql new file mode 100644 index 0000000..be6e04f --- /dev/null +++ b/health/IHD_register.sql @@ -0,0 +1,94 @@ +/************************************************************************************************** +Title: Ischaemic heart disease register +Author: MOH +Re-edit: Manjusha Radhakrishnan +Reviewer: + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + + +Description: +People in the IHD register. + +Intended purpose: +Create register of people who have been hospitalised for IHD in the past years. + +Inputs & Dependencies: +- [IDI_Clean].[moh_clean].[priv_fund_hosp_discharges_diag] +- [IDI_Clean].[[moh_clean].[pub_fund_hosp_discharges_event] + +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[defn_ihd] + +Notes: +- Ischaemic heart disease codes used: + I20-I26, Z82.4 + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +Issues: +1. Duplicates can be found; this is because a person may receive multiple treatments for the same condition + +History (reverse order): +2022-07-20 MR v1 +**************************************************************************************************/ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_ihd] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[defn_ihd] ( + snz_uid INT + , event_date DATE + , source VARCHAR(255) +); +GO + + +/** PUBLIC HOSPITALS **/ +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_ihd] (snz_uid, event_date, source) +SELECT snz_uid, + nmd.moh_evt_even_date AS event_date, + 'PUB HOSP' AS source +FROM [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_event] as nmd +INNER JOIN ( + SELECT moh_dia_event_id_nbr, + moh_dia_diagnosis_type_code, + moh_dia_clinical_code, + moh_dia_clinical_sys_code + FROM [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_diag] + WHERE moh_dia_clinical_sys_code IN ('11','12','13','14','15') + AND (substring(moh_dia_clinical_code,1,3) IN ('I20','I210','I211','I212','I213','I214','I219','I22','I23','I24','I25','Z824'))) as fndp +ON nmd.moh_evt_event_id_nbr = fndp.moh_dia_event_id_nbr +GO + +/** PRIVATE HOSPITALS **/ +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_ihd] (snz_uid, event_date, source) +SELECT snz_uid, + moh_pri_evt_end_date AS event_date, + 'PRIV HOSP' AS source +FROM [IDI_Clean_YYYYMM].[moh_clean].[priv_fund_hosp_discharges_event] as nmd +INNER JOIN ( + SELECT moh_pri_diag_event_id_nbr, + moh_pri_diag_diag_type_code, + moh_pri_diag_clinic_code, + moh_pri_diag_clinic_sys_code + FROM [IDI_Clean_YYYYMM].[moh_clean].[priv_fund_hosp_discharges_diag] + WHERE moh_pri_diag_clinic_sys_code IN ('11','12','13','14','15') + AND (substring(moh_pri_diag_clinic_code,1,3) IN ('I20','I210','I211','I212','I213','I214','I219','I22','I23','I24','I25','Z824')) +) as fndp +ON nmd.moh_pri_evt_event_id_nbr = fndp.moh_pri_diag_event_id_nbr +GO + diff --git a/health/PHO_enrollment.sql b/health/PHO_enrollment.sql new file mode 100644 index 0000000..3e0517d --- /dev/null +++ b/health/PHO_enrollment.sql @@ -0,0 +1,63 @@ +/************************************************************************************************** +Title: PHO enrolment +Author: Craig Wright + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Enrolment with Primary Health Organisation (PHO). + +Intended purpose: +Create variable reporting pho enrolment by month of enrolment pho_enrolment =(0/1) +based on monthly enrolment + +Inputs & Dependencies: +- [IDI_Clean].[moh_clean].[nes_enrolment] +- [IDI_Clean].[moh_clean].[pop_cohort_demographics] + +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_pho_enrollment_2021] + + +Notes: + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = vacc_ + Project schema = DL-MAA20XX-YY + Snapshot month = 'YYYYMMDD' + +Issues: + +History (reverse order): +2021-11-25 SA tidy +2021-10-12 CW +**************************************************************************************************/ + +/* remove */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_pho_enrollment_2021]; +GO + +/* create */ +SELECT b.snz_uid + ,a.[snz_moh_uid] + ,CAST([moh_nes_snapshot_month_date] AS DATE) AS enrollment_date +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_pho_enrollment_2021] +FROM [IDI_Clean_YYYYMM].[moh_clean].[nes_enrolment] AS a + ,[IDI_Clean_YYYYMM].[moh_clean].[pop_cohort_demographics] as b +WHERE a.snz_moh_uid = b.snz_moh_uid +AND [moh_nes_snapshot_month_date] >= 'YYYYMMDD' + +/* index */ +CREATE NONCLUSTERED INDEX my_index ON [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_pho_enrollment_2021] (snz_uid) diff --git a/health/Principal Disability Mapping.xlsx b/health/Principal Disability Mapping.xlsx new file mode 100644 index 0000000..8c999e4 Binary files /dev/null and b/health/Principal Disability Mapping.xlsx differ diff --git a/health/alcohol_abuse_dependence.sql b/health/alcohol_abuse_dependence.sql new file mode 100644 index 0000000..df2c703 --- /dev/null +++ b/health/alcohol_abuse_dependence.sql @@ -0,0 +1,573 @@ +/************************************************************************************************** +Title: Alcohol abuse or dependence +Author: Craig Wright +Re-work: Manjusha Radhakrishnan, Simon Anastasiadis +Reviewer: + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Any indication of chronic or acute alcohol abuse or dependence +For incidence and prevalence - see notes. + +Intended purpose: +Indication of chronic or acute alcohol abuse or dependence. + +Inputs & Dependencies: +- [IDI_Clean].[security].[concordance] +- [IDI_Clean].[moh_clean].[mortality_diagnosis] +- [IDI_Clean].[moh_clean].[mortality_registrations] +- [IDI_Clean].[moh_clean].[pharmaceutical] +- [IDI_Clean].[moh_clean].[pub_fund_hosp_discharges_diag] +- [IDI_Clean].[moh_clean].[pub_fund_hosp_discharges_event] +- [IDI_Clean].[moh_clean].[priv_fund_hosp_discharges_diag] +- [IDI_Clean].[moh_clean].[priv_fund_hosp_discharges_event] +- [IDI_Clean].[moh_clean].[PRIMHD] +- [IDI_Clean].[moh_clean].[interrai] +- [IDI_Clean].[moh_clean].[pop_cohort_demographics] +- [IDI_Clean].[msd_clean].[msd_incapacity] +- [IDI_Clean].[moj_clean].[charges] +- [IDI_Clean].[pol_clean].[nia_links] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_dim_form_pack_subsidy_code] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moj_charge_outcome_type_code] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moj_offence] +- [IDI_Adhoc].[clean_read_MOH_PRIMHD].[moh_primhd_mhinc] +- [IDI_Adhoc].[clean_read_MOH_PRIMHD].[primhd_diagnoses] +- [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_disability] +- [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_needs_assessment] +- [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_referral] +- [IDI_Adhoc].[clean_read_MOJ].[moj_alcohol_drv_disq] + +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_alcohol_abuse_or_dependence] +- [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] + + +Notes: +1) Definition based on three factors: + 1. any diagnosis or treatment for serious alcohol abuse or depedence + 2. any information for diangosis or treatment for serious alcohol abuse or depedence + WITH conditions including Alcohol and Other Drugs (AOD) abuse or depedence + 3. any information on acute intoxication - marks misuse of alcohol but not abuse or depedence + Precursors data tables for creating the final table: + 1. Alternative Level of Care (ALC) = diagnosis or treatment that is specific to abuse and depedence; this is used to indicate serious alcohol problems + 2. multiple = evidence of AOD - so servies that are for a mix of alcohol and / or other drugs be not specifically identifying which. + Includes: alcohol abuse/depedence and 100% alcohol attributable chronic conditions + Excludes: Fetal Alcohol Syndrome (FAS) and fetus affected by maternal alcohol use + +2) Incidence and prevalence + Incidence = development of a condition during a particular time period (new only). + Prevalance = affected by the condition during a particular time period (new + existing). + Plots of incidence (from this definition) against deprivation show a non-monotinic pattern. + Incidence increases from deprivation 1-8 but deprivation 9 and 10 show lower incidence. + This is likely because of barriers to access (cost, transport, availability etc.). + + So we can not use point in time measures to accurately access prevalance. + Instead, when using this definition, we recommend considering every person's history. + For example, if we consider any alcohol abuse indication in the last 10 years, then we observe + a more likely relationship between alcohol abuse and deprivation. This suggests that people in high + deprivation do get some treatment, but much less frequently than those who are less deprived. + Such an approach is reasonable, as this condition is likely to persist for at least 10 years. + +3) Some treatments are used to treat this condition and other conditions. This is most common + with pharmaceuticals. For these treatments, we do the following: + - Gather the treatments that serve multiple purposes together separately. + - Where a person only has records that also treat other conditions, discard those records + - Where a person has other evidence of alcohol abuse, add the treatments for multiple conditions + to the table. + For example, in 2010 a person is diagnosed in hospital with alcohol abuse, in 2008 they received drugs + that are used to treat alcohol abuse or depression. Then we include the 2008 date as an earlier probable indicator. + +4) Certain medial events are coded using the ICD9, ICD10, or DSM codes - most commonly hospital diagnoses. + There are mappings between the different codings in the diagnoses table. The mappings help researchers who are + familiar with only one coding system to locate records from a different coding system. + Most records (at least 80%) are stored in two versions/rows (the submitted code system, and an alternative they + have been mapped to), so researchers could use either version. + However, the mappings are imperfect. In some cases a more specific code we do want is mapped to a more general code + that we do not want. Hence, to ensure the most robust results, we have limited ourselves to only those records where + the diagnostic code is stored in the same system it was submitted. + This may exclude some records from our output definition. Researchers needing the broadest possible definition are + advised to review this constraint. + +5) We have decided to exclude SOCRATES as a source. There are several conditions recorded in SOCRATES that point + to previous alcohol abuse. However, they may develop 10+ years after such alcohol abuse. Hence, these may be of + interest to some researchers, but are not included here. + - 1106 Foetal alcohol syndrome (FAS) + - 1301 Alcohol / drug related disorder (excluding Korsakov's syndrome) + - 1403 Korsakov's syndrome / alcohol-related dementia + +6) We considered, but have excluded driver disqualifications. + The table [IDI_Adhoc].[clean_read_MOJ].[moj_alcohol_drv_disq] contains indication for conviction, whether blood or breath + alcohol level was high or low and what was the measurement. + However, the table has no information beyond 2013. Hence it is of limited use for most recent applications. + +Issues: +1) Because all our MHA tables use the same lookup/reference table, and all definitions load + this table into the database, you can not run the definitions in parallel. Because, each + definition will delete the reference table when it starts running and this will interfere + with the definitions that are already running. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +History (reverse order): +2022-09-12 SA Prep for library +2022-07-19 MR Tidy-up +2022-06-10 CW Definition creation +*************************************************************************************************************************/ + +/* Download the diagnosis lookup table from Github folder and upload onto datalab */ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] ( + diagnosis VARCHAR(30), + code_type VARCHAR(30), + code VARCHAR(10), + aux VARCHAR(30), + explanation VARCHAR(255), +) + +BULK INSERT [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] +FROM '\\your project folder\diagnosis_codes.csv' +WITH ( + FIRSTROW = 2, + FIELDTERMINATOR = ',', + ROWTERMINATOR = '\n', + TABLOCK +) + +/******************************************************** +TABLES TO APPEND TO +********************************************************/ + +/* Diagnosis or treatment only indicates bipolar */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_alcohol_solo] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_alcohol_solo] ( + snz_uid INT, + event_date DATE, +) + +/* Diagnosis or treatment used for bipolar and other conditions */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_alcohol_multi] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_alcohol_multi] ( + snz_uid INT, + event_date DATE, +) + +/******************************************************** +MORTALITY + +Note that people who died with this diagnosis will likely +have had bipolar for a while before death +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_alcohol_solo] (snz_uid, event_date) +SELECT b.snz_uid + ,EOMONTH(DATEFROMPARTS([moh_mor_death_year_nbr],[moh_mor_death_month_nbr],1)) AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[mortality_diagnosis] AS a +INNER JOIN [IDI_Clean_YYYYMM].[moh_clean].[mortality_registrations] AS b +ON a.[snz_dia_death_reg_uid] = b.snz_dia_death_reg_uid +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[moh_mort_diag_clinical_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'alcohol_abuse' + AND r.code_type = 'ICD10' + AND [moh_mort_diag_clinic_sys_code] >= '10' + AND [moh_mort_diag_clinic_type_code] IN ('A','B','V') +) +OR EXISTS ( +SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[moh_mort_diag_clinical_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'alcohol_abuse' + AND r.code_type = 'ICD9' + AND [moh_mort_diag_clinic_sys_code] IN ('06','6') + AND [moh_mort_diag_clinic_type_code] IN ('A','B','V') +) +OR EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[moh_mort_diag_clinical_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'alcohol_abuse' + AND r.code_type = 'external_ICD10' + AND [moh_mort_diag_clinic_sys_code] >= '10' + AND [moh_mort_diag_clinic_type_code] IN ('E') +) +OR EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[moh_mort_diag_clinical_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'alcohol_abuse' + AND r.code_type = 'external_ICD9' + AND [moh_mort_diag_clinic_sys_code] IN ('06','6') + AND [moh_mort_diag_clinic_type_code] IN ('E') +) +GO + +/******************************************************** +PHARMACEUTICALS +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_alcohol_solo] (snz_uid, event_date) +SELECT a.[snz_uid] + ,[moh_pha_dispensed_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[pharmaceutical] AS a +INNER JOIN [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_dim_form_pack_subsidy_code] AS b +ON a.[moh_pha_dim_form_pack_code] = b.[DIM_FORM_PACK_SUBSIDY_KEY] +WHERE EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(CAST(CHEMICAL_ID AS VARCHAR), 1, LEN(r.code)) = r.code + AND r.diagnosis = 'alcohol_abuse' + AND r.code_type = 'pharm_chemical' + AND r.aux = 'sole' +) +GO + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_alcohol_multi] (snz_uid, event_date) +SELECT a.[snz_uid] + ,[moh_pha_dispensed_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[pharmaceutical] AS a +INNER JOIN [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_dim_form_pack_subsidy_code] AS b +ON a.[moh_pha_dim_form_pack_code] = b.[DIM_FORM_PACK_SUBSIDY_KEY] +WHERE EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(CAST(CHEMICAL_ID AS VARCHAR), 1, LEN(r.code)) = r.code + AND r.diagnosis = 'alcohol_abuse' + AND r.code_type = 'pharm_chemical' + AND r.aux = 'multiple' +) +GO + +/******************************************************** +PRIVATE HOSPITAL DISCHARGE +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_alcohol_solo] (snz_uid, event_date) +SELECT a.[snz_uid] + ,CAST([moh_pri_evt_start_date] AS DATE) AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[priv_fund_hosp_discharges_event] AS a +INNER JOIN [IDI_Clean_YYYYMM].[moh_clean].[priv_fund_hosp_discharges_diag] AS b +ON a.[moh_pri_evt_event_id_nbr] = b.[moh_pri_diag_event_id_nbr] +AND [moh_pri_diag_sub_sys_code] = [moh_pri_diag_clinic_sys_code] +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING([moh_pri_diag_clinic_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'alcohol_abuse' + AND r.code_type = 'ICD10' + AND [moh_pri_diag_sub_sys_code] >= '10' + AND [moh_pri_diag_diag_type_code] IN ('A','B','V') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING([moh_pri_diag_clinic_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'alcohol_abuse' + AND r.code_type = 'ICD9' + AND [moh_pri_diag_sub_sys_code] IN ('06','6') + AND [moh_pri_diag_diag_type_code] IN ('A','B','V') +) +GO + +/******************************************************** +PUBLIC HOSPITAL DISCHARGE +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_alcohol_solo] (snz_uid, event_date) +SELECT b.[snz_uid] + ,[moh_evt_evst_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_diag] AS a +INNER JOIN [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_event] AS b +ON [moh_dia_clinical_sys_code] = [moh_dia_submitted_system_code] +AND [moh_evt_event_id_nbr]=[moh_dia_event_id_nbr] +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[moh_dia_clinical_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'alcohol_abuse' + AND r.code_type = 'ICD10' + AND [moh_dia_submitted_system_code] >= '10' + AND [moh_dia_diagnosis_type_code] IN ('A','B','V') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[moh_dia_clinical_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'alcohol_abuse' + AND r.code_type = 'ICD9' + AND [moh_dia_submitted_system_code] IN ('06','6') + AND [moh_dia_diagnosis_type_code] IN ('A','B','V') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[moh_dia_clinical_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'alcohol_abuse' + AND r.code_type = 'external_ICD10' + AND [moh_dia_submitted_system_code] >= '10' + AND [moh_dia_diagnosis_type_code] IN ('E') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[moh_dia_clinical_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'alcohol_abuse' + AND r.code_type = 'external_ICD9' + AND [moh_dia_submitted_system_code] IN ('06','6') + AND [moh_dia_diagnosis_type_code] IN ('E') +) +GO + +/******************************************************** +PUBLIC HOSPITAL DISCHARGE + +SUBSTANCE ABUSE HEALTH SPECIALITY +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_alcohol_multi] (snz_uid, event_date) +SELECT snz_uid, [moh_evt_evst_date] as event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_event] +WHERE moh_evt_hlth_spec_code in ('Y40','Y41','Y42','Y43','Y44','Y45','Y46','Y47','Y48','Y49') +GO + +/******************************************************** +PRIMHD AND MHINC +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_alcohol_solo] (snz_uid, event_date) +SELECT b.snz_uid + ,[classification_start] AS event_date +FROM [IDI_Adhoc].[clean_read_MOH_PRIMHD].[moh_primhd_mhinc] AS a +INNER JOIN [IDI_Clean_YYYYMM].[security].[concordance] AS b +ON a.snz_moh_uid = b.snz_moh_uid +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'alcohol_abuse' + AND r.code_type = 'ICD10' + AND [clinical_coding_system_id] >= '10' + AND diagnosis_type in ('A','B','V','P') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'alcohol_abuse' + AND r.code_type = 'DSM' + AND [clinical_coding_system_id] IN ('07','7') + AND diagnosis_type in ('A','B','V','P') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'alcohol_abuse' + AND r.code_type = 'ICD9' + AND [clinical_coding_system_id] IN ('06','6') + AND diagnosis_type in ('A','B','V','P') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'alcohol_abuse' + AND r.code_type = 'external_ICD10' + AND [clinical_coding_system_id] >= '10' + AND diagnosis_type in ('E') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'alcohol_abuse' + AND r.code_type = 'external_DSM' + AND [clinical_coding_system_id] IN ('07','7') + AND diagnosis_type in ('E') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'alcohol_abuse' + AND r.code_type = 'external_ICD9' + AND [clinical_coding_system_id] IN ('06','6') + AND diagnosis_type in ('E') +) +GO + +/******************************************************** +PRIMHD TEAM +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_alcohol_multi] (snz_uid, event_date) +SELECT [snz_uid] + ,[moh_mhd_activity_start_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[PRIMHD] +WHERE [moh_mhd_team_type_code] = 3 +OR [moh_mhd_team_code] IN (7874,14808,13481,13541,7086,7102,7114,7115,7238,7119,7122,7142,7152,7153,7077) + +/******************************************************** +PRIMHD DIAGNOSIS +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_alcohol_solo] (snz_uid, event_date) +SELECT snz_uid + ,DATEFROMPARTS(SUBSTRING([CLASSIFICATION_START_DATE],7,4),SUBSTRING([CLASSIFICATION_START_DATE],4,2),SUBSTRING([CLASSIFICATION_START_DATE],1,2)) AS event_date +FROM [IDI_Adhoc].[clean_read_MOH_PRIMHD].[primhd_diagnoses] AS a +INNER JOIN [IDI_Clean_YYYYMM].[security].[concordance] AS b +ON a.snz_moh_uid = b.snz_moh_uid + +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'alcohol_abuse' + AND r.code_type = 'ICD10' + AND [clinical_coding_system_id] >= '10' + AND [DIAGNOSIS_TYPE] in ('A','B','V','P') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'alcohol_abuse' + AND r.code_type = 'DSM' + AND [clinical_coding_system_id] IN ('07','7') + AND [DIAGNOSIS_TYPE] in ('A','B','V','P') +) +GO + +/******************************************************** +INTERRAI + +Alcohol- Highest number of drinks in any single sitting in LAST 14 DAYS / 5 or more +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_alcohol_multi] (snz_uid, event_date) +SELECT [snz_uid] + ,[moh_irai_assessment_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[interrai] +WHERE moh_irai_alcohol_one_settng_code = 3 /* 5+ drinks */ +GO + +/******************************************************** +MSD INCAPACITATION +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_alcohol_solo] (snz_uid, event_date) +SELECT [snz_uid] + ,[msd_incp_incp_from_date] AS event_date +FROM [IDI_Clean_YYYYMM].[msd_clean].[msd_incapacity] +WHERE [msd_incp_incrsn_code] IN ('007','170') +OR [msd_incp_incrsn95_1_code] IN ('007','170') +OR [msd_incp_incrsn95_2_code] IN ('007','170') +OR [msd_incp_incrsn95_3_code] IN ('007','170') +OR [msd_incp_incrsn95_4_code] IN ('007','170') +OR [msd_incp_incapacity_code] IN ('007','170') +GO + +/******************************************************** +DRIVING DISQUALIFICATION +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_alcohol_multi] (snz_uid, event_date) +SELECT snz_uid, [moj_chg_offence_from_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moj_clean].[charges] AS a +WHERE EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(moj_chg_offence_code, 1, LEN(r.code)) = r.code + AND r.diagnosis = 'alcohol_abuse' + AND r.code_type = 'offences' +) +GO + +/******************************************************** +POLICE NIA LINKS +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_alcohol_multi] (snz_uid, event_date) +SELECT snz_uid, [nia_links_rec_date] AS event_date +FROM [IDI_Clean_YYYYMM].[pol_clean].[nia_links] AS a +WHERE EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING([nia_links_latest_inc_off_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'alcohol_abuse' + AND r.code_type = 'offences' +) +GO + +/**************************************************************************************************************** +FINAL TABLE CREATION +****************************************************************************************************************/ + +/* Add indexes */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_alcohol_solo] (snz_uid); +GO +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_alcohol_multi] (snz_uid); +GO + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_alcohol_abuse_or_dependence] +GO + +WITH multi_to_add AS ( + SELECT snz_uid, event_date + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_alcohol_multi] AS m + WHERE EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_alcohol_solo] AS s + WHERE m.snz_uid = s.snz_uid + ) +) +SELECT DISTINCT snz_uid, event_date +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_alcohol_abuse_or_dependence] +FROM ( + SELECT snz_uid, event_date + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_alcohol_solo] + + UNION ALL + + SELECT snz_uid, event_date + FROM multi_to_add +) AS k +GO + +/******************************************************** +TIDY UP +********************************************************/ + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_alcohol_abuse_or_dependence] (snz_uid); +GO +/* Compress final table to save space */ +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_alcohol_abuse_or_dependence] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE); +GO + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_alcohol_solo] +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_alcohol_multi] +GO + diff --git a/health/avoidable hospitalisations ASH_PAH conditions.xlsx b/health/avoidable hospitalisations ASH_PAH conditions.xlsx new file mode 100644 index 0000000..5cc2260 Binary files /dev/null and b/health/avoidable hospitalisations ASH_PAH conditions.xlsx differ diff --git a/health/avoidable_hospitalisations_ASH_PAH.sql b/health/avoidable_hospitalisations_ASH_PAH.sql new file mode 100644 index 0000000..36c71f8 --- /dev/null +++ b/health/avoidable_hospitalisations_ASH_PAH.sql @@ -0,0 +1,252 @@ +/************************************************************************************************** +Title: ASH and child ASH/PAH +Author: HZ + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Ambulatory sensitive hospitalisations (ASH) and Child ASH & Potentially Avoidable Hospitalisation (PAH) + +Intended purpose: +Defines spells where clients have had a publicly funded hospital event which is considered +to be an Ambulatory Sensitive Care Hospitalisation (ASH) condition, child-related +Ambulatory Sensitive Care and/or Potentially avoidable hospitalisation (PAH). + +This includes the diagnosis (ICD10 code) and the relevant ages for which it is considered an ASH/PAH event. + +Inputs & Dependencies: +- [IDI_Clean].[moh_clean].[pub_fund_hosp_discharges_diag] +- [IDI_Clean].[moh_clean].[pub_fund_hosp_discharges_event] +Outputs: +- [IDI_UserCode].[$(PROJSCH)].[$(TBLPREF)ash] + + +Notes: +1) Ambulatory Sensitive Hospitalisations (ASH) are the most acute admissions that are + considered potentially reducible through prophylactic or therapeutic interventions + deliverable in a primary care setting. +2) Potentially Avoidable Hospitalisations (PAH) is an indicator of health-related outcomes + under the Child and Youth Wellbeing Strategy, and a Child Poverty Related Indicator + required by the Child Poverty Reduction Act 2018. +3) The main ASH definition has been constructed from the description provided by Craig. + This includes a list of diagnosis and procedure codes and the recommended age bracket for each condition. +4) The child ASH/PAH definitions have been constructed using the codes provided in Table 2 and 3 of + The New Zealand Medical Journal article called "Developing a tool to monitor potentially avoidable + and ambulatory care sensitive hospitalisations in New Zealand children". +5) The [end_date] in this table is the end of the hospital visit when diagnosis took place, + NOT the date that the chronic condition ended. +6) Most Child ASH and PAH codes correspond to children aged 1 month to 14 years. These are the upper and + lower age bounds unless otherwise specified. +(pg. 28, www.journal.nzma.org.nz/journal-articles/developing-a-tool-to-monitor-potentially-avoidable-and-ambulatory-care-sensitive-hosptilisations-in-new-zealand-children). + +Parameters & Present values: + Current refresh = $(IDIREF) + Prefix = $(TBLPREF) + Project schema = [$(PROJSCH)] + +Issues: + +History (reverse order): + + +**************************************************************************************************/ +--PARAMETERS################################################################################################## +--SQLCMD only (Activate by clicking Query->SQLCMD Mode) +--Already in master.sql; Uncomment when running individually +:setvar TBLPREF "SWA" +:setvar IDIREF "IDI_Clean_YYYYMM" +:setvar PROJSCH "DL-MAA20XX-YY" +GO + +--############################################################################################################## + +/* Clear before creation */ +DROP TABLE IF EXISTS [IDI_Sandpit].[$(PROJSCH)].[$(TBLPREF)_ash_event]; +GO + +/************************************ publically funded hospital discharges ************************************/ +SELECT [moh_dia_event_id_nbr] + ,[moh_dia_clinical_code] +INTO [IDI_Sandpit].[$(PROJSCH)].[$(TBLPREF)_ash_event] +FROM [$(IDIREF)].[moh_clean].[pub_fund_hosp_discharges_diag] +WHERE [moh_dia_submitted_system_code] = [moh_dia_clinical_sys_code] /* higher accuracy when systems match */ +AND ( +/* diagnosis in ICD10 */ +[moh_dia_diagnosis_type_code] IN ('A', 'B') /*"A" is "Principle diagnosis" and "B" is "Other relevant diagnosis" */ +AND [moh_dia_clinical_sys_code] IN ('10', '11', '12', '13', '14') /* ICD-10-AM - First, second, third, sixth, eighth edition*/ +AND ( SUBSTRING([moh_dia_clinical_code], 1, 3) IN ('A00', 'A01', 'A02', 'A03', 'A04', 'A05', 'A06', 'A07', 'A08', 'A09', 'A15', + 'A16', 'A17', 'A18', 'A19', 'A33', 'A34', 'A35', 'A36', 'A37', 'A39', 'A50', + 'A51', 'A52', 'A53', 'A54', 'A55', 'A56', 'A57', 'A58', 'A59', 'A60', 'A63', + 'A64', 'A80', 'A87', 'B05', 'B06', 'B16', 'B18', 'B26', 'B34', 'C53', 'D50', + 'D51', 'D52', 'D53', 'E10', 'E11', 'E13', 'E14', 'E40', 'E41', 'E42', 'E43', + 'E44', 'E45', 'E46', 'E47', 'E48', 'E49', 'E50', 'E51', 'E52', 'E53', 'E54', + 'E55', 'E56', 'E57', 'E58', 'E59', 'E60', 'E61', 'E62', 'E63', 'E64', 'G00', + 'G01', 'G02', 'G03', 'G40', 'G41', 'H65', 'H66', 'H67', 'I00', 'I01', 'I02', + 'I05', 'I06', 'I07', 'I08', 'I09', 'I10', 'I11', 'I12', 'I13', 'I15', 'I20', + 'I21', 'I22', 'I23', 'I25', 'I50', 'I61', 'I63', 'I64', 'I65', 'I66', 'J00', + 'J01', 'J02', 'J03', 'J04', 'J06', 'J12', 'J13', 'J14', 'J15', 'J16', 'J18', + 'J21', 'J22', 'J44', 'J45', 'J46', 'J47', 'J81', 'K02', 'K04', 'K05', 'K21', + 'K25', 'K26', 'K27', 'K28', 'L00', 'L01', 'L02', 'L03', 'L04', 'L05', 'L08', + 'L20', 'L21', 'L22', 'L23', 'L24', 'L25', 'L26', 'L27', 'L28', 'L29', 'L30', + 'M86', 'N10', 'N12', 'O15', 'R11') + OR SUBSTRING([moh_dia_clinical_code], 1, 4) IN ('A403', 'B180', 'B181', 'E162', 'H000', 'H010', 'I240', 'I241', 'I248', 'I249', + 'I674', 'J050', 'J100', 'J110', 'J340', 'K529', 'K590', 'L980', 'M014', 'M023', + 'M833', 'N136', 'N300', 'N309', 'N341', 'N390', 'P350', 'R062', 'R072', 'R073', + 'R074', 'R560', 'R568') + ) /*Include only all main ASH, child ASH, and PAH ICD10 codes*/ +) + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[$(PROJSCH)].[$(TBLPREF)_ash_event] ([moh_dia_event_id_nbr]); +GO + +/* Clear before creation */ +DROP TABLE IF EXISTS [IDI_Sandpit].[$(PROJSCH)].[$(TBLPREF)_ash]; +GO + +/************************************ combined final table ************************************/ +SELECT [snz_uid] + ,[moh_dia_event_id_nbr] + ,[start_date] + ,[end_date] + --,[moh_dia_clinical_code] + --,[moh_evt_birth_month_nbr] + --,[moh_evt_birth_year_nbr] + --,age_mnths + ,main_ASH + ,chld_ASH + ,chld_PAH +INTO [IDI_Sandpit].[$(PROJSCH)].[$(TBLPREF)_ash] +FROM ( +SELECT * +/*Group the ICD10 codes as child PAH/ASH condition or Main ASH*/ +------------------------------------------------------------------------------- +/*Child PAH*/ + ,CASE WHEN SUBSTRING([moh_dia_clinical_code], 1, 3) IN ('A00', 'A01', 'A02', 'A03', 'A04', 'A05', 'A06', 'A07', 'A08', 'A09', 'A15', + 'A16', 'A17', 'A18', 'A19', 'A33', 'A34', 'A35', 'A36', 'A37', 'A39', 'A80', + 'A87', 'B05', 'B06', 'B16', 'B26', 'B34', 'D50', 'D51', 'D52', 'D53', 'E40', + 'E41', 'E42', 'E43', 'E44', 'E45', 'E46', 'E47', 'E48', 'E49', 'E50', 'E51', + 'E52', 'E53', 'E54', 'E55', 'E56', 'E57', 'E58', 'E59', 'E60', 'E61', 'E62', + 'E63', 'E64', 'G00', 'G01', 'G02', 'G03', 'H65', 'H66', 'H67', 'I00', 'I01', + 'I02', 'I05', 'I06', 'I07', 'I08', 'I09', 'J00', 'J01', 'J02', 'J03', 'J04', + 'J06', 'J12', 'J13', 'J14', 'J15', 'J16', 'J18', 'J21', 'J45', 'J46', 'J47', + 'K02', 'K04', 'K05', 'K21', 'L00', 'L01', 'L02', 'L03', 'L04', 'L05', 'L08', + 'L20', 'L21', 'L22', 'L23', 'L24', 'L25', 'L26', 'L27', 'L28', 'L29', 'L30', + 'M86', 'R11') + AND age_mnths >= 1 AND age_mnths <= 168 THEN 1 /*3 substring - events that apply to ages greater than 1 month and less than 14 years*/ + WHEN SUBSTRING([moh_dia_clinical_code], 1, 4) IN ('B180', 'B181', 'H000', 'H010', 'J050', 'J100', 'J110', 'J340', 'K529', 'K590', 'L980', + 'M014', 'P350', 'R560') + AND age_mnths >= 1 AND age_mnths <= 168 THEN 1 /*4 substring - events that apply to ages greater than 1 month and less than 14 years*/ + WHEN SUBSTRING([moh_dia_clinical_code], 1, 3) IN ('N10', 'N12') + AND age_mnths >= 60 and age_mnths <= 168 THEN 1 /*3 substring - events that apply to ages older than 5 years*/ + WHEN SUBSTRING([moh_dia_clinical_code], 1, 4) IN ('N300', 'N390', 'N136', 'N309') + AND age_mnths >= 60 and age_mnths <= 168 THEN 1 /*4 substring - events that apply to ages older than 5 years*/ + ELSE 0 END AS chld_PAH + +------------------------------------------------------------------------------- +/*Child ASH*/ + ,CASE WHEN SUBSTRING([moh_dia_clinical_code], 1, 3) IN ('A02', 'A03', 'A04', 'A05', 'A06', 'A07', 'A08', 'A09', 'A33', 'A34', 'D50', + 'D51', 'D52', 'D53', 'E40', 'E41', 'E42', 'E43', 'E44', 'E45', 'E46', 'E47', + 'E48', 'E49', 'E50', 'E51', 'E52', 'E53', 'E54', 'E55', 'E56', 'E57', 'E58', + 'E59', 'E60', 'E61', 'E62', 'E63', 'E64', 'H65', 'H66', 'H67', 'I00', 'I01', + 'I02', 'I05', 'I06', 'I07', 'I08', 'I09', 'J00', 'J01', 'J02', 'J03', 'J06', + 'J13', 'J14', 'J15', 'J16', 'J18', 'J45', 'J46', 'J47', 'K02', 'K04', 'K05', + 'K21', 'L00', 'L01', 'L02', 'L03', 'L04', 'L05', 'L08', 'L20', 'L21', 'L22', + 'L23', 'L24', 'L25', 'L26', 'L27', 'L28', 'L29', 'L30', 'R11') + AND age_mnths >= 1 AND age_mnths <= 168 THEN 1 /*3 substring - events that apply to ages greater than 1 month and less than 14 years*/ + WHEN SUBSTRING([moh_dia_clinical_code], 1, 4) IN ('K590', 'K529', 'L980', 'J340', 'H010', 'H000', 'P350') + AND age_mnths >= 1 AND age_mnths <= 168 THEN 1 /*4 substring - events that apply to ages greater than 1 month and less than 14 years*/ + WHEN SUBSTRING([moh_dia_clinical_code], 1, 3) IN ('N10', 'N12') + AND age_mnths >= 60 and age_mnths <= 168 THEN 1 /*3 substring - events that apply to ages older than 5 years*/ + WHEN SUBSTRING([moh_dia_clinical_code], 1, 4) IN ('N300', 'N390', 'N136', 'N309') + AND age_mnths >= 60 and age_mnths <= 168 THEN 1 /*4 substring - events that apply to ages older than 5 years*/ + WHEN SUBSTRING([moh_dia_clinical_code], 1, 3) IN ('B05', 'B06', 'B26') + AND age_mnths > 16 and age_mnths <= 168 THEN 1 /*3 substring - events that apply to ages older than 16 months*/ + WHEN SUBSTRING([moh_dia_clinical_code], 1, 4) IN ('M014') + AND age_mnths > 16 and age_mnths <= 168 THEN 1 /*4 substring - events that apply to ages older than 16 months*/ + WHEN SUBSTRING([moh_dia_clinical_code], 1, 3) IN ('A35', 'A36', 'A37', 'A80', 'B16') + AND age_mnths > 6 and age_mnths <= 168 THEN 1 /*3 substring - events that apply to ages older than 6 months*/ + WHEN SUBSTRING([moh_dia_clinical_code], 1, 4) IN ('B180', 'B181') + AND age_mnths > 6 and age_mnths <= 168 THEN 1 /*4 substring - events that apply to ages older than 6 months*/ + ELSE 0 END AS chld_ASH + +------------------------------------------------------------------------------- +/*Main ASH*/ + ,CASE WHEN SUBSTRING([moh_dia_clinical_code], 1, 3) IN ('J22') + AND age_mnths >= 0 AND age_mnths <= 48 THEN 1 /*3 substring - events that apply to ages greater than 0 month and less than 4 years*/ + WHEN SUBSTRING([moh_dia_clinical_code], 1, 4) IN ('R062') + AND age_mnths >= 0 AND age_mnths <= 48 THEN 1 /*4 substring - events that apply to ages greater than 0 month and less than 4 years*/ + WHEN SUBSTRING([moh_dia_clinical_code], 1, 3) IN ('B05', 'B06', 'B26') + AND age_mnths >= 15 AND age_mnths <= 168 THEN 1 /*3 substring - events that apply to ages greater than 15 month and less than 14 years*/ + WHEN SUBSTRING([moh_dia_clinical_code], 1, 4) IN ('P350') + AND age_mnths >= 15 AND age_mnths <= 168 THEN 1 /*4 substring - events that apply to ages greater than 15 month and less than 14 years*/ + WHEN SUBSTRING([moh_dia_clinical_code], 1, 3) IN ('A50', 'A51', 'A52', 'A53', 'A54', 'A55', 'A56', 'A57', 'A58', 'A59', 'A60', + 'A63', 'A64', 'C53', 'E10', 'E11', 'E13', 'E14', 'G40', 'G41', 'I10', 'I11', + 'I12', 'I13', 'I15', 'I20', 'I21', 'I22', 'I23', 'I25', 'I50', 'I61', 'I63', + 'I64', 'I65', 'I66', 'J44', 'J47', 'J81', 'K25', 'K26', 'K27', 'K28', 'O15') + AND age_mnths >= 180 THEN 1 /*3 substring - events that apply to ages greater than 15 years*/ + WHEN SUBSTRING([moh_dia_clinical_code], 1, 4) IN ('E162', 'I240', 'I241', 'I248', 'I249', 'I674', 'M023', 'M833', 'N341', + 'R072', 'R073', 'R074', 'R560', 'R568') + AND age_mnths >= 180 THEN 1 /*4 substring - events that apply to ages greater than 15 years*/ + WHEN SUBSTRING([moh_dia_clinical_code], 1, 3) IN ('N10', 'N12') + AND age_mnths >= 60 THEN 1 /*3 substring - events that apply to ages greater than 5 years*/ + WHEN SUBSTRING([moh_dia_clinical_code], 1, 4) IN ('N136', 'N309', 'N390') + AND age_mnths >= 60 THEN 1 /*4 substring - events that apply to ages greater than 5 years*/ + WHEN SUBSTRING([moh_dia_clinical_code], 1, 3) IN ('A33', 'A34', 'A35', 'A36', 'A37', 'A80', 'B16', 'B18') + AND age_mnths >= 6 AND age_mnths <= 168 THEN 1 /*3 substring - events that apply to ages greater than 6 month and less than 14 years*/ + WHEN SUBSTRING([moh_dia_clinical_code], 1, 4) IN ('A403') + AND age_mnths >= 6 AND age_mnths <= 168 THEN 1 /*4 substring - events that apply to ages greater than 6 month and less than 14 years*/ + WHEN SUBSTRING([moh_dia_clinical_code], 1, 3) IN ('A02', 'A03', 'A04', 'A05', 'A06', 'A07', 'A08', 'A09', 'D50', 'D51', 'D52', + 'D53', 'E40', 'E41', 'E42', 'E43', 'E44', 'E45', 'E46', 'E50', 'E51', 'E52', + 'E53', 'E54', 'E55', 'E56', 'E58', 'E59', 'E60', 'E61', 'E63', 'H65', 'H66', + 'H67', 'I00', 'I01', 'I02', 'I05', 'I06', 'I07', 'I08', 'I09', 'J00', 'J01', + 'J02', 'J03', 'J04', 'J06', 'J13', 'J14', 'J15', 'J16', 'J18', 'J45', 'J46', + 'K02', 'K04', 'K05', 'K21', 'L01', 'L02', 'L03', 'L04', 'L08', 'L20', 'L21', + 'L22', 'L23', 'L24', 'L25', 'L26', 'L27', 'L28', 'L29', 'L30', 'R11') + AND age_mnths >= 0 THEN 1 /*3 substring - events that apply to all ages*/ + WHEN SUBSTRING([moh_dia_clinical_code], 1, 4) IN ('H000', 'H010', 'J340', 'K529', 'K590', 'L980') + AND age_mnths >= 0 THEN 1 /*4 substring - events that apply to all ages*/ + ELSE 0 END AS main_ASH + +FROM ( +/* public */ +SELECT [snz_uid] + ,[moh_dia_event_id_nbr] + ,[moh_evt_evst_date] AS [start_date] + ,[moh_evt_even_date] AS [end_date] + ,a.[moh_dia_clinical_code] + ,[moh_evt_birth_month_nbr] + ,[moh_evt_birth_year_nbr] + /*Determine the age (in months) of the individual at time of event*/ + ,DATEDIFF(month, (DATEFROMPARTS([moh_evt_birth_year_nbr], [moh_evt_birth_month_nbr], 15)), [moh_evt_evst_date]) AS age_mnths +FROM [IDI_Sandpit].[$(PROJSCH)].[$(TBLPREF)_ash_event] a +INNER JOIN [$(IDIREF)].[moh_clean].[pub_fund_hosp_discharges_event] b +ON a.[moh_dia_event_id_nbr] = b.[moh_evt_event_id_nbr] +) k +) m +WHERE chld_PAH!=0 or chld_ASH!=0 or main_ASH!=0 /*Remove spells for individuals whose ages do not correspond to the age bands for ASH or child ASH/PAH*/ + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[$(PROJSCH)].[$(TBLPREF)_ash] ([snz_uid]); +GO +/* Compress final table to save space */ +ALTER TABLE [IDI_Sandpit].[$(PROJSCH)].[$(TBLPREF)_ash] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE); +GO + +/************************************ tidy temporary tables away ************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[$(PROJSCH)].[$(TBLPREF)_ash_event]; +GO + + diff --git a/health/cancer_register.sql b/health/cancer_register.sql new file mode 100644 index 0000000..b34a23c --- /dev/null +++ b/health/cancer_register.sql @@ -0,0 +1,89 @@ +/************************************************************************************************** +Title: Cancer register +Author: MOH +Re-edit: Manjusha Radhakrishnan +Reviewer: + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +People in the cancer register. + +Intended purpose: +Create register of people who have been added to the cancer register or received treatment for cancer in the past years + +Inputs & Dependencies: +- [IDI_Clean].[moh_clean].[cancer_registrations] +- [IDI_Clean].[moh_clean].[nnpac] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[def_cancer] + + +Notes: +1. Cancer codes used: + - ICD 10 codes: C00-C96, D00-D48, N60, N84, N87, Z08-Z09, Z12, Z80, Z85, Z86 + - NNPAC codes: M5007-M50025, M54002-M54004, MS02009, M30020-M30021 + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +Issues: +1. Duplicates can be found; this is because a person may receive multiple treatments for the same condition + +History (reverse order): +2022-07-20 MR v1 +**************************************************************************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[def_cancer] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[def_cancer] ( + snz_uid INT + , event_date DATE + , source VARCHAR(255) +); +GO + +/** NMDS query - to access hospitalisation data **/ +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[def_cancer] (snz_uid, event_date, source) +SELECT snz_uid + ,moh_can_diagnosis_date AS event_date + , 'CANCER REG' AS source +FROM [IDI_Clean_YYYYMM].[moh_clean].[cancer_registrations] +WHERE moh_can_extent_of_disease_code NOT IN ('0','A') + AND (SUBSTRING(moh_can_site_code,1,3) IN ('C00','C01','C02','C03','C04','C05','C06','C07','C08','C09', + 'C10','C11','C12','C13','C14','C15','C16','C17','C18','C19','C20', + 'C20','C21','C22','C23','C24','C25','C26','C27','C28','C29','C30','C31','C32','C33','C34','C35','C36', + 'C37','C38','C39','C40','C41','C42','C43','C44','C45','C46','C47','C48','C49', + 'C50','C51','C52','C53','C54','C55','C56','C57','C58','C59','C60','C61','C62','C63','C64','C65','C66', + 'C67','C68','C69','C70','C71','C72','C73','C74','C75','C76','C77','C78','C79', + 'C80','C81','C82','C83','C84','C85','C86','C87','C88','C89','C90','C91','C92','C93','C94','C95','C96', + 'D00','D01','D02','D03','D04','D05','D06','D07','D08','D09', + 'D10','D11','D12','D13','D14','D15','D16','D17','D18','D19','D20', + 'D20','D21','D22','D23','D24','D25','D26','D27','D28','D29','D30','D31','D32','D33','D34','D35','D36', + 'D37','D38','D39','D40','D41','D42','D43','D44','D45','D46','D47','D48', + 'N60','N84','N87','Z08','Z09','Z12','Z80','Z85','Z86')) +GO + +/** NNPAC data **/ +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[def_cancer] (snz_uid, event_date, source) +SELECT snz_uid + ,moh_nnp_service_date AS event_date + ,'NNPAC' AS source + FROM [IDI_Clean_YYYYMM].[moh_clean].[nnpac] + WHERE moh_nnp_purchase_unit_code IN ('M5007','M5008','M5009', 'M50010', 'M50011','M50012','M50013','M50014','M50015','M50016','M50017','M50018','M50019','M50020','M50021','M50022', + 'M50023','M50024','M50025','M54002','M54003','M54004','MS02009','M30020','M30021') +GO diff --git a/health/chronic_coronary_heart_disease.sql b/health/chronic_coronary_heart_disease.sql new file mode 100644 index 0000000..99f1ab5 Binary files /dev/null and b/health/chronic_coronary_heart_disease.sql differ diff --git a/health/chronic_diabetes.sql b/health/chronic_diabetes.sql new file mode 100644 index 0000000..a20dc9d --- /dev/null +++ b/health/chronic_diabetes.sql @@ -0,0 +1,272 @@ +/************************************************************************************************** +Title: Diabetes +Author: Simon Anastasiadis +Modified: Penny Mok + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Diagnosis at a hospital with Diabetes or dispensing of diabetes drugs. + +Intended purpose: +Determine who has been diagnosed with the chronic condition diabetes. +And when they were diagnosed. +Inputs & Dependencies: +- [IDI_Clean].[moh_clean].[pub_fund_hosp_discharges_diag] +- [IDI_Clean].[moh_clean].[priv_fund_hosp_discharges_diag] +- [IDI_Clean].[moh_clean].[pub_fund_hosp_discharges_event] +- [IDI_Clean].[moh_clean].[priv_fund_hosp_discharges_event] +- [IDI_Clean].[moh_clean].[nnpac] +- [IDI_Clean].[moh_clean].[pharmaceutical] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_dim_form_pack_subsidy_code] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[defn_chronic_coronary_heart_disease] + + +Notes: +1) In the September 2018 refresh notes: + "The data contained in the [moh.clean].[chronic_condition] table has changed due to some data + sources being too outdated to provide value for researchers. COPD and CHD are no longer included + in this table, and alternatives should be used to identify these conditions. The remaining + conditions have been updated. Diabetes now uses data from the updated Virtual Diabetes + Register (VDR) methodology (v686) and contains data from the VDR 2017." + IDI wiki Source: + wprdtfs05/sites/DefaultProjectCollection/IDI/IDIwiki/UserWiki/Documents/September%202018%20IDI%20Refresh%20Updates.pdf + However, we do not have access to the VDR within the IDI. +2) We have constructed this definition from the description given in the MoH IDI Data dictionary. + This includes a list of diagnosis and proceedure codes, as well as a list of pharmaceuticals. +3) Testing against Chronic condition table in the 2018-07-20 refresh suggests high consistency. +4) To reduce the amount of data written/copied during the construction of these tables, we have + commented out non-critical fields (lines starting with "--"). Uncommenting these lines is + recommended is validating the construction/definition. +5) The [end_date] in this table is the end of the hospital visit when diagnosis took place, + NOT the date that the chronic condition ended. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +Issues: + +History (reverse order): +2020-05-26 SA v1 + +**************************************************************************************************/ + +/* Clear before creation */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_tmp_pfhd_chronic_diags]; +GO + +/************************************ publically funded hospital discharages ************************************/ +SELECT [moh_dia_event_id_nbr] AS [event_id] + --,[moh_dia_clinical_sys_code] + --,[moh_dia_submitted_system_code] + --,[moh_dia_diagnosis_type_code] + --,[moh_dia_clinical_code] + ,'pub_diab' AS [source] +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_tmp_pfhd_chronic_diags] +FROM [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_diag] +WHERE [moh_dia_submitted_system_code] = [moh_dia_clinical_sys_code] /* higher accuracy when systems match */ +AND [moh_dia_diagnosis_type_code] IN ('A', 'B') /* diagnosies */ +AND [moh_dia_clinical_sys_code] IN ('10', '11', '12', '13', '14') /* ICD-10-AM */ +AND ( + SUBSTRING([moh_dia_clinical_code], 1, 3) IN ( + 'E10' /* Type 1 DM */ + ,'E11' /* TYPE 2 DM */ + ,'E13' /* Other specified DM */ + ,'E14' /* Unspecified DM */ + ) + OR SUBSTRING([moh_dia_clinical_code], 1, 4) IN ('O240', 'O241', 'O242', 'O243') /* pre-existing diabetes in pregnancy */ +) + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[defn_tmp_pfhd_chronic_diags] ([event_id]); +GO + +/* Clear before creation */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_tmp_vfhd_chronic_diags]; +GO + +/************************************ privately funded hospital discharages ************************************/ +SELECT [moh_pri_diag_event_id_nbr] AS [event_id] + --,[moh_pri_diag_clinic_sys_code] + --,[moh_pri_diag_sub_sys_code] + --,[moh_pri_diag_diag_type_code] + --,[moh_pri_diag_clinic_code] + ,'priv_diab' AS [source] +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_tmp_vfhd_chronic_diags] +FROM [IDI_Clean_YYYYMM].[moh_clean].[priv_fund_hosp_discharges_diag] +WHERE [moh_pri_diag_sub_sys_code] = [moh_pri_diag_clinic_sys_code] /* higher accuracy when systems match */ +AND [moh_pri_diag_diag_type_code] IN ('A', 'B') /* diagnosies */ +AND [moh_pri_diag_clinic_sys_code] IN ('10', '11', '12', '13','14') /* ICD-10-AM */ +AND ( + SUBSTRING([moh_pri_diag_clinic_code], 1, 3) IN ( + 'E10' /* Type 1 DM */ + ,'E11' /* TYPE 2 DM */ + ,'E13' /* Other specified DM */ + ,'E14' /* Unspecified DM */ + ) + OR SUBSTRING([moh_pri_diag_clinic_code], 1, 4) IN ('O240', 'O241', 'O242', 'O243') /* pre-existing diabetes in pregnancy */ +) + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[defn_tmp_vfhd_chronic_diags] ([event_id]); +GO + +/* Clear before creation */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_tmp_pharm_chronic_diags]; +GO + +/************************************ pharmaceuticals ************************************/ +/* Ignores one chemical IDs: +1794 - Metformin hydrochloride +The chronic table also includes this chemical, but excludes women aged 12-45 who may have only been dispensed Metformin +AND do not meet any of the other criteria. +Note: This is intended to exclude women age 12-45 whom may have polycystic ovary syndrome treated with metformin. +*/ + +SELECT [snz_uid] + ,MIN([moh_pha_dispensed_date]) AS [moh_pha_dispensed_date] + --,[moh_pha_dim_form_pack_code] + --,[DIM_FORM_PACK_SUBSIDY_KEY] + --,[CHEMICAL_ID] + --,[CHEMICAL_NAME] + --,[FORMULATION_ID] + --,[FORMULATION_NAME] +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_tmp_pharm_chronic_diags] +FROM [IDI_Clean_YYYYMM].[moh_clean].[pharmaceutical] a +LEFT JOIN [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_dim_form_pack_subsidy_code] b +ON a.[moh_pha_dim_form_pack_code] = b.[DIM_FORM_PACK_SUBSIDY_KEY] +WHERE snz_uid <> -1 /* remove non-personal identities */ +AND [CHEMICAL_ID] IN ( + 1192 /* Insulin lispro */ + ,1570 /* Glucagon hydrochloride */ + ,1648 /* Insulin Neutral */ + ,1649 /* Insulin isophane */ + ,1655 /* Insulin zinc suspension */ + ,3783 /* Insulin aspart */ + ,3857 /* Insulin glargine */ + ,6300 /* Insulin isophane with insulin neutral */ + ,1068 /* Chlorpropamide */ + ,1247 /* Acarbose */ + ,1567 /* Glibenclamide */ + ,1568 /* Gliclazide */ + ,1569 /* Glipizide */ + ,2276 /* Tolazamide */ + ,2277 /* Tolbutamide */ + ,3739 /* Rosiglitazone */ + ,3800 /* Pioglitazone */ + + ,1794 /* Metformin hydrochloride --added based on advise by MOH but won't have much effect on Seniors project*/ + ,3882 /* Insulin lispro with insulin lispro protamine */ + ,3908 /* Insulin glulisine */ + ,4103 /* Vildagliptin */ + ,4104 /* Vildagliptin with metformin hydrochloride */ + ,4137 /* Empagliflozin */ + ,4138 /* Empagliflozin with metformin hydrochloride */ + +) +GROUP BY [snz_uid] + --,[moh_pha_dim_form_pack_code] + --,[DIM_FORM_PACK_SUBSIDY_KEY] + --,[CHEMICAL_ID] + --,[CHEMICAL_NAME] + --,[FORMULATION_ID] + --,[FORMULATION_NAME] + + +/* Clear before creation */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_tmp_nnpac_chronic_diags]; +GO + +/************************************ National Non-Admitted Patient Collection ************************************/ +SELECT [snz_uid] + ,MIN([moh_nnp_service_date]) AS [moh_nnp_service_date] +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_tmp_nnpac_chronic_diags] +FROM [IDI_Clean_YYYYMM].[moh_clean].[nnpac] +WHERE [moh_nnp_purchase_unit_code] IN ('M20006', 'M20007') +AND [moh_nnp_attendence_code] = 'ATT' /* attended */ +GROUP BY [snz_uid] + +/* Clear before creation */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_chronic_diabetes]; +GO + +/************************************ combined final table ************************************/ + +SELECT * +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_chronic_diabetes] +FROM ( +/* public */ +SELECT [snz_uid] + --,[event_id] + ,[source] + ,[moh_evt_evst_date] AS [start_date] + ,[moh_evt_even_date] AS [end_date] +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[defn_tmp_pfhd_chronic_diags] a +INNER JOIN [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_event] b +ON a.[event_id] = b.[moh_evt_event_id_nbr] + +UNION ALL + +/* private */ +SELECT [snz_uid] + --,[event_id] + ,[source] + ,[moh_pri_evt_start_date] AS [start_date] + ,[moh_pri_evt_end_date] AS [end_date] +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[defn_tmp_vfhd_chronic_diags] a +INNER JOIN [IDI_Clean_YYYYMM].[moh_clean].[priv_fund_hosp_discharges_event] b +ON a.[event_id] = b.[moh_pri_evt_event_id_nbr] + +UNION ALL + +/* pharmaceuticals */ +SELECT [snz_uid] + --,NULL AS [event_id] + ,'pha_diab' AS [source] + ,[moh_pha_dispensed_date] AS [start_date] + ,[moh_pha_dispensed_date] AS [end_date] +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[defn_tmp_pharm_chronic_diags] + +UNION ALL + + /* Non-admittted hospital patients */ + SELECT [snz_uid] + --,NULL AS [event_id] + ,'out_diab' AS [source] + ,[moh_nnp_service_date] AS [start_date] + ,[moh_nnp_service_date] AS [end_date] +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[defn_tmp_nnpac_chronic_diags] + +) k + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[defn_chronic_diabetes] ([snz_uid]); +GO +/* Compress final table to save space */ +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[defn_chronic_diabetes] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE); +GO + +/************************************ tidy tempoary tables away ************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_tmp_pfhd_chronic_diags]; +GO +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_tmp_vfhd_chronic_diags]; +GO +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_tmp_pharm_chronic_diags]; +GO +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_tmp_nnpac_chronic_diags]; +GO diff --git a/health/chronic_obstructive_pulmonary_disease.sql b/health/chronic_obstructive_pulmonary_disease.sql new file mode 100644 index 0000000..c1e940a Binary files /dev/null and b/health/chronic_obstructive_pulmonary_disease.sql differ diff --git a/health/covid_immunisation_register.sql b/health/covid_immunisation_register.sql new file mode 100644 index 0000000..3199737 --- /dev/null +++ b/health/covid_immunisation_register.sql @@ -0,0 +1,149 @@ +/************************************************************************************************** +Title: Covid-19 Immunisation Register +Author: Shaan Badenhorst + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions +Steven Johnston provided comments on the definition. + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Immunisation Register for NZ Covid-19 vaccination programme. + +Intended purpose: +Determining who has been vaccinated, how many vaccinations they have received, +and when & where their vaccinations took place. + +Inputs & Dependencies: +- [IDI_Adhoc].[clean_read_MOH_CIR].[moh_CIR_vaccination_activity20211123] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_clean_moh_CIR_vaccination] + +Notes: +1) Overwhelming majority of doses are Pfizer BioNTech. + <20 records are for NULL or other vaccine type. + +2) When dose numbers checked 12/10/2021 no one had more than 2 doses recorded. + +3) When link rates checked 12/10/2021: + - 3.4 million identities linked to snz_uid values + - 50,000 identities did not link to snz_uid + Note that MOH are improving quality and link rates are improving over time. + +4) There are multiple version of the CIR. Input table requires renaming as new versions + are loaded. Custom naming of output table may also be advised in case of multiple + versions. + +5) The concept of vaccination status is likely to evolve over time as immunity wears off + and boosters are introduced. Hence it is important to keep in touch with what the + vaccination programme are doing. + +6) Since the creation of this definition, Stats NZ have done addition work to improve the link + rate for the CIR. This script only picks up links from [snz_moh_uid] to [snz_uid] but recent + CIR tables include additional links direct to [snz_uid]. Users of the CIR are advised to + use the improved links by Stats NZ. These come from: + - linking of NHIs that appeared for the first time in the vaccination data, + - finding links for NHIs in the CIR data that had not previously been linked. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = vacc_ + Project schema = DL-MAA20XX-YY + +Issues: +1) Dose numbers in the CIR data can contain errors. For example, a person might have no dose 1 + and two dose 2, or one dose 1 and one dose 3 but no dose 2. It is now recommended to ignore dose + number and instead count the number of doses and turn this into vaccination status: + 1 dose => partially vaccinated, 2 or more doses => fully vaccinated. + +History (reverse order): +2021-10-12 SB + +**************************************************************************************************/ + +-- Pivot to create two columns for dose 1 and dose 2 +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_cir_pivot] +GO + +SELECT snz_moh_uid + ,MIN(IIF(dose_number = 1, activity_date, NULL)) as Dose_1_date + ,MIN(IIF(dose_number = 2, activity_date, NULL)) as Dose_2_date + ,MIN(IIF(dose_number = 3, activity_date, NULL)) as Dose_3_date + ,MIN(IIF(dose_number = 1, dhbofservice, NULL)) as Dose_1_DHB + ,MIN(IIF(dose_number = 2, dhbofservice, NULL)) as Dose_2_DHB + ,MIN(IIF(dose_number = 3, dhbofservice, NULL)) as Dose_3_DHB + ,[sequence_group] +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_cir_pivot] +FROM [IDI_Adhoc].[clean_read_MOH_CIR].[moh_CIR_vaccination_activity20211123] +GROUP BY snz_moh_uid, [sequence_group] +GO + +CREATE NONCLUSTERED INDEX my_index ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_cir_pivot] (snz_moh_uid) +GO + +-- Merge on snz_uids +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_cir_join_uids] +GO + +SELECT COALESCE(b.snz_uid,c.snz_uid) AS snz_uid + ,a.snz_moh_uid + ,Dose_1_date + ,Dose_2_date + ,Dose_3_date + ,Dose_1_DHB + ,Dose_2_DHB + ,Dose_3_DHB + ,[sequence_group] +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_cir_join_uids] +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_cir_pivot] AS a +LEFT JOIN [IDI_Clean_YYYYMM].[moh_clean].[pop_cohort_demographics] AS b +ON a.snz_moh_uid = b.snz_moh_uid +LEFT JOIN [IDI_Clean_YYYYMM].[security].[concordance] AS c +ON a.snz_moh_uid = c.snz_moh_uid + +-- Present as final +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_clean_moh_CIR_vaccination]; +GO + +SELECT snz_uid + ,snz_moh_uid + ,Dose_1_date + ,Dose_2_date + ,Dose_3_date + ,Dose_1_DHB + ,Dose_2_DHB + ,Dose_3_DHB + ,[sequence_group] + ,CAST(DAY(Dose_1_date) AS INT) AS Dose_1_Day + ,CAST(MONTH(Dose_1_date) AS INT) AS Dose_1_Month + ,CAST(YEAR(Dose_1_date) AS INT) AS Dose_1_Year + ,CAST(DAY(Dose_2_date) AS INT) AS Dose_2_Day + ,CAST(MONTH(Dose_2_date) AS INT) AS Dose_2_Month + ,CAST(YEAR(Dose_2_date) AS INT) AS Dose_2_Year + ,CAST(DAY(Dose_3_date) AS INT) AS Dose_3_Day + ,CAST(MONTH(Dose_3_date) AS INT) AS Dose_3_Month + ,CAST(YEAR(Dose_3_date) AS INT) AS Dose_3_Year + ,IIF(snz_uid IS NOT NULL, 1,0) as snz_uid_chk + ,IIF(Dose_1_date IS NULL, 1,0) as Dose_1_NULL_chk +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_clean_moh_CIR_vaccination] +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_cir_join_uids] + +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_clean_moh_CIR_vaccination] (snz_uid); +GO +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_clean_moh_CIR_vaccination] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE) +GO + +-- Remove temporary tables +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_cir_pivot] +GO +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_cir_join_uids] +GO diff --git a/health/diagnosis_codes.csv b/health/diagnosis_codes.csv new file mode 100644 index 0000000..3c0f6a5 --- /dev/null +++ b/health/diagnosis_codes.csv @@ -0,0 +1,468 @@ +diagnosis,code_type,code,aux,explanation +alcohol_abuse,offence,6121,,Trespass Alcoholism And Drug Addiction Act +alcohol_abuse,offence,8110,,Driving Under Influence +alcohol_abuse,offence,8111,,Driving Causing Injury Through Drink +alcohol_abuse,offence,8113,,Driving Causing Injury Through Drink And Drug +alcohol_abuse,offence,8114,,Driving Causing Death Through Drink +alcohol_abuse,offence,8116,,Driving Causing Death Through Drink And Drug +alcohol_abuse,offence,8117,,Causing Injury Through Excess Breath/Alcohol +alcohol_abuse,offence,8118,,Causing Death Through Excess Breath/Alcohol +alcohol_abuse,offence,8120,,Driving Under Influence +alcohol_abuse,offence,8121,,Causing Injury Through Excess Blood/Alcohol +alcohol_abuse,offence,8122,,Causing Death Through Excess Blood/Alcohol +alcohol_abuse,offence,8123,,Careless Use Causing Injury Through Drink +alcohol_abuse,offence,8125,,Careless Use Causing Death Through Drink +alcohol_abuse,offence,8127,,Drove With Excess Breath/Alcohol +alcohol_abuse,offence,8128,,Drove With Excess Blood Alcohol +alcohol_abuse,offence,8130,,Driving Under Influence +alcohol_abuse,offence,8131,,Driving Under The Influence Of Drink +alcohol_abuse,offence,8133,,Driving Under The Influence Of Drink And Drug +alcohol_abuse,offence,8134,,In Charge Motor Vehicle Under Influence Drink +alcohol_abuse,offence,8136,,In Charge Motor Vehicle Under Influence Drink & Drug +alcohol_abuse,offence,8190,,Driving Under Influence Drink/Drug +alcohol_abuse,offence,8191,,Licensed Person Driving Excess Blood Alcohol Level +alcohol_abuse,offence,8192,,Unlicensed Person Driving Excess Blood Alcoh Level +alcohol_abuse,offence,8193,,Licensed Person Driving Excess Breath Alcohol Level +alcohol_abuse,offence,8194,,Unlicensed Person Drive Excess Breath Alcoh Level +alcohol_abuse,offence,8910,,Driving Under Influence +alcohol_abuse,offence,8911,,Cause Bodily Injury Through Drink +alcohol_abuse,offence,8912,,Cause Death Through Drink +alcohol_abuse,offence,8915,,Drive Under The Influence Of Drink +alcohol_abuse,offence,8917,,Drive With Excess Blood Alcohol +alcohol_abuse,offence,8918,,Drive With Excess Breath Alcohol +alcohol_abuse,offence,8919,,Other Drive Under Influence +alcohol_abuse,offence,8920,,Attempts To Drive Under Influence +alcohol_abuse,offence,8921,,Attempts To Drive Und Infldrink +alcohol_abuse,offence,8923,,Attempts To Drive With Ex Bl Alcohol +alcohol_abuse,offence,8924,,Attempt Drive Excess Breath Alcohol +alcohol_abuse,offence,8929,,Other Attempts To Drive Und Influence +alcohol_abuse,offence,8930,,In Charge Under Influence +alcohol_abuse,offence,8931,,In Charge Under Influence Drink +alcohol_abuse,offence,8939,,Other In Charge Under Influence +alcohol_abuse,offence,A100,,Driving Under The Influence Of Drink Or Drug +alcohol_abuse,offence,A101,,Driving Under The Influence Of Drink +alcohol_abuse,offence,A103,,Cause Injury While Under Influence Of Drink +alcohol_abuse,offence,A105,,Cause Death While Under Influence Of Drink +alcohol_abuse,offence,A107,,Excess Breath Alcohol Causing Injury +alcohol_abuse,offence,A108,,Excess Breath Alcohol Causing Death +alcohol_abuse,offence,A109,,Driving Under The Influence Of Drink/Drug Or Both +alcohol_abuse,offence,A110,,Cause Injury While Under Influence Of Drink/Drug +alcohol_abuse,offence,A111,,Cause Death While Under Infl Of Drink/Drug +alcohol_abuse,offence,A113,,Careless driving while under influence of drink or drug causing death +alcohol_abuse,offence,A114,,Careless driving while under influence of drink or drug causing injury +alcohol_abuse,offence,A130,,Drove Under Influence Drink Or Drugs - 3Rd Or Subsequent +alcohol_abuse,offence,A202,,Alcohol Interlock Licence Holder Offended In Relation To Interlock Device +alcohol_abuse,offence,A204,,Alcohol Interlock Licensee's Breath Contained Alcohol - Not Over 400 mcgs +alcohol_abuse,offence,A205,,Zero Alcohol Licensee's Breath Contained Alcohol - Not Over 400 mcgs +alcohol_abuse,offence,A206,,Alcohol Interlock Licensee's Blood Contained Alcohol - Not Over 80 mcgs +alcohol_abuse,offence,A207,,Zero Alcohol Licensee's Blood Contained Alcohol - Not Over 80 mcgs +alcohol_abuse,offence,A208,,Alcohol Interlock Licensee's Breath Contained Alcohol - Over 400 mcgs +alcohol_abuse,offence,A209,,Zero Alcohol Licensee's Breath Contained Alcohol - Over 400 mcgs +alcohol_abuse,offence,A210,,Alcohol Interlock Licensee's Blood Contained Alcohol - Over 80 mcgs +alcohol_abuse,offence,A211,,Zero Alcohol Licensee's Blood Contained Alcohol - Over 80 mgms +alcohol_abuse,offence,A212,,Alcohol Interlock Licensee's Breath Contained Alcohol - not over 250mcgs +alcohol_abuse,offence,A213,,Zero Alcohol Licensee's Breath Contained Alcohol - not over 250mcgs +alcohol_abuse,offence,A214,,Alcohol Interlock Licensee's Blood Contained Alcohol - not over 50mgms +alcohol_abuse,offence,A215,,Zero Alcohol Licensee's Blood Contained Alcohol - not over 50mgms +alcohol_abuse,offence,A216,,Alcohol Interlock Licensee's Breath Contained Alcohol - over 250mcgs +alcohol_abuse,offence,A217,,Zero Alcohol Licensee's Breath Contained Alcohol - over 250mcgs +alcohol_abuse,offence,A218,,Alcohol Interlock Licensee's Blood Contained Alcohol - over 50mgms +alcohol_abuse,offence,A219,,Zero Alcohol Licensee's Blood Contained Alcohol - over 50mgms +alcohol_abuse,offence,A300,,Blood Alcohol Offences +alcohol_abuse,offence,A301,,Driving With Excess Blood Alcohol Level +alcohol_abuse,offence,A302,,Driving With Excess Blood Alcohol Concentration +alcohol_abuse,offence,A303,,Cause Injury Driving Excess Blood Alcohol +alcohol_abuse,offence,A304,,Causing Death Driving Excess Blood Alcohol +alcohol_abuse,offence,A314,,Driving With Excess Blood Alcohol Level Exceeds 200 +alcohol_abuse,offence,A320,,Licensed Person With Excess Blood Alcohol Level +alcohol_abuse,offence,A322,,Unlicensed Person Driving Excess Blood Alc +alcohol_abuse,offence,A323,,Driving With Excess Blood Alcohol Content +alcohol_abuse,offence,A324,,Person Und 20 Yr Exceeded Blood Alcohol Limit +alcohol_abuse,offence,A326,,Person Under 20's Blood Contained Alcohol - Over 30 mgm +alcohol_abuse,offence,A328,,Person Under 20's Blood Contained Alcohol - 30mgm or Less +alcohol_abuse,offence,A330,,Drove With Excess Blood Alcohol 3Rd Or Subsequent +alcohol_abuse,offence,A334,,Excess Blood Alcohol Causing Injury - 3Rd Or Subsequent +alcohol_abuse,offence,A335,,Excess Blood Alcohol Causing Death - 3Rd Or Subsequent +alcohol_abuse,offence,A336,,Blood Alcohol Level Exceeded 50mgm but not more than 80mgm +alcohol_abuse,offence,A338,,Blood Alcohol Level Exceeded 50mgm But No More Than 80mgm - Refused EBT +alcohol_abuse,offence,A340,,Excess blood alcohol causing death - 3rd or subsequent +alcohol_abuse,offence,A501,,Driving With Excess Breath Alcohol Level +alcohol_abuse,offence,A507,,Unlicensed Person Driving Excess Breath Alc +alcohol_abuse,offence,A514,,Driving With Excess Breath Alcohol Level Exceeds 1000 +alcohol_abuse,offence,A515,,Licensed Person Driving With Excess Breath Alcohol Level +alcohol_abuse,offence,A518,,Breath alcohol level over 400 mcgs per litre of breath +alcohol_abuse,offence,A519,,Person Und 20 Yr Exceed Breath Alcohol Limit +alcohol_abuse,offence,A521,,Person Under 20's Blood Contained Alcohol - Over 150mcg +alcohol_abuse,offence,A523,,Person Under 20's Blood Contained Alcohol - 150 mcg Or Less +alcohol_abuse,offence,A525,,Breath Alcohol Level Exceeded 250mcgs But Not More Than 400mcgs +alcohol_abuse,offence,A530,,Drove With Excess Breath Alcohol 3Rd Or Subsequent +alcohol_abuse,offence,A531,,Excess Breath Alcohol Causing Injury - 3Rd Or Subsequent +alcohol_abuse,offence,A532,,Excess Breath Alcohol Causing Death - 3Rd Or Subsequent +alcohol_abuse,offence,A533,,Excess breath alcohol causing death - 3rd or subsequent +alcohol_abuse,offence,A601,,Transport Service Driver Under The Influence Of Drink +alcohol_abuse,offence,A603,,In Charge Transport Service Vehicle Under Influence Of Drink Causing Injury +alcohol_abuse,offence,A605,,In Charge Transport Service Vehicle Under Influence Of Drink Causing Death +alcohol_abuse,offence,A607,,Excess Breath Alcohol Causing Injury - In Charge Transport Service Vehicle +alcohol_abuse,offence,A608,,Excess Breath Alcohol Causing Death - In Charge Transport Service Vehicle +alcohol_abuse,offence,A609,,Drove In Transport Service Under The Influence Of Drink/Drug Or Both +alcohol_abuse,offence,A610,,In Charge Transport Service Vehicle Under Influence Drink/Drug Causing Injury +alcohol_abuse,offence,A611,,In Charge Transport Service Vehicle Under Influence Drink/Drug Causing Death +alcohol_abuse,offence,A613,,Drove In Transport Service Under Influence Drink Or Drug - 3Rd Or Subsequent +alcohol_abuse,offence,A614,,Excess Blood Alcohol Causing Injury - In Charge Transport Service Vehicle +alcohol_abuse,offence,A615,,Excess Blood Alcohol Causing Death - In Charge Transport Service Vehicle +alcohol_abuse,offence,A627,,Drove In Transport Service With Excess Blood Alcohol Content +alcohol_abuse,offence,A628,,Transport Service Driver Under 20 Exceeded Blood Alcohol Limit +alcohol_abuse,offence,A629,,Drove In Transport Service With Excess Blood Alcohol - 3Rd Or Subsequent +alcohol_abuse,offence,A637,,Transport Service Driver Breath Alcohol Level Over 400 Mcgs Per Litre +alcohol_abuse,offence,A638,,Transport Service Driver Under 20 Exceeded Breath Alcohol Limit +alcohol_abuse,offence,A640,,Drove In Transport Service With Excess Breath Alcohol - 3Rd Or Subsequent +alcohol_abuse,offence,A641,,Transport Service Driver Excess Breath Alcohol Causing Injury - 3Rd Or Subs +alcohol_abuse,offence,A642,,Transport Service Driver Excess Breath Alcohol Causing Death - 3Rd Or Subs +alcohol_abuse,offence,A644,,Transport Service Driver Excess Blood Alcohol Causing Injury - 3Rd Or Sub +alcohol_abuse,offence,A645,,Transport Service Driver Excess Blood Alcohol Causing Death - 3Rd Or Sub +alcohol_abuse,offence,A660,,Transport Driver Under 20's Blood Contained Alcohol - Over 30 mcg +alcohol_abuse,offence,A661,,Transport Driver Under-20's Blood Contained Alcohol - 30mgm Or Less +alcohol_abuse,offence,A662,,Transport Driver Under 20 Breath Contained Alcohol - Over 150 mcg +alcohol_abuse,offence,A663,,Transport Driver Under 20 Breath Contained Alcohol - 150 mcg Or Less +alcohol_abuse,offence,A668,,Transport service driver excess breath alcohol causing death - 3rd or sub +alcohol_abuse,offence,A669,,Careless under influence drink or drug caused death transport service +alcohol_abuse,offence,A670,,Careless under influence drink or drug caused injury transport service +alcohol_abuse,offence,A671,,Transport service driver - excess blood alcohol caused death - 3rd or sub +alcohol_abuse,offence,D507,,Careless Driving Alcohol Involved Causing Injury +alcohol_abuse,offence,D508,,Careless Driving Alcohol Involved Causing Death +alcohol_abuse,offence,D513,,Careless Driving Alcohol Inv Cause Injury +alcohol_abuse,offence,D515,,Careless Driving Drink/Drug Inv Cause Injury +alcohol_abuse,offence,D516,,Careless Driving Alcohol Inv Cause Death +alcohol_abuse,offence,D518,,Careless Driving Drink/Drug Inv Cause Death +alcohol_abuse,offence,L208,,Drove Contrary To An Alcohol Interlock License +alcohol_abuse,offence,L209,,Drove Contrary To A Zero Alcohol License +alcohol_abuse,offence,L233,,Drove contrary to terms of alcohol interlock licence - 3rd or subsequent +alcohol_abuse,offence,L234,,Drove contrary to terms of zero alcohol licence - 3rd or subsequent +alcohol_abuse,offence,L502,,Learner Driver With Breath Alcohol Level Exceeding 150 +alcohol_abuse,offence,L503,,Learner Driver With Blood Alcohol Level Exceeding 30 +alcohol_abuse,offence,L521,,Restricted Driver With Breath Alcohol Level Exceeding 150 +alcohol_abuse,offence,L522,,Restricted Driver With Blood Alcohol Level Exceeding 30 +alcohol_abuse,offence,W606,,Person Under Purchase Age Bought Alcohol +alcohol_abuse,offence,W656,,Brought Alcohol Into An Alcohol Banned Area +alcohol_abuse,offence,Y998,,Causing Injury Driving With Excess Blood Alcohol Level +alcohol_abuse,DSM,291,,"Alcohol induced disorder (delirium, amnestic, dementia, etc.)" +alcohol_abuse,DSM,3030,,Alcohol Intoxication +alcohol_abuse,DSM,3039,,Alcohol Dependence +alcohol_abuse,DSM,3050,,Alcohol Abuse +alcohol_abuse,DSM,3575,,Alcoholic polyneuropathy +alcohol_abuse,DSM,4255,,Alcoholic cardiomyopathy +alcohol_abuse,DSM,5353,,Alcoholic gastritis +alcohol_abuse,DSM,5710,,Alcoholic fatty liver +alcohol_abuse,DSM,5711,,Acute alcoholic hepatitis +alcohol_abuse,DSM,5712,,Alcoholic cirrhosis of liver +alcohol_abuse,DSM,5713,,Alcoholic liver damage +alcohol_abuse,DSM,9800,,Toxic effect of ethyl alcohol +alcohol_abuse,DSM,9801,,Undetermined poisoning by barbiturates +alcohol_abuse,external_DSM,8600,,Accidental poisoning by alcoholic beverages +alcohol_abuse,external_DSM,8601,,Accid poisoning by other/unspec ethyl alcohol products +alcohol_abuse,external_DSM,8602,,Accidental poisoning by methyl alcohol +alcohol_abuse,external_DSM,8609,,Accidental poisoning by unspecified alcohol +alcohol_abuse,external_ICD10,X45,,Accidental poisoning by and exposure to alcohol +alcohol_abuse,external_ICD10,Y15,,"Poisoning by and exposure to alcohol, undetermined intent" +alcohol_abuse,external_ICD9,8600,,Accidental poisoning by alcoholic beverages +alcohol_abuse,external_ICD9,8601,,Accid poisoning by other/unspec ethyl alcohol products +alcohol_abuse,external_ICD9,8602,,Accidental poisoning by methyl alcohol +alcohol_abuse,external_ICD9,8609,,Accidental poisoning by unspecified alcohol +alcohol_abuse,ICD10,F10,,Mental and behavioural disorders due to use of alcohol +alcohol_abuse,ICD10,G312,,Degeneration of nervous system due to alcohol +alcohol_abuse,ICD10,G621,,Alcoholic polyneuropathy +alcohol_abuse,ICD10,G721,,Alcoholic myopathy +alcohol_abuse,ICD10,I426,,Alcoholic cardiomyopathy +alcohol_abuse,ICD10,K292,,Alcoholic gastritis +alcohol_abuse,ICD10,K700,,Alcoholic fatty liver +alcohol_abuse,ICD10,K701,,Alcoholic hepatitis +alcohol_abuse,ICD10,K702,,Alcoholic fibrosis and sclerosis of liver +alcohol_abuse,ICD10,K703,,Alcoholic cirrhosis of liver +alcohol_abuse,ICD10,K704,,Alcoholic hepatic failure +alcohol_abuse,ICD10,K709,,"Alcoholic liver disease, unspecified" +alcohol_abuse,ICD10,K852,,Alcohol-induced acute pancreatitis +alcohol_abuse,ICD10,K860,,"Alcoholic liver disease, unspecified" +alcohol_abuse,ICD10,X45,,Accidental poisoning by and exposure to alcohol +alcohol_abuse,ICD10,Y15,,"Poisoning by and exposure to alcohol, undetermined intent" +alcohol_abuse,ICD9,291,,"Alcohol (delirium, amnestic, dementia, etc)" +alcohol_abuse,ICD9,3030,,Acute alcoholic intoxication +alcohol_abuse,ICD9,3039,,Other and unspecified alcohol dependence +alcohol_abuse,ICD9,3050,,Alcohol use disorder +alcohol_abuse,ICD9,3575,,Alcoholic polyneuropathy +alcohol_abuse,ICD9,4255,,Alcoholic cardiomyopathy +alcohol_abuse,ICD9,5353,,Alcoholic gastritis +alcohol_abuse,ICD9,5710,,Alcoholic fatty liver +alcohol_abuse,ICD9,5711,,Acute alcoholic hepatitis +alcohol_abuse,ICD9,5712,,Alcoholic cirrhosis of liver +alcohol_abuse,ICD9,5713,,"Alcoholic liver damage, unspecified" +alcohol_abuse,ICD9,9800,,Toxic effect of ethyl alcohol +alcohol_abuse,ICD9,9801,,Undetermined poisoning by barbiturates +alcohol_abuse,incapacity,7,,"Mental disorders, substance abuse, alcohol" +alcohol_abuse,incapacity,170,,"Mental disorders, substance abuse, alcohol" +alcohol_abuse,pharm_chemical,1252,multiple,Naltrexone hydrochloride +alcohol_abuse,pharm_chemical,1432,sole,Disulfiram +alcohol_abuse,pharm_chemical,3793,multiple,Naltrexone hydrochloride +anxiety,DSM,30002,,Generalized anxiety disorder +anxiety,ICD10,F411,,Generalized anxiety disorder +anxiety,ICD9,30002,,Generalized anxiety disorder +anxiety,incapacity,165,,"Mental disorders, other psychological conditions" +anxiety,incapacity,009,,"Mental disorders, other psychological conditions" +anxiety,pharm_chemical,1911,multiple,Oxazepam +anxiety,pharm_chemical,2632,multiple,Alprazolam +anxiety,pharm_chemical,6006,sole,Buspirone hydrochloride +anxiety,socrates,1302,,Anxiety disorder +bipolar,DSM,2960,,"Bipolar disorder, single manic episode" +bipolar,DSM,2961,,"Manic affective disorder, recurrent episode" +bipolar,DSM,2964,,"Bipolar affective disorder, manic" +bipolar,DSM,2965,,"Bipolar affective disorder, depressed" +bipolar,DSM,2966,,"Bipolar affective disorder, mixed" +bipolar,DSM,2967,,"Bipolar affective disorder, unspecified" +bipolar,DSM,29680,,"Manic-depressive psychosis, unspecified" +bipolar,DSM,29681,,Atypical manic disorder +bipolar,DSM,29689,,"Manic-depressive psychosis, other" +bipolar,DSM,30113,,Cyclothymic disorder +bipolar,ICD10,F30,,Manic episode +bipolar,ICD10,F31,,Bipolar disorder +bipolar,ICD10,F340,,Cyclothymic disorder +bipolar,ICD9,2960,,"Bipolar disorder, single manic episode" +bipolar,ICD9,2961,,"Manic affective disorder, recurrent episode" +bipolar,ICD9,2964,,"Bipolar affective disorder, manic" +bipolar,ICD9,2965,,"Bipolar affective disorder, depressed" +bipolar,ICD9,2966,,"Bipolar affective disorder, mixed" +bipolar,ICD9,2967,,"Bipolar affective disorder, unspecified" +bipolar,ICD9,29680,,"Manic-depressive psychosis, unspecified" +bipolar,ICD9,29681,,Atypical manic disorder +bipolar,ICD9,29689,,"Manic-depressive psychosis, other" +bipolar,ICD9,30113,,Cyclothymic disorder +bipolar,incapacity,162,,Bipolar disorder +bipolar,lab_test,BM2,,Lithium +bipolar,pharm_chemical,1011,multiple,Risperidone +bipolar,pharm_chemical,1140,multiple,Olanzapine +bipolar,pharm_chemical,1183,multiple,Quetiapine +bipolar,pharm_chemical,2466,sole,Lithium carbonate +bipolar,pharm_chemical,3873,multiple,Ziprasidone +bipolar,pharm_chemical,3878,multiple,Aripiprazole +bipolar,pharm_chemical,3940,multiple,Olanzapine pamoate monohydrate +bipolar,pharm_formulation,131601,multiple,Clonazepam +bipolar,pharm_formulation,131602,multiple,Clonazepam +bipolar,socrates,1303,,Bipolar disorder (manic depression) +depressive,DSM,311,,"Depressive disorder, not elsewhere classified" +depressive,DSM,2962,,"Major depressive affective disorder, single episode" +depressive,DSM,2963,,"Major depressive affective disorder, recurrent episode" +depressive,ICD10,F32,,Depressive episode +depressive,ICD10,F33,,"Major depressive disorder, recurrent" +depressive,ICD9,311,,"Depressive disorder, not elsewhere classified" +depressive,ICD9,2962,,"Major depressive affective disorder, single episode" +depressive,ICD9,2963,,"Major depressive affective disorder, recurrent episode" +depressive,incapacity,161,,Depression +depressive,pharm_chemical,1180,multiple,Venlafaxine +depressive,pharm_chemical,1190,sole,Citalopram hydrobromide (Celapram) +depressive,pharm_chemical,1193,sole,Citalopram hydrobromide +depressive,pharm_chemical,2636,multiple,Fluoxetine hydrochloride +depressive,pharm_chemical,3785,multiple,Venlafaxine +depressive,pharm_chemical,3892,multiple,Bupropion hydrochloride +depressive,pharm_chemical,3901,multiple,Mirtazapine +depressive,pharm_chemical,3927,multiple,Sertraline +depressive,socrates,1304,,Depression +drug_abuse,offence,3160,,Consume/Smoke/Use Drugs (Not Cannabis) +drug_abuse,offence,3161,,Consume/Smoke/Use Cocaine +drug_abuse,offence,3162,,Consume/Smoke/Use Heroin +drug_abuse,offence,3163,,Consume/Smoke/Use Lsd +drug_abuse,offence,3164,,Consume/Smoke/Use Morphine +drug_abuse,offence,3165,,Consume/Smoke/Use Opium +drug_abuse,offence,3166,,Consume/Smoke/Use Other Opiates +drug_abuse,offence,3167,,Consume/Smoke/Use Stimulants And Depressants +drug_abuse,offence,3169,,Consume/Smoke/Use Other Drugs (Not Cannabis) +drug_abuse,offence,3260,,Consume/Smoke/Use Cannabis +drug_abuse,offence,3261,,Consume/Smoke/Use Cannabis Seed +drug_abuse,offence,3262,,Consume/Smoke/Use Cannabis Plant +drug_abuse,offence,3263,,Consume/Smoke/Use Cannabis Resin +drug_abuse,offence,3264,,Consume/Smoke/Use Cannabis Oil +drug_abuse,offence,3269,,Consume/Smoke/Use Cannabis Other Form +drug_abuse,offence,5960,,Consume/Smoke/Use Drugs (New Drugs) +drug_abuse,offence,5961,,Consume/Smoke/Use Methamphetamine And Amphetamine +drug_abuse,offence,5962,,Consume/Smoke/Use Ecstasy +drug_abuse,offence,5963,,Consume/Smoke/Use Fantasy Type Substances +drug_abuse,offence,5964,,Consume/Smoke/Use Bzp Type Substances +drug_abuse,offence,5965,,Consume/Smoke/Use 5F-ADB/AMB-FUBINACA +drug_abuse,offence,6121,,Trespass Alcoholism And Drug Addiction Act +drug_abuse,offence,8112,,Driving Causing Injury Through Drug +drug_abuse,offence,8113,,Driving Causing Injury Through Drink And Drug +drug_abuse,offence,8115,,Driving Causing Death Through Drug +drug_abuse,offence,8116,,Driving Causing Death Through Drink And Drug +drug_abuse,offence,8124,,Careless Use Causing Injury Through Drug +drug_abuse,offence,8126,,Careless Use Causing Death Through Drug +drug_abuse,offence,8132,,Driving Under The Influence Of Drug +drug_abuse,offence,8133,,Driving Under The Influence Of Drink And Drug +drug_abuse,offence,8135,,In Charge Motor Vehicle Under Influence Drug +drug_abuse,offence,8136,,In Charge Motor Vehicle Under Influence Drink & Drug +drug_abuse,offence,8190,,Driving Under Influence Drink/Drug +drug_abuse,offence,8913,,Cause Bodily Injury Through Drugs +drug_abuse,offence,8914,,Cause Death Through Drugs +drug_abuse,offence,8916,,Drive Under Influence Of Drugs +drug_abuse,offence,8922,,Attempts To Drive Und Infldrug +drug_abuse,offence,8932,,In Charge Under Influence Drug +drug_abuse,offence,A100,,Driving Under The Influence Of Drink Or Drug +drug_abuse,offence,A102,,Driving Under The Influence Of A Drug +drug_abuse,offence,A104,,Cause Injury While Under Influence Of A Drug +drug_abuse,offence,A106,,Cause Death While Under Influence Of Drug +drug_abuse,offence,A109,,Driving Under The Influence Of Drink/Drug Or Both +drug_abuse,offence,A110,,Cause Injury While Under Influence Of Drink/Drug +drug_abuse,offence,A111,,Cause Death While Under Infl Of Drink/Drug +drug_abuse,offence,A113,,Careless driving while under influence of drink or drug causing death +drug_abuse,offence,A114,,Careless driving while under influence of drink or drug causing injury +drug_abuse,offence,A130,,Drove Under Influence Drink Or Drugs - 3Rd Or Subsequent +drug_abuse,offence,A401,,Drove Impaired - Blood Contained Evidence Of The Use Of A Qualifying Drug +drug_abuse,offence,A402,,Drove With Blood Containing Evidence Of Controlled Drug +drug_abuse,offence,A406,,Impaired Driver Caused Injury - Blood Contained Qualifying Drug +drug_abuse,offence,A407,,Impaired Driver Caused Death - Blood Contained Qualifying Drug +drug_abuse,offence,A430,,Drove Impaired - Blood Contained Evidence Of Qualifying Drug - 3rd Or Sub +drug_abuse,offence,A432,,In Charge With Blood Containing Controlled Drug - Caused Injury - 3rd Or Sub +drug_abuse,offence,A433,,In Charge With Blood Containing Controlled Drug - Caused Death - 3rd Or Sub +drug_abuse,offence,A434,,Driver's Blood Contained Evidence Of Use Of Controlled Drug - 3rd Or Sub +drug_abuse,offence,A602,,Transport Service Driver Under The Influence Of A Drug +drug_abuse,offence,A604,,In Charge Transport Service Vehicle Under Influence Of A Drug Causing Injury +drug_abuse,offence,A606,,In Charge Transport Service Vehicle Under Influence Of Drug Causing Death +drug_abuse,offence,A609,,Drove In Transport Service Under The Influence Of Drink/Drug Or Both +drug_abuse,offence,A610,,In Charge Transport Service Vehicle Under Influence Drink/Drug Causing Injury +drug_abuse,offence,A611,,In Charge Transport Service Vehicle Under Influence Drink/Drug Causing Death +drug_abuse,offence,A613,,Drove In Transport Service Under Influence Drink Or Drug - 3Rd Or Subsequent +drug_abuse,offence,A646,,Transport Driver Impaired - Blood Contained Evidence Of Qualifying Drug +drug_abuse,offence,A647,,Transport Driver Impaired - Qualifying Drug in Blood - 3rd Or Sub +drug_abuse,offence,A648,,Transport Drive's Blood Contained Evidence Of Use Of controlled Drug +drug_abuse,offence,A653,,Transport Driver Caused Injury - Qualifying Drug In Blood +drug_abuse,offence,A654,,Transport Driver Caused Death - Qualifying Drug In Blood +drug_abuse,offence,A655,,Transport Driver's Blood Had Evidence Of Controlled Drug - Caused Injury +drug_abuse,offence,A656,,Transport Driver's Blood Had Evidence Of Controlled Drug - Caused Death +drug_abuse,offence,A657,,Transport Driver's Blood Had Controlled Drug - Caused Injury - 3rd Or Sub +drug_abuse,offence,A658,,Transport Driver's Blood Had Controlled Drug - Caused Death - 3rd Or Sub +drug_abuse,offence,A659,,Transport Drivers Blood Contained Evidence Of Controlled Drug - 3rd Or Sub +drug_abuse,offence,A669,,Careless under influence drink or drug caused death transport service +drug_abuse,offence,A670,,Careless under influence drink or drug caused injury transport service +drug_abuse,offence,D514,,Careless Driving Drug Inv Cause Injury +drug_abuse,offence,D515,,Careless Driving Drink/Drug Inv Cause Injury +drug_abuse,offence,D517,,Careless Driving Drug Inv Cause Death +drug_abuse,offence,D518,,Careless Driving Drink/Drug Inv Cause Death +drug_abuse,DSM,291,,"Alcohol (withdraw, amnestic, jealousy, psychosis, etc)" +drug_abuse,DSM,2920,,Drug withdrawal syndrome +drug_abuse,DSM,2921,,Drug-induced hallucinosis and delusions +drug_abuse,DSM,2922,,Pathological drug intoxication +drug_abuse,DSM,2923,,?? +drug_abuse,DSM,2924,,?? +drug_abuse,DSM,2925,,?? +drug_abuse,DSM,2926,,?? +drug_abuse,DSM,2927,,?? +drug_abuse,DSM,2928,,"Drug-induced (delirium, dementia, amnestic etc)" +drug_abuse,DSM,2929,,Unspecified drug-induced mental disorder +drug_abuse,DSM,3040,,Opioid type dependence +drug_abuse,DSM,3041,,Barbiturate/similarly acting sedative/hypnotic dependence +drug_abuse,DSM,3042,,Cocaine dependence +drug_abuse,DSM,3043,,Cannabis dependence +drug_abuse,DSM,3044,,Amphetamine and other psychostimulant dependence +drug_abuse,DSM,3045,,Hallucinogen dependence +drug_abuse,DSM,3046,,Other specified drug dependence +drug_abuse,DSM,3047,,Comb. of opioid type drug with any other drug dependence +drug_abuse,DSM,3048,,Combinations of drug dependence excluding opioid type drug +drug_abuse,DSM,3049,,Unspecified drug dependence +drug_abuse,DSM,3051,,Tobacco use disorder +drug_abuse,DSM,3052,,Cannabis use disorder +drug_abuse,DSM,3053,,Hallucinogen use disorder +drug_abuse,DSM,3054,,Barbiturate/similarly acting sedative or hypnotic use disord +drug_abuse,DSM,3055,,Opioid use disorder +drug_abuse,DSM,3056,,Cocaine use disorder +drug_abuse,DSM,3057,,Amphetamine or related acting sympathomimetic use disorder +drug_abuse,DSM,3058,,Antidepressant type use disorder +drug_abuse,DSM,3059,,"Other, mixed, or unspecified drug use disorder" +drug_abuse,DSM,V1581,,Personal history of tobacco use +drug_abuse,external_ICD10,X42,,Accidental poisoning by and exposure to narcotics and hallucinogens +drug_abuse,external_ICD10,X62,,Intentional self-poisoning by and exposure to narcotics and hallucinogens +drug_abuse,external_ICD10,Y12,,Poisoning by and exposure to narcotics and hallucinogens +drug_abuse,external_ICD9,8500,,Accidental poisoning by heroin +drug_abuse,external_ICD9,8501,,Accidental poisoning by methadone +drug_abuse,external_ICD9,8502,,Accidental poisoning by other opiates/narcotics +drug_abuse,ICD10,F11,,Opiod related disorders +drug_abuse,ICD10,F12,,Cannabis related disorders +drug_abuse,ICD10,F13,,"Sedative, hypnotic, or anxiolytic related disorders" +drug_abuse,ICD10,F14,,Cocaine related disorders +drug_abuse,ICD10,F15,,Other stimulant related disorders +drug_abuse,ICD10,F16,,Hallucinogen related disorders +drug_abuse,ICD10,F17,,Nicotine dependence +drug_abuse,ICD10,F18,,Inhalant related disorders +drug_abuse,ICD10,F19,,Other psychoactive substance related disorders +drug_abuse,ICD10,P961,,Neonatal withdrawal symptoms from maternal use of drugs of addiction +drug_abuse,ICD10,R781,,Finding of opiate in blood +drug_abuse,ICD10,R782,,Finding of cocaine on blood +drug_abuse,ICD10,R783,,Finding of hallucinogen in blood +drug_abuse,ICD10,R784,,Finding of other drugs of addictive potential in blood +drug_abuse,ICD10,R785,,Finding of other psychotropic drug in blood +drug_abuse,ICD10,R786,,Finding of steroid agent in blood +drug_abuse,ICD10,R787,,Finding of abnormal level of heavy metals in blood +drug_abuse,ICD10,R788,,"Finding of other specified substances, not normally found in blood" +drug_abuse,ICD10,R789,,"Finding of unspecified substance, not normally found in blood" +drug_abuse,ICD10,Z812,,Family history of tobacco abuse and dependence +drug_abuse,ICD10,Z813,,Family history of other psychoactive substance abuse and dependence +drug_abuse,ICD10,Z814,,Family history of othe substance abuse and dependence +drug_abuse,ICD9,291,,"Alcohol (withdraw, amnestic, jealousy, psychosis, etc)" +drug_abuse,ICD9,2920,,Drug withdrawal syndrome +drug_abuse,ICD9,2921,,Drug-induced hallucinosis and delusions +drug_abuse,ICD9,2922,,Pathological drug intoxication +drug_abuse,ICD9,2923,,?? +drug_abuse,ICD9,2924,,?? +drug_abuse,ICD9,2925,,?? +drug_abuse,ICD9,2926,,?? +drug_abuse,ICD9,2927,,?? +drug_abuse,ICD9,2928,,"Drug-induced (delirium, dementia, amnestic etc)" +drug_abuse,ICD9,2929,,Unspecified drug-induced mental disorder +drug_abuse,ICD9,3040,,Opioid type dependence +drug_abuse,ICD9,3041,,Barbiturate/similarly acting sedative/hypnotic dependence +drug_abuse,ICD9,3042,,Cocaine dependence +drug_abuse,ICD9,3043,,Cannabis dependence +drug_abuse,ICD9,3044,,Amphetamine and other psychostimulant dependence +drug_abuse,ICD9,3045,,Hallucinogen dependence +drug_abuse,ICD9,3046,,Other specified drug dependence +drug_abuse,ICD9,3047,,Comb. of opioid type drug with any other drug dependence +drug_abuse,ICD9,3048,,Combinations of drug dependence excluding opioid type drug +drug_abuse,ICD9,3049,,Unspecified drug dependence +drug_abuse,ICD9,3051,,Tobacco use disorder +drug_abuse,ICD9,3052,,Cannabis use disorder +drug_abuse,ICD9,3053,,Hallucinogen use disorder +drug_abuse,ICD9,3054,,Barbiturate/similarly acting sedative or hypnotic use disord +drug_abuse,ICD9,3055,,Opioid use disorder +drug_abuse,ICD9,3056,,Cocaine use disorder +drug_abuse,ICD9,3057,,Amphetamine or related acting sympathomimetic use disorder +drug_abuse,ICD9,3058,,Antidepressant type use disorder +drug_abuse,ICD9,3059,,"Other, mixed, or unspecified drug use disorder" +drug_abuse,ICD9,V1581,,Personal history of tobacco use +drug_abuse,incapacity,6,,"Mental disorders, substance abuse, other drugs" +drug_abuse,incapacity,171,,"Mental disorders, substance abuse, drugs" +drug_abuse,incapacity,172,,"Mental disorders, substance abuse, other substance abuse" +drug_abuse,pharm_chemical,1252,multiple,Naltrexone hydrochloride +drug_abuse,pharm_chemical,1432,sole,Disulfiram +drug_abuse,pharm_chemical,3793,multiple,Naltrexone hydrochloride +drug_abuse,socrates,1301,,Alcohol / drug related disorder (excluding Korsakov's syndrome) +dysthymia,DSM,3004,,Neurotic depression +dysthymia,ICD10,F341,,Dysthymic disorder +dysthymia,ICD9,3004,,Neurotic depression +dysthymia,socrates,1304,,Depression +schizophrenia,DSM,295,,Schizophrenic psychoses +schizophrenia,ICD10,F20,,Schizophrenia +schizophrenia,ICD10,F21,,Schizotypal disorder +schizophrenia,ICD10,F22,,Delusional disorders +schizophrenia,ICD10,F23,,Brief psychotic disorders +schizophrenia,ICD10,F24,,Shared psychotic disorder +schizophrenia,ICD10,F25,,Schizoaffective disorders +schizophrenia,ICD9,295,,Schizophrenic psychoses +schizophrenia,incapacity,163,,Schizophrenia +schizophrenia,pharm_chemical,1011,multiple,Risperidone +schizophrenia,pharm_chemical,1078,sole,Clozapine +schizophrenia,pharm_chemical,1140,multiple,Olanzapine +schizophrenia,pharm_chemical,1183,multiple,Quetiapine +schizophrenia,pharm_chemical,1532,multiple,Flupenthixol decanoate +schizophrenia,pharm_chemical,1533,multiple,Fluphenazine decanoate +schizophrenia,pharm_chemical,1535,multiple,Fluphenazine hydrochloride +schizophrenia,pharm_chemical,3873,multiple,Ziprasidone +schizophrenia,pharm_chemical,3878,multiple,Aripiprazole +schizophrenia,pharm_chemical,3884,sole,Amisulpride +schizophrenia,pharm_chemical,3940,multiple,Olanzapine pamoate monohydrate +schizophrenia,pharm_chemical,4025,sole,Paliperidone +schizophrenia,socrates,1306,,Schizophrenia diff --git a/health/drug_abuse_dependance.sql b/health/drug_abuse_dependance.sql new file mode 100644 index 0000000..bc7c64a --- /dev/null +++ b/health/drug_abuse_dependance.sql @@ -0,0 +1,579 @@ +/************************************************************************************************** +Title: Drug abuse or dependence +Author: Craig Wright +Re-work: Manjusha Radhakrishnan, Simon Anastasiadis +Reviewer: + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Indication of chronic or acute drug abuse or dependence. + +Intended purpose: +All instances that point to a person having drug abuse or dependance +For incidence and prevalence - see notes. + +Inputs & Dependencies: +- [IDI_Clean].[security].[concordance] +- [IDI_Clean].[moh_clean].[mortality_diagnosis] +- [IDI_Clean].[moh_clean].[mortality_registrations] +- [IDI_Clean].[moh_clean].[pharmaceutical] +- [IDI_Clean].[moh_clean].[pub_fund_hosp_discharges_diag] +- [IDI_Clean].[moh_clean].[pub_fund_hosp_discharges_event] +- [IDI_Clean].[moh_clean].[priv_fund_hosp_discharges_diag] +- [IDI_Clean].[moh_clean].[priv_fund_hosp_discharges_event] +- [IDI_Clean].[moh_clean].[PRIMHD] +- [IDI_Clean].[moh_clean].[interrai] +- [IDI_Clean].[moh_clean].[pop_cohort_demographics] +- [IDI_Clean].[msd_clean].[msd_incapacity] +- [IDI_Clean].[moj_clean].[charges] +- [IDI_Clean].[pol_clean].[nia_links] +- [IDI_Clean].[moh_clean].[PRIMHD] +- [IDI_Clean].[moh_clean].[nnpac] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS_CLIN_DIAG_CODES].[clinical_codes] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_dim_form_pack_subsidy_code] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_interrai_question_lookup] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_interrai_answer_lookup] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[acc_ICD10_Code] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_nmds_health_specialty] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[DepIndex2018_MB2018] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[meshblock_all] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moj_offence] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[msd_incapacity_reason_code_3] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[msd_incapacity_reason_code_4] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_pharmaceutical_lookup] +- [IDI_Adhoc].[clean_read_MOH_PRIMHD].[moh_primhd_mhinc] +- [IDI_Adhoc].[clean_read_MOH_PRIMHD].[primhd_diagnoses] +- [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_disability] +- [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_needs_assessment] +- [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_referral] +- [IDI_Adhoc].[clean_read_MOH_PHARMACEUTICAL].[moh_PHARMACEUTICAL_LOOKUP_TABLE] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_drug_abuse_or_dependence] +- [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] + + +Notes: +1) Definition based on three factors: + 1. any drug abuse or depedence + 2. any acute drug intoxication/abuse + 3. any chronic conditions derived from drugs + For chronic drug abuse or dependence, look for a period of more than two months. + +2) Incidence and prevalence + Incidence = development of a condition during a particular time period (new only). + Prevalance = affected by the condition during a particular time period (new + existing). + Plots of incidence (from this definition) against deprivation show a non-monotinic pattern. + Incidence increases from deprivation 1-8 but deprivation 9 and 10 show lower incidence. + This is likely because of barriers to access (cost, transport, availability etc.). + + So we can not use point in time measures to accurately access prevalance. + Instead, when using this definition, we recommend considering every person's history. + For example, if we consider any drug abuse indication in the last 10 years, then we observe + a more likely relationship between drug abuse and deprivation. This suggests that people in high + deprivation do get some treatment, but much less frequently than those who are less deprived. + Such an approach is reasonable, as this condition is likely to persist for at least 10 years. + +3) Some treatments are used to treat this condition and other conditions. This is most common + with pharmaceuticals. For these treatments, we do the following: + - Gather the treatments that serve multiple purposes together separately. + - Where a person only has records that also treat other conditions, discard those records + - Where a person has other evidence of drug abuse, add the treatments for multiple conditions + to the table. + For example, in 2010 a person is diagnosed in hospital with drug abuse, in 2008 they received drugs + that are used to treat drug abuse or depression. Then we include the 2008 date as an earlier probable indicator. + +4) Certain medial events are coded using the ICD9, ICD10, or DSM codes - most commonly hospital diagnoses. + There are mappings between the different codings in the diagnoses table. The mappings help researchers who are + familiar with only one coding system to locate records from a different coding system. + Most records (at least 80%) are stored in two versions/rows (the submitted code system, and an alternative they + have been mapped to), so researchers could use either version. + However, the mappings are imperfect. In some cases a more specific code we do want is mapped to a more general code + that we do not want. Hence, to ensure the most robust results, we have limited ourselves to only those records where + the diagnostic code is stored in the same system it was submitted. + This may exclude some records from our output definition. Researchers needing the broadest possible definition are + advised to review this constraint. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +Issues: +1) Because all our MHA tables use the same lookup/reference table, and all definitions load + this table into the database, you can not run the definitions in parallel. Because, each + definition will delete the reference table when it starts running and this will interfere + with the definitions that are already running. + +History (reverse order): +2022-09-12 SA Prep for library +2022-07-19 MR Tidy-up +2022-06-10 CW Definition creation +*************************************************************************************************************************/ + +/* Download the diagnosis lookup table from Github folder and upload onto datalab */ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] ( + diagnosis VARCHAR(30), + code_type VARCHAR(30), + code VARCHAR(10), + aux VARCHAR(30), + explanation VARCHAR(255), +) + +BULK INSERT [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] +FROM '\\your project folder\diagnosis_codes.csv' +WITH ( + FIRSTROW = 2, + FIELDTERMINATOR = ',', + ROWTERMINATOR = '\n', + TABLOCK +) + +/******************************************************** +TABLES TO APPEND TO +********************************************************/ + +/* Diagnosis or treatment only indicates bipolar */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_drug_solo] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_drug_solo] ( + snz_uid INT, + event_date DATE, +) + +/* Diagnosis or treatment used for bipolar and other conditions */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_drug_multi] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_drug_multi] ( + snz_uid INT, + event_date DATE, +) + +/******************************************************** +MORTALITY + +Note that people who died with this diagnosis will likely +have had bipolar for a while before death +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_drug_solo] (snz_uid, event_date) +SELECT b.snz_uid + ,EOMONTH(DATEFROMPARTS([moh_mor_death_year_nbr],[moh_mor_death_month_nbr],1)) AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[mortality_diagnosis] AS a +INNER JOIN [IDI_Clean_YYYYMM].[moh_clean].[mortality_registrations] AS b +ON a.[snz_dia_death_reg_uid] = b.snz_dia_death_reg_uid +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[moh_mort_diag_clinical_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'drug_abuse' + AND r.code_type = 'ICD10' + AND [moh_mort_diag_clinic_sys_code] >= 10 + AND [moh_mort_diag_clinic_type_code] IN ('A','B','V') +) +OR EXISTS ( +SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[moh_mort_diag_clinical_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'drug_abuse' + AND r.code_type = 'ICD9' + AND [moh_mort_diag_clinic_sys_code] IN ('06','6') + AND [moh_mort_diag_clinic_type_code] IN ('A','B','V') +) +OR EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[moh_mort_diag_clinical_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'drug_abuse' + AND r.code_type = 'external_cause' + AND [moh_mort_diag_clinic_sys_code] IN ('06','6') + AND [moh_mort_diag_clinic_type_code] IN ('E') +) +GO + +/******************************************************** +PHARMACEUTICALS +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_drug_solo] (snz_uid, event_date) +SELECT a.[snz_uid] + ,[moh_pha_dispensed_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[pharmaceutical] AS a +INNER JOIN [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_dim_form_pack_subsidy_code] AS b +ON a.[moh_pha_dim_form_pack_code] = b.[DIM_FORM_PACK_SUBSIDY_KEY] +WHERE EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(CAST(CHEMICAL_ID AS VARCHAR), 1, LEN(r.code)) = r.code + AND r.diagnosis = 'drug_abuse' + AND r.code_type = 'pharm_chemical' + AND r.aux = 'sole' +) +GO + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_drug_multi] (snz_uid, event_date) +SELECT a.[snz_uid] + ,[moh_pha_dispensed_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[pharmaceutical] AS a +INNER JOIN [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_dim_form_pack_subsidy_code] AS b +ON a.[moh_pha_dim_form_pack_code] = b.[DIM_FORM_PACK_SUBSIDY_KEY] +WHERE EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(CAST(CHEMICAL_ID AS VARCHAR), 1, LEN(r.code)) = r.code + AND r.diagnosis = 'drug_abuse' + AND r.code_type = 'pharm_chemical' + AND r.aux = 'multiple' +) +GO + +/******************************************************** +PRIVATE HOSPITAL DISCHARGE + +history of tobacco use ,'V1582','V1583' excluded - people smoking +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_drug_solo] (snz_uid, event_date) +SELECT a.[snz_uid] + ,CAST([moh_pri_evt_start_date] AS DATE) AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[priv_fund_hosp_discharges_event] AS a +INNER JOIN [IDI_Clean_YYYYMM].[moh_clean].[priv_fund_hosp_discharges_diag] AS b +ON a.[moh_pri_evt_event_id_nbr] = b.[moh_pri_diag_event_id_nbr] +AND [moh_pri_diag_sub_sys_code] = [moh_pri_diag_clinic_sys_code] +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING([moh_pri_diag_clinic_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'drug_abuse' + AND r.code_type = 'ICD10' + AND [moh_pri_diag_sub_sys_code] >= '10' + AND [moh_pri_diag_diag_type_code] IN ('A','B','V') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING([moh_pri_diag_clinic_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'drug_abuse' + AND r.code_type = 'ICD9' + AND [moh_pri_diag_sub_sys_code] IN ('06','6') + AND [moh_pri_diag_diag_type_code] IN ('A','B','V') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING([moh_pri_diag_clinic_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'drug_abuse' + AND r.code_type = 'external_cause' + AND [moh_pri_diag_sub_sys_code] IN ('06','6') + AND [moh_pri_diag_diag_type_code] IN ('A','B','V') +) +GO + +/******************************************************** +PUBLIC HOSPITAL DISCHARGE + +history of tobacco use ,'V1582','V1583' excluded - people smoking +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_drug_solo] (snz_uid, event_date) +SELECT b.[snz_uid] + ,[moh_evt_evst_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_diag] AS a +INNER JOIN [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_event] AS b +ON [moh_dia_clinical_sys_code] = [moh_dia_submitted_system_code] +AND [moh_evt_event_id_nbr]=[moh_dia_event_id_nbr] +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[moh_dia_clinical_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'drug_abuse' + AND r.code_type = 'ICD10' + AND [moh_dia_submitted_system_code] >= '10' + AND [moh_dia_diagnosis_type_code] IN ('A','B','V') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[moh_dia_clinical_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'drug_abuse' + AND r.code_type = 'ICD9' + AND [moh_dia_submitted_system_code] IN ('06','6') + AND [moh_dia_diagnosis_type_code] IN ('A','B','V') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[moh_dia_clinical_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'drug_abuse' + AND r.code_type = 'external_cause' + AND [moh_dia_submitted_system_code] IN ('06','6') + AND [moh_dia_diagnosis_type_code] IN ('E') +) +GO + +/******************************************************** +PUBLIC HOSPITAL DISCHARGE + +SUBSTANCE ABUSE HEALTH SPECIALITY +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_drug_multi] (snz_uid, event_date) +SELECT snz_uid, [moh_evt_evst_date] as event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_event] +WHERE moh_evt_hlth_spec_code in ('Y40','Y41','Y42','Y43','Y44','Y45','Y46','Y47','Y48','Y49') +GO + +/******************************************************** +PRIMHD AND MHINC +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_drug_solo] (snz_uid, event_date) +SELECT b.snz_uid + ,[classification_start] AS event_date +FROM [IDI_Adhoc].[clean_read_MOH_PRIMHD].[moh_primhd_mhinc] AS a +INNER JOIN [IDI_Clean_YYYYMM].[security].[concordance] AS b +ON a.snz_moh_uid = b.snz_moh_uid +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'drug_abuse' + AND r.code_type = 'ICD10' + AND [clinical_coding_system_id] >= '10' + AND diagnosis_type in ('A','B','V','P') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'drug_abuse' + AND r.code_type = 'DSM' + AND [clinical_coding_system_id] IN ('07','7') + AND diagnosis_type in ('A','B','V','P') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'drug_abuse' + AND r.code_type = 'ICD9' + AND [clinical_coding_system_id] IN ('06','6') + AND diagnosis_type in ('A','B','V','P') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'drug_abuse' + AND r.code_type = 'external_cause' + AND [clinical_coding_system_id] IN ('06','6') + AND diagnosis_type in ('E') +) +GO + +/******************************************************** +PRIMHD TEAM +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_drug_multi] (snz_uid, event_date) +SELECT [snz_uid] + ,[moh_mhd_activity_start_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[PRIMHD] +WHERE [moh_mhd_team_type_code] = 3 +OR [moh_mhd_team_code] IN (7874,14808,13481,13541,7086,7102,7114,7115,7238,7119,7122,7142,7152,7153,7077) + +/******************************************************** +PRIMHD DIAGNOSIS +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_drug_solo] (snz_uid, event_date) +SELECT snz_uid + ,DATEFROMPARTS(SUBSTRING([CLASSIFICATION_START_DATE],7,4),SUBSTRING([CLASSIFICATION_START_DATE],4,2),SUBSTRING([CLASSIFICATION_START_DATE],1,2)) AS event_date +FROM [IDI_Adhoc].[clean_read_MOH_PRIMHD].[primhd_diagnoses] AS a +INNER JOIN [IDI_Clean_YYYYMM].[security].[concordance] AS b +ON a.snz_moh_uid = b.snz_moh_uid + +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'drug_abuse' + AND r.code_type = 'ICD10' + AND [clinical_coding_system_id] >= '10' + AND [DIAGNOSIS_TYPE] in ('A','B','V','P') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'drug_abuse' + AND r.code_type = 'DSM' + AND [clinical_coding_system_id] IN ('07','7') + AND [DIAGNOSIS_TYPE] in ('A','B','V','P') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'drug_abuse' + AND r.code_type = 'ICD9' + AND [clinical_coding_system_id] IN ('06','6') + AND [DIAGNOSIS_TYPE] in ('A','B','V','P') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'drug_abuse' + AND r.code_type = 'DSM' + AND [clinical_coding_system_id] IN ('07','7') + AND [DIAGNOSIS_TYPE] in ('E') +) +GO + +/******************************************************** +INTERRAI + +Alcohol- Highest number of drinks in any single sitting in LAST 14 DAYS / 5 or more +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_drug_solo] (snz_uid, event_date) +SELECT [snz_uid] + ,[moh_irai_assessment_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[interrai] +WHERE moh_irai_alcohol_one_settng_code = 3 /* 5+ drinks */ +GO + +/******************************************************** +SOCRATES +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_drug_solo] (snz_uid, event_date) +SELECT b.snz_uid + ,COALESCE( + CAST(SUBSTRING([FirstContactDate], 1, 7) AS DATE), + CAST(SUBSTRING([ReferralDate],1,7) AS DATE) + ) AS event_date +FROM [IDI_Adhoc].[clean_read_MOH_SOCRATES].moh_disability AS a +INNER JOIN [IDI_Clean_YYYYMM].[security].[concordance] AS b +ON a.snz_moh_uid = b.snz_moh_uid +INNER JOIN [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_needs_assessment] AS c +ON a.snz_moh_uid = c.snz_moh_uid +LEFT JOIN [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_referral] AS e +ON a.snz_moh_uid = e.snz_moh_uid +WHERE a.[Code] = '1301' + +/******************************************************** +MSD INCAPACITATION +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_drug_solo] (snz_uid, event_date) +SELECT [snz_uid] + ,[msd_incp_incp_from_date] AS event_date +FROM [IDI_Clean_YYYYMM].[msd_clean].[msd_incapacity] +WHERE [msd_incp_incrsn_code] IN ('006','171','172') +OR [msd_incp_incrsn95_1_code] IN ('006','171','172') +OR [msd_incp_incrsn95_2_code] IN ('006','171','172') +OR [msd_incp_incrsn95_3_code] IN ('006','171','172') +OR [msd_incp_incrsn95_4_code] IN ('006','171','172') +OR [msd_incp_incapacity_code] IN ('006','171','172') +GO + +/******************************************************** +DRIVING DISQUALIFICATION +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_drug_solo] (snz_uid, event_date) +SELECT snz_uid, [moj_chg_offence_from_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moj_clean].[charges] AS a +WHERE EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(moj_chg_offence_code, 1, LEN(r.code)) = r.code + AND r.diagnosis = 'drug_abuse' + AND r.code_type = 'offences' +) +GO + +/******************************************************** +POLICE NIA LINKS +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_drug_solo] (snz_uid, event_date) +SELECT snz_uid, [nia_links_rec_date] AS event_date +FROM [IDI_Clean_YYYYMM].[pol_clean].[nia_links] AS a +WHERE EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING([nia_links_latest_inc_off_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'drug_abuse' + AND r.code_type = 'offences' +) +GO + +/**************************************************************************************************************** +FINAL TABLE CREATION +****************************************************************************************************************/ + +/* Add indexes */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_drug_solo] (snz_uid); +GO +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_drug_multi] (snz_uid); +GO + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_drug_abuse_or_dependence] +GO + +WITH multi_to_add AS ( + SELECT snz_uid, event_date + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_drug_multi] AS m + WHERE EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_drug_solo] AS s + WHERE m.snz_uid = s.snz_uid + ) +) +SELECT DISTINCT snz_uid, event_date +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_drug_abuse_or_dependence] +FROM ( + SELECT snz_uid, event_date + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_drug_solo] + + UNION ALL + + SELECT snz_uid, event_date + FROM multi_to_add +) AS k +GO + +/******************************************************** +TIDY UP +********************************************************/ + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_drug_abuse_or_dependence] (snz_uid); +GO +/* Compress final table to save space */ +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_drug_abuse_or_dependence] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE); +GO + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_drug_solo] +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_drug_multi] +GO diff --git a/health/earliest_b4sc.sql b/health/earliest_b4sc.sql new file mode 100644 index 0000000..aa6f402 --- /dev/null +++ b/health/earliest_b4sc.sql @@ -0,0 +1,142 @@ +/************************************************************************************************** +Title: Earliest date B4SC +Author: Simon Anastasiadis + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Earliest date that a child receives some part of their before (B4) school check (B4SC). +Excludes declined checks. + +Intended purpose: +Identifying when B4SC begin. +Determining which children are receiving B4SC. + +Inputs & Dependencies: +- [IDI_Clean].[moh_clean].[b4sc] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[defn_earliest_b4sc] + + +Notes: +1) Linking errors are possible so some people older that 4 or 5 may be recorded as having + before school checks. Such records should be discarded. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +Issues: + +History (reverse order): +2020-05-25 SA v1 +**************************************************************************************************/ + +/* Set database for writing views */ +USE IDI_UserCode +GO + +/* Clear existing view */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[defn_earliest_b4sc]; +GO + +/* Create view */ +CREATE VIEW [DL-MAA20XX-YY].[defn_earliest_b4sc] AS + +WITH +strengths_difficulty_questionnaire_teacher AS ( + SELECT [snz_uid], [moh_bsc_sdqt_date] AS the_date + FROM [IDI_Clean_YYYYMM].[moh_clean].[b4sc] + WHERE [moh_bsc_sdqt_outcome_text] IS NOT NULL + AND [moh_bsc_sdqt_outcome_text] <> 'Declined' + AND [moh_bsc_sdqt_date] IS NOT NULL +), +strengths_difficulty_questionnaire_parent AS ( + SELECT [snz_uid], [moh_bsc_sdqp_date] AS the_date + FROM [IDI_Clean_YYYYMM].[moh_clean].[b4sc] + WHERE [moh_bsc_sdqp_outcome_text] IS NOT NULL + AND [moh_bsc_sdqp_outcome_text] <> 'Declined' + AND [moh_bsc_sdqp_date] IS NOT NULL +), +pediatrics AS ( + SELECT [snz_uid], [moh_bsc_peds_date] AS the_date + FROM [IDI_Clean_YYYYMM].[moh_clean].[b4sc] + WHERE [moh_bsc_peds_outcome_text] IS NOT NULL + AND [moh_bsc_peds_outcome_text] <> 'Declined' + AND [moh_bsc_peds_date] IS NOT NULL +), +immunisations AS ( + SELECT [snz_uid], [moh_bsc_imms_date] AS the_date + FROM [IDI_Clean_YYYYMM].[moh_clean].[b4sc] + WHERE [moh_bsc_imms_outcome_text] IS NOT NULL + AND [moh_bsc_imms_outcome_text] <> 'Declined' + AND [moh_bsc_imms_date] IS NOT NULL +), +dental AS ( + SELECT [snz_uid], [moh_bsc_dental_date] AS the_date + FROM [IDI_Clean_YYYYMM].[moh_clean].[b4sc] + WHERE [moh_bsc_dental_outcome_text] IS NOT NULL + AND [moh_bsc_dental_outcome_text] <> 'Declined' + AND [moh_bsc_dental_date] IS NOT NULL +), +growth AS ( + SELECT [snz_uid], [moh_bsc_growth_date] AS the_date + FROM [IDI_Clean_YYYYMM].[moh_clean].[b4sc] + WHERE [moh_bsc_growth_outcome_text] IS NOT NULL + AND [moh_bsc_growth_outcome_text] <> 'Declined' + AND [moh_bsc_growth_date] IS NOT NULL +), +hearing AS ( + SELECT [snz_uid], [moh_bsc_hearing_date] AS the_date + FROM [IDI_Clean_YYYYMM].[moh_clean].[b4sc] + WHERE [moh_bsc_hearing_outcome_text] IS NOT NULL + AND [moh_bsc_hearing_outcome_text] <> 'Declined' + AND [moh_bsc_hearing_date] IS NOT NULL +), +vision AS ( + SELECT [snz_uid], [moh_bsc_vision_date] AS the_date + FROM [IDI_Clean_YYYYMM].[moh_clean].[b4sc] + WHERE [moh_bsc_vision_outcome_text] IS NOT NULL + AND [moh_bsc_vision_outcome_text] <> 'Declined' + AND [moh_bsc_vision_date] IS NOT NULL +), +general_checkup AS ( + SELECT [snz_uid], [moh_bsc_general_date] AS the_date + FROM [IDI_Clean_YYYYMM].[moh_clean].[b4sc] + WHERE [moh_bsc_general_outcome_text] IS NOT NULL + AND [moh_bsc_general_outcome_text] <> 'Declined' + AND [moh_bsc_general_date] IS NOT NULL +), +overall_status AS ( + SELECT [snz_uid], [moh_bsc_check_date] AS the_date + FROM [IDI_Clean_YYYYMM].[moh_clean].[b4sc] + WHERE [moh_bsc_check_status_text] IS NOT NULL + AND [moh_bsc_check_status_text] <> 'Declined' + AND [moh_bsc_check_date] IS NOT NULL +) +SELECT [snz_uid], MIN(the_date) AS earliest_date +FROM ( + SELECT * FROM strengths_difficulty_questionnaire_teacher UNION ALL + SELECT * FROM strengths_difficulty_questionnaire_parent UNION ALL + SELECT * FROM pediatrics UNION ALL + SELECT * FROM immunisations UNION ALL + SELECT * FROM dental UNION ALL + SELECT * FROM growth UNION ALL + SELECT * FROM hearing UNION ALL + SELECT * FROM vision UNION ALL + SELECT * FROM general_checkup UNION ALL + SELECT * FROM overall_status +) k +GROUP BY [snz_uid] diff --git a/health/emergency_department_visits.sql b/health/emergency_department_visits.sql new file mode 100644 index 0000000..1441c72 --- /dev/null +++ b/health/emergency_department_visits.sql @@ -0,0 +1,121 @@ +/************************************************************************************************** +Title: Emergency department visit +Author: Hubert Zal + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +ED visits as recorded in the National Non-admitted Patient Collection. + +Intended use: +Identify ED visit events + +Inputs & Dependencies: +- [IDI_Clean].[moh_clean].[nnpac] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[SWA_emergency_department] + + +Notes: + +In New Zealand, emergency departments (EDs) provide care and treatment for patients with real or perceived, serious injuries or illness. +We use ED visits as recorded in the National Non-admitted Patient Collection (NNPAC). + +Inclusion Criteria + +The Ministry of Health has defined an inclusion criteria which defines an emergency department visit +which is recorded with the National Non-admitted Patient Collection (NNPAC). +Inclusion criteria as described in: www.health.govt.nz/publication/emergency-department-use-2014-15 +See page 34. +• had one of the ED codes specified as the purchase unit code (ED02001-ED06001A) +• were completed (ie, excludes events where the patient did not wait to complete) +• do not include follow-up appointments + +The following ED purchase unit codes have been included: +ED02001, ED02001A, ED03001, ED03001A, +ED04001, ED04001A, ED05001, ED05001A, +ED06001, ED06001A + +As per Craig Wright's advice: +because we are only interested in counting events, we do not need to +combine with admitted patient ED events. +Events where the person Did Not Attend (DNA) and Did Not Wait (DNW) are excluded. + +------------------------------------------------------------------------------------------------------- +Variable definitions + +moh_nnp_purchase_unit_code: A purchase unit code is part of a classification system used to consistently measure, quantify and value a service. +The definition for each purchase unit code can be found in the Purchase unit dictionary at: +www.nsfl.health.govt.nz/purchase-units/purchase-unit-data-dictionary-202223 + +moh_nnp_attendence_code: Attendance code for the outpatient event. Notes: +ATT (Attended) - An attendance is where the healthcare user is assessed by a registered medical +practitioner or nurse practitioner. The healthcare user received treatment, therapy, advice, +diagnostic or investigatory procedures. +DNA (Did Not Attend) - Where healthcare user did not arrive, this is classed as did not attend. +DNW (Did Not Wait) - Used for ED where the healthcare user did not wait. Also for use where healthcare +user arrives but does not wait to receive service. + +moh_nnp_event_type_code: Code identifying the type of outpatient event. Notes: From 1 Jul 2008 to 31 June 2010, +the event type was determined from the submitted purchase unit code. However, from July 2010 +it became mandatory to report the event type directly. + +moh_nnp_service_date: The date and time that the triaged patient's treatment starts by a suitable ED medical professional +(could be the same time as the datetime of service if treatment begins immediately). + + +Parameters & Present values: + Current refresh = $(IDIREF) + Prefix = $(TBLPREF) + Project schema = [$(PROJSCH)] + +Issues: + +History (reverse order): +Simon Anastasiadis: 2019-01-08 + +**************************************************************************************************/ +--PARAMETERS################################################################################################## +--SQLCMD only (Activate by clicking Query->SQLCMD Mode) +--Already in master.sql; Uncomment when running individually +:setvar TBLPREF "SWA_" +:setvar IDIREF "IDI_Clean_YYYYMM" +:setvar PROJSCH "DL-MAA20XX-YY" +GO + +USE IDI_UserCode; + +DROP VIEW IF EXISTS [$(PROJSCH)].[$(TBLPREF)emergency_department]; +GO + +CREATE VIEW [$(PROJSCH)].[$(TBLPREF)emergency_department] AS + +SELECT DISTINCT [snz_uid] + ,[moh_nnp_service_datetime] AS [start_date] + ,[moh_nnp_service_datetime] AS [end_date] + -- ,[moh_nnp_purchase_unit_code] + ,'ED visit' AS [description] + ,'moh nnpac' as [source] +FROM [$(IDIREF)].[moh_clean].[nnpac] +WHERE [moh_nnp_event_type_code] = 'ED' +AND [moh_nnp_purchase_unit_code] IN ('ED02001', 'ED02001A', 'ED03001', 'ED03001A', + 'ED04001', 'ED04001A', 'ED05001', 'ED05001A', + 'ED06001', 'ED06001A') +AND [moh_nnp_service_date] IS NOT NULL +AND [moh_nnp_service_type_code] <> 'FU' /*do not include "follow-up" (FU) appointments. +--See 'inclusion criteria' on page 34 of: www.health.govt.nz/publication/emergency-department-use-2014-15*/ +AND [moh_nnp_attendence_code] <> 'DNA' /*Remove cases when health care user "Did not attend"*/ +AND [moh_nnp_attendence_code] <> 'DNW'; /*Remove cases when health care user arrived but "did not wait" to use service. +--See 'inclusion criteria' on page 34 of: www.health.govt.nz/publication/emergency-department-use-2014-15*/ +GO diff --git a/health/falls_injury_register.sql b/health/falls_injury_register.sql new file mode 100644 index 0000000..cd9c421 --- /dev/null +++ b/health/falls_injury_register.sql @@ -0,0 +1,93 @@ +/************************************************************************************************** +Title: Falls (injury) register +Author: MOH +Re-edit: Manjusha Radhakrishnan +Reviewer: + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +People in the falls (injury) register. + +Intended purpose: +Create register of people who have been hospitalised for a fall in the past years + +Inputs & Dependencies: +- [IDI_Clean].[moh_clean].[priv_fund_hosp_discharges_diag] +- [IDI_Clean].[moh_clean].[pub_fund_hosp_discharges_event] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[defn_falls_injury] + +Notes: +- Falls/Injury codes used: + 'W00','W01','W02','W03','W04','W05','W06','W07','W08','W09','W10','W11','W12','W13','W14','W15','W16','W17','W18','W19' + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +Issues: +1. Duplicates can be found; this is because a person may receive multiple treatments for the same condition + +History (reverse order): +2022-07-20 MR v1 +**************************************************************************************************/ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_falls_injury] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[defn_falls_injury] ( + snz_uid INT + , event_date DATE + , source VARCHAR(255) +); +GO + + +/** PUBLIC HOSPITALS **/ +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_falls_injury] (snz_uid, event_date, source) +SELECT snz_uid, + nmd.moh_evt_even_date AS event_date, + 'PUB HOSP' AS source +FROM [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_event] as nmd +INNER JOIN ( + SELECT moh_dia_event_id_nbr, + moh_dia_diagnosis_type_code, + moh_dia_clinical_code, + moh_dia_clinical_sys_code + FROM [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_diag] + WHERE moh_dia_clinical_sys_code IN ('11','12','13','14','15') + AND (substring(moh_dia_clinical_code,1,3) IN ('W00','W01','W02','W03','W04','W05','W06','W07','W08','W09','W10','W11', + 'W12','W13','W14','W15','W16','W17','W18','W19'))) as fndp +ON nmd.moh_evt_event_id_nbr = fndp.moh_dia_event_id_nbr +GO + +/** PRIVATE HOSPITALS **/ +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_falls_injury] (snz_uid, event_date, source) +SELECT snz_uid, + moh_pri_evt_end_date AS event_date, + 'PRIV HOSP' AS source +FROM [IDI_Clean_YYYYMM].[moh_clean].[priv_fund_hosp_discharges_event] as nmd +INNER JOIN ( + SELECT moh_pri_diag_event_id_nbr, + moh_pri_diag_diag_type_code, + moh_pri_diag_clinic_code, + moh_pri_diag_clinic_sys_code + FROM [IDI_Clean_YYYYMM].[moh_clean].[priv_fund_hosp_discharges_diag] + WHERE moh_pri_diag_clinic_sys_code IN ('11','12','13','14','15') + AND (substring(moh_pri_diag_clinic_code,1,3) IN ('W00','W01','W02','W03','W04','W05','W06','W07','W08','W09','W10','W11', + 'W12','W13','W14','W15','W16','W17','W18','W19')) +) as fndp +ON nmd.moh_pri_evt_event_id_nbr = fndp.moh_pri_diag_event_id_nbr +GO diff --git a/health/funded_moh_disability.sql b/health/funded_moh_disability.sql new file mode 100644 index 0000000..f3b672d --- /dev/null +++ b/health/funded_moh_disability.sql @@ -0,0 +1,71 @@ +/************************************************************************************************** +Title: Recent funded MOH disability client +Author: Craig Wright + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Recent funded MOH disability client in SOCRATES + +Intended purpose: +Identifying people receiving funding via SOCRATES (MOH) for a disability. + +Inputs & Dependencies: +- [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_needs_assessment_202110] +- [IDI_Clean].[security].[concordance] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[moh_disability_funded] + +Notes: + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = vacc_ + Project schema = DL-MAA20XX-YY + +Issues: + +History (reverse order): +2021-10-31 CW v1 +**************************************************************************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[moh_disability_funded] +GO + +SELECT b.snz_uid + ,1 as moh_disability_funded + --,a.[snz_moh_uid] + --,[snz_moh_soc_client_uid] + --,[ReferralID] + --,[NeedsAssessmentID] + --,[NASCCode] + --,[AssessmentType] + --,[AssessmentLocation] + --,[ReassessmentRequestDate] + --,[FirstContactDate] + --,[DateAssessmentCompleted] + --,[AssessmentOutcome] + --,[CurrentNA] +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[moh_disability_funded] +FROM [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_needs_assessment_202110] as a +INNER JOIN [IDI_Clean_YYYYMM].[security].[concordance] as b +ON a.snz_moh_uid = b.snz_moh_uid +WHERE [CurrentNA] = 1 +AND assessmentoutcome = 'Requires Service Coordination' +GO + +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[moh_disability_funded] (snz_uid); +GO +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[moh_disability_funded] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE) +GO diff --git a/health/health_cards.sql b/health/health_cards.sql new file mode 100644 index 0000000..aebc3ee --- /dev/null +++ b/health/health_cards.sql @@ -0,0 +1,111 @@ +/************************************************************************************************** +Title: Health cards +Author: Simon Anastasiadis + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Community Services Card and High Use Health Card use +for perscriptions or claimed via general medical subsidies. + +Intended purpose: +Determining who has a community services card and/or a high use health card. + +Inputs & Dependencies: +- [IDI_Clean].[moh_clean].[pharmaceutical] +- [IDI_Clean].[moh_clean].[gms_claims] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[defn_health_cs_card] +- [IDI_UserCode].[DL-MAA20XX-YY].[defn_health_huh_card] + +Notes: +1) While the GMS table codes both health cards as Y(es) or N(o), + the pharmaceutical table uses a different coding of health cards: + HUHC in (NULL, U, Z) + CSC in (NULL, 1, 3, 4) + Based on the proportion of the population in each category, we has assumed + HUHC = Z and CSC = 1 or 3 are equivalent to Yes, and the others are equivalent to No. +2) As this tables observes use of health cards, we require some assumptions as to + when people have health cards. We shall assume if you used a health card then you + have an active card for the month previous and two months after. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +Issues: + +History (reverse order): +2020-05-20 SA v1 +**************************************************************************************************/ + +/* Set database for writing views */ +USE IDI_UserCode +GO + +/* Clear existing view */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[defn_health_cs_card]; +GO + +/* Create view */ +CREATE VIEW [DL-MAA20XX-YY].[defn_health_cs_card] AS +SELECT [snz_uid] + ,[moh_pha_dispensed_date] AS [event_date] + ,DATEADD(MONTH, -1, [moh_pha_dispensed_date]) AS [start_date] + ,DATEADD(MONTH, 2, [moh_pha_dispensed_date]) AS [end_date] + ,[moh_pha_csc_holder_code] AS [csc] + ,[moh_pha_huhc_holder_code] AS [huhc] +FROM [IDI_Clean_YYYYMM].[moh_clean].[pharmaceutical] +WHERE [moh_pha_csc_holder_code] IN ('1', '3') + +UNION ALL + +SELECT [snz_uid] + ,[moh_gms_visit_date] AS [event_date] + ,DATEADD(MONTH, -1, [moh_gms_visit_date]) AS [start_date] + ,DATEADD(MONTH, 2, [moh_gms_visit_date]) AS [end_date] + ,[moh_gms_csc_code] AS [csc] + ,[moh_gms_huhc_code] AS [huhc] +FROM [IDI_Clean_YYYYMM].[moh_clean].[gms_claims] +WHERE [moh_gms_csc_code] = 'Y' + +GO + +/* Clear existing view */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[defn_health_huh_card]; +GO + +/* Create view */ +CREATE VIEW [DL-MAA20XX-YY].[defn_health_huh_card] AS +SELECT [snz_uid] + ,[moh_pha_dispensed_date] AS [event_date] + ,DATEADD(MONTH, -1, [moh_pha_dispensed_date]) AS [start_date] + ,DATEADD(MONTH, 2, [moh_pha_dispensed_date]) AS [end_date] + ,[moh_pha_csc_holder_code] AS [csc] + ,[moh_pha_huhc_holder_code] AS [huhc] +FROM [IDI_Clean_YYYYMM].[moh_clean].[pharmaceutical] +WHERE [moh_pha_huhc_holder_code] IN ('Z') + +UNION ALL + +SELECT [snz_uid] + ,[moh_gms_visit_date] AS [event_date] + ,DATEADD(MONTH, -1, [moh_gms_visit_date]) AS [start_date] + ,DATEADD(MONTH, 2, [moh_gms_visit_date]) AS [end_date] + ,[moh_gms_csc_code] AS [csc] + ,[moh_gms_huhc_code] AS [huhc] +FROM [IDI_Clean_YYYYMM].[moh_clean].[gms_claims] +WHERE [moh_gms_huhc_code] = 'Y' +GO diff --git a/health/immunisations.sql b/health/immunisations.sql new file mode 100644 index 0000000..5e9cb3b Binary files /dev/null and b/health/immunisations.sql differ diff --git a/health/mha_bipolar.sql b/health/mha_bipolar.sql new file mode 100644 index 0000000..9ffce6d --- /dev/null +++ b/health/mha_bipolar.sql @@ -0,0 +1,455 @@ +/************************************************************************************************** +Title: Bipolar disorder +Author: Craig Wright +Re-work: Manjusha Radhakrishnan, Simon Anastasiadis +Reviewer: + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Indication of Bipolar. + +Intended purpose: +All instances that point to a person having Bipolar +For incidence and prevalence - see notes. + +Inputs & Dependencies: +- [IDI_Clean].[security].[concordance] +- [IDI_Clean].[data].[personal_detail] +- [IDI_Clean].[moh_clean].[mortality_diagnosis] +- [IDI_Clean].[moh_clean].[mortality_registrations] +- [IDI_Clean].[moh_clean].[pharmaceutical] +- [IDI_Clean].[moh_clean].[interrai] +- [IDI_Clean].[msd_clean].[msd_incapacity] +- [IDI_Clean].[moh_clean].[lab_claims] +- [IDI_Clean].[moh_clean].[priv_fund_hosp_discharges_event] +- [IDI_Clean].[moh_clean].[priv_fund_hosp_discharges_diag] +- [IDI_Clean].[moh_clean].[pub_fund_hosp_discharges_diag] +- [IDI_Clean].[moh_clean].[pub_fund_hosp_discharges_event] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_labs_lab_test] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_dim_form_pack_subsidy_code] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_dim_form_pack_subsidy_code] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS_CLIN_DIAG_CODES].[clinical_codes] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_interrai_question_lookup] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_interrai_answer_lookup] +- [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_disability] +- [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_needs_assessment] +- [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_referral] +- [IDI_Adhoc].[clean_read_MOH_PRIMHD].[moh_primhd_mhinc] +- [IDI_Adhoc].[clean_read_MOH_PRIMHD].[primhd_diagnoses] + +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_bipolar] +- [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] + + +Notes: +1) Estimates for prevalance: + NZ 2.4% of the population Te Rau Hinengaro: The NZ Mental Health Survey 2006 + USA 2.8% of adults + +2) Incidence and prevalence + Incidence = development of a condition during a particular time period (new only). + Prevalance = affected by the condition during a particular time period (new + existing). + Plots of incidence (from this definition) against deprivation show a non-monotinic pattern. + Incidence increases from deprivation 1-8 but deprivation 9 and 10 show lower incidence. + This is likely because of barriers to access (cost, transport, availability etc.). + + So we can not use point in time measures to accurately access prevalance. + Instead, when using this definition, we recommend considering every person's history. + For example, if we consider any bipolar indication in the last 10 years, then we observe + a more likely relationship between bipolar and deprivation. This suggests that people in high + deprivation do get some treatment, but much less frequently than those who are less deprived. + Such an approach is reasonable, as this condition is likely to persist for at least 10 years. + +3) Some treatments are used to treat this condition and other conditions. This is most common + with pharmaceuticals. For these treatments, we do the following: + - Gather the treatments that serve multiple purposes together separately. + - Where a person only has records that also treat other conditions, discard those records + - Where a person has other evidence of bipolar, add the treatments for multiple conditions + to the table. + +4) Certain medial events are coded using the ICD9, ICD10, or DSM codes - most commonly hospital diagnoses. + There are mappings between the different codings in the diagnoses table. The mappings help researchers who are + familiar with only one coding system to locate records from a different coding system. + Most records (at least 80%) are stored in two versions/rows (the submitted code system, and an alternative they + have been mapped to), so researchers could use either version. + However, the mappings are imperfect. In some cases a more specific code we do want is mapped to a more general code + that we do not want. Hence, to ensure the most robust results, we have limited ourselves to only those records where + the diagnostic code is stored in the same system it was submitted. + This may exclude some records from our output definition. Researchers needing the broadest possible definition are + advised to review this constraint. + +Issues: +1) Because all our MHA tables use the same lookup/reference table, and all definitions load + this table into the database, you can not run the definitions in parallel. Because, each + definition will delete the reference table when it starts running and this will interfere + with the definitions that are already running. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +History (reverse order): +2022-09-12 SA Prep for library +2022-07-19 MR Tidy-up +2022-06-10 CW Definition creation +*************************************************************************************************************************/ + +/* Download the diagnosis lookup table from Github folder and upload onto datalab */ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] ( + diagnosis VARCHAR(30), + code_type VARCHAR(30), + code VARCHAR(10), + aux VARCHAR(30), + explanation VARCHAR(255), +) + +BULK INSERT [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] +FROM '\\your project folder\diagnosis_codes.csv' +WITH ( + FIRSTROW = 2, + FIELDTERMINATOR = ',', + ROWTERMINATOR = '\n', + TABLOCK +) + +/******************************************************** +TABLES TO APPEND TO +********************************************************/ + +/* Diagnosis or treatment only indicates bipolar */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_bipolar_solo] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_bipolar_solo] ( + snz_uid INT, + event_date DATE, +) + +/* Diagnosis or treatment used for bipolar and other conditions */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_bipolar_multi] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_bipolar_multi] ( + snz_uid INT, + event_date DATE, +) + +/******************************************************** +MORTALITY + +Note that people who died with this diagnosis will likely +have had bipolar for a while before death +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_bipolar_solo] (snz_uid, event_date) + +SELECT b.snz_uid + ,EOMONTH(DATEFROMPARTS([moh_mor_death_year_nbr],[moh_mor_death_month_nbr],1)) AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[mortality_diagnosis] AS a +INNER JOIN [IDI_Clean_YYYYMM].[moh_clean].[mortality_registrations] AS b +ON a.[snz_dia_death_reg_uid] = b.snz_dia_death_reg_uid +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[moh_mort_diag_clinical_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'bipolar' + AND r.code_type = 'ICD10' + AND [moh_mort_diag_clinic_sys_code] >= 10 + AND [moh_mort_diag_clinic_type_code] in ('A','B','V') +) +AND [moh_mort_diag_clinic_sys_code] >= 10 +GO + +/******************************************************** +PHARMACEUTICALS +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_bipolar_solo] (snz_uid, event_date) +SELECT a.[snz_uid] + ,[moh_pha_dispensed_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[pharmaceutical] AS a +INNER JOIN [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_dim_form_pack_subsidy_code] AS b +ON a.[moh_pha_dim_form_pack_code] = b.[DIM_FORM_PACK_SUBSIDY_KEY] +WHERE EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(CAST(CHEMICAL_ID AS VARCHAR), 1, LEN(r.code)) = r.code + AND r.diagnosis = 'bipolar' + AND r.code_type = 'pharm_chemical' + AND r.aux = 'sole' +) +GO + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_bipolar_multi] (snz_uid, event_date) +SELECT a.[snz_uid] + ,[moh_pha_dispensed_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[pharmaceutical] AS a +INNER JOIN [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_dim_form_pack_subsidy_code] AS b +ON a.[moh_pha_dim_form_pack_code] = b.[DIM_FORM_PACK_SUBSIDY_KEY] +WHERE EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(CAST(CHEMICAL_ID AS VARCHAR), 1, LEN(r.code)) = r.code + AND r.diagnosis = 'bipolar' + AND r.code_type = 'pharm_chemical' + AND r.aux = 'multiple' +) +OR EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(CAST(formulation_id AS VARCHAR), 1, LEN(r.code)) = r.code + AND r.diagnosis = 'bipolar' + AND r.code_type = 'pharm_formulation' + AND r.aux = 'multiple' +) +GO + +/******************************************************** +INTERRAI +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_bipolar_solo] (snz_uid, event_date) +SELECT [snz_uid] + ,[moh_irai_assessment_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[interrai] +WHERE moh_irai_bipolar_code > 0 +GO + +/******************************************************** +SOCRATES +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_bipolar_solo] (snz_uid, event_date) +SELECT b.snz_uid + ,COALESCE( + CAST(SUBSTRING([FirstContactDate], 1, 7) AS DATE), + CAST(SUBSTRING([ReferralDate],1,7) AS DATE) + ) AS event_date +FROM [IDI_Adhoc].[clean_read_MOH_SOCRATES].moh_disability AS a +INNER JOIN [IDI_Clean_YYYYMM].[security].[concordance] AS b +ON a.snz_moh_uid = b.snz_moh_uid +INNER JOIN [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_needs_assessment] AS c +ON a.snz_moh_uid = c.snz_moh_uid +LEFT JOIN [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_referral] AS e +ON a.snz_moh_uid = e.snz_moh_uid +WHERE a.[Code] = '1303' + +/******************************************************** +MSD INCAPACITATION +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_bipolar_solo] (snz_uid, event_date) +SELECT [snz_uid] + ,[msd_incp_incp_from_date] AS event_date +FROM [IDI_Clean_YYYYMM].[msd_clean].[msd_incapacity] +WHERE [msd_incp_incrsn_code] = '162' +OR [msd_incp_incrsn95_1_code] = '162' +OR [msd_incp_incrsn95_2_code] = '162' +OR [msd_incp_incrsn95_3_code] = '162' +OR [msd_incp_incrsn95_4_code] = '162' +OR [msd_incp_incapacity_code] = '162' +GO + +/******************************************************** +LAB TESTS +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_bipolar_solo] (snz_uid, event_date) +SELECT [snz_uid] + ,[moh_lab_visit_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[lab_claims] +WHERE EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING([moh_lab_test_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'bipolar' + AND r.code_type = 'lab_test' +) +GO + +/******************************************************** +PRIVATE HOSPITAL DISCHARGE +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_bipolar_solo] (snz_uid, event_date) +SELECT a.[snz_uid] + ,CAST([moh_pri_evt_start_date] AS DATE) AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[priv_fund_hosp_discharges_event] AS a +INNER JOIN [IDI_Clean_YYYYMM].[moh_clean].[priv_fund_hosp_discharges_diag] AS b +ON a.[moh_pri_evt_event_id_nbr] = b.[moh_pri_diag_event_id_nbr] +AND [moh_pri_diag_sub_sys_code] = [moh_pri_diag_clinic_sys_code] +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING([moh_pri_diag_clinic_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'bipolar' + AND r.code_type = 'ICD10' + AND [moh_pri_diag_sub_sys_code] >= '10' + AND [moh_pri_diag_diag_type_code] IN ('A','B','V') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING([moh_pri_diag_clinic_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'bipolar' + AND r.code_type = 'ICD9' + AND [moh_pri_diag_sub_sys_code] IN ('06','6') + AND [moh_pri_diag_diag_type_code] IN ('A','B','V') +) +GO + +/******************************************************** +PUBLIC HOSPITAL DISCHARGE +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_bipolar_solo] (snz_uid, event_date) +SELECT b.[snz_uid] + ,[moh_evt_evst_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_diag] AS a +INNER JOIN [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_event] AS b +ON [moh_dia_clinical_sys_code] = [moh_dia_submitted_system_code] +AND [moh_evt_event_id_nbr]=[moh_dia_event_id_nbr] +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[moh_dia_clinical_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'bipolar' + AND r.code_type = 'ICD10' + AND [moh_dia_submitted_system_code] >= '10' + AND [moh_dia_diagnosis_type_code] IN ('A','B','V') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[moh_dia_clinical_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'bipolar' + AND r.code_type = 'ICD9' + AND [moh_dia_submitted_system_code] IN ('06','6') + AND [moh_dia_diagnosis_type_code] IN ('A','B','V') +) +GO + +/******************************************************** +PRIMHD AND MHINC +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_bipolar_solo] (snz_uid, event_date) +SELECT b.snz_uid + ,[classification_start] AS event_date +FROM [IDI_Adhoc].[clean_read_MOH_PRIMHD].[moh_primhd_mhinc] AS a +INNER JOIN [IDI_Clean_YYYYMM].[security].[concordance] AS b +ON a.snz_moh_uid = b.snz_moh_uid +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'bipolar' + AND r.code_type = 'ICD10' + AND [clinical_coding_system_id] >= '10' +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'bipolar' + AND r.code_type = 'DSM' + AND [clinical_coding_system_id] IN ('07','7') +) +GO + +/******************************************************** +PRIMHD DIAGNOSIS +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_bipolar_solo] (snz_uid, event_date) +SELECT snz_uid + ,DATEFROMPARTS(SUBSTRING([CLASSIFICATION_START_DATE],7,4),SUBSTRING([CLASSIFICATION_START_DATE],4,2),SUBSTRING([CLASSIFICATION_START_DATE],1,2)) AS event_date +FROM [IDI_Adhoc].[clean_read_MOH_PRIMHD].[primhd_diagnoses] AS a +INNER JOIN [IDI_Clean_YYYYMM].[security].[concordance] AS b +ON a.snz_moh_uid = b.snz_moh_uid + +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'bipolar' + AND r.code_type = 'ICD10' + AND [clinical_coding_system_id] >= '10' +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'bipolar' + AND r.code_type = 'DSM' + AND [clinical_coding_system_id] IN ('07','7') +) +GO + +/**************************************************************************************************************** +FINAL TABLE CREATION +****************************************************************************************************************/ + +/* Add indexes */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_bipolar_solo] (snz_uid); +GO +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_bipolar_multi] (snz_uid); +GO + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_bipolar] +GO + +WITH multi_to_add AS ( + SELECT snz_uid, event_date + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_bipolar_multi] AS m + WHERE EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_bipolar_solo] AS s + WHERE m.snz_uid = s.snz_uid + ) +) +SELECT DISTINCT snz_uid, event_date +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_bipolar] +FROM ( + SELECT snz_uid, event_date + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_bipolar_solo] + + UNION ALL + + SELECT snz_uid, event_date + FROM multi_to_add +) AS k +GO + +/******************************************************** +TIDY UP +********************************************************/ + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_bipolar] (snz_uid); +GO +/* Compress final table to save space */ +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_bipolar] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE); +GO + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_bipolar_solo] +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_bipolar_multi] +GO diff --git a/health/mha_dysthymia.sql b/health/mha_dysthymia.sql new file mode 100644 index 0000000..36e1e69 --- /dev/null +++ b/health/mha_dysthymia.sql @@ -0,0 +1,292 @@ +/************************************************************************************************** +Title: Dysthymia incidence and prevalence +Author: Craig Wright +Re-work: Manjusha Radhakrishnan, Simon Anastasiadis +Reviewer: + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Indication of dysthymia. + +Intended purpose: +All instances that point to a person having dysthymia. +Also known as Persistent depressive disorder. +For incidence and prevalence - see notes. + +Inputs & Dependencies: +- [IDI_Clean].[security].[concordance] +- [IDI_Clean].[moh_clean].[priv_fund_hosp_discharges_event] +- [IDI_Clean].[moh_clean].[priv_fund_hosp_discharges_diag] +- [IDI_Clean].[moh_clean].[pub_fund_hosp_discharges_diag] +- [IDI_Clean].[moh_clean].[pub_fund_hosp_discharges_event] +- [IDI_Adhoc].[clean_read_MOH_PRIMHD].[moh_primhd_mhinc] +- [IDI_Adhoc].[clean_read_MOH_PRIMHD].[primhd_diagnoses] +- [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_disability] +- [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_needs_assessment] +- [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_referral] + +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_dysthymia] +- [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] + + +Notes: +1) Estimates for prevalance: + NZ 1-1.3% of the population Te Rau Hinengaro: The NZ Mental Health Survey 2006 + USA 6% at any time + USA 3% in the last 12 months + +2) Incidence and prevalence + Incidence = development of a condition during a particular time period (new only). + Prevalance = affected by the condition during a particular time period (new + existing). + Plots of incidence (from this definition) against deprivation show a non-monotinic pattern. + Incidence increases from deprivation 1-8 but deprivation 9 and 10 show lower incidence. + This is likely because of barriers to access (cost, transport, availability etc.). + + So we can not use point in time measures to accurately access prevalance. + Instead, when using this definition, we recommend considering every person's history. + For example, if we consider any dysthymia indication in the last 10 years, then we observe + a more likely relationship between dysthymia and deprivation. This suggests that people in high + deprivation do get some treatment, but much less frequently than those who are less deprived. + Such an approach is reasonable, as this condition is likely to persist for at least 10 years. + +3) Certain medial events are coded using the ICD9, ICD10, or DSM codes - most commonly hospital diagnoses. + There are mappings between the different codings in the diagnoses table. The mappings help researchers who are + familiar with only one coding system to locate records from a different coding system. + Most records (at least 80%) are stored in two versions/rows (the submitted code system, and an alternative they + have been mapped to), so researchers could use either version. + However, the mappings are imperfect. In some cases a more specific code we do want is mapped to a more general code + that we do not want. Hence, to ensure the most robust results, we have limited ourselves to only those records where + the diagnostic code is stored in the same system it was submitted. + This may exclude some records from our output definition. Researchers needing the broadest possible definition are + advised to review this constraint. + +Issues: +1) Because all our MHA tables use the same lookup/reference table, and all definitions load + this table into the database, you can not run the definitions in parallel. Because, each + definition will delete the reference table when it starts running and this will interfere + with the definitions that are already running. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +History (reverse order): +2022-09-12 SA Prep for library +2022-07-19 MR Tidy-up +2022-06-10 CW Definition creation +*************************************************************************************************************************/ + + +/* Download the diagnosis lookup table from Github folder and upload onto datalab */ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] ( + diagnosis VARCHAR(30), + code_type VARCHAR(30), + code VARCHAR(10), + aux VARCHAR(30), + explanation VARCHAR(255), +) + +BULK INSERT [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] +FROM '\\your project folder\diagnosis_codes.csv' +WITH ( + FIRSTROW = 2, + FIELDTERMINATOR = ',', + ROWTERMINATOR = '\n', + TABLOCK +) + +/******************************************************** +TABLES TO APPEND TO +********************************************************/ + +/* Diagnosis or treatment only indicates dysthymia */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_dysthymia_solo] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_dysthymia_solo] ( + snz_uid INT, + event_date DATE, +) + +/******************************************************** +PRIVATE HOSPITAL DISCHARGE +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_dysthymia_solo] (snz_uid, event_date) +SELECT a.[snz_uid] + ,CAST([moh_pri_evt_start_date] AS DATE) AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[priv_fund_hosp_discharges_event] AS a +INNER JOIN [IDI_Clean_YYYYMM].[moh_clean].[priv_fund_hosp_discharges_diag] AS b +ON a.[moh_pri_evt_event_id_nbr] = b.[moh_pri_diag_event_id_nbr] +AND [moh_pri_diag_sub_sys_code] = [moh_pri_diag_clinic_sys_code] +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING([moh_pri_diag_clinic_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'dysthymia' + AND r.code_type = 'ICD10' + AND [moh_pri_diag_sub_sys_code] >= '10' + AND [moh_pri_diag_diag_type_code] IN ('A','B','V') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING([moh_pri_diag_clinic_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'dysthymia' + AND r.code_type = 'ICD9' + AND [moh_pri_diag_sub_sys_code] IN ('06','6') + AND [moh_pri_diag_diag_type_code] IN ('A','B','V') +) +GO + +/******************************************************** +PUBLIC HOSPITAL DISCHARGE +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_dysthymia_solo] (snz_uid, event_date) +SELECT b.[snz_uid] + ,[moh_evt_evst_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_diag] AS a +INNER JOIN [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_event] AS b +ON [moh_dia_clinical_sys_code] = [moh_dia_submitted_system_code] +AND [moh_evt_event_id_nbr]=[moh_dia_event_id_nbr] +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[moh_dia_clinical_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'dysthymia' + AND r.code_type = 'ICD10' + AND [moh_dia_submitted_system_code] >= '10' + AND [moh_dia_diagnosis_type_code] IN ('A','B','V') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[moh_dia_clinical_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'dysthymia' + AND r.code_type = 'ICD9' + AND [moh_dia_submitted_system_code] IN ('06','6') + AND [moh_dia_diagnosis_type_code] IN ('A','B','V') +) +GO + +/******************************************************** +PRIMHD AND MHINC +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_dysthymia_solo] (snz_uid, event_date) +SELECT b.snz_uid + ,[classification_start] AS event_date +FROM [IDI_Adhoc].[clean_read_MOH_PRIMHD].[moh_primhd_mhinc] AS a +INNER JOIN [IDI_Clean_YYYYMM].[security].[concordance] AS b +ON a.snz_moh_uid = b.snz_moh_uid +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'dysthymia' + AND r.code_type = 'ICD10' + AND [clinical_coding_system_id] >= '10' +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'dysthymia' + AND r.code_type = 'DSM' + AND [clinical_coding_system_id] IN ('07','7') +) +GO + +/******************************************************** +PRIMHD DIAGNOSES +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_dysthymia_solo] (snz_uid, event_date) +SELECT b.snz_uid + ,CONVERT(DATE, CLASSIFICATION_START_DATE, 103) AS event_date +FROM [IDI_Adhoc].[clean_read_MOH_PRIMHD].[primhd_diagnoses] AS a +INNER JOIN [IDI_Clean_YYYYMM].[security].[concordance] AS b +ON a.snz_moh_uid = b.snz_moh_uid +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'dysthymia' + AND r.code_type = 'ICD10' + AND [clinical_coding_system_id] >= '10' +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'dysthymia' + AND r.code_type = 'DSM' + AND [clinical_coding_system_id] IN ('07','7') +) +GO + +/******************************************************** +SOCRATES + +We use the depression code from socrates as there is not a dysthymia specific code. +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_dysthymia_multi] (snz_uid, event_date) +SELECT b.snz_uid + ,COALESCE( + CAST(SUBSTRING([FirstContactDate], 1, 7) AS DATE), + CAST(SUBSTRING([ReferralDate],1,7) AS DATE) + ) AS event_date +FROM [IDI_Adhoc].[clean_read_MOH_SOCRATES].moh_disability AS a +INNER JOIN [IDI_Clean_YYYYMM].[security].[concordance] AS b +ON a.snz_moh_uid = b.snz_moh_uid +INNER JOIN [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_needs_assessment] AS c +ON a.snz_moh_uid = c.snz_moh_uid +LEFT JOIN [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_referral] AS e +ON a.snz_moh_uid = e.snz_moh_uid +WHERE a.[Code] = '1304' + +/**************************************************************************************************************** +FINAL TABLE CREATION +****************************************************************************************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_dysthymia] +GO + +SELECT DISTINCT snz_uid, event_date +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_dysthymia] +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_dysthymia_solo] +GO + +/******************************************************** +TIDY UP +********************************************************/ + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_dysthymia] (snz_uid); +GO +/* Compress final table to save space */ +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_dysthymia] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE); +GO + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_dysthymia_solo] +GO diff --git a/health/mha_generalised_anxiety_disorder.sql b/health/mha_generalised_anxiety_disorder.sql new file mode 100644 index 0000000..c252a42 --- /dev/null +++ b/health/mha_generalised_anxiety_disorder.sql @@ -0,0 +1,409 @@ +/************************************************************************************************** +Title: Generalised anxiety disorder (GAD) +Author: Craig Wright +Re-work: Manjusha Radhakrishnan, Simon Anastasiadis +Reviewer: + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Indication of Generalised anxiety disorder (GAD). + +Intended purpose: +All instances that point to a person having Generalised Anxiety Disorder (GEN) +For incidence and prevalence - see notes. + +Inputs & Dependencies: +- [IDI_Clean].[security].[concordance] +- [IDI_Clean].[moh_clean].[interrai] +- [IDI_Clean].[msd_clean].[msd_incapacity] +- [IDI_Clean].[moh_clean].[pharmaceutical] +- [IDI_Clean].[moh_clean].[priv_fund_hosp_discharges_event] +- [IDI_Clean].[moh_clean].[priv_fund_hosp_discharges_diag] +- [IDI_Clean].[moh_clean].[pub_fund_hosp_discharges_diag] +- [IDI_Clean].[moh_clean].[pub_fund_hosp_discharges_event] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_interrai_question_lookup] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_interrai_answer_lookup] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_dim_form_pack_subsidy_code] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_dim_form_pack_subsidy_code] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS_CLIN_DIAG_CODES].[clinical_codes] +- [IDI_Adhoc].[clean_read_MOH_PRIMHD].[moh_primhd_mhinc] +- [IDI_Adhoc].[clean_read_MOH_PRIMHD].[primhd_diagnoses] +- [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_disability] +- [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_needs_assessment] +- [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_referral] + +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_generalised_anxiety_disorder] +- [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] + +Notes: +1) Estimates for prevalance: + GAD prevalence NZ 1.6-2.8% depending on age group + Te Rau Hinengaro: The NZ Mental Health Survey 2006, Author S Wells + +2) Incidence and prevalence + Incidence = development of a condition during a particular time period (new only). + Prevalance = affected by the condition during a particular time period (new + existing). + Plots of incidence (from this definition) against deprivation show a non-monotinic pattern. + Incidence increases from deprivation 1-8 but deprivation 9 and 10 show lower incidence. + This is likely because of barriers to access (cost, transport, availability etc.). + + So we can not use point in time measures to accurately access prevalance. + Instead, when using this definition, we recommend considering every person's history. + For example, if we consider any GAD indication in the last 10 years, then we observe + a more likely relationship between GAD and deprivation. This suggests that people in high + deprivation do get some treatment, but much less frequently than those who are less deprived. + Such an approach is reasonable, as this condition is likely to persist for at least 10 years. + +3) Some treatments are used to treat this condition and other conditions. This is most common + with pharmaceuticals. For these treatments, we do the following: + - Gather the treatments that serve multiple purposes together separately. + - Where a person only has records that also treat other conditions, discard those records + - Where a person has other evidence of GAD, add the treatments for multiple conditions + to the table. + For example, in 2010 a person is diagnosed in hospital with GAD, in 2008 they received drugs + that are used to treat GAD or depression. Then we include the 2008 date as an earlier probable indicator. + +4) Certain medial events are coded using the ICD9, ICD10, or DSM codes - most commonly hospital diagnoses. + There are mappings between the different codings in the diagnoses table. The mappings help researchers who are + familiar with only one coding system to locate records from a different coding system. + Most records (at least 80%) are stored in two versions/rows (the submitted code system, and an alternative they + have been mapped to), so researchers could use either version. + However, the mappings are imperfect. In some cases a more specific code we do want is mapped to a more general code + that we do not want. Hence, to ensure the most robust results, we have limited ourselves to only those records where + the diagnostic code is stored in the same system it was submitted. + This may exclude some records from our output definition. Researchers needing the broadest possible definition are + advised to review this constraint. + +5) Aim is to have a date at which the person transitions from being susceptible for GAD to having + been diagnosed. + +6) MSD incapacitation, SOCRATES, and INTERRAI are only used as multi-type indicators. These tables + record anxiety, but do not distinguish GAD from anxiety. Hence, we can not use these as a primary + source of diagnosis, only as a potential time period. + +Issues: +1) Because all our MHA tables use the same lookup/reference table, and all definitions load + this table into the database, you can not run the definitions in parallel. Because, each + definition will delete the reference table when it starts running and this will interfere + with the definitions that are already running. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +History (reverse order): +2022-09-12 SA Prep for library +2022-07-19 MR Tidy-up +2022-06-10 CW Definition creation +*************************************************************************************************************************/ + +/* Download the diagnosis lookup table from Github folder and upload onto datalab */ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] ( + diagnosis VARCHAR(30), + code_type VARCHAR(30), + code VARCHAR(10), + aux VARCHAR(30), + explanation VARCHAR(255), +) + +BULK INSERT [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] +FROM '\\your project folder\diagnosis_codes.csv' +WITH ( + FIRSTROW = 2, + FIELDTERMINATOR = ',', + ROWTERMINATOR = '\n', + TABLOCK +) + +/******************************************************** +TABLES TO APPEND TO +********************************************************/ + +/* Diagnosis or treatment only indicates Generalised Anxiety Disorder */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_anxiety_solo] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_anxiety_solo] ( + snz_uid INT, + event_date DATE, +) + +/* Diagnosis or treatment used for GAD and other conditions */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_anxiety_multi] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_anxiety_multi] ( + snz_uid INT, + event_date DATE, +) + +/******************************************************** +INTERRAI +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_anxiety_multi] (snz_uid, event_date) +SELECT [snz_uid] + ,[moh_irai_assessment_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[interrai] +WHERE moh_irai_anxiety_code > 0 +GO + +/******************************************************** +MSD INCAPACITATION +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_anxiety_multi] (snz_uid, event_date) +SELECT [snz_uid] + ,[msd_incp_incp_from_date] AS event_date +FROM [IDI_Clean_YYYYMM].[msd_clean].[msd_incapacity] +WHERE [msd_incp_incrsn_code] IN ('009', '165') +OR [msd_incp_incrsn95_1_code] IN ('009', '165') +OR [msd_incp_incrsn95_2_code] IN ('009', '165') +OR [msd_incp_incrsn95_3_code] IN ('009', '165') +OR [msd_incp_incrsn95_4_code] IN ('009', '165') +OR [msd_incp_incapacity_code] IN ('009', '165') +GO + + +/******************************************************** +PHARMACEUTICALS +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_anxiety_solo] (snz_uid, event_date) +SELECT a.[snz_uid] + ,[moh_pha_dispensed_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[pharmaceutical] AS a +INNER JOIN [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_dim_form_pack_subsidy_code] AS b +ON a.[moh_pha_dim_form_pack_code] = b.[DIM_FORM_PACK_SUBSIDY_KEY] +WHERE EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(CAST(CHEMICAL_ID AS VARCHAR), 1, LEN(r.code)) = r.code + AND r.diagnosis = 'anxiety' + AND r.code_type = 'pharm_chemical' + AND r.aux = 'sole' +) +GO + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_anxiety_multi] (snz_uid, event_date) +SELECT a.[snz_uid] + ,[moh_pha_dispensed_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[pharmaceutical] AS a +INNER JOIN [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_dim_form_pack_subsidy_code] AS b +ON a.[moh_pha_dim_form_pack_code] = b.[DIM_FORM_PACK_SUBSIDY_KEY] +WHERE EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(CAST(CHEMICAL_ID AS VARCHAR), 1, LEN(r.code)) = r.code + AND r.diagnosis = 'anxiety' + AND r.code_type = 'pharm_chemical' + AND r.aux = 'multiple' +) +GO + +/******************************************************** +PRIVATE HOSPITAL DISCHARGE +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_anxiety_solo] (snz_uid, event_date) +SELECT a.[snz_uid] + ,CAST([moh_pri_evt_start_date] AS DATE) AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[priv_fund_hosp_discharges_event] AS a +INNER JOIN [IDI_Clean_YYYYMM].[moh_clean].[priv_fund_hosp_discharges_diag] AS b +ON a.[moh_pri_evt_event_id_nbr] = b.[moh_pri_diag_event_id_nbr] +AND [moh_pri_diag_sub_sys_code] = [moh_pri_diag_clinic_sys_code] +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING([moh_pri_diag_clinic_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'anxiety' + AND r.code_type = 'ICD10' + AND [moh_pri_diag_sub_sys_code] >= '10' + AND [moh_pri_diag_diag_type_code] IN ('A','B','V') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING([moh_pri_diag_clinic_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'anxiety' + AND r.code_type = 'ICD9' + AND [moh_pri_diag_sub_sys_code] IN ('06','6') + AND [moh_pri_diag_diag_type_code] IN ('A','B','V') +) +GO + +/******************************************************** +PUBLIC HOSPITAL DISCHARGE +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_anxiety_solo] (snz_uid, event_date) +SELECT b.[snz_uid] + ,[moh_evt_evst_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_diag] AS a +INNER JOIN [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_event] AS b +ON [moh_dia_clinical_sys_code] = [moh_dia_submitted_system_code] +AND [moh_evt_event_id_nbr]=[moh_dia_event_id_nbr] +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[moh_dia_clinical_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'anxiety' + AND r.code_type = 'ICD10' + AND [moh_dia_submitted_system_code] >= '10' + AND [moh_dia_diagnosis_type_code] IN ('A','B','V') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[moh_dia_clinical_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'anxiety' + AND r.code_type = 'ICD9' + AND [moh_dia_submitted_system_code] IN ('06','6') + AND [moh_dia_diagnosis_type_code] IN ('A','B','V') +) +GO + +/******************************************************** +PRIMHD AND MHINC +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_anxiety_solo] (snz_uid, event_date) +SELECT b.snz_uid + ,[classification_start] AS event_date +FROM [IDI_Adhoc].[clean_read_MOH_PRIMHD].[moh_primhd_mhinc] AS a +INNER JOIN [IDI_Clean_YYYYMM].[security].[concordance] AS b +ON a.snz_moh_uid = b.snz_moh_uid +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'anxiety' + AND r.code_type = 'ICD10' + AND [clinical_coding_system_id] >= '10' +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'anxiety' + AND r.code_type = 'DSM' + AND [clinical_coding_system_id] IN ('07','7') +) +GO + +/******************************************************** +PRIMHD DIAGNOSIS +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_anxiety_solo] (snz_uid, event_date) +SELECT snz_uid + ,DATEFROMPARTS(SUBSTRING([CLASSIFICATION_START_DATE],7,4),SUBSTRING([CLASSIFICATION_START_DATE],4,2),SUBSTRING([CLASSIFICATION_START_DATE],1,2)) AS event_date +FROM [IDI_Adhoc].[clean_read_MOH_PRIMHD].[primhd_diagnoses] AS a +INNER JOIN [IDI_Clean_YYYYMM].[security].[concordance] AS b +ON a.snz_moh_uid = b.snz_moh_uid + +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'anxiety' + AND r.code_type = 'ICD10' + AND [clinical_coding_system_id] >= '10' +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'anxiety' + AND r.code_type = 'DSM' + AND [clinical_coding_system_id] IN ('07','7') +) +GO + +/******************************************************** +SOCRATES +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_anxiety_multi] (snz_uid, event_date) +SELECT b.snz_uid + ,COALESCE( + CAST(SUBSTRING([FirstContactDate], 1, 7) AS DATE), + CAST(SUBSTRING([ReferralDate],1,7) AS DATE) + ) AS event_date +FROM [IDI_Adhoc].[clean_read_MOH_SOCRATES].moh_disability AS a +INNER JOIN [IDI_Clean_YYYYMM].[security].[concordance] AS b +ON a.snz_moh_uid = b.snz_moh_uid +INNER JOIN [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_needs_assessment] AS c +ON a.snz_moh_uid = c.snz_moh_uid +LEFT JOIN [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_referral] AS e +ON a.snz_moh_uid = e.snz_moh_uid +WHERE a.[Code] = '1302' + + +/**************************************************************************************************************** +FINAL TABLE CREATION +****************************************************************************************************************/ + +/* Add indexes */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_anxiety_solo] (snz_uid); +GO +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_anxiety_multi] (snz_uid); +GO + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_generalised_anxiety_disorder] +GO + +WITH multi_to_add AS ( + SELECT snz_uid, event_date + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_anxiety_multi] AS m + WHERE EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_anxiety_solo] AS s + WHERE m.snz_uid = s.snz_uid + ) +) +SELECT DISTINCT snz_uid, event_date +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_generalised_anxiety_disorder] +FROM ( + SELECT snz_uid, event_date + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_anxiety_solo] + + UNION ALL + + SELECT snz_uid, event_date + FROM multi_to_add +) AS k +GO + +/******************************************************** +TIDY UP +********************************************************/ + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_generalised_anxiety_disorder] (snz_uid); +GO +/* Compress final table to save space */ +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_generalised_anxiety_disorder] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE); +GO + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_anxiety_solo] +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_anxiety_multi] +GO diff --git a/health/mha_major_depressive_disorder.sql b/health/mha_major_depressive_disorder.sql new file mode 100644 index 0000000..ab9543c --- /dev/null +++ b/health/mha_major_depressive_disorder.sql @@ -0,0 +1,404 @@ +/************************************************************************************************** +Title: Major Depressive Disorder (MDD) +Author: Craig Wright +Re-work: Manjusha Radhakrishnan, Simon Anastasiadis +Reviewer: + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Indication of Major Depressive Disorder (MDD). + +Intended purpose: +All instances that point to a person having MDD +For incidence and prevalence - see notes. + +Inputs & Dependencies: +- [IDI_Clean].[security].[concordance] +- [IDI_Clean].[msd_clean].[msd_incapacity] +- [IDI_Clean].[moh_clean].[pharmaceutical] +- [IDI_Clean].[moh_clean].[interrai] +- [IDI_Clean].[moh_clean].[priv_fund_hosp_discharges_diag] +- [IDI_Clean].[moh_clean].[priv_fund_hosp_discharges_event] +- [IDI_Clean].[moh_clean].[pub_fund_hosp_discharges_diag] +- [IDI_Clean].[moh_clean].[pub_fund_hosp_discharges_event] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_dim_form_pack_subsidy_code] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_dim_form_pack_subsidy_code] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS_CLIN_DIAG_CODES].[clinical_codes] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_interrai_question_lookup] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_interrai_answer_lookup] +- [IDI_Adhoc].[clean_read_MOH_PRIMHD].[moh_primhd_mhinc] +- [IDI_Adhoc].[clean_read_MOH_PRIMHD].[primhd_diagnoses] +- [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_disability] +- [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_needs_assessment] +- [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_referral] + +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_major_depressive_disorder] +- [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] + +Notes: +1) Estimates for prevalance: + NZ 2.4% of the population Te Rau Hinengaro: The NZ Mental Health Survey 2006 + USA 2.8% of adults + +2) Incidence and prevalence + Incidence = development of a condition during a particular time period (new only). + Prevalance = affected by the condition during a particular time period (new + existing). + Plots of incidence (from this definition) against deprivation show a non-monotinic pattern. + Incidence increases from deprivation 1-8 but deprivation 9 and 10 show lower incidence. + This is likely because of barriers to access (cost, transport, availability etc.). + + So we can not use point in time measures to accurately access prevalance. + Instead, when using this definition, we recommend considering every person's history. + For example, if we consider any MDD indication in the last 10 years, then we observe + a more likely relationship between MDD and deprivation. This suggests that people in high + deprivation do get some treatment, but much less frequently than those who are less deprived. + Such an approach is reasonable, as this condition is likely to persist for at least 10 years. + +3) Some treatments are used to treat this condition and other conditions. This is most common + with pharmaceuticals. For these treatments, we do the following: + - Gather the treatments that serve multiple purposes together separately. + - Where a person only has records that also treat other conditions, discard those records + - Where a person has other evidence of MDD, add the treatments for multiple conditions + to the table. + For example, in 2010 a person is diagnosed in hospital with MDD, in 2008 they received drugs + that are used to treat MDD or depression. Then we include the 2008 date as an earlier probable indicator. + +4) Certain medial events are coded using the ICD9, ICD10, or DSM codes - most commonly hospital diagnoses. + There are mappings between the different codings in the diagnoses table. The mappings help researchers who are + familiar with only one coding system to locate records from a different coding system. + Most records (at least 80%) are stored in two versions/rows (the submitted code system, and an alternative they + have been mapped to), so researchers could use either version. + However, the mappings are imperfect. In some cases a more specific code we do want is mapped to a more general code + that we do not want. Hence, to ensure the most robust results, we have limited ourselves to only those records where + the diagnostic code is stored in the same system it was submitted. + This may exclude some records from our output definition. Researchers needing the broadest possible definition are + advised to review this constraint. + +5) MSD incapacitation, SOCRATES, and INTERRAI are only used as multi-type indicators. These tables + record depression, but do not distinguish MDD from anxiety. Hence, we can not use these as a primary + source of diagnosis, only as a potential time period. + +Issues: +1) Because all our MHA tables use the same lookup/reference table, and all definitions load + this table into the database, you can not run the definitions in parallel. Because, each + definition will delete the reference table when it starts running and this will interfere + with the definitions that are already running. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +History (reverse order): +2022-09-12 SA Prep for library +2022-07-19 MR Tidy-up +2022-06-10 CW Definition creation +*************************************************************************************************************************/ + +/* Download the diagnosis lookup table from Github folder and upload onto datalab */ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] ( + diagnosis VARCHAR(30), + code_type VARCHAR(30), + code VARCHAR(10), + aux VARCHAR(30), + explanation VARCHAR(255), +) + +BULK INSERT [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] +FROM '\\your project folder\diagnosis_codes.csv' +WITH ( + FIRSTROW = 2, + FIELDTERMINATOR = ',', + ROWTERMINATOR = '\n', + TABLOCK +) + +/******************************************************** +TABLES TO APPEND TO +********************************************************/ + +/* Diagnosis or treatment only indicates bipolar */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_depressive_solo] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_depressive_solo] ( + snz_uid INT, + event_date DATE, +) + +/* Diagnosis or treatment used for bipolar and other conditions */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_depressive_multi] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_depressive_multi] ( + snz_uid INT, + event_date DATE, +) + +/******************************************************** +MSD INCAPACITATION +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_depressive_multi] (snz_uid, event_date) +SELECT [snz_uid] + ,[msd_incp_incp_from_date] AS event_date +FROM [IDI_Clean_YYYYMM].[msd_clean].[msd_incapacity] +WHERE [msd_incp_incrsn_code] = '161' +OR [msd_incp_incrsn95_1_code] = '161' +OR [msd_incp_incrsn95_2_code] = '161' +OR [msd_incp_incrsn95_3_code] = '161' +OR [msd_incp_incrsn95_4_code] = '161' +OR [msd_incp_incapacity_code] = '161' +GO + +/******************************************************** +PHARMACEUTICALS +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_depressive_solo] (snz_uid, event_date) +SELECT a.[snz_uid] + ,[moh_pha_dispensed_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[pharmaceutical] AS a +INNER JOIN [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_dim_form_pack_subsidy_code] AS b +ON a.[moh_pha_dim_form_pack_code] = b.[DIM_FORM_PACK_SUBSIDY_KEY] +WHERE EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(CAST(CHEMICAL_ID AS VARCHAR), 1, LEN(r.code)) = r.code + AND r.diagnosis = 'depressive' + AND r.code_type = 'pharm_chemical' + AND r.aux = 'sole' +) +GO + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_depressive_multi] (snz_uid, event_date) +SELECT a.[snz_uid] + ,[moh_pha_dispensed_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[pharmaceutical] AS a +INNER JOIN [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_dim_form_pack_subsidy_code] AS b +ON a.[moh_pha_dim_form_pack_code] = b.[DIM_FORM_PACK_SUBSIDY_KEY] +WHERE EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(CAST(CHEMICAL_ID AS VARCHAR), 1, LEN(r.code)) = r.code + AND r.diagnosis = 'depressive' + AND r.code_type = 'pharm_chemical' + AND r.aux = 'multiple' +) +GO + +/******************************************************** +INTERRAI +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_depressive_multi] (snz_uid, event_date) +SELECT [snz_uid] + ,[moh_irai_assessment_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[interrai] +WHERE moh_irai_depression_code > 0 +GO + +/******************************************************** +PRIVATE HOSPITAL DISCHARGE +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_depressive_solo] (snz_uid, event_date) +SELECT a.[snz_uid] + ,CAST([moh_pri_evt_start_date] AS DATE) AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[priv_fund_hosp_discharges_event] AS a +INNER JOIN [IDI_Clean_YYYYMM].[moh_clean].[priv_fund_hosp_discharges_diag] AS b +ON a.[moh_pri_evt_event_id_nbr] = b.[moh_pri_diag_event_id_nbr] +AND [moh_pri_diag_sub_sys_code] = [moh_pri_diag_clinic_sys_code] +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING([moh_pri_diag_clinic_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'depressive' + AND r.code_type = 'ICD10' + AND [moh_pri_diag_sub_sys_code] >= '10' + AND [moh_pri_diag_diag_type_code] IN ('A','B','V') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING([moh_pri_diag_clinic_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'depressive' + AND r.code_type = 'ICD9' + AND [moh_pri_diag_sub_sys_code] IN ('06','6') + AND [moh_pri_diag_diag_type_code] IN ('A','B','V') +) +GO + +/******************************************************** +PUBLIC HOSPITAL DISCHARGE +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_depressive_solo] (snz_uid, event_date) +SELECT b.[snz_uid] + ,[moh_evt_evst_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_diag] AS a +INNER JOIN [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_event] AS b +ON [moh_dia_clinical_sys_code] = [moh_dia_submitted_system_code] +AND [moh_evt_event_id_nbr]=[moh_dia_event_id_nbr] +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[moh_dia_clinical_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'depressive' + AND r.code_type = 'ICD10' + AND [moh_dia_submitted_system_code] >= '10' + AND [moh_dia_diagnosis_type_code] IN ('A','B','V') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[moh_dia_clinical_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'depressive' + AND r.code_type = 'ICD9' + AND [moh_dia_submitted_system_code] IN ('06','6') + AND [moh_dia_diagnosis_type_code] IN ('A','B','V') +) +GO + +/******************************************************** +PRIMHD AND MHINC +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_depressive_solo] (snz_uid, event_date) +SELECT b.snz_uid + ,[classification_start] AS event_date +FROM [IDI_Adhoc].[clean_read_MOH_PRIMHD].[moh_primhd_mhinc] AS a +INNER JOIN [IDI_Clean_YYYYMM].[security].[concordance] AS b +ON a.snz_moh_uid = b.snz_moh_uid +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'depressive' + AND r.code_type = 'ICD10' + AND [clinical_coding_system_id] >= '10' +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'depressive' + AND r.code_type = 'DSM' + AND [clinical_coding_system_id] IN ('07','7') +) +GO + +/******************************************************** +PRIMHD DIAGNOSIS +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_depressive_solo] (snz_uid, event_date) +SELECT snz_uid + ,DATEFROMPARTS(SUBSTRING([CLASSIFICATION_START_DATE],7,4),SUBSTRING([CLASSIFICATION_START_DATE],4,2),SUBSTRING([CLASSIFICATION_START_DATE],1,2)) AS event_date +FROM [IDI_Adhoc].[clean_read_MOH_PRIMHD].[primhd_diagnoses] AS a +INNER JOIN [IDI_Clean_YYYYMM].[security].[concordance] AS b +ON a.snz_moh_uid = b.snz_moh_uid + +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'depressive' + AND r.code_type = 'ICD10' + AND [clinical_coding_system_id] >= '10' +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'depressive' + AND r.code_type = 'DSM' + AND [clinical_coding_system_id] IN ('07','7') +) +GO + +/******************************************************** +SOCRATES +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_depressive_multi] (snz_uid, event_date) +SELECT b.snz_uid + ,COALESCE( + CAST(SUBSTRING([FirstContactDate], 1, 7) AS DATE), + CAST(SUBSTRING([ReferralDate],1,7) AS DATE) + ) AS event_date +FROM [IDI_Adhoc].[clean_read_MOH_SOCRATES].moh_disability AS a +INNER JOIN [IDI_Clean_YYYYMM].[security].[concordance] AS b +ON a.snz_moh_uid = b.snz_moh_uid +INNER JOIN [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_needs_assessment] AS c +ON a.snz_moh_uid = c.snz_moh_uid +LEFT JOIN [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_referral] AS e +ON a.snz_moh_uid = e.snz_moh_uid +WHERE a.[Code] = '1304' + +/**************************************************************************************************************** +FINAL TABLE CREATION +****************************************************************************************************************/ + +/* Add indexes */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_depressive_solo] (snz_uid); +GO +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_depressive_multi] (snz_uid); +GO + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_major_depressive_disorder] +GO + +WITH multi_to_add AS ( + SELECT snz_uid, event_date + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_depressive_multi] AS m + WHERE EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_depressive_solo] AS s + WHERE m.snz_uid = s.snz_uid + ) +) +SELECT DISTINCT snz_uid, event_date +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_major_depressive_disorder] +FROM ( + SELECT snz_uid, event_date + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_depressive_solo] + + UNION ALL + + SELECT snz_uid, event_date + FROM multi_to_add +) AS k +GO + +/******************************************************** +TIDY UP +********************************************************/ + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_major_depressive_disorder] (snz_uid); +GO +/* Compress final table to save space */ +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_major_depressive_disorder] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE); +GO + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_depressive_solo] +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_depressive_multi] +GO diff --git a/health/mha_schizophrenia.sql b/health/mha_schizophrenia.sql new file mode 100644 index 0000000..017467f --- /dev/null +++ b/health/mha_schizophrenia.sql @@ -0,0 +1,438 @@ +/************************************************************************************************** +Title: Schizophrenia spectrum +Author: Craig Wright +Re-work: Manjusha Radhakrishnan, Simon Anastasiadis +Reviewer: + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Indication of Schizophrenia spectrum. + +Intended purpose: +All instances that point to a person having schizophrenia +For incidence and prevalence - see notes. + +Inputs & Dependencies: +- [IDI_Clean].[security].[concordance] +- [IDI_Clean].[moh_clean].[mortality_diagnosis] +- [IDI_Clean].[moh_clean].[mortality_registrations] +- [IDI_Clean].[msd_clean].[msd_incapacity] +- [IDI_Clean].[moh_clean].[pharmaceutical] +- [IDI_Clean].[moh_clean].[interrai] +- [IDI_Clean].[moh_clean].[priv_fund_hosp_discharges_event] +- [IDI_Clean].[moh_clean].[priv_fund_hosp_discharges_diag] +- [IDI_Clean].[moh_clean].[pub_fund_hosp_discharges_diag] +- [IDI_Clean].[moh_clean].[pub_fund_hosp_discharges_event] +- [IDI_Clean].[moh_clean].[pharmaceutical] +- [IDI_Clean].[moh_clean].[lab_claims] +- [IDI_Clean].[acc_clean].[claims] +- [IDI_Clean].[acc_clean].[medical_codes] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_dim_form_pack_subsidy_code] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS_CLIN_DIAG_CODES].[clinical_codes] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_interrai_question_lookup] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_interrai_answer_lookup] +- [IDI_Adhoc].[clean_read_MOH_PRIMHD].[moh_primhd_mhinc] +- [IDI_Adhoc].[clean_read_MOH_PRIMHD].[primhd_diagnoses] +- [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_disability] +- [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_needs_assessment] +- [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_referral] + +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_schizophrenia] +- [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] + +Notes: +1) Estimates for prevalance: + 2015 estimate of 6.7 schizophrenics per 1000 + +2) Incidence and prevalence + Incidence = development of a condition during a particular time period (new only). + Prevalance = affected by the condition during a particular time period (new + existing). + Plots of incidence (from this definition) against deprivation show a non-monotinic pattern. + Incidence increases from deprivation 1-8 but deprivation 9 and 10 show lower incidence. + This is likely because of barriers to access (cost, transport, availability etc.). + + So we can not use point in time measures to accurately access prevalance. + Instead, when using this definition, we recommend considering every person's history. + For example, if we consider any schizophrenia indication in the last 10 years, then we observe + a more likely relationship between schizophrenia and deprivation. This suggests that people in high + deprivation do get some treatment, but much less frequently than those who are less deprived. + Such an approach is reasonable, as this condition is likely to persist for at least 10 years. + +3) Some treatments are used to treat this condition and other conditions. This is most common + with pharmaceuticals. For these treatments, we do the following: + - Gather the treatments that serve multiple purposes together separately. + - Where a person only has records that also treat other conditions, discard those records + - Where a person has other evidence of schizophrenia, add the treatments for multiple conditions + to the table. + For example, in 2010 a person is diagnosed in hospital with schizophrenia, in 2008 they received drugs + that are used to treat schizophrenia or depression. Then we include the 2008 date as an earlier probable indicator. + +4) Certain medial events are coded using the ICD9, ICD10, or DSM codes - most commonly hospital diagnoses. + There are mappings between the different codings in the diagnoses table. The mappings help researchers who are + familiar with only one coding system to locate records from a different coding system. + Most records (at least 80%) are stored in two versions/rows (the submitted code system, and an alternative they + have been mapped to), so researchers could use either version. + However, the mappings are imperfect. In some cases a more specific code we do want is mapped to a more general code + that we do not want. Hence, to ensure the most robust results, we have limited ourselves to only those records where + the diagnostic code is stored in the same system it was submitted. + This may exclude some records from our output definition. Researchers needing the broadest possible definition are + advised to review this constraint. + +Issues: +1) Because all our MHA tables use the same lookup/reference table, and all definitions load + this table into the database, you can not run the definitions in parallel. Because, each + definition will delete the reference table when it starts running and this will interfere + with the definitions that are already running. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +History (reverse order): +2022-09-12 SA Prep for library +2022-07-19 MR Tidy-up +2022-06-10 CW Definition creation +*************************************************************************************************************************/ + +/* Download the diagnosis lookup table from Github folder and upload onto datalab */ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] ( + diagnosis VARCHAR(30), + code_type VARCHAR(30), + code VARCHAR(10), + aux VARCHAR(30), + explanation VARCHAR(255), +) + +BULK INSERT [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] +FROM '\\your project folder\diagnosis_codes.csv' +WITH ( + FIRSTROW = 2, + FIELDTERMINATOR = ',', + ROWTERMINATOR = '\n', + TABLOCK +) + +/******************************************************** +TABLES TO APPEND TO +********************************************************/ + +/* Diagnosis or treatment only indicates schizophrenia */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_schizophrenia_solo] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_schizophrenia_solo] ( + snz_uid INT, + event_date DATE, +) + +/* Diagnosis or treatment used for schizophrenia and other conditions */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_schizophrenia_multi] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_schizophrenia_multi] ( + snz_uid INT, + event_date DATE, +) + +/******************************************************** +MORTALITY + +Note that people who died with this diagnosis will likely +have had bipolar for a while before death +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_schizophrenia_solo] (snz_uid, event_date) + +SELECT b.snz_uid + ,EOMONTH(DATEFROMPARTS([moh_mor_death_year_nbr],[moh_mor_death_month_nbr],1)) AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[mortality_diagnosis] AS a +INNER JOIN [IDI_Clean_YYYYMM].[moh_clean].[mortality_registrations] AS b +ON a.[snz_dia_death_reg_uid] = b.snz_dia_death_reg_uid +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[moh_mort_diag_clinical_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'schizophrenia' + AND r.code_type = 'ICD10' + AND [moh_mort_diag_clinic_sys_code] >= 10 + AND [moh_mort_diag_clinic_type_code] in ('A','B','V') +) +OR EXISTS ( +SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[moh_mort_diag_clinical_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'schizophrenia' + AND r.code_type = 'ICD9' + AND [moh_mort_diag_clinic_sys_code] IN ('06','6') + AND [moh_mort_diag_clinic_type_code] in ('A','B','V') +) +GO + +/******************************************************** +MSD INCAPACITATION +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_schizophrenia_solo] (snz_uid, event_date) +SELECT [snz_uid] + ,[msd_incp_incp_from_date] AS event_date +FROM [IDI_Clean_YYYYMM].[msd_clean].[msd_incapacity] +WHERE [msd_incp_incrsn_code] = '163' +OR [msd_incp_incrsn95_1_code] = '163' +OR [msd_incp_incrsn95_2_code] = '163' +OR [msd_incp_incrsn95_3_code] = '163' +OR [msd_incp_incrsn95_4_code] = '163' +OR [msd_incp_incapacity_code] = '163' +GO + +/******************************************************** +PHARMACEUTICALS +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_schizophrenia_solo] (snz_uid, event_date) +SELECT a.[snz_uid] + ,[moh_pha_dispensed_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[pharmaceutical] AS a +INNER JOIN [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_dim_form_pack_subsidy_code] AS b +ON a.[moh_pha_dim_form_pack_code] = b.[DIM_FORM_PACK_SUBSIDY_KEY] +WHERE EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(CAST(CHEMICAL_ID AS VARCHAR), 1, LEN(r.code)) = r.code + AND r.diagnosis = 'schizophrenia' + AND r.code_type = 'pharm_chemical' + AND r.aux = 'sole' +) +GO + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_schizophrenia_multi] (snz_uid, event_date) +SELECT a.[snz_uid] + ,[moh_pha_dispensed_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[pharmaceutical] AS a +INNER JOIN [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_dim_form_pack_subsidy_code] AS b +ON a.[moh_pha_dim_form_pack_code] = b.[DIM_FORM_PACK_SUBSIDY_KEY] +WHERE EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(CAST(CHEMICAL_ID AS VARCHAR), 1, LEN(r.code)) = r.code + AND r.diagnosis = 'schizophrenia' + AND r.code_type = 'pharm_chemical' + AND r.aux = 'multiple' +) +GO + +/******************************************************** +INTERRAI +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_schizophrenia_solo] (snz_uid, event_date) +SELECT [snz_uid] + ,[moh_irai_assessment_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[interrai] +WHERE moh_irai_schizophrenia_code > 0 +GO + +/******************************************************** +PRIVATE HOSPITAL DISCHARGE +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_schizophrenia_solo] (snz_uid, event_date) +SELECT a.[snz_uid] + ,CAST([moh_pri_evt_start_date] AS DATE) AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[priv_fund_hosp_discharges_event] AS a +INNER JOIN [IDI_Clean_YYYYMM].[moh_clean].[priv_fund_hosp_discharges_diag] AS b +ON a.[moh_pri_evt_event_id_nbr] = b.[moh_pri_diag_event_id_nbr] +AND [moh_pri_diag_sub_sys_code] = [moh_pri_diag_clinic_sys_code] +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING([moh_pri_diag_clinic_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'schizophrenia' + AND r.code_type = 'ICD10' + AND [moh_pri_diag_sub_sys_code] >= '10' + AND [moh_pri_diag_diag_type_code] IN ('A','B','V') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING([moh_pri_diag_clinic_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'schizophrenia' + AND r.code_type = 'ICD9' + AND [moh_pri_diag_sub_sys_code] IN ('06','6') + AND [moh_pri_diag_diag_type_code] IN ('A','B','V') +) +GO + +/******************************************************** +PUBLIC HOSPITAL DISCHARGE +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_schizophrenia_solo] (snz_uid, event_date) +SELECT b.[snz_uid] + ,[moh_evt_evst_date] AS event_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_diag] AS a +INNER JOIN [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_event] AS b +ON [moh_dia_clinical_sys_code] = [moh_dia_submitted_system_code] +AND [moh_evt_event_id_nbr]=[moh_dia_event_id_nbr] +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[moh_dia_clinical_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'schizophrenia' + AND r.code_type = 'ICD10' + AND [moh_dia_submitted_system_code] >= '10' + AND [moh_dia_diagnosis_type_code] IN ('A','B','V') +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[moh_dia_clinical_code], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'schizophrenia' + AND r.code_type = 'ICD9' + AND [moh_dia_submitted_system_code] IN ('06','6') + AND [moh_dia_diagnosis_type_code] IN ('A','B','V') +) +GO + +/******************************************************** +PRIMHD AND MHINC +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_schizophrenia_solo] (snz_uid, event_date) +SELECT b.snz_uid + ,[classification_start] AS event_date +FROM [IDI_Adhoc].[clean_read_MOH_PRIMHD].[moh_primhd_mhinc] AS a +INNER JOIN [IDI_Clean_YYYYMM].[security].[concordance] AS b +ON a.snz_moh_uid = b.snz_moh_uid +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'schizophrenia' + AND r.code_type = 'ICD10' + AND [clinical_coding_system_id] >= '10' +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'schizophrenia' + AND r.code_type = 'DSM' + AND [clinical_coding_system_id] IN ('07','7') +) +GO + +/******************************************************** +PRIMHD DIAGNOSIS +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_schizophrenia_solo] (snz_uid, event_date) +SELECT snz_uid + ,DATEFROMPARTS(SUBSTRING([CLASSIFICATION_START_DATE],7,4),SUBSTRING([CLASSIFICATION_START_DATE],4,2),SUBSTRING([CLASSIFICATION_START_DATE],1,2)) AS event_date +FROM [IDI_Adhoc].[clean_read_MOH_PRIMHD].[primhd_diagnoses] AS a +INNER JOIN [IDI_Clean_YYYYMM].[security].[concordance] AS b +ON a.snz_moh_uid = b.snz_moh_uid + +WHERE EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'schizophrenia' + AND r.code_type = 'ICD10' + AND [clinical_coding_system_id] >= '10' +) +OR EXISTS( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ref_diagnosis] AS r + WHERE SUBSTRING(a.[CLINICAL_CODE], 1, LEN(r.code)) = r.code + AND r.diagnosis = 'schizophrenia' + AND r.code_type = 'DSM' + AND [clinical_coding_system_id] IN ('07','7') +) +GO + +/******************************************************** +SOCRATES +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_schizophrenia_solo] (snz_uid, event_date) +SELECT b.snz_uid + ,COALESCE( + CAST(SUBSTRING([FirstContactDate], 1, 7) AS DATE), + CAST(SUBSTRING([ReferralDate],1,7) AS DATE) + ) AS event_date +FROM [IDI_Adhoc].[clean_read_MOH_SOCRATES].moh_disability AS a +INNER JOIN [IDI_Clean_YYYYMM].[security].[concordance] AS b +ON a.snz_moh_uid = b.snz_moh_uid +INNER JOIN [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_needs_assessment] AS c +ON a.snz_moh_uid = c.snz_moh_uid +LEFT JOIN [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_referral] AS e +ON a.snz_moh_uid = e.snz_moh_uid +WHERE a.[Code] = '1306' + +/**************************************************************************************************************** +FINAL TABLE CREATION +****************************************************************************************************************/ + +/* Add indexes */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_schizophrenia_solo] (snz_uid); +GO +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_schizophrenia_multi] (snz_uid); +GO + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_schizophrenia] +GO + +WITH multi_to_add AS ( + SELECT snz_uid, event_date + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_schizophrenia_multi] AS m + WHERE EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_schizophrenia_solo] AS s + WHERE m.snz_uid = s.snz_uid + ) +) +SELECT DISTINCT snz_uid, event_date +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_schizophrenia] +FROM ( + SELECT snz_uid, event_date + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_schizophrenia_solo] + + UNION ALL + + SELECT snz_uid, event_date + FROM multi_to_add +) AS k +GO + +/******************************************************** +TIDY UP +********************************************************/ + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_schizophrenia] (snz_uid); +GO +/* Compress final table to save space */ +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[defn_mha_schizophrenia] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE); +GO + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_schizophrenia_solo] +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_schizophrenia_multi] +GO diff --git a/health/mha_service.sql b/health/mha_service.sql new file mode 100644 index 0000000..7cc177f Binary files /dev/null and b/health/mha_service.sql differ diff --git a/health/multi-source indicator of functional disability.sql b/health/multi-source indicator of functional disability.sql new file mode 100644 index 0000000..ad17173 --- /dev/null +++ b/health/multi-source indicator of functional disability.sql @@ -0,0 +1,509 @@ +/*************************************************************************************************************************** +Title: Functional disability +Author: Craig Wright +Reviewer: Simon Anastasiadis + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Multi-source indicator of functional disability + +Intended purpose: +Measuring disability is a complex and still evolving issue. There is no perfect way to classify people into categories and no full agreement amongst the +disabled community on the language to be used. However, it is important for the disabled community, researchers, and policy makers +to have some standard way to describe varying levels of participation as citizens, clients, or consumers of services. + + +## Key concepts + +Disability is a social construct that arises through the combination of two things: +1. The limitations some people have in completing some activities. +2. The barriers that exist in a person’s environment (or society in general) that limit participation. + +Information about the barriers or accommodations people experience is currently not available in administrative data. However, there are some sources data +that include functional limitations that some people have. Hence, one approach is to use functional limitations to identify groups who are more likely to be disabled. + + +## Practical notes + +The indicator is based on the Washington Group Short Set (WGSS) questions on functioning. The WGSS is a series of six questions about difficulties people might encounter doing everyday things: +• Walking +• Seeing +• Hearing +• Remembering +• Washing +• Communication + +However, not everyone has answered these questions, and some people might have acquired impairments after the questions were asked. We have supplemented these questions with data from some Ministry of Health collections: SOCRATES, which captures functional assessments for disability clients; and InterRAI, which captures similar information for older people. + +In collaboration with experts in the disabled community, the Ministry of Health and Office for Disability Issues, we have aligned each of these data sources, resulting in a three-level indicator for each of the six functional activities: +• 0 = No limitation: This group does not report any limitations in undertaking everyday tasks. They are unlikely to be disabled. +• 1 = Low functional limitation: This group reports some difficulty with everyday tasks. They are somewhat likely to be disabled. +• 2 = High functional limitation: This group reports a lot of difficulty with everyday tasks or cannot do them at all. They are very likely to be disabled. + +Following guidance from the Washington Group (2020), we have also produced one overall indicator for disability status. This takes the value of ‘disabled’ if the person had high functional limitation in at least one activity, and ‘not disabled’ otherwise. (This overall indicator is consistent with how many agencies, such as Stats NZ, are already using the WGSS to report on outcomes for disabled people.) + + +## How to use this indicator + +• Involve disabled people in your research, including when designing your approach and interpreting your results. +• Use it to compare outcomes, not to estimate the size of groups. This indicator is not suitable for counting the disabled population. The official measure of the disabled population in New Zealand is derived from the Disability Survey undertaken by Stats NZ. Use of this definition should be constrained to comparisons between groups, rather than discussing the size of those groups. +• Adjust for age in your analysis. There is a very strong relationship between age and functional impairment. To account for this, we recommend adjusting for age when comparing between functional groups (at minimum, reporting separately those below and above the age of 65). + + +## Limitations of this indicator + +• The indicator is not as accurate and comprehensive as the measure of disability in the Disability Survey. It is not a replacement for the official measure from the Disability Survey. +• The Washington Group Short Set (WGSS) is not a fully comprehensive measure of functional limitations. +• For children under 12, the WGSS is less sensitive at moderate levels of functional limitation. The WGSS is also not collected for children under the age of five. +• The indicator may capture people who are disabled only temporarily. These people may have very different experiences to people who have permanent or ongoing impairments. + + +## Other notes + +• The indicator might be expanded in the future to use WGSS responses from the NZCVS and specific modules of the HES. However, the coverage of both sources is limited. +• We investigated using Hospital diagnoses and ACC records. But the steering group advised against their inclusion. + + +## References + +The Social Wellbeing Agency has published an accompanying guide. You can find it our our website: swa.govt.nz + + +## Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = DL-MAA20XX-YY + + +## Dependencies +The code relies on eight input tables: +• [IDI_Clean].[cen_clean].[census_individual_2018] +• [IDI_Clean].[security].[concordance] +• [IDI_Clean].[gss_clean].[gss_person] +• [IDI_Clean].[security].[concordance] +• [IDI_Clean].[moh_clean].[interrai] +• [IDI_Adhoc].[clean_read_HLFS].[hlfs_disability] +• [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_support_needs_2022] +• [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_needs_assessment_2022] + +Linking between the two is done on snz_moh_uid. No rows are lost via this linking. + +## Outputs +Table: [IDI_Sandpit].[DL-MAA20XX-YY].[defn_functional_disability] + +## Variable Descriptions + +--------------------------------------------------------------------------------------------------------------------------- +Column Description +name +------------------------------ -------------------------------------------------------------------------------------------- +snz_uid The unique STATSNZ person identifier for the person + +record_source The database from which the record originates + +event_date The date the disability information was recorded, survey/census date for Stats NZ sources, assessment date for MoH sources. + +dv_hearing Indicator of hearing limitations + +dv_seeing Indicator of seeing limitations + +dv_walking Indicator of walking limitations + +dv_remembering Indicator of remembering limitations + +dv_washing Indicator of washing limitations + +dv_communication Indicator of hearing limitations + +overall_dv_indication Indicator of any high limitation + + +## Version and change history + +2022-09-19 SA ensure consistency with definition documentation +2022-09-19 SA check against definition description by Andrew (from consultation with DDEWG) +2021-12-02 SA review and tidy +2021-11-20 CW v1 + +## Code + +***************************************************************************************************************************/ + +/* create table for all records */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_functional_disability_list] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_functional_disability_list] ( + snz_uid INT, + record_source VARCHAR(9), + event_date DATE, + dv_hearing INT, + dv_seeing INT, + dv_walking INT, + dv_remembering INT, + dv_washing INT, + dv_communication INT, +); +GO + +/*************************************************************************************************************** +append records from each source into the table +***************************************************************************************************************/ + +/********************************* +1. Stats NZ surveys - Census 2018 + +The question in the census is: + +22. This question is about difficulties you may have doing certain activities because of a health problem. + + Do you have difficulty with any of the following: + + - seeing, even if wearing glasses? ([cen_ind_dffcl_seeing_code]) + + Answers: + - no difficulty (1) + - some difficulty (2) + - a lot of difficulty (3) + - cannot do at all (4) + + Other answers in the dataset + - Response unidentifiable (7) + - Not Stated (9) + + The same answers apply to these questions: + - hearing, even if using a hearing aid? ([cen_ind_dffcl_hearing_code]) + - walking or climbing steps? ([cen_ind_dffcl_walking_code]) + - remembering or concentrating? ([cen_ind_dffcl_remembering_code]) + - washing all over or dressing? ([cen_ind_dffcl_washing_code]) + - communicating using your usual language, for example understanding or being understood by others? ([cen_ind_dffcl_comt_code]) +*********************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_functional_disability_list] + (snz_uid, record_source, event_date, dv_hearing, dv_seeing, dv_walking, dv_remembering, dv_washing, dv_communication) +SELECT [snz_uid] + ,'CEN2018' AS record_source + ,'2018-03-05' AS event_date + --,[cen_ind_dsblty_ind_code] as dv_disability + ,CASE + WHEN [cen_ind_dffcl_hearing_code] = 1 THEN 0 + WHEN [cen_ind_dffcl_hearing_code] = 2 THEN 1 + WHEN [cen_ind_dffcl_hearing_code] IN (3,4) THEN 2 + ELSE 0 END AS dv_hearing + ,CASE + WHEN [cen_ind_dffcl_seeing_code] = 1 THEN 0 + WHEN [cen_ind_dffcl_seeing_code] = 2 THEN 1 + WHEN [cen_ind_dffcl_seeing_code] IN (3,4) THEN 2 + ELSE 0 END AS dv_seeing + ,CASE + WHEN [cen_ind_dffcl_walking_code] = 1 THEN 0 + WHEN [cen_ind_dffcl_walking_code] = 2 THEN 1 + WHEN [cen_ind_dffcl_walking_code] IN (3,4) THEN 2 + ELSE 0 END AS dv_walking + ,CASE + WHEN [cen_ind_dffcl_remembering_code] = 1 THEN 0 + WHEN [cen_ind_dffcl_remembering_code] = 2 THEN 1 + WHEN [cen_ind_dffcl_remembering_code] IN (3,4) THEN 2 + ELSE 0 END AS dv_remembering + ,CASE + WHEN [cen_ind_dffcl_washing_code] = 1 THEN 0 + WHEN [cen_ind_dffcl_washing_code] = 2 THEN 1 + WHEN [cen_ind_dffcl_washing_code] IN (3,4) THEN 2 + ELSE 0 END AS dv_washing + ,CASE + WHEN [cen_ind_dffcl_comt_code] = 1 THEN 0 + WHEN [cen_ind_dffcl_comt_code] = 2 THEN 1 + WHEN [cen_ind_dffcl_comt_code] IN (3,4) THEN 2 + ELSE 0 END AS dv_communication +FROM [IDI_Clean_YYYYMM].[cen_clean].[census_individual_2018] +/* WGSS not collected for children under 5 years of age for exclude */ +WHERE cen_ind_age_code NOT IN ('000', '001', '002', '003', '004') +GO + +/********************************* +2. Stats NZ surveys - HLFS +*********************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-MM].[tmp_functional_disability_list] + (snz_uid, record_source, event_date, dv_hearing, dv_seeing, dv_walking, dv_remembering, dv_washing, dv_communication) +SELECT b.snz_uid + ,'HLFS' AS record_source + ,[quarter_date] + ,CASE + WHEN [diff_hearing_code]='11' then 0 + WHEN [diff_hearing_code]='12' THEN 1 + WHEN [diff_hearing_code] IN ('13','14') THEN 2 + ELSE 0 END AS dv_hearing + ,CASE + WHEN [diff_seeing_code]='11' then 0 + WHEN [diff_seeing_code]='12' THEN 1 + WHEN [diff_seeing_code] IN ('13','14') THEN 2 + ELSE 0 END AS dv_seeing + ,CASE + WHEN [diff_walking_code]='11' then 0 + WHEN [diff_walking_code]='12' THEN 1 + WHEN [diff_walking_code] IN ('13','14') THEN 2 + ELSE 0 END AS dv_walking + ,CASE + WHEN [diff_memory_code]='11' then 0 + WHEN [diff_memory_code]='12' THEN 1 + WHEN [diff_memory_code] IN ('13','14') THEN 2 + ELSE 0 END AS dv_remembering + ,CASE + WHEN [diff_dressing_code]='11' then 0 + WHEN [diff_dressing_code]='12' THEN 1 + WHEN [diff_dressing_code] IN ('13','14') THEN 2 + ELSE 0 END AS dv_washing + ,CASE + WHEN [diff_communicating_code]='11' then 0 + WHEN [diff_communicating_code]='12' THEN 1 + WHEN [diff_communicating_code] IN ('13','14') THEN 2 + ELSE 0 END AS dv_communication +FROM [IDI_Adhoc].[clean_read_HLFS].[hlfs_disability] as a +INNER JOIN [IDI_Clean_YYYYMM].[security].[concordance] as b +ON a.snz_hlfs_uid = b.snz_hlfs_uid +GO + +/********************************* +3. Stats NZ surveys - GSS 2016 or 2018 +*********************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_functional_disability_list] + (snz_uid, record_source, event_date, dv_hearing, dv_seeing, dv_walking, dv_remembering, dv_washing, dv_communication) +SELECT [snz_uid] + ,'GSS' AS record_source + ,[gss_pq_HQinterview_date] AS event_date + ,CASE + WHEN [gss_pq_disability_hear_code] ='11' THEN 0 + WHEN [gss_pq_disability_hear_code] ='12' THEN 1 + WHEN [gss_pq_disability_hear_code] IN ('13','14') THEN 2 + ELSE 0 END AS dv_hearing + ,CASE + WHEN [gss_pq_disability_see_code] ='11' THEN 0 + WHEN [gss_pq_disability_see_code] ='12' THEN 1 + WHEN [gss_pq_disability_see_code] IN ('13','14') THEN 2 + ELSE 0 END AS dv_seeing + ,CASE + WHEN [gss_pq_disability_walk_code] ='11' THEN 0 + WHEN [gss_pq_disability_walk_code] ='12' THEN 1 + WHEN [gss_pq_disability_walk_code] IN ('13','14') THEN 2 + ELSE 0 END AS dv_walking + ,CASE + WHEN [gss_pq_disability_remem_code] ='11' THEN 0 + WHEN [gss_pq_disability_remem_code] ='12' THEN 1 + WHEN [gss_pq_disability_remem_code] IN ('13','14') THEN 2 + ELSE 0 END AS dv_remembering + ,CASE + WHEN [gss_pq_disability_wash_code] ='11' THEN 0 + WHEN [gss_pq_disability_wash_code] ='12' THEN 1 + WHEN [gss_pq_disability_wash_code] IN ('13','14') THEN 2 + ELSE 0 END AS dv_washing + ,CASE + WHEN [gss_pq_disability_comm_code] ='11' THEN 0 + WHEN [gss_pq_disability_comm_code] ='12' THEN 1 + WHEN [gss_pq_disability_comm_code] IN ('13','14') THEN 2 + ELSE 0 END AS dv_communication +FROM [IDI_Clean_YYYYMM].[gss_clean].[gss_person] +GO + +/********************************* +4. MoH - SOCRATES + +The SOCRATES data is for MoH funded disability services. +These will tend to be at the more severe end of the scale. +So we expect to see that there will be fewer people in this +data and that their measures will tend to be more severe. + +Code list: + Hearing + --1003 Hearing impaired + --1004 Deaf or nearly deaf + Seeing + --1001 Vision impaired + --1002 Blind or nearly blind + Walking + --1111 Wheelchair user (inside / outside of home) + --1101 Moving around inside home + --1102 Moving around outside home + --1103 Moving around in the community + Remembering + --1299 Other difficulties with memory / cognition / behaviour (specify) + --1203 Learning ability, i.e. acquiring skills of reading, writing, language, calculating, copying, etc. + --1202 Intellectual ability, i.e. thinking, understanding + --1208 Attention, e.g. concentration + --1201 Memory + Washing + --1403 Dressing and / or undressing + --1405 Toileting, using toilet facilities + --1402 Bathing, showering, washing self + --1404 Grooming and caring for body parts, e.g. feet, teeth, hair, nails, etc + Communication + --1803 Non verbal + --1801 Ability to express core needs + --1006 Mute or nearly mute + --1005 Speech impaired +*********************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_functional_disability_list] + (snz_uid, record_source, event_date, dv_hearing, dv_seeing, dv_walking, dv_remembering, dv_washing, dv_communication) + +SELECT [snz_uid] + ,'SOCRATES' AS record_source + ,CAST(SUBSTRING([DateAssessmentCompleted],1,7) AS DATE) AS event_date + --hearing + ,IIF(code IN (1003, 1004), 2, 0) AS dv_hearing + --seeing + ,IIF(code IN (1001, 1002), 2, 0) AS dv_seeing + --walk + ,IIF(code IN (1111, 1101, 1102, 1103), 2, 0) AS dv_walking + --memory/learning + ,IIF(code IN (1201,1202,1203,1208,1299), 2, 0) AS dv_remembering + --wash + ,IIF(code IN (1402,1403,1404,1405), 2, 0) AS dv_washing + --communicating + ,IIF(code IN (1005,1006,1801,1803), 2, 0) AS dv_communication + --,s.Code + --,s.Description +FROM [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_support_needs_2022] AS s +INNER JOIN [IDI_Clean_YYYYMM].[security].[concordance] as c +ON s.snz_moh_uid = c.snz_moh_uid +INNER JOIN [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_needs_assessment_2022] AS n +ON s.[snz_moh_uid] = n.[snz_moh_uid] +AND s.[NeedsAssessmentID] = n.[NeedsAssessmentID] +AND s.[snz_moh_soc_client_uid] = n.[snz_moh_soc_client_uid] +GO + +/********************************* +5. MoH - IRAI + +Excludes the CA assessment type due to absence of impairment/disability type questions +*********************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-MM].[tmp_functional_disability_list] + (snz_uid, record_source, event_date, dv_hearing, dv_seeing, dv_walking, dv_remembering, dv_washing, dv_communication) + +SELECT [snz_uid] + ,'IRAI' AS record_source + --,[snz_moh_uid] + --,[moh_irai_assessment_type_text] + --,[moh_irai_assess_version_text] + ,[moh_irai_assessment_date] AS event_date + --hearing + ,CASE + WHEN [moh_irai_hearing_code] BETWEEN 2 AND 4 THEN 2 + WHEN [moh_irai_hearing_code] = 1 THEN 1 + ELSE 0 END AS dv_hearing + --vision + ,CASE + WHEN [moh_irai_vision_light_code] BETWEEN 2 AND 4 THEN 2 + WHEN [moh_irai_vision_light_code] = 1 THEN 1 + ELSE 0 END AS dv_seeing + --walking + ,CASE + WHEN [moh_irai_adl_walking_code] BETWEEN 2 AND 6 + OR moh_irai_stairs_perform_code BETWEEN 2 AND 6 + OR moh_irai_stairs_capacity_code BETWEEN 2 AND 6 THEN 2 + WHEN [moh_irai_adl_walking_code] = 1 + OR moh_irai_stairs_perform_code = 1 + OR moh_irai_stairs_capacity_code = 1 THEN 1 + ELSE 0 END AS dv_walking + --memory/learning + ,CASE + WHEN [moh_irai_short_term_mem_ind] = 1 + OR [moh_irai_procedural_mem_ind] = 1 + OR [moh_irai_situational_mem_ind] = 1 + OR [moh_irai_long_term_mem_ind] = 1 + OR [moh_irai_res_hist_intellect_ind] = 1 + OR [moh_irai_easily_distracted_code] = 2 THEN 2 + WHEN [moh_irai_easily_distracted_code] = 1 THEN 1 + ELSE 0 END AS dv_remembering + --washing + ,CASE + WHEN moh_irai_adl_bathing_code BETWEEN 2 AND 6 THEN 2 + WHEN moh_irai_adl_bathing_code = 1 THEN 1 + ELSE 0 END AS dv_washing + --communication + ,CASE + WHEN moh_irai_scale_comm_code BETWEEN 3 AND 8 THEN 2 + WHEN moh_irai_scale_comm_code BETWEEN 1 AND 2 THEN 1 + ELSE 0 END AS dv_comt +FROM [IDI_Clean_YYYYMM].[moh_clean].[interrai] +WHERE [moh_irai_assessment_type_text] !='CA' + +GO + +/*************************************************************************************************************** +Create final table +***************************************************************************************************************/ + +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_functional_disability_list] (snz_uid); +GO + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_functional_disability] +GO + + +WITH + +/* Where multiple records per person for the same date & source, keep the highest */ +single_person_source_date_record AS ( + SELECT snz_uid + ,record_source + ,event_date + ,MAX(dv_hearing) AS dv_hearing + ,MAX(dv_seeing) AS dv_seeing + ,MAX(dv_walking) AS dv_walking + ,MAX(dv_remembering) AS dv_remembering + ,MAX(dv_washing) AS dv_washing + ,MAX(dv_communication) AS dv_communication + FROM [IDI_Sandpit].[DL-MAA2021-49].[tmp_functional_disability_list] + GROUP BY snz_uid, record_source, event_date +), + +/* Create indicator for most recent record for each person */ +most_recent_date AS ( + SELECT * + ,ROW_NUMBER() OVER (PARTITION BY snz_uid ORDER BY event_date DESC, record_source) AS ranking + FROM single_person_source_date_record +) + +SELECT snz_uid + ,record_source + ,event_date + ,dv_hearing + ,dv_seeing + ,dv_walking + ,dv_remembering + ,dv_washing + ,dv_communication + ,IIF(dv_hearing = 2 + OR dv_seeing = 2 + OR dv_walking = 2 + OR dv_remembering = 2 + OR dv_washing = 2 + OR dv_communication = 2, 1, 0) AS overall_dv_indication +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_functional_disability] +FROM most_recent_date +WHERE ranking = 1 +GO + +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[defn_functional_disability] (snz_uid); +GO +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[defn_functional_disability] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE) +GO + +/* remove raw list table */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_functional_disability_list] +GO \ No newline at end of file diff --git a/health/out_patients.sql b/health/out_patients.sql new file mode 100644 index 0000000..6f7a346 --- /dev/null +++ b/health/out_patients.sql @@ -0,0 +1,71 @@ +/************************************************************************************************** +Title: Hospital non-admitted patient events +Author: Simon Anastasiadis + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Hospital non-admitted events (including emergency department, +out patients, community visits) where the patient attended. + +Intended purpose: +Counting the number of non-admitted patient events. +Determining who had a non-admitted patient event. + +Inputs & Dependencies: +- [IDI_Clean].[moh_clean].[nnpac] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[defn_non_admit_patient] + +Notes: +1) Three types of non-admitted patient events are included: + ED = Emergency department + OP = Out patient + CR = Community visit +2) We use ED visits as recorded in out patients. As per Craig's advice: + because we are only interested in counting events, we do not need to + combine with admitted patient ED events. +3) Craig advised that community visits by hospital based practitioner + have only been consistently recorded recently. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +Issues: + +History (reverse order): +2020-05-20 SA v1 +**************************************************************************************************/ + +/* Set database for writing views */ +USE IDI_UserCode +GO + +/* Clear existing view */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[defn_non_admit_patient]; +GO + +/* Create view */ +CREATE VIEW [DL-MAA20XX-YY].[defn_non_admit_patient] AS +SELECT [snz_uid] + ,[moh_nnp_service_date] + ,[moh_nnp_event_type_code] + ,[moh_nnp_attendence_code] +FROM [IDI_Clean_YYYYMM].[moh_clean].[nnpac] +WHERE [moh_nnp_event_type_code] IN ('ED', 'OP', 'CR') +AND [moh_nnp_service_date] IS NOT NULL +AND [moh_nnp_attendence_code] <> 'DNA'; --Exclude 'did not attend' +GO diff --git a/health/population_health_service_users.sql b/health/population_health_service_users.sql new file mode 100644 index 0000000..26f5ee3 --- /dev/null +++ b/health/population_health_service_users.sql @@ -0,0 +1,201 @@ +/************************************************************************************************** +Title: Health Service Users Population (proxy) +Author: Craig Wright + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Proxy Health Service User (HSU) population from 2020-01-01 + +Intended purpose: +Proxy for population using health services. +This population is constructed by MOH for their own analysis outside the IDI. + +Inputs & Dependencies: +- [IDI_Clean].[moh_clean].[interrai] +- [IDI_Clean].[moh_clean].[lab_claims] +- [IDI_Clean].[moh_clean].[nes_enrolment] +- [IDI_Clean].[moh_clean].[pop_cohort_demographics] +- [IDI_Clean].[moh_clean].[nnpac] +- [IDI_Clean].[moh_clean].[pharmaceutical] +- [IDI_Clean].[moh_clean].[PRIMHD] +- [IDI_Clean].[moh_clean].[priv_fund_hosp_discharges_event] +- [IDI_Clean].[moh_clean].[pub_fund_hosp_discharges_event] +- [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_service_hist_202110] +- [IDI_Clean].[moh_clean].[pop_cohort_demographics] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_hsu_proxy] + +Notes: +1) By construction this is a multi-data source method. We draw interactions from + a wide range of MOH tables: + 1. interrai + 2. labs claims + 3. NES enrolment + 4. nnpac + 5. PHARMS + 6. PRIMHD + 7. private hospital + 8. public hospital + 9. socrates + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = vacc_ + Project schema = DL-MAA20XX-YY + Health events since = 'YYYY-MM-DD' + +Issues: + +History (reverse order): +2021-11-25 SA review and tidy +2021-09-30 CW +**************************************************************************************************/ + +/* create table of all identities */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_hsu_list] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_hsu_list] ( + snz_uid INT, + record_source VARCHAR(5), +); +GO + +/*************************************************************************************************************** +append records from each source into the table +***************************************************************************************************************/ + +/******************************************************** +1. interrai - NONE +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_hsu_list] (snz_uid, record_source) +SELECT DISTINCT [snz_uid] + ,'IRAI' AS record_source +FROM [IDI_Clean_YYYYMM].[moh_clean].[interrai] +WHERE [moh_irai_assessment_date] >= 'YYYY-MM-DD' +GO + +/******************************************************** +2. labs claims - NONE +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_hsu_list] (snz_uid, record_source) +SELECT DISTINCT [snz_uid] + ,'labs' AS record_source +FROM [IDI_Clean_YYYYMM].[moh_clean].[lab_claims] +WHERE [moh_lab_visit_date] >= 'YYYY-MM-DD' +GO + +/******************************************************** +3. NES enrolment - many months +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_hsu_list] (snz_uid, record_source) +SELECT DISTINCT b.snz_uid + ,'NES' AS record_source +FROM [IDI_Clean_YYYYMM].[moh_clean].[nes_enrolment] AS a +INNER JOIN [IDI_Clean_YYYYMM].[moh_clean].[pop_cohort_demographics] as b +ON a.snz_moh_uid=b.snz_moh_uid +WHERE CAST([moh_nes_snapshot_month_date] AS DATE) >= 'YYYY-MM-DD' +GO + +/******************************************************** +4. nnpac - NONE +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_hsu_list] (snz_uid, record_source) +SELECT DISTINCT [snz_uid] + ,'nnpac' AS record_source +FROM [IDI_Clean_YYYYMM].[moh_clean].[nnpac] +WHERE [moh_nnp_service_date] >= 'YYYY-MM-DD' +GO + +/******************************************************** +5. PHARMS +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_hsu_list] (snz_uid, record_source) +SELECT DISTINCT [snz_uid] + ,'pharm' AS record_source +FROM [IDI_Clean_YYYYMM].[moh_clean].[pharmaceutical] +WHERE [moh_pha_dispensed_date] >= 'YYYY-MM-DD' +GO + +/******************************************************** +6. PRIMHD +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_hsu_list] (snz_uid, record_source) +SELECT DISTINCT [snz_uid] + ,'prim' AS record_source +FROM [IDI_Clean_YYYYMM].[moh_clean].[PRIMHD] +WHERE [moh_mhd_referral_end_date] >= 'YYYY-MM-DD' +OR [moh_mhd_referral_end_date] IS NULL +GO + +/******************************************************** +7. private hospital +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_hsu_list] (snz_uid, record_source) +SELECT DISTINCT [snz_uid] + ,'priv' AS record_source +FROM [IDI_Clean_YYYYMM].[moh_clean].[priv_fund_hosp_discharges_event] +WHERE [moh_pri_evt_end_date] >= 'YYYY-MM-DD' +GO + +/******************************************************** +8. public hospital +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_hsu_list] (snz_uid, record_source) +SELECT DISTINCT [snz_uid] + ,'pubh' AS record_source +FROM [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_event] +WHERE [moh_evt_even_date] >= 'YYYY-MM-DD' +GO + +/******************************************************** +9. socrates +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_hsu_list] (snz_uid, record_source) +SELECT DISTINCT b.snz_uid + ,'socr' AS record_source +FROM [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_service_hist_202110] AS a +INNER JOIN [IDI_Clean_YYYYMM].[moh_clean].[pop_cohort_demographics] AS b +ON a.snz_moh_uid = b.snz_moh_uid +WHERE [EndDate_Value] >= 'YYYY-MM-DD' +GO + +/*************************************************************************************************************** +combine to a single list +***************************************************************************************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_hsu_proxy] +GO + +SELECT DISTINCT snz_uid +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_hsu_proxy] +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_hsu_list] + +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_hsu_proxy] (snz_uid); +GO +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_hsu_proxy] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE) +GO + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_hsu_list] +GO diff --git a/health/serious_mental_health_events.sql b/health/serious_mental_health_events.sql new file mode 100644 index 0000000..0e1ff66 --- /dev/null +++ b/health/serious_mental_health_events.sql @@ -0,0 +1,296 @@ +/************************************************************************************************** +Title: Any indication of serious mental health +Author: Craig Wright + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions +Staff at MoH provided comments on this definition. + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Multi-source, any indicator of serious mental health event or diagnosis. + +Intended purpose: +Has a person ever been diagnosed with a serious mental health condition. +Has a person had recent serious mental health diagnosis or event. + +Inputs & Dependencies: +- [IDI_Clean].[moh_clean].[pub_fund_hosp_discharges_diag] +- [IDI_Clean].[moh_clean].[pub_fund_hosp_discharges_event] +- [IDI_Clean].[moh_clean].priv_fund_hosp_discharges_diag] +- [IDI_Clean].[moh_clean].priv_fund_hosp_discharges_event +- [IDI_Clean].[moh_clean].[PRIMHD] +- [IDI_Adhoc].[clean_read_MOH_PRIMHD].[primhd_diagnoses] +- [IDI_Clean].[security].[concordance] +- [IDI_Clean].[moh_clean].[interrai] +- [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_disability] +- [IDI_Clean].[moh_clean].[pop_cohort_demographics] +- [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_needs_assessment] +- [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_referral] +- [IDI_Clean].[msd_clean].[msd_incapacity] +- [IDI_Clean].[moh_clean].[nnpac] + +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_serious_Mental_health_list] +- [IDI_UserCode].[DL-MAA20XX-YY].[vacc_recent_serious_Mental_health] + + + +Notes: +1) Two types of events included: + - any serious diagnosis - schizophrenia, Bi polar, major depressive disorder, schizoaffective disorder + - any PRIMHD/MHINC service started in the reference period + +2) Multiple sources included in definition: + 1. Y public hospital discharge diagnosis (ICD10) x 6 3 digit codes + 2. Y private hospital discharge diagnosis (ICD10) x 6 3 digit codes + 3. N MHINC service -- no as too early + 4.a Y PRIMHD service by referral period by date -- NB BASED ON RECENT SERVICE DATE + 4.b Y PRIMHD diagnosis codes + 5. Y InterRAI diagnosis by question x 2 questions + 6. Y SOCRATES by diagnosis x 2 codes + 7. Y MSD incapacitation + +3) MoH advises that in PRIMHD, we consider a current referral to be any referral "open" within the period. + So, the referral may have started prior or during the study period and the end date will be either during + or after the end of the study period. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = vacc_ + Project schema = DL-MAA20XX-YY + +Issues: +1) PRIMHD section only uses the referral start date. This means it only captures referrals that started during + the study period. To capture all open referrals code needs to be modified to consider referral end dates. + +2) Some MHA referrals go on have no activity (e.g. referral declined). Depending on the purpose of the research + it would be more effective to consider referrals that resulted in some kind of specialist MHA service. + +History (reverse order): +2022-02-28 SA incorporate comments from MoH staff +2021-12-01 SA tidy +2021-11-02 CW +**************************************************************************************************/ + +/* create table of all possible events */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_serious_Mental_health_list] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_serious_Mental_health_list] ( + snz_uid INT, + code VARCHAR(12), + descript VARCHAR(25), + record_source VARCHAR(5), + event_date DATE, +); +GO + +/******************************************************** +Public hospital events +********************************************************/ +--ICD10 +--F33 Major depressive disorder +--F30 Manic +--F31 Bipolar +--F20 Schizophrenia +--F21 Schizotypal +--F25 Schizaffective + +--[IDI_Metadata].[clean_read_CLASSIFICATIONS].[acc_ICD10_Code] +--WHERE SUBSTRING ([Code],1,1)='F' + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_serious_Mental_health_list] (snz_uid, code, descript, record_source, event_date) +SELECT b.snz_uid + ,[moh_dia_clinical_code] + ,NULL AS descript + ,'PUB' as record_source + ,[moh_evt_evst_date] + --,[moh_dia_event_id_nbr] +FROM [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_diag] as a +INNER JOIN [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_event] as b +ON a.[moh_dia_event_id_nbr] = b.[moh_evt_event_id_nbr] +WHERE SUBSTRING([moh_dia_clinical_code],1,3) in ('F30','F31','F33','F20','F21','F25') +AND [moh_dia_clinical_sys_code]=[moh_dia_submitted_system_code] +GO + +/******************************************************** +Private hospital events +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_serious_Mental_health_list] (snz_uid, code, descript, record_source, event_date) +SELECT b.snz_uid + ,moh_pri_diag_clinic_code + ,NULL AS descript + ,'PRI' as record_source + ,CAST(moh_pri_evt_start_date AS DATE) AS [date] +FROM [IDI_Clean_YYYYMM].[moh_clean].priv_fund_hosp_discharges_diag as a +INNER JOIN [IDI_Clean_YYYYMM].[moh_clean].priv_fund_hosp_discharges_event as b +ON a.moh_pri_diag_event_id_nbr = b.moh_pri_evt_event_id_nbr +WHERE SUBSTRING(moh_pri_diag_clinic_code,1,3) in ('F30','F31','F33','F20','F21','F25') +AND moh_pri_diag_clinic_sys_code = moh_pri_diag_sub_sys_code +GO + +/******************************************************** +PRIHMD events +********************************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_serious_Mental_health_list] (snz_uid, code, descript, record_source, event_date) +SELECT DISTINCT [snz_uid] + ,NULL AS code + ,NULL AS descript + ,'PRM' AS record_source + ,[moh_mhd_referral_start_date] +FROM [IDI_Clean_YYYYMM].[moh_clean].[PRIMHD] +GO + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_serious_Mental_health_list] (snz_uid, code, descript, record_source, event_date) +SELECT b.[snz_uid] + ,[CLINICAL_CODE] AS code + ,NULL AS descript + ,'PRC' AS record_source + ,[CLASSIFICATION_START_DATE] +FROM [IDI_Adhoc].[clean_read_MOH_PRIMHD].[primhd_diagnoses] AS a +INNER JOIN [IDI_Clean_YYYYMM].[security].[concordance] AS b +ON a.[snz_moh_uid] = b.[snz_moh_uid] +WHERE ([CLINICAL_CODING_SYSTEM_ID] >= 10 AND SUBSTRING([CLINICAL_CODE],1,3) in ('F30','F31','F33','F20','F21','F25')) +/* MOH staff recommend = 7 instead of >= 7 as this is the only code associated with DSM-IV */ +OR ([CLINICAL_CODING_SYSTEM_ID] = 7 AND SUBSTRING([CLINICAL_CODE],1,4) in ('2960','2962','2963','2964','2965','2966','2967','2968')) +OR ([CLINICAL_CODING_SYSTEM_ID] = 7 AND SUBSTRING([CLINICAL_CODE],1,3) in ('295')) +GO + +/******************************************************** +INTERRAI - health of older people assessment +********************************************************/ +-- [moh_irai_depression_code], [moh_irai_schizophrenia_code], [moh_irai_bipolar_code] +--0 Not present +--1 Primary diagnosis / diagnoses for current stay +--2 Diagnosis present, receiving active treatment +--3 Diagnosis present, monitored but no active treatment + +--FROM [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_interrai_question_lookup] +--WHERE [IDI Variable Name] like'%schi%' +--FROM [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_interrai_answer_lookup] +--WHERE [IDI Variable Name] like'%bipolar%' + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_serious_Mental_health_list] (snz_uid, code, descript, record_source, event_date) +SELECT [snz_uid] + ,NULL AS code + ,NULL AS descript + ,'IRA' AS record_source + ,[moh_irai_assessment_date] +FROM [IDI_Clean_YYYYMM].[moh_clean].[interrai] +WHERE [moh_irai_schizophrenia_code] in (1,2,3) +OR [moh_irai_bipolar_code] in (1,2,3) +GO + +/******************************************************** +SOCRATES funded disability +********************************************************/ +--1306 schizophrenia +--1303 Bipolar + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_serious_Mental_health_list] (snz_uid, code, descript, record_source, event_date) +SELECT b.snz_uid + ,CAST([Code] AS VARCHAR) AS code + ,[Description] + ,'SOC' AS record_source + ,COALESCE([FirstContactDate], [ReferralDate]) As event_date +FROM [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_disability] AS a +LEFT JOIN [IDI_Clean_YYYYMM].[moh_clean].[pop_cohort_demographics] AS b +ON a.snz_moh_uid = b.snz_moh_uid +LEFT JOIN ( + SELECT DISTINCT snz_moh_uid + ,[FirstContactDate] + ,CAST(SUBSTRING([FirstContactDate], 1, 7) AS DATE) AS event_date + FROM [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_needs_assessment] + WHERE [FirstContactDate] IS NOT NULL +) AS c +ON a.snz_moh_uid = c.snz_moh_uid +LEFT JOIN ( + SELECT DISTINCT snz_moh_uid + ,[ReferralDate] + ,CAST(SUBSTRING([ReferralDate], 1, 7) AS DATE) AS event_date + FROM [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_referral] + WHERE [ReferralDate] IS NOT NULL +) AS e +ON a.snz_moh_uid = e.snz_moh_uid +WHERE code IN ('1306','1303') +GO + +/******************************************************** +MSD - medical certificates and incapacitation +********************************************************/ +--161 Depression +--162 Bipolar disorder +--163 Schizophrenia + +-- Look up metadata: +-- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[msd_incapacity_reason_code_3] +-- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[msd_incapacity_reason_code_4] +--WHERE [classification] LIKE '%mental%' + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_serious_Mental_health_list] (snz_uid, code, descript, record_source, event_date) +SELECT snz_uid + ,[msd_incp_incapacity_code] AS code + ,NULL AS descript + ,'INCP' AS record_source + ,[msd_incp_incp_from_date] +FROM [IDI_Clean_YYYYMM].[msd_clean].[msd_incapacity] +WHERE [msd_incp_incrsn95_1_code] IN ('162', '163') +OR [msd_incp_incrsn95_2_code] IN ('162', '163') +OR [msd_incp_incrsn95_3_code] IN ('162', '163') +OR [msd_incp_incrsn95_4_code] IN ('162', '163') +OR [msd_incp_incapacity_code] IN ('162', '163') +GO + +/******************************************************** +NNPAC +********************************************************/ +--COOC0058 Mental Health Worker +--HOP235 AT & R (Assessment Treatment & Rehabilitation) Inpatient - Mental Health service(s) for Elderly + +-- Look up metadata: +-- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[moh_nnpac_purchase_unit] +--WHERE [PU_DESCRIPTION] like '%mental%' + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_serious_Mental_health_list] (snz_uid, code, descript, record_source, event_date) +SELECT [snz_uid] + ,[moh_nnp_purchase_unit_code] AS code + ,NULL AS descript + ,'NAP' AS record_source + ,[moh_nnp_service_date] +FROM [IDI_Clean_YYYYMM].[moh_clean].[nnpac] +WHERE [moh_nnp_purchase_unit_code] IN ('COOC0058','HOP235') +GO + +/******************************************************** +Conclude +********************************************************/ + +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_serious_Mental_health_list] (snz_uid); +GO +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_serious_Mental_health_list] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE) +GO + +/* View: any serious diagnosis or recent acute mental health service use */ +USE IDI_UserCode +GO + +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[vacc_recent_serious_Mental_health] +GO + +CREATE VIEW [DL-MAA20XX-YY].[vacc_recent_serious_Mental_health] AS +SELECT DISTINCT snz_uid +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_serious_Mental_health_list] +WHERE record_source IN ('PRI','PUB','SOC','IRA','PRC','INCP') +OR (record_source = 'PRM' AND event_date >= '2019-01-01') diff --git a/health/stroke_register.sql b/health/stroke_register.sql new file mode 100644 index 0000000..92c6a24 --- /dev/null +++ b/health/stroke_register.sql @@ -0,0 +1,94 @@ +/************************************************************************************************** +Title: Stroke register +Author: MOH +Re-edit: Manjusha Radhakrishnan +Reviewer: + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +People in the Stroke register. + +Intended purpose: +Create register of people who have had a stroke in the past years + +Inputs & Dependencies: +- [IDI_Clean].[moh_clean].[priv_fund_hosp_discharges_diag] +- [IDI_Clean].[moh_clean].[pub_fund_hosp_discharges_event] + +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[defn_stroke] + +Notes: +1. Cancer codes used: + - I60, I61, I62, I63, I65, I66, I67, I69, G45, G46 + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +Issues: +1. Duplicates can be found; this is because a person may receive multiple treatments for the same condition + +History (reverse order): +2022-07-20 MR v1 +**************************************************************************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_stroke] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[defn_stroke] ( + snz_uid INT + , event_date DATE + , source VARCHAR(255) +); +GO + + +/** PUBLIC HOSPITALS **/ +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_stroke] (snz_uid, event_date, source) +SELECT snz_uid, + nmd.moh_evt_even_date AS event_date, + 'PUB HOSP' AS source +FROM [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_event] as nmd +INNER JOIN ( + SELECT moh_dia_event_id_nbr, + moh_dia_diagnosis_type_code, + moh_dia_clinical_code, + moh_dia_clinical_sys_code + FROM [IDI_Clean_YYYYMM].[moh_clean].[pub_fund_hosp_discharges_diag] + WHERE moh_dia_clinical_sys_code IN ('11','12','13','14','15') + AND (substring(moh_dia_clinical_code,1,3) IN ('I60','I61','I62','I63','I65','I66','I67','I69','G45','G46'))) as fndp +ON nmd.moh_evt_event_id_nbr = fndp.moh_dia_event_id_nbr +GO + +/** PRIVATE HOSPITALS **/ +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_stroke] (snz_uid, event_date, source) +SELECT snz_uid, + moh_pri_evt_end_date AS event_date, + 'PRIV HOSP' AS source +FROM [IDI_Clean_YYYYMM].[moh_clean].[priv_fund_hosp_discharges_event] as nmd +INNER JOIN ( + SELECT moh_pri_diag_event_id_nbr, + moh_pri_diag_diag_type_code, + moh_pri_diag_clinic_code, + moh_pri_diag_clinic_sys_code + FROM [IDI_Clean_YYYYMM].[moh_clean].[priv_fund_hosp_discharges_diag] + WHERE moh_pri_diag_clinic_sys_code IN ('11','12','13','14','15') + AND (substring(moh_pri_diag_clinic_code,1,3) IN ('I60','I61','I62','I63','I65','I66','I67','I69','G45','G46')) +) as fndp +ON nmd.moh_pri_evt_event_id_nbr = fndp.moh_pri_diag_event_id_nbr +GO + diff --git a/housing/InterRAI loneliness and home type.sql b/housing/InterRAI loneliness and home type.sql new file mode 100644 index 0000000..8ec33d5 --- /dev/null +++ b/housing/InterRAI loneliness and home type.sql @@ -0,0 +1,78 @@ +/************************************************************************************************** +Title: InterRAI home type and loneliness +Author: Penny Mok +Reviewer: Manjusha Radhakrishnan + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Latest loneliness indicator and residencial type for a person who had InterRAI assessment between two dates. + +Intended purpose: +Loneliness indicator (0/1) for those who had InterRAI assessment in 1 Apr 2017-31 Mar 2018 (2018 tax year) +Also location indicator - whether live at home, in ARC (aged residential care) or other (hospital/PC facility). Same time period as above. + +Inputs & Dependencies: +- [IDI_Clean_YYYYMM].[moh_clean].[interrai] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[defn_interrai_loneliness] + + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +History (reverse order): +2022-07-15 MR QA +2022-07-13 VW Cast loneliness indicator as INT +2022-07-12 VW Formatting, Sandpit table created, change to use WITH query instead of temporary tables +2022-07-11 PM Definition creation +**************************************************************************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_interrai_loneliness]; +GO + +WITH info AS ( +select [snz_uid] + ,[moh_irai_assessment_date] AS date + ,CAST([moh_irai_lonely_ind] AS INTEGER) AS lonely + ,[moh_irai_location_text] + ,CASE + WHEN [moh_irai_location_text] = 'HOME' THEN 'home' + WHEN [moh_irai_location_text] = 'ARC FACILITY' THEN 'ARC' + ELSE 'other' -- other = hospital + END AS location +FROM [IDI_Clean_YYYYMM].[moh_clean].[interrai] +WHERE [moh_irai_assessment_date] BETWEEN 'YYYY-MM-DD' AND 'YYYY-MM-DD' -- enter the start date and end date for the assessment +AND [moh_irai_lonely_ind] IS NOT NULL +), + +most_recent AS ( +SELECT snz_uid + ,MAX(date) AS most_recent_date -- several duplicates for individual, get the latest date +FROM info +GROUP BY snz_uid +) + +SELECT DISTINCT b.snz_uid + ,a.lonely + ,a.date as most_recent_date + ,a.location -- home, ARC, or other +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_interrai_loneliness] +FROM info AS a + ,most_recent AS b +WHERE a.snz_uid = b.snz_uid +AND a.date = b.most_recent_date + diff --git a/housing/community_housing_spell.sql b/housing/community_housing_spell.sql new file mode 100644 index 0000000..80c938f --- /dev/null +++ b/housing/community_housing_spell.sql @@ -0,0 +1,164 @@ +/************************************************************************************************** +Title: Spell living in community housing +Author: Hubert Zal + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +A spell for a person living in community housing. + +Intended purpose: +Creating indicators of when/whether a person has lived in community housing. +Identifying spells when a person is living in community housing. +Counting the number of days a person spends in community housing. + +Inputs & Dependencies: +- [IDI_Clean].[hnz_clean].[tenancy_household_snapshot] +- [IDI_Clean].[hnz_clean].[tenancy_snapshot] + +Outputs: +- [IDI_Sandpit].[$(PROJSCH)].[tenancy_community_housing] + +Notes: +1) The snapshot table identifies who was in a house at given points of time. Where the + same person appears in consecutive snapshots we infer they are in the house during the + intervening time. +2) Condensing is used to avoid double counting where different tenancies overlap. + If condensing is slow, pre-filtering the input tables may improve speed. + +Parameters & Present values: +1. [$(PROJSCH)] = Project schema. "DL-MAA20XX-YY" +2. [$(IDIREF)] = Current refresh. "IDI_Clean_YYYYMM" +3. [$(TBLPREF)] = Prefix. "tmp" + + +Issues: + +History (reverse order): +**************************************************************************************************/ +--PARAMETERS################################################################################################## +--SQLCMD only (Activate by clicking Query->SQLCMD Mode) +--Already in master.sql; Uncomment when running individually +:setvar TBLPREF "tmp" +:setvar IDIREF "IDI_Clean_YYYYMM" +:setvar PROJSCH "DL-MAA20XX-YY" +GO + +--############################################################################################################## + USE IDI_Sandpit; + +/* Include only tenancies which are Community Housing Providers (CHP) and where the primary has an associated address.*/ +/* Condensed spells */ +/*Link primaries from [tenancy_snapshot] to remaining household occupants in [tenancy_household_snaphot].*/ +DROP TABLE IF EXISTS [IDI_Sandpit].[$(PROJSCH)].[$(TBLPREF)_tenancy_staging1]; +GO + +SELECT + a.[snz_uid] + ,a.[snz_household_uid] + ,a.[hnz_ths_snapshot_date] + ,b.[msd_provider_name_text] + ,b.[snz_idi_address_register_uid] + ,a.[hnz_ths_app_relship_text] +INTO [IDI_Sandpit].[$(PROJSCH)].[$(TBLPREF)_tenancy_staging1] +FROM [$(IDIREF)].[hnz_clean].[tenancy_household_snapshot] a +INNER JOIN [$(IDIREF)].[hnz_clean].[tenancy_snapshot] b +ON a.[snz_household_uid] = b.[snz_household_uid] +AND a.[hnz_ths_snapshot_date] = b.[hnz_ts_snapshot_date] +WHERE b.[msd_provider_name_text] = 'CHP' -- Community Housing Providers +AND b.[snz_idi_address_register_uid] IS NOT NULL + +/* Condensed spells - Spells are monthly snapshots. +If snapshot dates are approx. a month apart then condense snapshot dates into start and end dates. */ +DROP TABLE IF EXISTS [IDI_Sandpit].[$(PROJSCH)].[$(TBLPREF)_tenancy_staging2]; +GO + +/* Create staging table */ +SELECT + a.[snz_uid] + ,a.[hnz_ths_snapshot_date] AS [start_date] + ,b.[hnz_ths_snapshot_date] AS [end_date] + ,a.[snz_idi_address_register_uid] + ,a.snz_household_uid + ,a.[hnz_ths_app_relship_text] +INTO [IDI_Sandpit].[$(PROJSCH)].[$(TBLPREF)_tenancy_staging2] +FROM [IDI_Sandpit].[$(PROJSCH)].[$(TBLPREF)_tenancy_staging1] a +INNER JOIN [IDI_Sandpit].[$(PROJSCH)].[$(TBLPREF)_tenancy_staging1] b +ON a.snz_uid = b.snz_uid +WHERE DATEDIFF(DAY, a.[hnz_ths_snapshot_date], b.[hnz_ths_snapshot_date]) BETWEEN 20 AND 40 -- adjacent months +AND a.[snz_idi_address_register_uid] = b.[snz_idi_address_register_uid] + +/* Condensed spells - merge all the spells to determine true end date */ +DROP TABLE IF EXISTS [IDI_Sandpit].[$(PROJSCH)].[tenancy_community_housing]; +GO + +WITH +/* exclude start dates that are within another spell */ +spell_starts AS ( + SELECT + [snz_uid] + ,[start_date] + ,[snz_idi_address_register_uid] + ,[hnz_ths_app_relship_text] + ,[snz_household_uid] + FROM [IDI_Sandpit].[$(PROJSCH)].[$(TBLPREF)_tenancy_staging2] s1 + WHERE [start_date] <= [end_date] --This does not result in any records being excluded, however it is included in case a start_date ever occurs after the end_date in future updates. + AND NOT EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[$(PROJSCH)].[$(TBLPREF)_tenancy_staging2] s2 + WHERE s1.[snz_uid] = s2.[snz_uid] + AND DATEADD(DAY, -1, s1.[start_date]) BETWEEN s2.[start_date] AND s2.[end_date] + ) +), +/* exclude end dates that are within another spell */ +spell_ends AS ( + SELECT + [snz_uid] + ,[end_date] + ,[snz_idi_address_register_uid] + FROM [IDI_Sandpit].[$(PROJSCH)].[$(TBLPREF)_tenancy_staging2] t1 + WHERE [start_date] <= [end_date] + AND NOT EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[$(PROJSCH)].[$(TBLPREF)_tenancy_staging2] t2 + WHERE t2.snz_uid = t1.snz_uid + AND IIF(YEAR(t1.[end_date]) = 9999, t1.[end_date], DATEADD(DAY, 1, t1.[end_date])) BETWEEN t2.[start_date] AND t2.[end_date] + ) +) +SELECT + s.[snz_uid] + ,s.[start_date] + ,MIN(e.[end_date]) AS [end_date] + ,s.[snz_idi_address_register_uid] + --,s.[hnz_ths_app_relship_text] --used for testing purposes + --,s.[snz_household_uid] --used for testing purposes +INTO [IDI_Sandpit].[$(PROJSCH)].[tenancy_community_housing] +FROM spell_starts s +INNER JOIN spell_ends e +ON s.[snz_uid] = e.[snz_uid] +AND s.[start_date] <= e.[end_date] +GROUP BY s.snz_uid, s.[start_date], s.[snz_idi_address_register_uid], s.[hnz_ths_app_relship_text], s.[snz_household_uid] + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[$(PROJSCH)].[tenancy_community_housing] (snz_uid); +GO +/* Compress final table to save space */ +ALTER TABLE [IDI_Sandpit].[$(PROJSCH)].[tenancy_community_housing] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE); +GO + +/* Clear staging tables */ +DROP TABLE IF EXISTS [IDI_Sandpit].[$(PROJSCH)].[$(TBLPREF)_tenancy_CHP_primary]; +DROP TABLE IF EXISTS [IDI_Sandpit].[$(PROJSCH)].[$(TBLPREF)_tenancy_staging1]; +DROP TABLE IF EXISTS [IDI_Sandpit].[$(PROJSCH)].[$(TBLPREF)_tenancy_staging2]; +GO diff --git a/housing/emergency_housing_spells.sql b/housing/emergency_housing_spells.sql new file mode 100644 index 0000000..cdfac7e --- /dev/null +++ b/housing/emergency_housing_spells.sql @@ -0,0 +1,165 @@ +/*************************************************************************************************************************** +Title: Emergency Housing Spells +Author: Craig Wright + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +A spell for a person living in emergency housing. + +Intended purpose: +1. Extract MSD third tier expenditure data related to emergency housing payments made to the recipients. +2. Join the one-off payments data with each other to form spells with start dates and end dates. + +Inputs & Dependencies: +- [IDI_Clean].[msd_clean].[msd_third_tier_expenditure] + +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[defn_emergency_housing] + +Key rules: +1. 'IDI Adhoc Payments to Beneficiaries (Third Tier Expenditure)' table is located in [msd_clean].[msd_third_tier_expenditure] table in the [IDI_Clean] database. +2. The dataset contains individuals from [msd_clean].[msd_third_tier_expenditure] table with the pay reason code of '855', which corresponds to 'Emergency Housing', based on the information obtained from [IDI_Metadata].[clean_read_CLASSIFICATIONS].[msd_income_support_pay_reason]. +3. Any rows with duplicate snz_uid (STATSNZ unique identifier for individuals in the IDI), snz_msd_uid (MSD unique identifier for individuals) and msd_tte_app_date (the application date for the emergency housing) are removed. +4. If an individual has multiple application dates close to each other, the data is merged together where the earliest application date is the 'start date' and the latest application date is the '[end_date]'. The threshold for merging the application dates is '23' days after analysing the time gaps between applications for individuals that have made multiple applications. In the analysis, Simon Anastasiadis and John Park found that a significant number of individuals that have made multiple applications have made subsequent applications 1 week, 2 weeks and 3 weeks after their first emergency housing application. The code for compacting the different rows into a single spell is derived from Vinay Benny's code from Social Investment Data Foundation (SIDF) library. + +Notes: +The dataset was tested on data from March 2019 to March 2021. +Any inconsistencies and error with the MSD third tier expenditure data beyond those points have not been tested, +nor any inconsistencies in the MSD third tier expenditure data through time beyond those points. In addition, +the data for emergency housing goes from 11 September 2016 to 31 December 2021 for the March 2022 IDI refresh. + +The expected business key for the output dataset is one row per distinct snz_uid and start_date. +In other words, one row per person (snz_uid) plus the date when they recived the adhoc payment (start_date). + +From [Work and Income]( https://workandincome.govt.nz/housing/nowhere-to-stay/emergency-housing.html ) +Emergency housing is a form accommodation subsided by MSD for people that have no alternative housing options to +stay for the night. If an individual meets the criteria for an emergency housing, then MSD will cover for the first +7 nights at the emergency housing as long as +- it is the recipient's first time at an emergency housing, OR +- enough duration has passed since the last time recipient has been in an emergency housing and there's a + new reason for the recipient to need emergency housing + +After 7 nights, the recipient needs to pay 25% of their income for the accommodation costs. If they have a partner, +they will also need to pay 25% of their income. Income is any money coming in for the recipient, e.g., main benefit, Family Tax Credit. + +From [Ministry of Housing and Urban Development's (HUD's) Quarterly Reports]( https://hud.govt.nz/asserts/News-and-Resources/Statistics-and-Research/Public-housing-reports/Quarterly-reports/Public-housing-Quarterly-Report-March-2021.pdf ) +HUD further clarifies that emergency housing is a form of Special Needs Grants (SNGs) provided by MSD. + +1. [MSD MSD Benefit Dynamics Data and Income Support Expenditure Data Dictionary]( http://wprdtfs05/sites/DefaultProjectCollection/IDI/IDIwiki/UserWiki/Wiki%20Pages/02%20About%20the%20Data/IDI/Metadata/MSD/IDI%20MSD%20Benefit%20Dynamics%20Data%20and%20Income%20Support%20Expenditure%20data.aspx ) are available within the IDI. Most of these include one sentence descriptions of the IDI variables. Note that this link only works inside the IDI. Data on emergency housing is stored in the table 'IDI Adhoc Payments to Beneficiaries (Third Tier Expenditure)', which stores data on one-off payments made to beneficiaries, i.e., SNGs. +2. As stated in the MSD benefits data dictionary, "the IDI Adhoc Payments to Beneficiaries table are extracted from SWIFTT, UCV and CURAM Replica tables." +3. Identifying the reason for the one-off payments can be located in the Metadata database in the IDI under [IDI_Metadata].[clean_read_CLASSIFICATIONS].[msd_income_support_pay_reason]. The reason code for a beneficiary receiving emergency housing is '855'. This was first identified by Craig Wright from Social Wellbeing Agency (SWA). Although the code '857' is emergency housing contributions, there no individuals with that specific code in the data. +4. [HUD's Quarterly Reports]( https://hud.govt.nz/asserts/News-and-Resources/Statistics-and-Research/Public-housing-reports/Quarterly-reports/Public-housing-Quarterly-Report-March-2021.pdf ) provides an external reference for the number of people that have received Emergency Housing Special Needs Grants. The number of distinct individuals that have applied for emergency housing in the table 'IDI Adhoc Payments to Beneficiaries (Third Tier Expenditure)' gives numbers very close to the numbers reported by HUD. + +References & Contacts: +1. [Work and Income]( https://workandincome.govt.nz/housing/nowhere-to-stay/emergency-housing.html ) provides a brief overview on emergency housing. +2. [Ministry of Housing and Urban Development's (HUD's) Quarterly Reports]( https://hud.govt.nz/asserts/News-and-Resources/Statistics-and-Research/Public-housing-reports/Quarterly-reports/Public-housing-Quarterly-Report-March-2021.pdf ) provides an external point of reference for number of people on emergency housing. +2. Craig Wright at SWA + + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +--------------------------------------------------------------------------------------------------------------------------- +Column Description +name +------------------------------ -------------------------------------------------------------------------------------------- +snz_uid The unique STATSNZ person identifier for the individual in an emergency housing + +data_source A tag signifying a source dataset description (hard-coded to "MSD") + +snz_msd_uid The unique MSD person identifier for the individual in an emergency housing + +start_date The start date for the emergency housing, i.e., the first day that the person applied for a + third tier benefit from MSD to stay at an emergency housing + +end_date The latest date for when the person last consecutively applied for emergency housing benefit + from MSD for the current spell. + +---------------------------------------------------------------------------------------------------------------------------- + +History (reverse order): +31 May 2022 Initial version of the emergency housing code. +12 Oct 2021 Original code by Craig Wright +***************************************************************************************************************************/ + +/* Assign the target database to which all the components need to be created in. */ +USE IDI_UserCode +GO + +/* Delete the database object if it already exists */ +DROP VIEW IF EXISTS [IDI_UserCode].[DL-MAA20XX-YY].[defn_emergency_housing] +GO + +/* Create the database object from the temporary view by merging the any close consecutive applications together */ +CREATE VIEW [IDI_UserCode].[DL-MAA20XX-YY].[defn_emergency_housing] AS + /* Trim out the duplicate rows from the MSD data after filtering for people that have applied for emergency housing */ + WITH + emergency_housing AS ( + SELECT [snz_uid] + ,[snz_msd_uid] + ,[msd_tte_app_date] AS [start_date] + ,dateadd(day, 23, [msd_tte_app_date]) AS [end_date] /* Adds a 23 day threshold between each application so that any consecutive applications made within 23 day thresholds can be joined up together into a single spell. */ + FROM {idicleanversion}.[msd_clean].[msd_third_tier_expenditure] + WHERE [msd_tte_pmt_rsn_type_code] IN ('855') /* Emergency housing code */ + GROUP BY snz_uid, snz_msd_uid, msd_tte_app_date + ), + /* Exclude start dates that are within another spell */ + spell_starts AS ( + SELECT [snz_uid] + ,[snz_msd_uid] + ,[start_date] + ,[end_date] + FROM emergency_housing a + WHERE NOT EXISTS ( + SELECT 1 + FROM emergency_housing b + WHERE a.[snz_uid] = b.[snz_uid] + AND a.[snz_msd_uid] = b.[snz_msd_uid] + AND a.[start_date] > b.[start_date] + AND a.[start_date] <= b.[end_date] + ) + ), + /* Exclude end dates that are within another spell */ + spell_ends AS ( + SELECT [snz_uid] + ,[snz_msd_uid] + ,[start_date] + ,[end_date] + FROM emergency_housing a + WHERE NOT EXISTS ( + SELECT 1 + FROM emergency_housing b + WHERE a.[snz_uid] = b.[snz_uid] + AND a.[snz_msd_uid] = b.[snz_msd_uid] + AND a.[end_date] >= b.[start_date] + AND a.[end_date] < b.[end_date] + ) + ) + /* Compact the emergency housing event dates into spells for people with multiple emergency housing applications */ + SELECT + s.snz_uid + ,'MSD' AS data_source + ,s.snz_msd_uid + ,s.[start_date] /* The first application made by the person */ + ,dateadd(day, -23, min(e.[end_date])) AS [end_date] /* Remove the 23 days added to the application date after the joins have been made */ + FROM spell_starts s + INNER JOIN spell_ends e + ON s.[snz_uid] = e.[snz_uid] + AND s.[snz_msd_uid] = e.[snz_msd_uid] + AND s.[start_date] <= e.[end_date] + GROUP BY s.[snz_uid], s.[snz_msd_uid], s.[start_date] +GO diff --git a/housing/local_authority_housing_final.sql b/housing/local_authority_housing_final.sql new file mode 100644 index 0000000..8537e93 --- /dev/null +++ b/housing/local_authority_housing_final.sql @@ -0,0 +1,299 @@ + +/********************************************************************************** +Title: Local authority housing spells +Author: Todd Nicholson, John Park, and Hubert Zal + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +This code defines spells where clients have lived in local authority (city council) housing. + +Intended purpose: +It works out which addresses look reliably like local authority housing and over which date range. +This includes the individual’s unique identifier, the spell’s (tenancy) duration and the local authority property ID. +It is based on tenancy bond lodgements with MBIE. + +Inputs & Dependencies: +- [IDI_Clean].[dbh_clean].[bond_lodgement] +- [IDI_Clean].[data].[address_notification] + +Outputs: +- [$(PROJSCH)].[loc_aut_housing] + +Output variables: +1. snz_uid: A global unique identifier created by Statistics NZ. There is a snz_uid for each distinct identity in the IDI. +This identifier is changed and reassigned each refresh. +2. ant_notification_date: The start date of the spell (“move-in date”). +The year and month of an address notification entered into [IDI_Clean].[data].[address_notification] +3. ant_replacement_date: The end date of the spell (“move-out date”). +The year and month the address notification was superseded (if the date is 9999-12 it is still current). +4. snz_idi_address_register_uid: The encrypted identifier for this address. + + +Notes: +The following steps are taken to derive the spells where clients have lived in local authority housing: +1. Local authority housing is determined based on the [dbh_bond_landlord_group_code]. +This is an indicator for whether the landlord is a private landlord, or a government entity. +'LOC' signifies Local Authority. We start by getting all the properties that could be Local Authority Housing at some point in the past. +2. We work out when the property appears to have a spell of being used for Local Authority Housing. +This time range will then be used to work out when an individual’s residence spell. +3. Next the address notification of the Local Authority Houses are determined by linking to [data].[address_notification] on [snz_idi_address_register_uid]. +Since the address notification data and bond lodgement date are not always consistent a buffer of 7 days was given when limiting to only the spell ranges. +4. Determine if less than 70% of bond lodgements for each address are considered local authority (‘LOC’). +If yes, then do not count as local authority housing. +5. Determine the number of people per bond lodgement. If more than 12 people do not count as local authority housing. +6. Apply the local authority housing ranges and rules to the address notification. + +Additional notes: +[IDI_Clean].[dbh_clean].[bond_lodgement]: Data on bonds lodged with MBIE since January 2000. +The [dbh_bond_landlord_group_code] is an indicator for whether the landlord is a private landlord, or a government entity. 'LOC' signifies Local Authority. +The start and end dates of a tenancy are defined by [dbh_bond_tenancy_start_date] and [dbh_bond_tenancy_end_date]. If there is no end data an arbitrary date +has been set (9999-01-01). + +Ad Hoc applied based on the distribution: +1. If there are more than 12 people per bond lodgement then do not count as local authority housing. +2. If less than 70% of bond lodgements are local authority then do not count as local authority housing. +From beginning to end of bond LOC lodgement, house must remain LOC for 70% of this time. + + +Parameters & Present values: + Current refresh = $(IDIREF) + Prefix = $(TBLPREF) + Project schema = [$(PROJSCH)] + +Issues: + +History (reverse order): + + +**********************************************************************************/ +--PARAMETERS################################################################################################## +--SQLCMD only (Activate by clicking Query->SQLCMD Mode) +--Already in master.sql; Uncomment when running individually +:setvar TBLPREF "tmp" +:setvar IDIREF "IDI_Clean_YYYYMM" +:setvar PROJSCH "DL-MAA20XX-YY" +GO + +--############################################################################################################## + USE IDI_Sandpit; +-- USE {targetdb}; + +DROP TABLE IF EXISTS [$(PROJSCH)].[$(TBLPREF)_loc_addresses]; +-- DROP TABLE IF EXISTS {targetschema}.{projprefix}_moh_dis_assess_elig; +GO + +-- Start be getting all the properties that could be Local Authority Housing at some point in the past. +-- If there is no end date then set the end date to 9999-01-01 +SELECT snz_dbh_bond_uid + ,snz_dbh_property_uid + ,dbh_bond_bedroom_count_code + ,dbh_bond_bond_tenant_count + ,snz_idi_address_register_uid + ,dbh_bond_property_type_text + ,[dbh_bond_bond_lodged_date] + ,[dbh_bond_tenancy_start_date] AS [start_date] + ,ISNULL([dbh_bond_tenancy_end_date], '9999-01-01') AS [end_date] + ,[dbh_bond_landlord_group_code] + ,ISNULL([dbh_bond_bond_closed_date], '9999-01-01') AS [dbh_bond_bond_closed_date] + ,dbh_bond_ta_code +INTO [$(PROJSCH)].[$(TBLPREF)_loc_addresses] +FROM [$(IDIREF)].[dbh_clean].[bond_lodgement] +WHERE snz_idi_address_register_uid IN (SELECT DISTINCT snz_idi_address_register_uid FROM [$(IDIREF)].[dbh_clean].[bond_lodgement] WHERE [dbh_bond_landlord_group_code] = 'LOC') +--ORDER BY snz_idi_address_register_uid, [dbh_bond_tenancy_start_date] + +-- Work out when the property appears to be used for Local Authority Housing. This time range will be used to work out +-- when the counts should be calculated over. +DROP TABLE IF EXISTS [$(PROJSCH)].[$(TBLPREF)_pot_loc_range]; + +SELECT snz_idi_address_register_uid, MIN([start_date]) AS first_loc_date, MAX([end_date]) AS last_loc_date +INTO [$(PROJSCH)].[$(TBLPREF)_pot_loc_range] +FROM [$(PROJSCH)].[$(TBLPREF)_loc_addresses] +WHERE [dbh_bond_landlord_group_code] = 'LOC' +GROUP BY snz_idi_address_register_uid + +-- Add those dates to the original dataset +DROP TABLE IF EXISTS [$(PROJSCH)].[$(TBLPREF)_loc_add_one]; + +SELECT a.* + ,b.first_loc_date + ,b.last_loc_date +INTO [$(PROJSCH)].[$(TBLPREF)_loc_add_one] +FROM [$(PROJSCH)].[$(TBLPREF)_loc_addresses] AS a +LEFT JOIN [$(PROJSCH)].[$(TBLPREF)_pot_loc_range] AS b +ON a.snz_idi_address_register_uid = b.snz_idi_address_register_uid + +-- From that dataset grab the Local Authority Housing episodes in the appropriate range +DROP TABLE IF EXISTS [$(PROJSCH)].[$(TBLPREF)_loc_add_loc]; + +SELECT * +INTO [$(PROJSCH)].[$(TBLPREF)_loc_add_loc] +FROM [$(PROJSCH)].[$(TBLPREF)_loc_add_one] +WHERE [dbh_bond_landlord_group_code] = 'LOC' AND [start_date] >= [first_loc_date] AND [end_date] <= [last_loc_date] + +-- From that dataset grab the non-Local Authority Housing episodes in the appropriate range +DROP TABLE IF EXISTS [$(PROJSCH)].[$(TBLPREF)_loc_add_excl]; + +SELECT * +INTO [$(PROJSCH)].[$(TBLPREF)_loc_add_excl] +FROM [$(PROJSCH)].[$(TBLPREF)_loc_add_one] +WHERE [dbh_bond_landlord_group_code] != 'LOC' AND [start_date] >= [first_loc_date] AND [end_date] <= [last_loc_date] + +-- Grab the address notifications for the Local Authority Housing +DROP TABLE IF EXISTS [$(PROJSCH)].[$(TBLPREF)_add_notific_loc]; + +SELECT [snz_uid] + ,[ant_notification_date] + ,[ant_replacement_date] + ,[snz_idi_address_register_uid] +INTO [$(PROJSCH)].[$(TBLPREF)_add_notific_loc] +FROM [$(IDIREF)].[data].[address_notification] +WHERE snz_idi_address_register_uid IN (SELECT DISTINCT snz_idi_address_register_uid FROM [$(IDIREF)].[dbh_clean].[bond_lodgement] WHERE [dbh_bond_landlord_group_code] = 'LOC') +ORDER BY snz_idi_address_register_uid, [ant_notification_date] + +-- Add the time ranges to the dataset +DROP TABLE IF EXISTS [$(PROJSCH)].[$(TBLPREF)_add_notific_loc_1]; + +SELECT a.* + ,b.first_loc_date + ,b.last_loc_date +INTO [$(PROJSCH)].[$(TBLPREF)_add_notific_loc_1] +FROM [$(PROJSCH)].[$(TBLPREF)_add_notific_loc] AS a +LEFT JOIN [$(PROJSCH)].[$(TBLPREF)_pot_loc_range] AS b +ON a.snz_idi_address_register_uid = b.snz_idi_address_register_uid + +-- And limit to only that range but allow a little bit of additional range to account for timing issues +DROP TABLE IF EXISTS [$(PROJSCH)].[$(TBLPREF)_add_notific_loc_lim]; + +SELECT * +INTO [$(PROJSCH)].[$(TBLPREF)_add_notific_loc_lim] +FROM [$(PROJSCH)].[$(TBLPREF)_add_notific_loc_1] +WHERE [ant_notification_date] >= DATEADD(DAY , -7, [first_loc_date]) AND [ant_notification_date] <= [last_loc_date] + +-- Calculate the proportion of Local Authority Housing versus non-Local Authority Housing ones +-- count number of Local Authority Housing bonds at each address +DROP TABLE IF EXISTS [$(PROJSCH)].[$(TBLPREF)_loc_counts]; + +SELECT snz_idi_address_register_uid, + [first_loc_date], + [last_loc_date], + COUNT(snz_idi_address_register_uid) AS count_loc +INTO [$(PROJSCH)].[$(TBLPREF)_loc_counts] +FROM [$(PROJSCH)].[$(TBLPREF)_loc_add_loc] +GROUP BY snz_idi_address_register_uid, [first_loc_date], [last_loc_date] + +-- count number of non-Local Authority Housing bonds at each address +DROP TABLE IF EXISTS [$(PROJSCH)].[$(TBLPREF)_excl_counts]; + +SELECT snz_idi_address_register_uid, + COUNT(snz_idi_address_register_uid) AS count_excl +INTO [$(PROJSCH)].[$(TBLPREF)_excl_counts] +FROM [$(PROJSCH)].[$(TBLPREF)_loc_add_excl] +GROUP BY snz_idi_address_register_uid + +-- count number of address notifications +DROP TABLE IF EXISTS [$(PROJSCH)].[$(TBLPREF)_add_not_counts] + +SELECT snz_idi_address_register_uid, + COUNT(snz_idi_address_register_uid) AS count_add_nots +INTO [$(PROJSCH)].[$(TBLPREF)_add_not_counts] +FROM [$(PROJSCH)].[$(TBLPREF)_add_notific_loc_lim] +GROUP BY snz_idi_address_register_uid + +-- Pull all the counts together +DROP TABLE IF EXISTS [$(PROJSCH)].[$(TBLPREF)_count_sum_1]; + +SELECT a.snz_idi_address_register_uid + , a.[first_loc_date] + , a.[last_loc_date] + ,CASE + WHEN a.count_loc IS NULL THEN 0 + ELSE a.count_loc + END AS count_loc + ,CASE + WHEN b.count_excl IS NULL THEN 0 + ELSE b.count_excl + END AS count_excl +INTO [$(PROJSCH)].[$(TBLPREF)_count_sum_1] +FROM [$(PROJSCH)].[$(TBLPREF)_loc_counts] AS a +LEFT JOIN [$(PROJSCH)].[$(TBLPREF)_excl_counts] AS b +ON a.snz_idi_address_register_uid = b.snz_idi_address_register_uid + +/*determine if less than 70% of bond lodgments for each address are considered local authority (‘LOC’). +If yes, then do not count as local authority housing. +Next determine the number of people per bond lodgment. If more than 12 people do not count as local authority housing.*/ +DROP TABLE IF EXISTS [$(PROJSCH)].[$(TBLPREF)_count_sum_2]; + +SELECT a.* + ,CASE + WHEN b.count_add_nots IS NULL THEN 0 + ELSE b.count_add_nots + END AS count_add_nots + , CONVERT(DECIMAL(8, 2), ((count_loc*1.)/(count_loc+count_excl))) AS loc_ratio + , CONVERT(DECIMAL(8, 2), ((count_add_nots*1.)/(count_loc+count_excl))) AS notif_ratio + INTO [$(PROJSCH)].[$(TBLPREF)_count_sum_2] +FROM [$(PROJSCH)].[$(TBLPREF)_count_sum_1] AS a +LEFT JOIN [$(PROJSCH)].[$(TBLPREF)_add_not_counts] AS b +ON a.snz_idi_address_register_uid = b.snz_idi_address_register_uid + +-- And apply business rules +DROP TABLE IF EXISTS [$(PROJSCH)].[$(TBLPREF)_com_houses]; + +SELECT snz_idi_address_register_uid, + [first_loc_date], + [last_loc_date] +INTO [$(PROJSCH)].[$(TBLPREF)_com_houses] +FROM [$(PROJSCH)].[$(TBLPREF)_count_sum_2] +WHERE loc_ratio > 0.7 AND notif_ratio < 12.0 + +/********************************************************************************************* +Now apply those local authority housing ranges to the address notifications +*********************************************************************************************/ +DROP TABLE IF EXISTS [$(PROJSCH)].[loc_aut_housing]; + +SELECT a.[snz_uid] + ,a.[ant_notification_date] + ,a.[ant_replacement_date] + ,a.[snz_idi_address_register_uid] + -- ,a.[ant_ta_code] + -- ,b.[first_loc_date] + -- ,b.[last_loc_date] +INTO [$(PROJSCH)].[loc_aut_housing] +FROM [$(IDIREF)].[data].[address_notification] AS a +INNER JOIN [$(PROJSCH)].[$(TBLPREF)_com_houses] AS b +ON a.[snz_idi_address_register_uid] = b.[snz_idi_address_register_uid] +WHERE a.[ant_notification_date] >= DATEADD(DAY , 0, b.[first_loc_date]) AND a.[ant_notification_date] < DATEADD(DAY , 0, b.[last_loc_date]) + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[$(PROJSCH)].[loc_aut_housing] (snz_uid); +GO +/* Compress final table to save space */ +ALTER TABLE [IDI_Sandpit].[$(PROJSCH)].[loc_aut_housing] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE); +GO + + +DROP TABLE IF EXISTS [$(PROJSCH)].[$(TBLPREF)_loc_addresses]; +DROP TABLE IF EXISTS [$(PROJSCH)].[$(TBLPREF)_pot_loc_range]; +DROP TABLE IF EXISTS [$(PROJSCH)].[$(TBLPREF)_loc_add_one]; +DROP TABLE IF EXISTS [$(PROJSCH)].[$(TBLPREF)_loc_add_loc]; +DROP TABLE IF EXISTS [$(PROJSCH)].[$(TBLPREF)_loc_add_excl]; +DROP TABLE IF EXISTS [$(PROJSCH)].[$(TBLPREF)_add_notific_loc]; +DROP TABLE IF EXISTS [$(PROJSCH)].[$(TBLPREF)_add_notific_loc_1]; +DROP TABLE IF EXISTS [$(PROJSCH)].[$(TBLPREF)_add_notific_loc_lim]; +DROP TABLE IF EXISTS [$(PROJSCH)].[$(TBLPREF)_loc_counts]; +DROP TABLE IF EXISTS [$(PROJSCH)].[$(TBLPREF)_excl_counts]; +DROP TABLE IF EXISTS [$(PROJSCH)].[$(TBLPREF)_add_not_counts] +DROP TABLE IF EXISTS [$(PROJSCH)].[$(TBLPREF)_count_sum_1]; +DROP TABLE IF EXISTS [$(PROJSCH)].[$(TBLPREF)_count_sum_2]; +DROP TABLE IF EXISTS [$(PROJSCH)].[$(TBLPREF)_com_houses]; diff --git a/housing/residential_care.sql b/housing/residential_care.sql new file mode 100644 index 0000000..1f5e45e Binary files /dev/null and b/housing/residential_care.sql differ diff --git a/housing/social_housing_by_snapshot.sql b/housing/social_housing_by_snapshot.sql new file mode 100644 index 0000000..6137503 --- /dev/null +++ b/housing/social_housing_by_snapshot.sql @@ -0,0 +1,135 @@ +/************************************************************************************************** +Title: Spell living in social housing +Author: Simon Anastasiadis +Re-edit: Freya Li + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +A spell for a person living in social housing provided by central government. + +Intended purpose: +Creating indicators of when/whether a person has lived in social housing. +Identifying spells when a person is living in social housing. +Counting the number of days a person spends in social housing. + +Inputs & Dependencies: +- [IDI_Clean_YYYYMM].[hnz_clean].[tenancy_household_snapshot] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[defn_hnz_tenancy] + + +Notes: +1) The snapshot table identifies who was in a house at given points of time. Where the + same person appears in consecutive snapshots we infer they are in the house during the + intervening time. +2) Condensing is used to avoid double counting where different tenancies overlap. + If condensing is slow, pre-filtering the input tables may improve speed. +3) Latest start date is 2021-11-30 + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + Earliest start date = 'YYYY-MM-DD' + Latest end date = 'YYYY-MM-DD' + +Issues: +- Slow. Runtime > 17 minutes. +- Only uses occupancy snapshots. No connection to applications. + Hence limited accurcy as to when people moved into social housing. + +History (reverse order): +2020-06-14 FL update the referesh, table name and correct condensing step +2020-11-23 FL QA +2020-03-03 SA v1 +**************************************************************************************************/ + +/* Condensed spells */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[hnz_tenancy_staging]; +GO + +/* Create staging table */ +SELECT a.[snz_uid] + ,a.[hnz_ths_snapshot_date] AS [start_date] + ,b.[hnz_ths_snapshot_date] AS [end_date] + ,a.[hnz_ths_app_relship_text] + ,a.[hnz_ths_signatory_flg_ind] +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[hnz_tenancy_staging] +FROM [IDI_Clean_YYYYMM].[hnz_clean].[tenancy_household_snapshot] a +INNER JOIN [IDI_Clean_YYYYMM].[hnz_clean].[tenancy_household_snapshot] b +ON a.snz_uid = b.snz_uid +WHERE DATEDIFF(DAY, a.[hnz_ths_snapshot_date], b.[hnz_ths_snapshot_date]) BETWEEN 20 AND 40 -- adjacent months +AND (a.[snz_household_uid] = b.[snz_household_uid] +OR a.[snz_legacy_household_uid] = b.[snz_legacy_household_uid]) +AND a.[hnz_ths_snapshot_date] BETWEEN 'YYYY-MM-DD' AND 'YYYY-MM-DD' +AND b.[hnz_ths_snapshot_date] BETWEEN 'YYYY-MM-DD' AND 'YYYY-MM-DD'; +GO + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[hnz_tenancy_staging] (snz_uid); +GO + +/* Condensed spells */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_hnz_tenancy]; +GO + +/* create table with condensed spells */ +WITH +/* exclude start dates that are within another spell */ +spell_starts AS ( + SELECT [snz_uid], [start_date] + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[hnz_tenancy_staging] s1 + WHERE [start_date] <= [end_date] + AND NOT EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[hnz_tenancy_staging] s2 + WHERE s1.snz_uid = s2.snz_uid + AND DATEADD(DAY, -1, s1.[start_date]) BETWEEN s2.[start_date] AND s2.[end_date] + ) +), +/* exclude end dates that are within another spell */ +spell_ends AS ( + SELECT [snz_uid], [end_date] + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[hnz_tenancy_staging] t1 + WHERE [start_date] <= [end_date] + AND NOT EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[hnz_tenancy_staging] t2 + WHERE t2.snz_uid = t1.snz_uid + AND YEAR(t1.[end_date]) <> 9999 + AND DATEADD(DAY, 1, t1.[end_date]) BETWEEN t2.[start_date] AND t2.[end_date] + --AND IIF(YEAR(t1.[end_date]) = 9999, t1.[end_date], DATEADD(DAY, 1, t1.[end_date])) BETWEEN t2.[start_date] AND t2.[end_date] + ) +) +SELECT s.snz_uid, s.[start_date], MIN(e.[end_date]) as [end_date] +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_hnz_tenancy] +FROM spell_starts s +INNER JOIN spell_ends e +ON s.snz_uid = e.snz_uid +AND s.[start_date] <= e.[end_date] +GROUP BY s.snz_uid, s.[start_date] +GO + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[defn_hnz_tenancy] (snz_uid); +GO +/* Compress final table to save space */ +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[defn_hnz_tenancy] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE); +GO + +/* Clear staging table */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[hnz_tenancy_staging]; +GO + diff --git a/housing/social_housing_by_tenancy.sql b/housing/social_housing_by_tenancy.sql new file mode 100644 index 0000000..c38067c --- /dev/null +++ b/housing/social_housing_by_tenancy.sql @@ -0,0 +1,135 @@ +/************************************************************************************************** +Title: Social housing +Author: Simon Anastasiadis +Reviewer: Akilesh Chokkanathapuram + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Government provided social housing - application and residence. + +Intended purpose: +Identify social housing applications +Identify social housing tenancy + +Inputs & Dependencies: +- [IDI_Clean].[hnz_clean].[new_applications] +- [IDI_Clean].[hnz_clean].[new_applications_household] +- [IDI_Clean].[hnz_clean].[tenancy_household_snapshot] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[vacc_hnz_apply] +- [IDI_UserCode].[DL-MAA20XX-YY].[vacc_hnz_tenancy] +- [IDI_UserCode].[DL-MAA20XX-YY].[vacc_current_hnz_tenancy] + +Notes: +1) The social housing application tables can join on any of three different IDs. + - The oldest is snz_legacy_application_uid + - Next is snz_application_uid + - The latest is snz_msd_application_uid + These different IDs were phased in progressively, so there is an overlap in time + periods using each type of IDs, and some records have two different IDs. + +2) Similar patterns are observed for the household identities. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = vacc_ + Project schema = DL-MAA20XX-YY + +Issues: +1) Performance may be poor using all three of these as Views. Converting to indexed Tables + may improve performance. +2) Captures application dates and move in dates well. As occupants may change uring a tenancy + may not provide accurate measure of who lives in household (esp. during long or volatile + tenancies). + +History (reverse order): +2021-08-31 MP Parameterise for COVID-19 vaccination modelling +2020-08-18 MP Parameterise for Nga Tapuae +2019-04-23 AK Reviewed +2019-04-01 SA Initiated +**************************************************************************************************/ + +/*embedded in user code*/ +USE IDI_UserCode +GO + +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[vacc_hnz_apply]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[vacc_hnz_apply] AS +SELECT snz_uid + ,[hnz_na_date_of_application_date] + ,'apply social housing' AS [description] +FROM ( + -- join by MSD application ID (latest) + SELECT b.snz_uid + ,a.[hnz_na_date_of_application_date] + FROM [IDI_Clean_YYYYMM].[hnz_clean].[new_applications] a + INNER JOIN [IDI_Clean_YYYYMM].[hnz_clean].[new_applications_household] b + ON a.[snz_msd_application_uid] = b.[snz_msd_application_uid] + + UNION ALL + + -- join by application ID (a little old) + SELECT b.snz_uid + ,a.[hnz_na_date_of_application_date] + FROM [IDI_Clean_YYYYMM].[hnz_clean].[new_applications] a + INNER JOIN [IDI_Clean_YYYYMM].[hnz_clean].[new_applications_household] b + ON a.[snz_application_uid] = b.[snz_application_uid] + WHERE a.[snz_msd_application_uid] IS NULL OR b.[snz_msd_application_uid] IS NULL -- MSD application ID unavailable + + UNION ALL + + -- join by legacy application ID (quite old) + SELECT b.snz_uid + ,a.[hnz_na_date_of_application_date] + FROM [IDI_Clean_YYYYMM].[hnz_clean].[new_applications] a + INNER JOIN [IDI_Clean_YYYYMM].[hnz_clean].[new_applications_household] b + ON a.[snz_legacy_application_uid] = b.[snz_legacy_application_uid] + WHERE (a.[snz_msd_application_uid] IS NULL OR b.[snz_msd_application_uid] IS NULL) -- MSD application ID unavailable + AND (a.[snz_application_uid] IS NULL OR b.[snz_application_uid] IS NULL) -- application ID unavailable + +) k +GO + +/* Social housing tenancy */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[vacc_hnz_tenancy]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[vacc_hnz_tenancy] AS +SELECT a.[snz_uid] + ,a.[hnz_ths_snapshot_date] AS [start_date] + ,b.[hnz_ths_snapshot_date] AS [end_date] + ,'HNZ tenant' AS [description] +FROM [IDI_Clean_YYYYMM].[hnz_clean].[tenancy_household_snapshot] a +INNER JOIN [IDI_Clean_YYYYMM].[hnz_clean].[tenancy_household_snapshot] b +ON a.snz_uid = b.snz_uid +WHERE DATEDIFF(DAY, a.[hnz_ths_snapshot_date], b.[hnz_ths_snapshot_date]) >= 20 -- snapshots are 20-40 days apart +AND DATEDIFF(DAY, a.[hnz_ths_snapshot_date], b.[hnz_ths_snapshot_date]) <= 40 +AND (a.[snz_household_uid] = b.[snz_household_uid] -- same household +OR a.[snz_legacy_household_uid] = b.[snz_legacy_household_uid]) +GO + +/* Current social housing tenancy */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[vacc_current_hnz_tenancy] +GO + +CREATE VIEW [DL-MAA20XX-YY].[vacc_current_hnz_tenancy] AS +SELECT DISTINCT [snz_uid] + ,'Current_HNZ_tenant_June21' AS [description] +FROM [IDI_Clean_YYYYMM].[hnz_clean].[tenancy_household_snapshot] +WHERE MONTH([hnz_ths_snapshot_date]) = 6 +AND YEAR ([hnz_ths_snapshot_date]) = 2021 +GO diff --git a/housing/social_housing_waitlist.sql b/housing/social_housing_waitlist.sql new file mode 100644 index 0000000..bff5a68 --- /dev/null +++ b/housing/social_housing_waitlist.sql @@ -0,0 +1,154 @@ +/************************************************************************************************** +Title: Spell waitlisted for social housing +Author: Simon Anastasiadis + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +A spell when a person is waitlisted for social housing provided by central government. + +Intended purpose: +Creating indicators of when/whether a person is waiting for social housing. +Identifying spells when a person is waiting for social housing. + +Inputs & Dependencies: +- [IDI_Clean].[hnz_clean].[new_applications_household] +- [IDI_Clean].[hnz_clean].[new_applications] +- [IDI_Clean].[hnz_clean].[register_exit] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[defn_hnz_waitlist] + +Notes: +1) HNZ tenancy data includes three different identifiers for linking separate tables. + These IDs have some complications: + - Some records with only one ID (e.g. a legacy ID) link to records with two IDs + (e.g. a legacy ID and an MSD ID). + - The same number can appear in different ID columns. Hence legacy IDs can be + incorrectly linked to MSD IDs. + We use a simplified approach: link using each of the three IDs and keep the records + that link. This means the resulting data does include duplicates where a record can + link on more than one ID. +2) We have only included new applications for people not in social housing. This means we + have excluded transfer applications: Social housing tenants requesting a movement to + another address (e.g. because an additional bedroom is required). +3) The [hnz_na_hshd_size_nbr] column of [new_applications] may contain the number of + people on the application. However the number that we can find on the household record + [new_applications_household] is approximately three-quarters of this amount. Though + the gap is smaller in recent years. + The cause of this difference is unknown. +4) An alternative way to approach this calculation would be to use the registry snapshots + in a similar way to how we used the tenancy snapshots. A comparison of these two ways + remains to be done. +5) Requires spell condensing in order to count the number of days a person spends waiting + for social housing. Spell condensing should also deduplicate. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +Issues: + +History (reverse order): +2020-05-19 SA v1 +**************************************************************************************************/ + +/* Staging */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_hnz_waitlist]; +GO + +WITH +via_application_uid AS ( + SELECT h.[snz_uid] + ,h.[hnz_nah_app_relship_text] + ,h.[hnz_nah_signatory_flg_ind] + ,a.[hnz_na_date_of_application_date] + ,a.[hnz_na_hshd_size_nbr] + ,e.[hnz_re_exit_date] + ,e.[hnz_re_exit_status_text] + ,e.[hnz_re_exit_reason_text] + ,h.[snz_application_uid] AS ID + FROM [IDI_Clean_YYYYMM].[hnz_clean].[new_applications_household] h + INNER JOIN [IDI_Clean_YYYYMM].[hnz_clean].[new_applications] a + ON h.[snz_application_uid] = a.[snz_application_uid] + LEFT JOIN [IDI_Clean_YYYYMM].[hnz_clean].[register_exit] e + ON h.[snz_application_uid] = e.[snz_application_uid] + WHERE h.[snz_application_uid] IS NOT NULL +), +via_legacy_application_uid AS ( + SELECT h.[snz_uid] + ,h.[hnz_nah_app_relship_text] + ,h.[hnz_nah_signatory_flg_ind] + ,a.[hnz_na_date_of_application_date] + ,a.[hnz_na_hshd_size_nbr] + ,e.[hnz_re_exit_date] + ,e.[hnz_re_exit_status_text] + ,e.[hnz_re_exit_reason_text] + ,h.[snz_legacy_application_uid] AS ID + FROM [IDI_Clean_YYYYMM].[hnz_clean].[new_applications_household] h + INNER JOIN [IDI_Clean_YYYYMM].[hnz_clean].[new_applications] a + ON h.[snz_legacy_application_uid] = a.[snz_legacy_application_uid] + LEFT JOIN [IDI_Clean_YYYYMM].[hnz_clean].[register_exit] e + ON h.[snz_legacy_application_uid] = e.[snz_legacy_application_uid] + WHERE h.[snz_legacy_application_uid] IS NOT NULL +), +via_msd_application_uid AS ( + SELECT h.[snz_uid] + ,h.[hnz_nah_app_relship_text] + ,h.[hnz_nah_signatory_flg_ind] + ,a.[hnz_na_date_of_application_date] + ,a.[hnz_na_hshd_size_nbr] + ,e.[hnz_re_exit_date] + ,e.[hnz_re_exit_status_text] + ,e.[hnz_re_exit_reason_text] + ,h.[snz_msd_application_uid] AS ID + FROM [IDI_Clean_YYYYMM].[hnz_clean].[new_applications_household] h + INNER JOIN [IDI_Clean_YYYYMM].[hnz_clean].[new_applications] a + ON h.[snz_msd_application_uid] = a.[snz_msd_application_uid] + LEFT JOIN [IDI_Clean_YYYYMM].[hnz_clean].[register_exit] e + ON h.[snz_msd_application_uid] = e.[snz_msd_application_uid] + WHERE h.[snz_msd_application_uid] IS NOT NULL +) +SELECT [snz_uid] + ,[hnz_nah_app_relship_text] + ,[hnz_nah_signatory_flg_ind] + ,[hnz_na_date_of_application_date] + ,[hnz_na_hshd_size_nbr] + ,COALESCE([hnz_re_exit_date], DATEADD(YEAR, 3, [hnz_na_date_of_application_date])) AS [hnz_re_exit_date] + ,[hnz_re_exit_status_text] + ,[hnz_re_exit_reason_text] + ,ID +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_hnz_waitlist] +FROM ( + SELECT * + FROM via_application_uid + UNION ALL + SELECT * + FROM via_legacy_application_uid + UNION ALL + SELECT * + FROM via_msd_application_uid +) k +WHERE [hnz_na_date_of_application_date] <= [hnz_re_exit_date] +OR [hnz_re_exit_date] IS NULL +GO + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[defn_hnz_waitlist] (snz_uid); +GO + +/* Compress final table to save space */ +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[defn_hnz_waitlist] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE); +GO + diff --git a/income/annual_income_incl_benefit.sql b/income/annual_income_incl_benefit.sql new file mode 100644 index 0000000..f87b2a0 --- /dev/null +++ b/income/annual_income_incl_benefit.sql @@ -0,0 +1,341 @@ +/*************************************************************************************************************************************** +Title: Tax year income summary (including benefits) +Author: Freya Li +Reviewer: Simon Anastasiadis + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Summary of total income (including non-taxible income) for each tax year. + +Intended purpose: +1. Calculating annual income (including 2nd tier benefit , 3rd tier benefit and WWF) from different sources and in grand total. +2. Identifying whether a person received income from any source (taxible and non-taxible). + + +Inputs & Dependencies: +- [IDI_Clean].[data].[income_tax_yr_summary] +- [IDI_Clean].[msd_clean].[msd_second_tier_expenditure] +- [IDI_Clean].[msd_clean].[msd_third_tier_expenditure] +- [IDI_Clean].[wff_clean].[fam_return_dtls] + +Output: +- [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_income_bnt_wff_tax_year] + +Notes: +1. Not all income is taxible. 2nd tier benefits, 3rd tier benefits, and WfF tax credits + are non-taxible and hence are not reported to IRD. This income is not part of annual + total taxible income but is part of annual total income. + +2. Full OUTER JOIN been considered because some people only have benefit or salaries or wwf as the only source of income + There are around 3% of data in second tier benefit couldn't link to ir data every year. + There are around 2% of data in third tier benefit couldn't link to ir data every year. + There are around 10% of data in wff couldn't link to ir data every year. + Potential reason is that IR records only include the "active items that have non-zero partnership, self-employment, + or shareholder salary income". + Although WFF spells table do not have any linking issue, and it seems to be a alternative, + the records only available for (2003--2013), thus we won't consider this table. + + +3. Table reference period + [msd_second_tier_expenditure] April 1990 � July 2020 + [msd_third_tier_expenditure] November 2009 � July 2020 + [wff_clean].[fam_return_dtls] March 2000 � March 2020 + [income_tax_yr_summary] April 1995 � March 2020 + + (the above dates take from data dictionaries) + We have added the condition(:where year>=2010) for all analysis in this script to speed up the performance and save memory. + +4. [inc_tax_yr_sum_year_nbr] + The year in which income was derived. The tax year of 2001 would run from 01 April 2000 � 31 March 2001. + The calendar year of 2001 would run from 01 Jan � 31 Dec 2001. + +5. In the table [IDI_Clean_20201020].[msd_clean].[msd_third_tier_expenditure], we only consider [msd_tte_recoverable_ind] = 'N', + as [msd_tte_recoverable_ind] = 'Y' means the participants have to pay back the money. + + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = d2gP2_ + Project schema = [DL-MAA20XX-YY] + earliest_year = 2010 + + +Issues: + +1. There are around 3.5% negative values in the column [wff_pmt] (which is the work for family payment). + More than 95% of thes negative values are smaller than -50. + + 22% of the population have ever got negative wff payment. + 6% of the population got overall negative wff payment if we sum up the wff payment from all the years + + To soleve the issue, we add the negative wff payment back, and deduct the abs(negative) from previous year. + The adjustment decrease the percentage of negative payment to 2%. + There are still 16% people who have ever got negative wff payment. + This is because if two successive years continiously have negative payment, then there will be negative values + after adjustment. Or if the payment from the previous year is smaller than the abs(negative payment) from current year. + The third reason is that the overall payment is negative. + year1,year2,year3, year4,year5, year6 + eg. the wff payment of $600, $200, $-400, $300, $-100, $-200 + after ajustment would be $600, $-200, $0 , $200,$-200, $0 + + It seems unlikely that people would keep overpaying having already overpaid, or they get overall negative payment. + It may caused by data missing from the table. + + +2. runtime for [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_benefit_tier2]: 18 mins + + +History (reverse order): +2021-06-10 SA QA +2021-02-03 FL v3 +2021-01-26 SA QA +2020-12-10 FL v2 include tier 2, tier 3 and wff tax credit into the income +2020-07-16 MP QA +2020-03-02 SA v1 +***************************************************************************************************************************************/ + + +/* Establish database for writing views */ +USE IDI_UserCode +GO + + +/* Wages and salaries by tax year */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[temp_income_tax_year]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[temp_income_tax_year] AS +SELECT * + ,DATEFROMPARTS([inc_tax_yr_sum_year_nbr], 3, 31) AS [event_date] + ,DATEFROMPARTS([inc_tax_yr_sum_year_nbr], 1, 1) AS [start_date] + ,DATEFROMPARTS([inc_tax_yr_sum_year_nbr], 12, 31) AS [end_date] +FROM [IDI_Clean_YYYYMM].[data].[income_tax_yr_summary] +WHERE [inc_tax_yr_sum_year_nbr]>=2010 +GO + +/********************************************************************* +second tier benefits by tax year +*********************************************************************/ + +-- Creating a temporary dates table +GO + +CREATE TABLE #tax_year +(yr_tax INT, +tax_year_start DATE, +tax_year_end DATE +) + +DECLARE @y_min INT = 2010 +DECLARE @y_max INT = 2020 + +WHILE @y_min <= @y_max +BEGIN + INSERT INTO #tax_year + SELECT @y_min+1 AS yr_tax + ,DATEFROMPARTS(@y_min, 4, 1) as tax_year_start + ,DATEFROMPARTS(@y_min+1, 3, 31) as tax_year_end + SET @y_min = @y_min + 1 +END + + +/*Break down second tier benefit into tax years*/ + +-- drop table before re-creating +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_benefit_tier2]; +GO +WITH dates_breakdown AS ( +SELECT t2.* + ,t1.[snz_uid] + ,t1.[msd_ste_start_date] + ,t1.[msd_ste_end_date] + ,t1.[msd_ste_daily_gross_amt] + ,t1.[msd_ste_period_nbr] + ,IIF([msd_ste_start_date] > [tax_year_start], [msd_ste_start_date], [tax_year_start]) AS [start_date_tax] + ,IIF(msd_ste_end_date < tax_year_end, msd_ste_end_date, tax_year_end) AS [end_date_tax] +FROM [IDI_Clean_YYYYMM].[msd_clean].[msd_second_tier_expenditure] t1 +INNER JOIN #tax_year t2 +ON msd_ste_start_date < tax_year_end AND tax_year_start < msd_ste_end_date +WHERE YEAR([msd_ste_start_date])>=2010 +), + +payment_breakdown AS ( +SELECT + [snz_uid] + ,[yr_tax] + ,[start_date_tax] + ,[end_date_tax] + ,[msd_ste_start_date] + ,[msd_ste_end_date] + ,(DATEDIFF(DAY, [start_date_tax], [end_date_tax])+1) AS [period_num] + ,(DATEDIFF(DAY, [start_date_tax], [end_date_tax])+1) * [msd_ste_daily_gross_amt] AS [gross_payment] +FROM dates_breakdown +) +SELECT [snz_uid] + ,DATEFROMPARTS([yr_tax], 1, 1) AS [start_date] + ,DATEFROMPARTS([yr_tax], 12, 31) AS [end_date] + ,SUM([gross_payment]) AS [bet_pmt_tier2] +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_benefit_tier2] +FROM payment_breakdown +GROUP BY [snz_uid], [yr_tax] + +CREATE INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_benefit_tier2] ([snz_uid]); +GO + + /********************************************************************* + third tier benefits by tax year + *********************************************************************/ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[tmp_benefit_tier3]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[tmp_benefit_tier3] AS +SELECT [snz_uid] + ,DATEFROMPARTS([yr_tax], 1, 1) AS [start_date] + ,DATEFROMPARTS([yr_tax], 12, 31) AS [end_date] + ,SUM([msd_tte_pmt_amt]) as [bet_pmt_tier3] +FROM( + SELECT [snz_uid] + ,IIF(MONTH([msd_tte_decision_date])<4, YEAR([msd_tte_decision_date]), YEAR([msd_tte_decision_date]) + 1) AS [yr_tax] + ,[msd_tte_pmt_amt] + FROM [IDI_Clean_YYYYMM].[msd_clean].[msd_third_tier_expenditure] + WHERE [msd_tte_recoverable_ind] = 'N' + AND YEAR([msd_tte_decision_date]) >= 2010 + ) tier3 + GROUP BY [snz_uid], [yr_tax] +GO + + + + /********************************************************************* + Work for family payment + *********************************************************************/ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[tmp_wff]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[tmp_wff] AS +SELECT [snz_uid] + ,[tax_year] + ,DATEFROMPARTS([tax_year], 1, 1) AS [start_date] + ,DATEFROMPARTS([tax_year], 12, 31) AS [end_date] + ,SUM([wff_pmt_prt]) AS [wff_pmt] +FROM ( + SELECT [snz_uid] + ,IIF(MONTH([wff_frd_return_period_date])<4, YEAR([wff_frd_return_period_date]), YEAR([wff_frd_return_period_date])+1) AS [tax_year] + ,(COALESCE([wff_frd_fam_paid_amt], 0) - COALESCE([wff_frd_winz_paid_amt],0) - COALESCE([wff_frd_final_dr_cr_amt], 0)) AS [wff_pmt_prt] + FROM [IDI_Clean_YYYYMM].[wff_clean].[fam_return_dtls] + WHERE YEAR([wff_frd_return_period_date])>=2010 + ) a +GROUP BY [snz_uid], [tax_year] +GO + +-- add the negative wff payment back, and deduct it from previous wff payment + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_wff_neg_adj]; +GO + + SELECT * + ,wff_pmt + neg_adj + COALESCE(lead_neg_deduct,0) AS wff_pmt_neg_adj +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_wff_neg_adj] +FROM( + SELECT * + ,IIF(wff_pmt<0,-wff_pmt,0) AS neg_adj + ,LEAD(IIF(wff_pmt<0, wff_pmt,0)) OVER(PARTITION BY snz_uid ORDER BY start_date) AS lead_neg_deduct + FROM [DL-MAA20XX-YY].[tmp_wff] + ) a +GO + +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_wff_neg_adj] (snz_uid); +GO + +/***********************************************************/ + +--join bnt_wff table with ir income table + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_income_bnt_wff_tax_year]; +GO + + +SELECT COALESCE(ir.[snz_uid], tier2.[snz_uid],tier3.[snz_uid], wff.[snz_uid]) AS [snz_uid] + ,COALESCE(ir.[start_date],tier2.[start_date], tier3.[start_date], wff.[start_date]) AS [start_date] + ,COALESCE(ir.[end_date], tier2.[end_date],tier3.[end_date],wff.[end_date]) AS[end_date] + ,ir.[event_date] + ,ir.[inc_tax_yr_sum_WAS_tot_amt] /* wages & salaries */ + ,ir.[inc_tax_yr_sum_WHP_tot_amt] /* withholding payments (schedular payments with withholding taxes) */ + ,ir.[inc_tax_yr_sum_BEN_tot_amt] /* benefits */ + ,ir.[inc_tax_yr_sum_ACC_tot_amt] /* ACC claimants compensation */ + ,ir.[inc_tax_yr_sum_PEN_tot_amt] /* pensions (superannuation) */ + ,ir.[inc_tax_yr_sum_PPL_tot_amt] /* Paid parental leave */ + ,ir.[inc_tax_yr_sum_STU_tot_amt] /* Student allowance */ + ,ir.[inc_tax_yr_sum_C00_tot_amt] /* Company director/shareholder income from IR4S */ + ,ir.[inc_tax_yr_sum_C01_tot_amt] /* Comapny director/shareholder receiving PAYE deducted income */ + ,ir.[inc_tax_yr_sum_C02_tot_amt] /* Company director/shareholder receiving WHT deducted income */ + ,ir.[inc_tax_yr_sum_P00_tot_amt] /* Partnership income from IR20 */ + ,ir.[inc_tax_yr_sum_P01_tot_amt] /* Partner receiving PAYE deducted income */ + ,ir.[inc_tax_yr_sum_P02_tot_amt] /* Partner receiving withholding tax deducted income */ + ,ir.[inc_tax_yr_sum_S00_tot_amt] /* Sole trader income from IR3 */ + ,ir.[inc_tax_yr_sum_S01_tot_amt] /* Sole Trader receiving PAYE deducted income */ + ,ir.[inc_tax_yr_sum_S02_tot_amt] /* Sole trader receiving withholding tax deducted income */ + ,ir.[inc_tax_yr_sum_S03_tot_amt] /* Rental income from IR3 */ + ,ir.[inc_tax_yr_sum_all_srces_tot_amt] /*The total earnings for the individual for the tax or calendar year in year_nbr sourced from wages and salaries.*/ + ,tier2.[bet_pmt_tier2] + ,tier3.[bet_pmt_tier3] + ,wff.[wff_pmt_neg_adj] + ,COALESCE(ir.[inc_tax_yr_sum_all_srces_tot_amt], 0) + + COALESCE(tier2.[bet_pmt_tier2], 0) + + COALESCE(tier3.[bet_pmt_tier3], 0) + + COALESCE(wff.[wff_pmt_neg_adj], 0) AS [inc_tax_yr_inc_bnt] /*income including the benefit of the tax year*/ + ,COALESCE(ir.[inc_tax_yr_sum_BEN_tot_amt], 0) + + COALESCE(tier2.[bet_pmt_tier2], 0) + + COALESCE(tier3.[bet_pmt_tier3], 0) AS [inc_tax_yr_all_bnt] /*all benefit from the tax year*/ +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_income_bnt_wff_tax_year] +FROM [DL-MAA20XX-YY].[temp_income_tax_year] ir +FULL JOIN [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_benefit_tier2] tier2 +ON ir.[snz_uid] = tier2.[snz_uid] +AND ir.[start_date] = tier2.[start_date] +AND ir.[end_date] = tier2.[end_date] +FULL JOIN [DL-MAA20XX-YY].[tmp_benefit_tier3] tier3 +ON ir.[snz_uid] = tier3.[snz_uid] +AND ir.[start_date] = tier3.[start_date] +AND ir.[end_date] = tier3.[end_date] +FULL JOIN [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_wff_neg_adj] wff +ON ir.[snz_uid] = wff.[snz_uid] +AND ir.[start_date] = wff.[start_date] +AND ir.[end_date] = wff.[end_date] +GO +--running the code takes about 15 mins with index +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_income_bnt_wff_tax_year] (snz_uid, start_date, end_date); +GO + +/* Compress to save space (takes 12 minutes) */ +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_income_bnt_wff_tax_year] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE) + + + + +/******************************************************************************** +Tidy up and remove all temporary tables/views that have been created +********************************************************************************/ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[temp_income_tax_year]; +GO +GO +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_benefit_tier2]; +GO +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[tmp_benefit_tier3]; +GO +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[tmp_wff]; +GO +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[tmp_wff_neg_adj]; +GO + diff --git a/income/bankrupt_complete.sql b/income/bankrupt_complete.sql new file mode 100644 index 0000000..71208d1 --- /dev/null +++ b/income/bankrupt_complete.sql @@ -0,0 +1,77 @@ +/************************************************************************************************** +Title: Bankrupt/insolvent +Author: Simon Anastasiadis +Reviewer: Marianna Pekar + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +People who have become insolvent / completed bankruptcy proceedings. + +Intended purpose: +1. Identifying when/whether people become bankrupt or insolvent. +2. Indicator for have you ever been bankrupt. + +Inputs & Dependencies: +- [IDI_Clean].[ir_clean].[ird_cross_reference] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[d2g_bankruptcy] + +Notes: +1) Conversation with Joanne Butterfield (IRD) on 2020-01-08: + - IRD applies the 'B' flag in the customers table when any bankruptcy proceedings + are in progress. These do not always result in an insolvency (if people can + resume making repayments or if another process is used, such as a no-asset proceedure). + - New IRD numbers are issues when a person has completed bankruptcy/insolvency. + Therefore + - We use the customers table as an indication of 'trouble making repayments' + - We use the cross-reference table to identify people declared bankrupt. +2) The New Zealand Insolvency and Trustee Service (insolvency.govt.nz) is the government + office (part of MBIE) who handles insolvency. They distinguish between bankruptcy + (which applies to individuals) and liquidation & receivership (which apply to companies). + Sole traders are trated as individuals, because they are not distinct legal entities. +3) This definition is for people who have been declared bankrupt. It produces similar numbers + per year reported by the New Zealand Insovency and Trustee Service, the Official Assignee. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = d2g_ + Project schema = [DL-MAA20XX-YY] + +Issues: + +History (reverse order): +2020-07-16 MP QA +2020-03-02 SA v1 +**************************************************************************************************/ + +/* Establish database for writing views */ +USE IDI_UserCode +GO + +/* Clear view */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[d2g_bankruptcy]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[d2g_bankruptcy] AS +SELECT [snz_uid] + ,ir_xrf_applied_date AS [event_date] + ,ir_xrf_applied_date AS [start_date] + ,ir_xrf_ceased_date AS [end_date] -- should be '9999-12-31' as issue of new IRD number is irrevesable + ,ir_xrf_reference_type_code -- for confirming BAN code + ,ir_xrf_ird_timestamp_date +FROM [IDI_Clean_YYYYMM].[ir_clean].[ird_cross_reference] +WHERE ir_xrf_reference_type_code = 'BAN' +AND ir_xrf_applied_date <= ir_xrf_ceased_date; +GO diff --git a/income/bankruptcy_proceeding.sql b/income/bankruptcy_proceeding.sql new file mode 100644 index 0000000..b96749b --- /dev/null +++ b/income/bankruptcy_proceeding.sql @@ -0,0 +1,74 @@ +/************************************************************************************************** +Title: Bankrupt proceedings +Author: Simon Anastasiadis +Reviewer: Marianna Pekar + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Any bankruptcy proceedings against a person, irrespective of whether they result in the person being declared bankrupt/insolvent. + +Intended purpose: +Identifying when/whether people are having trouble making repayments on their debt. + +Inputs & Dependencies: +- [IDI_Clean].[ir_clean].[ird_customers] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[d2g_bankruptcy_proceeding] + +Notes: +1) Conversation with Joanne Butterfield (IRD) on 2020-01-08: + - IRD applies the 'B' flag in the customers table when any bankruptcy proceedings + are in progress. These do not always result in an insolvency (if people can + resume making repayments or if another process is used, such as a no-asset proceedure). + - New IRD numbers are issues when a person has completed bankruptcy/insolvency. + Therefore + - We use the customers table as an indication of 'trouble making repayments' + - We use the cross-reference table to identify people declared bankrupt. +2) The New Zealand Insolvency and Trustee Service (insolvency.govt.nz) is the government + office (part of MBIE) who handles insolvency. They distinguish between bankruptcy + (which applies to individuals) and liquidation & receivership (which apply to companies). + Sole traders are trated as individuals, because they are not distinct legal entities. +3) This definition is for people who have any bankruptcy proceeding against them. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = d2g_ + Project schema = [DL-MAA20XX-YY] + +Issues: + +History (reverse order): +2020-07-17 MP QA +2020-03-02 SA v1 +**************************************************************************************************/ + +/* Establish database for writing views */ +USE IDI_UserCode +GO + +/* Clear view */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[d2g_bankruptcy_proceeding]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[d2g_bankruptcy_proceeding] AS +SELECT [snz_uid] + ,[ir_cus_applied_date] AS [start_date] + ,[ir_cus_ceased_date] AS [end_date] + ,[ir_cus_client_status_code] +FROM [IDI_Clean_YYYYMM].[ir_clean].[ird_customers] +WHERE [ir_cus_client_status_code] IN ('B', 'U', 'R', 'L') -- Bankrupt or Undischarged bankrupt, or Receivership or Liquidator (though R & L should not apply to individuals) +AND [ir_cus_entity_type_code] = 'I' -- individual +AND [ir_cus_applied_date] <= [ir_cus_ceased_date]; +GO diff --git a/income/classifying MSD debt using T3 data.sql b/income/classifying MSD debt using T3 data.sql new file mode 100644 index 0000000..c432652 --- /dev/null +++ b/income/classifying MSD debt using T3 data.sql @@ -0,0 +1,251 @@ +Title: Classified Debt to MSD +Author: Simon Anastasiadis +Reviwer: Verity Warn + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + + +Description: +Debt to MSD categorised by type. + +Intended purpose: +1. Intermediate table for detailed MSD debt to be extracted from. +2. Identifying debtors, type of debt, and changes in debt. + +Inputs & Dependencies: +- [IDI_Adhoc].[clean_read_DEBT].[msd_debt_30sep] +- [IDI_Clean].[security].[concordance] +- [IDI_Clean].[msd_clean].[msd_third_tier_expenditure] + +Outputs: +- [IDI_Sandpit].[DL-MAA2020-01].[d2gP2_classified_msd_debt] + +Notes: +1. Date range for table [msd_debt_30sep] is 2009-01-01 to 2020-09-01. Existing balances + are created as new principle (amount incurred) on the opening date. +2. Numbers represent changes in debt owing. So principle is positive, repayments and + write-offs are negative. A small number of repayments and write-offs are positive + we assume these are reversals - and they increase debt in the same way as principle. +3. MSD debt data available at time of analysis does not classify MSD debt by type. We + use the following method to classify MSD debt into recoverable assistance and + overpayment: + A) Fetch all T3 debt, filter to recoverable payments + B) Split payments into 'within' month and 'at boundary of month' + with 'boundary' defined as the last four days in the month. + C) Group by identity and month and sum both dollar amounts. + D) Join T3 payments to MSD debt + E) Classify debt incurred into T3 (recoverable assistance) using the following order: + - T3 payments 'within' a month + - T3 payments on the border of the month + - T3 payments on the border of the previous month that do not fit in previous month + F) Any debt principle amounts not classified as due to T3 (recoverable) are classified + as due to other, non-T3 causes (overpayment). +4. Consideration of 'within' vs. 'border' is necessary because recoverable assistance + payments approved/made near the end of a month are sometimes recorded in the month + and other times recorded in the next month. +5. Quality of this classification was assessed and found to be good. Consistent with other known + sources of error in the IDI or better: + - Of 8.9 million records, only 104,000 pass principle on the border to the next record. + Of these we at most 7,700 could have error in timing (the wrong amount of border + principle passed to the next month). But we estimate this is likely for only 1,600 records. + - For 276,000 records (~3%) the amount of recoverable assistance from the T3 table is greater + than the amount of debt recorded. Some of this will be due to differences in approvals + and payment (e.g. $100 approved but only $99.97 needed). + - Most debt classified as overpayment occurs within a benefit spell or at the end of a benefit spell + as expected. +6. We do not attempt to distinguish between innocent overpayment debt and fraud. For two key reasons: + A) No robust method for distinguishing the two in the available data. + B) There are fewer than 600 fraud prosecutions each year but about 250,000 clients with an overpayment + debt each year. So the percent of fraud cases is too small for approximate methods to work well + (number from documents released by MSD under OIA). + The best proxy we tested was to treat all principle amounts over $50,000 as fraud. +7. Performance can be slow: 14 minutes + + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = d2gP2_ + Project schema = [DL-MAA20XX-YY] + Earliest year of data = 2009 + Number of days considered border of month = 4 + +History (reverse order): +2021-01-18 SA v1 +2021-01-12 SA work begun +2020-12-01 VW initial exploration of MSD records +**************************************************************************************************/ + +/********************************************************************** +add snz_uid to msd_debt table in the sandpit +**********************************************************************/ +IF OBJECT_ID('[IDI_Sandpit].[DL-MAA20XX-YY].[tmp_msd_debt_principle]','U') IS NOT NULL DROP TABLE [IDI_Sandpit].[DL-MAA2020-01].[tmp_msd_debt_principle]; +GO + +SELECT b.snz_uid + ,a.[snz_msd_uid] + ,a.[debt_as_at_date] + ,SUM(ISNULL(a.[amount_incurred], 0)) AS [amount_incurred] --some identities have more than one record per month, hence collapse + ,SUM(ISNULL(a.[amount_repaid], 0)) AS [amount_repaid] + ,SUM(ISNULL(a.[amount_written_off], 0)) AS [amount_written_off] +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_msd_debt_principle] +FROM [IDI_Adhoc].[clean_read_DEBT].[msd_debt_30sep] AS a +INNER JOIN [IDI_Clean_YYYYMM].[security].[concordance] AS b -- adding snz_uid +ON a.snz_msd_uid = b.snz_msd_uid +GROUP BY b.snz_uid, a.snz_msd_uid, a.debt_as_at_date + +/* Index by snz_uid (improve efficiency) */ +CREATE CLUSTERED INDEX snz_uid_index ON [IDI_Sandpit].[DL-MAA2020-01].[tmp_msd_debt_principle] (snz_uid); + +/********************************************************************** +convert T3 to monthly +**********************************************************************/ +IF OBJECT_ID('[IDI_Sandpit].[DL-MAA20XX-YY].[tmp_recoverable_t3]','U') IS NOT NULL DROP TABLE [IDI_Sandpit].[DL-MAA2020-01].[tmp_recoverable_t3]; +GO + +WITH prep_t3 AS ( + /* filter and add some additional columns to MSD T3 expenditure */ + SELECT [snz_uid] + ,[snz_msd_uid] + ,[msd_tte_decision_date] + ,DATEFROMPARTS(YEAR([msd_tte_decision_date]), MONTH([msd_tte_decision_date]), 1) AS decision_month + ,DATEDIFF(DAY, [msd_tte_decision_date], EOMONTH([msd_tte_decision_date])) AS days_til_end_of_month + ,[msd_tte_pmt_amt] + FROM [IDI_Clean_YYYYMM].[msd_clean].[msd_third_tier_expenditure] + WHERE [msd_tte_recoverable_ind] = 'Y' + AND YEAR([msd_tte_decision_date]) >= 2009 -- debt data starts in 2009 +) +SELECT [snz_uid] + ,[snz_msd_uid] + ,[decision_month] + ,SUM(IIF(days_til_end_of_month <= 4, [msd_tte_pmt_amt], 0)) AS border_month_amount + ,SUM(IIF(days_til_end_of_month <= 4, 0, [msd_tte_pmt_amt])) AS within_month_amount + ,SUM([msd_tte_pmt_amt]) AS total_t3_payment_amount + ,COUNT(*) AS num_payments -- used for auditing +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_recoverable_t3] +FROM prep_t3 +GROUP BY [snz_uid], [snz_msd_uid], [decision_month] + +/* Index by snz_uid (improve efficiency) */ +CREATE CLUSTERED INDEX snz_uid_index ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_recoverable_t3] (snz_uid); + +/********************************************************************** +attach T3 to MSD debt +**********************************************************************/ +IF OBJECT_ID('[IDI_Sandpit].[DL-MAA20XX-YY].[tmp_augmented_msd_debt]','U') IS NOT NULL DROP TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_augmented_msd_debt]; +GO + +WITH +step_1_join AS ( + +SELECT pr.snz_uid + ,pr.snz_msd_uid + ,debt_as_at_date + ,amount_incurred + ,ISNULL(within_month_amount, 0) AS within_month_amount + ,ISNULL(border_month_amount, 0) AS border_month_amount +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_msd_debt_principle] AS pr +LEFT JOIN [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_recoverable_t3] AS t3 +ON pr.snz_uid = t3.snz_uid +AND pr.[debt_as_at_date] = t3.decision_month +WHERE [amount_incurred] IS NOT NULL -- only interested in where debt has occured +AND [amount_incurred] > 0 + +), +step_2_assign_within AS ( + +SELECT snz_uid + ,snz_msd_uid + ,amount_incurred + ,debt_as_at_date + ,within_month_amount + ,border_month_amount + ,IIF(amount_incurred <= within_month_amount + border_month_amount, amount_incurred, within_month_amount + border_month_amount) AS recoverable_source + ,IIF(amount_incurred <= within_month_amount + border_month_amount, 0, amount_incurred - within_month_amount - border_month_amount) AS other_source + ,IIF(border_month_amount >= IIF(amount_incurred - within_month_amount > 0, amount_incurred - within_month_amount, 0), + border_month_amount - IIF(amount_incurred - within_month_amount > 0, amount_incurred - within_month_amount, 0), + 0) AS border_remaining +FROM step_1_join + +), +step_3_pass_across AS ( + +SELECT * + ,LAG(border_remaining, 1, 0) OVER (PARTITION BY snz_uid ORDER BY debt_as_at_date) AS passed_over_border + ,LAG(debt_as_at_date, 1, '1000-01-01') OVER (PARTITION BY snz_uid ORDER BY debt_as_at_date) AS passed_from_date +FROM step_2_assign_within + +), +step_4_assign_across AS ( + +SELECT * + ,IIF(DATEDIFF(MONTH, passed_from_date, debt_as_at_date) = 1, -- was valid amount passed from previous + recoverable_source + IIF(passed_over_border > other_source, other_source, passed_over_border), -- yes: increase recoverable source by passed amount + recoverable_source -- no: then existing amount remains correct + ) AS recoverable_source_2 + ,IIF(DATEDIFF(MONTH, passed_from_date, debt_as_at_date) = 1, -- was valid amount passed from previous + IIF(passed_over_border > other_source, 0, other_source - passed_over_border), -- yes: reduce other source by passed amount + other_source -- no: then existing amount remains correct + ) AS other_source_2 +FROM step_3_pass_across + +) +SELECT snz_uid + ,snz_msd_uid + ,amount_incurred + ,debt_as_at_date + ,recoverable_source_2 AS amount_incurred_t3_sources + ,other_source_2 AS amount_incurred_non_t3_sources +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_augmented_msd_debt] +FROM step_4_assign_across + +/* Index by snz_uid (improve efficiency) */ +CREATE CLUSTERED INDEX snz_uid_index ON [IDI_Sandpit].[DL-MAA2020-01].[tmp_augmented_msd_debt] (snz_uid); + +/********************************************************************** +attach categorised debt to full MSD debt data +**********************************************************************/ +IF OBJECT_ID('[IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_classified_msd_debt]','U') IS NOT NULL DROP TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_classified_msd_debt]; +GO + +SELECT a.snz_uid + ,a.[snz_msd_uid] + ,a.[debt_as_at_date] + ,a.[amount_incurred] + ,[amount_repaid] + ,[amount_written_off] + ,amount_incurred_t3_sources AS amount_incurred_assistance + ,amount_incurred_non_t3_sources AS amount_incurred_overpayment +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_classified_msd_debt] +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_msd_debt_principle] AS a +LEFT JOIN [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_augmented_msd_debt] AS b +ON a.snz_uid = b.snz_uid +AND a.debt_as_at_date = b.debt_as_at_date +/* the following conditions must be true by construction, checked & commented out for speed */ +--AND a.snz_msd_uid = b.snz_msd_uid +--AND a.amount_incurred = b.amount_incurred + +/* Index by snz_uid (improve efficiency) */ +CREATE CLUSTERED INDEX snz_uid_index ON [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_classified_msd_debt] (snz_uid); +/* Compact table to save space */ +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_classified_msd_debt] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE) + +/********************************************************************** +remove temporary tables +**********************************************************************/ +IF OBJECT_ID('[IDI_Sandpit].[DL-MAA20XX-YY].[tmp_msd_debt_principle]','U') IS NOT NULL DROP TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_msd_debt_principle]; +GO +IF OBJECT_ID('[IDI_Sandpit].[DL-MAA20XX-YY].[tmp_recoverable_t3]','U') IS NOT NULL DROP TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_recoverable_t3]; +GO +IF OBJECT_ID('[IDI_Sandpit].[DL-MAA20XX-YY].[tmp_augmented_msd_debt]','U') IS NOT NULL DROP TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_augmented_msd_debt]; +GO \ No newline at end of file diff --git a/income/debt_to_ird.sql b/income/debt_to_ird.sql new file mode 100644 index 0000000..c83b990 --- /dev/null +++ b/income/debt_to_ird.sql @@ -0,0 +1,505 @@ +/************************************************************************************************** +Title: Debt to IRD +Author: Freya Li +Reviewer: Simon Anastasiadis + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Debt, debt balances, and repayment for debtors owing money to IRD. + +Intended purpose: +1. Identifying debtors. +2. Calculating number of debts and total value of debts. +3. Calculating change in debts - due to borrowing or repayment. + +Inputs & Dependencies: +- [IDI_Clean].[security].[concordance] +- [IDI_Adhoc].[clean_read_DEBT].[ir_debt_transactions] +- [IDI_Adhoc].[clean_read_DEBT].[ir_debt_collection_cases] +- [IDI_Adhoc].[clean_read_DEBT].[ir_debt_transactions_student_202105] + +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_ird_transactions_ready] +- [IDI_UserCode].[DL-MAA20XX-YY].[ird_labels_balance] +- [IDI_UserCode].[DL-MAA20XX-YY].[ird_labels_transactions] +- [IDI_UserCode].[DL-MAA20XX-YY].[ird_labels_repayments] +- [IDI_UserCode].[DL-MAA20XX-YY].[ird_labels_persist] + +Notes: +1. As tables are adhoc they need to be linked to IDI_Clean via agency specific uid's. + Both IRD tables exhibit excellent correspondence to one-another and both link + very well to the spine. +2. delta is the sum of assess, penalty, interest, account_maintenance, payment, remiss, the first record of each tax type should including + the opening balance +3. Some of the different debt types have different starting points. + This is because of changes in IR's computer system between 2018 and 2020. + It is not due to data missing from the table. + Jordan's (IRD) response about this: There have been several releases as Inland Revenue are migrating completely over to our new tax system, + START. One of these key release dates was in April-2020 and specifically that was when the Income Tax type product was converted over. + The collection case begin dates are correct if they were opened from before Jan-2019 but any specific historical transaction data is + essentially rolled up into a single converted transaction line item dated 20th April 2019. This is why it appears to have data missing + and only start from then. + tax_type_group start date end date + Donation Tax Credits 2019-04-30 2020-11-30 + Employment Activities 2019-01-31 2020-11-30 + Families 2019-04-30 2020-11-30 + GST 2019-01-31 2020-11-30 + Income Tax 2019-04-30 2020-11-30 + Liable Parent 2019-01-31 2020-11-30 + Other 2019-01-31 2020-11-30 + Receiving Carer 2019-01-31 2020-11-30 + Student Loans 2020-04-30 2020-11-30 + +4. Date range for dataset is 2019-01-31 to 2020-11-30. About 11% of debt cases arise before 2019. + If tax case started earlier than 2019 and unclosed until 2019-01-31, then there is a difference + between [running_balance_tax_type] and [delta] at beginning of 2019. +5. Numbers represent changes in debt owing. So principle is positive, repayments are + negative, interest is positive when charges and negative if reversed/waived. + Around 2% assesses are negative. delta is the sum of the components +6. Each identity can have multiple cases, of different tax types, and case numbers are + reused between tax types and individuals. +7. Outliers: + Very large positive and negative balances and delta values are observed in the data. + Defining 'large values' as magnitudes in excess of $1 million, then a tiny fraction + of people <0.1% have very large balances at any given time +8. 2% of the records has negative balance, only 0.4% of debt cases have negative debt balances + of at least $100. +9. 46% of the records from [IDI_Adhoc].[clean_read_DEBT].[ir_debt_collection_cases] has NULL date_end + Around 86% the records with NULL date_end has date_begin after 2018 (year 2018 excluded) +10. We consider a debt case is closed if the balance for a snz_uid within a snz_case_number and tax_type_group turns to 0. +11. Opening date for a debt case in [ir_debt_collection_cases] is not consistent with the table [ir_debt_transactions], + Recalculation of the open and close date for each debt case based on table [ir_debt_transactions] required. However, + the calculation is only an approximation, for two reasons: + (1) Date range for dataset is 2019-01-31 to 2020-11-30. The open date for any case arise before 2019 will be extracted + from table [ir_debt_collection_cases]. However, [ir_debt_collection_cases] consider a case number as a debt case, + we consider each tax_type under a case number for a snz_uid as a debt case. + (2) All the records in [ir_debt_transactions] are recored as the last date of the month rather than the exact date of + the transaction. +12. We observe student loan debt provided by IR in the table [ir_debt_transactions] to contain errors. + - Double the amount of debt as reported outside the IDI + - Double the amount of debt as observable in sla_clean.ird_overdue_debt + - About half of all people with overdue student loan debt have multiple debt cases + (This is much higher than for all the other debt types). + However, the number of people with debt is approximately consistent. + + Cause - due to a processing error IR has included both overdue AND non-overdue balance for some debtors. + Initial solution - Join with SLA data and remove the debt cases most obviously wrong. Reduced error by two thirds. + Fix - IR provided fresh overdue student loan information in table [ir_debt_transactions_student_202105]. + We replace student loan debt in the original table with the debt in this table. + Original code moved to data exploration/unused definitions folder. + Outstanding concerns - fix address majority of problem but still some differences from numbers reported outside the IDI. +13. There are around 3000 cases where the same identity has two (or more) records for the same month, tax type, and case number. + This primariy affects Liable Parent debt. Our approach keeps all records, which is equivalent to assuming that the case + numbers should be different. + +Issues: +1. It seems that [running_balance_case] from table [ir_debt_transactions] is not correctly calculated, it doesn't sum up the + running balance across the different tax type group with in a debt case. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = d2gP2_ + Project schema = [DL-MAA20XX-YY] + Earliest debt date = '2019-01-01' + Latest debt date = '2020-12-31' + +History (reverse order): +2021-07-13 SA insert records when there are no transactions +2021-06-18 FL comment out active cases part as it is out of scope; add repayment indicator, debt persistence indicator +2021-06-11 SA update notes on student loan debt and faster loading/setup +2021-05-26 FL IR reload the student loan debt, replace the student loan records with the new table +2021-04-13 FL removing wrong student loan debt +2021-02-02 FL v3 +2021-01-26 SA QA +2021-01-13 FL v2 Create monthly time series +2021-01-06 FL (update to the latest refresh, make necessary changes to the code) +2020-11-25 FL QA +2020-07-02 SA update following input from SME +2020-06-23 SA v1 +**************************************************************************************************/ + +/************************************************************************************************** +set location for views once at start +**************************************************************************************************/ +USE [IDI_UserCode] +GO + +/************************************************************************************************** +Prior to use, copy to sandpit and index +(runtime 2 minutes) +**************************************************************************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_ir_debt_transactions]; +GO + +SELECT * +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_ir_debt_transactions] +FROM ( + + /* All debt in base table excluding faulty student loan records */ + SELECT * + FROM [IDI_Adhoc].[clean_read_DEBT].[ir_debt_transactions] + WHERE tax_type_group <> 'Student Loans' + + UNION ALL + + /* Corrected student loan debt records */ + SELECT * + FROM [IDI_Adhoc].[clean_read_DEBT].[ir_debt_transactions_student_202105] +) k + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_ir_debt_transactions] ([snz_ird_uid], [snz_case_number], [tax_type_group]); +GO + +/************************************************************************************************** +Standardise columns +**************************************************************************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[ir_debt_transactions_corrected]; +GO + +WITH standardised AS ( + SELECT * + ,REPLACE(tax_type_group, ' ', '_') AS tax_type_label + ,CAST(REPLACE(REPLACE([running_balance_tax_type],'$',''), ',' , '') AS NUMERIC(10,2)) AS [running_balance_tax_type_num] -- convert $text to numeric + ,CAST(REPLACE(REPLACE([running_balance_case],'$',''), ',' , '') AS NUMERIC(10,2)) AS [running_balance_case_num] -- convert $text to numeric + ,MIN([month_end]) OVER(PARTITION BY [snz_ird_uid], [snz_case_number], [tax_type_group] ORDER BY [month_end]) AS [min_date] + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_ir_debt_transactions] +) +SELECT * + ,IIF([month_end] = [min_date], [running_balance_tax_type_num], [delta]) as [delta_updated] + ,IIF([account_maintenance] > 0, [account_maintenance], 0) AS [maintenance_pos] + ,IIF([account_maintenance] < 0, [account_maintenance], 0) AS [maintenance_neg] + ,IIF([month_end] = [min_date] AND [month_end] <= '2020-09-30' AND [running_balance_tax_type_num] <> [delta], [running_balance_tax_type_num] - [delta], 0) as [pre_2019] + --around 300 identities' first records in the debt table is after 2020-09-30, however, they have balance before 2019,without consider these records will keep the debt balance consistent with the sum of component + --If tax case started earlier than 2019 and unclosed until 2019-01-31, then set [running_balance_tax_type_num] + [delta] as delta_updated at the first date of the record +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[ir_debt_transactions_corrected] +FROM standardised + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[ir_debt_transactions_corrected] ([snz_ird_uid],[tax_type_group]); +GO +/************************************************************************************************** +fill in records where balance is non-zero but transactions are zero +**************************************************************************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_ird_non_transactions]; +GO + +WITH +/* list of 100 numbers 1:100 - spt_values is an admin table chosen as it is at least 100 rows long */ +n AS ( + SELECT TOP 100 ROW_NUMBER() OVER (ORDER BY type) AS x + FROM master.dbo.spt_values +), +/* list of dates, constructed by adding list of numbers to initial date */ +my_dates AS ( + SELECT TOP (DATEDIFF(MONTH, '2019-01-01', '2020-12-31') + 1) /* number of dates required */ + EOMONTH(DATEADD(MONTH, x-1, '2019-01-01')) AS my_dates + FROM n + ORDER BY x +), +/* get the next date for each record */ +debt_source AS ( + SELECT * + ,LEAD(month_end, 1, '9999-01-01') OVER (PARTITION BY snz_ird_uid, tax_type_label, snz_case_number ORDER BY month_end) AS lead_month_end + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ir_debt_transactions_corrected] +), +/* join where dates from list are between current and next date --> hence dates from list are missing */ +joined AS ( + SELECT * + FROM debt_source + INNER JOIN my_dates + ON month_end < my_dates + AND my_dates < lead_month_end +) +/* combine original and additional records into same table */ +SELECT * +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_ird_non_transactions] +FROM ( + /* original records */ + SELECT snz_ird_uid + ,snz_case_number + ,month_end + ,tax_type_label + ,pre_2019 + ,assess + ,penalty + ,interest + ,maintenance_pos + ,maintenance_neg + ,payment + ,remiss + ,delta_updated + ,running_balance_tax_type_num + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ir_debt_transactions_corrected] + + UNION ALL + + /* additional records */ + SELECT snz_ird_uid + ,snz_case_number + ,my_dates AS month_end + ,tax_type_label + ,NULL AS pre_2019 + ,NULL AS assess + ,NULL AS penalty + ,NULL AS interest + ,NULL AS maintenance_pos + ,NULL AS maintenance_neg + ,NULL AS payment + ,NULL AS remiss + ,0 AS delta_updated + ,running_balance_tax_type_num + FROM joined + WHERE NOT running_balance_tax_type_num BETWEEN -10 AND 0 -- exclude small negative balances +) k + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_ird_non_transactions] ([snz_ird_uid]); +GO + +/************************************************************************************************** +join on snz_uid +**************************************************************************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_ird_transactions_ready]; +GO + +SELECT b.snz_uid + ,a.* +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_ird_transactions_ready] +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_ird_non_transactions] AS a +LEFT JOIN [IDI_Clean_YYYYMM].[security].[concordance] AS b +ON a.snz_ird_uid = b.snz_ird_uid + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_ird_transactions_ready] (snz_uid); +GO + +/************************************************************************************************** +Views for balance labels +**************************************************************************************************/ + +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[ird_labels_balance]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[ird_labels_balance] AS +SELECT snz_uid + ,month_end + ,tax_type_label + ,running_balance_tax_type_num + /* balance labels */ + ,CONCAT('ird_Y', YEAR(month_end), 'M', MONTH(month_end), '_', tax_type_label) AS balance_label + ,CONCAT('ird_Y', YEAR(month_end), 'M', MONTH(month_end)) AS balance_label_all_types +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_ird_transactions_ready] +WHERE month_end BETWEEN '2019-01-01' AND '2020-09-30' +GO + +/************************************************************************************************** +Views for transaction labels +**************************************************************************************************/ + +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[ird_labels_transactions]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[ird_labels_transactions] AS +SELECT snz_uid + ,month_end + ,tax_type_label + ,pre_2019 + ,assess + ,penalty + ,interest + ,maintenance_pos + ,maintenance_neg + ,payment + ,remiss + /* pre_2019 */ + ,CONCAT('ird_', 'pre_2019', '_', tax_type_label) AS transaction_labels_pre_2019_by_type + ,CONCAT('ird_', 'pre_2019') AS transaction_labels_pre_2019_all_types + /* assess */ + ,CONCAT('ird_', 'assess', '_', YEAR(month_end), '_', tax_type_label) AS transaction_labels_assess_by_type + ,CONCAT('ird_', 'assess', '_', YEAR(month_end)) AS transaction_labels_assess_all_types + /* penalty */ + ,CONCAT('ird_', 'penalty', '_', YEAR(month_end), '_', tax_type_label) AS transaction_labels_penalty_by_type + ,CONCAT('ird_', 'penalty', '_', YEAR(month_end)) AS transaction_labels_penalty_all_types + /* interest */ + ,CONCAT('ird_', 'interest', '_', YEAR(month_end), '_', tax_type_label) AS transaction_labels_interest_by_type + ,CONCAT('ird_', 'interest', '_', YEAR(month_end)) AS transaction_labels_interest_all_types + /* maintenance_pos */ + ,CONCAT('ird_', 'maintenance_pos', '_', YEAR(month_end), '_', tax_type_label) AS transaction_labels_maintenance_pos_by_type + ,CONCAT('ird_', 'maintenance_pos', '_', YEAR(month_end)) AS transaction_labels_maintenance_pos_all_types + /* maintenance_neg */ + ,CONCAT('ird_', 'maintenance_neg', '_', YEAR(month_end), '_', tax_type_label) AS transaction_labels_maintenance_neg_by_type + ,CONCAT('ird_', 'maintenance_neg', '_', YEAR(month_end)) AS transaction_labels_maintenance_neg_all_types + /* payment */ + ,CONCAT('ird_', 'payment', '_', YEAR(month_end), '_', tax_type_label) AS transaction_labels_payment_by_type + ,CONCAT('ird_', 'payment', '_', YEAR(month_end)) AS transaction_labels_payment_all_types + /* remiss */ + ,CONCAT('ird_', 'remiss', '_', YEAR(month_end), '_', tax_type_label) AS transaction_labels_remiss_by_type + ,CONCAT('ird_', 'remiss', '_', YEAR(month_end)) AS transaction_labels_remiss_all_types +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_ird_transactions_ready] +WHERE month_end BETWEEN '2019-01-01' AND '2020-09-30' +GO + +/************************************************************************************************** +Views for repayments +**************************************************************************************************/ + +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[ird_labels_repayments]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[ird_labels_repayments] AS + +SELECT snz_uid + ,snz_case_number + ,month_end + ,tax_type_label + ,payment + /* repayment labels by type */ + ,IIF(month_end BETWEEN '2020-07-01' AND '2020-09-30', CONCAT('ird_payment_03mth_', tax_type_label), NULL) AS payment_label_by_type_03 + ,IIF(month_end BETWEEN '2020-04-01' AND '2020-09-30', CONCAT('ird_payment_06mth_', tax_type_label), NULL) AS payment_label_by_type_06 + ,IIF(month_end BETWEEN '2020-01-01' AND '2020-09-30', CONCAT('ird_payment_09mth_', tax_type_label), NULL) AS payment_label_by_type_09 + ,IIF(month_end BETWEEN '2019-10-01' AND '2020-09-30', CONCAT('ird_payment_12mth_', tax_type_label), NULL) AS payment_label_by_type_12 + /* repayment labels all types */ + ,IIF(month_end BETWEEN '2020-07-01' AND '2020-09-30', 'ird_payment_03mth', NULL) AS payment_label_all_types_03 + ,IIF(month_end BETWEEN '2020-04-01' AND '2020-09-30', 'ird_payment_06mth', NULL) AS payment_label_all_types_06 + ,IIF(month_end BETWEEN '2020-01-01' AND '2020-09-30', 'ird_payment_09mth', NULL) AS payment_label_all_types_09 + ,IIF(month_end BETWEEN '2019-10-01' AND '2020-09-30', 'ird_payment_12mth', NULL) AS payment_label_all_types_12 +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_ird_transactions_ready] +WHERE payment < -1 +AND month_end BETWEEN '2019-10-01' AND '2020-09-30' +GO + +/************************************************************************************************** +Views for persistence + +To determine whether a person has persistent debt we count the number of distinct dates where +the label is non-null during assembly. After assembly, we create the indicator by checking +whether ird_persistence_XXmth = XX. +- If ird_persistence_XXmth = XX then in the last XX months there were XX months where the person + had debt hence they had debt in every month. +- If ird_persistence_XXmth < XX then in the last XX months there were some months where the person + did not have debt. +**************************************************************************************************/ + +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[ird_labels_persist]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[ird_labels_persist] AS +SELECT snz_uid + ,snz_case_number + ,month_end + ,tax_type_label + ,running_balance_tax_type_num + /* persistence labels by type */ + ,IIF(month_end BETWEEN '2020-07-01' AND '2020-09-30', CONCAT('ird_persistence_03mth_', tax_type_label), NULL) AS persistence_label_by_type_03 + ,IIF(month_end BETWEEN '2020-04-01' AND '2020-09-30', CONCAT('ird_persistence_06mth_', tax_type_label), NULL) AS persistence_label_by_type_06 + ,IIF(month_end BETWEEN '2020-01-01' AND '2020-09-30', CONCAT('ird_persistence_09mth_', tax_type_label), NULL) AS persistence_label_by_type_09 + ,IIF(month_end BETWEEN '2019-10-01' AND '2020-09-30', CONCAT('ird_persistence_12mth_', tax_type_label), NULL) AS persistence_label_by_type_12 + ,IIF(month_end BETWEEN '2019-07-01' AND '2020-09-30', CONCAT('ird_persistence_15mth_', tax_type_label), NULL) AS persistence_label_by_type_15 + ,IIF(month_end BETWEEN '2019-04-01' AND '2020-09-30', CONCAT('ird_persistence_18mth_', tax_type_label), NULL) AS persistence_label_by_type_18 + ,IIF(month_end BETWEEN '2019-01-01' AND '2020-09-30', CONCAT('ird_persistence_21mth_', tax_type_label), NULL) AS persistence_label_by_type_21 + /* persistence labels all types */ + ,IIF(month_end BETWEEN '2020-07-01' AND '2020-09-30', 'ird_persistence_03mth', NULL) AS persistence_label_all_types_03 + ,IIF(month_end BETWEEN '2020-04-01' AND '2020-09-30', 'ird_persistence_06mth', NULL) AS persistence_label_all_types_06 + ,IIF(month_end BETWEEN '2020-01-01' AND '2020-09-30', 'ird_persistence_09mth', NULL) AS persistence_label_all_types_09 + ,IIF(month_end BETWEEN '2019-10-01' AND '2020-09-30', 'ird_persistence_12mth', NULL) AS persistence_label_all_types_12 + ,IIF(month_end BETWEEN '2019-07-01' AND '2020-09-30', 'ird_persistence_15mth', NULL) AS persistence_label_all_types_15 + ,IIF(month_end BETWEEN '2019-04-01' AND '2020-09-30', 'ird_persistence_18mth', NULL) AS persistence_label_all_types_18 + ,IIF(month_end BETWEEN '2019-01-01' AND '2020-09-30', 'ird_persistence_21mth', NULL) AS persistence_label_all_types_21 +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_ird_transactions_ready] +WHERE running_balance_tax_type_num IS NOT NULL +AND running_balance_tax_type_num > 0 +AND month_end BETWEEN '2019-01-01' AND '2020-09-30' +GO + + +/************************************************************************************************** +Generate group labels for rows you want the sum of during assembly + +This approach not taken as storing a lot of text is very memory heavy +And there is lots of duplication in this text, so better to create it dynamically +at run time via views. +**************************************************************************************************/ + +/* + +SELECT TOP 100 snz_ird_uid + ,snz_case_number + ,month_end + ,tax_type_label + ,assess + ,penalty + ,interest + ,maintenance_pos + ,maintenance_neg + ,payment + ,remiss + --,delta + ,delta_updated + ,running_balance_tax_type_num + --,running_balance_case_num + ,min_date + ,pre_2019 + /* balance labels */ + ,IIF(month_end BETWEEN '2019-01-01' AND '2020-09-30', CONCAT('ird_Y', YEAR(month_end), 'M', MONTH(month_end), '_', tax_type_label), NULL) AS balance_label + ,IIF(month_end BETWEEN '2019-01-01' AND '2020-09-30', CONCAT('ird_Y', YEAR(month_end), 'M', MONTH(month_end)), NULL) AS balance_label_all_types + /* transaction labels */ + ,IIF(month_end BETWEEN '2019-01-01' AND '2020-09-30', CONCAT('ird_', '', '_', YEAR(month_end), '_', tax_type_label), NULL) AS transaction_labels_by_type + ,IIF(month_end BETWEEN '2019-01-01' AND '2020-09-30', CONCAT('ird_', '', '_', YEAR(month_end)), NULL) AS transaction_labels_all_types + /* persistence labels by type */ + ,IIF(running_balance_tax_type_num > 1 AND month_end BETWEEN '2020-07-01' AND '2020-09-30', CONCAT('ird_persistence_03mth_', tax_type_label), NULL) AS persistence_label_by_type_03 + ,IIF(running_balance_tax_type_num > 1 AND month_end BETWEEN '2020-04-01' AND '2020-09-30', CONCAT('ird_persistence_06mth_', tax_type_label), NULL) AS persistence_label_by_type_06 + ,IIF(running_balance_tax_type_num > 1 AND month_end BETWEEN '2020-01-01' AND '2020-09-30', CONCAT('ird_persistence_09mth_', tax_type_label), NULL) AS persistence_label_by_type_09 + ,IIF(running_balance_tax_type_num > 1 AND month_end BETWEEN '2019-10-01' AND '2020-09-30', CONCAT('ird_persistence_12mth_', tax_type_label), NULL) AS persistence_label_by_type_12 + ,IIF(running_balance_tax_type_num > 1 AND month_end BETWEEN '2019-07-01' AND '2020-09-30', CONCAT('ird_persistence_15mth_', tax_type_label), NULL) AS persistence_label_by_type_15 + ,IIF(running_balance_tax_type_num > 1 AND month_end BETWEEN '2019-04-01' AND '2020-09-30', CONCAT('ird_persistence_18mth_', tax_type_label), NULL) AS persistence_label_by_type_18 + ,IIF(running_balance_tax_type_num > 1 AND month_end BETWEEN '2019-01-01' AND '2020-09-30', CONCAT('ird_persistence_21mth_', tax_type_label), NULL) AS persistence_label_by_type_21 + /* persistence labels all types */ + ,IIF(running_balance_tax_type_num > 1 AND month_end BETWEEN '2020-07-01' AND '2020-09-30', 'ird_persistence_03mth', NULL) AS persistence_label_all_types_03 + ,IIF(running_balance_tax_type_num > 1 AND month_end BETWEEN '2020-04-01' AND '2020-09-30', 'ird_persistence_06mth', NULL) AS persistence_label_all_types_06 + ,IIF(running_balance_tax_type_num > 1 AND month_end BETWEEN '2020-01-01' AND '2020-09-30', 'ird_persistence_09mth', NULL) AS persistence_label_all_types_09 + ,IIF(running_balance_tax_type_num > 1 AND month_end BETWEEN '2019-10-01' AND '2020-09-30', 'ird_persistence_12mth', NULL) AS persistence_label_all_types_12 + ,IIF(running_balance_tax_type_num > 1 AND month_end BETWEEN '2019-07-01' AND '2020-09-30', 'ird_persistence_15mth', NULL) AS persistence_label_all_types_15 + ,IIF(running_balance_tax_type_num > 1 AND month_end BETWEEN '2019-04-01' AND '2020-09-30', 'ird_persistence_18mth', NULL) AS persistence_label_all_types_18 + ,IIF(running_balance_tax_type_num > 1 AND month_end BETWEEN '2019-01-01' AND '2020-09-30', 'ird_persistence_21mth', NULL) AS persistence_label_all_types_21 + /* repayment labels by type */ + ,IIF(payment > 1 AND month_end BETWEEN '2020-07-01' AND '2020-09-30', CONCAT('ird_payment_03mth_', tax_type_label), NULL) AS payment_label_by_type_03 + ,IIF(payment > 1 AND month_end BETWEEN '2020-04-01' AND '2020-09-30', CONCAT('ird_payment_06mth_', tax_type_label), NULL) AS payment_label_by_type_06 + ,IIF(payment > 1 AND month_end BETWEEN '2020-01-01' AND '2020-09-30', CONCAT('ird_payment_09mth_', tax_type_label), NULL) AS payment_label_by_type_09 + ,IIF(payment > 1 AND month_end BETWEEN '2019-10-01' AND '2020-09-30', CONCAT('ird_payment_12mth_', tax_type_label), NULL) AS payment_label_by_type_12 + /* repayment labels all types */ + ,IIF(payment > 1 AND month_end BETWEEN '2020-07-01' AND '2020-09-30', 'ird_payment_03mth', NULL) AS payment_label_by_type_03 + ,IIF(payment > 1 AND month_end BETWEEN '2020-04-01' AND '2020-09-30', 'ird_payment_06mth', NULL) AS payment_label_by_type_06 + ,IIF(payment > 1 AND month_end BETWEEN '2020-01-01' AND '2020-09-30', 'ird_payment_09mth', NULL) AS payment_label_by_type_09 + ,IIF(payment > 1 AND month_end BETWEEN '2019-10-01' AND '2020-09-30', 'ird_payment_12mth', NULL) AS payment_label_by_type_12 +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[ir_debt_transactions_corrected] + +*/ + + +/************************************************************************************************** +remove temporary tables +**************************************************************************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_ir_debt_transactions]; +GO +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[ir_debt_transactions_corrected]; +GO +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_ird_non_transactions]; +GO diff --git a/income/debt_to_moj_fcco.sql b/income/debt_to_moj_fcco.sql new file mode 100644 index 0000000..2a8c392 --- /dev/null +++ b/income/debt_to_moj_fcco.sql @@ -0,0 +1,216 @@ +Title: Debt to MOJ FCCO +Author: Freya Li & Simon Anastasiadis + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Debt, debt balances, and repayment for debtors owing money to MOJ Family Court Cost Contribution Order (FCCO). + +Intended purpose: +1. Identifying debtors. +2. Calculating number of debts and total value of debts. +3. Calculating change in debts - due to borrowing or repayment. + +Description: +Debt, debt balances, and repayment for debtors owing money to MOJ Family Court Cost Contribution Order (FCCO). + +Intended purpose: +1. Identifying debtors. +2. Calculating number of debts and total value of debts. +3. Calculating change in debts - due to borrowing or repayment. + +Inputs & Dependencies: +- [IDI_Adhoc].[clean_read_DEBT].[moj_debt_fcco_monthly_balances] +- [IDI_Adhoc].[clean_read_DEBT].[moj_debt_identity_fml_and_adhoc] +- [IDI_Adhoc].[clean_read_DEBT].[moj_debt_data_link] +- [IDI_Clean].[security].[concordance] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_fcco_transactions_ready] + +Notes: +1. Date range for table [d2gP2_moj_debt_fcco_monthly_balances] is Jul2014 to Jan2021 + Family Court Contribution Orders were introduced in the 2014 family justice reforms. + Hence there is no FCCO debt prior to 2014. +2. The first record closing_balance for each identity is equal to the sum of all the conponent: + closing_balance[1] = new_debt_established[1] + repayments[1] + write_offs[1], the rest of the + closing_balance (except first record) is the running balnce (which means that: + closing_balance[i+1] = closing_balance[i] + new_debt_established[i+1] + repayments[i+1] + write_offs[i+1] + where i = 2, 3,...). +3. 12% of the fcco_file_nbr couldn't be link to the corresponding snz_uid in table [d2gP2_moj_fcco_debt_cases] + 14% of the fcco_file_nbr couldn't be link to the corresponding snz_uid in table [d2gP2_moj_fcco_debt_by_month] +4. A single person may have multiple fcco_fiel_nbrs for multiple debts or debt roles. + After link the monthly debt to spine, only keep one record for each snz_uid (which means combine those rows with + same snz_uid but different fcco_file_nbr) + For debt case data, we would keep it as it is, as different fcco_file_nbr refer to deffierent debt or debt roles. +5. FML for the MoJ debt data is specific to the 20201020 refresh. +6. Fast match loader: + Run_key is a number that is specific to the linking of a specific dataset. + Lhs_nbr is the let hand side identifier, also known as the snz_spine _uid. + Rhs_nbr is the right_hand side identifier, also known as node. This is the uid that was created in the primary + series table related to the primary series key. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = d2gP2_ + Project schema = [DL-MAA20XX-YY] + +History (reverse order): +2021-09-12 FL restructure +2021-06-18 FL comment out active debt deration period as it's out of scope; repayment indicator; persistence indicator +2021-05-07 FL including monthly incured debt and repayment +2021-03-08 FL work begun +***************************************************************************************************/ + +/************************************************************************************************** +Prior to use, copy to sandpit and index +**************************************************************************************************/ +IF OBJECT_ID('[IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_debt_fcco_monthly_balances]','U') IS NOT NULL +DROP TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_debt_fcco_monthly_balances]; +GO + +With debt_source AS( +SELECT * + ,EOMONTH(DATEFROMPARTS(calendar_year, RIGHT(month_nbr,2), 1)) AS month_date + ,COALESCE(new_debt_established, 0) + COALESCE(repayments, 0) + COALESCE(write_offs, 0) AS delta +FROM [IDI_Adhoc].[clean_read_DEBT].[moj_debt_fcco_monthly_balances] +) +SELECT * + ,SUM(delta) OVER (PARTITION BY fcco_file_nbr ORDER BY month_date) AS balance_correct +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_debt_fcco_monthly_balances] +FROM debt_source +WHERE month_date <= '2020-12-31' +GO +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_debt_fcco_monthly_balances] ([fcco_file_nbr]); +GO + +/************************************************************************************************** +fill in records where balance is non-zero but transactions are zero +**************************************************************************************************/ +IF OBJECT_ID('[IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_fcco_non_transactions]','U') IS NOT NULL +DROP TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_fcco_non_transactions]; +GO + +WITH +/* list of 100 numbers of 1:100 - spt_values is a n admin table chosen as it is at least 100 row long */ +n AS ( + SELECT TOP 100 ROW_NUMBER() OVER (ORDER BY type) AS x + FROM master.dbo.spt_values +), +/* list of dates, constructed by adding list of numbers to initial date */ +my_dates AS ( + SELECT TOP (DATEDIFF(MONTH, '2014-07-01', '2020-12-31') + 1) /* number of dates required */ + EOMONTH(DATEADD(MONTH, x-1, '2014-07-01')) AS my_dates + FROM n + ORDER BY x +), +/* get the date for each record */ +debt_source AS( + SELECT * + ,LEAD(month_date, 1, '9999-01-01') OVER (PARTITION BY fcco_file_nbr ORDER BY month_date) AS lead_date_cor + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_debt_fcco_monthly_balances] +), +joined AS ( + SELECT * + FROM debt_source + INNER JOIN my_dates + ON month_date < my_dates + AND my_dates < lead_date_cor +) +/* combine oringinal and additional records into same table */ +SELECT * +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_fcco_non_transactions] +FROM ( + /* original records */ + SELECT fcco_file_nbr + ,month_date + ,new_debt_established + ,repayments + ,write_offs + ,balance_correct + ,delta + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_debt_fcco_monthly_balances] + + UNION ALL + + /* additional records */ + SELECT fcco_file_nbr + ,my_dates AS month_date + ,NULL AS new_debt_established + ,NULL AS repaymets + ,NULL AS write_offs + ,balance_correct + ,0 AS delta + FROM joined + WHERE NOT balance_correct BETWEEN -1 AND 0 -- exclude small negative balances +) k + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_fcco_non_transactions] ([fcco_file_nbr]); +GO + +/************************************************************************* +Join MoJ FCCO data to spine +*************************************************************************/ +--Linking MoJ debt data with fast match loader +IF OBJECT_ID('[IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_fcco_debt_id]','U') IS NOT NULL +DROP TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_fcco_debt_id]; +GO + +SELECT fml.[fcco_file_nbr] -- uid on debt input table + ,fml.[snz_fml_8_uid] -- links to dl.rhs_nbr + ,dl.[rhs_nbr] -- links to fml.snz_fml_8_uid + ,dl.[lhs_nbr] -- links to sc.snz_spine_uid + ,sc.snz_spine_uid -- links to dl.lhr_nbr + ,sc.snz_uid -- desired output uid +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_fcco_debt_id] +FROM [IDI_Adhoc].[clean_read_DEBT].[moj_debt_fcco_identities] AS fml +LEFT JOIN [IDI_Adhoc].[clean_read_DEBT].[moj_debt_data_link] AS dl +ON fml.snz_fml_8_uid = dl.rhs_nbr +AND (dl.near_exact_ind = 1 + OR dl.weight_nbr > 17) -- exclude only low weight, non-exact links +LEFT JOIN [IDI_Clean_YYYYMM].[security].[concordance] AS sc +ON dl.lhs_nbr = sc.snz_spine_uid +WHERE dl.run_key = 943 -- there are two run_keys as FML used twice for MoJ data +-- runkey = 941 for fines & charges, runkey = 943 for FCCO + +IF OBJECT_ID('[IDI_Sandpit].[DL-MAA20XX-YY].[[d2gP2_fcco_transactions_ready]]','U') IS NOT NULL +DROP TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_fcco_transactions_ready]; +GO + +SELECT b.snz_uid + ,a.* +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_fcco_transactions_ready] +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_fcco_non_transactions] a +LEFT JOIN [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_fcco_debt_id] b +ON a.fcco_file_nbr = b.fcco_file_nbr + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_fcco_transactions_ready] (snz_uid); +GO + +/***************************************************************************** +Remove temporary tables +*****************************************************************************/ +IF OBJECT_ID('[IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_debt_fcco_monthly_balances]','U') IS NOT NULL +DROP TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_debt_fcco_monthly_balances]; +GO + +IF OBJECT_ID('[IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_fcco_non_transactions]','U') IS NOT NULL +DROP TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_fcco_non_transactions]; +GO + +IF OBJECT_ID('[IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_fcco_debt_id]','U') IS NOT NULL +DROP TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_fcco_debt_id]; +GO \ No newline at end of file diff --git a/income/debt_to_moj_fines.sql b/income/debt_to_moj_fines.sql new file mode 100644 index 0000000..be5e25b --- /dev/null +++ b/income/debt_to_moj_fines.sql @@ -0,0 +1,269 @@ +/************************************************************************************************** +Title: Debt to MOJ +Author: Freya Li and Simon Anastasiadis + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Debt, debt balances, and repayment for debtors owing money to MOJ. + +Intended purpose: +1. Identifying debtors. +2. Calculating number of debtors and total value of debts. +3. Calculating change in debts - due to borrowing or repayment. + +Inputs & Dependencies: +- [IDI_Adhoc].[clean_read_DEBT].[moj_debt_full_summary_trimmed] +- [IDI_Adhoc].[clean_read_DEBT].[moj_debt_identity_fml_and_adhoc] +- [IDI_Adhoc].[clean_read_DEBT].[moj_debt_data_link] +- [IDI_Clean].[security].[concordance] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_transactions_ready] + +Notes: +1. Date range for table [moj_debt_full_summary_trimmed] is Dec2010 - Dec2020. + The balance in 2011 has no change for every month, we discard all data pre-2012. + +2. Foe orginal data, numbers represent the amount of the money, they are all positive. + Delta = penalty + impositions - payment - remittals + payment_reversal +remittal_reversal + delta is calculated correctly. + To keep consistency, in the final transaction views payment and remittals are switched as + negative values. + +3. If the first outstanding_balance for an identity may not equal to the delta, which means that + the outstanding_balance include the debt before 2012 + +4. When we produce the table [d2gP2_tmp_moj_debt_cases], only debt incurred after 2012 will be + considered, because if a debt incurred before 2012, we don't know when exactly the debt + incurred. Thus, only 85% of the identities from [moj_debt_full_summary_trimmed] inculded in + [d2gP2_tmp_moj_debt_cases]. + +5. A single person may have multiple PPNs for multiple debts or debt roles. + We combined the debt for the same snz_uid with different moj_ppn (for monthly debt table), + For debt case data, we keep it as it is, as different fcco_file_nbr refer to deffierent debt + or debt roles. + +6. The debt arise before 2012 most likely been paid off by end of 2018, it won't infulence too much on the + analysis for 2019 and 2020 data; the total imposition debt is about 7 times more than penalty debt, + thus we treat all the debt arise before 2012 as imposition on the date of the first record an individual has. + An attempt of splitting debt repayment to repaid to debt arise before 2012, and to debt arise after 2012 in + the file MoJ debt date investigation.sql. However, running the code takes extra time, and won't improve our + analysis significantly, that version has been abandoned. + +7. The case of total repaid is more than total incurred has been observed. + +8. This table contains debts from fines and infringements which have been imposed. These are not charges, + which are more courts and tribunals related. (Email Stephanie Simpson, MoJ data SME, 2021-04-08) + +9. FML for the MoJ debt data is specific to the 20201020 refresh. To use the data on a different refresh, + researchers have to first join it to the 20201020 refresh and then use snz_jus_uid to link to other refreshes. + +10. Fast match loader: + Run_key is a number that is specific to the linking of a specific dataset. + Lhs_nbr is the let hand side identifier, also known as the snz_spine _uid. + Rhs_nbr is the right_hand side identifier, also known as node. This is the uid that was created in the primary + series table related to the primary series key. + + +Issue: +1. Data that has been linked with the Fast Match Loader (FML) requires some additional steps + to connect to the spine. + 20% of the moj_ppn couldn't be link to the corresponding snz_uid in table [d2gP2_moj_debt_cases] + 30% of the moj_ppn couldn't be link to the corresponding snz_uid in table [d2gP2_moj_debt_by_month] + +2. There are cases that individuals have outstanding balance from pre-2012, but the first transaction record + is not Janaury 2012. In this case, instead of record their outstanding balance before 2012 at January 2012, + we record it at the first date those individuals have a transaction record. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = d2gP2_ + Project schema = [DL-MAA20XX-YY] + +History (reverse order): +2021-10-04 SA review +2021-08-13 FL insert records when there are no transactions +2021-06-17 FL comment out the active debt duration as it's out of scope; add repayment indicator; debt persisitence +2021-05-07 FL including monthly incurred debt and repayment +2021-03-03 FL work begun +**************************************************************************************************/ + +/************************************************************************************************** +Prior to use, copy to sandpit and index +(runtime 2 minutes) +**************************************************************************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_debt_full_summary_trimmed]; +GO + +SELECT * + ,EOMONTH(DATEFROMPARTS(year_nbr, month_of_year_nbr, 1)) AS month_date +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_debt_full_summary_trimmed] +FROM [IDI_Adhoc].[clean_read_DEBT].[moj_debt_full_summary_trimmed] +WHERE year_nbr > 2011 +GO + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_debt_full_summary_trimmed] ([moj_ppn]); +GO + +/************************************************************************************************** +Data preparation +Not all the outstanding_balance provided by MoJ are correct. Recalculation is required +**************************************************************************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_tmp_moj_debt_prep]; +GO + +SELECT * + ,SUM(balance_before_2012 + delta) OVER (PARTITION BY moj_ppn ORDER BY month_date) AS balance_correct +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_tmp_moj_debt_prep] +FROM( + SELECT * + ,IIF(first_record_ind = 1, COALESCE(outstanding_balance, 0) - delta, 0) AS balance_before_2012 + FROM( + SELECT * + ,IIF(month_date = MIN(month_date) OVER (PARTITION BY moj_ppn), 1, 0) AS first_record_ind + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_debt_full_summary_trimmed] + )a +)b + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_tmp_moj_debt_prep] ([moj_ppn]); +GO + +/************************************************************************************************** +fill in records where balance is non-zero but transactions are zero +**************************************************************************************************/ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_non_transactions]; +GO + +WITH +/* list of 120 numbers of 1:120 - spt_values is a n admin table chosen as it is at least 120 row long */ +n AS ( + SELECT TOP 120 ROW_NUMBER() OVER (ORDER BY type) AS x + FROM master.dbo.spt_values +), +/* list of dates, constructed by adding list of numbers to initial date */ +my_dates AS ( + SELECT TOP (DATEDIFF(MONTH, '2012-01-01', '2020-12-31') + 1) /* number of dates required */ + EOMONTH(DATEADD(MONTH, x-1, '2012-01-01')) AS my_dates + FROM n + ORDER BY x +), +/* get the next date for each record */ +debt_source AS( +SELECT * + ,LEAD(month_date, 1, '9999-01-01') OVER (PARTITION BY moj_ppn ORDER BY month_date) AS lead_date_cor +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_tmp_moj_debt_prep] +), +joined AS ( + SELECT * + FROM debt_source + INNER JOIN my_dates + ON month_date < my_dates + AND my_dates < lead_date_cor +) +/* combine oringinal and additional records into same table */ +SELECT * +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_non_transactions] +FROM ( + /* original records */ + SELECT moj_ppn + ,month_date + ,impositions + ,penalty + ,payment + ,remittals + ,payment_reversal + ,remittal_reversal + ,balance_correct + ,delta + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_tmp_moj_debt_prep] + + UNION ALL + + /* additional records */ + SELECT moj_ppn + ,my_dates AS month_date + ,NULL AS impositions + ,NULL AS penalty + ,NULL AS payment + ,NULL AS remittals + ,NULL AS payment_reversal + ,NULL AS remittal_reversal + ,balance_correct + ,0 AS delta + FROM joined + WHERE NOT balance_correct BETWEEN - 1 AND 0 --exclude small netative balances +) k + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_non_transactions] ([moj_ppn]); +GO + +/************************************************************************* +Join MoJ data to spine +*************************************************************************/ +--Linking MoJ debt data with fast match loader +--FML for the MoJ debt data is specific to the 20201020 refresh. +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_debt_id]; +GO + +SELECT fml.[snz_jus_uid] + ,fml.[moj_ppn] -- uid on debt input table + ,fml.[snz_fml_7_uid] -- links to dl.rhs_nbr + ,dl.[rhs_nbr] -- links to fml.snz_fml_8_uid + ,dl.[lhs_nbr] -- links to sc.snz_spine_uid + ,sc.snz_spine_uid -- links to dl.lhr_nbr + ,sc.snz_uid -- desired output uid +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_debt_id] +FROM [IDI_Adhoc].[clean_read_DEBT].[moj_debt_identity_fml_and_adhoc] AS fml +LEFT JOIN [IDI_Adhoc].[clean_read_DEBT].[moj_debt_data_link] AS dl +ON fml.snz_fml_7_uid = dl.rhs_nbr +AND (dl.near_exact_ind = 1 + OR dl.weight_nbr > 17) -- exclude only low weight, non-exact links +LEFT JOIN [IDI_Clean_20201020].[security].[concordance] AS sc +ON dl.lhs_nbr = sc.snz_spine_uid +WHERE dl.run_key = 941 -- there are two run_keys as FML used twice for MoJ data +-- runkey = 941 for fines & charges, runkey = 943 for FCCO + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_transactions_ready]; +GO + +SELECT b.snz_uid + ,a.* +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_transactions_ready] +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_non_transactions] a +LEFT JOIN [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_debt_id] b +ON a.moj_ppn = b.moj_ppn + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_transactions_ready] (snz_uid); +GO + +/***************************************************************************** +Remove temporary tables +*****************************************************************************/ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_debt_full_summary_trimmed]; +GO + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_tmp_moj_debt_prep]; +GO + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_non_transactions]; +GO + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_moj_debt_id]; +GO diff --git a/income/debt_to_msd.sql b/income/debt_to_msd.sql new file mode 100644 index 0000000..2b5fc7e --- /dev/null +++ b/income/debt_to_msd.sql @@ -0,0 +1,580 @@ +/************************************************************************************************** +Title: Debt to MSD +Authors: Freya Li and Simon Anastasiadis + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Debt, debt balances, and repayment for debtors owing money to MSD. +Considering debt by Overpayment and Recoverable Assistance. + +Intended purpose: +1. Identifying debtors. +2. Calculating number of debts and total value of debts. +3. Calculating change in debts - due to borrowing or repayment. + +Inputs & Dependencies: +- "classifying MSD debt using T3 data.sql" --> [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_classified_msd_debt] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_msd_non_transactions] +- [IDI_UserCode].[DL-MAA20XX-YY].[d2gP2_msd_labels_balance] +- [IDI_UserCode].[DL-MAA20XX-YY].[d2gP2_msd_labels_transactions_part1] +- [IDI_UserCode].[DL-MAA20XX-YY].[d2gP2_msd_labels_transactions_part2] +- [IDI_UserCode].[DL-MAA20XX-YY].[d2gP2_msd_labels_repayments_part1] +- [IDI_UserCode].[DL-MAA20XX-YY].[d2gP2_msd_labels_repayments_part2] +- [IDI_UserCode].[DL-MAA20XX-YY].[d2gP2_msd_labels_persist] + +Notes: +1. This file builds on previous work that splits debt incurred into overpayment and recoverable assistance. This file extends + this work by splitting repayments into overpayment and recoverable assistance. This is done in accordance with guidance + from MSD that: + (1) The earliest debt is paid off first + (2) benefit advances are paid off in parallel with other repayments ("separately at a rate of $1 to $5 per week"). + Based on this advice we assign repayments to overpayment and recoverable assistance besed on the oldest outstanding debt. + We do not fully address point (2), but where overpayment and recoverable assistance debt is incurred in the same month we + assign repayment first to recoverable assistance. + +2. Detailed notes on the assumptions underlying our splitting of repayments into overpayment and recoverable assistance + can be found in supporting scripts. Key points from these can be summarised as follows: + - A significant proportion of recoverable assistance payments (~80%) are done via benefit advances. + This means that ignoring point (2) above is a strong assumption. And so analysis that depends on the exact timing + that individuals gain or repay debt of a specific type will be less reliable. + - Less than 5% of records where a debt is incurred has both overpayment and recoverable assistance incurred in the + same month. As recoverable assistance is mainly requested by people on benefit, and overpayment is more likely + to happen when a person comes off benefit, assuming repayment of recoverable assistance before overpayment in this + case is not a strong assumption and will introduce minimal error into our results. + - While repayments are allocated oldest-to-newest, we do not know how debt write-offs are allocated. But as write-offs + are much less common than repayments (of every amount) treating write-offs as repayments seems reasonable. + +3. Openning balances are much harder to separate into overpayment and recoverable assistance. This will have flow-on + consequences for splitting repayments. More than 90% of openning balances (in the dataset) have been repaid by the + end of 2018 (the earliest period for our study). As these unpaid balances will mostly be classified as overpayment, + and overpayment debts are more likely to be large, this isunlikely to have significant impact on our results. + However, researchers wanting to look at pre-2018 debt patterns may wish to refine our techniques. + +4. When classifying repayments and write-offs to overpayment and recoverable assistance, we identify payments that are + recoverable assistance. All other amounts are attributed to overpayment. Reverse payments and write-offs (wrong-signs) + are handled via 'all other amounts' and hence are treated as overpayment. + Less than 0.1% of all records contain a wrong sign (31000 of 44637000). Almost all wrong signs are write offs, and + about 12% (25000 of 198000) of all write offs are wrong sign (positive). + +5. We couldn't split writeoff and repayment seperately for recoverable assistance and overpayment debt. The month of + repayment for subtypes of MSD debt are estimations, as the writeoff is included in the repament. + +6. Outlier values + Some principle amounts > $10k, which is unlikely to be recoverable assistance or overpayment. + Large transactions (>$10k) make up a tiny proportion of transactions (less than 0.1%) and + effect a small number of people (less than 3%) but are a significant amount of total amounts incurred (22%). + Current hypothesis is that such amounts are fraud (which is part of the dataset) or + receipt of more than one form of recoverable assistance (e.g. one for clothes, another for heating). + Conversation with MSD data experts suggests these amounts are most likely related to fraud. + +7. Values approx 0 that should be zero is of minimal concern. + As values are dollars, all numbers should be rounded to 2 decimal places. + Less than 0.5% of people ever have an absolute debt balance of 1-5 cents. + +8. Recoverable assistance - Looking at third tier expenditure that is recoverable: + - Amounts less than $2k are common + - Amounts of $3-5k are uncommon + - Amounts exceeding $5k occur but are rare + So if we are concerned about spikes, $5k is a reasonable threshold for identifying spikes + because people could plausably borrow and repay $2000 in recoverable assistance in a single month. + +9. Spikes - Yes there are occurrences where people's balances change in a spike pattern + (suddent, large change, immediately reversed) and the value of the change exceeds $5k. + There are less than 6,000 instances of this in the dataset, about 0.01% of records + The total amount of money involved is less than 0.2% of total debt. + Hence this pattern is not of concern. + +10. We would expect that a debtor's balance is always non-negative. So if we sum amounts incurred, repayments, and write offs, + then debt balances should be >=0. However, some identities have dates on which their net amount owing is negative. About + 6000 people have negative amounts owing at some point (negative been considered as the cut-off). Inspection of the data + suggests that this happens when repayments exceed debt, and rather than withdraw the excess, the amount is of debt is left + negative until the individual borrows again (either overpayment or recoverable assistance). It is common to observe + negative balances that later on become zero balances. We would not observe this if the negative balances implied that some + debt incurred was not recorded in the dataset. + +11. Notes in this file supported by investigations recorded in: + - testing msd debt data.sql + - splitting_msd_debt_repayments - drafting ideas.sql + - splitting_msd_debt_repayments - investigating complications.sql + +Issues: +1. Runtime is long even with indexing >15 minutes (much slower at peak times). + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = d2gP2_ + Project schema = [DL-MAA20XX-YY] + +History (reverse order): +2021-10-06 SA merge relevant part of debt_to_msd_p2_split.sql in to reduce duplication +2021-09-08 FL restructure MSD debt data begun +**************************************************************************************************/ + +USE IDI_UserCode +GO + +/* +Reference, input dataset + +Notes: +- joined to spine +- indexed by snz_uid +- [amount_incurred], [amount_repaid], and [amount_written_off] contain zeros instead of NULL +- [amount_incurred_assistance], and [amount_incurred_overpayment] contains NULL if no amount incurred +- there are no duplicate months per individual +*/ + +/************************************************************************************************** +Non-negative repayments/write-offs are treated as overpayment debt +**************************************************************************************************/ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_tmp_signs_handled]; +GO + +SELECT snz_uid + ,snz_msd_uid + ,debt_as_at_date + ,amount_incurred + + IIF(amount_repaid > 0, amount_repaid, 0) + + IIF(amount_written_off > 0, amount_written_off, 0) + AS amount_incurred + ,IIF(amount_repaid <= 0, amount_repaid, 0) AS amount_repaid + ,IIF(amount_written_off <= 0, amount_written_off, 0) AS amount_written_off + ,ISNULL(amount_incurred_assistance, 0) AS amount_incurred_assistance + ,ISNULL(amount_incurred_overpayment, 0) + + IIF(amount_repaid > 0, amount_repaid, 0) + + IIF(amount_written_off > 0, amount_written_off, 0) + AS amount_incurred_overpayment +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_tmp_signs_handled] +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_classified_msd_debt] + +/************************************************************************************************** +Split repayments and write-offs into overpayment and recoverable assistance +**************************************************************************************************/ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_tmp_classified_msd_repayment]; +GO + +/* +Concept for approach: + +There is a technique we use for counting days in overlapping date ranges: +Given date1 - date2 and date3 - date4 as "the zone", count the number of days in dateA - dateB that are "in zone". +We can do the same thing for repayments - count dollar overlap ranges: +Given amount1 - amount2 and amount3 - amount4 are "recoverable assistance", count the number of dollars in amountA - amountB that are "recoverable assistance". + +First setup dollar ranges of debt incurred. +For example: +First debt $100 overpayment, second debt $70 recoverable assistance, third debt $220 recoverable assistance +Gives dollar ranges: +( 0,100] dollars owed for overpayment +(100,170] dollars owed for recoverable assistance +(170,390] dollars owed for recoverable assistance + +Second setup dollar ranges for repayments. +For example: +Repayment of $200 in January, followed by $90 in February. +Gives dollar ranges: +( 0,200] dollars paid in Jan +(200,290] dollars paid in Feb + +Overlapping these dollar ranges: +( 0,200] dollars paid in Jan overlaps ( 0,100] dollars owed for overpayment so $100 overpayment repaid in Jan +( 0,200] dollars paid in Jan overlaps (100,170] dollars owed for recoverable assistance so $70 recoverable assistance repaid in Jan +( 0,200] dollars paid in Jan overlaps (170,390] dollars owed for recoverable assistance so $30 recoverable assistance repaid in Jan +(200,290] dollars paid in Feb overlaps (170,390] dollars owed for recoverable assistance so $90 recoverable assistance repaid in Feb +No overlap for the last $100 owed, so outstanding balance = $100 owed for recoverable assistance +*/ + +WITH +recoverable_assistance_range AS ( + + SELECT snz_uid + ,snz_msd_uid + ,debt_as_at_date + ,amount_incurred + ,amount_incurred_assistance + ,amount_incurred_overpayment + /* total debt incurred up-to-and-including current row - debt incurred this row = debt up-to-but-excluding current row */ + ,SUM(amount_incurred) OVER (PARTITION BY snz_uid ORDER BY debt_as_at_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) + - amount_incurred AS assistance_start + /* total debt incurred up-to-and-including current row - overpayment debt incurred this row = debt up-to-and-including recoverable assistance debt in current row */ + ,SUM(amount_incurred) OVER (PARTITION BY snz_uid ORDER BY debt_as_at_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) + - amount_incurred_overpayment AS assistance_end + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_tmp_signs_handled] + WHERE amount_incurred > 0 + +), +repayment_range AS ( + + SELECT snz_uid + ,snz_msd_uid + ,debt_as_at_date + ,amount_incurred + ,amount_repaid + ,amount_written_off + /* total repayment up-to-and-including current row - repayments this row = repayment up-to-but-excluding current row */ + ,-1.0 * (SUM(amount_repaid + amount_written_off) OVER (PARTITION BY snz_uid ORDER BY debt_as_at_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) + - (amount_repaid + amount_written_off)) AS repayment_start + /* total repayment up-to-and-including current row */ + ,-1.0 * SUM(amount_repaid + amount_written_off) OVER (PARTITION BY snz_uid ORDER BY debt_as_at_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) + AS repayment_end + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_tmp_signs_handled] + WHERE amount_repaid <> 0 + OR amount_written_off <> 0 + +), +split AS ( + + SELECT r.snz_uid + ,r.debt_as_at_date + ,a.assistance_start + ,a.assistance_end + ,r.repayment_start AS repayment_start + ,r.repayment_end AS repayment_end + ,IIF(a.assistance_start < r.repayment_start, r.repayment_start, a.assistance_start) AS latest_start + ,IIF(a.assistance_end < r.repayment_end, a.assistance_end, r.repayment_end) AS earliest_end + ,IIF(a.assistance_end < r.repayment_end, a.assistance_end, r.repayment_end) + - IIF(a.assistance_start < r.repayment_start, r.repayment_start, a.assistance_start) + AS overlap_is_repayment_of_recoverable_assistance + FROM repayment_range AS r + INNER JOIN recoverable_assistance_range AS a + ON r.snz_uid = a.snz_uid + AND a.assistance_start <= r.repayment_end + AND r.repayment_start <= a.assistance_end + +) +SELECT snz_uid + ,debt_as_at_date + ,SUM(overlap_is_repayment_of_recoverable_assistance) AS recoverable_assistance_repaid +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_tmp_classified_msd_repayment] +FROM split +GROUP BY snz_uid, debt_as_at_date + +/* Index by snz_uid (improve efficiency) */ +CREATE NONCLUSTERED INDEX snz_uid_index ON [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_tmp_classified_msd_repayment] (snz_uid); +GO + +/************************************************************************************************** +Join repayment split back to base table +**************************************************************************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_classified_msd_debt_and_repayment]; +GO + +SELECT c.snz_uid + ,c.[snz_msd_uid] + ,c.[debt_as_at_date] + ,[amount_incurred] + ,[amount_repaid] + ,[amount_written_off] + ,amount_incurred_assistance + ,amount_incurred_overpayment + ,-1.0 * ROUND(ISNULL(r.recoverable_assistance_repaid, 0), 2) AS amount_repaid_assistance + ,-1.0 * ROUND((-[amount_repaid]) + (-[amount_written_off]) - ISNULL(r.recoverable_assistance_repaid,0), 2) AS amount_repaid_overpayment +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_classified_msd_debt_and_repayment] +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_tmp_signs_handled] AS c +LEFT JOIN [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_tmp_classified_msd_repayment] AS r +ON c.snz_uid = r.snz_uid +AND c.debt_as_at_date = r.debt_as_at_date + +/* Index by snz_uid (improve efficiency) */ +CREATE NONCLUSTERED INDEX snz_uid_index ON [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_classified_msd_debt_and_repayment] (snz_uid); +GO +/*********** compress tables ***********/ +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_classified_msd_debt_and_repayment] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE) + +/************************************************************************************************** +Data preparation +Calculate running balances +**************************************************************************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_tmp_msd_debt_prep]; +GO + +SELECT * + ,SUM(delta) OVER (PARTITION BY snz_uid ORDER BY debt_as_at_date) AS balance + ,SUM(delta_assistance) OVER (PARTITION BY snz_uid ORDER BY debt_as_at_date) AS balance_assistance + ,SUM(delta_overpayment) OVER (PARTITION BY snz_uid ORDER BY debt_as_at_date) AS balance_overpayment +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_tmp_msd_debt_prep] +FROM( + SELECT * + ,ISNULL(amount_incurred, 0) + ISNULL(amount_repaid, 0) + ISNULL(amount_written_off, 0) AS delta + ,ISNULL(amount_incurred_assistance, 0) + ISNULL(amount_repaid_assistance, 0) AS delta_assistance + ,ISNULL(amount_incurred_overpayment, 0) + ISNULL(amount_repaid_overpayment, 0) AS delta_overpayment + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_classified_msd_debt_and_repayment] +)b + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_tmp_msd_debt_prep] (snz_uid); +GO + +/************************************************************************************************** +fill in records where balance is non-zero but transactions are zero +**************************************************************************************************/ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_msd_non_transactions]; +GO + +WITH +/* list of 1000 numbers 1:1000 - spt_values is an admin table chosen as it is at least 1000 row long */ +n AS ( + SELECT TOP 1000 ROW_NUMBER() OVER (ORDER BY type) AS x + FROM master.dbo.spt_values +), +/* list of dates, constructed by adding list of numbers to initial date */ +my_dates AS ( + SELECT TOP (DATEDIFF(MONTH, '2009-01-01', '2020-09-01') + 1) /* number of dates required */ + EOMONTH(DATEADD(MONTH, x-1, '2009-01-01')) AS my_dates + FROM n + ORDER BY x +), +/* get the next date for each record */ +debt_source AS ( + SELECT * + ,LEAD(debt_as_at_date, 1, '9999-01-01') OVER (PARTITION BY snz_uid ORDER BY debt_as_at_date) AS lead_date + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_tmp_msd_debt_prep] +), +/* join where dates from list are between current and next date --> hence dates from list are missing */ +joined AS ( + SELECT * + FROM debt_source + INNER JOIN my_dates + ON EOMONTH(debt_as_at_date) < my_dates + AND my_dates < EOMONTH(lead_date) +) +/* combine original and additional records into same table */ +SELECT * +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_msd_non_transactions] +FROM ( + /* original records */ + SELECT snz_uid + ,snz_msd_uid + ,debt_as_at_date + ,ROUND(amount_incurred, 2) AS amount_incurred + ,ROUND(amount_repaid, 2) AS amount_repaid + ,ROUND(amount_written_off, 2) AS amount_written_off + ,ROUND(amount_incurred_assistance, 2) AS amount_incurred_assistance + ,ROUND(amount_incurred_overpayment, 2) AS amount_incurred_overpayment + ,ROUND(amount_repaid_assistance, 2) AS amount_repaid_assistance + ,ROUND(amount_repaid_overpayment, 2) AS amount_repaid_overpayment + ,ROUND(delta, 2) AS delta + ,ROUND(delta_assistance, 2) AS delta_assistance + ,ROUND(delta_overpayment, 2) AS delta_overpayment + ,ROUND(balance, 2) AS balance + ,ROUND(balance_assistance, 2) AS balance_assistance + ,ROUND(balance_overpayment, 2) AS balance_overpayment + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_tmp_msd_debt_prep] + + UNION ALL + + /* additional records */ + SELECT snz_uid + ,snz_msd_uid + ,my_dates AS debt_as_at_date + ,NULL AS amount_incurred + ,NULL AS amount_repaid + ,NULL AS amount_written_off + ,NULL AS amount_incurred_assistance + ,NULL AS amount_incurred_overpayment + ,NULL AS amount_repaid_assistance + ,NULL AS amount_repaid_overpayment + ,0 AS delta + ,0 AS delta_assistance + ,0 AS delta_overpayment + ,ROUND(balance, 2) AS balance + ,ROUND(balance_assistance, 2) AS balance_assistance + ,ROUND(balance_overpayment, 2) AS balance_overpayment + FROM joined + WHERE NOT ( -- exclude small negative balances, the BETWEEN operator is inclusive + balance BETWEEN -10 AND 2 + AND balance_assistance BETWEEN -10 AND 2 + AND balance_overpayment BETWEEN -10 AND 2 + ) +) k + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_msd_non_transactions] ([snz_uid]); +GO + +/************************************************************************************************** +Views for balance labels +**************************************************************************************************/ + +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[d2gP2_msd_labels_balance]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[d2gP2_msd_labels_balance] AS +SELECT snz_uid + ,debt_as_at_date + ,balance + ,balance_overpayment + ,balance_assistance + /* balance labels */ + ,CONCAT('msd_Y', YEAR(debt_as_at_date), 'M', MONTH(debt_as_at_date), '_', 'overpayment') AS balance_label_overpayment + ,CONCAT('msd_Y', YEAR(debt_as_at_date), 'M', MONTH(debt_as_at_date), '_', 'assistance') AS balance_label_assistance + ,CONCAT('msd_Y', YEAR(debt_as_at_date), 'M', MONTH(debt_as_at_date)) AS balance_label_all_types +FROM[IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_msd_non_transactions] +GO + +/************************************************************************************************** +Views for transaction labels + +Views for transaction labels contains two parts. Part1 (eg. [d2gP2_msd_labels_transactions_part1]) +contains repayment & write off for subtype MSD debt. Part2 ([d2gP2_msd_labels_transactions_part2]) contains repayment and +write off seperately for MSD debt. +**************************************************************************************************/ + +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[d2gP2_msd_labels_transactions]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[d2gP2_msd_labels_transactions] AS +SELECT snz_uid + ,snz_msd_uid + ,debt_as_at_date + ,amount_incurred + ,amount_repaid + ,amount_written_off + ,amount_incurred_assistance + ,amount_incurred_overpayment + ,amount_repaid_assistance + ,amount_repaid_overpayment + /* incurred */ + ,CONCAT('msd_', 'amount_incurred', '_', YEAR(debt_as_at_date), '_', 'overpayment') AS transaction_labels_incurred_overpayment + ,CONCAT('msd_', 'amount_incurred', '_', YEAR(debt_as_at_date), '_', 'assistance') AS transaction_labels_incurred_assistance + ,CONCAT('msd_', 'amount_incurred', '_', YEAR(debt_as_at_date)) AS transaction_labels_incurred_all_types + /* repaid or write-off */ + ,CONCAT('msd_', 'amount_repaid', '_', YEAR(debt_as_at_date), '_', 'overpayment') AS transaction_labels_repaid_writeoff_overpayment + ,CONCAT('msd_', 'amount_repaid', '_', YEAR(debt_as_at_date), '_', 'assistance') AS transaction_labels_repaid_writeoff_assistance + ,CONCAT('msd_', 'amount_repaid', '_', YEAR(debt_as_at_date)) AS transaction_labels_repaid_all_types + ,CONCAT('msd_', 'amount_writeoff', '_', YEAR(debt_as_at_date)) AS transaction_labels_writeoff_all_types +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_msd_non_transactions] +GO + +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[msd_labels_pre2019]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[msd_labels_pre2019] AS +SELECT snz_uid + ,DATEADD(MONTH, 1, debt_as_at_date) AS debt_as_at_date + ,balance + ,balance_overpayment + ,balance_assistance + /*pre_2019*/ + ,CONCAT('msd_', 'pre_2019', '_', 'overpayment') AS transaction_labels_pre_2019_overpayment + ,CONCAT('msd_', 'pre_2019', '_', 'assistance') AS transaction_labels_pre_2019_assistance + ,CONCAT('msd_', 'pre_2019') AS transaction_labels_pre_2019_all_types +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_msd_non_transactions] +WHERE debt_as_at_date BETWEEN '2018-12-01' AND '2018-12-31' +GO + +/************************************************************************************************** +Views for repayments + +Views for repayments labels contains two parts. Part1 (eg. [d2gP2_msd_labels_repayments_part1]) +contains repayment & write off for subtype MSD debt. Part2 ([d2gP2_msd_labels_repayments_part2]) contains +repayment and write off seperately for MSD debt. +**************************************************************************************************/ + +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[d2gP2_msd_labels_repayments]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[d2gP2_msd_labels_repayments] AS +SELECT snz_uid + ,snz_msd_uid + ,debt_as_at_date + ,amount_repaid + ,amount_repaid_assistance + ,amount_repaid_overpayment + /* repayment labels by type */ + ,IIF(debt_as_at_date BETWEEN '2020-07-01' AND '2020-09-30', CONCAT('msd_payment_03mth_', 'assistance'), NULL) AS payment_label_assistance_03 + ,IIF(debt_as_at_date BETWEEN '2020-04-01' AND '2020-09-30', CONCAT('msd_payment_06mth_', 'assistance'), NULL) AS payment_label_assistance_06 + ,IIF(debt_as_at_date BETWEEN '2020-01-01' AND '2020-09-30', CONCAT('msd_payment_09mth_', 'assistance'), NULL) AS payment_label_assistance_09 + ,IIF(debt_as_at_date BETWEEN '2019-10-01' AND '2020-09-30', CONCAT('msd_payment_12mth_', 'assistance'), NULL) AS payment_label_assistance_12 + ,IIF(debt_as_at_date BETWEEN '2020-07-01' AND '2020-09-30', CONCAT('msd_payment_03mth_', 'overpayment'), NULL) AS payment_label_overpayment_03 + ,IIF(debt_as_at_date BETWEEN '2020-04-01' AND '2020-09-30', CONCAT('msd_payment_06mth_', 'overpayment'), NULL) AS payment_label_overpayment_06 + ,IIF(debt_as_at_date BETWEEN '2020-01-01' AND '2020-09-30', CONCAT('msd_payment_09mth_', 'overpayment'), NULL) AS payment_label_overpayment_09 + ,IIF(debt_as_at_date BETWEEN '2019-10-01' AND '2020-09-30', CONCAT('msd_payment_12mth_', 'overpayment'), NULL) AS payment_label_overpayment_12 + /* repayment labels all types */ + ,IIF(debt_as_at_date BETWEEN '2020-07-01' AND '2020-09-30', 'msd_payment_03mth', NULL) AS payment_label_all_types_03 + ,IIF(debt_as_at_date BETWEEN '2020-04-01' AND '2020-09-30', 'msd_payment_06mth', NULL) AS payment_label_all_types_06 + ,IIF(debt_as_at_date BETWEEN '2020-01-01' AND '2020-09-30', 'msd_payment_09mth', NULL) AS payment_label_all_types_09 + ,IIF(debt_as_at_date BETWEEN '2019-10-01' AND '2020-09-30', 'msd_payment_12mth', NULL) AS payment_label_all_types_12 +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_msd_non_transactions] +WHERE amount_repaid < -1 +GO + +/************************************************************************************************** +Views for persistence + +To determine whether a person has persistent debt we count the number of distinct dates where +the label is non-null during assembly. After assembly, we create the indicator by checking +whether msd_persistence_XXmth = XX. +- If msd_persistence_XXmth = XX then in the last XX months there were XX months where the person + had debt hence they had debt in every month. +- If msd_persistence_XXmth < XX then in the last XX months there were some months where the person + did not have debt. +**************************************************************************************************/ + +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[msd_labels_persist]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[d2gP2_msd_labels_persist] AS +SELECT snz_uid + ,snz_msd_uid + ,debt_as_at_date + ,balance + ,balance_overpayment + ,balance_assistance + /* persistence labels assistance */ + ,IIF(balance_assistance > 0 AND debt_as_at_date BETWEEN '2020-07-01' AND '2020-09-30', CONCAT('msd_persistence_03mth_', 'assistance'), NULL) AS persistence_label_assistance_03 + ,IIF(balance_assistance > 0 AND debt_as_at_date BETWEEN '2020-04-01' AND '2020-09-30', CONCAT('msd_persistence_06mth_', 'assistance'), NULL) AS persistence_label_assistance_06 + ,IIF(balance_assistance > 0 AND debt_as_at_date BETWEEN '2020-01-01' AND '2020-09-30', CONCAT('msd_persistence_09mth_', 'assistance'), NULL) AS persistence_label_assistance_09 + ,IIF(balance_assistance > 0 AND debt_as_at_date BETWEEN '2019-10-01' AND '2020-09-30', CONCAT('msd_persistence_12mth_', 'assistance'), NULL) AS persistence_label_assistance_12 + ,IIF(balance_assistance > 0 AND debt_as_at_date BETWEEN '2019-07-01' AND '2020-09-30', CONCAT('msd_persistence_15mth_', 'assistance'), NULL) AS persistence_label_assistance_15 + ,IIF(balance_assistance > 0 AND debt_as_at_date BETWEEN '2019-04-01' AND '2020-09-30', CONCAT('msd_persistence_18mth_', 'assistance'), NULL) AS persistence_label_assistance_18 + ,IIF(balance_assistance > 0 AND debt_as_at_date BETWEEN '2019-01-01' AND '2020-09-30', CONCAT('msd_persistence_21mth_', 'assistance'), NULL) AS persistence_label_assistance_21 + /* persistence labels overpayment */ + ,IIF(balance_overpayment > 0 AND debt_as_at_date BETWEEN '2020-07-01' AND '2020-09-30', CONCAT('msd_persistence_03mth_', 'overpayment'), NULL) AS persistence_label_overpayment_03 + ,IIF(balance_overpayment > 0 AND debt_as_at_date BETWEEN '2020-04-01' AND '2020-09-30', CONCAT('msd_persistence_06mth_', 'overpayment'), NULL) AS persistence_label_overpayment_06 + ,IIF(balance_overpayment > 0 AND debt_as_at_date BETWEEN '2020-01-01' AND '2020-09-30', CONCAT('msd_persistence_09mth_', 'overpayment'), NULL) AS persistence_label_overpayment_09 + ,IIF(balance_overpayment > 0 AND debt_as_at_date BETWEEN '2019-10-01' AND '2020-09-30', CONCAT('msd_persistence_12mth_', 'overpayment'), NULL) AS persistence_label_overpayment_12 + ,IIF(balance_overpayment > 0 AND debt_as_at_date BETWEEN '2019-07-01' AND '2020-09-30', CONCAT('msd_persistence_15mth_', 'overpayment'), NULL) AS persistence_label_overpayment_15 + ,IIF(balance_overpayment > 0 AND debt_as_at_date BETWEEN '2019-04-01' AND '2020-09-30', CONCAT('msd_persistence_18mth_', 'overpayment'), NULL) AS persistence_label_overpayment_18 + ,IIF(balance_overpayment > 0 AND debt_as_at_date BETWEEN '2019-01-01' AND '2020-09-30', CONCAT('msd_persistence_21mth_', 'overpayment'), NULL) AS persistence_label_overpayment_21 + /* persistence labels all types */ + ,IIF(balance > 0 AND debt_as_at_date BETWEEN '2020-07-01' AND '2020-09-30', 'msd_persistence_03mth', NULL) AS persistence_label_all_types_03 + ,IIF(balance > 0 AND debt_as_at_date BETWEEN '2020-04-01' AND '2020-09-30', 'msd_persistence_06mth', NULL) AS persistence_label_all_types_06 + ,IIF(balance > 0 AND debt_as_at_date BETWEEN '2020-01-01' AND '2020-09-30', 'msd_persistence_09mth', NULL) AS persistence_label_all_types_09 + ,IIF(balance > 0 AND debt_as_at_date BETWEEN '2019-10-01' AND '2020-09-30', 'msd_persistence_12mth', NULL) AS persistence_label_all_types_12 + ,IIF(balance > 0 AND debt_as_at_date BETWEEN '2019-07-01' AND '2020-09-30', 'msd_persistence_15mth', NULL) AS persistence_label_all_types_15 + ,IIF(balance > 0 AND debt_as_at_date BETWEEN '2019-04-01' AND '2020-09-30', 'msd_persistence_18mth', NULL) AS persistence_label_all_types_18 + ,IIF(balance > 0 AND debt_as_at_date BETWEEN '2019-01-01' AND '2020-09-30', 'msd_persistence_21mth', NULL) AS persistence_label_all_types_21 +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_msd_non_transactions] +GO + +/************************************************************************************************** +remove temporary tables +**************************************************************************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_tmp_signs_handled]; +GO + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_tmp_classified_msd_repayment]; +GO + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_classified_msd_debt_and_repayment]; +GO + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_tmp_msd_debt_prep]; +GO diff --git a/justice/any_corrections_spells.sql b/justice/any_corrections_spells.sql new file mode 100644 index 0000000..9a3ce6f --- /dev/null +++ b/justice/any_corrections_spells.sql @@ -0,0 +1,108 @@ +/************************************************************************************************** +Title: Spell managed by Corrections +Author: Simon Anastasiadis +Reviewer: Marianna Pekar, Joel Bancolita + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +A spell for a person in New Zealand with any management by Corrections. + +Intended purpose: +1. Creating indicators of when/whether a person has been managed by corrections. +2. Identifying spells when a person is under Corrections management. +3. Counting the number of days a person spends under Corrections management. + +Inputs & Dependencies: +- [IDI_Clean].[cor_clean].[ov_major_mgmt_periods] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[d2gP2_corrections_any] + +Notes: +1) Corrections management includes prison sentences (PRISON), remanded in custody (REMAND), + supervision (ESO, INT_SUPER, SUPER), home detention (HD_REL, HD_SENT), conditions + (PAROLE, ROC, PDC, PERIODIC), and community sentences (COM_DET, CW, COM_PROG, COM_SERV, OTH_COM) +2) Corrections management excludes not managed (ALIVE), deceased, deported or over 90 (AGED_OUT) + not applicate (NA), or errors (ERROR). +3) This data set includes only major management periods, of which Prison is one type. + Where a person has multiple management/sentence types this dataset only records the + most severe. See introduction of Corrections documentation (2016). +4) A small but meaningful number of snz_uid codes (between 1% and 5%) have some form of duplicate + records. These people can be identified by having more than one [cor_mmp_max_period_nbr] value. + To avoid double counting, we keep only the records that are part of the longest sequence. + This requires the inner join. + An alternative approach would be to keep the sequence with the longest duration. + The assumption behind keeping the more complex sequences is that the increased detail makes them + more likely to be true. Contrast for example: 5 year prison sentence, vs. 2 years in prison, + 1 year home detention, 6 months supervision. In this case the first sequence, while longer, + may not have been updated as the person's conditions changed with the updates appearing on + the second sequence. +5) A tiny number of snz_uid codes have duplicate records of equal length that can not be + resolved using [cor_mmp_max_period_nbr]. Best estimates for the size of this group is <0.1% + of the population. We have left these duplicate records in place. +6) One day is subtracted from the end date to ensure periods are non-overlapping. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = d2g_ + Project schema = [DL-MAA20XX-YY] + +Issues: +1) Starting with the March 2022 refresh, the input table [ov_major_mgmt_periods] is no longer + being updated and new tables have been introduced. Definition will need updating to use + these new tables for research studying the most recent time periods. + +History (reverse order): +2021-06-04 FL update the input table to the latest reference +2020-07-22 JB QA +2020-07-16 MP QA +2020-02-28 SA v1 +**************************************************************************************************/ + +/* Set database for writing views */ +USE IDI_UserCode +GO + +/* Clear existing view */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[d2gP2_corrections_any]; +GO + +/* Create view */ +CREATE VIEW [DL-MAA20XX-YY].[d2gP2_corrections_any] AS +SELECT a.snz_uid + ,a.cor_mmp_prev_mmc_code + ,a.cor_mmp_mmc_code + ,a.cor_mmp_next_mmc_code + ,a.cor_mmp_index_offence_code + ,a.cor_mmp_imposed_sentence_length_nbr + ,a.cor_mmp_sentence_location_text + ,a.cor_mmp_period_start_date AS [start_date] + ,DATEADD(DAY, -1, a.cor_mmp_period_end_date) AS [end_date] +FROM [IDI_Clean_YYYYMM].[cor_clean].[ov_major_mgmt_periods] a +INNER JOIN ( + SELECT snz_uid, MAX(cor_mmp_max_period_nbr) AS cor_mmp_max_period_nbr + FROM [IDI_Clean_YYYYMM].[cor_clean].[ov_major_mgmt_periods] + GROUP BY snz_uid +) b +ON a.snz_uid = b.snz_uid +AND a.cor_mmp_max_period_nbr = b.cor_mmp_max_period_nbr +WHERE [cor_mmp_mmc_code] NOT IN ('AGED_OUT', 'ALIVE', 'ERROR', 'NA') +AND cor_mmp_period_start_date IS NOT NULL +AND cor_mmp_period_end_date IS NOT NULL +AND cor_mmp_period_start_date <= cor_mmp_period_end_date; +GO + + + + diff --git a/justice/crime_offender_victim.sql b/justice/crime_offender_victim.sql new file mode 100644 index 0000000..b45038e --- /dev/null +++ b/justice/crime_offender_victim.sql @@ -0,0 +1,101 @@ +/************************************************************************************************** +Title: Crime - offenders and victims +Author: Simon Anastasiadis + +cknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Offenders and victims of crime. + +Intended purpose: +1. Determining who has been a victim of crime or an offender. +2. Counting the number of occurrence of offence or victimisation. + +Inputs & Dependencies: +- [IDI_Clean].[pol_clean].[post_count_offenders] +- [IDI_Clean].[pol_clean].[post_count_victimisations] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[defn_crime_offender] +- [IDI_UserCode].[DL-MAA20XX-YY].[defn_crime_victim] + + +Notes: +1) Multiple charges can arise from a single occurrence. + We use the post_count tables for both offender and victim. + This means that only the most serious offence/charge from each + occurrence is used. +2) Not every crime/offence has a person as its victim. + E.g. drink driving. + The offender for some victimisations is unknown. + E.g. Burglary while victim was out. + Hence the number of offenders and victims will not match. +3) Only captures reported crime to police. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +Issues: + +History (reverse order): +2020-05-20 SA v1 +**************************************************************************************************/ + +/* Set database for writing views */ +USE IDI_UserCode +GO + +/* Clear existing view */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[defn_crime_offender]; +GO + +/* Create view */ +CREATE VIEW [DL-MAA20XX-YY].[defn_crime_offender] AS +SELECT [snz_uid] + ,[pol_poo_occurrence_inv_ind] + ,[pol_poo_offence_inv_ind] + ,[pol_poo_proceeding_date] + ,[pol_poo_offence_code] + ,[pol_poo_proceeding_code] + ,[snz_person_ind] + ,[pol_poo_earliest_occ_start_date] + ,[pol_poo_latest_poss_occ_date] +FROM [IDI_Clean_YYYYMM].[pol_clean].[post_count_offenders] +WHERE [snz_person_ind] = 1 --offender is a person +AND snz_uid > 0 --meaningful snz_uid code +AND [pol_poo_occurrence_inv_ind] = 1 --occurrence was investigated +AND [pol_poo_proceeding_code] NOT IN ('300', '999') --exclude not proceeded with and unknown status +GO + +/* Clear existing view */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[defn_crime_victim]; +GO + +/* Create view */ +CREATE VIEW [DL-MAA20XX-YY].[defn_crime_victim] AS +SELECT [snz_uid] + ,[pol_pov_occurrence_inv_ind] + ,[pol_pov_offence_inv_ind] + ,[pol_pov_reported_date] + ,[pol_pov_offence_code] + ,[pol_pov_rov_code] + ,[snz_person_ind] + ,[pol_pov_earliest_occ_start_date] + ,[pol_pov_latest_poss_occ_date] +FROM [IDI_Clean_YYYYMM].[pol_clean].[post_count_victimisations] +WHERE [snz_person_ind] = 1 --victim is a person +AND snz_uid > 0 --meaningful snz_uid code +AND [pol_pov_occurrence_inv_ind] = 1 --occurrence was investigated +GO diff --git a/justice/family_violence_observed.sql b/justice/family_violence_observed.sql new file mode 100644 index 0000000..e59ed0f --- /dev/null +++ b/justice/family_violence_observed.sql @@ -0,0 +1,266 @@ +/************************************************************************************************** +Title: Family Violence +Author: Marianna Pekar + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_libraryDescription: +Family, sexual, and intimate partner violence. + +Intended purpose: +Identifying people who have been exposed to Family Violence. + +Inputs & Dependencies: +- [IDI_Clean].[pol_clean].[pre_count_offenders] +- [IDI_Clean].[pol_clean].[post_count_offenders] +- [IDI_Clean].[pol_clean].[pre_count_victimisations] +- [IDI_Clean].[pol_clean].[post_count_victimisations] +- [IDI_Clean].[acc_clean].[claims] +- [IDI_Clean].[moj_clean].[charges] + +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[offence_codes_mapping] +- [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_family_violence] + +Notes: +1) Based on information received from Jo Fink, Ministry of Justice, Senior Analyst, Analysis and Modelling. + Our thanks to Jo Fink for their help. +2) Much of the information that is used operationally by the Justice sector agencies for official statistics + are not available in the IDI (e.g. Police family harm investigations, calls for service, Police Safety Orders, + family violence flag, Protection Order applications, referral to non-violence programmes etc.). +3) This definition is likely to be an under count, both due to under-reporting and differences in data sources. + For comparison, consider that the police family violence unit is called out aroun 120k times per year. +4) Sources used (administrative): + - Police: victimisation from July 2014 onwards, proceedings from 2009 onwards + - NIA links: offence information + - Justice: charges from 1992 onwards + - ACC: sensitive claims and injury claims based on keywords - awaiting SME review + - Oranga Tamariki: reports of concern - currently not incorporated + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = vacc_ + Project schema = DL-MAA20XX-YY + +Issues: + +History (reverse order): +2021-08-31 modifying parameters, MP +2020-08-05 initiated MP +**************************************************************************************************/ + +/* Manual loading of offence codes of interest */ +DROP Table IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[offence_codes_mapping]; +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[offence_codes_mapping] ( + [offence_code] INT, + [offence_description] NVARCHAR(MAX) +); + + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[offence_codes_mapping] ([offence_code], [offence_description]) +VALUES + (1541,'Male Assaults Female (Firearm)'), -- not strictly FV, but more than 90% of such offences are FV (Jo Fink) + (1542,'Male Assaults Female (Other Weapon)'), -- not strictly FV, but more than 90% of such offences are FV (Jo Fink) + (1543,'Male Assaults Female (Manually)'), -- not strictly FV, but more than 90% of such offences are FV (Jo Fink) + (1544,'Male Assaults Female (Stabbing/Cutting Weapon)'), -- not strictly FV, but more than 90% of such offences are FV (Jo Fink) + (1545,'Assault on person in family relationship'), + (1581,'Common Assault(Domestic)Crimes Act(Firearm)'), + (1582,'Common Assault(Domestic)Crimes Act(Other Weapon)'), + (1583,'Common Assault(Domestic)Crimes Act(Manually)'), + (1587,'Common Assault (Domestic) (Stabbing/Cutting Weapon)'), + (1641,'Common Assault (Domestic) (Firearm)'), + (1642,'Common Assault (Domestic) (Other Weapon)'), + (1643,'Common Assault (Domestic) (Manually)'), + (1647,'Common Assault (Domestic) (Stabbing/Cutting Weapon)'), + (2153,'Husband Rapes Wife (With Weapon)'), + (2154,'Husband Rapes Wife (No Weapon)'), + (2157,'Unlawful Sexual Connection With Spouse (Weapon)'), + (2158,'Unlawful Sexual Connection With Spouse (No Weapon)'), + (2163,'Attempts Sexual Violation Spouse (Weapon)'), + (2164,'Attempts Sexual Violation Spouse (No Weapon)'), + (2167,'Assaults W Intent T Commit Sexual Violation Spouse (Weapon)'), + (2168,'Assaults Intent Commit Sexual Violation Spouse (No Weapon)'), + (2311,'Father Incest Daughter'), + (2312,'Brother Incest Sister'), + (2313,'Other Incest Other Relative'), + (2319,'Other Incest'), + (2654,'Husband Rapes Wife'), + (2658,'Unlawful Sexual Connection With Spouse'), + (2664,'Attempt To Rape Spouse'), + (2668,'Attempted Unlawful Sexual Connection Spouse'), + (2674,'Assault With Intent To Commit Rape Spouse'), + (2678,'Assault Intent Commit Sex Connect-Spouse'), + (2711,'Parent Incest Child Under 12'), + (2712,'Parent Incest Child 12-16'), + (2713,'Parent Incest Child Over 16'), + (2714,'Brother Incest Sister Under 12'), + (2715,'Brother Incest Sister 12-16'), + (2716,'Brother Incest Sister Over 16'), + (2719,'Other Incest'), + (2731,'Sexual Connection Dependent Family Member'), + (2732,'Attempt Sex Connection Dependent Family Member'), + (2733,'Indecent Act On Dependent Family Member'), + (3851,'Contravenes Protection Order (Firearm)'), + (3852,'Contravenes Protection Order (No Firearm)'), + (3853,'Fails To Comply With Conditions Of Order (Firearm)'), + (3854,'Fails To Comply With Conditions Of Order (No Firearm)'), + (3855,'Fails To Attend Program'), + (3856,'Breach Publications Restrictions'), + (3858,'Detention by Constable - Failure or Refusal to Remain'), + (3859,'Other Breaches Of Domestic Violence Act'), + (3871,'Contravenes Protection Order - Family Violence'), + (3872,'Contravenes Protection Order - Unauthorised Contact'), + (3873,'Contravenes Protection Order - Encourages a Person to Engage in Behaviour'), + (3874,'Contravenes Protection Order - Dowry-Related Violence'), + (3875,'Contravenes Protection Order - Breach of Special Condition'), + (3876,'Fail to Comply w/Cond Protection Order - Fail/Refuse to Surrender Weapon'), + (3877,'Fail to Comply w/Cond Protection Order - Fail/Ref Surrender Firearms Lic'), + (3878,'Fail to comply with conditions of protection order - Possess Weapon'), + (3879,'Fail to comply w/conditions of protection order - Held Firearms Licence'), + (3881,'Contravenes Protection Order - Occupation Order'), + (3882,'Contravenes Protection Order - Tenancy Order'), + (3883,'Contravenes Protection Order - Ancillary Furniture Order'), + (3884,'Contravenes Protection Order - Furniture Order'), + (3885,'Fails to Attend Programme - Family Violence Act 2018'), + (3886,'Breach Publication Restrictions - Family Violence Act 2018'), + (7224,'Coerced person into marriage/civil union'), -- obsolete + (2173, 'Induce Sexual Intercourse Pretence Of Marriage'), -- added + (2620, 'Abduction For Marriage Or Sex'), -- added + (2621, 'Abduction For Marriage Girl Under 12'), -- added + (2622, 'Abduction For Marriage Girl 12-16'), --added + (2623, 'Abduction For Marriage Female Over 16'),--added + (2627, 'Abduction For Marriage - Male'), --added + (2629, 'Other Abduction For Marriage Or Sex'), --added + (2641, 'Inducing Sexual Intercourse Pretence Of Marriage'), --added + (3723,'Breach Of Nonmolestation Order'), --DOMESTIC PROTECTION ACT 1982 SECTION 16,18. Repealed 1 July 1996 by s 129(a) Domestic Violence Act 1995 + (3741,'Offences Against Domestic Protection Act'), --Domestic Protection Act 1982 Where Not Specifically Covered Elsewhere. Repealed & Offence Obsoleted From 1 Jul, 1996 + (3749,'Other Miscellaneous Family Offences'), --Children, Young Persons, & Their Families Act 1989, Guardianship Act 1968, Matrimonial Property Act 1976, Property (Relationships) Act 1976, Domestic Violence Act 1995, Family Proceedings Act 1980, Where Not Otherwise Specified. + (3857,'Failure To Comply With Police Safety Order (Not a criminal prosecution)'), --Domestic Violence (Enhancing Safety) Bill s 124E 1(a-i) + (6122,'Trespass Family Proceed Act'); --Family Proceedings Act 1980 Sec 176(3). Repealed By Domestic Protection Act 1982 + --(7200, 'Births, Deaths And Marriages') + --(7220, 'Offences Re Marriage') + --(7222, 'Feigned Marriage') + --(7223, 'Breaches Marriage Act') + --(7229, 'Other Offences Re Marriage') +GO + +/*************************************** +combined table of family violence events +***************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_family_violence]; +GO + +SELECT * +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_family_violence] +FROM ( + +-- New Zealand Police Data +-- The Pre-Count table includes all criminal incidents within the reporting period where the offence type (ANZSOC group) is within scope. +SELECT snz_uid + ,[pol_pro_earliest_occ_start_date] AS [start_date] + ,[pol_pro_latest_poss_occ_date] AS [end_date] +FROM [IDI_Clean_YYYYMM].[pol_clean].[pre_count_offenders] +WHERE pol_pro_offence_code IN ( + SELECT CAST([offence_code] AS VARCHAR(5)) + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[offence_codes_mapping] +) +AND snz_uid > 0 + +UNION ALL + +-- NZ Police data +-- The Post-count table counts a person once on each day they are proceeded against by police in the reference period, whether by court or non-court action +SELECT snz_uid + ,[pol_poo_earliest_occ_start_date] AS [start_date] + ,[pol_poo_latest_poss_occ_date] AS [end_date] +FROM [IDI_Clean_YYYYMM].[pol_clean].[post_count_offenders] +WHERE pol_poo_offence_code IN ( + SELECT CAST([offence_code] AS VARCHAR(5)) + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[offence_codes_mapping] +) +AND snz_uid > 0 + +UNION ALL + +-- Pre-count victimisation +-- Recorded Crime Victims Statistics (RCVS), available from July 2014 onwards +SELECT snz_uid + ,[pol_prv_earliest_occ_start_date] AS [start_date] + ,[pol_prv_latest_poss_occ_date] AS [end_date] +FROM [IDI_Clean_YYYYMM].[pol_clean].[pre_count_victimisations] +WHERE pol_prv_offence_code IN ( + SELECT CAST([offence_code] as VARCHAR(5)) + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[offence_codes_mapping] +) +AND snz_uid > 0 + +UNION ALL + +-- Post-count victimisation +SELECT snz_uid + ,[pol_pov_earliest_occ_start_date] AS [start_date] + ,[pol_pov_latest_poss_occ_date] AS [end_date] +FROM [IDI_Clean_YYYYMM].[pol_clean].[post_count_victimisations] +WHERE pol_pov_offence_code IN ( + SELECT CAST([offence_code] AS VARCHAR(5)) + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[offence_codes_mapping] +) +AND snz_uid > 0 + +UNION ALL + +-- NIA Links +SELECT [snz_uid] + ,[nia_links_rec_date] AS [start_date] + ,[nia_links_rec_date] AS [end_date] +FROM [IDI_Clean_YYYYMM].[pol_clean].[nia_links] +WHERE [nia_links_latest_inc_off_code] IN ( + SELECT CAST([offence_code] AS VARCHAR(5)) + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[offence_codes_mapping] +) +AND [nia_links_rec_date] IS NOT NULL +AND snz_uid > 0 + +UNION ALL + +-- Ministry of Justice, Charges (1992 onwards) +SELECT snz_uid + ,moj_chg_offence_from_date AS [start_date] + ,moj_chg_offence_from_date AS [end_date] +FROM [IDI_Clean_YYYYMM].[moj_clean].[charges] AS a +WHERE moj_chg_offence_code IN ( + SELECT CAST([offence_code] AS VARCHAR(5)) + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[offence_codes_mapping] +) +AND snz_uid > 0 + +UNION ALL + +-- ACC, acc_cla_read_codes indicating family member is identified as a perpetrator of maltreatment or neglect +SELECT snz_uid + ,acc_cla_accident_date AS [start_date] + ,acc_cla_accident_date AS [end_date] +FROM [IDI_Clean_YYYYMM].[acc_clean].[claims] +-- where acc_cla_read_code description contains malteratment, abuse of family members. List awaiting QA +WHERE acc_cla_read_code IN ('SN563','SN564','SN57.','SN55.','SN550','SN55z','TL70.','TL7y.','TL7z.','TE401','SN553','SN552','TLx40') + +) k; +GO + +/* index and compress */ +CREATE NONCLUSTERED INDEX my_index ON [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_family_violence] (snz_uid) +GO +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_family_violence] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE) +GO diff --git a/justice/police_interaction.sql b/justice/police_interaction.sql new file mode 100644 index 0000000..03e29a4 --- /dev/null +++ b/justice/police_interaction.sql @@ -0,0 +1,97 @@ +/************************************************************************************************** +Title: Recent police interaction +Author: Craig Wright + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Flag for recent police interaction as victim or offender. + +Intended purpose: +Identifying whether people have interacted with police in (2020,2021) and if so as a victim or as an offender. + +Inputs & Dependencies: +- [IDI_Clean].[pol_clean].[post_count_victimisations] +- [IDI_Clean].[pol_clean].[post_count_offenders] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[vacc_police_interaction] + +Notes: + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = vacc_ + Project schema = DL-MAA20XX-YY + +Issues: + +History (reverse order): +2021-10-31 CW +**************************************************************************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_police_interaction] +GO + +/* create */ +WITH +pol_off AS ( + + SELECT DISTINCT [snz_uid] + ,[snz_jus_uid] + ,1 AS the_type + ,[pol_pov_year_nbr] AS the_year + FROM [IDI_Clean_YYYYMM].[pol_clean].[post_count_victimisations] + WHERE [pol_pov_year_nbr] in (2020, 2021) -- year(s) of interest + +), +pol_vic AS ( + + SELECT DISTINCT [snz_uid] + ,[snz_jus_uid] + ,2 AS the_type + ,[pol_poo_year_nbr] AS the_year + FROM [IDI_Clean_YYYYMM].[pol_clean].[post_count_offenders] + WHERE [pol_poo_year_nbr] in (2020, 2021) -- year(s) of interest + +), +all_ids AS ( + + SELECT DISTINCT snz_uid + FROM ( + SELECT snz_uid FROM pol_off + UNION ALL + SELECT snz_uid FROM pol_vic + ) AS k + +) +SELECT a.snz_uid + ,CASE WHEN b.the_type IS NULL THEN 0 ELSE 1 END AS off_2020 + ,CASE WHEN c.the_type IS NULL THEN 0 ELSE 1 END AS off_2021 + ,CASE WHEN d.the_type IS NULL THEN 0 ELSE 1 END AS vic_2020 + ,CASE WHEN e.the_type IS NULL THEN 0 ELSE 1 END AS vic_2021 +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_police_interaction] +FROM all_ids AS a +LEFT JOIN pol_off AS b +ON a.snz_uid = b.snz_uid AND b.the_year=2020 +LEFT JOIN pol_off AS c +ON a.snz_uid = c.snz_uid AND c.the_year=2021 +LEFT JOIN pol_vic AS d +ON a.snz_uid = d.snz_uid AND d.the_year=2020 +LEFT JOIN pol_vic AS e +ON a.snz_uid = e.snz_uid AND e.the_year=2021 +GO + +/* index */ +CREATE NONCLUSTERED INDEX mu_index ON [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_police_interaction] (snz_uid) +GO diff --git a/justice/prison_spells.sql b/justice/prison_spells.sql new file mode 100644 index 0000000..18b6432 --- /dev/null +++ b/justice/prison_spells.sql @@ -0,0 +1,95 @@ +/************************************************************************************************** +Title: Spell in Prison +Author: Simon Anastasiadis +re-edit: Freya Li + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +A spell for a person in a New Zealand prison managed by Corrections. + +Intended purpose: +1. Creating indicators of when/whether a person has been imprisoned. +2. Identifying spells when a person is in prison. +3. Counting the number of days a person spends in prison. + +Inputs & Dependencies: +- [IDI_Clean].[cor_clean].[ov_major_mgmt_periods] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[d2gP2_corrections_prison] + +Notes: +1) This data set includes only major management periods, of which Prison is one type. + Where a person has multiple management/sentence types this dataset only records the + most severe. See introduction of Corrections documentation (2016). +2) A small but meaningful number of snz_uid codes (between 1% and 5%) have some form of duplicate + records. These people can be identified by having more than one [cor_mmp_max_period_nbr] value. + To avoid double counting, we keep only the records that are part of the longest sequence. + This requires the inner join. +3) A tiny number of snz_uid codes have duplicate records of equal length that can not be + resolved using [cor_mmp_max_period_nbr]. Best estimates for the size of this group is <0.1% + of the population. We have left these duplicate records in place. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = d2gP2_ + Project schema = [DL-MAA20XX-YY] + +Issues: +- Trivial number of records have start_date > end_date +- Starting with the March 2022 refresh, the input table [ov_major_mgmt_periods] is no longer + being updated and new tables have been introduced. Definition will need updating to use + these new tables for research studying the most recent time periods. + +History (reverse order): +2021-06-09 SA QA +2021-06-04 FL update the input data to the latest reference +2020-11-19 FL QA +2020-02-28 SA v1 +**************************************************************************************************/ + +/* Set database for writing views */ +USE IDI_UserCode +GO + +/* Clear existing view */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[d2gP2_corrections_prison]; +GO + +/* Create view */ +CREATE VIEW [DL-MAA20XX-YY].[d2gP2_corrections_prison] AS +SELECT a.snz_uid + ,a.cor_mmp_prev_mmc_code + ,a.cor_mmp_mmc_code + ,a.cor_mmp_next_mmc_code + ,a.cor_mmp_index_offence_code + ,a.cor_mmp_imposed_sentence_length_nbr + ,a.cor_mmp_sentence_location_text + ,a.cor_mmp_period_start_date AS [start_date] + ,a.cor_mmp_period_end_date AS [end_date] +FROM [IDI_Clean_YYYYMM].[cor_clean].[ov_major_mgmt_periods] a +INNER JOIN ( + SELECT snz_uid, MAX(cor_mmp_max_period_nbr) AS cor_mmp_max_period_nbr + FROM [IDI_Clean_YYYYMM].[cor_clean].[ov_major_mgmt_periods] + GROUP BY snz_uid +) b +ON a.snz_uid = b.snz_uid +AND a.cor_mmp_max_period_nbr = b.cor_mmp_max_period_nbr +WHERE cor_mmp_mmc_code = 'PRISON' +AND cor_mmp_period_start_date IS NOT NULL +AND cor_mmp_period_end_date IS NOT NULL +AND cor_mmp_period_start_date <= cor_mmp_period_end_date; +GO + + diff --git a/people/Principal Disability Mapping.xlsx b/people/Principal Disability Mapping.xlsx new file mode 100644 index 0000000..8c999e4 Binary files /dev/null and b/people/Principal Disability Mapping.xlsx differ diff --git a/people/alive.sql b/people/alive.sql new file mode 100644 index 0000000..b8eb004 --- /dev/null +++ b/people/alive.sql @@ -0,0 +1,86 @@ +/************************************************************************************************** +Title: Period a person is alive +Author: Simon Anastasiadis +Re-edit: Freya Li + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +The period we can be confident that a person is alive. This is birth to death where both birth and death are available. If death is not available, we assume people do not live beyond 130 years. + +Intended purpose: +1. Suitable for creating indicators for when/whether a person is alive. +2. Expected to use to filter datasets for whether people are alive. + +Inputs & Dependencies: +- [IDI_Clean].[data].[personal_detail] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[defn_alive] + +Notes: +1) Only year and month of death are available in the IDI. Day of birth and day of death + are considered identifying. Hence all births happen on the 15th of the month and + all deaths happen on the 28th. +2) Future births and deaths are excluded. This is applied via a filter on the year of + birth and year of death. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + Limit on future births & deaths = 2021 + +Issues: + +History (reverse order): +2022-10-20 SA merge w deaths defn +2022-05-07 VW Point to DL-MAA20XX-YY, update to latest refresh (202203) +2022-02-21 VW Point to DL-MAA20XX-YY +2021-11-22 MR Update latest refresh (20211020) +2021-01-26 SA QA +2021-01-09 FL v2 (Change prefix and update the table to the latest refresh) +2020-07-22 JB QA +2020-07-16 MP QA +2020-02-28 SA v1 +**************************************************************************************************/ + +/* Set database for writing views */ +USE IDI_UserCode +GO + +/* Clear existing view */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[defn_alive]; +GO + +/* Create view */ +CREATE VIEW [DL-MAA20XX-YY].[defn_alive] AS +SELECT * +FROM ( + SELECT snz_uid + ,EOMONTH([snz_birth_date_proxy]) AS [start_date] + ,IIF( [snz_deceased_year_nbr] IS NOT NULL AND [snz_deceased_month_nbr] IS NOT NULL, + EOMONTH(DATEFROMPARTS([snz_deceased_year_nbr], [snz_deceased_month_nbr], 15)), + EOMONTH(DATEFROMPARTS([snz_birth_year_nbr] + 130, [snz_birth_month_nbr], 15))) AS [end_date] + ,[snz_birth_year_nbr] + ,[snz_birth_month_nbr] + ,[snz_deceased_year_nbr] + ,[snz_deceased_month_nbr] + FROM [IDI_Clean_YYYYMM].[data].[personal_detail] + WHERE [snz_person_ind] = 1 + AND [snz_birth_year_nbr] IS NOT NULL + AND [snz_birth_month_nbr] IS NOT NULL + AND [snz_birth_year_nbr] <= YEAR(GETDATE()) +) k +WHERE [start_date] <= [end_date]; +GO diff --git a/people/country_of_birth.sql b/people/country_of_birth.sql new file mode 100644 index 0000000..dfd9c72 --- /dev/null +++ b/people/country_of_birth.sql @@ -0,0 +1,1646 @@ +/************************************************************************************************** +Title: Country of birth +Author: Craig Wright + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Country of birth (or citizenship where available COB not available). + +Intended purpose: +1. Supplement ethnicity and identity information by including Country of Birth. +2. Where country of birth (COB) is not available we instead use country of citizenship (COC). + +Inputs & Dependencies: +- [IDI_Clean].[cen_clean].[census_individual_2018] +- [IDI_Clean].[cen_clean].[census_individual_2013] +- [IDI_Clean].[dia_clean].[births] +- [IDI_Clean].[cus_clean].[journey] +- [IDI_Clean].[nzta_clean].[dlr_historic] +- [IDI_Clean].[nzta_clean].[drivers_licence_register] +- [IDI_Clean].[dol_clean].[movement_identities] +- [IDI_Clean].[moe_clean].[enrolment] +- [IDI_Adhoc].[clean_read_MOE].[School_Roll_Return_2019] +- [IDI_Adhoc].[clean_read_MOE].[School_Roll_Return_2018] +- [IDI_Adhoc].[clean_read_MOE].[School_Roll_Return_2017] +- [IDI_Adhoc].[clean_read_MOE].[School_Roll_Return_2016] +- [IDI_Adhoc].[clean_read_MOE].[School_Roll_Return_2015] +- [IDI_Adhoc].[clean_read_MOE].[School_Roll_Return_2014] +- [IDI_Adhoc].[clean_read_MOE].[School_Roll_Return_2013] +- [IDI_Adhoc].[clean_read_MOE].[School_Roll_Return_2012] +- [IDI_Adhoc].[clean_read_MOE].[School_Roll_Return_2011] +- [IDI_Adhoc].[clean_read_MOE].[School_Roll_Return_2010] +- [IDI_Adhoc].[clean_read_MOE].[School_Roll_Return_2009] +- [IDI_Adhoc].[clean_read_MOE].[School_Roll_Return_2008] +- [IDI_Adhoc].[clean_read_MOE].[School_Roll_Return_2007] +- [IDI_Clean].[moe_clean].[nsi] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_country_of_birth] + +Notes: +1) Multiple sources contain COB / COC information. + Consistent with how SNZ makes the personal details table, the different sources + are ranked and the highest quality source is kept. + +2) The ranking of the sources are as follows (1 = best): + 1. census 2018 + 2. census 2013 + 3. DIA births - NZ birth + 4. CUS customs + 5. NZTA drivers license + 6. DOL + 7. MOE enrolment - country of citizenships - take first COC by date + 8. MOE school enrolment - country of citizenships - take first COC by date + +3) This file is very long because it contains complete lists of countries and the + (manual) recoding of their labels to a standard set of country codes. + A single length of country codes is approx 5-6x the PgDn button. This means that + you can push the PgDn button 5 or 6 times to go from the start to the end of a country list. + +4) The output country encoding should be consistent with Census country codes. + +Issues: +1) The long CASE WHEN statements can be slow to execute. + This can be improved by loading a metadata table and using a JOIN. + A suitable metadata table is saved alongside this definition, but has not been + integrated into this definition. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = vacc_ + Project schema = DL-MAA20XX-YY + +History (reverse order): +2021-11-26 SA restructure and tidy +2021-10-31 CW +**************************************************************************************************/ + +/* create table of all COB */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_COB_list] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_COB_list] ( + snz_uid INT, + c_type VARCHAR(5), + code VARCHAR(5), + source_rank INT, +); +GO + +/*************************************************************************************************************** +append records from each source into the table +***************************************************************************************************************/ + +/******************************************************** +Census 2018 +'V14.1' as code_sys +********************************************************/ +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_COB_list] (snz_uid, c_type, code, source_rank) +SELECT [snz_uid] + ,'COB' AS c_type + ,[cen_ind_birth_country_code] as code + ,1 AS source_rank +FROM [IDI_Clean_YYYYMM].[cen_clean].[census_individual_2018] +WHERE [cen_ind_birth_country_impt_ind] in ('11','12') +GO + +/******************************************************** +Census 2013 +'V14.1' as code_sys +********************************************************/ +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_COB_list] (snz_uid, c_type, code, source_rank) +SELECT [snz_uid] + ,'COB' as c_type + ,[cen_ind_birth_country_code] as code + ,2 AS source_rank +FROM [IDI_Clean_YYYYMM].[cen_clean].[census_individual_2013] +GO + +/******************************************************** +DIA births - NZ birth +'1999 4N V14.0.0' as code_sys +********************************************************/ +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_COB_list] (snz_uid, c_type, code, source_rank) +SELECT DISTINCT snz_uid + ,'COB' as c_type + ,'1201' as code + ,3 AS source_rank +FROM [IDI_Clean_YYYYMM].[dia_clean].[births] +GO + +/******************************************************** +CUS customs +'1999 4A V15.0.0' as raw_code_sys +'1999 4N V14.0.0' as code_sys +********************************************************/ +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_COB_list] (snz_uid, c_type, code, source_rank) +SELECT [snz_uid] + ,'COB' AS c_type + ,CASE [cus_jou_country_of_birth_code] + WHEN '01' then NULL + WHEN '02' then NULL + WHEN '03' then NULL + WHEN 'AD' then '3101' + WHEN 'AE' then '4216' + WHEN 'AF' then '7201' + WHEN 'AG' then '8402' + WHEN 'AI' then '8401' + WHEN 'AL' then '3201' + WHEN 'AM' then '7202' + WHEN 'AN' then NULL + WHEN 'AO' then '9201' + WHEN 'AQ' then '1601' + WHEN 'AR' then '8201' + WHEN 'AS' then '1506' + WHEN 'AT' then '2301' + WHEN 'AU' then '1101' + WHEN 'AW' then '8403' + WHEN 'AZ' then '7203' + WHEN 'BA' then '3202' + WHEN 'BB' then '8405' + WHEN 'BD' then '7101' + WHEN 'BE' then '2302' + WHEN 'BF' then '9102' + WHEN 'BG' then '3203' + WHEN 'BH' then '4201' + WHEN 'BI' then '9203' + WHEN 'BJ' then '9101' + WHEN 'BM' then '8101' + WHEN 'BN' then '5201' + WHEN 'BO' then '8202' + WHEN 'BR' then '8203' + WHEN 'BS' then '8404' + WHEN 'BT' then '7102' + WHEN 'BU' then '5101' + WHEN 'BW' then '9202' + WHEN 'BY' then '3301' + WHEN 'BZ' then '8301' + WHEN 'CA' then '8102' + WHEN 'CC' then '1101' + WHEN 'CD' then '9108' + WHEN 'CF' then '9105' + WHEN 'CG' then '9107' + WHEN 'CH' then '2311' + WHEN 'CI' then '9111' + WHEN 'CK' then '1501' + WHEN 'CL' then '8204' + WHEN 'CM' then '9103' + WHEN 'CN' then '6101' + WHEN 'CO' then '8205' + WHEN 'CR' then '8302' + WHEN 'CS' then '3215' + WHEN 'CT' then '1402' + WHEN 'CU' then '8407' + WHEN 'CV' then '9104' + WHEN 'CW' then '8433' + WHEN 'CX' then '1101' + WHEN 'CY' then '3205' + WHEN 'CZ' then '3302' + WHEN 'DD' then '2304' + WHEN 'DE' then '2304' + WHEN 'DJ' then '9205' + WHEN 'DK' then '2401' + WHEN 'DM' then '8408' + WHEN 'DO' then '8411' + WHEN 'DZ' then '4101' + WHEN 'EC' then '8206' + WHEN 'EE' then '3303' + WHEN 'EG' then '4102' + WHEN 'EH' then '4107' + WHEN 'EN' then '2102' + WHEN 'ER' then '9206' + WHEN 'ES' then '3108' + WHEN 'ET' then '9207' + WHEN 'EU' then NULL + WHEN 'FI' then '2403' + WHEN 'FJ' then '1502' + WHEN 'FK' then '8207' + WHEN 'FM' then '1404' + WHEN 'FO' then '2402' + WHEN 'FR' then '2303' + WHEN 'GA' then '9113' + WHEN 'GB' then '2100' + WHEN 'GD' then '8412' + WHEN 'GE' then '7204' + WHEN 'GF' then '8208' + WHEN 'GH' then '9115' + WHEN 'GI' then '3102' + WHEN 'GJ' then '8412' + WHEN 'GL' then '2404' + WHEN 'GM' then '9114' + WHEN 'GN' then '9116' + WHEN 'GP' then '8413' + WHEN 'GQ' then '9112' + WHEN 'GR' then '3207' + WHEN 'GS' then NULL + WHEN 'GT' then '8304' + WHEN 'GU' then '1401' + WHEN 'GW' then '9117' + WHEN 'GY' then '8211' + WHEN 'HG' then NULL + WHEN 'HI' then '2105' + WHEN 'HK' then '6102' + WHEN 'HN' then '8305' + WHEN 'HR' then '3204' + WHEN 'HT' then '8414' + WHEN 'HU' then '3304' + WHEN 'ID' then '5202' + WHEN 'IE' then '2201' + WHEN 'IL' then '4205' + WHEN 'IM' then '2103' + WHEN 'IN' then '7103' + WHEN 'IO' then '9299' + WHEN 'IQ' then '4204' + WHEN 'IR' then '4203' + WHEN 'IS' then '2405' + WHEN 'IT' then '3104' + WHEN 'JM' then '8415' + WHEN 'JO' then '4206' + WHEN 'JP' then '6103' + WHEN 'KE' then '9208' + WHEN 'KG' then '7206' + WHEN 'KH' then '5102' + WHEN 'KI' then '1402' + WHEN 'KM' then '9204' + WHEN 'KN' then '8422' + WHEN 'KP' then '6104' + WHEN 'KR' then '6105' + WHEN 'KW' then '4207' + WHEN 'KX' then '3216' + WHEN 'KY' then '8406' + WHEN 'KZ' then '7205' + WHEN 'LA' then '5103' + WHEN 'LB' then '4208' + WHEN 'LC' then '8423' + WHEN 'LI' then '2305' + WHEN 'LK' then '7107' + WHEN 'LR' then '9118' + WHEN 'LS' then '9211' + WHEN 'LT' then '3306' + WHEN 'LU' then '2306' + WHEN 'LV' then '3305' + WHEN 'LY' then '4103' + WHEN 'MA' then '4104' + WHEN 'MC' then '2307' + WHEN 'MD' then '3208' + WHEN 'ME' then '3214' + WHEN 'MG' then '9212' + WHEN 'MH' then '1403' + WHEN 'MK' then '3206' + WHEN 'ML' then '9121' + WHEN 'MM' then '5101' + WHEN 'MN' then '6107' + WHEN 'MO' then '6106' + WHEN 'MP' then '1406' + WHEN 'MQ' then '8416' + WHEN 'MR' then '9122' + WHEN 'MS' then '8417' + WHEN 'MT' then '3105' + WHEN 'MU' then '9214' + WHEN 'MV' then '7104' + WHEN 'MW' then '9213' + WHEN 'MX' then '8306' + WHEN 'MY' then '5203' + WHEN 'MZ' then '9216' + WHEN 'NA' then '9217' + WHEN 'NC' then '1301' + WHEN 'ND' then '2104' + WHEN 'NE' then '9123' + WHEN 'NF' then '1102' + WHEN 'NG' then '9124' + WHEN 'NI' then '8307' + WHEN 'NL' then '2308' + WHEN 'NO' then '2406' + WHEN 'NP' then '7105' + WHEN 'NR' then '1405' + WHEN 'NT' then NULL + WHEN 'NU' then '1504' + WHEN 'NZ' then '1201' + WHEN 'OM' then '4211' + WHEN 'PA' then '8308' + WHEN 'PC' then NULL + WHEN 'PE' then '8213' + WHEN 'PF' then '1503' + WHEN 'PG' then '1302' + WHEN 'PH' then '5204' + WHEN 'PK' then '7106' + WHEN 'PL' then '3307' + WHEN 'PM' then '8103' + WHEN 'PN' then '1513' + WHEN 'PR' then '8421' + WHEN 'PS' then '4202' + WHEN 'PT' then '3106' + WHEN 'PU' then NULL + WHEN 'PW' then '1407' + WHEN 'PX' then '4202' + WHEN 'PY' then '8212' + WHEN 'QA' then '4212' + WHEN 'RE' then '9218' + WHEN 'RK' then '3216' + WHEN 'RO' then '3211' + WHEN 'RS' then '3215' + WHEN 'RU' then '3308' + WHEN 'RW' then '9221' + WHEN 'SA' then '4213' + WHEN 'SB' then '1303' + WHEN 'SC' then '9223' + WHEN 'SD' then '4105' + WHEN 'SE' then '2407' + WHEN 'SG' then '5205' + WHEN 'SH' then '9222' + WHEN 'SI' then '3212' + WHEN 'SK' then '3311' + WHEN 'SL' then '9127' + WHEN 'SM' then '3107' + WHEN 'SN' then '9126' + WHEN 'SO' then '9224' + WHEN 'SP' then NULL + WHEN 'SQ' then NULL + WHEN 'SR' then '8214' + WHEN 'SS' then '4111' + WHEN 'ST' then '9125' + WHEN 'SU' then NULL + WHEN 'SV' then '8303' + WHEN 'SX' then '3215' + WHEN 'SY' then '4214' + WHEN 'SZ' then '9226' + WHEN 'TC' then '8426' + WHEN 'TD' then '9106' + WHEN 'TF' then '1601' + WHEN 'TG' then '9128' + WHEN 'TH' then '5104' + WHEN 'TJ' then '7207' + WHEN 'TK' then '1507' + WHEN 'TL' then '5206' + WHEN 'TM' then '7208' + WHEN 'TN' then '4106' + WHEN 'TO' then '1508' + WHEN 'TP' then '5206' + WHEN 'TR' then '4215' + WHEN 'TT' then '8425' + WHEN 'TV' then '1511' + WHEN 'TW' then '6108' + WHEN 'TZ' then '9227' + WHEN 'UA' then '3312' + WHEN 'UG' then '9228' + WHEN 'UK' then '3216' + WHEN 'UM' then '8104' + WHEN 'UN' then NULL + WHEN 'US' then '8104' + WHEN 'UY' then '8215' + WHEN 'UZ' then '7211' + WHEN 'VA' then '3103' + WHEN 'VC' then '8424' + WHEN 'VE' then '8216' + WHEN 'VG' then '8427' + WHEN 'VI' then '8428' + WHEN 'VN' then '5105' + WHEN 'VU' then '1304' + WHEN 'WA' then '2106' + WHEN 'WF' then '1512' + WHEN 'WS' then '1505' + WHEN 'XX' then NULL + WHEN 'YD' then '4217' + WHEN 'YE' then '4217' + WHEN 'YM' then '3206' + WHEN 'YT' then '9215' + WHEN 'YU' then NULL + WHEN 'ZA' then '9225' + WHEN 'ZM' then '9231' + WHEN 'ZR' then '9108' + WHEN 'ZW' then '9232' + WHEN 'ZZ' then NULL + ELSE NULL END AS code + ,4 AS source_rank +FROM [IDI_Clean_YYYYMM].[cus_clean].[journey] +WHERE [cus_jou_country_of_birth_code] IS NOT NULL +AND [cus_jou_country_of_birth_code] != 'XX' +GO + +/******************************************************** +NZTA drivers license +'1999 4N V14.0.0' as code_sys +********************************************************/ +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_COB_list] (snz_uid, c_type, code, source_rank) +SELECT DISTINCT snz_uid + ,'COB' AS c_type + ,CASE raw_text + WHEN 'AUSTRALIA' then '1101' + WHEN 'NORFOLK ISLAND' then '1102' + WHEN 'NORFOLK ISLANDS' then '1102' + WHEN 'NEW ZEALAND' then '1201' + WHEN 'NEW CALEDONIA' then '1301' + WHEN 'PAPUA NEW GUINEA' then '1302' + WHEN 'SOLOMON ISLANDS' then '1303' + WHEN 'BRITISH SOLOMON ISLANDS' then '1303' + WHEN 'VANUATU' then '1304' + WHEN 'GUAM' then '1401' + WHEN 'KIRIBATI' then '1402' + WHEN 'CANTON & ENDERBURY' then '1402' + WHEN 'MARSHALL ISLANDS' then '1403' + WHEN 'FEDERATED STATES OF MICRONESIA' then '1404' + WHEN 'NAURU' then '1405' + WHEN 'NORTHERN MARIANA ISLANDS' then '1406' + WHEN 'PALAU' then '1407' + WHEN 'COOK ISLANDS (RAROTONGA)' then '1501' + WHEN 'FIJI' then '1502' + WHEN 'FRENCH POLYNESIA' then '1503' + WHEN 'NIUE ISLANDS' then '1504' + WHEN 'SAMOA' then '1505' + WHEN 'WESTERN SAMOA' then '1505' + WHEN 'SAMOA AMERICAN' then '1506' + WHEN 'AMERICAN SAMOA' then '1506' + WHEN 'TOKELAU' then '1507' + WHEN 'TONGA' then '1508' + WHEN 'TUVALU' then '1511' + WHEN 'WALLIS AND FUTUNA' then '1512' + WHEN 'PITCAIRN' then '1513' + WHEN 'UNITED KINGDOM' then '2100' + WHEN 'CHANNEL ISLANDS' then '2101' + WHEN 'ENGLAND' then '2102' + WHEN 'ISLE OF MAN' then '2103' + WHEN 'NORTHERN IRELAND' then '2104' + WHEN 'SCOTLAND' then '2105' + WHEN 'WALES' then '2106' + WHEN 'IRELAND (EIRE)' then '2201' + WHEN 'REPUBLIC OF IRELAND' then '2300' + WHEN 'AUSTRIA' then '2301' + WHEN 'BELGIUM' then '2302' + WHEN 'FRANCE' then '2303' + WHEN 'GERMANY' then '2304' + WHEN 'LIECHTENSTEIN' then '2305' + WHEN 'LUXEMBOURG' then '2306' + WHEN 'MONACO' then '2307' + WHEN 'NETHERLANDS (HOLLAND)' then '2308' + WHEN 'SWITZERLAND' then '2311' + WHEN 'DENMARK' then '2401' + WHEN 'FAEROE ISLANDS' then '2402' + WHEN 'FINLAND' then '2403' + WHEN 'GREENLAND' then '2404' + WHEN 'ICELAND' then '2405' + WHEN 'NORWAY' then '2406' + WHEN 'SWEDEN' then '2407' + WHEN 'ANDORRA' then '3101' + WHEN 'GIBRALTAR' then '3102' + WHEN 'VATICAN CITY STATE' then '3103' + WHEN 'ITALY' then '3104' + WHEN 'MALTA' then '3105' + WHEN 'PORTUGAL' then '3106' + WHEN 'SAN MARINO' then '3107' + WHEN 'SPAIN' then '3108' + WHEN 'ALBANIA' then '3201' + WHEN 'BOSNIA-HERZEGOVINA' then '3202' + WHEN 'BULGARIA' then '3203' + WHEN 'CROATIA' then '3204' + WHEN 'CYPRUS' then '3205' + WHEN 'GREECE' then '3207' + WHEN 'MOLDOVA' then '3208' + WHEN 'ROMANIA' then '3211' + WHEN 'SLOVENIA' then '3212' + WHEN 'MONTENEGRO' then '3214' + WHEN 'SERBIA' then '3215' + WHEN 'KOSOVO' then '3216' + WHEN 'YUGOSLAVIA' then '3300' + WHEN 'MACEDONIA' then '3300' + WHEN 'BELARUS' then '3301' + WHEN 'CZECH REPUBLIC' then '3302' + WHEN 'CZECHOSLOVAKIA' then '3302' + WHEN 'YUGOSLAVIA (SERBIA & MONTENEGRO)' then '3302' + WHEN 'ESTONIA' then '3303' + WHEN 'HUNGARY' then '3304' + WHEN 'LATVIA' then '3305' + WHEN 'LITHUANIA' then '3306' + WHEN 'POLAND' then '3307' + WHEN 'RUSSIA' then '3308' + WHEN 'SLOVAKIA' then '3311' + WHEN 'UKRAINE' then '3312' + WHEN 'ALGERIA' then '4101' + WHEN 'EGYPT' then '4102' + WHEN 'MOROCCO' then '4104' + WHEN 'SUDAN' then '4105' + WHEN 'TUNISIA' then '4106' + WHEN 'WESTERN SAHARA' then '4107' + WHEN 'SOUTH SUDAN' then '4111' + WHEN 'BAHRAIN' then '4201' + WHEN 'PALESTINE' then '4202' + WHEN 'GAZA STRIP' then '4202' + WHEN 'WEST BANK' then '4202' + WHEN 'IRAN' then '4203' + WHEN 'IRAQ' then '4204' + WHEN 'ISRAEL' then '4205' + WHEN 'JORDAN' then '4206' + WHEN 'KUWAIT' then '4207' + WHEN 'LEBANON' then '4208' + WHEN 'OMAN' then '4211' + WHEN 'QATAR' then '4212' + WHEN 'SAUDI ARABIA' then '4213' + WHEN 'SYRIA' then '4214' + WHEN 'TURKEY' then '4215' + WHEN 'UNITED ARAB EMIRATES' then '4216' + WHEN 'YEMEN' then '4217' + WHEN 'MYANMAR' then '5101' + WHEN 'BURMA' then '5101' + WHEN 'CAMBODIA' then '5102' + WHEN 'KAMPUCHEA' then '5102' + WHEN 'LAOS' then '5103' + WHEN 'THAILAND' then '5104' + WHEN 'VIETNAM' then '5105' + WHEN 'BRUNEI DARUSSALAM' then '5201' + WHEN 'INDONESIA' then '5202' + WHEN 'MALAYSIA' then '5203' + WHEN 'PHILIPPINES' then '5204' + WHEN 'SINGAPORE' then '5205' + WHEN 'TIMOR (PORTUGESE)' then '5206' + WHEN 'CHINA PEOPLE''S REPUBLIC OF' then '6101' + WHEN 'HONG KONG' then '6102' + WHEN 'JAPAN' then '6103' + WHEN 'KOREA (NORTH) DPR' then '6104' + WHEN 'KOREA (SOUTH) RPBC' then '6105' + WHEN 'MACAU' then '6106' + WHEN 'MONGOLIA' then '6107' + WHEN 'TAIWAN' then '6108' + WHEN 'BANGLADESH' then '7101' + WHEN 'BHUTAN' then '7102' + WHEN 'INDIA' then '7103' + WHEN 'MALDIVES' then '7104' + WHEN 'NEPAL' then '7105' + WHEN 'PAKISTAN' then '7106' + WHEN 'SRI LANKA' then '7107' + WHEN 'AFGHANISTAN' then '7201' + WHEN 'ARMENIA' then '7202' + WHEN 'AZERBAIJAN' then '7203' + WHEN 'GEORGIA' then '7204' + WHEN 'KAZAKHSTAN' then '7205' + WHEN 'KYRGYZSTAN' then '7206' + WHEN 'TAJIKISTAN' then '7207' + WHEN 'TURKMENISTAN' then '7208' + WHEN 'UZBEKISTAN' then '7211' + WHEN 'AMERICA UNDEFINED' then '8000' + WHEN 'BERMUDA' then '8101' + WHEN 'CANADA' then '8102' + WHEN 'ST PIERRE AND MIQUELON' then '8103' + WHEN 'UNITED STATES OF AMERICA' then '8104' + WHEN 'ARGENTINA' then '8201' + WHEN 'BOLIVIA' then '8202' + WHEN 'BRAZIL' then '8203' + WHEN 'CHILE' then '8204' + WHEN 'COLOMBIA' then '8205' + WHEN 'ECUADOR' then '8206' + WHEN 'FALKLAND ISLANDS' then '8207' + WHEN 'FRENCH GUIANA' then '8208' + WHEN 'GUYANA' then '8211' + WHEN 'PARAGUAY' then '8212' + WHEN 'PERU' then '8213' + WHEN 'SURINAME' then '8214' + WHEN 'URUGUAY' then '8215' + WHEN 'VENEZUELA' then '8216' + WHEN 'BELIZE' then '8301' + WHEN 'COSTA RICA' then '8302' + WHEN 'EL SALVADOR' then '8303' + WHEN 'GUATEMALA' then '8304' + WHEN 'HONDURAS' then '8305' + WHEN 'MEXICO' then '8306' + WHEN 'NICARAGUA' then '8307' + WHEN 'PANAMA' then '8308' + WHEN 'CARIBBEAN UNSPECIFIED' then '8400' + WHEN 'ANGUILLA' then '8401' + WHEN 'ANTIGUA & BARBUDA' then '8402' + WHEN 'ARUBA' then '8403' + WHEN 'BAHAMAS' then '8404' + WHEN 'BARBADOS' then '8405' + WHEN 'CAYMAN ISLANDS' then '8406' + WHEN 'CUBA' then '8407' + WHEN 'DOMINICA' then '8408' + WHEN 'DOMINICAN REPUBLIC' then '8411' + WHEN 'GRENADA' then '8412' + WHEN 'GUADELOUPE' then '8413' + WHEN 'HAITI' then '8414' + WHEN 'JAMAICA' then '8415' + WHEN 'MARTINIQUE' then '8416' + WHEN 'MONTSERRAT' then '8417' + WHEN 'PUERTO RICO' then '8421' + WHEN 'ST KITT-NEVIS' then '8422' + WHEN 'ST LUCIA' then '8423' + WHEN 'ST VINCENT AND THE GRENADINES' then '8424' + WHEN 'TRINIDAD AND TOBAGO' then '8425' + WHEN 'TURKS AND CAICOS ISLANDS' then '8426' + WHEN 'VIRGIN ISLANDS BRITISH' then '8427' + WHEN 'VIRGIN ISLANDS UNITED STATES' then '8428' + WHEN 'BENIN' then '9101' + WHEN 'BURKINA FASO' then '9102' + WHEN 'CAMEROON REPUBLIC OF' then '9103' + WHEN 'CENTRAL AFRICAN REPUBLIC' then '9105' + WHEN 'CHAD' then '9106' + WHEN 'CONGO' then '9107' + WHEN 'REPUBLIC OF CONGO' then '9108' + WHEN 'DEM REPUBLIC OF THE CONGO' then '9108' + WHEN 'COTE D''IVOIRE' then '9111' + WHEN 'EQUATORIAL GUINEA' then '9112' + WHEN 'GABON' then '9113' + WHEN 'GAMBIA' then '9114' + WHEN 'GHANA' then '9115' + WHEN 'GUINEA' then '9116' + WHEN 'GUINEA-BISSAU' then '9117' + WHEN 'LIBERIA' then '9118' + WHEN 'MALI' then '9121' + WHEN 'MAURITANIA' then '9122' + WHEN 'NIGER' then '9123' + WHEN 'NIGERIA' then '9124' + WHEN 'SAO TOME AND PRINCIPE' then '9125' + WHEN 'SENEGAL' then '9126' + WHEN 'SIERRA LEONE' then '9127' + WHEN 'TOGO' then '9128' + WHEN 'ANGOLA' then '9201' + WHEN 'BOTSWANA' then '9202' + WHEN 'BURUNDI' then '9203' + WHEN 'COMOROS' then '9204' + WHEN 'DJIBOUTI' then '9205' + WHEN 'ERITREA' then '9206' + WHEN 'ETHIOPIA' then '9207' + WHEN 'KENYA' then '9208' + WHEN 'LESOTHO' then '9211' + WHEN 'MADAGASCAR' then '9212' + WHEN 'MALAWI' then '9213' + WHEN 'MAURITIUS' then '9214' + WHEN 'MAYOTTE' then '9215' + WHEN 'MOZAMBIQUE' then '9216' + WHEN 'NAMIBIA' then '9217' + WHEN 'REUNION' then '9218' + WHEN 'RWANDA' then '9221' + WHEN 'ST HELENA' then '9222' + WHEN 'SEYCHELLES' then '9223' + WHEN 'SOMALIA' then '9224' + WHEN 'SOUTH AFRICA' then '9225' + WHEN 'TANZANIA' then '9227' + WHEN 'UGANDA' then '9228' + WHEN 'ZAMBIA' then '9231' + WHEN 'ZIMBABWE' then '9232' + WHEN 'UNKNOWN' then '9999' + WHEN 'LIBYAN ARAB REPUBLIC' then NULL + WHEN 'SWAZILAND' then NULL + WHEN 'PACIFIC ISLANDS (UNSPECIFIED)' then NULL + WHEN 'NETHERLANDS ANTILLES' then NULL + WHEN 'US MISC PACIFIC ISLANDS' then NULL + WHEN 'COCOS ISLANDS' then NULL + WHEN 'IVORY COAST' then NULL + WHEN 'LAO REPUBLIC' then NULL + WHEN 'SOUTHERN RHODESIA' then NULL + WHEN 'CHILEAN ANTARCTIC TERRITORY' then NULL + WHEN 'NEUTRAL ZONE' then NULL + WHEN 'KHMER REPUBLIC' then NULL + WHEN 'ZAIRE' then NULL + WHEN 'OTHER NORTH AFRICA' then NULL + WHEN 'BRITISH ANTARCTIC TERRITORY' then NULL + WHEN 'BRIT INDIAN OCEAN' then NULL + WHEN 'AFARS & ISSAS' then NULL + WHEN 'BYELORUSSIAN SSR' then NULL + WHEN 'CAPE VERDE' then NULL + WHEN 'OTHER POLYNESIA (EXCL HAWAII)' then NULL + WHEN 'ADELIE LAND (FRANCE)' then NULL + WHEN 'ARGENTINIAN ANTARCTIC' then NULL + WHEN 'AUSTRALIAN ANTARCTIC TERRITORY' then NULL + WHEN 'OTHER SOUTHERN & EAST AFRICA' then NULL + WHEN 'UPPER VOLTA' then NULL + ELSE NULL END AS code + ,5 AS source_rank +FROM ( + SELECT snz_uid, nzta_hist_birth_country_text AS raw_text + FROM [IDI_Clean_YYYYMM].[nzta_clean].[dlr_historic] + + UNION ALL + + SELECT snz_uid, nzta_dlr_birth_country_text AS raw_text + FROM [IDI_Clean_YYYYMM].[nzta_clean].[drivers_licence_register] +) AS a +GO + +/******************************************************** +DOL +'1999 4A V15.0.0' as raw_code_sys +'1999 4N V14.0.0' as code_sys +********************************************************/ +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_COB_list] (snz_uid, c_type, code, source_rank) +SELECT DISTINCT [snz_uid] + ,'COB' as c_type + ,CASE [dol_mid_birth_country_code] + WHEN 'AU' then '1101' + WHEN 'NF' then '1102' + WHEN 'NZ' then '1201' + WHEN 'NC' then '1301' + WHEN 'PG' then '1302' + WHEN 'SB' then '1303' + WHEN 'VU' then '1304' + WHEN 'GU' then '1401' + WHEN 'KI' then '1402' + WHEN 'MH' then '1403' + WHEN 'FM' then '1404' + WHEN 'NR' then '1405' + WHEN 'MP' then '1406' + WHEN 'PW' then '1407' + WHEN 'CK' then '1501' + WHEN 'FJ' then '1502' + WHEN 'PF' then '1503' + WHEN 'NU' then '1504' + WHEN 'WS' then '1505' + WHEN 'AS' then '1506' + WHEN 'TK' then '1507' + WHEN 'TO' then '1508' + WHEN 'TV' then '1511' + WHEN 'WF' then '1512' + WHEN 'PN' then '1513' + WHEN 'AQ' then '1601' + WHEN 'TF' then '1601' + WHEN 'GB' then '2100' + WHEN 'IM' then '2103' + WHEN 'ND' then '2104' + WHEN 'HI' then '2105' + WHEN 'WA' then '2106' + WHEN 'IE' then '2201' + WHEN 'AT' then '2301' + WHEN 'BE' then '2302' + WHEN 'FR' then '2303' + WHEN 'DE' then '2304' + WHEN 'LI' then '2305' + WHEN 'LU' then '2306' + WHEN 'MC' then '2307' + WHEN 'NL' then '2308' + WHEN 'CH' then '2311' + WHEN 'DK' then '2401' + WHEN 'FO' then '2402' + WHEN 'FI' then '2403' + WHEN 'GL' then '2404' + WHEN 'IS' then '2405' + WHEN 'NO' then '2406' + WHEN 'SE' then '2407' + WHEN 'AD' then '3101' + WHEN 'GI' then '3102' + WHEN 'VA' then '3103' + WHEN 'IT' then '3104' + WHEN 'MT' then '3105' + WHEN 'PT' then '3106' + WHEN 'SM' then '3107' + WHEN 'ES' then '3108' + WHEN 'AL' then '3201' + WHEN 'BA' then '3202' + WHEN 'BG' then '3203' + WHEN 'HR' then '3204' + WHEN 'CY' then '3205' + WHEN 'GR' then '3207' + WHEN 'MD' then '3208' + WHEN 'RO' then '3211' + WHEN 'SI' then '3212' + WHEN 'ME' then '3214' + WHEN 'RS' then '3215' + WHEN 'KX' then '3216' + WHEN 'BY' then '3301' + WHEN 'CZ' then '3302' + WHEN 'EE' then '3303' + WHEN 'EN' then '3303' + WHEN 'HU' then '3304' + WHEN 'LV' then '3305' + WHEN 'LT' then '3306' + WHEN 'PL' then '3307' + WHEN 'RU' then '3308' + WHEN 'SK' then '3311' + WHEN 'UA' then '3312' + WHEN 'DZ' then '4101' + WHEN 'EG' then '4102' + WHEN 'LY' then '4103' + WHEN 'MA' then '4104' + WHEN 'SD' then '4105' + WHEN 'TN' then '4106' + WHEN 'EH' then '4107' + WHEN 'SS' then '4111' + WHEN 'BH' then '4201' + WHEN 'IR' then '4203' + WHEN 'IQ' then '4204' + WHEN 'IL' then '4205' + WHEN 'JO' then '4206' + WHEN 'KW' then '4207' + WHEN 'LB' then '4208' + WHEN 'OM' then '4211' + WHEN 'QA' then '4212' + WHEN 'SA' then '4213' + WHEN 'SY' then '4214' + WHEN 'TR' then '4215' + WHEN 'AE' then '4216' + WHEN 'YE' then '4217' + WHEN 'MM' then '5101' + WHEN 'KH' then '5102' + WHEN 'LA' then '5103' + WHEN 'TH' then '5104' + WHEN 'VN' then '5105' + WHEN 'BN' then '5201' + WHEN 'ID' then '5202' + WHEN 'MY' then '5203' + WHEN 'PH' then '5204' + WHEN 'SG' then '5205' + WHEN 'TL' then '5206' + WHEN 'CN' then '6101' + WHEN 'HK' then '6102' + WHEN 'JP' then '6103' + WHEN 'KP' then '6104' + WHEN 'KR' then '6105' + WHEN 'MO' then '6106' + WHEN 'MN' then '6107' + WHEN 'TW' then '6108' + WHEN 'BD' then '7101' + WHEN 'BT' then '7102' + WHEN 'IN' then '7103' + WHEN 'MV' then '7104' + WHEN 'NP' then '7105' + WHEN 'PK' then '7106' + WHEN 'LK' then '7107' + WHEN 'AF' then '7201' + WHEN 'AM' then '7202' + WHEN 'AZ' then '7203' + WHEN 'GE' then '7204' + WHEN 'KZ' then '7205' + WHEN 'KG' then '7206' + WHEN 'TJ' then '7207' + WHEN 'TM' then '7208' + WHEN 'UZ' then '7211' + WHEN 'BM' then '8101' + WHEN 'CA' then '8102' + WHEN 'PM' then '8103' + WHEN 'US' then '8104' + WHEN 'AR' then '8201' + WHEN 'BO' then '8202' + WHEN 'BR' then '8203' + WHEN 'CL' then '8204' + WHEN 'CO' then '8205' + WHEN 'EC' then '8206' + WHEN 'FK' then '8207' + WHEN 'GF' then '8208' + WHEN 'GY' then '8211' + WHEN 'PY' then '8212' + WHEN 'PE' then '8213' + WHEN 'SR' then '8214' + WHEN 'UY' then '8215' + WHEN 'VE' then '8216' + WHEN 'BZ' then '8301' + WHEN 'CR' then '8302' + WHEN 'SV' then '8303' + WHEN 'GT' then '8304' + WHEN 'HN' then '8305' + WHEN 'MX' then '8306' + WHEN 'NI' then '8307' + WHEN 'PA' then '8308' + WHEN 'AI' then '8401' + WHEN 'AG' then '8402' + WHEN 'AW' then '8403' + WHEN 'BS' then '8404' + WHEN 'BB' then '8405' + WHEN 'KY' then '8406' + WHEN 'CU' then '8407' + WHEN 'DM' then '8408' + WHEN 'DO' then '8411' + WHEN 'GD' then '8412' + WHEN 'GJ' then '8412' + WHEN 'GP' then '8413' + WHEN 'HT' then '8414' + WHEN 'JM' then '8415' + WHEN 'MQ' then '8416' + WHEN 'MS' then '8417' + WHEN 'PR' then '8421' + WHEN 'KN' then '8422' + WHEN 'LC' then '8423' + WHEN 'VC' then '8424' + WHEN 'TT' then '8425' + WHEN 'TC' then '8426' + WHEN 'VG' then '8427' + WHEN 'VI' then '8428' + WHEN 'CW' then '8433' + WHEN 'BJ' then '9101' + WHEN 'BF' then '9102' + WHEN 'CM' then '9103' + WHEN 'CV' then '9104' + WHEN 'CF' then '9105' + WHEN 'TD' then '9106' + WHEN 'CG' then '9107' + WHEN 'CD' then '9108' + WHEN 'ZR' then '9108' + WHEN 'CI' then '9111' + WHEN 'GQ' then '9112' + WHEN 'GA' then '9113' + WHEN 'GM' then '9114' + WHEN 'GH' then '9115' + WHEN 'GN' then '9116' + WHEN 'GW' then '9117' + WHEN 'LR' then '9118' + WHEN 'ML' then '9121' + WHEN 'MR' then '9122' + WHEN 'NE' then '9123' + WHEN 'NG' then '9124' + WHEN 'ST' then '9125' + WHEN 'SN' then '9126' + WHEN 'SL' then '9127' + WHEN 'TG' then '9128' + WHEN 'AO' then '9201' + WHEN 'BW' then '9202' + WHEN 'BI' then '9203' + WHEN 'KM' then '9204' + WHEN 'DJ' then '9205' + WHEN 'ER' then '9206' + WHEN 'ET' then '9207' + WHEN 'KE' then '9208' + WHEN 'LS' then '9211' + WHEN 'MG' then '9212' + WHEN 'MW' then '9213' + WHEN 'MU' then '9214' + WHEN 'YT' then '9215' + WHEN 'MZ' then '9216' + WHEN 'NA' then '9217' + WHEN 'RE' then '9218' + WHEN 'RW' then '9221' + WHEN 'SH' then '9222' + WHEN 'SC' then '9223' + WHEN 'SO' then '9224' + WHEN 'ZA' then '9225' + WHEN 'TZ' then '9227' + WHEN 'UG' then '9228' + WHEN 'ZM' then '9231' + WHEN 'ZW' then '9232' + WHEN '01' then '9999' + WHEN 'NULL' then '9999' + WHEN '02' then '9999' + WHEN '03' then '9999' + WHEN 'BQ' then '9999' + WHEN 'BL' then '9999' + WHEN 'NT' then '9999' + WHEN 'XX' then '9999' + WHEN 'SU' then '3300' + WHEN 'YU' then '3200' + WHEN 'MK' then '3200' + WHEN 'UN' then '9999' + WHEN 'PS' then '4202' + WHEN 'DD' then '2304' + WHEN 'ZZ' then '3302' + WHEN 'RK' then '3200' + WHEN 'CS' then '3200' + WHEN 'AN' then '2308' + WHEN 'TP' then '5206' + WHEN 'CX' then '1402' + WHEN 'BU' then '5101' + WHEN 'UK' then '9999' + WHEN 'SP' then '9999' + WHEN 'PC' then '9999' + WHEN 'SQ' then '9999' + WHEN 'CC' then '1101' + WHEN 'EU' then '9999' + WHEN 'SX' then '3200' + WHEN 'GS' then '2102' + WHEN 'UM' then '8104' + WHEN 'PU' then '8104' + WHEN 'CT' then '9999' + WHEN 'HG' then '9999' + WHEN 'PX' then '4202' + WHEN 'YD' then '4217' + WHEN 'IO' then '9999' + WHEN 'SZ' then '9226' + WHEN 'YM' then '3200' + ELSE NULL END AS code + ,6 AS source_rank +FROM [IDI_Clean_YYYYMM].[dol_clean].[movement_identities] +GO + +/******************************************************** +MOE enrollment - first COC by date +'XXXX' as raw_code_sys +********************************************************/ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_moe_enrollment] +GO + +SELECT DISTINCT [snz_uid] + ,moe_enr_prog_start_date + ,[moe_enr_country_code] +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_moe_enrollment] +FROM [IDI_Clean_YYYYMM].[moe_clean].[enrolment] +WHERE [moe_enr_country_code] != '999' +AND [moe_enr_country_code] NOT IN ('XXX','999','000') +AND [moe_enr_country_code] IS NOT NULL +GO + +CREATE NONCLUSTERED INDEX my_index ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_moe_enrollment] (snz_uid) +GO + +WITH date_ranked AS ( + SELECT * + ,RANK() OVER (PARTITION BY snz_uid ORDER BY moe_enr_prog_start_date) AS date_rank + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_moe_enrollment] +) +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_COB_list] (snz_uid, c_type, code, source_rank) +SELECT snz_uid + ,'COC' AS c_type + ,CASE [moe_enr_country_code] + WHEN '0' then '9999' + WHEN '102' then '9999' + WHEN '110' then '9999' + WHEN '999' then '9999' + WHEN 'ABW' then '2308' + WHEN 'AFG' then '7201' + WHEN 'AGO' then '9201' + WHEN 'ALB' then '3201' + WHEN 'AME' then '9999' + WHEN 'AND' then '3101' + WHEN 'ANT' then '2308' + WHEN 'ARE' then '4216' + WHEN 'ARG' then '8201' + WHEN 'ARM' then '7202' + WHEN 'ASM' then '1506' + WHEN 'ATG' then '8402' + WHEN 'AUL' then '9999' + WHEN 'AUS' then '1101' + WHEN 'AUT' then '2301' + WHEN 'AZE' then '7203' + WHEN 'BDI' then '9203' + WHEN 'BEL' then '2302' + WHEN 'BEN' then '9101' + WHEN 'BFA' then '9102' + WHEN 'BGD' then '7101' + WHEN 'BGR' then '3203' + WHEN 'BHR' then '4201' + WHEN 'BHS' then '8404' + WHEN 'BIH' then '3202' + WHEN 'BLR' then '3301' + WHEN 'BLZ' then '8301' + WHEN 'BOL' then '8202' + WHEN 'BOZ' then '3200' + WHEN 'BRA' then '8203' + WHEN 'BRB' then '8405' + WHEN 'BRN' then '5201' + WHEN 'BTN' then '7102' + WHEN 'BUL' then '9999' + WHEN 'BWA' then '9202' + WHEN 'CAF' then '9105' + WHEN 'CAN' then '8102' + WHEN 'CHE' then '2311' + WHEN 'CHI' then '9999' + WHEN 'CHL' then '8204' + WHEN 'CHN' then '6101' + WHEN 'CIV' then '9111' + WHEN 'CMR' then '9103' + WHEN 'COD' then '9108' + WHEN 'COG' then '9107' + WHEN 'COK' then '1501' + WHEN 'COL' then '8205' + WHEN 'COM' then '9204' + WHEN 'CRI' then '8302' + WHEN 'CSK' then '3302' + WHEN 'CUB' then '8407' + WHEN 'CYP' then '3205' + WHEN 'CZE' then '3302' + WHEN 'DEU' then '2304' + WHEN 'DJI' then '9205' + WHEN 'DMA' then '8408' + WHEN 'DNK' then '2401' + WHEN 'DOM' then '8411' + WHEN 'DZA' then '4101' + WHEN 'ECU' then '8206' + WHEN 'EGY' then '4102' + WHEN 'ENG' then '2102' + WHEN 'ERI' then '9206' + WHEN 'ESH' then '4107' + WHEN 'ESP' then '3108' + WHEN 'EST' then '3303' + WHEN 'ETH' then '9207' + WHEN 'FIN' then '2403' + WHEN 'FJI' then '1502' + WHEN 'FRA' then '2303' + WHEN 'FSM' then '1404' + WHEN 'GAB' then '9113' + WHEN 'GBR' then '2100' + WHEN 'GEO' then '7204' + WHEN 'GHA' then '9115' + WHEN 'GIN' then '9116' + WHEN 'GJS' then '2101' + WHEN 'GMB' then '9114' + WHEN 'GNQ' then '9112' + WHEN 'GRC' then '3207' + WHEN 'GRD' then '8412' + WHEN 'GRE' then '9999' + WHEN 'GTM' then '8304' + WHEN 'GUY' then '8211' + WHEN 'HIL' then '2105' + WHEN 'HKG' then '6102' + WHEN 'HND' then '8305' + WHEN 'HOL' then '9999' + WHEN 'HRV' then '3204' + WHEN 'HTI' then '8414' + WHEN 'HUN' then '3304' + WHEN 'IDN' then '5202' + WHEN 'IND' then '7103' + WHEN 'IRL' then '2201' + WHEN 'IRN' then '4203' + WHEN 'IRQ' then '4204' + WHEN 'ISL' then '2405' + WHEN 'ISR' then '4205' + WHEN 'ITA' then '3104' + WHEN 'JAM' then '8415' + WHEN 'JAP' then '9999' + WHEN 'JOR' then '4206' + WHEN 'JPN' then '6103' + WHEN 'KAZ' then '7205' + WHEN 'KEN' then '9208' + WHEN 'KGZ' then '7206' + WHEN 'KHM' then '5102' + WHEN 'KIR' then '1402' + WHEN 'KNA' then '8422' + WHEN 'KOR' then '6105' + WHEN 'KUR' then '9999' + WHEN 'KWT' then '4207' + WHEN 'LAO' then '5103' + WHEN 'LBN' then '4208' + WHEN 'LBR' then '9118' + WHEN 'LBY' then '4103' + WHEN 'LCA' then '8423' + WHEN 'LIE' then '2305' + WHEN 'LKA' then '7107' + WHEN 'LSO' then '9211' + WHEN 'LTU' then '3306' + WHEN 'LUX' then '2306' + WHEN 'LVA' then '3305' + WHEN 'MAC' then '6106' + WHEN 'MAL' then '9999' + WHEN 'MAR' then '4104' + WHEN 'MCD' then '3200' + WHEN 'MCO' then '2307' + WHEN 'MDA' then '3208' + WHEN 'MDG' then '9212' + WHEN 'MDV' then '7104' + WHEN 'MEX' then '8306' + WHEN 'MHL' then '1403' + WHEN 'MKD' then '3200' + WHEN 'MLI' then '9121' + WHEN 'MLT' then '3105' + WHEN 'MMR' then '5101' + WHEN 'MNE' then '3214' + WHEN 'MNG' then '6107' + WHEN 'MOZ' then '9216' + WHEN 'MRT' then '9122' + WHEN 'MSR' then '2100' + WHEN 'MUS' then '9214' + WHEN 'MWI' then '9213' + WHEN 'MYS' then '5203' + WHEN 'MZL' then '9999' + WHEN 'NAM' then '9217' + WHEN 'NCL' then '1301' + WHEN 'NER' then '9123' + WHEN 'NET' then '9999' + WHEN 'NEW' then '9999' + WHEN 'NGA' then '9124' + WHEN 'NIC' then '8307' + WHEN 'NID' then '2104' + WHEN 'NIU' then '1504' + WHEN 'NLD' then '2308' + WHEN 'NOR' then '2406' + WHEN 'NPL' then '7105' + WHEN 'NRU' then '1405' + WHEN 'NULL' then '9999' + WHEN 'NZL' then '1201' + WHEN 'OMN' then '4211' + WHEN 'PAK' then '7106' + WHEN 'PAN' then '8308' + WHEN 'PER' then '8213' + WHEN 'PHI' then '9999' + WHEN 'PHL' then '5204' + WHEN 'PLW' then '1407' + WHEN 'PNG' then '1302' + WHEN 'POL' then '3307' + WHEN 'PRK' then '6104' + WHEN 'PRT' then '3106' + WHEN 'PRY' then '8212' + WHEN 'PSE' then '4202' + WHEN 'PYF' then '1503' + WHEN 'QAT' then '4212' + WHEN 'ROM' then '3211' + WHEN 'ROU' then '3211' + WHEN 'RUS' then '3308' + WHEN 'RWA' then '9221' + WHEN 'SA' then '9999' + WHEN 'SAU' then '4213' + WHEN 'SCG' then '3200' + WHEN 'SDN' then '4105' + WHEN 'SEN' then '9126' + WHEN 'SGP' then '5205' + WHEN 'SIN' then '9999' + WHEN 'SLB' then '1303' + WHEN 'SLE' then '9127' + WHEN 'SLV' then '8303' + WHEN 'SMR' then '3107' + WHEN 'SOM' then '9224' + WHEN 'SOU' then '9999' + WHEN 'SRB' then '3215' + WHEN 'SSD' then '4111' + WHEN 'STP' then '9125' + WHEN 'SUR' then '8214' + WHEN 'SVK' then '3311' + WHEN 'SVN' then '3212' + WHEN 'SWE' then '2407' + WHEN 'SWI' then '9999' + WHEN 'SWZ' then '9226' + WHEN 'SYC' then '9223' + WHEN 'SYR' then '4214' + WHEN 'TCD' then '9106' + WHEN 'TGO' then '9128' + WHEN 'THA' then '5104' + WHEN 'TJK' then '7207' + WHEN 'TKL' then '1507' + WHEN 'TKM' then '7208' + WHEN 'TLS' then '5206' + WHEN 'TON' then '1508' + WHEN 'TTO' then '8425' + WHEN 'TUN' then '4106' + WHEN 'TUR' then '4215' + WHEN 'TUV' then '1511' + WHEN 'TWN' then '6108' + WHEN 'TZA' then '9227' + WHEN 'UGA' then '9228' + WHEN 'UK' then '9999' + WHEN 'UKR' then '3312' + WHEN 'UMI' then '8104' + WHEN 'UNI' then '9999' + WHEN 'UNR' then '9999' + WHEN 'URY' then '8215' + WHEN 'USA' then '8104' + WHEN 'UZB' then '7211' + WHEN 'VAT' then '3103' + WHEN 'VCT' then '8424' + WHEN 'VEN' then '8216' + WHEN 'VNM' then '5105' + WHEN 'VUT' then '1304' + WHEN 'WAL' then '2106' + WHEN 'WSM' then '1505' + WHEN 'XHI' then '9999' + WHEN 'XXX' then '9999' + WHEN 'YEM' then '4217' + WHEN 'YUG' then '3200' + WHEN 'ZAF' then '9225' + WHEN 'ZMB' then '9231' + WHEN 'ZWE' then '9232' + ELSE NULL END AS code + ,7 AS source_rank +FROM date_ranked +WHERE date_rank = 1 +GO + +/******************************************************** +MOE school enrolment - first COC by date +'XXXX' as raw_code_sys +********************************************************/ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_moe_school_roll] +GO + +SELECT DISTINCT [snz_moe_uid] + ,[CollectionDate] + ,[CountryOfCitizenship] +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_moe_school_roll] +FROM ( + + SELECT DISTINCT [snz_moe_uid],[CollectionDate],[CountryOfCitizenship] FROM [IDI_Adhoc].[clean_read_MOE].[School_Roll_Return_2019] + UNION ALL + SELECT DISTINCT [snz_moe_uid],[CollectionDate],[CountryOfCitizenship] FROM [IDI_Adhoc].[clean_read_MOE].[School_Roll_Return_2018] + UNION ALL + SELECT DISTINCT [snz_moe_uid],[CollectionDate],[CountryOfCitizenship] FROM [IDI_Adhoc].[clean_read_MOE].[School_Roll_Return_2017] + UNION ALL + SELECT DISTINCT [snz_moe_uid],[CollectionDate],[CountryOfCitizenship] FROM [IDI_Adhoc].[clean_read_MOE].[School_Roll_Return_2016] + UNION ALL + SELECT DISTINCT [snz_moe_uid],[CollectionDate],[CountryOfCitizenship] FROM [IDI_Adhoc].[clean_read_MOE].[School_Roll_Return_2015] + UNION ALL + SELECT DISTINCT [snz_moe_uid],[CollectionDate],[CountryOfCitizenship] FROM [IDI_Adhoc].[clean_read_MOE].[School_Roll_Return_2014] + UNION ALL + SELECT DISTINCT [snz_moe_uid],[CollectionDate],[CountryOfCitizenship] FROM [IDI_Adhoc].[clean_read_MOE].[School_Roll_Return_2013] + UNION ALL + SELECT DISTINCT [snz_moe_uid],[CollectionDate],[CountryOfCitizenship] FROM [IDI_Adhoc].[clean_read_MOE].[School_Roll_Return_2012] + UNION ALL + SELECT DISTINCT [snz_moe_uid],[CollectionDate],[CountryOfCitizenship] FROM [IDI_Adhoc].[clean_read_MOE].[School_Roll_Return_2011] + UNION ALL + SELECT DISTINCT [snz_moe_uid],[CollectionDate],[CountryOfCitizenship] FROM [IDI_Adhoc].[clean_read_MOE].[School_Roll_Return_2010] + UNION ALL + SELECT DISTINCT [snz_moe_uid],[CollectionDate],[CountryOfCitizenship] FROM [IDI_Adhoc].[clean_read_MOE].[School_Roll_Return_2009] + UNION ALL + SELECT DISTINCT [snz_moe_uid],[CollectionDate],[CountryOfCitizenship] FROM [IDI_Adhoc].[clean_read_MOE].[School_Roll_Return_2008] + UNION ALL + SELECT DISTINCT [snz_moe_uid],[CollectionDate],[CountryOfCitizenship] FROM [IDI_Adhoc].[clean_read_MOE].[School_Roll_Return_2007] + +) AS k +WHERE [CountryOfCitizenship] NOT IN ('XXX','999','000') +AND [CountryOfCitizenship] IS NOT NULL +GO + +CREATE NONCLUSTERED INDEX my_index ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_moe_school_roll] ([snz_moe_uid]) +GO + +WITH date_ranked AS ( + SELECT * + ,RANK() OVER (PARTITION BY [snz_moe_uid] ORDER BY [CollectionDate]) AS date_rank + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_moe_school_roll] +) +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_COB_list] (snz_uid, c_type, code, source_rank) +SELECT snz_uid + ,'COC' AS c_type + ,CASE [CountryOfCitizenship] + WHEN '0' then '9999' + WHEN '102' then '9999' + WHEN '110' then '9999' + WHEN '999' then '9999' + WHEN 'ABW' then '2308' + WHEN 'AFG' then '7201' + WHEN 'AGO' then '9201' + WHEN 'ALB' then '3201' + WHEN 'AME' then '9999' + WHEN 'AND' then '3101' + WHEN 'ANT' then '2308' + WHEN 'ARE' then '4216' + WHEN 'ARG' then '8201' + WHEN 'ARM' then '7202' + WHEN 'ASM' then '1506' + WHEN 'ATG' then '8402' + WHEN 'AUL' then '9999' + WHEN 'AUS' then '1101' + WHEN 'AUT' then '2301' + WHEN 'AZE' then '7203' + WHEN 'BDI' then '9203' + WHEN 'BEL' then '2302' + WHEN 'BEN' then '9101' + WHEN 'BFA' then '9102' + WHEN 'BGD' then '7101' + WHEN 'BGR' then '3203' + WHEN 'BHR' then '4201' + WHEN 'BHS' then '8404' + WHEN 'BIH' then '3202' + WHEN 'BLR' then '3301' + WHEN 'BLZ' then '8301' + WHEN 'BOL' then '8202' + WHEN 'BOZ' then '3200' + WHEN 'BRA' then '8203' + WHEN 'BRB' then '8405' + WHEN 'BRN' then '5201' + WHEN 'BTN' then '7102' + WHEN 'BUL' then '9999' + WHEN 'BWA' then '9202' + WHEN 'CAF' then '9105' + WHEN 'CAN' then '8102' + WHEN 'CHE' then '2311' + WHEN 'CHI' then '9999' + WHEN 'CHL' then '8204' + WHEN 'CHN' then '6101' + WHEN 'CIV' then '9111' + WHEN 'CMR' then '9103' + WHEN 'COD' then '9108' + WHEN 'COG' then '9107' + WHEN 'COK' then '1501' + WHEN 'COL' then '8205' + WHEN 'COM' then '9204' + WHEN 'CRI' then '8302' + WHEN 'CSK' then '3302' + WHEN 'CUB' then '8407' + WHEN 'CYP' then '3205' + WHEN 'CZE' then '3302' + WHEN 'DEU' then '2304' + WHEN 'DJI' then '9205' + WHEN 'DMA' then '8408' + WHEN 'DNK' then '2401' + WHEN 'DOM' then '8411' + WHEN 'DZA' then '4101' + WHEN 'ECU' then '8206' + WHEN 'EGY' then '4102' + WHEN 'ENG' then '2102' + WHEN 'ERI' then '9206' + WHEN 'ESH' then '4107' + WHEN 'ESP' then '3108' + WHEN 'EST' then '3303' + WHEN 'ETH' then '9207' + WHEN 'FIN' then '2403' + WHEN 'FJI' then '1502' + WHEN 'FRA' then '2303' + WHEN 'FSM' then '1404' + WHEN 'GAB' then '9113' + WHEN 'GBR' then '2100' + WHEN 'GEO' then '7204' + WHEN 'GHA' then '9115' + WHEN 'GIN' then '9116' + WHEN 'GJS' then '2101' + WHEN 'GMB' then '9114' + WHEN 'GNQ' then '9112' + WHEN 'GRC' then '3207' + WHEN 'GRD' then '8412' + WHEN 'GRE' then '9999' + WHEN 'GTM' then '8304' + WHEN 'GUY' then '8211' + WHEN 'HIL' then '2105' + WHEN 'HKG' then '6102' + WHEN 'HND' then '8305' + WHEN 'HOL' then '9999' + WHEN 'HRV' then '3204' + WHEN 'HTI' then '8414' + WHEN 'HUN' then '3304' + WHEN 'IDN' then '5202' + WHEN 'IND' then '7103' + WHEN 'IRL' then '2201' + WHEN 'IRN' then '4203' + WHEN 'IRQ' then '4204' + WHEN 'ISL' then '2405' + WHEN 'ISR' then '4205' + WHEN 'ITA' then '3104' + WHEN 'JAM' then '8415' + WHEN 'JAP' then '9999' + WHEN 'JOR' then '4206' + WHEN 'JPN' then '6103' + WHEN 'KAZ' then '7205' + WHEN 'KEN' then '9208' + WHEN 'KGZ' then '7206' + WHEN 'KHM' then '5102' + WHEN 'KIR' then '1402' + WHEN 'KNA' then '8422' + WHEN 'KOR' then '6105' + WHEN 'KUR' then '9999' + WHEN 'KWT' then '4207' + WHEN 'LAO' then '5103' + WHEN 'LBN' then '4208' + WHEN 'LBR' then '9118' + WHEN 'LBY' then '4103' + WHEN 'LCA' then '8423' + WHEN 'LIE' then '2305' + WHEN 'LKA' then '7107' + WHEN 'LSO' then '9211' + WHEN 'LTU' then '3306' + WHEN 'LUX' then '2306' + WHEN 'LVA' then '3305' + WHEN 'MAC' then '6106' + WHEN 'MAL' then '9999' + WHEN 'MAR' then '4104' + WHEN 'MCD' then '3200' + WHEN 'MCO' then '2307' + WHEN 'MDA' then '3208' + WHEN 'MDG' then '9212' + WHEN 'MDV' then '7104' + WHEN 'MEX' then '8306' + WHEN 'MHL' then '1403' + WHEN 'MKD' then '3200' + WHEN 'MLI' then '9121' + WHEN 'MLT' then '3105' + WHEN 'MMR' then '5101' + WHEN 'MNE' then '3214' + WHEN 'MNG' then '6107' + WHEN 'MOZ' then '9216' + WHEN 'MRT' then '9122' + WHEN 'MSR' then '2100' + WHEN 'MUS' then '9214' + WHEN 'MWI' then '9213' + WHEN 'MYS' then '5203' + WHEN 'MZL' then '9999' + WHEN 'NAM' then '9217' + WHEN 'NCL' then '1301' + WHEN 'NER' then '9123' + WHEN 'NET' then '9999' + WHEN 'NEW' then '9999' + WHEN 'NGA' then '9124' + WHEN 'NIC' then '8307' + WHEN 'NID' then '2104' + WHEN 'NIU' then '1504' + WHEN 'NLD' then '2308' + WHEN 'NOR' then '2406' + WHEN 'NPL' then '7105' + WHEN 'NRU' then '1405' + WHEN 'NULL' then '9999' + WHEN 'NZL' then '1201' + WHEN 'OMN' then '4211' + WHEN 'PAK' then '7106' + WHEN 'PAN' then '8308' + WHEN 'PER' then '8213' + WHEN 'PHI' then '9999' + WHEN 'PHL' then '5204' + WHEN 'PLW' then '1407' + WHEN 'PNG' then '1302' + WHEN 'POL' then '3307' + WHEN 'PRK' then '6104' + WHEN 'PRT' then '3106' + WHEN 'PRY' then '8212' + WHEN 'PSE' then '4202' + WHEN 'PYF' then '1503' + WHEN 'QAT' then '4212' + WHEN 'ROM' then '3211' + WHEN 'ROU' then '3211' + WHEN 'RUS' then '3308' + WHEN 'RWA' then '9221' + WHEN 'SA' then '9999' + WHEN 'SAU' then '4213' + WHEN 'SCG' then '3200' + WHEN 'SDN' then '4105' + WHEN 'SEN' then '9126' + WHEN 'SGP' then '5205' + WHEN 'SIN' then '9999' + WHEN 'SLB' then '1303' + WHEN 'SLE' then '9127' + WHEN 'SLV' then '8303' + WHEN 'SMR' then '3107' + WHEN 'SOM' then '9224' + WHEN 'SOU' then '9999' + WHEN 'SRB' then '3215' + WHEN 'SSD' then '4111' + WHEN 'STP' then '9125' + WHEN 'SUR' then '8214' + WHEN 'SVK' then '3311' + WHEN 'SVN' then '3212' + WHEN 'SWE' then '2407' + WHEN 'SWI' then '9999' + WHEN 'SWZ' then '9226' + WHEN 'SYC' then '9223' + WHEN 'SYR' then '4214' + WHEN 'TCD' then '9106' + WHEN 'TGO' then '9128' + WHEN 'THA' then '5104' + WHEN 'TJK' then '7207' + WHEN 'TKL' then '1507' + WHEN 'TKM' then '7208' + WHEN 'TLS' then '5206' + WHEN 'TON' then '1508' + WHEN 'TTO' then '8425' + WHEN 'TUN' then '4106' + WHEN 'TUR' then '4215' + WHEN 'TUV' then '1511' + WHEN 'TWN' then '6108' + WHEN 'TZA' then '9227' + WHEN 'UGA' then '9228' + WHEN 'UK' then '9999' + WHEN 'UKR' then '3312' + WHEN 'UMI' then '8104' + WHEN 'UNI' then '9999' + WHEN 'UNR' then '9999' + WHEN 'URY' then '8215' + WHEN 'USA' then '8104' + WHEN 'UZB' then '7211' + WHEN 'VAT' then '3103' + WHEN 'VCT' then '8424' + WHEN 'VEN' then '8216' + WHEN 'VNM' then '5105' + WHEN 'VUT' then '1304' + WHEN 'WAL' then '2106' + WHEN 'WSM' then '1505' + WHEN 'XHI' then '9999' + WHEN 'XXX' then '9999' + WHEN 'YEM' then '4217' + WHEN 'YUG' then '3200' + WHEN 'ZAF' then '9225' + WHEN 'ZMB' then '9231' + WHEN 'ZWE' then '9232' + ELSE NULL END AS code + ,8 AS source_rank +FROM date_ranked AS a +INNER JOIN [IDI_Clean_YYYYMM].[moe_clean].[nsi] AS b +ON a.snz_moe_uid = b.snz_moe_uid +WHERE date_rank = 1 +GO + +/*************************************************************************************************************** +Keep best rank for each person +***************************************************************************************************************/ + +CREATE NONCLUSTERED INDEX my_index ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_COB_list] (snz_uid) +GO + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_country_of_birth] +GO + +WITH source_ranked AS ( + SELECT * + ,RANK() OVER (PARTITION BY [snz_uid] ORDER BY source_rank) AS ranked + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_COB_list] +) +SELECT snz_uid + , c_type + , CAST(code AS INT) AS code + , source_rank +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_country_of_birth] +FROM source_ranked +WHERE ranked = 1 +AND code IS NOT NULL +AND code NOT IN ('9999', '0000') +GO + +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_country_of_birth] (snz_uid); +GO +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_country_of_birth] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE) +GO + +/*************************************************************************************************************** +Delete templorary tables +***************************************************************************************************************/ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_COB_list] +GO +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_moe_enrollment] +GO +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_moe_school_roll] +GO diff --git a/people/country_of_birth_lookup.xlsx b/people/country_of_birth_lookup.xlsx new file mode 100644 index 0000000..a40e285 Binary files /dev/null and b/people/country_of_birth_lookup.xlsx differ diff --git a/people/evidence_population_in_nz.sql b/people/evidence_population_in_nz.sql new file mode 100644 index 0000000..6dbf432 --- /dev/null +++ b/people/evidence_population_in_nz.sql @@ -0,0 +1,205 @@ +/************************************************************************************************** +Title: Evidence of people in NZ +Author: Craig Wright + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Evidence of people in NZ using multi-source. + +Intended purpose: +1. Constructing this population is to capture anyone and everyone who is in New Zealand at present (November 2021). +2. Combine data from multiple sources where we have evidence a person is in the country. + +Inputs & Dependencies: +- [IDI_Clean].[data].[snz_res_pop] +- [IDI_Adhoc].[clean_read_MSD].[msd_sben_202107] +- [IDI_Clean].[security].[concordance] +- population_health_service_users.sql --> [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_hsu_proxy] +- [IDI_Clean].[data].[person_overseas_spell] +- [IDI_Adhoc].[clean_read_MOH_CIR].[moh_CIR_vaccination_activity20211123] +- [IDI_Clean].[security].[concordance] +- [IDI_Clean].[data].[personal_detail] +Output +- [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_most_complete_population] + + +Notes: +1) This definition follows the same underlying business rules, but differs in implementation + from the population definition used for the COVID-19 Vaccination study population. +2) Sources used include: + 1. Estimated residential population by Stats NZ + 2. MSD high frequency load data spells for benefits ending in 2021 (anyone receiving a benefit) + 3. Health Service Users (HSU) + 4. Anybody with boarder spells putting them in NZ (entered NZ without an exit recorded) + 5. COVID immunisation register (received vaccination in NZ) + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = vacc_ + Project schema = DL-MAA20XX-YY + +Issues: + +History (reverse order): +2021-10-29 CW +**************************************************************************************************/ + +/*************************************************************************************************************** +Gather all identities that appear in NZ +***************************************************************************************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_all_identities_list] +GO + +SELECT DISTINCT snz_uid +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_all_identities_list] +FROM ( + --1. Estimated residential population + SELECT snz_uid + FROM [IDI_Clean_YYYYMM].[data].[snz_res_pop] + WHERE YEAR ([srp_ref_date]) = 2021 + + UNION ALL + + --2. MSD high frequecty load data spells for benefits ending in 2021 + SELECT b.snz_uid + FROM [IDI_Adhoc].[clean_read_MSD].[msd_sben_202107] as a + INNER JOIN [IDI_Clean_YYYYMM].[security].[concordance] as b + ON a.[snz_msd_uid] = b.[snz_msd_uid] + WHERE YEAR([end_date]) =2021 + + UNION ALL + + --3. Health Service Users (HSU) + SELECT DISTINCT snz_uid + ,'hsu' AS record_source + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_hsu_proxy] + + UNION ALL + + --4. Anybody with boarder spells putting them in NZ + SELECT snz_uid + FROM [IDI_Clean_YYYYMM].[data].[person_overseas_spell] AS a + WHERE NOT EXISTS ( + SELECT 1 + FROM [IDI_Clean_YYYYMM].[data].[person_overseas_spell] AS b + WHERE a.snz_uid = b.snz_uid + AND b.pos_last_departure_ind = 'y' + ) + + UNION ALL + + --5. COVID immunisation register + SELECT b.snz_uid + FROM [IDI_Adhoc].[clean_read_MOH_CIR].[moh_CIR_vaccination_activity20211123] AS a + INNER JOIN [IDI_Clean_YYYYMM].[security].[concordance] AS b + ON a.snz_moh_uid = b.snz_moh_uid +) AS k +GO + +/*************************************************************************************************************** +Gather conditions for exclusion +***************************************************************************************************************/ + +-- deceased +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_pop_deceased] +GO + +SELECT snz_uid +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_pop_deceased] +FROM [IDI_Clean_YYYYMM].[data].[personal_detail] +WHERE snz_uid IS NOT NULL +AND ([snz_deceased_year_nbr] IS NOT NULL OR [snz_deceased_month_nbr] IS NOT NULL) + + +-- left NZ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_pop_departed] +GO + +SELECT DISTINCT snz_uid +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_pop_departed] +FROM [IDI_Clean_YYYYMM].[data].[person_overseas_spell] +WHERE YEAR([pos_ceased_date]) = 9999 +AND snz_uid IS NOT NULL +AND pos_last_departure_ind = 'y' + +-- spine +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_pop_no_spine] +GO + +SELECT snz_uid +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_pop_no_spine] +FROM [IDI_Clean_YYYYMM].[security].[concordance] +WHERE snz_spine_uid IS NULL + +/*************************************************************************************************************** +Index all for comparison +***************************************************************************************************************/ + +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_all_identities_list] (snz_uid); +GO + +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_pop_deceased] (snz_uid); +GO + +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_pop_departed] (snz_uid); +GO + +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_pop_no_spine] (snz_uid); +GO + +/*************************************************************************************************************** +Population table +***************************************************************************************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_most_complete_population] +GO + +SELECT snz_uid +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_most_complete_population] +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_all_identities_list] AS a +WHERE NOT EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_pop_deceased] AS b + WHERE a.snz_uid = b.snz_uid +) +AND NOT EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_pop_departed] AS c + WHERE a.snz_uid = c.snz_uid +) +AND NOT EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_pop_no_spine] AS d + WHERE a.snz_uid = d.snz_uid +) + +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_most_complete_population] (snz_uid); +GO +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_most_complete_population] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE) +GO + +/*************************************************************************************************************** +Tidy up +***************************************************************************************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_all_identities_list] +GO +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_pop_deceased] +GO +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_pop_departed] +GO +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_pop_no_spine] +GO diff --git a/people/functional_disability.sql b/people/functional_disability.sql new file mode 100644 index 0000000..6077b72 --- /dev/null +++ b/people/functional_disability.sql @@ -0,0 +1,494 @@ +/*************************************************************************************************************************** + +# Functional disability + +Author: Craig Wright +Refinement: Simon Anastasiadis + + +# Multi-source indicator of functional disability + +Measuring disability is a complex and still evolving issue. There is no perfect way to classify people into categories and no full agreement amongst the disabled community on the language to be used. However, it is important for the disabled community, researchers, and policy makers to have some standard way to describe varying levels of participation as citizens, clients, or consumers of services. + + +## Key concepts + +Disability is a social construct that arises through the combination of two things: +1. The limitations some people have in completing some activities. +2. The barriers that exist in a person’s environment (or society in general) that limit participation. + +Information about the barriers or accommodations people experience is currently not available in administrative data. However, there are some sources data that include functional limitations that some people have. Hence, one approach is to use functional limitations to identify groups who are more likely to be disabled. + + +## Practical notes + +The indicator is based on the Washington Group Short Set (WGSS) questions on functioning. The WGSS is a series of six questions about difficulties people might encounter doing everyday things: +• Walking +• Seeing +• Hearing +• Remembering +• Washing +• Communication + +However, not everyone has answered these questions, and some people might have acquired impairments after the questions were asked. We have supplemented these questions with data from some Ministry of Health collections: SOCRATES, which captures functional assessments for disability clients; and InterRAI, which captures similar information for older people. + +In collaboration with experts in the disabled community, the Ministry of Health and Office for Disability Issues, we have aligned each of these data sources, resulting in a three-level indicator for each of the six functional activities: +• 0 = No limitation: This group does not report any limitations in undertaking everyday tasks. They are unlikely to be disabled. +• 1 = Low functional limitation: This group reports some difficulty with everyday tasks. They are somewhat likely to be disabled. +• 2 = High functional limitation: This group reports a lot of difficulty with everyday tasks or cannot do them at all. They are very likely to be disabled. + +Following guidance from the Washington Group (2020), we have also produced one overall indicator for disability status. This takes the value of ‘disabled’ if the person had high functional limitation in at least one activity, and ‘not disabled’ otherwise. (This overall indicator is consistent with how many agencies, such as Stats NZ, are already using the WGSS to report on outcomes for disabled people.) + + +## How to use this indicator + +• Involve disabled people in your research, including when designing your approach and interpreting your results. +• Use it to compare outcomes, not to estimate the size of groups. This indicator is not suitable for counting the disabled population. The official measure of the disabled population in New Zealand is derived from the Disability Survey undertaken by Stats NZ. Use of this definition should be constrained to comparisons between groups, rather than discussing the size of those groups. +• Adjust for age in your analysis. There is a very strong relationship between age and functional impairment. To account for this, we recommend adjusting for age when comparing between functional groups (at minimum, reporting separately those below and above the age of 65). + + +## Limitations of this indicator + +• The indicator is not as accurate and comprehensive as the measure of disability in the Disability Survey. It is not a replacement for the official measure from the Disability Survey. +• The Washington Group Short Set (WGSS) is not a fully comprehensive measure of functional limitations. +• For children under 12, the WGSS is less sensitive at moderate levels of functional limitation. The WGSS is also not collected for children under the age of five. +• The indicator may capture people who are disabled only temporarily. These people may have very different experiences to people who have permanent or ongoing impairments. + + +## Other notes + +• The indicator might be expanded in the future to use WGSS responses from the NZCVS and specific modules of the HES. However, the coverage of both sources is limited. +• We investigated using Hospital diagnoses and ACC records. But the steering group advised against their inclusion. + + +## References + +The Social Wellbeing Agency has published an accompanying guide. You can find it our our website: swa.govt.nz + + +## Parameters & Present values: + Current refresh = 20211020 + Prefix = defn_ + Project schema = DL-MAA20XX-YY + + +## Dependencies +The code relies on eight input tables: +• [IDI_Clean].[cen_clean].[census_individual_2018] +• [IDI_Clean].[security].[concordance] +• [IDI_Clean].[gss_clean].[gss_person] +• [IDI_Clean].[security].[concordance] +• [IDI_Clean].[moh_clean].[interrai] +• [IDI_Adhoc].[clean_read_HLFS].[hlfs_disability] +• [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_support_needs_2022] +• [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_needs_assessment_2022] + +Linking between the two is done on snz_moh_uid. No rows are lost via this linking. + +## Outputs +Table: [IDI_Sandpit].[DL-MAA20XX-YY].[defn_functional_disability] + +## Variable Descriptions + +--------------------------------------------------------------------------------------------------------------------------- +Column Description +name +------------------------------ -------------------------------------------------------------------------------------------- +snz_uid The unique STATSNZ person identifier for the person + +record_source The database from which the record originates + +event_date The date the disability information was recorded, survey/census date for Stats NZ sources, assessment date for MoH sources. + +dv_hearing Indicator of hearing limitations + +dv_seeing Indicator of seeing limitations + +dv_walking Indicator of walking limitations + +dv_remembering Indicator of remembering limitations + +dv_washing Indicator of washing limitations + +dv_communication Indicator of hearing limitations + +overall_dv_indication Indicator of any high limitation + + +## Version and change history + +2022-09-19 SA ensure consistency with definition documentation +2022-09-19 SA check against definition description by Andrew (from consultation with DDEWG) +2021-12-02 SA review and tidy +2021-11-20 CW v1 + +## Code + +***************************************************************************************************************************/ + +/* create table for all records */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_functional_disability_list] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_functional_disability_list] ( + snz_uid INT, + record_source VARCHAR(9), + event_date DATE, + dv_hearing INT, + dv_seeing INT, + dv_walking INT, + dv_remembering INT, + dv_washing INT, + dv_communication INT, +); +GO + +/*************************************************************************************************************** +append records from each source into the table +***************************************************************************************************************/ + +/********************************* +1. Stats NZ surveys - Census 2018 + +The question in the census is: + +22. This question is about difficulties you may have doing certain activities because of a health problem. + + Do you have difficulty with any of the following: + + - seeing, even if wearing glasses? ([cen_ind_dffcl_seeing_code]) + + Answers: + - no difficulty (1) + - some difficulty (2) + - a lot of difficulty (3) + - cannot do at all (4) + + Other answers in the dataset + - Response unidentifiable (7) + - Not Stated (9) + + The same answers apply to these questions: + - hearing, even if using a hearing aid? ([cen_ind_dffcl_hearing_code]) + - walking or climbing steps? ([cen_ind_dffcl_walking_code]) + - remembering or concentrating? ([cen_ind_dffcl_remembering_code]) + - washing all over or dressing? ([cen_ind_dffcl_washing_code]) + - communicating using your usual language, for example understanding or being understood by others? ([cen_ind_dffcl_comt_code]) +*********************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_functional_disability_list] + (snz_uid, record_source, event_date, dv_hearing, dv_seeing, dv_walking, dv_remembering, dv_washing, dv_communication) +SELECT [snz_uid] + ,'CEN2018' AS record_source + ,'2018-03-05' AS event_date + --,[cen_ind_dsblty_ind_code] as dv_disability + ,CASE + WHEN [cen_ind_dffcl_hearing_code] = 1 THEN 0 + WHEN [cen_ind_dffcl_hearing_code] = 2 THEN 1 + WHEN [cen_ind_dffcl_hearing_code] IN (3,4) THEN 2 + ELSE 0 END AS dv_hearing + ,CASE + WHEN [cen_ind_dffcl_seeing_code] = 1 THEN 0 + WHEN [cen_ind_dffcl_seeing_code] = 2 THEN 1 + WHEN [cen_ind_dffcl_seeing_code] IN (3,4) THEN 2 + ELSE 0 END AS dv_seeing + ,CASE + WHEN [cen_ind_dffcl_walking_code] = 1 THEN 0 + WHEN [cen_ind_dffcl_walking_code] = 2 THEN 1 + WHEN [cen_ind_dffcl_walking_code] IN (3,4) THEN 2 + ELSE 0 END AS dv_walking + ,CASE + WHEN [cen_ind_dffcl_remembering_code] = 1 THEN 0 + WHEN [cen_ind_dffcl_remembering_code] = 2 THEN 1 + WHEN [cen_ind_dffcl_remembering_code] IN (3,4) THEN 2 + ELSE 0 END AS dv_remembering + ,CASE + WHEN [cen_ind_dffcl_washing_code] = 1 THEN 0 + WHEN [cen_ind_dffcl_washing_code] = 2 THEN 1 + WHEN [cen_ind_dffcl_washing_code] IN (3,4) THEN 2 + ELSE 0 END AS dv_washing + ,CASE + WHEN [cen_ind_dffcl_comt_code] = 1 THEN 0 + WHEN [cen_ind_dffcl_comt_code] = 2 THEN 1 + WHEN [cen_ind_dffcl_comt_code] IN (3,4) THEN 2 + ELSE 0 END AS dv_communication +FROM [IDI_Clean_20211020].[cen_clean].[census_individual_2018] +/* WGSS not collected for children under 5 years of age for exclude */ +WHERE cen_ind_age_code NOT IN ('000', '001', '002', '003', '004') +GO + +/********************************* +2. Stats NZ surveys - HLFS +*********************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_functional_disability_list] + (snz_uid, record_source, event_date, dv_hearing, dv_seeing, dv_walking, dv_remembering, dv_washing, dv_communication) +SELECT b.snz_uid + ,'HLFS' AS record_source + ,[quarter_date] + ,CASE + WHEN [diff_hearing_code]='11' then 0 + WHEN [diff_hearing_code]='12' THEN 1 + WHEN [diff_hearing_code] IN ('13','14') THEN 2 + ELSE 0 END AS dv_hearing + ,CASE + WHEN [diff_seeing_code]='11' then 0 + WHEN [diff_seeing_code]='12' THEN 1 + WHEN [diff_seeing_code] IN ('13','14') THEN 2 + ELSE 0 END AS dv_seeing + ,CASE + WHEN [diff_walking_code]='11' then 0 + WHEN [diff_walking_code]='12' THEN 1 + WHEN [diff_walking_code] IN ('13','14') THEN 2 + ELSE 0 END AS dv_walking + ,CASE + WHEN [diff_memory_code]='11' then 0 + WHEN [diff_memory_code]='12' THEN 1 + WHEN [diff_memory_code] IN ('13','14') THEN 2 + ELSE 0 END AS dv_remembering + ,CASE + WHEN [diff_dressing_code]='11' then 0 + WHEN [diff_dressing_code]='12' THEN 1 + WHEN [diff_dressing_code] IN ('13','14') THEN 2 + ELSE 0 END AS dv_washing + ,CASE + WHEN [diff_communicating_code]='11' then 0 + WHEN [diff_communicating_code]='12' THEN 1 + WHEN [diff_communicating_code] IN ('13','14') THEN 2 + ELSE 0 END AS dv_communication +FROM [IDI_Adhoc].[clean_read_HLFS].[hlfs_disability] as a +INNER JOIN [IDI_Clean_20211020].[security].[concordance] as b +ON a.snz_hlfs_uid = b.snz_hlfs_uid +GO + +/********************************* +3. Stats NZ surveys - GSS 2016 or 2018 +*********************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_functional_disability_list] + (snz_uid, record_source, event_date, dv_hearing, dv_seeing, dv_walking, dv_remembering, dv_washing, dv_communication) +SELECT [snz_uid] + ,'GSS' AS record_source + ,[gss_pq_HQinterview_date] AS event_date + ,CASE + WHEN [gss_pq_disability_hear_code] ='11' THEN 0 + WHEN [gss_pq_disability_hear_code] ='12' THEN 1 + WHEN [gss_pq_disability_hear_code] IN ('13','14') THEN 2 + ELSE 0 END AS dv_hearing + ,CASE + WHEN [gss_pq_disability_see_code] ='11' THEN 0 + WHEN [gss_pq_disability_see_code] ='12' THEN 1 + WHEN [gss_pq_disability_see_code] IN ('13','14') THEN 2 + ELSE 0 END AS dv_seeing + ,CASE + WHEN [gss_pq_disability_walk_code] ='11' THEN 0 + WHEN [gss_pq_disability_walk_code] ='12' THEN 1 + WHEN [gss_pq_disability_walk_code] IN ('13','14') THEN 2 + ELSE 0 END AS dv_walking + ,CASE + WHEN [gss_pq_disability_remem_code] ='11' THEN 0 + WHEN [gss_pq_disability_remem_code] ='12' THEN 1 + WHEN [gss_pq_disability_remem_code] IN ('13','14') THEN 2 + ELSE 0 END AS dv_remembering + ,CASE + WHEN [gss_pq_disability_wash_code] ='11' THEN 0 + WHEN [gss_pq_disability_wash_code] ='12' THEN 1 + WHEN [gss_pq_disability_wash_code] IN ('13','14') THEN 2 + ELSE 0 END AS dv_washing + ,CASE + WHEN [gss_pq_disability_comm_code] ='11' THEN 0 + WHEN [gss_pq_disability_comm_code] ='12' THEN 1 + WHEN [gss_pq_disability_comm_code] IN ('13','14') THEN 2 + ELSE 0 END AS dv_communication +FROM [IDI_Clean_20211020].[gss_clean].[gss_person] +GO + +/********************************* +4. MoH - SOCRATES + +The SOCRATES data is for MoH funded disability services. +These will tend to be at the more severe end of the scale. +So we expect to see that there will be fewer people in this +data and that their measures will tend to be more severe. + +Code list: + Hearing + --1003 Hearing impaired + --1004 Deaf or nearly deaf + Seeing + --1001 Vision impaired + --1002 Blind or nearly blind + Walking + --1111 Wheelchair user (inside / outside of home) + --1101 Moving around inside home + --1102 Moving around outside home + --1103 Moving around in the community + Remembering + --1299 Other difficulties with memory / cognition / behaviour (specify) + --1203 Learning ability, i.e. acquiring skills of reading, writing, language, calculating, copying, etc. + --1202 Intellectual ability, i.e. thinking, understanding + --1208 Attention, e.g. concentration + --1201 Memory + Washing + --1403 Dressing and / or undressing + --1405 Toileting, using toilet facilities + --1402 Bathing, showering, washing self + --1404 Grooming and caring for body parts, e.g. feet, teeth, hair, nails, etc + Communication + --1803 Non verbal + --1801 Ability to express core needs + --1006 Mute or nearly mute + --1005 Speech impaired +*********************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_functional_disability_list] + (snz_uid, record_source, event_date, dv_hearing, dv_seeing, dv_walking, dv_remembering, dv_washing, dv_communication) + +SELECT [snz_uid] + ,'SOCRATES' AS record_source + ,CAST(SUBSTRING([DateAssessmentCompleted],1,7) AS DATE) AS event_date + --hearing + ,IIF(code IN (1003, 1004), 2, 0) AS dv_hearing + --seeing + ,IIF(code IN (1001, 1002), 2, 0) AS dv_seeing + --walk + ,IIF(code IN (1111, 1101, 1102, 1103), 2, 0) AS dv_walking + --memory/learning + ,IIF(code IN (1201,1202,1203,1208,1299), 2, 0) AS dv_remembering + --wash + ,IIF(code IN (1402,1403,1404,1405), 2, 0) AS dv_washing + --communicating + ,IIF(code IN (1005,1006,1801,1803), 2, 0) AS dv_communication + --,s.Code + --,s.Description +FROM [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_support_needs_2022] AS s +INNER JOIN [IDI_Clean_20211020].[security].[concordance] as c +ON s.snz_moh_uid = c.snz_moh_uid +INNER JOIN [IDI_Adhoc].[clean_read_MOH_SOCRATES].[moh_needs_assessment_2022] AS n +ON s.[snz_moh_uid] = n.[snz_moh_uid] +AND s.[NeedsAssessmentID] = n.[NeedsAssessmentID] +AND s.[snz_moh_soc_client_uid] = n.[snz_moh_soc_client_uid] +GO + +/********************************* +5. MoH - IRAI + +Excludes the CA assessment type due to absence of impairment/disability type questions +*********************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_functional_disability_list] + (snz_uid, record_source, event_date, dv_hearing, dv_seeing, dv_walking, dv_remembering, dv_washing, dv_communication) + +SELECT [snz_uid] + ,'IRAI' AS record_source + --,[snz_moh_uid] + --,[moh_irai_assessment_type_text] + --,[moh_irai_assess_version_text] + ,[moh_irai_assessment_date] AS event_date + --hearing + ,CASE + WHEN [moh_irai_hearing_code] BETWEEN 2 AND 4 THEN 2 + WHEN [moh_irai_hearing_code] = 1 THEN 1 + ELSE 0 END AS dv_hearing + --vision + ,CASE + WHEN [moh_irai_vision_light_code] BETWEEN 2 AND 4 THEN 2 + WHEN [moh_irai_vision_light_code] = 1 THEN 1 + ELSE 0 END AS dv_seeing + --walking + ,CASE + WHEN [moh_irai_adl_walking_code] BETWEEN 2 AND 6 + OR moh_irai_stairs_perform_code BETWEEN 2 AND 6 + OR moh_irai_stairs_capacity_code BETWEEN 2 AND 6 THEN 2 + WHEN [moh_irai_adl_walking_code] = 1 + OR moh_irai_stairs_perform_code = 1 + OR moh_irai_stairs_capacity_code = 1 THEN 1 + ELSE 0 END AS dv_walking + --memory/learning + ,CASE + WHEN [moh_irai_short_term_mem_ind] = 1 + OR [moh_irai_procedural_mem_ind] = 1 + OR [moh_irai_situational_mem_ind] = 1 + OR [moh_irai_long_term_mem_ind] = 1 + OR [moh_irai_res_hist_intellect_ind] = 1 + OR [moh_irai_easily_distracted_code] = 2 THEN 2 + WHEN [moh_irai_easily_distracted_code] = 1 THEN 1 + ELSE 0 END AS dv_remembering + --washing + ,CASE + WHEN moh_irai_adl_bathing_code BETWEEN 2 AND 6 THEN 2 + WHEN moh_irai_adl_bathing_code = 1 THEN 1 + ELSE 0 END AS dv_washing + --communication + ,CASE + WHEN moh_irai_scale_comm_code BETWEEN 3 AND 8 THEN 2 + WHEN moh_irai_scale_comm_code BETWEEN 1 AND 2 THEN 1 + ELSE 0 END AS dv_comt +FROM [IDI_Clean_20211020].[moh_clean].[interrai] +WHERE [moh_irai_assessment_type_text] !='CA' + +GO + +/*************************************************************************************************************** +Create final table +***************************************************************************************************************/ + +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_functional_disability_list] (snz_uid); +GO + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_functional_disability] +GO + + +WITH + +/* Where multiple records per person for the same date & source, keep the highest */ +single_person_source_date_record AS ( + SELECT snz_uid + ,record_source + ,event_date + ,MAX(dv_hearing) AS dv_hearing + ,MAX(dv_seeing) AS dv_seeing + ,MAX(dv_walking) AS dv_walking + ,MAX(dv_remembering) AS dv_remembering + ,MAX(dv_washing) AS dv_washing + ,MAX(dv_communication) AS dv_communication + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_functional_disability_list] + GROUP BY snz_uid, record_source, event_date +), + +/* Create indicator for most recent record for each person */ +most_recent_date AS ( + SELECT * + ,ROW_NUMBER() OVER (PARTITION BY snz_uid ORDER BY event_date DESC, record_source) AS ranking + FROM single_person_source_date_record +) + +SELECT snz_uid + ,record_source + ,event_date + ,dv_hearing + ,dv_seeing + ,dv_walking + ,dv_remembering + ,dv_washing + ,dv_communication + ,IIF(dv_hearing = 2 + OR dv_seeing = 2 + OR dv_walking = 2 + OR dv_remembering = 2 + OR dv_washing = 2 + OR dv_communication = 2, 1, 0) AS overall_dv_indication +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_functional_disability] +FROM most_recent_date +WHERE ranking = 1 +GO + +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[defn_functional_disability] (snz_uid); +GO +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[defn_functional_disability] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE) +GO + +/* remove raw list table */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_functional_disability_list] +GO diff --git a/people/neighbourhood_descriptors.sql b/people/neighbourhood_descriptors.sql new file mode 100644 index 0000000..25dfa62 --- /dev/null +++ b/people/neighbourhood_descriptors.sql @@ -0,0 +1,173 @@ +/************************************************************************************************** +Title: Neighbourhood descriptors +Author: Simon Anastasiadis +Re-edit: Freya Li + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Summary description of a person's neighbourhood including: region, deprivation, urban/rural, and whether a person lives in a household with dependent children. + +Intended purpose: +Identifying the region, urban/rural-ness, and other characteristics of where a person lives at a specific point in time. + +Inputs & Dependencies: +- [IDI_Clean].[data].[personal_detail] +- [IDI_Clean].[data].[snz_res_pop] +- [IDI_Clean].[data].[address_notification] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[meshblock_concordance_2019] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[meshblock_higher_geography_2020_V1_00] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[DepIndex2013] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[defn_address_descriptors_yearly] + + +Notes: +1) Address information in the IDI is not of sufficient quality to determine who shares an + address. We would also be cautious about claiming that a person lives at a specific + address on a specific date. However, we are confident using address information for the + purpose of "this location has the characteristics of the place this person lives", and + "this person has the characteristics of the people who live in this location". +2) Despite the limitations of address, it is the best source for determining whether a person + lives in a household with dependent children. Hence we use it for this purpose. However + we note that this is a low quality measure. +3) The year of the meshblock codes used for the address notification could not be found in + data documentation. A quality of range of different years/joins were tried the final + choice represents the best join available at time of creation. + Another cause for this join being imperfect is not every meshblock contains residential + addresses (e.g. some CBD areas may contain hotels but not residential addresses, and + some meshblocks are uninhabited - such as mountains or ocean areas). + Re-assessment of which meshblock code to use for joining to address_notifications + is recommended each refresh. +4) This version extends the previous single-date-snapshot to a yearly single-date-snapshot. For + example to identify someones address each year from 2013-2016, this code can produce address as at + 2013-03-31, 2014-03-31, 2015-03-31, and 2016-03-31 [i.e. YYYY-03-31 format]. + This is achieved by 'trimming' the [ant_notification_date] (address start date) and [ant_replacement_date] + (address end date) to an appopriate YYYY-03-31 date. + + The logic stems from the following: + + - A constructed (03/31) start date and constructed (03/31) end date pair MUST fall inside the spell: + 1. Either YYYY-03-31 or [YYYY+1]-03-31 (the two possible trimmed start dates) should occur within the address spell. + We go forward as it is possible that the spell starts at the end of one year (not overlapping with this year's 03-31) and spills into the + next year's 03-31, meaning the address should be attributed to the following year's 03-31 and a different spell will be used for this year's 03-31. + For example, an adddress 1 Lambton Quay starting in November 2021 and continuing to May 2022. The start date 2021-03-31 does not fall inside the spell but + a 2022-03-31 start date does, so the 2022-03-31 address is 1 Lambton Quay (and 2021-03-31 will be another address, whose spell overlaps with 2021-03-31). + We don't go backward to the previous 03-31 as this would take us before the address spell, in which case a different address spell should be used for that + YYYY-03-31 address. + AND + 2. Either YYYY-03-31 or [YYYY-1]-03-31 (the two possible trimmed end dates) should occur within the address spell. + We go backward as it is possible that the spell starts at the beginning of one year (overlapping this year's 03/31 start) and continues into the future, + to the start of a year, not quite reaching the next 03-31 date - so we shorten the spell to end at the last 03-31 date. + For example, an address 2 Lambton Quay starting in February 2021 and continuing to January 2023. The end date 2023-03-31 does not fall inside the spell + but a 2022-03-31 end date does, so the 2022-03-31 address is 2 Lambton Quay (and 2023-03-31 will be another address, whose spell overlaps with 2023-03-31). + We don't go forward to the next 03-31 as this would take us after the address spell, in which case a different address spell should be used for that + YYYY-03-31 address. + + - Given a constructed start and end date both fall within the address spell, the following determines which start and end dates are given: + - Start dates: + If the YYYY-03-31 start date constructed from year(ant_notification_date) occurs within the address spell: + trim_start is the YYYY-03-31 start date constructed from the year of the ant_notification_date + Else: + trim_start is set to [YYYY+1]-03-31 + - End dates: + If the YYYY-03-31 end date constructed from year(ant_replacement_date) occurs within the address spell: + trim_end is the YYYY-03-31 end date constructed from the year of the ant_replacement_date + Else: + trim_end is set to [YYYY-1]-03-31 + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + Current 'as-at' date = YYYY-03-31 -- can easily change to a different MM-DD by editing trim_start_end table (WITH statement) + +Issues: + +History (reverse order): +2022-06-07 VW Add name of region code (using meshblock higher geography 2020) +2022-05-31 SA-VW Create version that enables a yearly snapshot of address (see note and comments) +2022-05-17 VW Point to MSD seniors project (DL-MAA20XX-YY), update to latest refresh, add NZDep2018 for use with different years (will need to edit code below) +2022-02-18 VW Point to HTR project, DL-MAA20XX-YY +2021-11-30 MR (Update latest refresh, link latest meshblock higher geography) +2021-06-21 FL repalce age of child with the birth year +2021-06-10 SA QA +2021-06-08 FL v3 Add age of child +2021-01-26 SA QA +2021-01-11 FL v2 (Change prefix, update the table to the latest refresh, update the date) +2020-07-15 MP QA +2020-03-03 SA v1 +**************************************************************************************************/ + +/* Remove table */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_address_descriptors_yearly]; -- rename +GO + +/* Create all possible start and end date options, centred around YYYY-03-31 */ +WITH trim_start_end AS ( +SELECT * + ,DATEFROMPARTS(YEAR([ant_notification_date]), 03, 31) AS trim_start_0 + ,DATEFROMPARTS(YEAR([ant_notification_date]) + 1, 03, 31) AS trim_start_1 + -- add one so that if [ant_notification_date] (start date) falls after 03/31 of the ant_notification_date YEAR the date moves forward to 03/31 of the next year + ,DATEFROMPARTS(YEAR([ant_replacement_date]), 03, 31) AS trim_end_0 + ,DATEFROMPARTS(YEAR([ant_replacement_date]) - 1, 03, 31) AS trim_end_1 + -- subtract one so that if [ant_replacement_date] (end date) falls before 03/31 of the [ant_replacement_date] YEAR it is brought back to the previous year +FROM [IDI_Clean_YYYYMM].[data].[address_notification] +) + +/* Trim start and end dates */ +SELECT a.[snz_uid] + ,a.[ant_notification_date] + ,a.[ant_replacement_date] + ,a.[snz_idi_address_register_uid] + ,CAST(a.[ant_region_code] AS INT) AS [ant_region_code] + ,b.[REGC2020_V1_00_NAME] -- name of region + ,b.[TA2020_V1_00] -- TA number + ,b.[TA2020_V1_00_NAME] -- TA name + ,b.[IUR2020_V1_00] -- urban/rural classification + ,b.[IUR2020_V1_00_NAME] + ,CAST(b.[SA22020_V1_00] AS INT) AS [SA22020_V1_00] -- Statistical Area 2 (neighbourhood) + ,b.[SA22020_V1_00_NAME] + ,c.[DepIndex2013] + ,CAST(d.[NZDep2018] AS INT) AS [NZDep2018] + /* Choose which trimmed start and end dates to use */ + ,IIF(trim_start_0 BETWEEN [ant_notification_date] AND [ant_replacement_date], trim_start_0, trim_start_1) AS trim_start + -- if 03/31 of the ant_notification_date YEAR is within the address spell, start date is 03/31 of the ant_notification_date year, else start date is 03/31 of the next year + ,IIF(trim_end_0 BETWEEN [ant_notification_date] AND [ant_replacement_date], trim_end_0, trim_end_1) AS trim_end + -- if 03/31 of the ant_replacement_date YEAR (i.e. trim_end_0) is within the address spell, end date is 03/31 of the current year, else end date is 03/31 of the previous year +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_address_descriptors_yearly] +FROM trim_start_end AS a +INNER JOIN [IDI_Metadata].[clean_read_CLASSIFICATIONS].[meshblock_concordance_2019] AS conc +ON conc.[MB2019_code] = a.[ant_meshblock_code] +LEFT JOIN [IDI_Metadata].[clean_read_CLASSIFICATIONS].[meshblock_higher_geography_2020_V1_00] AS b +ON conc.[MB2019_code] = b.[MB2020_V1_00] +LEFT JOIN [IDI_Metadata].[clean_read_CLASSIFICATIONS].[DepIndex2013] AS c +ON conc.[MB2013_code] = c.[Meshblock2013] +LEFT JOIN [IDI_Metadata].[clean_read_CLASSIFICATIONS].[DepIndex2018_MB2018] AS d +ON conc.[MB2018_code] = d.[MB2018_code] +WHERE a.[ant_meshblock_code] IS NOT NULL +AND (trim_start_0 BETWEEN [ant_notification_date] AND [ant_replacement_date] OR trim_start_1 BETWEEN [ant_notification_date] AND [ant_replacement_date]) + -- 1 of the start dates must be valid (i.e. within the address spell) +AND (trim_end_0 BETWEEN [ant_notification_date] AND [ant_replacement_date] OR trim_end_1 BETWEEN [ant_notification_date] AND [ant_replacement_date]) + -- 1 of the end dates must be valid (i.e. within the address spell) + + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[defn_address_descriptors_yearly] (snz_uid); +GO +/* Compress final table to save space */ +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[defn_address_descriptors_yearly] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE); +GO + + diff --git a/people/prioritised_level2_ethnicity.sql b/people/prioritised_level2_ethnicity.sql new file mode 100644 index 0000000..1c8e8c2 --- /dev/null +++ b/people/prioritised_level2_ethnicity.sql @@ -0,0 +1,522 @@ +/************************************************************************************************** +Title: Level 2 prioritised ethnicity +Author: Craig Wright + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Level 2 source prioritised ethnic codes as per personal detail table source ranking. + +Intended purpose: +More detailed ethnicity information for people than top 6 categories. + +Inputs & Dependencies: +- [IDI_Clean].[cen_clean].[census_individual_2018] +- [IDI_Clean].[cen_clean].[census_individual_2013] +- [IDI_Clean].[dia_clean].[births] +- [IDI_Clean].[dia_clean].[births] +- [IDI_Clean].[dia_clean].[births] +- [IDI_Clean].[moh_clean].[nes_enrolment] +- [IDI_Clean].[security].[concordance] +- [IDI_Clean].[moh_clean].[pop_cohort_demographics] +- [IDI_Adhoc].[clean_read_MOH_CIR].[moh_cir_nhi20211026] +- [IDI_Clean].[security].[concordance] +- [IDI_Clean].[moe_clean].[student_per] +- [IDI_Clean].[acc_clean].[clients] +- [IDI_Clean].[msd_clean].[msd_swn] +- [IDI_Clean].[sla_clean].[msd_borrowing] +- [IDI_Clean].[sofie_clean].[person_waves] +- [IDI_Clean].[sofie_clean].[hq_id] +- [IDI_Clean].[hlfs_clean].[data] +- [IDI_Clean].[hlfs_clean].[nzis] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_ethnicity_level_2] +- [IDI_UserCode].[DL-MAA20XX-YY].[vacc_ethnicity_level_1_and_2] + + +Notes: +1) This approach follows the same broad concept AS Stats NZ's approach for the person_details + table: Data from all sources is collated. For each person the final value is drawn from + the highest quality source available for that person. +2) Multiple sources are used. The sources and their rankings are (1 is highest rank): + 1. census 2018 + 2. census 2013 + 3. DIA - only births data appears useful + 4. MOH NES + 5. MOH NHI + 6. CIR ethnicity + 7. MOE + 8. ACC + 9. MSD + 10. SLA MSD called SLM + 11. HES + 12. SOFIE AS SOF + 13. LINZ Migrant Survey - not included + 14. HLFS + 15. ACM Auckland city mission - not included AS one digit codes + 16. GSS +3) Codes for non-response or response unidentifiable are given using ranks 98 and 99. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = vacc_ + Project schema = DL-MAA20XX-YY + +Issues: + +History (reverse order): +2021-10-31 CW +**************************************************************************************************/ + +/* create table of all ethnicities */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_ethnicity_list] +GO + +CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_ethnicity_list] ( + snz_uid INT, + ethnic_code VARCHAR(20), + source_rank INT, + record_date DATE, +); +GO + +/******************************************************************************************* +append records from each source into the table +*******************************************************************************************/ + +/************************************ +Census 2018 +************************************/ +--Impute rank description +--11 1 2018 Census form +--12 1 2018 Census (missing from individual form) +--21 2 2013 Census +--31 97 Admin data +--41 99 Within household donor +--42 99 Donor's 2018 Census form +--43 99 Donor's 2018 Census (missing from individual form) +--44 99 Donor's response sourced from 2013 Census +--45 99 Donor's response sourced from admin data +--46 99 Donor's response sourced from within household +--51 99 No information + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_ethnicity_list] (snz_uid, ethnic_code, source_rank, record_date) +SELECT [snz_uid] + ,[cen_ind_eth_output_level2] AS ethnic_code + ,CASE + --if ethnic code is 94-98 THEN set rank to 98 + WHEN SUBSTRING([cen_ind_eth_output_level2],1,2) in ('94','95','96','97','98') THEN 98 + WHEN [cen_ind_ethgr_impt_ind] in ('11','12') THEN 1 + WHEN [cen_ind_ethgr_impt_ind] in ('21') THEN 2 + WHEN [cen_ind_ethgr_impt_ind] in ('31') THEN 97 + WHEN [cen_ind_ethgr_impt_ind] in ('41','42','43','44','45','46','51') THEN 99 + WHEN [cen_ind_ethgr_impt_ind] is null THEN 99 + end AS source_rank + ,CAST('2018-03-05' AS record_date) AS record_date +FROM [IDI_Clean_YYYYMM].[cen_clean].[census_individual_2018] +WHERE snz_uid IS NOT NULL +GO + +/************************************ +Census 2013 +************************************/ +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_ethnicity_list] (snz_uid, ethnic_code, source_rank, record_date) +SELECT [snz_uid] + ,CONCAT( + SUBSTRING([cen_ind_eth_rand6_grp1_code],1,2),';', + SUBSTRING([cen_ind_eth_rand6_grp2_code],1,2),';', + SUBSTRING([cen_ind_eth_rand6_grp3_code],1,2),';', + SUBSTRING([cen_ind_eth_rand6_grp4_code],1,2),';', + SUBSTRING([cen_ind_eth_rand6_grp5_code],1,2),';', + SUBSTRING([cen_ind_eth_rand6_grp6_code],1,2)) AS ethnic_code + ,CASE + WHEN SUBSTRING([cen_ind_eth_rand6_grp1_code],1,2) in ('94','95','96','97','98') THEN 98 + WHEN SUBSTRING([cen_ind_eth_rand6_grp1_code],1,2) in ('99') THEN 99 + else 2 END AS source_rank + ,CAST('2013-03-05' AS DATE) AS record_date +FROM [IDI_Clean_YYYYMM].[cen_clean].[census_individual_2013] +WHERE snz_uid IS NOT NULL +GO + +/************************************ +DIA births/deaths/marriages/civil unions +************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_ethnicity_list] (snz_uid, ethnic_code, source_rank, record_date) +SELECT [snz_uid] + ,CONCAT( + SUBSTRING([dia_bir_ethnic1_snz_code],1,2),';', + SUBSTRING([dia_bir_ethnic2_snz_code],1,2),';', + SUBSTRING([dia_bir_ethnic3_snz_code],1,2),';', + SUBSTRING([dia_bir_ethnic4_snz_code],1,2),';', + SUBSTRING([dia_bir_ethnic5_snz_code],1,2),';', + SUBSTRING([dia_bir_ethnic6_snz_code],1,2)) AS ethnic_code + ,CASE + WHEN SUBSTRING([dia_bir_ethnic1_snz_code],1,2) in ('94','95','96','97','98') THEN 98 + WHEN SUBSTRING([dia_bir_ethnic1_snz_code],1,2) in ('99') THEN 99 + else 3 END AS source_rank + ,DATEFROMPARTS([dia_bir_birth_year_nbr],[dia_bir_birth_month_nbr],1) AS record_date +FROM [IDI_Clean_YYYYMM].[dia_clean].[births] +WHERE [dia_bir_ethnic1_snz_code] IS NOT NULL +AND snz_uid IS NOT NULL +GO + +--parent 1 +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_ethnicity_list] (snz_uid, ethnic_code, source_rank, record_date) +SELECT parent1_snz_uid AS snz_uid + ,CONCAT( + SUBSTRING([dia_bir_parent1_ethnic1_snz_code],1,2),';', + SUBSTRING([dia_bir_parent1_ethnic2_snz_code],1,2),';', + SUBSTRING([dia_bir_parent1_ethnic3_snz_code],1,2),';', + SUBSTRING([dia_bir_parent1_ethnic4_snz_code],1,2),';', + SUBSTRING([dia_bir_parent1_ethnic5_snz_code],1,2),';', + SUBSTRING([dia_bir_parent1_ethnic6_snz_code],1,2)) AS ethnic_code + ,CASE + WHEN SUBSTRING([dia_bir_parent1_ethnic1_snz_code],1,2) in ('94','95','96','97','98') THEN 98 + WHEN SUBSTRING([dia_bir_parent1_ethnic1_snz_code],1,2) in ('99') THEN 99 + else 3 END AS source_rank + ,datefromparts([dia_bir_birth_year_nbr],[dia_bir_birth_month_nbr],1) AS record_date +FROM [IDI_Clean_YYYYMM].[dia_clean].[births] +WHERE [dia_bir_parent1_ethnic1_snz_code] IS NOT NULL +AND parent1_snz_uid IS NOT NULL +GO + +--parent 2 +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_ethnicity_list] (snz_uid, ethnic_code, source_rank, record_date) +SELECT parent2_snz_uid AS snz_uid + ,CONCAT( + SUBSTRING([dia_bir_parent2_ethnic1_snz_code],1,2),';', + SUBSTRING([dia_bir_parent2_ethnic2_snz_code],1,2),';', + SUBSTRING([dia_bir_parent2_ethnic3_snz_code],1,2),';', + SUBSTRING([dia_bir_parent2_ethnic4_snz_code],1,2),';', + SUBSTRING([dia_bir_parent2_ethnic5_snz_code],1,2),';', + SUBSTRING([dia_bir_parent2_ethnic6_snz_code],1,2)) AS ethnic_code + ,CASE + WHEN SUBSTRING([dia_bir_parent2_ethnic1_snz_code],1,2) in ('94','95','96','97','98') THEN 98 + WHEN SUBSTRING([dia_bir_parent2_ethnic1_snz_code],1,2) in ('99') THEN 99 + else 3 END AS source_rank + ,datefromparts([dia_bir_birth_year_nbr],[dia_bir_birth_month_nbr],1) AS record_date +FROM [IDI_Clean_YYYYMM].[dia_clean].[births] +WHERE [dia_bir_parent2_ethnic1_snz_code] IS NOT NULL +AND parent2_snz_uid IS NOT NULL +GO + +/************************************ +MOH PHO +************************************/ +--keep latest rank 4 ethnic record + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_ethnicity_list] (snz_uid, ethnic_code, source_rank, record_date) +SELECT b.snz_uid + ,CONCAT( + SUBSTRING([moh_nes_ethnic1_snz_code],1,2),';', + SUBSTRING([moh_nes_ethnic2_snz_code],1,2),';', + SUBSTRING([moh_nes_ethnic3_snz_code],1,2)) AS ethnic_code + ,CASE + WHEN [moh_nes_ethnic1_snz_code] in ('94','95','96','97','98') THEN 98 + WHEN [moh_nes_ethnic1_snz_code]='99' THEN 99 + else 4 END AS source_rank + ,cast([moh_nes_snapshot_month_date] AS record_date) AS record_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[nes_enrolment] AS a +INNER JOIN [IDI_Clean_YYYYMM].[security].[concordance] AS b +ON a.snz_moh_uid = b.snz_moh_uid +WHERE [moh_nes_ethnic1_snz_code] IS NOT NULL +AND b.snz_uid IS NOT NULL +GO + +/************************************ +MOH NHI +************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_ethnicity_list] (snz_uid, ethnic_code, source_rank, record_date) +SELECT [snz_uid] + ,CONCAT( + SUBSTRING([moh_pop_ethnic1_snz_code],1,2),';', + SUBSTRING([moh_pop_ethnic2_snz_code],1,2),';', + SUBSTRING([moh_pop_ethnic3_snz_code],1,2)) AS ethnic_code + ,CASE + WHEN [moh_pop_ethnic1_snz_code] in ('94','95','96','97','98') THEN 98 + WHEN [moh_pop_ethnic1_snz_code]='99' THEN 99 + else 5 END AS source_rank + ,[moh_pop_last_updated_date] AS record_date +FROM [IDI_Clean_YYYYMM].[moh_clean].[pop_cohort_demographics] +WHERE snz_uid IS NOT NULL +GO + +/************************************ +CIR ethnicity +************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_ethnicity_list] (snz_uid, ethnic_code, source_rank, record_date) +SELECT b.snz_uid + ,CONCAT( + SUBSTRING(cast([ethnic_code_1] AS varchar),1,2),';', + SUBSTRING(cast([ethnic_code_2] AS varchar),1,2),';', + SUBSTRING(cast([ethnic_code_3] AS varchar),1,2)) AS ethnic_code + ,CASE + when cast([ethnic_code_1] AS varchar)in ('94','95','96','97','98') THEN 98 + when cast([ethnic_code_1] AS varchar)='99' THEN 99 + else 6 END AS source_rank + ,CAST('YYYY-MM-DD' AS DATE) AS record_date +FROM [IDI_Adhoc].[clean_read_MOH_CIR].[moh_cir_nhi20211026] AS a +INNER JOIN [IDI_Clean_YYYYMM].[security].[concordance] AS b +ON a.snz_moh_uid = b.snz_moh_uid +GO + +/************************************ +MOE +************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_ethnicity_list] (snz_uid, ethnic_code, source_rank, record_date) +SELECT [snz_uid] + ,CONCAT( + SUBSTRING([moe_spi_eth1_text],1,2),';', + SUBSTRING([moe_spi_eth2_text],1,2),';', + SUBSTRING([moe_spi_eth3_text],1,2)) AS ethnic_code + ,CASE + WHEN SUBSTRING([moe_spi_eth1_text],1,2) in ('94','95','96','97','98') THEN 98 + WHEN SUBSTRING([moe_spi_eth1_text],1,2)='99' THEN 99 + else 7 END AS source_rank + ,[moe_spi_mod_address_date] AS record_date +FROM [IDI_Clean_YYYYMM].[moe_clean].[student_per] +WHERE [moe_spi_eth1_text] IS NOT NULL +AND snz_uid IS NOT NULL +GO + +/************************************ +ACC +************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_ethnicity_list] (snz_uid, ethnic_code, source_rank, record_date) +SELECT [snz_uid] + ,CONCAT( + SUBSTRING([acc_cli_ethnic1_snz_code],1,2),';', + SUBSTRING([acc_cli_ethnic2_snz_code],1,2),';', + SUBSTRING([acc_cli_ethnic3_snz_code],1,2)) AS ethnic_code + ,CASE + WHEN SUBSTRING([acc_cli_ethnic1_snz_code],1,2) in ('94','95','96','97','98') THEN 98 + WHEN SUBSTRING([acc_cli_ethnic1_snz_code],1,2)='99' THEN 99 + else 8 END AS source_rank + ,CAST('YYYY-MM-DD' AS DATE) AS record_date +FROM [IDI_Clean_YYYYMM].[acc_clean].[clients] +WHERE snz_uid IS NOT NULL +GO + +/************************************ +MSD +************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_ethnicity_list] (snz_uid, ethnic_code, source_rank, record_date) +SELECT [snz_uid] + ,CONCAT( + SUBSTRING([msd_swn_ucvii_ethnic1_snz_code],1,2),';', + SUBSTRING([msd_swn_ucvii_ethnic2_snz_code],1,2),';', + SUBSTRING([msd_swn_ucvii_ethnic3_snz_code],1,2)) AS ethnic_code + ,CASE + WHEN SUBSTRING([msd_swn_ucvii_ethnic1_snz_code],1,2) in ('94','95','96','97','98') THEN 98 + WHEN SUBSTRING([msd_swn_ucvii_ethnic1_snz_code],1,2)='99' THEN 99 + else 9 END AS source_rank + ,CAST('YYYY-MM-DD' AS DATE) AS record_date +FROM [IDI_Clean_YYYYMM].[msd_clean].[msd_swn] +WHERE snz_uid IS NOT NULL +GO + +/************************************ +Student Loans and Allowances MSD (SLM) +************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_ethnicity_list] (snz_uid, ethnic_code, source_rank, record_date) +SELECT [snz_uid] + ,CONCAT( + SUBSTRING([msd_sla_ethnic1_code],1,2),';', + SUBSTRING([msd_sla_ethnic2_code],1,2),';', + SUBSTRING([msd_sla_ethnic3_code],1,2)) AS ethnic_code + ,CASE + WHEN SUBSTRING([msd_sla_ethnic1_code],1,2) in ('94','95','96','97','98') THEN 98 + WHEN SUBSTRING([msd_sla_ethnic1_code],1,2)='99' THEN 99 + else 10 END AS source_rank + ,DATEFROMPARTS([msd_sla_year_nbr],7,1) AS record_date +FROM [IDI_Clean_YYYYMM.[sla_clean].[msd_borrowing] +WHERE snz_uid IS NOT NULL +GO + +/************************************ +HES -Household Economic Survey +************************************/ +--not included AS only 1-digit ethnic codes + +/************************************ +SOFIE +************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_ethnicity_list] (snz_uid, ethnic_code, source_rank, record_date) +SELECT a.[snz_uid] + ,CONCAT( + SUBSTRING([sofie_wav_ethnic1_snz_code],1,2),';', + SUBSTRING([sofie_wav_ethnic2_snz_code],1,2),';', + SUBSTRING([sofie_wav_ethnic3_snz_code],1,2),';', + SUBSTRING([sofie_wav_ethnic4_snz_code],1,2)) AS ethnic_code + ,CASE + WHEN SUBSTRING([sofie_wav_ethnic1_snz_code],1,2) in ('94','95','96','97','98') THEN 98 + WHEN SUBSTRING([sofie_wav_ethnic1_snz_code],1,2)='99' THEN 99 + else 12 END AS source_rank + ,COALESCE(b.[sofie_id_start_intervw_period_date], b.[sofie_id_start_ann_period_date]) AS record_date +FROM [IDI_Clean_YYYYMM].[sofie_clean].[person_waves] AS a +INNER JOIN [IDI_Clean_YYYYMM].[sofie_clean].[hq_id] AS b +ON a.snz_uid = b.snz_uid +AND a.sofie_wav_wave_nbr = b.sofie_id_wave_nbr +WHERE [sofie_wav_ethnic1_snz_code] IS NOT NULL +AND a.snz_uid IS NOT NULL +GO + +/************************************ +LINZ migrant survey +************************************/ +--not included + +/************************************ +HLFS +************************************/ + +INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_ethnicity_list] (snz_uid, ethnic_code, source_rank, record_date) +SELECT a.[snz_uid] + ,CONCAT( + SUBSTRING([hlfs_urd_ethnic_1_code],1,2),';', + SUBSTRING([hlfs_urd_ethnic_2_code],1,2),';', + SUBSTRING([hlfs_urd_ethnic_3_code],1,2),';', + SUBSTRING([hlfs_urd_ethnic_4_code],1,2),';', + SUBSTRING([hlfs_urd_ethnic_5_code],1,2),';', + SUBSTRING([hlfs_urd_ethnic_6_code],1,2)) AS ethnic_code + ,CASE + WHEN SUBSTRING([hlfs_urd_ethnic_1_code],1,2) in ('94','95','96','97','98') THEN 98 + WHEN SUBSTRING([hlfs_urd_ethnic_1_code],1,2)='99' THEN 99 + else 14 END AS source_rank + ,[nzis_is_quarter_date] AS record_date +FROM [IDI_Clean_YYYYMM].[hlfs_clean].[data] AS a +INNER JOIN [IDI_Clean_YYYYMM].[hlfs_clean].[nzis] AS b +ON a.snz_uid = b.snz_uid +AND a.[hlfs_urd_quarter_nbr] = b.[nzis_is_quarter_nbr] +AND a.[snz_hlfs_hhld_uid] = b.[snz_hlfs_hhld_uid] +WHERE a.snz_uid IS NOT NULL +GO + +/************************************ +ACM aucland city mission +************************************/ +--not included as one digit codes + +/************************************ +GSS +************************************/ +--a strange coding of ethnicity - not sure how to include + +/*************************************************************************************************************** +Keep best rank for each person +***************************************************************************************************************/ + +CREATE NONCLUSTERED INDEX my_index ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_ethnicity_list] (snz_uid) +GO + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_ethnicity_level_2] +GO + +WITH source_ranked AS ( + SELECT * + ,RANK() OVER (PARTITION BY [snz_uid] ORDER BY source_rank, record_date) AS ranked + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_ethnicity_list] +) +SELECT snz_uid + ,ethnic_code + ,source_rank +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_ethnicity_level_2] +FROM source_ranked +WHERE ranked = 1 +AND snz_uid IS NOT NULL +GO + +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_ethnicity_level_2] (snz_uid); +GO +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_ethnicity_level_2] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE) +GO + +/* remove raw list table */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_ethnicity_list] +GO + +/*************************************************************************************************************** +Indicator view +***************************************************************************************************************/ + +USE IDI_UserCode +GO + +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[vacc_ethnicity_level_1_and_2] +GO + +CREATE VIEW [DL-MAA20XX-YY].[vacc_ethnicity_level_1_and_2] AS +SELECT snz_uid + ,ethnic_code + ,source_rank + /* level 1 ethnicities */ + ,IIF(ethnic_code like '%10%' + OR ethnic_code like '%11%' + OR ethnic_code like '%12%', 1, 0) AS lvl1_eth_1 -- European + ,IIF(ethnic_code like '%21%', 1, 0) AS lvl1_eth_2 -- Maori + ,IIF(ethnic_code like '%30%' + OR ethnic_code like '%31%' + OR ethnic_code like '%32%' + OR ethnic_code like '%33%' + OR ethnic_code like '%34%' + OR ethnic_code like '%35%' + OR ethnic_code like '%36%' + OR ethnic_code like '%37%', 1, 0) AS lvl1_eth_3 -- Pacific + ,IIF(ethnic_code like '%40%' + OR ethnic_code like '%41%' + OR ethnic_code like '%42%' + OR ethnic_code like '%43%' + OR ethnic_code like '%44%', 1, 0) AS lvl1_eth_4 -- Asian + ,IIF(ethnic_code like '%51%' + OR ethnic_code like '%52%' + OR ethnic_code like '%53%', 1, 0) AS lvl1_eth_5 -- MELAA + ,IIF(ethnic_code like '%61%', 1, 0) AS lvl1_eth_6 -- other + /* level 2 ethnicities */ + ,IIF(ethnic_code LIKE '%10%', 1, 0) AS lvl2_eth_10 -- European nfd + ,IIF(ethnic_code LIKE '%11%', 1, 0) AS lvl2_eth_11 -- New Zealand European + ,IIF(ethnic_code LIKE '%12%', 1, 0) AS lvl2_eth_12 -- Other European + ,IIF(ethnic_code LIKE '%21%', 1, 0) AS lvl2_eth_21 -- Mäori + ,IIF(ethnic_code LIKE '%30%', 1, 0) AS lvl2_eth_30 -- Pacific Peoples nfd + ,IIF(ethnic_code LIKE '%31%', 1, 0) AS lvl2_eth_31 -- Samoan + ,IIF(ethnic_code LIKE '%32%', 1, 0) AS lvl2_eth_32 -- Cook Islands Maori + ,IIF(ethnic_code LIKE '%33%', 1, 0) AS lvl2_eth_33 -- Tongan + ,IIF(ethnic_code LIKE '%34%', 1, 0) AS lvl2_eth_34 -- Niuean + ,IIF(ethnic_code LIKE '%35%', 1, 0) AS lvl2_eth_35 -- Tokelauan + ,IIF(ethnic_code LIKE '%36%', 1, 0) AS lvl2_eth_36 -- Fijian + ,IIF(ethnic_code LIKE '%37%', 1, 0) AS lvl2_eth_37 -- Other Pacific Peoples + ,IIF(ethnic_code LIKE '%40%', 1, 0) AS lvl2_eth_40 -- Asian nfd + ,IIF(ethnic_code LIKE '%41%', 1, 0) AS lvl2_eth_41 -- Southeast Asian + ,IIF(ethnic_code LIKE '%42%', 1, 0) AS lvl2_eth_42 -- Chinese + ,IIF(ethnic_code LIKE '%43%', 1, 0) AS lvl2_eth_43 -- Indian + ,IIF(ethnic_code LIKE '%44%', 1, 0) AS lvl2_eth_44 -- Other Asian + ,IIF(ethnic_code LIKE '%51%', 1, 0) AS lvl2_eth_51 -- Middle Eastern + ,IIF(ethnic_code LIKE '%52%', 1, 0) AS lvl2_eth_52 -- Latin American + ,IIF(ethnic_code LIKE '%53%', 1, 0) AS lvl2_eth_53 -- African + ,IIF(ethnic_code LIKE '%61%', 1, 0) AS lvl2_eth_61 -- Other Ethnicity + /* non-responses */ + ,IIF(ethnic_code like '%94%' + OR ethnic_code like '%95%' + OR ethnic_code like '%97%' + OR ethnic_code like '%98%' + OR ethnic_code like '%99%', 1, 0) AS eth_non_response +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_ethnicity_level_2] +GO diff --git a/people/security_concordance.sql b/people/security_concordance.sql new file mode 100644 index 0000000..44144e4 --- /dev/null +++ b/people/security_concordance.sql @@ -0,0 +1,93 @@ +/************************************************************************************************** +Title: Security concordance indicators +Author: Craig Wright + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Security concordance indicators + +Intended purpose: +Identifying which datasets an identity is linked to. + +Inputs & Dependencies: +- [IDI_Clean].[security].[concordance] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[vacc_security_concordance] + +Notes: + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = vacc_ + Project schema = DL-MAA20XX-YY + +Issues: + +History (reverse order): +2021-10-31 CW +**************************************************************************************************/ + +USE IDI_UserCode +GO + +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[vacc_security_concordance] +GO + +CREATE VIEW [DL-MAA20XX-YY].[vacc_security_concordance] AS +SELECT [link_set_key] + ,[snz_uid] + ,[snz_spine_uid] + ,IIF([snz_spine_uid] IS NOT NULL, 1, 0) AS [spine] + ,IIF([snz_ird_uid] IS NOT NULL, 1, 0) AS [ird] + ,IIF([snz_moe_uid] IS NOT NULL, 1, 0) AS [moe] + ,IIF([snz_dol_uid] IS NOT NULL, 1, 0) AS [dol] + ,IIF([snz_hlfs_uid] IS NOT NULL, 1, 0) AS [hlfs] + ,IIF([snz_msd_uid] IS NOT NULL, 1, 0) AS [msd] + ,IIF([snz_sofie_uid] IS NOT NULL, 1, 0) AS [sofie] + ,IIF([snz_jus_uid] IS NOT NULL, 1, 0) AS [jus] + ,IIF([snz_acc_uid] IS NOT NULL, 1, 0) AS [acc] + ,IIF([snz_moh_uid] IS NOT NULL, 1, 0) AS [moh] + ,IIF([snz_dia_uid] IS NOT NULL, 1, 0) AS [dia] + ,IIF([snz_cen_uid] IS NOT NULL, 1, 0) AS [cen] + ,IIF([snz_hes_uid] IS NOT NULL, 1, 0) AS [hes] + ,IIF([snz_acm_uid] IS NOT NULL, 1, 0) AS [acm] + ,IIF([snz_nzta_uid] IS NOT NULL, 1, 0) AS [nzta] + ,IIF([snz_gss_uid] IS NOT NULL, 1, 0) AS [gss] + ,IIF([snz_otfs_uid] IS NOT NULL, 1, 0) AS [otfs] + ,IIF([snz_piaac_uid] IS NOT NULL, 1, 0) AS [piaac] + ,IIF([snz_esp_uid] IS NOT NULL, 1, 0) AS [esp] + ,IIF([snz_nzcvs_uid] IS NOT NULL, 1, 0) AS [nzcvs] + -- count linked UIDs + ,IIF([snz_ird_uid] IS NOT NULL, 1, 0) + + IIF([snz_moe_uid] IS NOT NULL, 1, 0) + + IIF([snz_dol_uid] IS NOT NULL, 1, 0) + + IIF([snz_hlfs_uid] IS NOT NULL, 1, 0) + + IIF([snz_msd_uid] IS NOT NULL, 1, 0) + + IIF([snz_sofie_uid] IS NOT NULL, 1, 0) + + IIF([snz_jus_uid] IS NOT NULL, 1, 0) + + IIF([snz_acc_uid] IS NOT NULL, 1, 0) + + IIF([snz_moh_uid] IS NOT NULL, 1, 0) + + IIF([snz_dia_uid] IS NOT NULL, 1, 0) + + IIF([snz_cen_uid] IS NOT NULL, 1, 0) + + IIF([snz_hes_uid] IS NOT NULL, 1, 0) + + IIF([snz_acm_uid] IS NOT NULL, 1, 0) + + IIF([snz_nzta_uid] IS NOT NULL, 1, 0) + + IIF([snz_gss_uid] IS NOT NULL, 1, 0) + + IIF([snz_otfs_uid] IS NOT NULL, 1, 0) + + IIF([snz_piaac_uid] IS NOT NULL, 1, 0) + + IIF([snz_esp_uid] IS NOT NULL, 1, 0) + + IIF([snz_nzcvs_uid] IS NOT NULL, 1, 0) AS [uids] +FROM [IDI_Clean_YYYYMM].[security].[concordance] +GO diff --git a/social_services/OT_placement.sql b/social_services/OT_placement.sql new file mode 100644 index 0000000..b9aecad --- /dev/null +++ b/social_services/OT_placement.sql @@ -0,0 +1,54 @@ +/************************************************************************************************** +Title: Recent OT placements +Author: Craig Wright + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Indicator of recent Oranga Tamariki placements for children. + +Intended purpose: +Identifying whether children have recently been in a placement arranged by Oranga Tamariki. + +Inputs & Dependencies: +- [IDI_Clean].[cen_clean].[census_individual_2018] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[vacc_OT_placement] + +Notes: + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = vacc_ + Project schema = DL-MAA20XX-YY + +Issues: + +History (reverse order): +2021-10-31 CW +**************************************************************************************************/ + +USE IDI_UserCode +GO + +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[vacc_OT_placement] +GO + +CREATE VIEW [DL-MAA20XX-YY].[vacc_OT_placement] AS +SELECT DISTINCT [snz_uid] + ,[snz_msd_uid] + ,1 as type +FROM [IDI_Clean_YYYYMM].[cyf_clean].[cyf_placements_event] +WHERE [cyf_ple_event_to_date_wid_date] >= '2021-07-01' +GO diff --git a/social_services/T2_benefit_supplementary_support.sql b/social_services/T2_benefit_supplementary_support.sql new file mode 100644 index 0000000..9b6964e --- /dev/null +++ b/social_services/T2_benefit_supplementary_support.sql @@ -0,0 +1,86 @@ +/************************************************************************************************** +Title: T2 benefit receipt by type +Author: Michael Hackney and Simon Anastasiadis, et. al. (HaBiSA project), +Reviewer: Simon Anastasiadis, AK + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Supplementary benefits received from MSD. + +Intended purpose: +Identify periods of Tier 2 benefit receipt, value of Tier 2 benefit received, +and to identify the types of T2 benefits. + +Inputs & Dependencies: +- [IDI_Clean].[msd_clean].[msd_second_tier_expenditure] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[msd_benefit_type_code] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[msd_benefit_type_code_4] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[supplementary_benefit_receipt] + +Notes: +1) Periods of Tier 2 benefit receipt with daily amount received. +2) The IDI metadata database contains multiple tables that translate benefit type codes into benefit + names/descriptions. The differences between these tables are not well explained. + As not every code appears in every table, for some applications we need to combine multiple metadata tables. + + +Parameters & Present values: + Current refresh = YYYYMM + Project schema = DL-MAA20XX-YY + +Issues: + +History (reverse order): +2021-09-03 MP modify for vaccination modelling +2019-04-09 AK QA, archived manual list, replaced with join to metadata, value change to total for period +2019-04-23 AK Changes applied, Code Index for codes 604, 605, 667 not available, Joining two tables, table meta data not available +2019-04-26 SA notes added above +2018-12-06 SA reviewed +2018-12-04 initiated +**************************************************************************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[supplementary_benefit_receipt]; +GO + +WITH code_classifications AS ( + -- Code classifications + SELECT [Code], [classification] + FROM [IDI_Metadata].[clean_read_CLASSIFICATIONS].[msd_benefit_type_code] + + UNION ALL + + SELECT [Code], [classification] + FROM [IDI_Metadata].[clean_read_CLASSIFICATIONS].[msd_benefit_type_code_4] -- add three codes that do not appear in the first metadata table + WHERE Code IN (604, 605, 667) +) +SELECT DISTINCT snz_uid + ,[Code] + ,[classification] + ,msd_ste_start_date + ,msd_ste_end_date + ,[msd_ste_daily_gross_amt] AS daily_payment_amount + ,[msd_ste_daily_gross_amt] * (1 + DATEDIFF(DAY, msd_ste_start_date, msd_ste_end_date)) AS total_payment_amount +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_supplementary_benefit_receipt] +FROM [IDI_Clean_YYYYMM].[msd_clean].[msd_second_tier_expenditure] t2 +INNER JOIN code_classifications AS codes +ON t2.[msd_ste_supp_serv_code] = codes.Code +WHERE [msd_ste_supp_serv_code] IS NOT NULL +GO + +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[supplementary_benefit_receipt] (snz_uid); +GO +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[supplementary_benefit_receipt] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE) +GO diff --git a/social_services/T3_benefit_emergency_support.sql b/social_services/T3_benefit_emergency_support.sql new file mode 100644 index 0000000..85aa6a6 --- /dev/null +++ b/social_services/T3_benefit_emergency_support.sql @@ -0,0 +1,66 @@ +/************************************************************************************************** +Title: T3 benefit receipt +Author: Shaan Badenhorst + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Number of hardship and emergency payment received. + +Intended purpose: +Number of emergency benefits received. +Frequency of receipt. + +Inputs & Dependencies: +- [IDI_Clean].[msd_clean].[msd_third_tier_expenditure] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[emergency_benefit_receipt] + +Notes: +1) MSD classifies support payments into three tiers: + - T1 includes main benefits types that provide some replacement for income from employment. + - T2 includes supplementary benefits like accommodation supplement and winter energy payment. + - T3 includes hardship and emergency payments. + While both T2 and T3 are in addition of a person's main income (e.g. benefit, pension, wages) + one of the key differences is that T2 may be ongoing, while T3 are intended to be once-off + (e.g. food grant, car repairs, school uniforms). + +Parameters & Present values: + Current refresh = YYYYMM + Project schema = DL-MAA20XX-YY + Year of interest = 2020 + +Issues: + +History (reverse order): +2021-10-01 SB +**************************************************************************************************/ + +USE IDI_UserCode +GO + +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[emergency_benefit_receipt]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[emergency_benefit_receipt] AS +SELECT snz_uid + ,msd_tte_decision_date + ,msd_tte_pmt_amt AS amount_received + /* payment type codes */ + ,msd_tte_parent_svc_code + ,msd_tte_lump_sum_svc_code + ,msd_tte_pmt_rsn_type_code +FROM [IDI_Clean_YYYYMM].[msd_clean].[msd_third_tier_expenditure] +WHERE YEAR([msd_tte_decision_date]) >= 2020 +GO diff --git a/social_services/accommodation_supplement.sql b/social_services/accommodation_supplement.sql new file mode 100644 index 0000000..e4203ac --- /dev/null +++ b/social_services/accommodation_supplement.sql @@ -0,0 +1,63 @@ +/************************************************************************************* +Title: Accommodation supplement +Author: Verity Warn + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Accomodation supplement and accommodation benefit payments + +Intended purpose: +Identify number of people receiving accommodation supplement and accomodation benefit payments. + +Inputs & Dependencies: +- [IDI_Clean].[msd_clean].[msd_second_tier_expenditure] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[msd_income_support_serv_codes] +Output: +- [IDI_UserCode].[DL-MAA20XX-YY].[defn_accom_supp] + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +History +2022-06-14 VW Implement QA suggestions (rename table for reading ease, fix join to ensure benefit serv codes are relevant to period) +2022-06-14 SA QA +2022-05-26 VW Definition creation + +**************************************************************************************/ + +USE IDI_UserCode +GO + +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[defn_accom_supp]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[defn_accom_supp] AS +SELECT [snz_uid] + ,[msd_ste_start_date] + ,[msd_ste_end_date] + ,ROUND([msd_ste_period_nbr] * [msd_ste_daily_gross_amt], 2) AS [gross_payment] -- [msd_ste_period_nbr] equals number of days start-to-end inclusive +FROM [IDI_Clean_YYYYMM].[msd_clean].[msd_second_tier_expenditure] AS k +LEFT JOIN [IDI_Adhoc].[clean_read_MSD].[benefit_codes] AS code + ON k.msd_ste_supp_serv_code = code.serv +AND (k.msd_ste_supp_serv_code = code.additional_service_data + OR (code.additional_service_data IS NULL + AND (k.msd_ste_supp_serv_code ='null' OR k.msd_ste_supp_serv_code IS NULL) + )) +AND code.ValidFromtxt <= k.[msd_ste_start_date] +AND k.[msd_ste_start_date] <= code.ValidTotxt +WHERE msd_ste_supp_serv_code IN ('470', '471') -- accommodation support, accommodation benefit (there are others could include, check e.g. tenure allowance, income related rent subsidy HNZ) +GO diff --git a/social_services/beneficiary_on_IRD_EMS.sql b/social_services/beneficiary_on_IRD_EMS.sql new file mode 100644 index 0000000..df354a8 --- /dev/null +++ b/social_services/beneficiary_on_IRD_EMS.sql @@ -0,0 +1,61 @@ +/************************************************************************************************** +Title: Beneficiary indicator based on IRD EMS +Author: Freya Li +Reviewer: Simon Anastasiadis + + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Main benefit receipt indicator on IRD EMS. + +Intended purpose: +Creating indicators of individuals who receiving benefit and identifiable on IRD EMS data. + +Inputs & Dependencies: +- [IDI_Clean_YYYYMM].[ir_clean].[ird_ems] + +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[d2gP2_benefit_ind] + +Notes: + +Parameters & Present values: + Prefix = d2gP2_ + Project schema = [DL-MAA20XX-YY] + +Issues: + +History (reverse order): +2021-06-10 SA QA +2021-04-16 FL v1 +**************************************************************************************************/ + +/* Set database for writing views */ +USE IDI_UserCode +GO + +/* Clear existing view */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[d2gP2_benefit_ind]; +GO + +/* Create view */ +CREATE VIEW [DL-MAA20XX-YY].[d2gP2_benefit_ind] AS +SELECT snz_uid + ,[ir_ems_return_period_date] +FROM [IDI_Clean_YYYYMM].[ir_clean].[ird_ems] +WHERE ir_ems_income_source_code = 'BEN' -- income source is benefit +--AND YEAR([ir_ems_return_period_date]) = YYYY +--AND MONTH([ir_ems_return_period_date]) = MM -- has benefit record on specfic date +GO diff --git a/social_services/cfy_report_of_concern.sql b/social_services/cfy_report_of_concern.sql new file mode 100644 index 0000000..09aff94 --- /dev/null +++ b/social_services/cfy_report_of_concern.sql @@ -0,0 +1,64 @@ +/************************************************************************************************** +Title: Number of reports of concern to children +Author: Simon Anastasiadis + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Counting number of reports of concern to children that meets Sec 15 criteria. + +Intended purpose: +Counting number of reports of concern to children. + +Inputs & Dependencies: +- [IDI_Clean].[cyf_clean].[cyf_intakes_event] +- [IDI_Clean].[cyf_clean].[cyf_intakes_details] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[defn_cyf_report_of_concern] + + +Notes: +1) Oranga Tamariki (and CYF) have several layers of events including investigations, + and findings of abuse. Reports of Concern are the most common types of events that + meet a formal criteria. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +Issues: + +History (reverse order): +2020-05-20 SA v1 +**************************************************************************************************/ + +/* Set database for writing views */ +USE IDI_UserCode +GO + +/* Clear existing view */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[defn_cyf_report_of_concern]; +GO + +/* Create view */ +CREATE VIEW [DL-MAA20XX-YY].[defn_cyf_report_of_concern] AS +SELECT a.[snz_uid] + ,[cyf_ine_event_from_date_wid_date] + ,b.[cyf_ind_intake_type_code] +FROM [IDI_Clean_YYYYMM].[cyf_clean].[cyf_intakes_event] a +INNER JOIN [IDI_Clean_YYYYMM].[cyf_clean].[cyf_intakes_details] b +ON a.[snz_composite_event_uid] = b.[snz_composite_event_uid] +WHERE b.[cyf_ind_intake_type_code] = 'SEC15' +GO diff --git a/social_services/cyf_events.sql b/social_services/cyf_events.sql new file mode 100644 index 0000000..c6c062b --- /dev/null +++ b/social_services/cyf_events.sql @@ -0,0 +1,178 @@ +/********************************************************************************************************* +Title: CYF meets Sec 15 criterial +Author Simon Anastasiadis, Maui Brennan +Reviewer: AK, MP + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Descriptions: +Reporting concerning CYF that meets Section 15 criterial. + +Intended use: +- Section 15 - Identify concern for children and stress on parents +- Identifying all abuse events. + +Inputs & Dependencies: +- popdefn-stage.sql +- [IDI_Clean_YYYYMM].[cyf_clean].[cyf_abuse_event] +- [IDI_Clean_YYYYMM].[cyf_clean].[cyf_intakes_event] +- [IDI_Clean_YYYYMM].[cyf_clean].[cyf_placements_event] + +History (reverse order): +2020-08-29 reviewed, parameterised (MP) +2020-08-19 MB adding all abuse events +2019-04-23 reviewer (AK) +2019-04-01 Initiated +*/ +--PARAMETERS################################################################################################## +--SQLCMD only (Activate by clicking Query->SQLCMD Mode) +/* Already in master.sql; Uncomment when running individually +:setvar TBLPREF "swangt_" +:setvar IDIREF "IDI_Clean_YYYYMM" +:setvar PROJSCH "DL-MAA20XX-YY" +GO +*/ +--############################################################################################################## +/*embedded in user code*/ +USE IDI_UserCode +GO + +--############################################################################################################## +/* +Notes: +Section 15 - Identify concern for children and stress on parents +A care and protection client intake event occurs when a person who believes a child or young person (CYP) is being (or is likely to be) harmed, +ill-treated, abused, neglected, or deprived, reports the matter to CYF or the Police. +CYF also receive reports when there are concerns regarding a child or young person's behaviour, or insecurity of care. +A youth justice client intake event occurs when a child or young person is alleged to have committed an offence and the matter is referred by the Police +(or other enforcement agency), Youth Court, or Family Court. Where a child or young person appears before the court, they may also be placed in the custody of CYF following arrest. +The client intake event start date is the incident date of the notification to CYF and the end date is the end date of the client role in the intake phase. +*/ +-- Create a view for event + DROP VIEW IF EXISTS [$(PROJSCH)].[$(TBLPREF)cyf_intakes_event]; +GO + +CREATE VIEW [$(PROJSCH)].[$(TBLPREF)cyf_intakes_event] +AS +SELECT a.[snz_uid] + ,[cyf_ine_event_from_date_wid_date] AS [start_date] + ,[cyf_ine_event_from_date_wid_date] AS [end_date] + ,'SEC15' AS [description] + ,1 AS [value] + ,'cyf_intakes' AS [source] +FROM [$(IDIREF)].[cyf_clean].[cyf_intakes_event] a +INNER JOIN [$(IDIREF)].[cyf_clean].[cyf_intakes_details] b + ON a.[snz_composite_event_uid] = b.[snz_composite_event_uid] +WHERE b.[cyf_ind_intake_type_code] = 'SEC15'; +GO + +-- Create a view for indicator + DROP VIEW IF EXISTS [$(PROJSCH)].[$(TBLPREF)cyf_intakes_indicator]; +GO + +CREATE VIEW [$(PROJSCH)].[$(TBLPREF)cyf_intakes_indicator] +AS +SELECT a.[snz_uid] + ,'1900-01-01' AS "start_date" + ,'2100-01-01' AS "end_date" + ,'SEC15' AS [description] + ,1 AS [value] + ,'cyf_intakes' AS [source] +FROM [$(IDIREF)].[cyf_clean].[cyf_intakes_event] a +INNER JOIN [$(IDIREF)].[cyf_clean].[cyf_intakes_details] b + ON a.[snz_composite_event_uid] = b.[snz_composite_event_uid] +WHERE b.[cyf_ind_intake_type_code] = 'SEC15'; +GO + +-- Identifying all abuse events, broader definiton +/* +Notes: +An abuse finding event records the assessment that a social worker makes about whether or not a client has suffered abuse. +There is one event for every combination of client, perpetrator, and abuse type. +There will often be multiple abuse findings event records for a client because there may be multiple notifications for a client, each requiring an investigation. +The same client may have more than one type of abuse within the same period (eg physically and sexually abused). +Similarly, a client may have the same type of abuse more than once for the same notification, as a result of more than one perpetrator subjecting the client to the same abuse. +For example, a child is neglected by both parents. +*/ +-- Create a view for event + DROP VIEW IF EXISTS [$(PROJSCH)].[$(TBLPREF)cyf_abuse_event]; +GO + +CREATE VIEW [$(PROJSCH)].[$(TBLPREF)cyf_abuse_event] +AS +SELECT [snz_uid] + ,[cyf_abe_event_from_datetime] AS [start_date] + ,[cyf_abe_event_to_datetime] AS [end_date] + ,'Abuse Event Occurred' AS [description] + ,1 AS [value] + ,'cyf' AS [source] +FROM [$(IDIREF)].[cyf_clean].[cyf_abuse_event] +WHERE [cyf_abe_event_type_wid_nbr] IS NOT NULL; +GO + +-- Create a view for indicator + DROP VIEW IF EXISTS [$(PROJSCH)].[$(TBLPREF)cyf_abuse_indicator]; +GO + +CREATE VIEW [$(PROJSCH)].[$(TBLPREF)cyf_abuse_indicator] +AS +SELECT [snz_uid] + ,'1900-01-01' AS "start_date" + ,'2100-01-01' AS "end_date" + ,'Abuse Event Occurred' AS [description] + ,1 AS [value] + ,'cyf' AS [source] +FROM [$(IDIREF)].[cyf_clean].[cyf_abuse_event] +WHERE [cyf_abe_event_type_wid_nbr] IS NOT NULL; +GO + +-- Placements +/* +Notes: A placement event occurs when a placement record is created for a client. +Some placement records for a given client may overlap. An example of this is where a placement is in force but then a respite placement +(perhaps for a few days) occurs for the same client, who then returns to the original placement after the respite placement. +*/ +-- Create a view for event + DROP VIEW IF EXISTS [$(PROJSCH)].[$(TBLPREF)cyf_placement_event]; +GO + +CREATE VIEW [$(PROJSCH)].[$(TBLPREF)cyf_placement_event] +AS +SELECT [snz_uid] + ,cyf_ple_event_from_date_wid_date AS [start_date] + ,cyf_ple_event_to_date_wid_date AS [end_date] + ,'Placement Event Occurred' AS [description] + ,1 AS [value] + ,'cyf' AS [source] +FROM [$(IDIREF)].[cyf_clean].[cyf_placements_event] +WHERE [cyf_ple_event_type_wid_nbr] IS NOT NULL; +GO + +-- Create a view for indicator + DROP VIEW IF EXISTS [$(PROJSCH)].[$(TBLPREF)cyf_placement_indicator]; +GO + +CREATE VIEW [$(PROJSCH)].[$(TBLPREF)cyf_placement_indicator] +AS +SELECT [snz_uid] + ,'1900-01-01' AS "start_date" + ,'2100-01-01' AS "end_date" + ,'Placement Event Occurred' AS [description] + ,1 AS [value] + ,'cyf' AS [source] +FROM [$(IDIREF)].[cyf_clean].[cyf_placements_event] +WHERE [cyf_ple_event_type_wid_nbr] IS NOT NULL; +GO + + diff --git a/social_services/main_benefit_spells.sql b/social_services/main_benefit_spells.sql new file mode 100644 index 0000000..aa0489c --- /dev/null +++ b/social_services/main_benefit_spells.sql @@ -0,0 +1,111 @@ +/************************************************************************************************** +Title: Main benefit spell +Author: Simon Anastasiadis +Re-edit: Freya Li +Reviewer: Simon Anastasiadis + + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Main benefit receipt spells regardless of benefit type or beneficiary role. + +Intended purpose: +Creating indicators of when/whether a person was receiving a benefit. +Identifying spells when a person is receiving a benefit. +Counting the number of days a person spends receiving benefit. + +Inputs & Dependencies: +- main_benefits_by_type_and_partner_status.sql --> [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_abt_main_benefit_final] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_benefit_spell] +- [IDI_UserCode].[DL-MAA20XX-YY].[d2gP2_benefit_spell_with_ind] + + +Notes: +1) Input table already contains condensing and preparation. But when we discard the benefit + type and beneficiary role, further condensing is necessary to avoid double counting. +2) Condensing can be slow. But speed improvements arise from pre-filtering the input tables + to narrower dates of interest. + +Parameters & Present values: + Prefix = d2gP2_ + Project schema = [DL-MAA20XX-YY] + Earliest start date = '2006-01-01' + Latest end date = '2026-12-31' + +Issues: + +History (reverse order): +2021-06-10 SA QA and convert table to view for performance +2021-04-15 FL including those spells which end date is null +2021-01-26 SA QA +2021-01-11 FL v2 (Change prefix and update the table to the latest refresh) +2020-07-22 MP QA +2020-03-02 SA v1 +**************************************************************************************************/ + +/* Condensed spells */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_benefit_spell]; +GO + + +WITH +/* shared staging filter */ +staging_spells AS ( + SELECT [snz_uid], [start_date], [end_date] + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_abt_main_benefit_final] + WHERE [start_date] <= [end_date] + AND '2006-01-01' <= [end_date] + AND [start_date] <= '2026-12-31' +), +/* exclude start dates that are within another spell */ +spell_starts AS ( + SELECT [snz_uid], [start_date] + FROM staging_spells s1 + WHERE NOT EXISTS ( + SELECT 1 + FROM staging_spells s2 + WHERE s1.snz_uid = s2.snz_uid + AND DATEADD(DAY, -1, s1.[start_date]) BETWEEN s2.[start_date] AND s2.[end_date] + ) +), +/* exclude end dates that are within another spell */ +spell_ends AS ( + SELECT [snz_uid], [end_date] + FROM staging_spells t1 + WHERE NOT EXISTS ( + SELECT 1 + FROM staging_spells t2 + WHERE t2.snz_uid = t1.snz_uid + AND YEAR(t1.[end_date]) <> 9999 + AND DATEADD(DAY, 1, t1.[end_date]) BETWEEN t2.[start_date] AND t2.[end_date] + ) +) +SELECT s.snz_uid, s.[start_date], MIN(e.[end_date]) as [end_date] +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_benefit_spell] +FROM spell_starts s +INNER JOIN spell_ends e +ON s.snz_uid = e.snz_uid +AND s.[start_date] <= e.[end_date] +GROUP BY s.snz_uid, s.[start_date] +GO + +/* Add index */ +CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_benefit_spell] (snz_uid); +GO +/* Compress final table to save space */ +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_benefit_spell] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE); +GO + diff --git a/social_services/main_benefit_types.sql b/social_services/main_benefit_types.sql new file mode 100644 index 0000000..caf83d4 --- /dev/null +++ b/social_services/main_benefit_types.sql @@ -0,0 +1,97 @@ +/************************************************************************************************** +Title: Main benefit spell by type +Author: Simon Anastasiadis +Re-edit: Freya Li +Reviewer: Simon Anastasiadis + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Type of main benefit receipt spells where a person is the primary recipient or a partner. +Also includes NZ Superannuation as superannuatents interact with tier 2 and tier 3 +parts of the benefit system. + +Intended purpose: +Creating indicators of benefit type(s) a person is receiving. + +Inputs & Dependencies: +- main_benefits_by_type_and_partner_status.sql --> [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_abt_main_benefit_final] +- [IDI_Clean].[ir_clean].[ird_ems] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[d2gP2_main_benefit_types] + + +Notes: +1) NZ Super is distinct from main benefits. However, for the Debt to Government project + we add it on to this definition because similar debt recovery options exist for + people receiving Superannuation as for people receiving a main benefit. +2) This file is unnecessary in some refreshes as Benefit Dynamics Data includes Pensions + for some refreshes. The correspondence between Pensions from MSD BDD records + and NZSuper from the code below is very high (>95%). + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = d2gP2_ + Project schema = [DL-MAA20XX-YY] + earliest_year = 2014 + latest_year = 2020 + +Issues: + +History (reverse order): +2021-04-19 SA note added regarding Pensions in some refreshes +2021-01-26 SA QA +2021-01-11 FL v2 (Change prefix and update the table to the latest refresh) +2020-07-22 MP QA +2020-07-02 SA v1 +**************************************************************************************************/ + +USE IDI_UserCode +GO + +/* drop before re-creating */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[d2gP2_main_benefit_types]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[d2gP2_main_benefit_types] AS + +SELECT [snz_uid] + ,[start_date] + ,[end_date] + ,[level4] AS [ben_type] +FROM [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_abt_main_benefit_final] +WHERE YEAR([start_date]) <= 2020 +AND 2014 <= YEAR([end_date]) + +UNION ALL + +SELECT snz_uid + ,CASE WHEN [ir_ems_employee_start_date] IS NOT NULL + AND [ir_ems_employee_start_date] < [ir_ems_return_period_date] + AND DATEDIFF(DAY, [ir_ems_employee_start_date], [ir_ems_return_period_date]) < 60 -- employee started in the last two months + THEN [ir_ems_employee_start_date] + ELSE DATEFROMPARTS(YEAR([ir_ems_return_period_date]),MONTH([ir_ems_return_period_date]),1) END AS [start_date] + ,CASE WHEN [ir_ems_employee_end_date] IS NOT NULL + AND [ir_ems_employee_end_date] < [ir_ems_return_period_date] + AND ([ir_ems_employee_start_date] IS NULL OR [ir_ems_employee_start_date] < [ir_ems_employee_end_date]) + AND DATEDIFF(DAY, [ir_ems_employee_end_date], [ir_ems_return_period_date]) < 27 -- employee finished in the last month + THEN [ir_ems_employee_end_date] + ELSE [ir_ems_return_period_date] END AS [end_date] + ,'NZSuper' AS [ben_type] +FROM [IDI_Clean_YYYYMM].[ir_clean].[ird_ems] +WHERE [ir_ems_income_source_code] IN ('PEN') +AND [snz_ird_uid] <> 0 +AND YEAR([ir_ems_return_period_date]) BETWEEN 2014 AND 2020; -- exclude placeholder person without IRD number +GO + diff --git a/social_services/main_benefits_by_type_and_partner_status.sql b/social_services/main_benefits_by_type_and_partner_status.sql new file mode 100644 index 0000000..7b2cd5a --- /dev/null +++ b/social_services/main_benefits_by_type_and_partner_status.sql @@ -0,0 +1,490 @@ +/************************************************************************************************** +Title: Main benefit spell with type and role +Author: Simon Anastasiadis + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Main benefit receipt spells where a person is the primary recipient or a partner. +Beneficiary role (primary, partner, or single) and benefit type are provided. + +Intended purpose: +Creating indicators of when/whether a person was receiving a benefit. +Identifying spells when a person is receiving a benefit. +Counting the number of days a person spends receiving benefit. +Can be subsetted by type and role. + +Inputs & Dependencies: +- [IDI_Clean].[msd_clean].[msd_spell] +- [IDI_Clean].[msd_clean].[msd_partner] +- [IDI_Adhoc].[clean_read_MSD].[benefit_codes] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_abt_main_benefit_final] + + +Notes: +1) SAS code for the same purpose has been developed by Marc de Boer & the team at MSD. + We provide this equivalent approach in SQL. As it is more compact, and has only the + details necessary for the creation of this table. +2) As per the MSD data dictionary: Benefit tables in the IDI cover entitlements and + IRD EMS records dispensing. Differences between these tables can be due to changes + in entiltements, partial dispensing (e.g. due to automated deductions), and differences + in timing (weekly/daily vs. monthly). +3) The script is long as several intermediate tables must first be built. All intermediate + tables are deleted at the conclusion of the script. +4) Initial versions of this script used *_input views to create the final table instead of + *_condensed. This resulted in overlapping output spells for the same benefit type. + It was also inconsistent with the original methodology by Marc de Boer that does not reuse + the input tables once they have been condensed. +5) The benefit type output (msd_spel_servf_code, msd_spel_add_servf_code, level 1-4) for + people with [role] = 'partner' is not their benefit type but the benefit type of their + partner who is the one receiving a main benefit. There will be a corresponding record + for this person with [role] = 'primary'. +6) Note that in some refreshes Pensions (NZ Super) is one of the benefit types in the Benefit + Dynamics Data. This category is not commonly included when MSD reports on benefits + or beneficiaries. Hence researchers are advised to exclude this category in research + focused on MSD main benefits. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = tmp_ + Project schema = [DL-MAA20XX-YY] + +Issues: +- A very small number of records are duplicates (<0.1%). As de-duplicating is very slow + these have not been removed. +- Some identities have more than one main benefit at once. This results in overlapping spells. +- A very small number of partners (<0.5%) have main benefit spells at the same time as being + the partner of another person's main benefit spell. +- Very slow. Runtime > 18 minutes. + +History (reverse order): +2021-04-20 FL Filter out pension +2021-04-19 SA note added regarding Pensions in some refreshes +2021-01-19 SA corrected final query to use *_condensed instead of *_input, + added indexes to intermediate tables, + filter out 'Not On Main Benefit', + filter out people being their own partner, + 'partner' label created using INNER JOIN replacing LEFT JOIN. +2020-12-07 SA Location of [MSD_benefit_codes] changed +2020-11-23 FL QA +2020-07-22 MP QA +2020-03-02 SA updated header +2019-06-05 SA validated against previous table >99% match +**************************************************************************************************/ + +/******************************************************************************** +Interface + +Views to provide an easy point of correction if columns renamed +********************************************************************************/ +USE IDI_UserCode +GO + +/* drop before re-creating */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[tmp_main_benefit_spell_input]; +GO +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[tmp_main_benefit_partner_input]; +GO + +/* view for main recipient spells */ +CREATE VIEW [DL-MAA20XX-YY].[tmp_main_benefit_spell_input] AS +SELECT [snz_uid] + ,COALESCE([msd_spel_servf_code], 'null') AS [msd_spel_servf_code] + ,COALESCE([msd_spel_add_servf_code], 'null') AS [msd_spel_add_servf_code] -- must coalesce away missing values to join col to itself + ,[msd_spel_spell_start_date] AS [start_date] + ,COALESCE([msd_spel_spell_end_date], '9999-01-01') AS [end_date] +FROM [IDI_Clean_YYYYMM].[msd_clean].[msd_spell] +WHERE [msd_spel_spell_start_date] IS NOT NULL +AND ([msd_spel_spell_end_date] IS NULL + OR [msd_spel_spell_start_date] <= [msd_spel_spell_end_date]) +AND [msd_spel_servf_code] <> '839' -- code for "Not On Main Benefit" +AND [msd_spel_servf_code] <> '050' -- code for "Transitional Retirement Benefit" +AND [msd_spel_servf_code] <> '180' -- code for "New Zealand Superannuation" +AND [msd_spel_servf_code] <> '181' -- code for "Veteran's Pension" +GO + + +/* view for partner spells */ +CREATE VIEW [DL-MAA20XX-YY].[tmp_main_benefit_partner_input] AS +SELECT [snz_uid] + ,[partner_snz_uid] + ,[msd_ptnr_ptnr_from_date] AS [start_date] + ,COALESCE([msd_ptnr_ptnr_to_date], '9999-01-01') AS [end_date] +FROM [IDI_Clean_YYYYMM].[msd_clean].[msd_partner] +WHERE [msd_ptnr_ptnr_from_date] IS NOT NULL +AND ([msd_ptnr_ptnr_to_date] IS NULL + OR [msd_ptnr_ptnr_from_date] <= [msd_ptnr_ptnr_to_date]) +AND [snz_uid] <> [partner_snz_uid] +GO + +/******************************************************************************** +Condense Primary Benefit spells +(AKA packing date intervals OR merging overlapping spells) + +Where the same person has overlapping benefit spells, or a new spell +starts the same day/the day after an old spell ends, then merge the spells. + +E.g. +start_date end_date +2001-01-01 2001-01-05 +2001-01-06 2001-01-12 +2001-02-09 2001-02-14 +2001-02-12 2001-02-18 +2001-02-18 2001-02-29 +2010-10-10 2010-10-10 + +becomes +start_date end_date +2001-01-01 2001-01-12 +2001-02-09 2001-02-29 +2010-10-10 2010-10-10 +********************************************************************************/ + +/* drop table before re-creating */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_main_benefit_spell_condensed]; +GO + +/* create table with condensed spells */ +WITH +/* exclude start dates that are within another spell */ +spell_starts AS ( + SELECT [snz_uid] + ,[msd_spel_servf_code] + ,[msd_spel_add_servf_code] + ,[start_date] + ,[end_date] + FROM [IDI_UserCode].[DL-MAA20XX-YY].[tmp_main_benefit_spell_input] s1 + WHERE NOT EXISTS ( + SELECT 1 + FROM [IDI_UserCode].[DL-MAA20XX-YY].[tmp_main_benefit_spell_input] s2 + WHERE s1.snz_uid = s2.snz_uid + AND s1.[msd_spel_servf_code] = s2.[msd_spel_servf_code] + AND s1.[msd_spel_add_servf_code] = s2.[msd_spel_add_servf_code] + AND s2.[start_date] < s1.[start_date] + AND s1.[start_date] <= s2.[end_date] + ) +), +/* exclude end dates that are within another spell */ +spell_ends AS ( + SELECT [snz_uid] + ,[msd_spel_servf_code] + ,[msd_spel_add_servf_code] + ,[start_date] + ,[end_date] + FROM [IDI_UserCode].[DL-MAA20XX-YY].[tmp_main_benefit_spell_input] t1 + WHERE NOT EXISTS ( + SELECT 1 + FROM [IDI_UserCode].[DL-MAA20XX-YY].[tmp_main_benefit_spell_input] t2 + WHERE t2.snz_uid = t1.snz_uid + AND t1.[msd_spel_servf_code] = t2.[msd_spel_servf_code] + AND t1.[msd_spel_add_servf_code] = t2.[msd_spel_add_servf_code] + AND t2.[start_date] <= t1.[end_date] + AND t1.[end_date] < t2.[end_date] + ) +) +SELECT s.snz_uid + ,s.[msd_spel_servf_code] + ,s.[msd_spel_add_servf_code] + ,s.[start_date] + ,MIN(e.[end_date]) as [end_date] +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_main_benefit_spell_condensed] +FROM spell_starts s +INNER JOIN spell_ends e +ON s.snz_uid = e.snz_uid +AND s.[msd_spel_servf_code] = e.[msd_spel_servf_code] +AND s.[msd_spel_add_servf_code] = e.[msd_spel_add_servf_code] +AND s.[start_date] <= e.[end_date] +GROUP BY s.snz_uid, s.[start_date], s.[msd_spel_servf_code], s.[msd_spel_add_servf_code] +ORDER BY s.[start_date] +GO + +CREATE NONCLUSTERED INDEX my_index ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_main_benefit_spell_condensed] ([snz_uid]) + + + +/******************************************************************************** +Condense Partner Benefit spells +As per the same logic for primary benefit spells + +Note that we ignore the benefit type of the main beneficiary. +********************************************************************************/ + +/* drop table before re-creating */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_main_benefit_partner_condensed]; +GO + +/* create table with condensed spells */ +WITH +/* exclude start dates that are within another spell */ +spell_starts AS ( + SELECT [snz_uid] + ,[partner_snz_uid] + ,[start_date] + ,[end_date] + FROM [IDI_UserCode].[DL-MAA20XX-YY].[tmp_main_benefit_partner_input] s1 + WHERE NOT EXISTS ( + SELECT * + FROM [IDI_UserCode].[DL-MAA20XX-YY].[tmp_main_benefit_partner_input] s2 + WHERE s1.snz_uid = s2.snz_uid + AND s1.[partner_snz_uid] = s2.[partner_snz_uid] + AND s2.[start_date] < s1.[start_date] + AND s1.[start_date] <= s2.[end_date] + ) +), +/* exclude end dates that are within another spell */ +spell_ends AS ( + SELECT [snz_uid] + ,[partner_snz_uid] + ,[start_date] + ,[end_date] + FROM [IDI_UserCode].[DL-MAA20XX-YY].[tmp_main_benefit_partner_input] t1 + WHERE NOT EXISTS ( + SELECT * + FROM [IDI_UserCode].[DL-MAA20XX-YY].[tmp_main_benefit_partner_input] t2 + WHERE t2.snz_uid = t1.snz_uid + AND t1.[partner_snz_uid] = t2.[partner_snz_uid] + AND t2.[start_date] <= t1.[end_date] + AND t1.[end_date] < t2.[end_date] + ) +) +SELECT s.snz_uid + ,s.[partner_snz_uid] + ,s.[start_date] + ,MIN(e.[end_date]) as [end_date] +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_main_benefit_partner_condensed] +FROM spell_starts s +INNER JOIN spell_ends e +ON s.snz_uid = e.snz_uid +AND s.[partner_snz_uid] = e.[partner_snz_uid] +AND s.[start_date] <= e.[end_date] +GROUP BY s.snz_uid, s.[start_date], s.[partner_snz_uid] +ORDER BY s.[start_date] +GO + +CREATE NONCLUSTERED INDEX my_index ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_main_benefit_partner_condensed] ([snz_uid]) + + +/******************************************************************************** +Invert Primary benefit spells + +Return periods where the person does not have any spells in the input table. +Requires that input table has already been condensed + + +E.g. +start_date end_date +2001-01-01 2001-01-05 +2001-01-06 2001-01-12 +2001-02-12 2001-02-18 + +becomes +start_date end_date +1900-01-01 2000-12-31 +2001-01-13 2001-02-11 +2001-02-19 9999-12-31 + +********************************************************************************/ + +/* drop table before re-creating */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_main_benefit_spell_invert]; +GO + +/* create table with inverted spells */ +SELECT [snz_uid] + ,'non-benefit' AS [description] + ,[start_date] + ,[end_date] +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_main_benefit_spell_invert] +FROM ( + /* all forward looking spells */ + SELECT [snz_uid] + ,DATEADD(DAY, 1, [end_date]) AS [start_date] + ,LEAD(DATEADD(DAY, -1, [start_date]), 1, '9999-01-01') OVER ( + PARTITION BY [snz_uid] + ORDER BY [start_date] ) AS [end_date] + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_main_benefit_spell_condensed] + + UNION ALL + + /* back looking spell (to 'origin of time') created separately */ + SELECT [snz_uid] + ,'1900-01-01' AS [start_date] + ,DATEADD(DAY, -1, MIN([start_date])) AS [end_date] + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_main_benefit_spell_condensed] + GROUP BY [snz_uid] +) k +WHERE [start_date] <= [end_date] +AND '1900-01-01' <= [start_date] +AND [end_date] <= '9999-01-01' +GO + +CREATE NONCLUSTERED INDEX my_index ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_main_benefit_spell_invert] ([snz_uid]) + + +/******************************************************************************** +Invert Partner Benefit spells +As per the same logic for primary benefit spells +********************************************************************************/ + +/* drop table before re-creating */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_main_benefit_partner_invert]; +GO + +/* create table with inverted spells */ +SELECT [snz_uid] + ,NULL AS [partner_snz_uid] + ,[start_date] + ,[end_date] +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_main_benefit_partner_invert] +FROM ( + /* all forward looking spells */ + SELECT [snz_uid] + ,DATEADD(DAY, 1, [end_date]) AS [start_date] + ,LEAD(DATEADD(DAY, -1, [start_date]), 1, '9999-01-01') OVER ( + PARTITION BY [snz_uid] + ORDER BY [start_date] ) AS [end_date] + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_main_benefit_partner_condensed] + + UNION ALL + + /* back looking spell (to 'origin of time') created separately */ + SELECT [snz_uid] + ,'1900-01-01' AS [start_date] + ,DATEADD(DAY, -1, MIN([start_date])) AS [end_date] + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_main_benefit_partner_condensed] + GROUP BY [snz_uid] +) k +WHERE [start_date] <= [end_date] +AND '1900-01-01' <= [start_date] +AND [end_date] <= '9999-01-01' +GO + +CREATE NONCLUSTERED INDEX my_index ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_main_benefit_partner_invert] ([snz_uid]) + +/******************************************************************************** +Apply categorisation rules + +If in spell AND not in partner THEN 'single' +If in spell AND in partner THEN 'primary' +If partner in partner AND not in spell THEN 'partner' +ONLY 'single' and 'primary' have additional benefit details (like type & amount) +********************************************************************************/ + +/* drop table before re-creating */ +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_abt_main_benefit_final]; +GO + +SELECT k.* + ,code.level1 + ,code.level2 + ,code.level3 + ,code.level4 +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_abt_main_benefit_final] +FROM ( + +/* recipient where role = single as no partner during period */ + SELECT ys.[snz_uid] + ,'single' AS [role] + ,CASE WHEN ys.[start_date] <= np.[start_date] THEN np.[start_date] ELSE ys.[start_date] END AS [start_date] -- latest start date + ,CASE WHEN ys.[end_date] <= np.[end_date] THEN ys.[end_date] ELSE np.[end_date] END AS [end_date] -- earliest end date + ,ys.[msd_spel_servf_code] + ,ys.[msd_spel_add_servf_code] + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_main_benefit_spell_condensed] ys -- yes, spell + INNER JOIN [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_main_benefit_partner_invert] np -- no, partner + ON ys.snz_uid = np.snz_uid -- identity appears in both tables + AND ys.[start_date] <= np.[end_date] + AND np.[start_date] <= ys.[end_date] -- periods overlap + +UNION ALL + +/* recipient where role = single as never had partner */ + SELECT ys.[snz_uid] + ,'single' AS [role] + ,[start_date] + ,[end_date] + ,ys.[msd_spel_servf_code] + ,ys.[msd_spel_add_servf_code] + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_main_benefit_spell_condensed] ys -- yes, spell + WHERE NOT EXISTS ( + SELECT 1 + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_main_benefit_partner_invert] np -- no, partner + WHERE ys.snz_uid = np.snz_uid -- identity appears in both tables + ) + +UNION ALL + +/* recipient where role = primary */ + SELECT ys.[snz_uid] + ,'primary' AS [role] + ,CASE WHEN ys.[start_date] <= yp.[start_date] THEN yp.[start_date] ELSE ys.[start_date] END AS [start_date] -- latest start date + ,CASE WHEN ys.[end_date] <= yp.[end_date] THEN ys.[end_date] ELSE yp.[end_date] END AS [end_date] -- earliest end date + ,ys.[msd_spel_servf_code] + ,ys.[msd_spel_add_servf_code] + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_main_benefit_spell_condensed] ys -- yes, spell + INNER JOIN [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_main_benefit_partner_condensed] yp -- yes, partner + ON ys.snz_uid = yp.snz_uid -- identity appears in both tables + AND ys.[start_date] <= yp.[end_date] + AND yp.[start_date] < ys.[end_date] -- periods overlap + +UNION ALL + +/* receipt as role = partner */ + SELECT yp.[partner_snz_uid] AS [snz_uid] + ,'partner' AS [role] + ,CASE WHEN ps.[start_date] <= yp.[start_date] THEN yp.[start_date] ELSE ps.[start_date] END AS [start_date] -- latest start date + ,CASE WHEN ps.[end_date] <= yp.[end_date] THEN ps.[end_date] ELSE yp.[end_date] END AS [end_date] -- earliest end date + ,ps.[msd_spel_servf_code] + ,ps.[msd_spel_add_servf_code] + FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_main_benefit_partner_condensed] yp -- yes, partner + INNER JOIN [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_main_benefit_spell_condensed] ps -- (benefit receiving) partner spell + ON ps.snz_uid = yp.snz_uid + AND yp.[start_date] <= ps.[end_date] + AND ps.[start_date] < yp.[end_date] -- periods overlap + +) k +LEFT JOIN [IDI_Adhoc].clean_read_MSD.benefit_codes code +ON k.msd_spel_servf_code = code.serv +AND (k.msd_spel_add_servf_code = code.additional_service_data + OR (code.additional_service_data IS NULL + AND (k.msd_spel_add_servf_code ='null' OR k.msd_spel_add_servf_code IS NULL) + )) +AND code.ValidFromtxt <= k.[start_date] +AND k.[start_date] <= code.ValidTotxt +GO + + + +/******************************************************************************** +Tidy up and remove all temporary tables/views that have been created +********************************************************************************/ + + +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[tmp_main_benefit_spell_input]; +GO +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[tmp_main_benefit_partner_input]; +GO +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_main_benefit_spell_condensed]; +GO +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_main_benefit_partner_condensed]; +GO +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_main_benefit_spell_invert]; +GO +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_main_benefit_partner_invert]; +GO + +CREATE NONCLUSTERED INDEX my_index ON [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_abt_main_benefit_final] ([snz_uid]) + +ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[d2gP2_abt_main_benefit_final] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE) + diff --git a/social_services/special_needs_grant.sql b/social_services/special_needs_grant.sql new file mode 100644 index 0000000..8efea6b --- /dev/null +++ b/social_services/special_needs_grant.sql @@ -0,0 +1,72 @@ +/************************************************************************************************** +Title: Special Needs Grant - Food or Medical grants +Author: Verity Warn +Reviewer: Penny Mok + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Identify T3 benefit hardship assistance - Special Needs Food grants or Special Needs Medical grants. + +Intended Purpose: +Identify Special Needs Grants - particularly for Food and Medical And Associated Costs (SNG) + +Inputs & Dependencies: +- [IDI_Clean_202203].[msd_clean].[msd_third_tier_expenditure] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[msd_income_support_pay_reason] +- [IDI_Metadata].[clean_read_CLASSIFICATIONS].[msd_income_support_serv_codes] +Outputs: +- [IDI_Sandpit].[DL-MAA20XX-YY].[defn_gss1418] + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + +History (reverse order): +2022-07-19 VW Edit filters on final table (only identify SNG) +2022-07-15 VW Definition creation +**************************************************************************************************/ + +DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_t3_types]; + +/* Create temporary table with all required metadata */ +WITH t3_types AS( +SELECT + snz_uid + ,[msd_tte_lump_sum_svc_code] + ,benefit_name + ,[msd_tte_pmt_rsn_type_code] + ,b.payment_reason_lvl2 + ,b.payment_reason_lvl1 + ,msd_tte_decision_date + ,msd_tte_pmt_amt +FROM [IDI_Clean_YYYYMM].[msd_clean].[msd_third_tier_expenditure] a +LEFT JOIN [IDI_Metadata].[clean_read_CLASSIFICATIONS].[msd_income_support_pay_reason] b +ON a.[msd_tte_pmt_rsn_type_code] = b.[payrsn_code] +LEFT JOIN [IDI_Metadata].[clean_read_CLASSIFICATIONS].[msd_income_support_serv_codes] c +ON a.[msd_tte_lump_sum_svc_code] = c.serv_code +WHERE [msd_tte_recoverable_ind] = 'N' +) + +/* Create flags for special needs grant types MSD interested in */ +SELECT snz_uid + ,msd_tte_decision_date + ,IIF(payment_reason_lvl2 = 'Food' AND msd_tte_pmt_amt > 0, 1, 0) AS special_needs_grant_food + ,IIF(payment_reason_lvl2 = 'Medical And Associated Costs (SNG)' AND msd_tte_pmt_amt > 0, 1, 0) AS special_needs_grant_medical +INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_t3_types] +FROM t3_types +WHERE benefit_name = 'Special Needs Grant' -- therefore any snz_uid in this output table has a special needs grant + + diff --git a/social_services/temp_additional_supplement.sql b/social_services/temp_additional_supplement.sql new file mode 100644 index 0000000..d65b0a6 --- /dev/null +++ b/social_services/temp_additional_supplement.sql @@ -0,0 +1,65 @@ +/************************************************************************************* +Title: Temporary additional supplement +Author: Verity Warn + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Temporary accommodation supplement payments. + +Intended purpose: +- Identifying usage of temporary additional supplements in the 65+ population + +Inputs & Dependencies: +- [IDI_Clean].[msd_clean].[msd_second_tier_expenditure] +- [IDI_Adhoc].[clean_read_MSD].[benefit_codes] +Output: +- [IDI_UserCode].[DL-MAA20XX-YY].[defn_temp_add_supp] + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = defn_ + Project schema = [DL-MAA20XX-YY] + + +Notes +- Latest start date is 2021-12-31 + +History: +2022-06-14 VW Implement SA QA - rename table, add extra TAS codes, edit join to take only current benefit type codes +2022-06-14 SA QA +2022-05-30 VW Created definition +**************************************************************************************/ + +USE IDI_UserCode +GO + +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[defn_temp_add_supp]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[defn_temp_add_supp] AS +SELECT snz_uid + ,msd_ste_start_date + ,msd_ste_end_date + ,ROUND([msd_ste_period_nbr] * [msd_ste_daily_gross_amt], 2) AS tas_gross_payment -- [msd_ste_period_nbr] equals number of days start-to-end inclusive + --,level4 -- confirm TAS +FROM [IDI_Clean_YYYYMM].[msd_clean].[msd_second_tier_expenditure] AS k +LEFT JOIN [IDI_Adhoc].[clean_read_MSD].[benefit_codes] AS code +ON k.msd_ste_supp_serv_code = code.serv +AND code.ValidFromtxt <= k.msd_ste_start_date +AND k.msd_ste_start_date <= code.ValidTotxt +WHERE msd_ste_supp_serv_code IN ('450', '460', '473') -- TAS code +GO + + diff --git a/transport/nzta_license_and_registration.sql b/transport/nzta_license_and_registration.sql new file mode 100644 index 0000000..598df57 --- /dev/null +++ b/transport/nzta_license_and_registration.sql @@ -0,0 +1,82 @@ +/************************************************************************************************** +Title: Driver licensing and vehicle registration +Author: Joel Bancolita + +Acknowledgements: +Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions + +Disclaimer: +The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the +context of a specific project. Whether or not these definitions are suitable for other projects depends on the +context of those projects. Researchers using definitions from this library will need to determine for themselves +to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides +this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. + +Citation: +Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library + +Description: +Driver licensing and vehicle registration + +Intended purpose: +Indication of whether a person has an NZ driver's license or has a motor vehicle registered in their name. + +Inputs & Dependencies: +- [IDI_Clean].[nzta_clean].[drivers_licence_register] +- [IDI_Clean].[nzta_clean].[motor_vehicle_register] +Outputs: +- [IDI_UserCode].[DL-MAA20XX-YY].[vacc_driver_license] +- [IDI_UserCode].[DL-MAA20XX-YY].[vacc_vehicle_registration] + +Notes: +1) Recent driver's license status for regular vehicles. Special licenses (e.g. trucks, buses, etc.) + deliberately excluded as they require a regular license. + Not suited for identifying the time of license renewal/application. + +Parameters & Present values: + Current refresh = YYYYMM + Prefix = vacc_ + Project schema = DL-MAA20XX-YY + +Issues: + +History (reverse order): +2021-08-31 MP limiting only to full license and disregarding vehicle types +2020-07-20 JB cull +**************************************************************************************************/ + +/* Set database for writing views */ +USE IDI_UserCode +GO + +/* Recent driver's license issue */ +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[vacc_driver_license]; +GO + +CREATE VIEW [DL-MAA20XX-YY].[vacc_driver_license] AS +SELECT [snz_uid] + ,[nzta_dlr_licence_from_date] AS [start_date] + ,[nzta_dlr_licence_from_date] AS [end_date] +-- ,CONCAT('dlr status ',SUBSTRING([nzta_dlr_licence_stage_text], 1, 20)) AS [description] + , 'full or restricted license' AS [description] +FROM [IDI_Clean_YYYYMM].[nzta_clean].[drivers_licence_register] +WHERE [nzta_dlr_licence_class_text] = 'MOTOR CARS AND LIGHT MOTOR VEHICLES' +AND nzta_dlr_licence_stage_text in ('FULL', 'RESTRICTED') -- only full lincenses +GO + +/* NZTA Motor vehicle registration */ + +DROP VIEW IF EXISTS [DL-MAA20XX-YY].[vacc_vehicle_registration] +GO + +/* Create staging */ +CREATE VIEW [DL-MAA20XX-YY].[vacc_vehicle_registration] AS +SELECT snz_uid + ,nzta_mvr_reg_date AS [start_date] + ,nzta_mvr_end_date AS [end_date] +-- ,'Vehicle: ' + nzta_mvr_body_type_text as [description] + ,'any vehicle' AS [description] + FROM [IDI_Clean_YYYYMM].[nzta_clean].[motor_vehicle_register] + WHERE nzta_mvr_body_type_text NOT LIKE '%TRAILER%' + AND nzta_mvr_body_type_text NOT LIKE '%UNKNOWN%' +GO