-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Simon Anastasiadis
authored and
Simon Anastasiadis
committed
Jun 25, 2024
1 parent
7435b53
commit e65c722
Showing
106 changed files
with
18,689 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,31 @@ | ||
# definitions_library | ||
Definitions from SWA's projects suitable for use with the dataset assembly tool. | ||
# IDI definitions library | ||
Definitions from the Social Investment Agency (proeviously Social Wellbeing Agency) projects in a single repository for ease of discovery. These definitions are intended to be compatible with the Dataset Assembly Tool. | ||
|
||
## Overview | ||
The Dataset Assembly Tool encourages research projects to structure their input data into population and measures. The population is specific to a specific study and includes who is being studied and over what time period. The measures are not specific to a study. This means they can be reused across research projects. | ||
|
||
When creating a new measure, researchers are encouraged to construct the best definition of the measure that they can. By creating and sharing high quality definitions, the quality of research will improve. | ||
|
||
## Dependencies | ||
Because these definitions are based on data in the IDI, it is necessary to have an IDI project if you wish to run the code. Visit the Stats NZ website for more information about this. | ||
|
||
The definitions in this library were developed using specific versions/refreshes of the IDI. As the IDI is further developed, new refreshes will be released. When reusing these definitions, it is the researcher's responsibility to updated them if the refresh being used in the project does not match the original refresh of the definition. Trying to use different refreshes in the same project will likely result in errors and links between data sources can change between refreshes. | ||
|
||
In addition to updating the refresh, researchers will need to update the project schema before running these definitions. Unless you have permission to create tables and views in the chosen schema, you will receive an error when you run a definition. | ||
|
||
Any dependencies for an individual definition should be noted in the header at the top of the file. | ||
|
||
## Folder descriptions | ||
This repository contains definitions from other projects and the definitions are stored in different domain folders. | ||
|
||
## Disclaimer | ||
The definitions provided in this library were determined by the Agency to be suitable in the context of a specific project. Whether or not these definitions are suitable for other projects depends on the context of those projects. Researchers using definitions from this library will need to determine for themselves to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. | ||
|
||
## Acknowledgements | ||
Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of a number of definitions from the vaccination uptake project. | ||
|
||
## Citation | ||
Social Investment Agency. Definitions library. Source code. https://github.com/nz-social-investment-agency/definitions_library | ||
|
||
## Getting Help | ||
If you have any questions email [email protected] |
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,118 @@ | ||
/************************************************************************************************** | ||
Title: First arrival in New Zealand | ||
Author: Luke Scullion | ||
Reviewer: Shaan Badenhorst | ||
|
||
Acknowledgements: | ||
Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions | ||
|
||
Disclaimer: | ||
The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the | ||
context of a specific project. Whether or not these definitions are suitable for other projects depends on the | ||
context of those projects. Researchers using definitions from this library will need to determine for themselves | ||
to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides | ||
this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. | ||
|
||
Citation: | ||
Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library | ||
|
||
Description: | ||
Best estimate of first arrival in New Zealand across sources. | ||
|
||
Intended purpose: | ||
When a person first entered NZ, how long a person has been in NZ. | ||
|
||
Inputs & Dependencies: | ||
- [IDI_Clean].[cen_clean].[census_individual_2018] | ||
- [IDI_Clean].[cen_clean].[census_individual_2013] | ||
- [IDI_Clean].[data].[person_overseas_spell] | ||
Outputs: | ||
- [IDI_UserCode].[DL-MAA20XX-YY].[vacc_Cen2018_Occupation] | ||
|
||
Notes: | ||
1) The definition synthesizes across Census 2018, Census 2013, and Overseas spells data. | ||
2) May not accurately identify beginning of residence in NZ if a person visited | ||
New Zealand before migrating here. | ||
|
||
Parameters & Present values: | ||
Current refresh = YYYYMM | ||
Prefix = vacc_ | ||
Project schema = DL-MAA20XX-YY | ||
|
||
Issues: | ||
|
||
History (reverse order): | ||
2021-11-10 SB review | ||
2021-10-31 LC | ||
**************************************************************************************************/ | ||
|
||
/***************************** | ||
create table of all possible entries | ||
*****************************/ | ||
DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_NZ_arrival] | ||
GO | ||
|
||
CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_NZ_arrival] ( | ||
snz_uid INT, | ||
arrival_year INT, | ||
arrival_month INT, | ||
); | ||
GO | ||
|
||
/* Census 2018 */ | ||
INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_NZ_arrival] (snz_uid, arrival_year, arrival_month) | ||
SELECT snz_uid | ||
,CAST([cen_ind_arrv_in_nz_year_code] AS INT) AS arrival_year | ||
,CAST([cen_ind_arrv_in_nz_month_code] AS INT) AS arrival_month | ||
FROM [IDI_Clean_YYYYMM].[cen_clean].[census_individual_2018] | ||
WHERE snz_uid IS NOT NULL | ||
|
||
/* Census 2013 */ | ||
INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_NZ_arrival] (snz_uid, arrival_year, arrival_month) | ||
SELECT snz_uid | ||
,CAST([cen_ind_arrival_in_nz_yr_code] AS INT) AS arrival_year | ||
,CAST([cen_ind_arrival_in_nz_mnth_code] AS INT) AS arrival_month | ||
FROM [IDI_Clean_YYYYMM].[cen_clean].[census_individual_2013] | ||
WHERE snz_uid IS NOT NULL | ||
|
||
/* Overseas spells */ | ||
INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_NZ_arrival] (snz_uid, arrival_year, arrival_month) | ||
SELECT snz_uid | ||
,YEAR([pos_ceased_date]) AS arrival_year | ||
,MONTH([pos_ceased_date]) AS arrival_month | ||
FROM [IDI_Clean_YYYYMM].[data].[person_overseas_spell] | ||
WHERE [pos_first_arrival_ind] = 'y' | ||
AND snz_uid IS NOT NULL | ||
|
||
CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_NZ_arrival] (snz_uid); | ||
GO | ||
|
||
/***************************** | ||
filter and conclude | ||
*****************************/ | ||
|
||
DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_arrival_in_NZ]; | ||
GO | ||
|
||
WITH tidied_variables AS ( | ||
|
||
SELECT snz_uid | ||
,IIF(arrival_year >= 2025, NULL, arrival_year) AS arrival_year | ||
,IIF(arrival_month NOT IN (1,2,3,4,5,6,7,8,9,10,11,12), NULL, arrival_month) AS arrival_month | ||
FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_NZ_arrival] | ||
|
||
) | ||
SELECT snz_uid | ||
,MIN(arrival_year) AS arrival_year | ||
,MIN(arrival_month) AS arrival_month | ||
INTO [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_arrival_in_NZ] | ||
FROM tidied_variables | ||
GROUP BY snz_uid | ||
GO | ||
|
||
CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[vacc_arrival_in_NZ] (snz_uid); | ||
GO | ||
|
||
|
||
DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_NZ_arrival] | ||
GO |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,215 @@ | ||
/************************************************************************************************** | ||
Title: Migration date to New Zealand | ||
Author: Simon Anastasiadis | ||
Acknowledgements: | ||
Informatics for Social Services and Wellbeing (terourou.org) supported the publishing of these definitions | ||
Disclaimer: | ||
The definitions provided in this library were determined by the Social Wellbeing Agency to be suitable in the | ||
context of a specific project. Whether or not these definitions are suitable for other projects depends on the | ||
context of those projects. Researchers using definitions from this library will need to determine for themselves | ||
to what extent the definitions provided here are suitable for reuse in their projects. While the Agency provides | ||
this library as a resource to support IDI research, it provides no guarantee that these definitions are fit for reuse. | ||
Citation: | ||
Social Wellbeing Agency. Definitions library. Source code. https://github.com/nz-social-wellbeing-agency/definitions_library | ||
Description: | ||
Date of migration to New Zealand based on compolation of different sources keeping highest qulaity source. | ||
Intended purpose: | ||
1. Creating indicators of when a person migrated to New Zealand. | ||
2. Creating indicators of how recently a person migrated to NZ (for example, within the last 10 years). | ||
3. Calculating age when migrated to New Zealand. | ||
Inputs & Dependencies: | ||
- [IDI_Clean].[cen_clean].[census_individual_2018] | ||
- [IDI_Clean].[cen_clean].[census_individual_2013] | ||
- [IDI_Clean].[data].[person_overseas_spell] | ||
Outputs: | ||
- [IDI_Sandpit].[DL-MAA20XX-YY].[defn_migration_date] | ||
Notes: | ||
1) Aim is to create a starting point for examining when people migrate to New Zealand | ||
Useful for understanding whether people are new migrants and at what age they migrated | ||
(with implications for first-generation, second-generation, etc. migrant measures). | ||
2) We have used Census 2018 and 2013 as the highest quality sources. These sources are | ||
self-report. For old dates these may vary in quality (for example, people may round to | ||
nearest 5 years when self-reporting). | ||
3) For Overseas spells, we examined duration of spells overseas and compared against | ||
self-reported values in Census 2018. | ||
- While people are non-resident, their overseas spells are much more likely to | ||
be greater than 180 days. | ||
- Once peole become resident, their overseas spells are much more likely to | ||
be less than 180 days. | ||
Hence, we tested "migration date = end date of last overseas spells with duration | ||
180+ days". This definition has strong consistency with self-reported year. | ||
4) Note that we have no control for 'is the person resident' or 'was the person resident'. | ||
Some people who migrate to New Zealand will have since emmigrated from NZ. | ||
It is recommended that you first use the Estimated Residnetial Population (ERP) by Stats | ||
to determine who is resident. And where people who are resident were born overseas, | ||
then use this code as a starting point to determine when they arrived in NZ. | ||
Parameters & Present values: | ||
Current refresh = YYYYMM | ||
Prefix = defn_ | ||
Project schema = [DL-MAA20XX-YY] | ||
Issues: | ||
1) We do not have sufficient expertise with immigration records in IDI to use this source. | ||
Adding this source is the obvious and essential next step. | ||
2) Run time 7 minutes | ||
History (reverse order): | ||
2022-10-12 SA v1 | ||
**************************************************************************************************/ | ||
|
||
/******************************************************** | ||
TABLES TO APPEND TO | ||
********************************************************/ | ||
|
||
/* Diagnosis or treatment only indicates dysthymia */ | ||
DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_migration_date] | ||
GO | ||
|
||
CREATE TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_migration_date] ( | ||
snz_uid INT, | ||
event_date DATE, | ||
event_year INT, | ||
event_month INT, | ||
origin VARCHAR(12) | ||
) | ||
|
||
/******************************************************** | ||
Census 2018 | ||
********************************************************/ | ||
|
||
INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_migration_date] (snz_uid, event_date, event_year, event_month, origin) | ||
SELECT [snz_uid] | ||
,CASE -- required as transformation applied before filtering (even if using nested queries) | ||
WHEN [cen_ind_arrv_in_nz_year_code] NOT IN ('7777', '9999') | ||
AND [cen_ind_arrv_in_nz_month_code] NOT IN ('77', '99') | ||
THEN DATEFROMPARTS([cen_ind_arrv_in_nz_year_code], [cen_ind_arrv_in_nz_month_code], 15) | ||
END AS event_date | ||
,CAST([cen_ind_arrv_in_nz_year_code] AS INT) AS event_year | ||
,CAST([cen_ind_arrv_in_nz_month_code] AS INT) AS event_month | ||
,'cen18' AS origin | ||
FROM [IDI_Clean_YYYYMM].[cen_clean].[census_individual_2018] | ||
WHERE [cen_ind_birth_country_code] NOT IN ('1201', '9999', '0000') -- Exclude NZ and non-response | ||
AND [cen_ind_birth_country_impt_ind] IN ('11', '21') -- Census 2018 response | ||
AND [cen_ind_arrv_in_nz_year_code] NOT IN ('7777', '9999') -- Some year required | ||
AND [cen_ind_arrv_in_nz_month_code] NOT IN ('77', '99') -- Some month required | ||
GO | ||
|
||
/******************************************************** | ||
Census 2013 | ||
********************************************************/ | ||
|
||
INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_migration_date] (snz_uid, event_date, event_year, event_month, origin) | ||
SELECT [snz_uid] | ||
,CASE | ||
WHEN [cen_ind_arrival_in_nz_yr_code] NOT IN ('7777', '9999') | ||
AND [cen_ind_arrival_in_nz_mnth_code] NOT IN ('77', '99') | ||
THEN DATEFROMPARTS([cen_ind_arrival_in_nz_yr_code], [cen_ind_arrival_in_nz_mnth_code], 15) | ||
END AS event_date | ||
,CAST([cen_ind_arrival_in_nz_yr_code] AS INT) AS event_year | ||
,CAST([cen_ind_arrival_in_nz_mnth_code] AS INT) AS event_month | ||
,'cen13' AS origin | ||
FROM [IDI_Clean_YYYYMM].[cen_clean].[census_individual_2013] | ||
WHERE [cen_ind_birth_country_code] NOT IN ('1201', '9999', '0000') -- Exclude NZ and non-response | ||
AND [cen_ind_arrival_in_nz_yr_code] NOT IN ('7777', '9999') -- Some year required | ||
AND [cen_ind_arrival_in_nz_mnth_code] NOT IN ('77', '99') -- Some month required | ||
GO | ||
|
||
/******************************************************** | ||
Overseas spells | ||
Find people with a first date in NZ. | ||
Find the earliest date such that any overseas spells of length 180+ days | ||
are all prior to this date. | ||
********************************************************/ | ||
|
||
WITH | ||
people_who_have_a_first_arrival AS ( | ||
SELECT snz_uid | ||
FROM [IDI_Clean_YYYYMM].[data].[person_overseas_spell] | ||
WHERE pos_first_arrival_ind = 'y' -- people born in NZ should not have a first arrival | ||
), | ||
last_day_of_long_overseas_spell AS ( | ||
SELECT [snz_uid] | ||
,MAX([pos_ceased_date]) AS max_ceased_date | ||
FROM [IDI_Clean_YYYYMM].[data].[person_overseas_spell] | ||
WHERE [pos_day_span_nbr] > 180 | ||
AND pos_last_departure_ind <> 'y' -- not last departure from NZ | ||
GROUP BY snz_uid | ||
) | ||
INSERT INTO [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_migration_date] (snz_uid, event_date, event_year, event_month, origin) | ||
SELECT snz_uid | ||
,max_ceased_date AS event_date | ||
,YEAR(max_ceased_date) AS event_year | ||
,MONTH(max_ceased_date) AS event_month | ||
,'spells' AS origin | ||
FROM last_day_of_long_overseas_spell AS ld | ||
WHERE max_ceased_date < GETDATE() | ||
AND EXISTS ( | ||
SELECT 1 | ||
FROM people_who_have_a_first_arrival AS pw | ||
WHERE pw.snz_uid = ld.snz_uid | ||
) | ||
GO | ||
|
||
/******************************************************** | ||
Keep highest ranked source | ||
********************************************************/ | ||
|
||
/* index for performance */ | ||
CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_migration_date] (snz_uid); | ||
GO | ||
|
||
/* delete before creation */ | ||
DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[defn_migration_date] | ||
GO | ||
|
||
WITH | ||
ranked AS ( | ||
SELECT * | ||
,CASE | ||
WHEN origin = 'cen18' THEN 1 | ||
WHEN origin = 'cen13' THEN 2 | ||
WHEN origin = 'spells' THEN 3 | ||
END AS ranked | ||
FROM [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_migration_date] | ||
), | ||
source_ranked AS ( | ||
SELECT * | ||
,ROW_NUMBER() OVER (PARTITION BY snz_uid ORDER BY ranked, event_date) AS source_ranked | ||
FROM ranked | ||
) | ||
SELECT snz_uid | ||
,event_date | ||
,event_year | ||
,event_month | ||
--,origin | ||
INTO [IDI_Sandpit].[DL-MAA20XX-YY].[defn_migration_date] | ||
FROM source_ranked | ||
WHERE source_ranked = 1 | ||
GO | ||
|
||
/******************************************************** | ||
Tidy | ||
********************************************************/ | ||
|
||
/* index */ | ||
CREATE NONCLUSTERED INDEX my_index_name ON [IDI_Sandpit].[DL-MAA20XX-YY].[defn_migration_date] (snz_uid); | ||
GO | ||
/* compress */ | ||
ALTER TABLE [IDI_Sandpit].[DL-MAA20XX-YY].[defn_migration_date] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE) | ||
GO | ||
/* remove temp tables */ | ||
DROP TABLE IF EXISTS [IDI_Sandpit].[DL-MAA20XX-YY].[tmp_migration_date] | ||
GO |
Oops, something went wrong.