Stats NZ has recorded a series of videos as a learning resource for new data lab researchers to build confidence in the IDI.
This repository contains the code that was written as part of the training videos. It is made available to that researchers do not need to transcribe code from the videos and have the option to review the code separate from the videos.
We have made some modifications to the files since the recordings:
- Layout and documentation has been improved
- Some details that might be sensitive have been removed
The layout of this folder matches the four programming languages demonstrated in the training videos. Researchers interested in a specific language should focus on that sub-folder.
These files have all gone through output checking and have been reviewed to confirm they are save for release. This means that the files contain no results. As initial data exploration tends to include detailed notes or copies of results, we have not output the exploration files from the lab. While we could have made a case to output the exploration code and notes on the basis that the demonstration analysis only uses randomly generated data (suitable only for teaching purposes, not real data) we have elected to omit these files to avoid confusion about their contents.
As written, the code requires a specific data lab research project with randomly generated teaching data loaded in order to run. As a result you are unlikely to be able to execute the code without modifying it.
This is as intended. The code available here is a learning resource that can serve as a pattern for developing your own code. It can not act as a substitute for writing your own code.
All information that leaves the data lab is checked to ensure it is safe and upholds confidentiality standards. Just as the training videos were carefully reviewed to ensure their publication was safe, the content of these code files has been reviewed to ensure they are safe for public release.
More information about integrated data can be found here:
Links to the training videos are available to approved researchers via the Data Lab Commons.
The contact point for the code in this repository is the same as the contact point for integrated data at Stats: [email protected]