Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: update processing of AAR file/codes #1780

Merged
merged 3 commits into from
Jan 24, 2025

Conversation

PsypherPunk
Copy link
Collaborator

@PsypherPunk PsypherPunk commented Jan 17, 2025

Context

The codes used for AAR were updated some time ago: all future AAR files can be expected to adhere to the newer BAE-/BAI-prefixed codes (similarly BTE/BTI for AAR CS data).

AB#241700

Change proposed in this pull request

  • previous BNCH… codes have been updated with BAI…/BAE… codes
  • update processing of AAR file to accommodate the above, in line with recent changes for input_schemas
  • update processing of AAR/CS file to accommodate the above, in line with recent changes for input_schemas

Guidance to review

Have verified that for all years 2021—2023, the files produced by this updated code produce the same files as exist in t01:

❯ md5sum --check ../../t01/2023/checksum.txt
aar.parquet: OK
academies.parquet: OK
all_schools.parquet: OK
bfr_metrics.parquet: OK
bfr.parquet: OK
cdc.parquet: OK
census.parquet: OK
central_services.parquet: OK
cfo.parquet: OK
ks2.parquet: OK
ks4.parquet: OK
maintained_schools.parquet: OK
schools.parquet: OK
sen.parquet: OK
trusts.parquet: OK

I've also run the data for 2024 locally, with 2 documented discrepancies:

  • all Trust names are now upper-case (purely a cosmetic issue to be noted)
  • the CS data is missing some Trusts referenced in the AAR data (again, to be noted but the data are incomplete until later this month)

Checklist (add/remove as appropriate)

  • Work items have been linked (use AB#)
  • Your code builds clean without any errors or warnings
  • You have run all unit/integration tests and they pass
  • Your branch has been rebased onto main
  • You have tested by running locally
  • You have reviewed with UX/Design

@PsypherPunk PsypherPunk force-pushed the feature/241700/aar-code-updates branch 3 times, most recently from e5d8156 to fbfbc07 Compare January 20, 2025 16:01
- previous `BNCH…` codes have been updated with `BAI…` codes
- update processing of AAR file to accommodate the above, in line with
  recent changes for `input_schemas`
@PsypherPunk PsypherPunk force-pushed the feature/241700/aar-code-updates branch from fbfbc07 to 4d85ed4 Compare January 21, 2025 14:30
jrabbott
jrabbott previously approved these changes Jan 22, 2025
Copy link
Collaborator

@jrabbott jrabbott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LTGM - would be good to get @benlav-50 eyes over the mappings.

Might also be worth adding some documentation on the schema changes?

- previous `BNCH…` codes have been updated with `BAI…` codes
- update processing of AAR/CS file to accommodate the above, in line with
  recent changes for `input_schemas`
@PsypherPunk PsypherPunk force-pushed the feature/241700/aar-code-updates branch from 24e131d to c057334 Compare January 22, 2025 15:51
@PsypherPunk PsypherPunk marked this pull request as ready for review January 22, 2025 15:59
@PsypherPunk PsypherPunk merged commit f67ea35 into main Jan 24, 2025
13 checks passed
@PsypherPunk PsypherPunk deleted the feature/241700/aar-code-updates branch January 24, 2025 13:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants