Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add API character limit warnings #146

Merged
merged 8 commits into from
Jan 10, 2025

Conversation

cjrace
Copy link
Collaborator

@cjrace cjrace commented Jan 10, 2025

Brief overview of changes

This adds 5 new tests to check for character limits in data sets according to the standards set for the API - https://dfe-analytical-services.github.io/analysts-guide/statistics-production/api-data-standards.html#character-limits-for-col_names-and-filter-items

Why are these changes being made?

To highlight to publishers where their current data may need to change and to allow us to spot any early issues with the limits.

Detailed description of changes

5 new tests added, including automated tests and test data for them, all as advisory warnings for now. At a later point they can be added to the API suite of tests (once that exists) as a failure.

Also added a .gitattributes file to force the repo to always be CRLF, which also helped me to then be able to run git add --renormalize . to undo what I'd done from my other laptop knocking files to just LF. In that I've ignored any binary files so that we don't inadvertently break them.

Additional information for reviewers

Nothing extra to add.

Issue ticket number/s and link

No issue on GitHub, but resolves this Trello card - https://trello.com/c/jenmz3hG/1708-add-in-warnings-for-character-limits-into-screener-thinking-ahead-to-api-data

@cjrace cjrace requested a review from rmbielby January 10, 2025 10:36
@cjrace cjrace marked this pull request as ready for review January 10, 2025 10:36
Copy link
Contributor

@rmbielby rmbielby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The obvious thing that jumps out is why not have a shared function that does the bulk of the work?

lengths_check <- function(entries, entry_type, length_limit){
    lengths_table <- data.frame(
    "entries" = entries
    "length" = unlist(lapply(entries, nchar), use.names = FALSE)
  )

  lengths_too_long <- lengths_table[lengths_table$length > length_limit, "entries"]

  if (length(lengths_too_long) == 0) {
    output <- list(
      "message" = paste("All location codes are ", length_limit," characters or fewer."),
      "result" = "PASS"
    )
  } else {
    if (length(lengths_too_long) == 1) {
      output <- list(
        "message" = paste0("The following ", entry_type," is over ", length_limit," characters, this will need shortening before this data can be published through the API: '", paste(lengths_too_long, collapse = "', '"), "'."),
        "result" = "ADVISORY"
      )
    }
  }
...
}

Then within each test you have a line like lengths_check(location_codes, "location codes", 30)

@cjrace
Copy link
Collaborator Author

cjrace commented Jan 10, 2025

@rmbielby you're absolutely right, though...

Writing it as it was broadly followed how the rest of the project was written and was super fast as GitHub copilot was spotted the patterns and giving me 90% of it as suggestions, so rewriting it like this will probably take me longer than the initial writing did to start with (not particularly long but still something).

That being said, I nearly went to do this just now, though I think it's probably not worth it when we can tackle this pretty soon separately when converting to the package, there's plenty of other code in this repo in the same kind of way, so I'm going to leave it for now and expect we can sweep it up with the other tidy up into a file of internal only functions when moving to a package.

@cjrace cjrace merged commit 38eb8e2 into main Jan 10, 2025
7 checks passed
@cjrace cjrace deleted the new-feature/api-character-limit-warnings branch January 10, 2025 15:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants