Add API character limit warnings #146

cjrace · 2025-01-10T09:00:30Z

Brief overview of changes

This adds 5 new tests to check for character limits in data sets according to the standards set for the API - https://dfe-analytical-services.github.io/analysts-guide/statistics-production/api-data-standards.html#character-limits-for-col_names-and-filter-items

Why are these changes being made?

To highlight to publishers where their current data may need to change and to allow us to spot any early issues with the limits.

Detailed description of changes

5 new tests added, including automated tests and test data for them, all as advisory warnings for now. At a later point they can be added to the API suite of tests (once that exists) as a failure.

Also added a .gitattributes file to force the repo to always be CRLF, which also helped me to then be able to run git add --renormalize . to undo what I'd done from my other laptop knocking files to just LF. In that I've ignored any binary files so that we don't inadvertently break them.

Additional information for reviewers

Nothing extra to add.

Issue ticket number/s and link

No issue on GitHub, but resolves this Trello card - https://trello.com/c/jenmz3hG/1708-add-in-warnings-for-character-limits-into-screener-thinking-ahead-to-api-data

rmbielby

The obvious thing that jumps out is why not have a shared function that does the bulk of the work?

lengths_check <- function(entries, entry_type, length_limit){
    lengths_table <- data.frame(
    "entries" = entries
    "length" = unlist(lapply(entries, nchar), use.names = FALSE)
  )

  lengths_too_long <- lengths_table[lengths_table$length > length_limit, "entries"]

  if (length(lengths_too_long) == 0) {
    output <- list(
      "message" = paste("All location codes are ", length_limit," characters or fewer."),
      "result" = "PASS"
    )
  } else {
    if (length(lengths_too_long) == 1) {
      output <- list(
        "message" = paste0("The following ", entry_type," is over ", length_limit," characters, this will need shortening before this data can be published through the API: '", paste(lengths_too_long, collapse = "', '"), "'."),
        "result" = "ADVISORY"
      )
    }
  }
...
}

Then within each test you have a line like lengths_check(location_codes, "location codes", 30)

cjrace · 2025-01-10T15:00:05Z

@rmbielby you're absolutely right, though...

Writing it as it was broadly followed how the rest of the project was written and was super fast as GitHub copilot was spotted the patterns and giving me 90% of it as suggestions, so rewriting it like this will probably take me longer than the initial writing did to start with (not particularly long but still something).

That being said, I nearly went to do this just now, though I think it's probably not worth it when we can tackle this pretty soon separately when converting to the package, there's plenty of other code in this repo in the same kind of way, so I'm going to leave it for now and expect we can sweep it up with the other tidy up into a file of internal only functions when moving to a package.

cjrace added 8 commits January 9, 2025 18:44

initial working

b1fca0b

finish off adding api character warnings

bbc574d

recomment debugging file and touch others to try to reset line endings

3df9d7e

touch files to reset line endings

4cc2eb3

touch the main tests file again after restart

3b094b4

add a git attributes file for the repo

34610c1

update gitattributes to ignore the www folder

20bb16e

renormalize files I accidentally lf-ed

81718d6

cjrace requested a review from rmbielby January 10, 2025 10:36

cjrace marked this pull request as ready for review January 10, 2025 10:36

rmbielby reviewed Jan 10, 2025

View reviewed changes

rmbielby approved these changes Jan 10, 2025

View reviewed changes

cjrace merged commit 38eb8e2 into main Jan 10, 2025
7 checks passed

cjrace deleted the new-feature/api-character-limit-warnings branch January 10, 2025 15:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add API character limit warnings #146

Add API character limit warnings #146

cjrace commented Jan 10, 2025 •

edited

Loading

rmbielby left a comment

cjrace commented Jan 10, 2025

Add API character limit warnings #146

Add API character limit warnings #146

Conversation

cjrace commented Jan 10, 2025 • edited Loading

Brief overview of changes

Why are these changes being made?

Detailed description of changes

Additional information for reviewers

Issue ticket number/s and link

rmbielby left a comment

Choose a reason for hiding this comment

cjrace commented Jan 10, 2025

cjrace commented Jan 10, 2025 •

edited

Loading