Skip to content

Commit

Permalink
Fix/41 48 download dataset preview (#50)
Browse files Browse the repository at this point in the history
* Update notes around using download_dataset as a previewing function

* I don't know why integer checking isn't in base R

* update download to preview and ditch httr for httr2 combined with readr

* Increment version number to 0.3.1.9000

* response to PR comments
  • Loading branch information
cjrace authored Oct 22, 2024
1 parent b51af5a commit c6635a8
Show file tree
Hide file tree
Showing 12 changed files with 251 additions and 107 deletions.
6 changes: 4 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: eesyapi
Title: EES-y API
Version: 0.3.1
Version: 0.3.1.9000
Authors@R: c(
person("Rich", "Bielby", , "[email protected]", role = c("aut", "cre"),
comment = c(ORCID = "0000-0001-9070-9969")),
Expand All @@ -17,18 +17,20 @@ Imports:
data.table,
dplyr,
httr,
httr2,
jsonlite,
magrittr,
readr,
rlang,
stringr
Suggests:
knitr,
readr,
rmarkdown,
testthat (>= 3.0.0)
VignetteBuilder:
knitr
Config/testthat/edition: 3
Encoding: UTF-8
Language: en-GB
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.2
2 changes: 1 addition & 1 deletion NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ export(api_url)
export(api_url_pages)
export(api_url_query)
export(convert_api_filter_type)
export(download_dataset)
export(example_data_raw)
export(example_geography_query)
export(example_id)
Expand Down Expand Up @@ -33,6 +32,7 @@ export(parse_tojson_params)
export(parse_tojson_time_periods)
export(parse_tourl_filter_in)
export(post_dataset)
export(preview_dataset)
export(query_dataset)
export(validate_ees_filter_type)
export(validate_ees_id)
Expand Down
4 changes: 3 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
# eesyapi (development version)

# eesyapi 0.3.1

* Added parsing of SQIDs in retrieved data to provide human readable content
* Created function, `download_dataset()`, to connect to csv endpoint for downloading data set csv file
* Created function, `preview_dataset()`, to connect to csv endpoint for downloading data set csv file
* Added first draft of example workflow for querying a data set

# eesyapi 0.3.0
Expand Down
66 changes: 42 additions & 24 deletions R/download_dataset.R → R/preview_dataset.R
Original file line number Diff line number Diff line change
@@ -1,38 +1,51 @@
#' Download the raw CSV for an API data set
#' Preview the raw CSV for an API data set
#'
#' This gives a super quick way to just fetch the whole file in a human
#' readable format.
#' This gives a super quick way to just fetch the file in a human readable
#' format.
#'
#' @description
#' This function is mostly designed for exploring the API, and is unlikely to
#' be suitable for long term production use.
#'
#' There is no filtering down of the file so you will always get the whole file
#' and in some instances this may be very large.
#' You can set the number of rows to preview using the n_max parameter. This
#' uses the n_max from `readr::read_csv()` under the hood.
#'
#' As there are no IDs involved, this is brittle and code relying on this
#' function will likely break whenever there is renaming of variables or items
#' in the data.
#'
#' It is recommended to take the time to set up custom queries using the
#' `query_dataset()` function instead. If you are using this function for more
#' than exploratory purposes, make sure you subscribe to the data set you're
#' downloading and then keep track of any updates to the data.
#' `query_dataset()` function instead.
#'
#' If you are using this function for more than exploratory purposes, make
#' sure you subscribe to the data set you're downloading and then keep track
#' of any updates to the data.
#'
#' @param dataset_id ID of data set
#' @param dataset_version Version number of data set
#' @param api_version EES API version
#' @param verbose Run with additional contextual messaging, logical, default = FALSE
#' @param n_max maximum number of rows to preview, 10 by default, Inf will get
#' all available rows
#' @param verbose Run with additional contextual messaging, logical,
#' default = FALSE
#'
#' @return data.frame
#' @export
#'
#' @examples
#' download_dataset(example_id("dataset"))
download_dataset <- function(
#' # Preview first 10 rows
#' preview_dataset(example_id("dataset"))
#'
#' # Get 2 rows
#' preview_dataset(example_id("dataset"), n_max = 2)
#'
#' # Get all rows
#' preview_dataset(example_id("dataset"), n_max = Inf)
preview_dataset <- function(
dataset_id,
dataset_version = NULL,
api_version = NULL,
n_max = 10,
verbose = FALSE) {
# Validation ----------------------------------------------------------------
if (!is.null(dataset_version)) {
Expand All @@ -48,6 +61,12 @@ download_dataset <- function(
stop("verbose must be a logical value, either TRUE or FALSE")
}

if (n_max != Inf) {
if (!check_integer(n_max)) {
stop("n_max must be a positive integer value, e.g. 15, or Inf")
}
}

eesyapi::validate_ees_id(dataset_id, level = "dataset")

# Generate query ------------------------------------------------------------
Expand All @@ -57,25 +76,24 @@ download_dataset <- function(
verbose = verbose
)

# Check we can request successfully -----------------------------------------
toggle_message("Requesting data...", verbose = verbose)

response <- httr::GET(query_url)
response <- query_url |>
httr2::request() |>
httr2::req_perform()

eesyapi::http_request_error(response, verbose = verbose)

toggle_message("Parsing response...", verbose = verbose)

# Parse into data.frame -----------------------------------------------------
output <- httr::content(
response,

# All EES CSVs should be UTF-8 and are validated on import
encoding = "UTF-8",
# Read in the CSV -----------------------------------------------------------
toggle_message("Reading response...", verbose = verbose)

# httr uses read_csv() underneath, controlling read_csv() verbosity
show_col_types = verbose,
progress = verbose
) |>
output <- query_url |>
readr::read_csv(
show_col_types = verbose,
progress = verbose,
n_max = n_max
) |>
as.data.frame()

return(output)
Expand Down
23 changes: 23 additions & 0 deletions R/utils.R
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,26 @@ toggle_message <- function(..., verbose) {
message(...)
}
}

#' Check if a value is an integer
#'
#' is.integer checks the object class, not the value, so credit to VitoshKa
#' on stack overflow for the core of this function...
#'
#' https://stackoverflow.com/questions/3476782/check-if-the-number-is-integer
#'
#' looks like it's been adopted in installr too, avoiding needing that as a
#' dependency by putting the code we need here.
#'
#' @param x a value to test
#'
#' @return logical, false if not an integer, true if an integer
#' @keywords internal
check_integer <- function(x) {
if (!is.double(x)) {
# Return early if wrapped in quotes
return(FALSE)
} else {
!grepl("[^[:digit:]]", format(x, digits = 20, scientific = FALSE))
}
}
2 changes: 1 addition & 1 deletion _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ reference:
- get_publications
- get_data_catalogue
- get_meta
- download_dataset
- preview_dataset
- query_dataset

- title: Support for generating API URLs and interpreting responses
Expand Down
25 changes: 25 additions & 0 deletions man/check_integer.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

48 changes: 0 additions & 48 deletions man/download_dataset.Rd

This file was deleted.

62 changes: 62 additions & 0 deletions man/preview_dataset.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 0 additions & 12 deletions tests/testthat/test-download_dataset.R

This file was deleted.

Loading

0 comments on commit c6635a8

Please sign in to comment.