Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement/geography handling #64

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

rmbielby
Copy link
Contributor

@rmbielby rmbielby commented Jan 14, 2025

Brief overview of changes

A lot of the feedback in user testing was around having more intuitive handling of geographies in the query handler, in particular having geographies, geographic_levels and locations parameters on query_dataset was confusing as people didn't know which one(s) to use.

Why are these changes being made?

To try and make things simpler for users to understand how to make geography-based queries.

Detailed description of changes

I've adapted the code such that query_dataset() only takes geographies instead of all three of geographies, geographic_levels and locations. The geographies parameter can itself incorporate both geographic_levels and locations and I basically want them in a single entity so we can handle a broader range of geography queries.

To cover handling this behind the scenes:

  • I've added a function (todf_geographies()) to convert anything relatively sensible passed to geographies (string, vector, data frame, list containing some combination of geographic_level and location) into a standardised data frame.
  • if query_dataset() is run with method="POST", then post_dataset() should happily work with the standardised form
  • if query_dataset() is run with method="GET", it will convert that standardised data frame into something that get_dataset() can understand.

It can seem a bit circular when taking get_dataset() in isolation with simple queries like "return all rows at regional geographic level", but having the single geographies parameter is really helpful for creating combined geographical criteria, i.e. find me all LAs in a given region and all national level data.

Additional information for reviewers

I've also made a start on switching a sub-set of functions to being internal, in particular post_dataset() and get_dataset() and related helper functions.

I've not added a geography vignette as yet and that may be a second PR. At the moment this is all just trying to make the front end input appear simpler for the user.

Issue ticket number/s and link

This removes locations and geographic_levels as parameters from query_dataset() so closes #56

@rmbielby rmbielby added enhancement Improvement to existing feature geographies labels Jan 14, 2025
@rmbielby rmbielby self-assigned this Jan 14, 2025
@rmbielby rmbielby requested a review from cjrace January 14, 2025 16:13
Copy link
Contributor

@cjrace cjrace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raised #65 during testing this, not sure if related or not, though feel free to fix on this PR if you think it is.

Left a couple of small comments though does seem much nicer to use! Pretty neat (and ambitious) to accept all kinds of formats for the geography queries.

Should we have more tests around the new todf_geographies() function? (even if just through examples of query_dataset() accepting different geography approaches). Wary it could be easy to accidentally change expected behaviour if someone else (particularly me) edits it in the future.

The geography vignette in #43 will still be very important on top of this though for users who are less familiar than we are with our data structures

DESCRIPTION Outdated Show resolved Hide resolved
NAMESPACE Outdated Show resolved Hide resolved
R/api_url.R Show resolved Hide resolved
R/query_dataset.R Outdated Show resolved Hide resolved
_pkgdown.yml Outdated Show resolved Hide resolved
R/query_dataset.R Show resolved Hide resolved
R/query_dataset_utils.R Outdated Show resolved Hide resolved
Copy link
Contributor

@cjrace cjrace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looking good and happy for you to merge.

Added a comment on #5 with some more examples, semi-related to this as it's around how we clean the time and geography data

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improvement to existing feature geographies
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Examples or validation needed for locations argument in data set query
2 participants