You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently the contribution section on Readme.md doesn't state any of the limitations imposed by CRAN, namely that the entire package must be <5MB in size.
The text was updated successfully, but these errors were encountered:
Agreed. We should probably write guidance for other ways to submit larger datasets. Emphasis on this being for training, not a general data warehouse. Hacktoberfest?
Here are some of my (highly opinionated) thoughts on this. I think we should aim to follow the Tidyverse Style Guide where possible.
datasets should be designed for teaching how to do things in R, so should be easy to understand and relevant datasets to a general audience
datasets need to be relatively small in size; CRAN has a limit of 5MB for the entire package, so each dataset should be no more than 500KB in size. You can check with object.size()
datasets should not contain any sensitive or disclosive information; they are being released publicly. The data ideally should be from a published source, or synthetic/generated data
datasets that come from other sources must be licensed under a suitable license for reshaping, e.g. MIT, GPL, OGL, CC. Attribution must be included to the source data
datasets should be saved as a tibble - you can use as_tibble() to convert
datasets should be named using camel case, as should all columns within the dataset
datasets should be documented with roxygen2; this documentation should be a high level overview
datasets should have a vignette that describes what the data is in more detail than the documentation goes into as well as containing a useful example (ideally examples) of how to use the data demonstrating useful R functions
vignettes should use tidyverse functions and avoid base R and data.table; this is more so because the introductory training NHS-R offers focussed on the tidyverse
vignettes should not require the use of too many extra packages. Any packages you use must be included in the Suggests section of DESCRIPTION
each function should have an example that is more than the use of glimpse() which is now listed in the Get Started vignette. An example being in ons_mortality the example is how to view the data in wide form with each date as a column.
Currently the contribution section on Readme.md doesn't state any of the limitations imposed by CRAN, namely that the entire package must be <5MB in size.
The text was updated successfully, but these errors were encountered: