This repository was archived by the owner on Jul 18, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathREADME.rmd
166 lines (120 loc) · 7.22 KB
/
README.rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
---
title: "README"
author: "Ellen Esch"
date: "`r format(Sys.time(), '%d %B %Y')`"
output:
github_document:
toc: yes
always_allow_html: yes
urlcolor: blue
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE, message = F, warning = F)
library(readxl)
library(tidyverse)
library(sf)
source("./data-raw/user_tracts_fxn.R")
```
## Overview
The Economic Values Atlas project arises from work done by Oregon Metro and Brookings Institution.
## MetCouncil's contributions
We have tried to make an RShiny application that is highly portable, well documented, and requires minimal coding in order to lower the bar for other regions who might like to implement this type of analysis.
At the most basic level, users may upload an Excel document containing 2 sheets. Users can also leverage R scripts to aggregate disparate data sources.
To understand what you will be creating, please view [this example for the Twin Cities region](https://metrotransitmn.shinyapps.io/eva_app_v1/) or [this example for the Portland region](https://metrotransitmn.shinyapps.io/eva_app_pdx/).
### Set user parameters
First, set the following parameters which indicate the state/s and county/ies from which tract data comes from. Right now, this code is set to handle up to 2 states although this can easily be expanded if there is a need. Also set the metro name, and the format of the data inputs. This should be the only section of code that needs editing.
```{r user_state, echo=TRUE}
state_1 <- "OR"
county_1 <- c("Clackamas", "Columbia", "Multnomah", "Washington", "Yamhill")
state_2 <- "WA"
county_2 <- c("Clark", "Skamania")
metro <- "pdx"
dataformat <- "excel"
#####
# state_1 <- "MN"
# county_1 <- c("Anoka", "Carver", "Dakota", "Hennepin", "Ramsey", "Scott", "Washington")
#
# state_2 <- NA
# county_2 <- NA
#
# metro <- "msp"
# dataformat <- "rscript"
```
### Read and process raw data
If the data is in an excel format, please ensure it has the following structure:
- `Sheet 1`: this sheet should be a "variable key" and contain the following columns:
- `variable`: a short code corresponding to the tract-level data used in the EVA
- `name`: a descriptive name corresponding to the tract-level data
- `type`: indicating if the variable corresponds to `people`, `place`, or `business`
- `interpret_high_value`: use `high_opportunity` if a high value of the variable should correspond to a positive economic value. use `low_opporunity` if a high value of the variable is not a desirable economic value.
- `Sheet 2`: this sheet should contain the raw data
- `tract_string`: should be the tract identifiers
- all other columns should be named according to the `variable`s in sheet 1.
The excel file should be placed in the `data-raw` folder. It should be named according the the following convention: `metro.xlsx`. (Notice that Portland Metro's data had a different format/structure, so I processed that data differently; the suggested format here should hopefully save you from Excel headaches!)
If an R script is being used to aggregate the data, you may find it useful to follow the example for the Twin Cities outlined in the `input_tract_data.R` script.
```{r process, include = F}
#the following code processes the data. No edits are needed for end users.
### dl geometries using tigris
eva_tract_geometry <- user_tracts_fxn(state_1, county_1,
state_2, county_2)
usethis::use_data(eva_tract_geometry, overwrite = TRUE)
map_centroid <- eva_tract_geometry %>%
st_union() %>%
st_centroid()
usethis::use_data(map_centroid, overwrite = TRUE)
###process tract data
if(metro == "pdx") {
source("./data-raw/user_process_pdx.R")
} else if (dataformat == "excel") {
eva_vars <- read_xlsx(paste0("./data-raw/", metro, ".xlsx"),
sheet = 1)
eva_data_main <- read_xlsx(paste0("./data-raw/", metro, ".xlsx"),
sheet = 1,
na = c("NA")) %>%
mutate(tract_string = as.character(tract_string)) %>% #ensure this reads as a character so that we can join with tract geometries
gather("variable", "raw_value", -tract_string) %>%
group_by(variable) %>%
mutate(MEAN = mean(raw_value, na.rm = T),
SD = sd(raw_value, na.rm = T),
MIN = min(raw_value, na.rm = T),
MAX = max(raw_value, na.rm = T),
COUNT = as.numeric(sum(!is.na(raw_value))),
z_score = (raw_value - MEAN)/SD) %>%
right_join(eva_vars) %>%
#create nominal weights
mutate(weights_nominal = case_when(interpret_high_value == "high_opportunity" ~ (raw_value - MIN) / (MAX - MIN) * 10,
interpret_high_value == "low_opportunity" ~ 10 - (raw_value - MIN) / (MAX - MIN) * 10,
TRUE ~ NA_real_)) %>%
#Weights Standard Score
mutate(weights_scaled = case_when(interpret_high_value == "high_opportunity" ~ pnorm(z_score) * 10,
interpret_high_value == "low_opportunity" ~ (10 - pnorm(z_score) * 10),
TRUE ~ NA_real_)) %>%
#weights rank
mutate(weights_rank = case_when(interpret_high_value == "high_opportunity" ~ min_rank((weights_nominal)) / COUNT * 10,
interpret_high_value == "low_opportunity" ~ min_rank(desc(weights_nominal)) / COUNT * 10,
TRUE ~ NA_real_)) %>%
# #rank
mutate(overall_rank = case_when(interpret_high_value == "high_opportunity" ~ min_rank(as.numeric(weights_nominal)),
interpret_high_value == "low_opportunity" ~ min_rank(desc(as.numeric(weights_nominal))))) %>%
#clean dataframe
select(-MEAN, -SD, -MIN, -MAX)
} else if (dataformat == "rscript") {
source("./data-raw/input_tract_data.R")
}
usethis::use_data(eva_data_main, overwrite = TRUE)
```
### Edit and add any region-specific langauge
There will likely be region-specific information that should be displayed alongside the data within the interactive application. A general shell is created here. But users should edit the `./R/mod_home.R` script for any introductory information which should be displayed. And the `./R/app_ui/R` script can be edited as well.
Text and pictures may also be updated, and the interface can be styled with css. I'm not sure of the best way to make that portable. My inclination is to create a "Brookings" style for the generic app.
### Launch the app
The following code will launch your region's app (!!). Please run all the code chunks (two) prior to this section in order to see proper performance. To deploy it on an R server, you can click the blue button on the top right hand side of the app that will launch locally.
```{r deployapp, echo=T, include=F}
pkgload::load_all(export_all = FALSE, helpers = FALSE, attach_testthat = FALSE)
options("golem.app.prod" = TRUE)
eva.app::run_app() # add parameters here (if any)
```
## Future plans
What *are* our future plans?!?!
- How can we update the app (particularly the data inputs) to suit our region?
- Are there national sources of supplemental data which might be nice to include (transit for instance?). If not, is this a thing that some code would still be helpful?
- Is z-score the best variable here? What are the other tabs in the Portland data?