-
Notifications
You must be signed in to change notification settings - Fork 4
/
Copy pathREADME.qmd
197 lines (157 loc) · 7.36 KB
/
README.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
---
# format: html
format: gfm
bibliography: vignettes/references.bib
execute:
cache: false
fig.path: "man/figures/README-"
message: false
---
```{r, include=FALSE}
knitr::opts_chunk$set(cache = TRUE,
fig.path = "man/figures/README-",
message = FALSE)
```
<!-- badges: start -->
[![R-CMD-check](https://github.com/robinlovelace/simodels/workflows/R-CMD-check/badge.svg)](https://github.com/robinlovelace/simodels/actions)
<!-- badges: end -->
The goal of {simodels} is to provide a simple, [tidy](https://www.tidyverse.org/), and flexible framework for developing spatial interaction models (SIMs).
SIMs estimate the amount of movement between spatial entities and can be used for many things, including to support evidence-based investment in sustainable transport infrastructure and prioritisation of location options for public services.
Unlike many software tools designed to support spatial interaction modelling, {simodels} does not define (or even encourage use of) any particular functional forms or modelling frameworks for predicting movement between origins and destinations.
Instead, it provides a framework enabling you to use model function forms or models of your choosing, ensuring flexibility and encouraging flexibility.
## Installation
Install the package as follows:
```r
install.packages("remotes") # if not already installed
```
```r
remotes::install_github("robinlovelace/simodels")
```
To get the develoment version do:
```{r}
devtools::load_all()
```
<!-- # Implementations in other languages -->
## simodel basics
Run a basic SIM as follows:
```{r distance}
library(simodels)
library(dplyr)
# prepare OD data
od = si_to_od(
origins = si_zones, # origin locations
destinations = si_zones, # destination locations
max_dist = 5000 # maximum distance between OD pairs
)
# specify a function
gravity_model = function(beta, d, m, n) {
m * n * exp(-beta * d / 1000)
}
# perform SIM
od_res = si_calculate(
od,
fun = gravity_model,
d = distance_euclidean,
m = origin_all,
n = destination_all,
constraint_production = origin_all,
beta = 0.9
)
# visualize the results
plot(od_res$distance_euclidean, od_res$interaction)
```
What just happened?
We created an 'OD data frame' with the function `si_to_od()` from geographic origins and destinations, and then estimated a simple 'production constrained' (with the `constraint_production` argument) gravity model based on the population in origin and destination zones and a custom distance decay function with `si_calculate()`.
As the example above shows, the package allows/encourages you to define and use your own functions to estimate the amount of interaction/movement between places.
The approach is also 'tidy', allowing use of {simodels} functions in {dplyr} pipelines:
```{r}
od_res = od %>%
si_calculate(fun = gravity_model,
m = origin_all,
n = destination_all,
d = distance_euclidean,
constraint_production = origin_all,
beta = 0.3)
od_res %>%
select(interaction)
```
The resulting estimates of interaction, returned in the column `interaction` and plotted with distance in the graphic above, resulted from our choice of spatial interaction model inputs, allowing a wide range of alternative approaches to be implemented.
This flexibility is a key aspect of the package, enabling small and easily modified functions to be implemented and tested.
The output of `si_calculate()` is a geographic object that can be plotted with `sf`'s plot method (or other geographic data visualisation packages):
```{r map}
plot(od_res["interaction"], logz = TRUE)
```
The `si_to_od()` function transforms geographic entities (typically polygons and points) into a data frame representing the full combination of origin-destination pairs that are less than `max_dist` meters apart.
A common saying in data science is that 80% of the effort goes into the pre-processing stage.
This is equally true for spatial interaction modelling as it is for other types of data intensive analysis/modelling work.
So what does this function return?
The function allows you to use any variable in the origin or destination data by joining all attributes onto the OD data frame, with column names appended with `origin` and `destination`.
The approach works equally well for 'bipartite' SIMs, in which origin and destination points are different (Hasova et al. [2022](https://lenkahas.com/files/preprint.pdf)).
The following example implements a bipartite SIM that estimates the number of trips from administrative zones to pubs in Leeds:
```{r pubs}
# Set n. trips to pubs, assuming that for every trip to the pub there are
# 50 trips to work (this would be validated/tested/modelled in empirical work)
zones_pubs = si_zones %>%
mutate(to_pubs = all / 50)
pubs_example = si_pubs %>%
filter(grepl(pattern = "Chemic|Nag", x = name))
pubs_example$size = c(100, 80)
od_to_pubs = si_to_od(zones_pubs, pubs_example)
od_to_pubs_result = od_to_pubs %>%
si_calculate(fun = gravity_model,
m = origin_to_pubs,
n = destination_size,
d = distance_euclidean,
beta = 0.5,
constraint_production = origin_to_pubs)
od_to_pubs_result %>%
select(O, D, destination_name, interaction)
```
We can plot the top 20 desire lines between zone centroids and the 2 pubs in the example dataset as follows:
```{r pubmap, eval=FALSE}
library(tmap)
tmap_mode("view")
od_to_pubs_result %>%
top_n(n = 50, wt = interaction) %>%
tm_shape() +
tm_lines(col = "interaction", palette = "viridis", scale = 2)
```
![](https://user-images.githubusercontent.com/1825120/168485930-1adf9970-7dc1-4417-891e-2d5a1ec5e170.png)
## Feedback
We would be interested to hear how the approach presented in this package compared with other implementations such as those presented in the links below.
If anyone would like to try the approach or implement it in another language feel free to get in touch via the issue tracker.
## Further reading
For details on what SIMs are and how they have been defined mathematically and in code from first principles, see the [`sims` vignette](https://robinlovelace.github.io/simodels/articles/sims-first-principles.html).
To dive straight into using {simodels} to develop SIMs, see the [Get started vignette](https://robinlovelace.github.io/simodels/articles/simodels.html).
For a detailed introduction to SIMs, support by reproducible R code, see Adam Dennett's [2018 paper](https://doi.org/10.37970/aps.v2i2.38).
## Other SIM packages
- The [`spflow` R package](https://github.com/LukeCe/spflow)
- The [`spint` Python package](https://spint.readthedocs.io/en/latest/)
- The [`gravity`](https://cran.r-project.org/package=gravity) R package
- The [`mobility`](https://covid-19-mobility-data-network.github.io/mobility/index.html) R package
- The [gravity functions in the scikit-mobility](https://scikit-mobility.github.io/scikit-mobility/reference/models.html#module-skmob.models.gravity) Python package
```{r}
#| echo: false
#| eval: false
# Benmarks
bench::mark(
{od_res = si_calculate(
od,
fun = gravity_model,
d = distance_euclidean,
m = origin_all,
n = destination_all,
constraint_production = origin_all,
beta = 0.9
)}
)
```
```{r}
#| echo: false
#| eval: false
# Add actions:
usethis::use_github_action("check-standard")
# Add pkgdown action:
usethis::use_github_action("pkgdown")
usethis::use_data_raw("si_oa_wpz")
```