-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathclhs.Rmd
275 lines (190 loc) · 8.36 KB
/
clhs.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
---
title: "cLHS - Conditioned Latin Hypercube Sampling"
author: "Dave White"
date: "February 8, 2018"
output:
html_document:
toc: TRUE
toc_float: TRUE
number_sections: TRUE
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, warning = FALSE)
knitr::opts_knit$set(root.dir = "C:/workspace2/clhs/")
```
# Introduction
This document is designed to show you a few ways to implement the conditioned Latin hypercube sampling (cLHS) package to generate sampling points. This is a stratified random sampling method using environmental covariate (raster) data. for more information see:
Minasny, B., and A.B. McBratney. 2006. A conditioned Latin hypercube method for sampling in the presence of ancillary information. Comput. Geosci. 32(9): 1378-1388. doi: 10.1016/j.cageo.2005.12.009.
Roudier, P. 2011. clhs: a R package for conditioned Latin hypercube sampling. R package version 0.5-1. R Found. Vienna Available HttpCRAN R-Proj. Orgpackage Clhs Accessed Febr. 2016.
Roudier, P., A.E. Hewitt, and D.E. Beaudette. 2012. A conditioned Latin hypercube sampling algorithm incorporating operational constraints.
# Setup
Load Required Libraries
```{r libraries}
library(rgdal)
library(raster)
library(clhs)
```
Set working directory and download data
```{r data}
# create a new directory to store the data
dir.create("C:/workspace2/clhs", recursive = TRUE)
# setting the working directory
setwd("C:/workspace2/clhs/")
# download data
download.file(url = "http://github.com/dave-white2/data/raw/master/clhs/clhs_data.zip", destfile = "clhs.zip", method = "auto")
# unzip data
unzip(zipfile = "C:/workspace2/clhs/clhs.zip", overwrite = TRUE)
```
# Load Raster Data
Data needs to be of the same extent and resolution
```{r loadraster}
# load raster data of same extent and resolution
r.claymin <- raster("claymin.tif")
r.mrrtf <- raster("mrrtf.tif")
r.mrvbf <- raster("mrvbf.tif")
r.ndvi <- raster("ndvi.tif")
r.sagawi <- raster("sagawi.tif")
r.cost <- raster("cost.tif")
```
Combine into a raster stack and assign names
```{r rasterstack}
r.stack <- stack(r.claymin, r.mrrtf, r.mrvbf, r.ndvi, r.sagawi)
names(r.stack) <- c('claymin', 'mrrtf', 'mrvbf', 'ndvi', 'sagawi')
# load raster stack into memory for faster sampling
r.stack <- readAll(r.stack)
```
# cLHS - Basic Method
Convert the raster stack to a `SpatialPointsDataFrame`
During this process every pixel becomes a row in the resulting `data.frame`. WARINING: if you have a very large dataset this might not be appropriate as it could take a long time or exceed available memory.
```{r rtop}
s <- rasterToPoints(r.stack, spatial=TRUE)
```
Apply cLHS to entire raster stack - this process takes a bit of time (size: number of sample points/bins) - result is as a `cLHS_result` object. To see a progress bar of how long it takes to run `cLHS`, set `progress = TRUE`.
```{r clhs}
s.clhs <- clhs(s, size = 10, progress = FALSE, iter = 1000, simple = FALSE)
```
Diagnostic plots
You can produce plots illustrating the result of the cLHS sampling procedure. For more info type `?plot.cLHS_result` into your console.
```{r plots}
plot(s.clhs, mode=c("obj", "box"))
```
Extract indices
```{r index}
subset.idx <- s.clhs$index_samples
```
Plot points to inspect and save to shp file
```{r points}
# check visually:
par(mar = c(1,1,1,1))
plot(r.sagawi, axes=FALSE)
points(s[subset.idx, ], bg = 'red', pch=21)
# save cLHS points to shp
writeOGR(s[subset.idx, ], dsn = 'C:/workspace2/clhs', layer='clhs_points', driver = 'ESRI Shapefile', overwrite_layer = TRUE)
```
# cLHS - Large Raster Datasets
Sometimes, you may have a large raster stack, both in number of rasters and in size. When this occurs the above method can be entirely too slow. We can utilize a technique where a regular grid of points is created over the raster stack. Then the values at this regular grid are used to create the cLHS object. This is created by regular sampling of the raster stack.
```{r regularsample}
# result is a SpatialPointsDataFrame
# size is the number of points to generate
s <- sampleRegular(r.stack, size = 500, sp = TRUE)
# plot s to see the grid of points
par(mar = c(1,1,1,1))
plot(r.sagawi, axes=FALSE)
plot(s, add=TRUE)
```
Apply cLHS to SpatialPixelsDataFrame (size: number of sample points/bins) - result is as a cLHS_result object
```{r clhs2}
s.clhs <- clhs(s, size = 10, progress = FALSE, iter = 1000, simple = FALSE)
```
Diagnostic plots
```{r plots2}
plot(s.clhs, mode=c("obj", "box"))
```
Extract indices
```{r index2}
subset.idx <- s.clhs$index_samples
```
Plot points to inspect and save to shp file
```{r points2}
# check visually:
par(mar = c(1,1,1,1))
plot(r.sagawi, axes=FALSE)
points(s[subset.idx, ], bg = 'red', pch=21)
# save cLHS points to shp
writeOGR(s[subset.idx, ], dsn = 'C:/workspace2/clhs', layer = 'clhs_points_rgrd', driver = 'ESRI Shapefile', overwrite_layer = TRUE)
```
## Investigate Alternatives
This requires the latest `clhs` package. Subject to change.
```{r eval=FALSE}
library(cluster)
# keep cLHS samples
s.sub <- s[subset.idx, ]
# make an ID that will always be sorted correctly
s.sub$ID <- sprintf("%06d", 1:nrow(s.sub))
## TODO: this should output a vector of file names
GS(rstack = r.stack, samps = s.sub, buffer = 200, fac = NA)
# locate resulting raster layers, one for each sample location
r <- list.files(path='.', pattern = 'SimilarityIndex*')
# init raster stack
rs <- stack(r)
# check
par(mar = c(1,1,1,1))
plot(rs, axes=FALSE)
# reduce stack to single raster via summation of Gowers distance
g <- stackApply(rs, indices=rep(1, times=nlayers(rs)), fun = sum)
# all samples
plot(g, axes=FALSE)
points(s.sub)
# zoom in and check
plot(rs[[1]], ext=extent(s.sub[1, ]), axes=FALSE)
points(s.sub[1, ])
```
# cLHS - Cost Implementation
cLHS has a built in cost function. Using cost, you can constrain the points to particular locations. For instance, you may not have access to certain private lands, or the terrain is too rugged. You can generate a cost raster where larger pixel values are "more expensive" to visit. Remember that the cost raster needs to be the same size and extent of the rasters in the raster stack.
This cost raster was created in SAGA using the Accumulated Cost (Anisotropic) function with slope as the cost grid, aspect as the direction grid, and a roads grid as the destination points. The accumulated cost raster created from this process is shown below.
```{r costRaster}
# plot the cost raster
par(mar = c(1,1,1,1))
plot(r.cost, axes=FALSE)
```
The cost raster needs to be combined with the raster stack for this to work
```{r stackCost}
r.stack <- stack(r.claymin, r.mrrtf, r.mrvbf, r.ndvi, r.sagawi, r.cost)
names(r.stack) <- c('claymin', 'mrrtf', 'mrvbf', 'ndvi', 'sagawi', 'cost')
r.stack <- readAll(r.stack)
```
Convert the raster stack to a `SpatialPointsDataFrame` by sampling along a regular grid.
```{r rtop3}
s <- sampleRegular(r.stack, size = 10000, sp=TRUE)
```
To implement the cost function in cLHS, there are a few changes to the code.
Apply cLHS to entire raster stack - this process takes a bit of time (size: number of sample points/bins) - result is as a cLHS_result object.
```{r clhscost}
# pending a fix in clhs package (https://github.com/pierreroudier/clhs/issues/3)
# NA cost values must be filtered from the samples
s <- s[which(!is.na(s$cost)), ]
# cost surface is activated via `cost` argument
s.clhs <- clhs(s, size = 10, progress = FALSE, iter = 1000, cost = 'cost', simple = FALSE)
```
Diagnostic plots
```{r plots3}
plot(s.clhs, mode = c("obj", "box"))
```
Extract indices
```{r index3}
subset.idx <- s.clhs$index_samples
```
Plot points to inspect and save to shp file
```{r points3}
# check visually on the sagawi raster
par(mar = c(1,1,1,1))
plot(r.sagawi, axes=FALSE)
contour(r.cost, nlevels=10, col='black', add=TRUE)
points(s[subset.idx, ], bg = 'red', pch=21)
# check visualy on the cost raster
par(mar = c(0,0,0,0))
plot(r.cost, axes=FALSE)
points(s[subset.idx, ], bg = 'red', pch=21)
# save cLHS points to shp
writeOGR(s[subset.idx, ], dsn = 'C:/workspace2/clhs', layer = 'clhs_points_cost', driver = 'ESRI Shapefile', overwrite_layer = TRUE)
```