-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path03_test.Rmd
93 lines (72 loc) · 2.33 KB
/
03_test.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
---
title: "Example Analysis of Ames Housing Data"
author: "Axel R"
date: "2024-01-20"
output: html_document
---
# Details
These objects are the results of an analysis of the Ames housing data. A K-nearest neighbors model was used with a small predictor set that included natural spline transformations of the Longitude and Latitude predictors. The code used to generate these examples was:
```{r}
library(tidymodels)
library(tune)
library(AmesHousing)
library(kknn)
# ------------------------------------------------------------------------------
ames <- make_ames()
set.seed(4595)
data_split <- initial_split(ames, strata = "Sale_Price")
ames_train <- training(data_split)
set.seed(2453)
rs_splits <- vfold_cv(ames_train, strata = "Sale_Price")
# ------------------------------------------------------------------------------
ames_rec <-
recipe(Sale_Price ~ ., data = ames_train) %>%
step_log(Sale_Price, base = 10) %>%
step_YeoJohnson(Lot_Area, Gr_Liv_Area) %>%
step_other(Neighborhood, threshold = .1) %>%
step_dummy(all_nominal()) %>%
step_zv(all_predictors()) %>%
step_ns(Longitude, deg_free = tune("lon")) %>%
step_ns(Latitude, deg_free = tune("lat"))
knn_model <-
nearest_neighbor(
mode = "regression",
neighbors = tune("K"),
weight_func = tune(),
dist_power = tune()
) %>%
set_engine("kknn")
ames_wflow <-
workflow() %>%
add_recipe(ames_rec) %>%
add_model(knn_model)
ames_set <-
extract_parameter_set_dials(ames_wflow) %>%
update(K = neighbors(c(1, 50)))
set.seed(7014)
ames_grid <-
ames_set %>%
grid_max_entropy(size = 10)
ames_grid_search <-
tune_grid(
ames_wflow,
resamples = rs_splits,
grid = ames_grid
)
set.seed(2082)
ames_iter_search <-
tune_bayes(
ames_wflow,
resamples = rs_splits,
param_info = ames_set,
initial = ames_grid_search,
iter = 15
)
```
important note: Since the rsample split columns contain a reference to the same data, saving them to disk can results in large object sizes when the object is later used. In essence, R replaces all of those references with the actual data. For this reason, we saved zero-row tibbles in their place. This doesn't affect how we use these objects in examples but be advised that using some rsample functions on them will cause issues.
# Examples
```{r}
library(tune)
ames_grid_search
ames_iter_search
```