-
Notifications
You must be signed in to change notification settings - Fork 1
/
README.Rmd
138 lines (99 loc) · 5.31 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
options(width = 999)
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
devtools::load_all()
```
# migrate <img src='man/figures/logo.png' align="right" height="200" />
<!-- badges: start -->
[![lifecycle](https://img.shields.io/badge/lifecycle-stable-green.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)
[![CRAN status](https://www.r-pkg.org/badges/version/migrate)](https://CRAN.R-project.org/package=migrate)
[![metacran downloads](https://cranlogs.r-pkg.org/badges/migrate)](https://cran.r-project.org/package=migrate)
[![R-CMD-check](https://github.com/ketchbrookanalytics/migrate/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/ketchbrookanalytics/migrate/actions/workflows/R-CMD-check.yaml)
<!-- badges: end -->
The goal of {migrate} is to provide users with an easy set of tools for building *state transition matrices*.
<br>
![](man/figures/gt_tbl.png)
## Methodology
{migrate} provides an easy way to calculate absolute or percentage migration within a credit portfolio. The above image shows a typical credit migration matrix using the *absolute* approach; each cell in the grid represents the total balance in the portfolio at 2020-06-30 that started at the Risk Rating represented on the left-hand vertical axis and ended (at 2020-09-30) at the Risk Rating represented on the upper horizontal axis of the matrix. For example, $6.58M moved from a Risk Rating **AAA** at 2020-06-30 to a Risk Rating **AA** at 2020-09-30.
While the above, *absolute*, migration example is typically more of a reporting function, the *percentage* (or probabilistic) methodology is often more of a statistical modeling exercise, often used in credit portfolio risk management. Currently, this package only supports the simple "cohort" methodology. This estimates the probability of moving from state *i* to state *j* in a single time step, echoing a Markov process. We can visualize this in a matrix, for a credit portfolio with *N* unique, ordinal states:
![](man/figures/markov_matrix.png)
### Future Plans for {migrate}
Future development plans for this package include building functionality for the more complex **duration**/**hazard** methodology, including both the *time-homogeneous* and *non-homogeneous* implementations.
## Installation
You can install the released version of {migrate} from [CRAN](https://CRAN.R-project.org) with:
``` {r, eval = FALSE}
install.packages("migrate")
```
And the development version from [GitHub](https://github.com/) with:
``` {r, eval = FALSE}
# install.packages("devtools")
devtools::install_github("ketchbrookanalytics/migrate")
```
## Practical Usage
{migrate} currently only handles transitions between exactly two (2) timepoints. Under the hood, `migrate()` finds the earliest & latest dates in the given *time* variable, and filters out any observations where the *time* value does not match those two dates.
If you are writing a SQL query to get data to be used with `migrate()`, the query would likely look something like this:
```{r, eval = FALSE}
# -- Get the *State* risk status and *Balance* dollar amount for each ID, at two distinct dates
# SELECT ID, Date, State, Balance
# FROM my_database
# WHERE Date IN ('2020-12-31', '2021-06-30')
```
By default, `migrate()` drops observations that belong to IDs found at a single timepoint. However, users can define a *filler state* so that IDs with a single timepoint are not removed but rather migrated from or to this *filler state*. This allows for more flexible handling of such data, ensuring that no information is lost during the migration process. Check [Handle IDs with observations at a single timepoint](https://ketchbrookanalytics.github.io/migrate/articles/migrate.html#handle-ids-with-observations-at-a-single-timepoint) for more information.
## Example
First, load the package using `library()`
```{r load, eval = FALSE}
library(migrate)
```
The package has a built-in mock dataset, which can be loaded into the environment like so:
```{r data, eval = FALSE}
data("mock_credit")
head(mock_credit[order(mock_credit$customer_id), ]) # sort by 'customer_id'
```
```{r data_tbl, echo = FALSE}
head(mock_credit[order(mock_credit$customer_id), ]) |>
knitr::kable(row.names = FALSE)
```
Note that an important feature of the `mock_credit` dataset is that there are exactly two (2) unique values in the `date` column variable; if the `time` argument passed to `migrate()` has more than two (2) unique values, the function will throw an error.
```{r dates}
unique(mock_credit$date)
```
To summarize the migration within the data, use the `migrate()` function
```{r migrate}
migrated_df <- migrate(
data = mock_credit,
id = customer_id,
time = date,
state = risk_rating,
)
head(migrated_df)
```
To create the state transition matrix, use the `build_matrix()` function
```{r matrix}
build_matrix(migrated_df)
```
Or, to do it all in one shot, use the `|>`
```{r pipe}
mock_credit |>
migrate(
id = customer_id,
time = date,
state = risk_rating,
metric = principal_balance,
percent = FALSE,
verbose = FALSE
) |>
build_matrix(
state_start = risk_rating_start,
state_end = risk_rating_end,
metric = principal_balance
)
```