Skip to content
/ zugr Public
forked from JBGruber/zugr

R package to wrap the Deutsche Bahn Fahrplan API 🚄

License

Notifications You must be signed in to change notification settings

Benrath/zugr

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

zugr

Lifecycle: experimental

Just a small package that wraps the Fahrplan API of the German train company db. The main idea is to be able to be able to cast your net a bit wider in cases when you are flexible to go on a different hour, day, or week.

Installation

You can install the development version of zugr like so (or with remotes, but the cool kids use pak now 😁):

# install.packages("pak)
pak::pak("JBGruber/zugr")

Example: When should I go to Amsterdam

A journey that I take often is from Wiesbaden to Amsterdam. There are really fast connections between these cities, which is awsome! What is less great is trying to decide if I go Monday or Tuesday, on the weekend or during the week, or if I should rather go in a different week? If you’ve used the interface of bahn.de, you know these decisions require a lot of useless click-work. Well not anymore…

The first step is to get the station ID for the main stations in Wiesbaden and Amsterdam:

library(zugr)
wi <- search_station("Wiesbaden")
wi
#> # A tibble: 10 × 2
#>    name                   id                                                    
#>    <chr>                  <chr>                                                 
#>  1 Wiesbaden Hbf          A=1@O=Wiesbaden Hbf@X=8243729@Y=50070788@U=80@L=80002…
#>  2 Wiesbaden-Mainz-Kast   A=1@O=Wiesbaden-Mainz-Kast@X=8283173@Y=50006578@U=80@…
#>  3 Wiesbaden-Auringen-M   A=1@O=Wiesbaden-Auringen-M@X=8332092@Y=50113604@U=80@…
#>  4 Wiesbaden-Erbenheim    A=1@O=Wiesbaden-Erbenheim@X=8295048@Y=50054410@U=80@L…
#>  5 Wiesbaden Ost          A=1@O=Wiesbaden Ost@X=8256295@Y=50041411@U=80@L=80064…
#>  6 Wiesbaden-Biebrich     A=1@O=Wiesbaden-Biebrich@X=8236420@Y=50048827@U=80@L=…
#>  7 Wiesbaden-Igstadt      A=1@O=Wiesbaden-Igstadt@X=8325638@Y=50082582@U=80@L=8…
#>  8 Wiesbaden-Schierstein  A=1@O=Wiesbaden-Schierstein@X=8194602@Y=50047587@U=80…
#>  9 Wiesbaden-Chausseehaus A=1@O=Wiesbaden-Chausseehaus@X=8168794@Y=50097765@U=8…
#> 10 Wiesbaden-Biebr. KD    A=1@O=Wiesbaden-Biebr. KD@X=8236600@Y=50036234@U=80@L…

The first one is what I need! Now Amsterdam. Here I don’t need as many options:

adam <- search_station("Amsterdam", n_res = 1)
adam
#> # A tibble: 1 × 2
#>   name                         id                                               
#>   <chr>                        <chr>                                            
#> 1 Amsterdam Airport (Schiphol) A=1@O=Amsterdam Airport (Schiphol)@X=4761487@Y=5…

Now I can make a search for next Tuesday Morning (as of writing this):

next_tuesday <- bahn_search(
  from = wi$id[1],
  to = adam$id[1],
  start = "2023-10-03T05:00:00",
  end =  "2023-10-03T19:00:00"
)
next_tuesday
#> # A tibble: 6 × 6
#>   id         duration             price changes start              
#>   <chr>      <Duration>           <dbl>   <int> <dttm>             
#> 1 b88e588e_3 22440s (~6.23 hours)  37.9       3 2023-10-03 05:30:00
#> 2 78b5d1d5_3 20760s (~5.77 hours)  NA         2 2023-10-03 06:13:00
#> 3 ef00cd9f_3 21360s (~5.93 hours)  88.9       4 2023-10-03 06:01:00
#> 4 f944674b_3 18300s (~5.08 hours) 109.        2 2023-10-03 07:13:00
#> 5 65959c56_3 23160s (~6.43 hours)  27.9       4 2023-10-03 07:01:00
#> 6 0893b3bf_3 21360s (~5.93 hours)  51.9       3 2023-10-03 07:58:00
#> # ℹ 1 more variable: end <dttm>

Note that the end parameter marks the time at which the train should arrive at the latest, while start marks the earliest departure that should be included in the results.

So what is the cheapest connection? Easy to see with some R commands :smirk::

library(tidyverse)
next_tuesday |> 
  slice_min(price, n = 1) |> 
  select(-id)
#> # A tibble: 1 × 5
#>   duration             price changes start               end                
#>   <Duration>           <dbl>   <int> <dttm>              <dttm>             
#> 1 23160s (~6.43 hours)  27.9       4 2023-10-03 07:01:00 2023-10-03 13:14:00

With 4 connections and more than 6 hours, this is a rather terrible connections. It is cheap though. Let’s see s quick visualisation instead:

next_tuesday |> 
  # some connections include trains from different companies, the prices for
  # these are not oncluded
  filter(!is.na(price)) |>
  ggplot(aes(x = start, y = price, label = changes, fill = duration)) +
  geom_col() +
  geom_text(aes(y = price / 2), colour = "white") +
  scale_y_continuous(labels = function(x) paste0(x, "")) +
  scale_fill_gradient(labels = function(x) paste0(round(x / 60 / 60, 1), "h")) +
  labs(x = "depature", y = NULL, 
       title = "Prices and number of changes for different connections") +
  theme_minimal()

With more data, this gets a little more interesting.

About

R package to wrap the Deutsche Bahn Fahrplan API 🚄

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • R 100.0%