Datasets used in ST-SSL

We provide several datasets used in the ST-SSL framework, which leverages self-supervised learning for traffic flow prediction.

The datasets range from {NYCBike1, NYCBike2, NYCTaxi, BJTaxi}.

Please use Git Large File Storage (LFS) to pull this repo to your computer.

You can also download the dataset at Beihang Cloud Drive or Google Drive.

Dataset Format

Each dataset is composed of 4 files, namely train.npz, val.npz, test.npz, and adj_mx.npz.

|----NYCBike1\
|    |----train.npz    # training data
|    |----adj_mx.npz   # predefined graph structure
|    |----test.npz     # test data
|    |----val.npz      # validation data

Train/Val/Test data is composed of 4 numpy.ndarray objects:

The train/val/test data is composed of 4 numpy.ndarray objects:

X: input data. It is a 4D tensor of shape (#samples, #lookback_window, #nodes, #flow_types), where # denotes the number sign.
Y: data to be predicted. It is a 4D tensor of shape (#samples, #predict_horizon, #nodes, #flow_types). Note that X and Y are paired in the sample dimension. For instance, (X_i, Y_i) is the i-the data sample with i indexing the sample dimension.
X_offset: a list indicating offsets of X's lookback window relative to the current time with offset 0.
Y_offset: a list indicating offsets of Y's prediction horizon relative to the current time with offset 0.

For all datasets, previous 2-hour flows as well as previous 3-day flows around the predicted time are used to forecast flows for the next time step.

adj_mx.npz is the graph adjacency matrix that indicates the spatial relation of every two regions/nodes in the studied area.

Dataset Usage

You can use the following code to view the data:

import numpy as np

data = np.load('./BJTaxi/train.npz')
for file in data.files:
    print(file, data[file].shape)

Raw Data

All datasets are processed by us as a sliding window view. Raw data of NYCBike1 and BJTaxi are collected from STResNet. Raw data of NYCBike2 and NYCTaxi are collected from STDN.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
BJTaxi		BJTaxi
NYCBike1		NYCBike1
NYCBike2		NYCBike2
NYCTaxi		NYCTaxi
.gitattributes		.gitattributes
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Datasets used in ST-SSL

Dataset Format

Dataset Usage

Raw Data

About

Releases

Packages

Contributors 2

Echo-Ji/ST-SSL_Dataset

Folders and files

Latest commit

History

Repository files navigation

Datasets used in ST-SSL

Dataset Format

Dataset Usage

Raw Data

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages