survstack is a Python implementation of the survival stacking method proposed in Survival stacking: casting survival analysis as a classification problem by Erin Craig, Chenyang Zhong, and Robert Tibshirani (2021) [1]. The package offers both an OO and functional interface.
The recommended use is the provided SurvivalStacker class. Survival data format follows that of the scikit-survival package - a structured array with the first field indicating the observation of an event as a boolean value, and the second field denoting the survival time.
from survstack import SurvivalStacker
from sksurv.datasets import load_breast_cancer
X, y = load_breast_cancer()
X = X.loc[:, X.dtypes == np.float64].values
event_field, time_field = y.dtype.names
print(X.shape, y.shape, y[event_field].sum())
# (198, 78) (198,) 51
ss = SurvivalStacker()
X_stacked, y_stacked = ss.fit_transform(X, y)
print(X_stacked.shape, y_stacked.shape)
# (8117, 129) (8117,)
In the above example code, you can see the number of columns in X increased by the number of observed events, while y became a single column. The number of rows increases with respect to the number of samples still under observation at each time-point.
@article{Craig2021-or,
title={Survival stacking: casting survival analysis as a classification problem},
author={Craig, Erin and Zhong, Chenyang and Tibshirani, Robert},
journal={arXiv preprint arXiv:2107.13480},
year={2021}
}