-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
recalibratePlpRefit returns a dataframe instead of an object of class runPlp #470
Comments
The documentation is definitely wrong. Thanks for reporting this. It appears to return the prediction for the new population with the original model and recalibrated model. It also adds attributes for how the model needed to be adjusted, but these seem hidden. It does refit the model, so this could be edited to return a runPlp, but I need to see whether it is used inside other functions before editing. |
Could you provide some information on how to generate the runPlp object myself? |
Hi @Volpym, I'm now looking at the recalibration code a bit in relation to a study I'm doing. First, if you want to return the model from the Modified recalibratePlpRefitrecalibratePlpRefit <- function(
plpModel,
newPopulation,
newData,
returnModel = FALSE) {
checkNotNull(plpModel)
checkNotNull(newPopulation)
checkNotNull(newData)
checkIsClass(plpModel, "plpModel")
checkIsClass(newData, "plpData")
checkBoolean(returnModel)
# get selected covariates
includeCovariateIds <- plpModel$covariateImportance %>%
dplyr::filter(.data$covariateValue != 0) %>%
dplyr::select("covariateId") %>%
dplyr::pull()
# check which covariates are included in new data
containedIds <- newData$covariateData$covariateRef %>% dplyr::collect()
noShrinkage <- intersect(includeCovariateIds, containedIds$covariateId)
# add intercept
noShrinkage <- append(noShrinkage, 0, 0)
setLassoRefit <- setLassoLogisticRegression(
includeCovariateIds = includeCovariateIds,
noShrinkage = noShrinkage,
maxIterations = 10000 # increasing this due to test code often not converging
)
newData$labels <- newPopulation
newData$folds <- data.frame(
rowId = newData$labels$rowId,
index = sample(2, length(newData$labels$rowId), replace = TRUE)
)
# add dummy settings to fit model
attr(newData, "metaData")$outcomeId <- attr(newPopulation, "metaData")$outcomeId
attr(newData, "metaData")$targetId <- attr(newPopulation, "metaData")$targetId
attr(newData, "metaData")$restrictPlpDataSettings <- attr(newPopulation, "metaData")$restrictPlpDataSettings
attr(newData, "metaData")$covariateSettings <- newData$metaData$covariateSettings
attr(newData, "metaData")$populationSettings <- attr(newPopulation, "metaData")$populationSettings
attr(newData$covariateData, "metaData")$featureEngineeringSettings <- PatientLevelPrediction::createFeatureEngineeringSettings()
attr(newData$covariateData, "metaData")$preprocessSettings <- PatientLevelPrediction::createPreprocessSettings()
attr(newData, "metaData")$splitSettings <- PatientLevelPrediction::createDefaultSplitSetting()
attr(newData, "metaData")$sampleSettings <- PatientLevelPrediction::createSampleSettings()
newModel <- tryCatch(
{
fitPlp(
trainData = newData,
modelSettings = setLassoRefit,
analysisId = "recalibrationRefit",
analysisPath = NULL
)
},
error = function(e) {
ParallelLogger::logInfo(e)
return(NULL)
}
)
if (is.null(newModel)) {
ParallelLogger::logInfo("Recalibration fit failed")
return(NULL)
}
newModel$prediction$evaluationType <- "recalibrationRefit"
oldPred <- predictPlp(
plpModel = plpModel,
plpData = newData,
population = newPopulation,
timepoint = 0
)
oldPred$evaluationType <- "validation"
prediction <- rbind(
oldPred,
newModel$prediction[, colnames(oldPred)]
)
if (!is.null(newModel$covariateImportance)) {
adjust <- newModel$covariateImportance %>%
dplyr::filter(.data$covariateValue != 0) %>%
dplyr::select(
"covariateId",
"covariateValue"
)
} else {
adjust <- c()
}
newIntercept <- newModel$model$coefficients[names(newModel$model$coefficients) == "(Intercept)"]
attr(prediction, "metaData")$recalibratePlpRefit <- list(adjust = adjust, newIntercept = newIntercept)
if (returnModel) {
return(model = newModel)
} else {
return(prediction)
}
} But this is not a But I've been thinking about our three types of recalibration, Could you tell me more about your use case and for example why you are looking at the Egill |
Describe the bug
Running recalibratePlpRefit doesn't return an object of class runPlp that is recalibrated on the new data object, as stated in the documentation but an object of type data.frame. Am I missing something?
Set up (please run in R "sessionInfo()" and copy the output here):
To Reproduce
Context
I have divided the Eunomia dataset into three chunks. I trained a PLP model on the first chunk and now I am recalibrating it using the data from the second chunk. Next, I plan to recalibrate this model again using the third chunk of data. Finally, I will compare this final recalibrated model with the PLP model trained on the entire Eunomia dataset.
The text was updated successfully, but these errors were encountered: