-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: cleanup old model revision files #238
base: main
Are you sure you want to change the base?
Conversation
{ | ||
_logger.LogInformation("Deleting expired model file: {FilePath}", revision.FilePath); | ||
StateUtils.DeleteFileAndDirectory(revision.FilePath, revision.IsInDirectory); | ||
revision.IsDeleted = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you not deleting the entry from the state entirely? it would make the code much simpler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
talked, figured that there will be issues in that case as well:
- once the model is removed from the state and the new run happens it would be downloaded to the temporary storage only (and removed after)
- to fix this properly we need to make the temp models be committed back to the main state after the use
if(model.IsDeleted){ | ||
var downloaded = await DownloadFileAsync(model, true).ConfigureAwait(false); | ||
if(downloaded){ | ||
model.IsDeleted = false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you remove the revision from the state (instead of only deleting the file) you should not need this
var revisions = _state.Values | ||
.Where(s => s.IsInDirectory | ||
&& !string.IsNullOrEmpty(s.FilePath) | ||
&& s.LastAccessTime.AddDays(_config.FilesTTL) < _dateTimeProvider.UtcNow) | ||
.OrderByDescending(s => s.CreatedTime); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multiple revisions can reuse one file, so this is unsafe.
also, removing the latest revision for the model is most likely not a good idea.
this code from VerifyLocalModelState
could be taken as an example (or reused?)
var filesInUseMap = _state.Values
.Where(f => !string.IsNullOrEmpty(f.FilePath))
.ToDictionarySafe(f => f.FilePath, f => true);
if (statesToDelete.Any())
{
_logger.LogWarning("Removing {Num} model versions not found in CDF: {Versions}",
statesToDelete.Count,
string.Join(", ", statesToDelete.Select(s => s.ModelExternalId + " v" + s.Version)));
var statesToDeleteWithFile = statesToDelete
.Select(s => (s, !filesInUseMap.ContainsKey(s.FilePath)));
await _store
.RemoveFileStates(_config.FilesTable, statesToDeleteWithFile, token)
.ConfigureAwait(false);
}
@@ -196,6 +200,7 @@ await _store.RestoreExtractionState<U, T>( | |||
HashSet<string> idsToKeep = new HashSet<string>(_state.Select(s => s.Value.Id)); | |||
ldbStore.RemoveUnusedState(_config.FilesTable, idsToKeep); | |||
} | |||
CleanExpiredModelRevisionFiles(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only on restart? this may never happen if the connector is stable enough
This change introduces a new property,
LastAccessTime
, in the local state for simulator model revisions. Each time a revision is accessed, this property will be updated. Based on theFilesTTL
value defined in theconfig.yaml
, it will check whether the last usage of a file exceeds the specified retention period. If the file has not been accessed within the defined timeframe, it will be deleted.