-
Notifications
You must be signed in to change notification settings - Fork 44
StorageExceptionDisruption Error #684
Comments
We need more information to find out what happened and how to avoid this in the future. Is there a longer stack trace? Because the underlying error is missing. Setting the log level to 'debug' should produce a more usable output. The thread halt could be the culprit if the housekeeping thread was involved. This is where the GC happens. If you can provide the affected storage and source code, we could investigate on our side. |
@fh-ms I think we know what the problem is. If you look at the error when you restart the Database:
... it looks like the file size is inconsistent. We are currently testing our testing environment as a Docker container and we copied the production Database onto the staging server which is mounted into the container. Maybe its a permission problem ? |
One possibility is that the storage files were being updated as you copied them over. As a quick fix, you can delete the transaction files (transactions_*.sft) in each channel directory and restart the application. These are getting rebuilt automatically. I don't think that it is a permission problem. If so, an exception would be thrown by the IO layer. |
@fh-ms We used the issueFullBackup method of the storageManager to create the copy, does this make a difference or can it still happen to corrupt the storage ? This is good to know with the deletion of the files, we will test this out. What are the transactions_*.sft used for ? |
@fh-ms Unfortunately when deleting the transactions files we get the following error after startup:
|
Hello, I had a look at the problem too. From my point of view the root cause for the corrupted files is the first exception (StorageExceptionGarbageCollector). Unfortunately, the StorageExceptionDisruptingExceptions class only prints the message, but not the cause. So is not possible to see what really happened, except that the channel 0 tread has been interrupted by some error in the StorageGC during a store operation. All other exceptions after restart are just consequences of this incomplete IO. Created #687 to track the StorageExceptionDisruptingExceptions issue. |
@hg-ms Unfortunately we were not able to recreate the StorageExceptionDisruption Error but we have some other information that might be relevant:
We really hope there is a solution to this Problem. |
Can you reproduce the problem without docker? It’s interesting that the error occurs after restarting the docker container. This indicates that the copy was ok at the first start. May it be possible that the docker restart causes in incomplete write of storage files? The StorageExceptionConsistency: 2 Length 8388092 of file 116011 is inconsistent with the transactions entry's length of 8388596 also indicates that a storage file has not been completely written. The exception says that file No. 116011 of channel 2 is shorter then expected. The files size is 8388092 Bytes, but the transaction log says that it should be 8388596 Bytes. The Transaction Log is always written after a storage file IO has succeeded. |
@hg-ms
The same happens on my local tomcat server. @Bean(destroyMethod = "shutdown")
public DiStorageService createStorageService() {
return new StorageService();
} Do you have any idea why this is happening ? |
Environment Details
Describe the bug
We encountered the following exception during Runtime:
After restarting the Servlet we can no longer initialize MicroStream:
Caused by: org.springframework.context.ApplicationContextException: Could not initiate Microstream properly: Problem in channel #0
Is there a way to figure out:
To Reproduce
We have no Idea how it happened since it happened in a production database on runtime without further logging that can help us. We have a clue that a bug in the application (and a possible halt to a thead) caused the Garbage Collector to disrupt the DB.
The text was updated successfully, but these errors were encountered: