Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EXPOSED-86 allowing blobs with streams of unknown size #1777

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

elektro-wolle
Copy link
Contributor

InputStreams for large blobs or blobs with unknown size fail, see description in

https://youtrack.jetbrains.com/issue/EXPOSED-86/JdbcPreparedStatementImpl-fails-on-input-stream-with-unknown-size-breaks-CipherStreams-in-blobs.

This fix also skips the side-effect in the getter of ExposedBlob.bytes, which silently read the full input-stream and replacing it with an ByteArrayInputStream.

}
} catch (e: SQLFeatureNotSupportedException) {
// fallback to bytes
statement.setBytes(index, inputStream.readAllBytes())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested it and it appears that this line works for all cases. Why not use just this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

readAllBytes() will fail on contents larger than 2GB (maximum size of the ByteArray) and allocates the whole data within RAM. The setBinaryStream function transfers just small chunks to the database until the stream is fully read (and allocates only one of those small chunks on the heap).

Copy link
Collaborator

@joc-a joc-a Jul 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying.

One thing I'm concerned about is having to check the type of InputStream and do something specific for each type. InputStream has a very large number of sub-classes and I wonder how sustainable/maintainable it is to have a separate approach for each one instead of having a generic solution that could potentially work for most, if not all of them, if that's possible.

Also, I tried testing it using a 3GB file created using RandomAccessFile and it failed for all databases for being too large.

        val fileSize = 3L * 1024L * 1024L * 1024L // 3GB
        val filePath = "path_to_large_file.txt"

        try {
            val randomAccessFile = RandomAccessFile(filePath, "rw")
            randomAccessFile.setLength(fileSize)
            println("Large file created: $filePath")
            randomAccessFile.close()
        } catch (e: IOException) {
            e.printStackTrace()
        }
        
        val longBlob = ExposedBlob(
              inputStream = FileInputStream(filePath)
        )

Is your solution supposed to cover this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, there is no interface within JDK to decide if the InputStream has a known size. But maybe, it's easier to skip the functions with the provided length.

I'll have to look into your 3GB test-case, it should have worked, maybe, it is necessary to use Connection.createBlob instead of InputStream. I hoped, this would be transparently handled by the JDBC-driver.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll split the PR into two separate PRs: one for the available() bug (with is more urgent for me) and the blob handling with Connection.createBlob() (which currently fails on columns with default values).

Copy link
Collaborator

@joc-a joc-a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution. Left a comment.

@elektro-wolle elektro-wolle changed the title EXPOSED-86 allowing blobs with streams of unknown size or sizes > 2GB EXPOSED-86 allowing blobs with streams of unknown size Jul 22, 2023
@elektro-wolle
Copy link
Contributor Author

Rebased onto current main and corrected title.

# Conflicts:
#	exposed-jdbc/src/main/kotlin/org/jetbrains/exposed/sql/statements/jdbc/JdbcPreparedStatementImpl.kt
#	exposed-tests/src/test/kotlin/org/jetbrains/exposed/sql/tests/shared/DDLTests.kt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants