-
Notifications
You must be signed in to change notification settings - Fork 327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Basic generic JDBC connection #12073
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks a great start a few small changes please.
Database SPI can be a follow up PR.
distribution/lib/Standard/Database/0.0.0-dev/src/JDBC/Generic_JDBC_Connection.enso
Outdated
Show resolved
Hide resolved
distribution/lib/Standard/Database/0.0.0-dev/src/JDBC/Generic_JDBC_Connection.enso
Outdated
Show resolved
Hide resolved
distribution/lib/Standard/Database/0.0.0-dev/src/JDBC/Generic_JDBC_Connection.enso
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should be committing such large files to the repo.
We already have logic for downloadnig much smaller assets, e.g. Examples.xls
.
I think we should remove the commit adding it (rebase to avoid it polluting version history) and instead relying on a download. We could use #12017 to make this easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed; this is now downloaded by the SBT rule.
test/Table_Tests/src/Database/JDBC/Generic_JDBC_Connection_Spec.enso
Outdated
Show resolved
Hide resolved
test/Table_Tests/src/Database/JDBC/Generic_JDBC_Connection_Spec.enso
Outdated
Show resolved
Hide resolved
self.should_contain_and_not_contain_all contained [] | ||
|
||
add_specs suite_builder = | ||
data_dir = "test/Table_Tests/data/transient" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may not work correctly if the working directory is unexpected.
Much better to use the more robust:
data_dir = "test/Table_Tests/data/transient" | |
data_dir = enso_project.data / "transient" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Overall changes look good, but I think we need to address the binary file issue. And a few other suggestions. |
I'll push my changes once I've rebased properly. |
787cab1
to
3721d84
Compare
build.sbt
Outdated
@@ -4652,7 +4654,14 @@ lazy val `enso-test-java-helpers` = project | |||
) | |||
secondaryLocations.foreach { target => | |||
IO.copyFile(primaryLocation, target) | |||
} | |||
} | |||
val _ = StdBits |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hubertp I would like your thoughts on my terrible hack here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK, this is just another way to declare that enso-test-java-helpers/Compile/packageBin
task depends on StdBits.copyDependencies
task. Nothing too hacky about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't add any new JARs into Standard.Base
, Standard.Table
and neither Standard.Database
. It goes against all our efforts to compile these libraries by native image.
@@ -642,6 +642,7 @@ val jnaVersion = "5.14.0" | |||
val googleProtobufVersion = "3.25.1" | |||
val shapelessVersion = "2.3.10" | |||
val postgresVersion = "42.4.0" | |||
val h2Version = "2.3.232" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you trying to add another library into production? That's not a good idea.
- there is a failure in native image generation
- in general we have too many libraries in
Standard.Base
andStandard.Table
- if you really want another one, create separate project
I don't think majority Enso customers need h2 DB distributed in Standard
libraries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to second @jtulach's opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All I need here is to download this jar file when building for the purpose of running Table_Tests
-- this is not needed for production at all. This declaration is used in the rule enso-test-java-helpers
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does my change cause the driver to be included in the build?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And if it does, is there a way to do this in build.sbt
that only affects the running of library tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does it take to ensure that a driver is native image friendly?
These tests do not depend on Table_Tests
, so I am happy to move them to a separate project. @JaroslavTulach do I need to do anything to exclude this new project from the native image builds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We add those explicity. So if you separate your test out of Table_Tests
you will be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- once CI should run Standard Library Tests in both JVM and Native modes #12123 gets implemented
- it needs to know what projects to run in native mode
- and which only in
--jvm
mode
But that's something owner of #12123 needs to worry about, not this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved the test to its own project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hubertp Please take a look at my SBT changes.
|
||
with_database name action = | ||
database_root = data_dir / name | ||
url = "jdbc:h2:./" + database_root.path + "/" + database_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is database_root
an absolute or relative path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolute -- I forgot to remove the "./"
r.at "TABLE_CAT" . to_vector . should_equal ['TEST', 'TEST'] | ||
r.at "TABLE_SCHEM" . to_vector . should_equal ['PUBLIC', 'PUBLIC'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it may be backend dependent, no?
But I think it is absolutely fine for this test to assume it is running on particular database (e.g. H2).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it definitely is backend dependent. I am already working on updating the tests for postgres and changing these, but I'd rather wait to make the changes in that PR. The tests will get less specific with the next few drivers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
## TODO: implement this for a driver that supports multiple catalogs | ||
group_builder.specify "should be able to get schemas by catalog" <| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tbh I feel like getting the various schemas may be a bit too backend dependent, I'm not really sure if this functionality is that needed in the generic driver?
Or was that a requirement to have it?
From my perspective all that is needed is ability to list available tables and read
them into memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly this implementation is supporting to much. But I think that only allowing the user to read an entire table is not enough -- users might need to access very large databases where it would be impossible to read entire tables.
group_builder.specify "should be able to create tables in multiple schemas" <| | ||
with_connection "multiple_schemas" conn-> | ||
conn.execute "create table foo (a int, b text)" | ||
conn.execute "create schema other" | ||
conn.execute "set schema other" | ||
conn.execute "create table bar (c float, d timestamp)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this functionality does not need to work on all backends to deem the generic JDBC driver OK. But no harm in testing it e.g. in H2 case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussion with @jdunkerley, I'll be removing these tests.
conn.get_tables schema_pattern='PUBLIC' . should_equal ['FOO'] | ||
conn.get_tables schema_pattern='OTHER' . should_equal ['BAR'] | ||
|
||
group_builder.specify "should be able to insert, update and query data" <| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
group_builder.specify "should be able to insert, update and query data" <| | |
group_builder.specify "should be able to insert, update and query data using raw SQL" <| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
GROUP Standard.Base.Input | ||
ICON data_input | ||
|
||
Execute a raw SQL query, or an `SQL_Statement`.Generic_JDBC_Connection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leftover?
Execute a raw SQL query, or an `SQL_Statement`.Generic_JDBC_Connection. | |
Execute a raw SQL query, or an `SQL_Statement`.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
connection = Generic_JDBC_Connection.connect "jdbc:h2:~/my_database" | ||
connection.execute "select a, b, c from foo" | ||
read self sql:(Text | SQL_Statement) -> Table = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, shouldn't the read
signature be consistent with Connection.read
. IMO it should.
I always assumed that part of the required functionality for the generic JDBC driver was not only reading custom raw SQL queries, but the simplest thing - reading a table by name, without requiring the user to dabble in SQL (even if that SQL is simply SELECT * FROM foo
).
The hard part of this will be to figure out how to translate the user provided table name into SQL - as the quote character may differ for some dialects. It would be good to test it with such a dialect, maybe MySQL (I think it uses backticks for names?)? We should be able to rely on JDBC to provide us with the quote and quote escape characters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we consider Connection.read
to be a user-friendly interface? It's public and has documentation, but it only supports SQL_Query
, which does not permit the use of SQL_Builder
. Perhaps we could add SQL_Statement
as a third option to SQL_Query
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest I'm not 100% convinced we should be making SQL_Statement
public?
Its only benefit is helping with interpolating Enso objects into the query. But that is dialect specific anyway... Various dialects will have different ways of interpolating things like Dates and we have specific Statement_Setter
implementations for it. It was originally written to be an internal helper, not user-facing.
A generic one will very likely fail to interpolate more than basic Text or Integer in many backends (because handling of stuff like Date is very DB dependent). But its presence will suggest to the users it can do more.
IMO if the user is working on the generic backend and writing the query themselves, most of the time they will want to write the query string.
I do appreciate that the SQL_Statement
makes it a bit easier to interpolate variables into the queries and avoids simple concatenation (that can lead to SQL injection), but I'm wondering if users writing custom queries will:
- need it often enough
- know how to use it
So I'd suggest to add it only after we find out that users really need it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will remove.
conn.execute "create table ((((((()" . should_fail_with SQL_Error | ||
conn.execute "create table foo (a int)" . should_succeed | ||
conn.execute "create table foo (a int)" . should_fail_with SQL_Error | ||
conn.read "select * from nonexistent" . should_fail_with SQL_Error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So many tests for raw SQL queries, but I always assumed that the fundamental part of the generic JDBC driver was the simple use case:
- I put my driver in
polyglot/java
- I connect to the database
- I list tables and
read
one of them by name.
In all of this I shouldn't need to write any raw SQL, I only need to know what is the jdbc:...
connection string for my DB.
Although I appreciate there may be challenges with how to escape the table name (as mentioned above). Is it indeed very problematic? Or is it maybe planned to be done in #12101?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If #12101 works, then doing table name escaping through that would be best, so I'd like to save this part for that task.
Implements a JDBC connection object which can be used to connect to any JDBC-compliant database, if the driver is available on the classpath.
This only provides basic methods for querying the database schema, and performing SQL queries and statements against the database.
This implementation is only tested agains the H2 database; more JDBC drivers will be added to the tests. It is likely that the tests will have to be updated to make fewer assumptions about the behavior of JDBC connections across different backends.
Read a table by full SQL query:
Read a table by name:
Checklist
Please ensure that the following checklist has been satisfied before submitting the PR:
Scala,
Java,
TypeScript,
and
Rust
style guides. In case you are using a language not listed above, follow the Rust style guide.
or the Snowflake database integration, a run of the Extra Tests has been scheduled.