Skip to content

Latest commit

 

History

History
180 lines (119 loc) · 8.46 KB

exasol.md

File metadata and controls

180 lines (119 loc) · 8.46 KB

Exasol SQL Dialect

Connecting to an Exasol database is the simplest way to get started with Virtual Schemas. You don't have to install any JDBC driver, because it is already installed in the Exasol database and also included in the JAR of the JDBC adapter.

Installing the Adapter Script

Upload the latest available release of Exasol Virtual Schema to Bucket FS.

Then create a schema to hold the adapter script.

CREATE SCHEMA SCHEMA_FOR_VS_SCRIPT;

The SQL statement below creates the adapter script, defines the Java class that serves as entry point and tells the UDF framework where to find the libraries (JAR files) for Virtual Schema and database driver.

CREATE JAVA ADAPTER SCRIPT SCHEMA_FOR_VS_SCRIPT.ADAPTER_SCRIPT_EXASOL AS
    %scriptclass com.exasol.adapter.RequestDispatcher;
    %jar /buckets/<BFS service>/<bucket>/virtual-schema-dist-10.5.0-exasol-7.1.1.jar;
/

Defining a Named Connection

Define the connection to the other Exasol cluster as shown below:

CREATE CONNECTION JDBC_CONNECTION
TO 'jdbc:exa:<host>:<port>'
USER '<user>'
IDENTIFIED BY '<password>';

For connecting via TLS you can specify the certificate's fingerprint in the JDBC URL like this:

CREATE CONNECTION JDBC_CONNECTION
TO 'jdbc:exa:<host>/<fingerprint>:<port>'
USER '<user>'
IDENTIFIED BY '<password>';

You can learn more about defining named connections in the Exasol online handbook.

Choosing the Type of Connection

You have three options to pick from when connecting to an Exasol instance or cluster. The options are explained below.

Using IMPORT FROM EXA

Exasol provides the faster and parallel IMPORT FROM EXA command for loading data from another Exasol instance. You can tell the adapter to use this command instead of IMPORT FROM JDBC by setting the IMPORT_FROM_EXA property.

In this case you have to provide the additional EXA_CONNECTION which contains the name of the connection definition used for the internally used IMPORT FROM EXA command.

That means you will have two named connections: a JDBC connection and an EXA connection. The Virtual Schema adapter uses the JDBC connection for reading metadata. The EXA connection is used by the EXALoader that runs the IMPORT statement.

Please refer to the CREATE CONNECTION documentation for more details about how to define an EXA connection.

Creating a Virtual Schema With EXA Import

CREATE CONNECTION EXA_CONNECTION
TO '<host-or-list>:<port>'
USER '<user>'
IDENTIFIED BY '<password>'

With Exasol 7.1.0 and later you can specify the TLS certificate's fingerprint:

CREATE CONNECTION EXA_CONNECTION
TO '<host-or-list>/<fingerprint>:<port>'
USER '<user>'
IDENTIFIED BY '<password>'
CREATE VIRTUAL SCHEMA VIRTUAL_EXASOL 
USING SCHEMA_FOR_VS_SCRIPT.ADAPTER_SCRIPT_EXASOL WITH
    CONNECTION_NAME = 'JDBC_CONNECTION'
    SCHEMA_NAME     = '<schema name>'
    IMPORT_FROM_EXA = 'true'
    EXA_CONNECTION  = 'EXA_CONNECTION';

Using IMPORT FROM JDBC

You can alternatively use a regular JDBC connection for the IMPORT. Note that this option is slower because it lacks the parallelization the IMPORT FROM EXA variant.

Creating a Virtual Schema With JDBC Import

CREATE VIRTUAL SCHEMA <virtual schema name> 
USING SCHEMA_FOR_VS_SCRIPT.ADAPTER_SCRIPT_EXASOL WITH
    CONNECTION_NAME = 'JDBC_CONNECTION'
    SCHEMA_NAME     = '<schema name>';

Using IS_LOCAL

If the data source is the same Exasol instance or cluster Virtual Schemas runs on, then the best possible connection type is a so called "local" connection.

Add the following parameter to CREATE VIRTUAL SCHEMA:

IS_LOCAL = 'true'

The IS_LOCAL parameter provides an additional speed-up in this particular use case.

The way this works is that Virtual Schema generates a regular SELECT statement instead of an IMPORT statement.

And that SELECT can be directly executed by the core database, whereas the IMPORT statement takes a detour via the ExaLoader.

Important: Please note that since the generated SELECT command runs with the permissions of the owner of the Virtual Schema, that user must have privileges to access what you plan to select!

IMPORT statements use a connection definition which allows connecting with a different user account. Generated SELECT statements do not open additional connections (hence the "local" moniker) so they inherit the context of the Virtual Schema query they are executed in — including permissions.

Creating a Local Virtual Schema

CREATE VIRTUAL SCHEMA <virtual schema name> 
USING SCHEMA_FOR_VS_SCRIPT.ADAPTER_SCRIPT_EXASOL WITH
    CONNECTION_NAME = 'JDBC_CONNECTION'
    SCHEMA_NAME     = '<schema name>'
    IS_LOCAL        = 'true';

Note that you still need to provide a JDBC connection. This is used by the Virtual Schema adapter internally. It is not used for mass data transfer though. And that is where the performance gain comes from.

Transport Layer Security (TLS)

With version 7.1 Exasol introduced TLS encryption on the database port. Other ports were TLS capable before that already.

Using TLS Connections

To establish a TLS connection with an Exasol Virtual Schema, you must use Exasol Virtual Schema 5.0.4 or later and connect to a compatible Exasol server version (see section "Exasol Server Versions and TLS Support").

The reason why you need the Exasol Virtual Schema 5.0.4 or later is that with this version we built-in a JDBC driver that is TLS-capable.

Check the JDBC driver documentation in our online handbook for details.

Exasol Server Versions and TLS Support

The following Exasol Server Versions support TLS connections via JDBC:

  • 6.2.15 and later 6.2.x versions
  • 7.0.10 and later 7.0.x versions
  • All versions from 7.1.x on

Older versions (6.2.0 … 6.2.14 and 7.0.0 … 7.0.9) automatically fall back to legacy encryption. These versions are outdated anyway. If you still have one of them running, you should still disable TLS support explicitly by specifying parameter legacyencryption=1 in the JDBC URL (see details about supported driver properties). This is a visual reminder that TLS won't work with those versions.

If you are interested, you can find even more detailed information in our roadmap ticket "TLS for all Exasol Drivers".

Disabling TLS

If you want to connect to a cluster running Exasol 7.0.x or earlier with a recent Exasol Virtual Schema (5.0.4 or later), you should explicitly disable TLS on in the JDBC connection the Virtual Schema uses, otherwise the driver will attempt to establish a TLS connection to the server that does not support it and the connection attempt will fail.

The JDBC driver property legacyencryption switches between TLS and the encryption scheme of older Exasol versions. Set it to 1 to disable TLS.

Example:

CREATE CONNECTION LEGACY_JDBC_CONNECTION
TO 'jdbc:exa:<host>:<port>;legacyencryption=1'
USER '<user>'
IDENTIFIED BY '<password>';

Supported Capabilities

The Exasol SQL dialect supports all capabilities that are supported by the virtual schema framework.

Known limitations

  • Using literals and constant expressions with TIMESTAMP WITH LOCAL TIME ZONE data type in Virtual Schemas can produce an incorrect results.
    • We recommend using TIMESTAMP instead.
    • If you are willing to take the risk and want to use TIMESTAMP WITH LOCAL TIME ZONE anyway, please, create a Virtual Schema with the following additional property IGNORE_ERRORS = 'TIMESTAMP_WITH_LOCAL_TIME_ZONE_USAGE'.
    • We also recommend to set Exasol system time_zone to UTC while working with TIMESTAMP WITH LOCAL TIME ZONE.
  • When using an EXA connection, the outermost order of the imported result rows is not guaranteed. This is not a bug, but a deliberate speed optimization in the ExaLoader (the part that runs the IMPORT) caused by parallel import. If you need ordering, please wrap your query into an extra SELECT * FROM (<virtual-schema-query>) ORDER BY <criteria> [, ...] which will then be executed on the target Exasol database instead of the source.