Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UPSERT does not respect CHAR size for temporary tables #257

Open
florian-niefind opened this issue Sep 23, 2016 · 1 comment
Open

UPSERT does not respect CHAR size for temporary tables #257

florian-niefind opened this issue Sep 23, 2016 · 1 comment

Comments

@florian-niefind
Copy link

florian-niefind commented Sep 23, 2016

Used: dataduct 0.4.0

The auto-generated sql-query for the upsert step does not carry over the size of a CHAR column. It seems to only work with VARCHAR columns. Example:
My Table definitions for the upsert step looks like:

CREATE TABLE IF NOT EXISTS staging.example (col1 VARCHAR(20), col2_buggy CHAR(2));

CREATE TABLE IF NOT EXISTS public.example (col1 VARCHAR(20), col2_buggy CHAR(2));

And the auto-generated sql-query (which is fortunately printed to stdout) for the upsert will look like :

CREATE TEMPORARY TABLE example_temp (col1 VARCHAR(20), col2_buggy CHAR);
INSERT INTO user_activity_temp (SELECT * FROM staging.user_activity LIMIT 10);
and so on...

col2 in the temporary table is thus generated with the default size (1) and carryover fails. Pipeline will fail giving me:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/dataduct/steps/executors/runner.py", line 100, in sql_runner
    cursor.execute(sql_query)
psycopg2.InternalError: Value too long for character type
DETAIL:  
  -----------------------------------------------
  error:  Value too long for character type
  code:      8001
  context:   Value too long for type character(1)
  query:     8143
  location:  funcs_string.hpp:392
  process:   query0_25 [pid=31361]

Everything works fine if i change col2 ro VARCHAR as well.

EDIT: traced the problem here
https://github.com/coursera/dataduct/blob/develop/dataduct/database/parsers/utils.py#L26-L28

CHAR pattern definition does not include Word(alphanums)

@seguschin
Copy link

You not need to use upsert step as RedshiftCopyActivity supports upsert internally, use option insert_mode: OVERWRITE_EXISTING

sergei

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants