You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm not 100% sure whose bug it is, but I think it belongs here:
If I load a CSV with a column name containing uppercase characters (eg. MyKeyCol) and want to use it as the key column, I run into a problem if I am also using target-postgres as the loader (the datamill-co variant). Somewhere in the process, the column name is lowercased, but the key property isn't, which results in the KeyError below.
If I change key_properties to be one of the columns with a lowercase name, it works fine.
I looked at the extras in meltano, but couldn't figure out a solution. It almost feels like the key_properties should allow a value like MyKeyCol as mykeycol.
I'm guessing the field_names config value could be used to force the column names, but it doesn't look like I can skip the first row, so I'm back to unhappiness again :) (this could actually be an extra setting, like skip_rows (integer, default 0) to force the loader to ignore the first N rows of the file).
Here are the logs I'm getting with an example that fails:
tap-spreadsheets-anywhere | INFO Using supplied catalog .meltano/run/elt/load_csv_base_data/3bce9632-9df0-4752-b19d-1705bab4d11d/tap.properties.json.
tap-spreadsheets-anywhere | INFO Syncing stream:us_sec_country_codes
tap-spreadsheets-anywhere | INFO Walking /static_data/.
tap-spreadsheets-anywhere | INFO Found 13 files.
tap-spreadsheets-anywhere | INFO Checking 9 resolved objects for any that match regular expression "myfile.*" and were modified since 2011-01-01 00:00:00+00:00
tap-spreadsheets-anywhere | INFO Processing 1 resolved objects that met our criteria. Enable debug verbosity logging for more details.
tap-spreadsheets-anywhere | INFO Syncing file "myfile.csv".
tap-spreadsheets-anywhere | INFO Wrote 309 records for stream "mystream".
target-postgres | DEBUG MillisLoggingConnection: 0 millis spent executing: b'BEGIN;'
target-postgres | DEBUG MillisLoggingConnection: 0 millis spent executing: b"\n SELECT c.relname, obj_description(c.oid, 'pg_class')\n FROM pg_namespace AS n\n INNER JOIN pg_class AS c ON n.oid = c.relnamespace\n WHERE n.nspname = 'meltano';\n "
target-postgres | INFO Mapping: mytable to ['mytable']
target-postgres | INFO Mapping: tp_mytable_mycol__sdc_sequence_idx to None
target-postgres | DEBUG MillisLoggingConnection: 7 millis spent executing: b"\n SELECT column_name, data_type, is_nullable FROM information_schema.columns\n WHERE table_schema = 'meltano' and table_name = 'mytable';\n "
target-postgres | DEBUG MillisLoggingConnection: 1 millis spent executing: b"\n SELECT EXISTS (\n SELECT 1 FROM pg_tables\n WHERE schemaname = 'meltano' AND\n tablename = 'mytable');\n "
target-postgres | INFO Stream mytable (mytable) with max_version None targetting None
target-postgres | INFO Root table name mytable
target-postgres | INFO Writing batch with 309 records for `mytable` with `key_properties`: `['MyKeyCol']`
target-postgres | INFO METRIC: {"type": "counter", "metric": "record_count", "value": 0, "tags": {"count_type": "batch_rows_persisted", "path": ["mytable"], "database": "mydb", "schema": "meltano"}}
target-postgres | INFO METRIC: {"type": "timer", "metric": "job_duration", "value": 0.0010831356048583984, "tags": {"job_type": "batch", "path": ["mytable"], "database": "mydb", "schema": "meltano", "status": "failed"}}
target-postgres | ERROR Exception writing records
target-postgres | Traceback (most recent call last):
target-postgres | File ".meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/target_postgres/postgres.py", line 295, in write_batch
target-postgres | written_batches_details = self.write_batch_helper(cur,
target-postgres | File ".meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/target_postgres/sql_base.py", line 808, in write_batch_helper
target-postgres | for table_batch in denest.to_table_batches(schema, key_properties, records):
target-postgres | File ".meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/target_postgres/denest.py", line 20, in to_table_batches
target-postgres | table_schemas = _get_streamed_table_schemas(schema,
target-postgres | File ".meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/target_postgres/denest.py", line 47, in _get_streamed_table_schemas
target-postgres | key_prop_schemas[key] = schema['properties'][key]
target-postgres | KeyError: 'MyKeyCol'
target-postgres | CRITICAL ('Exception writing records', KeyError('MyKeyCol'))
target-postgres | Traceback (most recent call last):
target-postgres | File ".meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/target_postgres/postgres.py", line 295, in write_batch
target-postgres | written_batches_details = self.write_batch_helper(cur,
target-postgres | File ".meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/target_postgres/sql_base.py", line 808, in write_batch_helper
target-postgres | for table_batch in denest.to_table_batches(schema, key_properties, records):
target-postgres | File ".meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/target_postgres/denest.py", line 20, in to_table_batches
target-postgres | table_schemas = _get_streamed_table_schemas(schema,
target-postgres | File ".meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/target_postgres/denest.py", line 47, in _get_streamed_table_schemas
target-postgres | key_prop_schemas[key] = schema['properties'][key]
target-postgres | KeyError: 'MyKeyCol'
target-postgres |
target-postgres | During handling of the above exception, another exception occurred:
target-postgres |
target-postgres | Traceback (most recent call last):
target-postgres | File ".meltano/loaders/target-postgres/venv/bin/target-postgres", line 8, in <module>
target-postgres | sys.exit(cli())
target-postgres | File ".meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/target_postgres/__init__.py", line 45, in cli
target-postgres | main(args.config)
target-postgres | File ".meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/target_postgres/__init__.py", line 39, in main
target-postgres | target_tools.main(postgres_target)
target-postgres | File ".meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/target_postgres/target_tools.py", line 28, in main
target-postgres | stream_to_target(input_stream, target, config=config)
target-postgres | File ".meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/target_postgres/target_tools.py", line 77, in stream_to_target
target-postgres | raise e
target-postgres | File ".meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/target_postgres/target_tools.py", line 70, in stream_to_target
target-postgres | state_tracker.flush_streams(force=True)
target-postgres | File ".meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/target_postgres/stream_tracker.py", line 47, in flush_streams
target-postgres | self._write_batch_and_update_watermarks(stream)
target-postgres | File ".meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/target_postgres/stream_tracker.py", line 67, in _write_batch_and_update_watermarks
target-postgres | self.target.write_batch(stream_buffer)
target-postgres | File ".meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/target_postgres/postgres.py", line 309, in write_batch
target-postgres | raise PostgresError(message, ex)
target-postgres | target_postgres.exceptions.PostgresError: ('Exception writing records', KeyError('MyKeyCol'))
meltano | Loading failed (1): target_postgres.exceptions.PostgresError: ('Exception writing records', KeyError('MyKeyCol'))
meltano | ELT could not be completed: Loader failed
The text was updated successfully, but these errors were encountered:
I'm not 100% sure whose bug it is, but I think it belongs here:
If I load a CSV with a column name containing uppercase characters (eg.
MyKeyCol
) and want to use it as the key column, I run into a problem if I am also usingtarget-postgres
as the loader (the datamill-co variant). Somewhere in the process, the column name is lowercased, but the key property isn't, which results in theKeyError
below.If I change
key_properties
to be one of the columns with a lowercase name, it works fine.I looked at the
extra
s in meltano, but couldn't figure out a solution. It almost feels like thekey_properties
should allow a value likeMyKeyCol as mykeycol
.I'm guessing the
field_names
config value could be used to force the column names, but it doesn't look like I can skip the first row, so I'm back to unhappiness again :) (this could actually be an extra setting, likeskip_rows
(integer, default 0) to force the loader to ignore the first N rows of the file).Here are the logs I'm getting with an example that fails:
The text was updated successfully, but these errors were encountered: