Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: start_date from meltano.yml used even if replication-key column is not date-time #1677

Open
1 task
BuzzCutNorman opened this issue May 5, 2023 · 3 comments · May be fixed by #1734
Open
1 task

bug: start_date from meltano.yml used even if replication-key column is not date-time #1677

BuzzCutNorman opened this issue May 5, 2023 · 3 comments · May be fixed by #1734
Labels
kind/Bug Something isn't working valuestream/SDK
Milestone

Comments

@BuzzCutNorman
Copy link
Contributor

Singer SDK Version

0.24.0

Is this a regression?

  • Yes

Python Version

3.9

Bug scope

Taps (catalog, state, etc.)

Operating System

Windows

Description

If you setup Incremental replication to use a non datetime column as a replcation-key and start_date is set in the meltano.yml then during the initial incremental run it will fail with a pyodbc conversion error.

      start_date: '2022-01-03'
      raw-tags:
        replication-method: INCREMENTAL
        replication-key: Id
{
	"type": "SCHEMA"
	, "stream": "raw-tags"
	, "schema": 
	{
		"properties": 
		{
			"Id": {"type": ["integer"]}
			, "TagName": {"type": ["string", "null"]}
			, "Count": {"type": ["integer", "null"]}
			, "ExcerptPostId": {"type": ["integer", "null"]}
			, "WikiPostId": {"type": ["integer", "null"]}
		}
		, "type": "object"
		, "required": ["Id"]
	}
	, "key_properties": ["Id"]
	, "bookmark_properties": ["Id"]
}

The issue seems to be in _write_starting_replication_value. The method doesn't currently check if the replication key coloumn is of type timestamp before assigning the start_date. I guess it could also check that the replication key is not of type numeric or integer.

Possible fix:

# Use start_date if it is more recent than the replication_key state
start_date_value: str | None = self.config.get("start_date")
if start_date_value and self.is_timestamp_replication_key:
	if not value:
		value = start_date_value
	else:
		value = self.compare_start_date(value, start_date_value)

Code

# Use start_date if it is more recent than the replication_key state
start_date_value: str | None = self.config.get("start_date")
if start_date_value:
	if not value:
		value = start_date_value
	else:
		value = self.compare_start_date(value, start_date_value)


Traceback (most recent call last):
  File "C:\Program Files\Python39\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\Python39\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\development\projects\my-test-stuff\.meltano\extractors\tap-mssql\venv\Scripts\tap-mssql.exe\__main__.py", line 7, in <module>
  File "C:\development\projects\my-test-stuff\.meltano\extractors\tap-mssql\venv\lib\site-packages\click\core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "C:\development\projects\my-test-stuff\.meltano\extractors\tap-mssql\venv\lib\site-packages\click\core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "C:\development\projects\my-test-stuff\.meltano\extractors\tap-mssql\venv\lib\site-packages\click\core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\development\projects\my-test-stuff\.meltano\extractors\tap-mssql\venv\lib\site-packages\click\core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "C:\development\projects\my-test-stuff\.meltano\extractors\tap-mssql\venv\lib\site-packages\singer_sdk\tap_base.py", line 542, in cli
    tap.sync_all()
  File "C:\development\projects\my-test-stuff\.meltano\extractors\tap-mssql\venv\lib\site-packages\singer_sdk\tap_base.py", line 413, in sync_all
    stream.sync()
  File "C:\development\projects\my-test-stuff\.meltano\extractors\tap-mssql\venv\lib\site-packages\singer_sdk\streams\core.py", line 1194, in sync
    for _ in self._sync_records(context=context):
  File "C:\development\projects\my-test-stuff\.meltano\extractors\tap-mssql\venv\lib\site-packages\singer_sdk\streams\core.py", line 1093, in _sync_records
    for record_result in self.get_records(current_context):
  File "C:\development\buzzcutnorman\tap-mssql\tap_mssql\client.py", line 542, in get_records
    for record in conn.execute(query):
  File "C:\development\projects\my-test-stuff\.meltano\extractors\tap-mssql\venv\lib\site-packages\sqlalchemy\future\engine.py", line 280, in execute
    return self._execute_20(
  File "C:\development\projects\my-test-stuff\.meltano\extractors\tap-mssql\venv\lib\site-packages\sqlalchemy\engine\base.py", line 1710, in _execute_20
    return meth(self, args_10style, kwargs_10style, execution_options)
  File "C:\development\projects\my-test-stuff\.meltano\extractors\tap-mssql\venv\lib\site-packages\sqlalchemy\sql\elements.py", line 334, in _execute_on_connection
    return connection._execute_clauseelement(
  File "C:\development\projects\my-test-stuff\.meltano\extractors\tap-mssql\venv\lib\site-packages\sqlalchemy\engine\base.py", line 1577, in _execute_clauseelement
    ret = self._execute_context(
  File "C:\development\projects\my-test-stuff\.meltano\extractors\tap-mssql\venv\lib\site-packages\sqlalchemy\engine\base.py", line 1948, in _execute_context
    self._handle_dbapi_exception(
  File "C:\development\projects\my-test-stuff\.meltano\extractors\tap-mssql\venv\lib\site-packages\sqlalchemy\engine\base.py", line 2129, in _handle_dbapi_exception
    util.raise_(
  File "C:\development\projects\my-test-stuff\.meltano\extractors\tap-mssql\venv\lib\site-packages\sqlalchemy\util\compat.py", line 211, in raise_
    raise exception
  File "C:\development\projects\my-test-stuff\.meltano\extractors\tap-mssql\venv\lib\site-packages\sqlalchemy\engine\base.py", line 1905, in _execute_context
    self.dialect.do_execute(
  File "C:\development\projects\my-test-stuff\.meltano\extractors\tap-mssql\venv\lib\site-packages\sqlalchemy\engine\default.py", line 736, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.DataError: (pyodbc.DataError) ('22018', "[22018] [Microsoft][ODBC Driver 18 for SQL Server][SQL Server]Conversion failed when converting the nvarchar value '2022-01-03' to data type int. (245) (SQLExecDirectW)")
[SQL: SELECT raw.tags.[Id], raw.tags.[TagName], raw.tags.[Count], raw.tags.[ExcerptPostId], raw.tags.[WikiPostId]
FROM raw.tags
WHERE raw.tags.[Id] >= ? ORDER BY raw.tags.[Id]]
[parameters: ('2022-01-03',)]
(Background on this error at: https://sqlalche.me/e/14/9h9h)
Copy link

stale bot commented May 4, 2024

This has been marked as stale because it is unassigned, and has not had recent activity. It will be closed after 21 days if no further activity occurs. If this should never go stale, please add the evergreen label, or request that it be added.

@stale stale bot added the stale label May 4, 2024
@edgarrmondragon
Copy link
Collaborator

Still relevant

@stale stale bot removed the stale label May 6, 2024
@edgarrmondragon edgarrmondragon added this to the v1.0 Release milestone Jul 31, 2024
@Iamcerba
Copy link

Iamcerba commented Sep 5, 2024

Still relevant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/Bug Something isn't working valuestream/SDK
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants