Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed ReplacingMergeTree EngineSpec parsing: is_deleted column presence caused error #357

Merged

Conversation

AlexTheKing
Copy link
Contributor

Summary

Column for marking rows as deleted (is_deleted column) of ReplacingMergeTree was not anyhow handled during EngineSpec parsing resulting in runtime exception like below:

java.lang.IllegalArgumentException: Expect list size 0 or 1, but got 2
at xenon.clickhouse.parse.ParseUtils$.seqToOption(ParseUtils.scala:32)
at xenon.clickhouse.parse.AstVisitor.visitEngineClause(AstVisitor.scala:78)
at xenon.clickhouse.parse.SQLParser.$anonfun$parseEngineClause$1(SQLParser.scala:30)
at xenon.clickhouse.parse.SQLParser.parse(SQLParser.scala:49)
at xenon.clickhouse.parse.SQLParser.parseEngineClause(SQLParser.scala:30)
at xenon.clickhouse.spec.TableEngineUtils$.resolveTableEngine(TableEngineUtils.scala:24)
at xenon.clickhouse.ClickHouseCatalog.loadTable(ClickHouseCatalog.scala:133)
at xenon.clickhouse.ClickHouseCatalog.loadTable(ClickHouseCatalog.scala:36)
at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:583)
at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:566)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
at java.lang.Thread.run(Thread.java:748)

Checklist

Delete items not relevant to your PR:

  • Unit tests covering the common scenario was added

@CLAassistant
Copy link

CLAassistant commented Aug 13, 2024

CLA assistant check
All committers have signed the CLA.

@AlexTheKing
Copy link
Contributor Author

I'm not sure whether I have something to do with the ClickHouse Cloud tests :)

@pan3793
Copy link
Collaborator

pan3793 commented Aug 14, 2024

@AlexTheKing the cloud test failure is irrelevant.

instead of dropping the is_deleted column info directly, I'm wondering if we can reserve such info in ReplacingMergeTreeEngineSpec, then we can use such info to filter/process the deleted rows on the spark side later if wanted.

@AlexTheKing
Copy link
Contributor Author

I can add changes to preserve it in ReplacingMergeTreeEngineSpec if needed

@AlexTheKing
Copy link
Contributor Author

Added one more commit: now preserving is_deleted column in spec, also added the same for ReplicatedReplacingMergeTree (it had the same issue)

@AlexTheKing AlexTheKing force-pushed the replacingmergetree-engine-spec-parsing-fix branch from e90942e to 545bf1a Compare August 14, 2024 07:39
@AlexTheKing AlexTheKing force-pushed the replacingmergetree-engine-spec-parsing-fix branch from 545bf1a to c820b7d Compare August 14, 2024 07:40
@AlexTheKing
Copy link
Contributor Author

I fixed test & style, should like OK now

@mshustov
Copy link
Member

@mzitnik PTAL

@mzitnik
Copy link
Collaborator

mzitnik commented Aug 18, 2024

Checking Cloud strange behavior, currently will merge PR

@mzitnik mzitnik merged commit 699c0ca into ClickHouse:main Aug 18, 2024
30 of 36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants