Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spark 3.5: Support micro timestamp #311

Merged
merged 1 commit into from
May 10, 2024

Conversation

Veiasai
Copy link
Contributor

@Veiasai Veiasai commented May 10, 2024

Fix #310

@Veiasai Veiasai force-pushed the timestamp_micro branch from 59a158d to 6cbdf68 Compare May 10, 2024 05:48
@pan3793
Copy link
Collaborator

pan3793 commented May 10, 2024

thanks for making this change, do you have a chance to add a ut?

@Veiasai
Copy link
Contributor Author

Veiasai commented May 10, 2024

#310

@pan3793 pan3793 changed the title support micro timestamp Spark 3.5: Support micro timestamp May 10, 2024
@Veiasai
Copy link
Contributor Author

Veiasai commented May 10, 2024

@pan3793 let me take a look.

By the way, how to build the java tar locally?
https://housepower.github.io/spark-clickhouse-connector/developers/01_build_and_test/
this said it should occur in build/, but I don't find it

@pan3793
Copy link
Collaborator

pan3793 commented May 10, 2024

Oh, I forget to update the docs when switching the default Spark version from 3.4 to 3.5, the jar should be at spark-3.5/clickhouse-spark-runtime/build/libs/

➜  Projects cd spark-clickhouse-connector
(scc) ➜  spark-clickhouse-connector git:(master) ./gradlew clean build -x test
Starting a Gradle Daemon (subsequent builds will be faster)

> Task :clickhouse-core:compileScala
[Warn] : two feature warnings; re-run with -feature for details
one warning found

> Task :clickhouse-spark-3.5_2.12:compileScala
[Warn] /Users/chengpan/Projects/spark-clickhouse-connector/spark-3.5/clickhouse-spark/src/main/scala/org/apache/spark/sql/clickhouse/ExprUtils.scala:159:21: non-variable type argument Any in type pattern org.apache.spark.sql.connector.expressions.LiteralValue[Any] is unchecked since it is eliminated by erasure
one warning found

BUILD SUCCESSFUL in 29s
33 actionable tasks: 32 executed, 1 up-to-date
(scc) ➜  spark-clickhouse-connector git:(master) ll spark-3.5/clickhouse-spark-runtime/build/libs/
total 2688
-rw-r--r--  1 chengpan  staff   261B May 10 13:59 clickhouse-spark-runtime-3.5_2.12-0.8.0-SNAPSHOT-empty.jar
-rw-r--r--  1 chengpan  staff   261B May 10 13:59 clickhouse-spark-runtime-3.5_2.12-0.8.0-SNAPSHOT-javadoc.jar
-rw-r--r--  1 chengpan  staff   261B May 10 13:59 clickhouse-spark-runtime-3.5_2.12-0.8.0-SNAPSHOT-sources.jar
-rw-r--r--  1 chengpan  staff   1.3M May 10 13:59 clickhouse-spark-runtime-3.5_2.12-0.8.0-SNAPSHOT.jar

@pan3793
Copy link
Collaborator

pan3793 commented May 10, 2024

you can skip changes for spark-3.4 and spark-3.3, just focus on clickhouse-core and spark-3.5, there is a half-auto script to backport the patch to older spark versions.

@Veiasai
Copy link
Contributor Author

Veiasai commented May 10, 2024

There is no test suite that I can easily extend.... (and I am not familiar with java/scala..
Probably I'll do a system test with the generated jar..

@pan3793
Copy link
Collaborator

pan3793 commented May 10, 2024

Okay~

@pan3793
Copy link
Collaborator

pan3793 commented May 10, 2024

you can construct a simple case and leave it in the comments, I will find time to add it to UT later

@Veiasai
Copy link
Contributor Author

Veiasai commented May 10, 2024

Previous

.config(
    "spark.jars.packages",
    "com.github.housepower:clickhouse-spark-runtime-3.4_2.12:0.7.3,com.clickhouse:clickhouse-jdbc:0.4.6"
)

image

With this fix

 .config(
    "spark.jars",
    "/home/ubuntu/spark-clickhouse-connector/spark-3.5/clickhouse-spark-runtime/build/libs/clickhouse-spark-runtime-3.5_2.12-0.8.0-SNAPSHOT.jar"
)

image

@pan3793
Copy link
Collaborator

pan3793 commented May 10, 2024

can try both json and binary for spark.clickhouse.read.format?

@Veiasai
Copy link
Contributor Author

Veiasai commented May 10, 2024

.config("spark.clickhouse.read.format", "json")

in pyspark?

tested, same.
image

yea.. it works. since if I add "x" it failed.

xenon.clickhouse.exception.CHClientException:  [-1] Unsupported read format: x

@pan3793
Copy link
Collaborator

pan3793 commented May 10, 2024

the default value is "json", if "binary" works too, it's good

@Veiasai
Copy link
Contributor Author

Veiasai commented May 10, 2024

yea I tried both

@pan3793 pan3793 merged commit 0452f90 into ClickHouse:master May 10, 2024
29 checks passed
@pan3793
Copy link
Collaborator

pan3793 commented May 10, 2024

Thanks, merged to master

@Veiasai
Copy link
Contributor Author

Veiasai commented May 10, 2024

when will we release?

@Veiasai Veiasai deleted the timestamp_micro branch May 10, 2024 10:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Lose precision on read DateTime64
2 participants