Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] kyuubi 1.8 Enable KyuubiSparkSQLExtension NoClassDefFoundError org/apache/spark/sql/hive/execution/OptimizedCreateHiveTableAsSelectCommand #5724

Closed
2 of 4 tasks
LoneExplorer2023 opened this issue Nov 18, 2023 · 7 comments
Labels
kind:bug This is a clearly a bug priority:major

Comments

@LoneExplorer2023
Copy link

LoneExplorer2023 commented Nov 18, 2023

Code of Conduct

Search before asking

  • I have searched in the issues and found no similar issues.

Describe the bug

Kyuubi 1.8.0 spark version spark-3.4.1-bin-hadoop3

I have put Kyuubi Spark SQL extension jar kyuubi-extension-spark-3-4_2.12-1.8.0.jar into $SPARK_HOME/jars

In kyuubi-defaults.conf set parametric:

spark.sql.extensions=org.apache.kyuubi.sql.KyuubiSparkSQLExtension
spark.sql.optimizer.insertRepartitionBeforeWrite.enabled=true
spark.sql.optimizer.insertRepartitionBeforeWriteIfNoShuffle.enabled=true

spark driver

ERROR SparkSQLEngine: Failed to instantiate SparkSession: org/apache/spark/sql/hive/execution/OptimizedCreateHiveTableAsSelectCommand
java.lang.NoClassDefFoundError: org/apache/spark/sql/hive/execution/OptimizedCreateHiveTableAsSelectCommand
	at org.apache.kyuubi.sql.RebalanceBeforeWritingHive$.apply(RebalanceBeforeWriting.scala:75)
	at org.apache.kyuubi.sql.RebalanceBeforeWritingHive$.apply(RebalanceBeforeWriting.scala:75)
	at org.apache.spark.sql.SparkSessionExtensions.$anonfun$buildPostHocResolutionRules$1(SparkSessionExtensions.scala:192)
	at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
	at scala.collection.TraversableLike.map(TraversableLike.scala:286)
	at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
	at scala.collection.AbstractTraversable.map(Traversable.scala:108)
	at org.apache.spark.sql.SparkSessionExtensions.buildPostHocResolutionRules(SparkSessionExtensions.scala:192)
	at org.apache.spark.sql.internal.BaseSessionStateBuilder.customPostHocResolutionRules(BaseSessionStateBuilder.scala:229)
	at org.apache.spark.sql.hive.HiveSessionStateBuilder$$anon$1.<init>(HiveSessionStateBuilder.scala:107)
	at org.apache.spark.sql.hive.HiveSessionStateBuilder.analyzer(HiveSessionStateBuilder.scala:85)
	at org.apache.spark.sql.internal.BaseSessionStateBuilder.$anonfun$build$2(BaseSessionStateBuilder.scala:369)
	at org.apache.spark.sql.internal.SessionState.analyzer$lzycompute(SessionState.scala:89)
	at org.apache.spark.sql.internal.SessionState.analyzer(SessionState.scala:89)
	at org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:76)
	at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
	at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$2(QueryExecution.scala:202)
	at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:526)
	at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:202)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:827)
	at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:201)
	at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:76)
	at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:74)
	at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:66)
	at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:98)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:827)
	at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:96)
	at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:640)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:827)
	at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:630)
	at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:671)
	at org.apache.kyuubi.engine.spark.KyuubiSparkUtil$.$anonfun$initializeSparkSession$1(KyuubiSparkUtil.scala:48)
	at org.apache.kyuubi.engine.spark.KyuubiSparkUtil$.$anonfun$initializeSparkSession$1$adapted(KyuubiSparkUtil.scala:41)
	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
	at org.apache.kyuubi.engine.spark.KyuubiSparkUtil$.initializeSparkSession(KyuubiSparkUtil.scala:41)
	at org.apache.kyuubi.engine.spark.SparkSQLEngine$.createSpark(SparkSQLEngine.scala:291)
	at org.apache.kyuubi.engine.spark.SparkSQLEngine$.main(SparkSQLEngine.scala:357)
	at org.apache.kyuubi.engine.spark.SparkSQLEngine.main(SparkSQLEngine.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1020)
	at org.apache.spark.deploy.SparkSubmit.$anonfun$submit$2(SparkSubmit.scala:169)
	at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:63)
	at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:62)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
	at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:62)
	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:169)
	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:215)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1111)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1120)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.hive.execution.OptimizedCreateHiveTableAsSelectCommand
	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	... 62 more

Affects Version(s)

1.8.0

Kyuubi Server Log Output

No response

Kyuubi Engine Log Output

No response

Kyuubi Server Configurations

No response

Kyuubi Engine Configurations

No response

Additional context

No response

Are you willing to submit PR?

  • Yes. I would be willing to submit a PR with guidance from the Kyuubi community to fix.
  • No. I cannot submit a PR at this time.
@LoneExplorer2023 LoneExplorer2023 added kind:bug This is a clearly a bug priority:major labels Nov 18, 2023
Copy link

Hello @LoneExplorer2023,
Thanks for finding the time to report the issue!
We really appreciate the community's efforts to improve Apache Kyuubi.

@pan3793
Copy link
Member

pan3793 commented Nov 18, 2023

Verified locally by using official Spark/Kyuubi published binary artifacts, can not reproduce.

@pan3793
Copy link
Member

pan3793 commented Nov 18, 2023

In terms of the question you encountered, I believe you have done the basic analysis, and are trying to filter out the unrelated information, just want to provide the key clues to the question itself. But the fact is that this approach may be detrimental to your problem-solving, in most cases. The real reason may have been filtered out by you.

@LoneExplorer2023
Copy link
Author

LoneExplorer2023 commented Nov 20, 2023

I has deploy successd kyuubi 1.7.2 and spark 3.3.2 before, Is there a problem with the above procedure?

@pan3793
Copy link
Member

pan3793 commented Nov 20, 2023

Provide reproducible steps or environment (e.g. a Dockerfile).

@AngersZhuuuu
Copy link
Contributor

More looks like you didn't add spark sql hive jar in running env?

@pan3793
Copy link
Member

pan3793 commented Nov 25, 2023

The error is caused by the reporter puts kyuubi-extension-spark-common_2.12-1.8.0.jar into $SPARK_HOME/jars, which was not mentioned in the issue description.

Close as it's not an issue, but wrong usage.

@pan3793 pan3793 closed this as completed Nov 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug This is a clearly a bug priority:major
Projects
None yet
Development

No branches or pull requests

3 participants