Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-51027][SQL] Prevent
HiveClient.runSqlHive
invocation in non-…
…testing environment ### What changes were proposed in this pull request? This PR aims to prevent `HiveClient.runSqlHive` invocation in non-testing environment. ### Why are the changes needed? `HiveClient.runSqlHive` is a kind of testing utility which Apache Spark uses during unit testing in order to compare with Hive result. We had better guarantee that this is not used in non-testing environments. ``` $ git grep runSqlHive | grep test | wc -l 84 $ git grep runSqlHive | grep -v test sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClient.scala: def runSqlHive(sql: String): Seq[String] sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala: override def runSqlHive(sql: String): Seq[String] = { ``` ### Does this PR introduce _any_ user-facing change? No. This is a method of private class, `HiveClient`. https://github.com/apache/spark/blob/3badfd2c41d33fe5f1cbd1d7345c749b7af81574/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClient.scala#L42 ### How was this patch tested? In testing environment, it always succeed. So, we need to check manually. ``` $ build/sbt package -Phive -Phive-thriftserver $ bin/spark-shell WARNING: Using incubator modules: jdk.incubator.vector Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 4.1.0-SNAPSHOT /_/ Using Scala version 2.13.15 (OpenJDK 64-Bit Server VM, Java 17.0.14) Type in expressions to have them evaluated. Type :help for more information. 25/01/28 22:32:58 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Spark context Web UI available at http://localhost:4040 Spark context available as 'sc' (master = local[*], app id = local-1738132378341). Spark session available as 'spark'. scala> :paste -raw // Entering paste mode (ctrl-D to finish) package org.apache.spark object myObject { val spark = org.apache.spark.sql.SparkSession.getActiveSession.get val client = spark.sharedState.externalCatalog.unwrapped.asInstanceOf[org.apache.spark.sql.hive.HiveExternalCatalog].client client.runSqlHive("SHOW TABLES") } // Exiting paste mode... now compiling with scalac. scala> org.apache.spark.myObject 25/01/28 22:33:23 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 2.3.0 25/01/28 22:33:23 WARN ObjectStore: setMetaStoreSchemaVersion called but recording version is disabled: version = 2.3.0, comment = Set by MetaStore dongjoon127.0.0.1 java.lang.AssertionError: assertion failed: spark.testing is not set to true at scala.Predef$.assert(Predef.scala:279) at org.apache.spark.sql.hive.client.HiveClientImpl.runSqlHive(HiveClientImpl.scala:865) ... 43 elided ``` ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#49722 from dongjoon-hyun/SPARK-51027. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
- Loading branch information