This demo covers using Apache Spark with the Scala SDK with a simple application.
This demo uses Apache Spark 2.4.4 and sentry-java
1.7.27
Spark requires Java 8. It is recommended that you use jenv to manage your Java versions.
Check your Java version with:
java -version
You should get something like this:
openjdk version "1.8.0_222"
Install sbt with homebrew
brew install sbt
Download Apache Spark version 2.4.4 with Hadoop 2.7 - https://spark.apache.org/downloads.html
Set your $SPARK_HOME
environmental variable to point to your Spark folder.
export SPARK_HOME=path/to/spark/spark-2.4.4-bin-hadoop2.7
Package your application jar
sbt package
Run your application with spark-submit
$SPARK_HOME/bin/spark-submit \
--class "SimpleApp" \
--master "local[4]" \
--files "sentry.properties" \
--packages "io.sentry:sentry-spark_2.11:0.0.1-alpha04" \
target/scala-2.11/simple-project_2.11-1.0.jar
$SPARK_HOME/bin/spark-submit \
--class "SimpleQueryApp" \
--master "local[4]" \
--files "sentry.properties" \
--packages "io.sentry:sentry-spark_2.11:0.0.1-alpha04" \
target/scala-2.11/simple-project_2.11-1.0.jar
$SPARK_HOME/bin/spark-submit \
--class "SimpleStreamingQueryApp" \
--master "local[4]" \
--files "sentry.properties" \
--packages "io.sentry:sentry-spark_2.11:0.0.1-alpha04" \
target/scala-2.11/simple-project_2.11-1.0.jar