Sparklyr is a pretty awesome tool for parallel programming.
This repo aims for collecting awesome Sparklyr resources togather.
- fork this project.
- Add your resources by pull request.
- en: Data Science in Spark with Sparklyr::CHEAT SHEET
- zh: Spark数据科学之 sparklyr 速查表
- de: Data Science in Spark mit sparklyr Schummelzettel
- geospark: bring sf to spark in production
- sparkgeo: Sparklyr extension package providing geospatial analytics capabilities
avro vs parquet: row-oriented vs column-oriented, see more introduction in stackoverflow and cern blog
- cloudera-parcel: customized cloudera-parcel
- mvndeps: Use (previously installed) maven from R to resolve java dependencies
- clusterconf: Manage Hadoop cluster configurations
- rstudio: bigdataclass
- RStudio Webinars: Part 1 - Introducing an R interface for Apache Spark
- RStudio Webinars: Part 2 - Extending Spark using sparklyr
- RStudio Webinars: Part 3 - Advanced features of sparklyr
- RStudio Webinars: Part 4 - Understanding sparklyr deployment modes
- Microsoft Azure: R and Microsoft R Workflows for Data Science
- DataCamp: Introduction to Spark in R using sparklyr
- rspark-tutorial: Tutorial for learning rspark
- sparklyr 1.1: Foundations, Books, Lakes and Barriers
- sparklyr 1.0: Apache Arrow, XGBoost, Broom and TFRecords
- sparklyr 0.9: Streams and Kubernetes
- sparklyr 0.8: Production pipelines and graphs
- sparklyr 0.7: Spark Pipelines and Machine Learning
- sparklyr 0.6: Distributed R and external sources
- sparklyr 0.5: Livy and dplyr improvements
- Online retail data analysis using R, tidyverse, sparklyr and Spark
- Spark Joy - Saying Konmari to your event logs with grammar of data manipulation
- How to Distribute your R code with sparklyr and Cloudera Data Science Workbench
- Association rules using FPGrowth in Spark MLlib through SparklyR
- Visualizing taxi trips between NYC neighborhoods with Spark and Microsoft R Server
- SDSS2018workflows: Presentation slides for Symposium on Data Science and Statistics held May 16--19, Reston, VA
- Microsoft RxSpark: Create Spark compute context, connect and disconnect a Spark application
- sparklyr RStudio 활용
- S3のデータをRStudioとsparklyrで分析する
- RStudio + sparklyr on EMRでスケーラブル機械学習
- RSparkling: The Best of R + H2O + Spark
- SPARK+AI SUMMIT 2019: Running R at Scale with Apache Arrow on Spark
- R y Spark para la Ciencia de Datos
- Spark+R 大数据分析入门
- rstudio-conf-2018-sparklyr
- SPARK+AI SUMMIT 2018: From Prototyping to Deployment at Scale with R and sparklyr Kevin Kuo (RStudio)
- rstudio: building-spark-ml-pipelines-with-sparklyr
- oreilly: Sparklyr: An R interface for Apache Spark
- DataScienceWarsaw25: sparklyr: R interface to Apache Spark machine learning algorithms with dplyr back-end
- Tensorflow and Sparklyr: Scaling Deep Learning and R to the Big Data ecosystem [Italian]
- Sparklyr: Big Data enabler for R users - Serena Signorelli, ICTEAM
- eRum 2018: Exploiting Spark for high-performance scalable data engineering and data-science on Microsoft Azure