Skip to content

Latest commit

 

History

History

coursera_spark_lecture

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

Big Data Analysis with Scala and Spark

Coursera 강좌 중 Heather Miller 교수님의 "Big Data Analysis with Scala and Spark"를 수강하며 학습한 내용 정리

Week 01

  1. Data-Parallel to Distributed Data-Parallel
  2. Latency
  3. RDDs, Sparks's Distributed Collection
  4. RDDs: Transformation and Actions
  5. Evaluation in Spark: Unlike Scala Collections!
  6. Cluster Topology Matters!
  7. Weekly Summary

Week 02

  1. Reduction Operations
  2. Pair RDDs
  3. Transformations and Actions on Pair RDDs
  4. Joins
  5. Weekly Summary

Week 03

  1. Shuffling: What it is and why it's important
  2. Partitioning
  3. Optimizing with Partitioners
  4. Wide vs Narrow Dependencies
  5. Weekly Summary

Week 04

  1. Weekly Summary

Scala Programming

  1. (작성중)Scala basic
  2. Scala vs Python in Apache Spark
  3. Set up Scala

Set Up

  1. Scala and Spark Version
  2. Apache Zeppelin
  3. Use SBT
  4. Jupyter Notebook

Reference