Skip to content

Examples of analyzing Tweets, csv files and Apache logs using Spark Streaming, Spark SQL and MLlib

Notifications You must be signed in to change notification settings

nkafr/Spark-Streaming-Advanced-Examples

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spark-Streaming-Advanced-Examples

Examples of analyzing Tweets, csv files and Apache logs using Spark Streaming, Spark SQL and MLlib

These cases include:

LogAlarmer: Monitors a stream of Apache access logs on port 9999, and prints an alarm if an excessive ratio of errors is encountered.

LogParser: Maintains top URL's visited over a 5 minute window, from a stream of Apache access logs on port 9999.

LogSQL: Using SparkSQL with Spark Streaming, to issue queries on Apache log data extracted from a stream on port 9999.

AverageTweetLength: Uses thread-safe counters to keep track of the average length of Tweets in a stream.

PopularHashtags: Listens to a stream of Tweets and keeps track of the most popular hashtags over a 5 minute window.

PrintTweets: Simple application to listen to a stream of Tweets and print them out.

SaveTweets: Listens to a stream of tweets and saves them to disk.

Sessionizer: An example of using a State object to keep persistent state information across a stream. This example keep track of clickstreams on sessions tied together by IP addresses.

StreamingKMeansClass: Example of using streaming K-Means clustering to cluster people by income and age into 5 clusters.

StreamingRegression: Example of using streaming linear regression with stochastic gradient descent. Listens to port 9999 for data on page speed vs. amount spent, and creates a linear model to predict amount spent by page speed over time.

StructuredStreaming: Example of Structured Streaming which aggregates the number of successful requests each hour .

About

Examples of analyzing Tweets, csv files and Apache logs using Spark Streaming, Spark SQL and MLlib

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages