Project 6 : Disaster Tweet Classification

Problem statement

Twitter has become an important communication channel in times of emergency. The ubiquitousness of smartphones enables people to announce an emergency they’re observing in real-time. Because of this, more agencies are interested in programatically monitoring Twitter (i.e. disaster relief organization and news agencies).

This Project builds a learning model that classifies Tweets as disaster and no disaster.

Executive Summary

Data Collection & Cleaning

This dataset was created by the company figure-eight and originally shared on their ‘Data For Everyone’ website here.

Tweet source: (https://twitter.com/AnyOtherAnnaK/status/629195955506708480)

Exploratory Data Analysis

This step involves the following:

Import and Read data - Reading the csv file

Data Visualization - Creating histogram and word cloud

Baseline Accuracy - Calculation

Pre Processing

This step involves the following methods :

Tokenizing - splitting data into distinct chunks

Removing Stopwords- Removing commonly used words/stop words as they take up space and processing time

Lemmatizing - return the base/dictionary form of a word

Modeling

This step creates three models and compares them.

Logistic Regression Model

Naive Bayes Model

Train and Test Scores:

Model	Train Score	Test Score
Logistic Regression Model	0.8894727623051323	0.7977941176470589
Naive Bayes Model	0.7836748992818356	0.773109243697479

Confusion Matrix Result:

Model	False Positives	False Negatives
Logistic Regression Model	0	3
Naive Bayes Model	8	0
Decision Tree Model	20	15

Inferential Visualizations

Creating a desicion tree with Labels

Creating word clouds for subreddit - Bookclub and Cooking

Conclusions and Recommendations

A sucessful model was bult and score was submitted to kaggle.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
code		code
dataset		dataset
.DS_Store		.DS_Store
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project 6 : Disaster Tweet Classification

Problem statement

Executive Summary

Contents

Data Collection & Cleaning

Exploratory Data Analysis

Pre Processing

Modeling

Inferential Visualizations

Conclusions and Recommendations

About

Releases

Packages

Languages

vaishnavibv13/Disaster-Tweet-Classification

Folders and files

Latest commit

History

Repository files navigation

Project 6 : Disaster Tweet Classification

Problem statement

Executive Summary

Contents

Data Collection & Cleaning

Exploratory Data Analysis

Pre Processing

Modeling

Inferential Visualizations

Conclusions and Recommendations

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages