This set of online learning materials for undergraduate and graduate data mining class is currently maintained by Zhaohu (Jonathan) Fan. Some of the materials are from Dr. Yan Yu’s class notes. Thanks for the contribution from previous Ph.D. students at Lindner College of Business.
Contributors:
- Zhaohu(Jonathan) Fan, PhD in Business Analytics, [email protected]
- Harsh Singal, M.S. in Business Analytics (current position: Data Scientist - Product Analytics at Asurion)
- Saidat Sanni, PhD Candidate in Business Analytics.
Description | |
---|---|
1.A | Introduction to Data Mining |
1.B | Introduction to Python |
1.C | Advanced techniques: function and loop |
1.D | Introduction to Markdown (optional) |
Description | |
---|---|
2.A | Explore and describe dataset |
2.B | Exploratory data analysis by visualization |
Description | |
---|---|
3.A | Linear regression and prediction |
3.B | Subset variable selection |
3.C | LASSO variable selection |
3.D | Monte Carlo simulation |
Description | |
---|---|
4.A | Logistic regression and prediction |
4.B | Logistic regression and variable selection |
4.C | Logistic Regression for binary classification |
4.D | Logistic regression and ROC |
Description | |
---|---|
5.A | Cross validation |
5.B | Cross validation (Logit model) |
Description | |
---|---|
6.A | Regression Trees |
6.B | Classification Trees |
Description | |
---|---|
7.A | Bagging trees |
7.B | Random forests |
7.C | Boosting trees |
Description | |
---|---|
8.A | Univariate Nonparametric Smoothing |
8.B | Generalized additive model (GAM) |
Description | |
---|---|
9.A | Neural network models |
9.B | Neural network models (Handwritten Digits Case) |
9.C | Discriminant analysis (Optional) |
9.D | Support vector machine (SVM) (Optional) |
Description | |
---|---|
10.A | Clustering |
Description | |
---|---|
11.A | Association Rules |
Description | |
---|---|
12.A | Basic Text Mining |
Acknowledgments: I have drawn ideas or readings from the following texts:
- Ethan Swan, Python for Data Science
- And many more.