Machine Learning Basics @BflySoft

Lecture 1: Introduction to Data Analytics

Data Analytics 개요 및 주요 개념
데이터과학 프로젝트 절차
Machine Learning 방법론
Machine Learning 모델링 예시: PP 기사 분류 모형
[Slide], [Video 1], [Video_2], [Video 3], [Video_4]

Lecture 2: Multiple Linear Regression

MLR Formulation
MLR 학습: Ordinary Least Square
MLR 결과 해석
[Slide], [Video 1], [Video 2]

Lecture 3: Logistic Regression

Logistic Regression Formulation
Logistic Regression 학습: Gradient Descent
다항 로지스틱 회귀분석
[Slide], [Video 1], [Video 2], [Video 3], [Video 4]

Lecture 4: Performance Evaluation

회귀 모형의 성능 평가: MAE, MAPE, MSE, RMSE
분류 모형의 성능 평가: 단순정확도, 균형정확도, F1-지표

Lecture 5: Decision Tree

Classification Tree: 재귀적 분기, 가지치기
Regression Tree
[Slide], [Video]

Lecture 6: Artificial Neural Network

인공신경망 개요, Perceptron
Multi-layered Perceptron
[Slide], [Video 1], [Video 2]

Lecture 7: Deep Neural Network & Convolutional Neural Network

심층신경망 개요
합성곱 신경망: Convolution 개념, 대표적 CNN 구조
[Slide]

Lecture 8: Recurrent Neural Network & Auto Encoder

순환신경망, LSTM, GRU
오토인코더
[Slide]

Lecture 9: Ensemble Learning

앙상블 배경
배깅 & 랜덤 포레스트
AdaBoost & Gradient Boosting Machine
[Slide], [Video 1], [Video 2], [Video 3], [Video 4], [Video 5]

Lecture 10: Anomaly Detection

이상치 탐지
밀도 기반 이상치 탐지
모델 기반 이상치 탐지
[Slide]

Lecture 11: Clustering

군집화 개요 및 타당성 평가 지표
K-평균 군집화
계층적 군집화
밀도 기반 군집화: DBSCAN
[Slide], [Video 1], [Video 2], [Video 3], [Video 4]

Lecture 12: Recommendation Systems

추천시스템 개요
아이템 기반 추천 [Video]
협업 필터링 기반 추천
행렬 분해 기반 추천
[Slide]

Text Analytics @BflySoft

Introduction to Text Analytics

Topic 1: Introduction to Text Analytics [Slide]

Text Analytics: Backgrounds, Applications, & Challanges, and Process [Video]
Text Analytics Process [Video]

Topic 2: Text Preprocessing [Slide]

Introduction to Natural Language Processing (NLP) [Video]
Lexical analysis [Video]
Syntax analysis & Other topics in NLP [Video]
Reading materials
- Cambria, E., & White, B. (2014). Jumping NLP curves: A review of natural language processing research. IEEE Computational intelligence magazine, 9(2), 48-57. (PDF)
- Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., & Kuksa, P. (2011). Natural language processing (almost) from scratch. Journal of Machine Learning Research, 12(Aug), 2493-2537. (PDF)
- Young, T., Hazarika, D., Poria, S., & Cambria, E. (2017). Recent trends in deep learning based natural language processing. arXiv preprint arXiv:1708.02709. (PDF)
- NLP Year in Review - 2019 (Medium Post)

Topic 3: Text Representation I: Classic Methods [Slide]

Bag of words, Word weighting, N-grams [Video]

Topic 5: Text Representation II: Distributed Representation [Slide]

Neural Network Language Model (NNLM) [Video]
Word2Vec [Video]
GloVe [Video]
FastText, Doc2Vec, and Other Embeddings [Video]
Reading materials
- Bengio, Y., Ducharme, R., Vincent, P., & Jauvin, C. (2003). A neural probabilistic language model. Journal of machine learning research, 3(Feb), 1137-1155. (PDF)
- Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781. (PDF)
- Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (pp. 3111-3119). (PDF)
- Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532-1543). (PDF)
- Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2016). Enriching word vectors with subword information. arXiv preprint arXiv:1607.04606. (PDF)

Topic 6: Dimensionality Reduction [Slide]

Dimensionality Reduction Overview, Supervised Feature Selection [Video]
Unsupervised Feature Extraction [Video]
Reading materials
- Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American society for information science, 41(6), 391-407. (PDF)
- Landauer, T. K., Foltz, P. W., & Laham, D. (1998). An introduction to latent semantic analysis. Discourse processes, 25(2-3), 259-284. (PDF)
- Maaten, L. V. D., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of machine learning research, 9(Nov), 2579-2605. (PDF) (Homepage)

Topic 7: Language Modeling & Pre-trained Models [Slide 1], [Slide 2]

Sequence-to-Sequence Learning [Video]
Transformer [Video]
ELMo: Embeddings from Language Models [Video]
GPT: Generative Pre-Training of a Language Model [Video]
BERT: Bidirectional Encoder Representations from Transformer [Video]
GPT-2: Language Models are Unsupervised Multitask Learners [Video]
Transformer to T5 [Slide], [Video], Presented by Yukyoung Lee.
Reading Materials
- Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Advances in neural information processing systems (pp. 3104-3112). (PDF)
- Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473. (PDF)
- Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. In Advances in neural information processing systems (pp. 5998-6008). (PDF)
- Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep contextualized word representations. arXiv preprint arXiv:1802.05365. (PDF)
- Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training. (PDF)
- Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. (PDF)
- Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI Blog, 1(8), 9. (PDF)

Topic 8: Topic Modeling as a Distributed Reprentation

Topic modeling overview & Latent Semantic Analysis (LSA), Probabilistic Latent Semantic Analysis: pLSA [Video]
LDA: Document Generation Process [Video]
LDA Inference: Collapsed Gibbs Sampling, LDA Evaluation [Video]
Reading Materials
- Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American society for information science, 41(6), 391. (PDF)
- Dumais, S. T. (2004). Latent semantic analysis. Annual review of information science and technology, 38(1), 188-230.
- Hofmann, T. (1999, July). Probabilistic latent semantic analysis. In Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence (pp. 289-296). Morgan Kaufmann Publishers Inc. (PDF)
- Hofmann, T. (2017, August). Probabilistic latent semantic indexing. In ACM SIGIR Forum (Vol. 51, No. 2, pp. 211-218). ACM.
- Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77-84. (PDF)
- Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3(Jan), 993-1022. (PDF)
Recommended video lectures
- LDA by D. Blei (Lecture Video)
- Variational Inference for LDA by D. Blei (Lecture Video)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Machine Learning Basics @BflySoft

Lecture 1: Introduction to Data Analytics

Lecture 2: Multiple Linear Regression

Lecture 3: Logistic Regression

Lecture 4: Performance Evaluation

Lecture 5: Decision Tree

Lecture 6: Artificial Neural Network

Lecture 7: Deep Neural Network & Convolutional Neural Network

Lecture 8: Recurrent Neural Network & Auto Encoder

Lecture 9: Ensemble Learning

Lecture 10: Anomaly Detection

Lecture 11: Clustering

Lecture 12: Recommendation Systems

Text Analytics @BflySoft

Introduction to Text Analytics

Topic 1: Introduction to Text Analytics [Slide]

Topic 2: Text Preprocessing [Slide]

Topic 3: Text Representation I: Classic Methods [Slide]

Topic 5: Text Representation II: Distributed Representation [Slide]

Topic 6: Dimensionality Reduction [Slide]

Topic 7: Language Modeling & Pre-trained Models [Slide 1], [Slide 2]

Topic 8: Topic Modeling as a Distributed Reprentation

Files

README.md

Latest commit

History

README.md

File metadata and controls

Machine Learning Basics @BflySoft

Lecture 1: Introduction to Data Analytics

Lecture 2: Multiple Linear Regression

Lecture 3: Logistic Regression

Lecture 4: Performance Evaluation

Lecture 5: Decision Tree

Lecture 6: Artificial Neural Network

Lecture 7: Deep Neural Network & Convolutional Neural Network

Lecture 8: Recurrent Neural Network & Auto Encoder

Lecture 9: Ensemble Learning

Lecture 10: Anomaly Detection

Lecture 11: Clustering

Lecture 12: Recommendation Systems

Text Analytics @BflySoft

Introduction to Text Analytics

Topic 1: Introduction to Text Analytics [Slide]

Topic 2: Text Preprocessing [Slide]

Topic 3: Text Representation I: Classic Methods [Slide]

Topic 5: Text Representation II: Distributed Representation [Slide]

Topic 6: Dimensionality Reduction [Slide]

Topic 7: Language Modeling & Pre-trained Models [Slide 1], [Slide 2]

Topic 8: Topic Modeling as a Distributed Reprentation