Skip to content

pilsung-kang/Machine-Learning-Basics-Bflysoft

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Machine Learning Basics @BflySoft

Lecture 1: Introduction to Data Analytics

  • Data Analytics 개요 및 주요 개념
  • 데이터과학 프로젝트 절차
  • Machine Learning 방법론
  • Machine Learning 모델링 예시: PP 기사 분류 모형
  • [Slide], [Video 1], [Video_2], [Video 3], [Video_4]

Lecture 2: Multiple Linear Regression

  • MLR Formulation
  • MLR 학습: Ordinary Least Square
  • MLR 결과 해석
  • [Slide], [Video 1], [Video 2]

Lecture 3: Logistic Regression

Lecture 4: Performance Evaluation

  • 회귀 모형의 성능 평가: MAE, MAPE, MSE, RMSE
  • 분류 모형의 성능 평가: 단순정확도, 균형정확도, F1-지표

Lecture 5: Decision Tree

  • Classification Tree: 재귀적 분기, 가지치기
  • Regression Tree
  • [Slide], [Video]

Lecture 6: Artificial Neural Network

Lecture 7: Deep Neural Network & Convolutional Neural Network

  • 심층신경망 개요
  • 합성곱 신경망: Convolution 개념, 대표적 CNN 구조
  • [Slide]

Lecture 8: Recurrent Neural Network & Auto Encoder

  • 순환신경망, LSTM, GRU
  • 오토인코더
  • [Slide]

Lecture 9: Ensemble Learning

Lecture 10: Anomaly Detection

  • 이상치 탐지
  • 밀도 기반 이상치 탐지
  • 모델 기반 이상치 탐지
  • [Slide]

Lecture 11: Clustering

  • 군집화 개요 및 타당성 평가 지표
  • K-평균 군집화
  • 계층적 군집화
  • 밀도 기반 군집화: DBSCAN
  • [Slide], [Video 1], [Video 2], [Video 3], [Video 4]

Lecture 12: Recommendation Systems

  • 추천시스템 개요
  • 아이템 기반 추천 [Video]
  • 협업 필터링 기반 추천
  • 행렬 분해 기반 추천
  • [Slide]

Text Analytics @BflySoft

Introduction to Text Analytics

Topic 1: Introduction to Text Analytics [Slide]

  • Text Analytics: Backgrounds, Applications, & Challanges, and Process [Video]
  • Text Analytics Process [Video]

Topic 2: Text Preprocessing [Slide]

  • Introduction to Natural Language Processing (NLP) [Video]
  • Lexical analysis [Video]
  • Syntax analysis & Other topics in NLP [Video]
  • Reading materials
    • Cambria, E., & White, B. (2014). Jumping NLP curves: A review of natural language processing research. IEEE Computational intelligence magazine, 9(2), 48-57. (PDF)
    • Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., & Kuksa, P. (2011). Natural language processing (almost) from scratch. Journal of Machine Learning Research, 12(Aug), 2493-2537. (PDF)
    • Young, T., Hazarika, D., Poria, S., & Cambria, E. (2017). Recent trends in deep learning based natural language processing. arXiv preprint arXiv:1708.02709. (PDF)
    • NLP Year in Review - 2019 (Medium Post)

Topic 3: Text Representation I: Classic Methods [Slide]

  • Bag of words, Word weighting, N-grams [Video]

Topic 5: Text Representation II: Distributed Representation [Slide]

  • Neural Network Language Model (NNLM) [Video]
  • Word2Vec [Video]
  • GloVe [Video]
  • FastText, Doc2Vec, and Other Embeddings [Video]
  • Reading materials
    • Bengio, Y., Ducharme, R., Vincent, P., & Jauvin, C. (2003). A neural probabilistic language model. Journal of machine learning research, 3(Feb), 1137-1155. (PDF)
    • Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781. (PDF)
    • Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (pp. 3111-3119). (PDF)
    • Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532-1543). (PDF)
    • Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2016). Enriching word vectors with subword information. arXiv preprint arXiv:1607.04606. (PDF)

Topic 6: Dimensionality Reduction [Slide]

  • Dimensionality Reduction Overview, Supervised Feature Selection [Video]
  • Unsupervised Feature Extraction [Video]
  • Reading materials
    • Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American society for information science, 41(6), 391-407. (PDF)
    • Landauer, T. K., Foltz, P. W., & Laham, D. (1998). An introduction to latent semantic analysis. Discourse processes, 25(2-3), 259-284. (PDF)
    • Maaten, L. V. D., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of machine learning research, 9(Nov), 2579-2605. (PDF) (Homepage)

Topic 7: Language Modeling & Pre-trained Models [Slide 1], [Slide 2]

  • Sequence-to-Sequence Learning [Video]
  • Transformer [Video]
  • ELMo: Embeddings from Language Models [Video]
  • GPT: Generative Pre-Training of a Language Model [Video]
  • BERT: Bidirectional Encoder Representations from Transformer [Video]
  • GPT-2: Language Models are Unsupervised Multitask Learners [Video]
  • Transformer to T5 [Slide], [Video], Presented by Yukyoung Lee.
  • Reading Materials
    • Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Advances in neural information processing systems (pp. 3104-3112). (PDF)
    • Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473. (PDF)
    • Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. In Advances in neural information processing systems (pp. 5998-6008). (PDF)
    • Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep contextualized word representations. arXiv preprint arXiv:1802.05365. (PDF)
    • Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training. (PDF)
    • Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. (PDF)
    • Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI Blog, 1(8), 9. (PDF)

Topic 8: Topic Modeling as a Distributed Reprentation

  • Topic modeling overview & Latent Semantic Analysis (LSA), Probabilistic Latent Semantic Analysis: pLSA [Video]
  • LDA: Document Generation Process [Video]
  • LDA Inference: Collapsed Gibbs Sampling, LDA Evaluation [Video]
  • Reading Materials
    • Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American society for information science, 41(6), 391. (PDF)
    • Dumais, S. T. (2004). Latent semantic analysis. Annual review of information science and technology, 38(1), 188-230.
    • Hofmann, T. (1999, July). Probabilistic latent semantic analysis. In Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence (pp. 289-296). Morgan Kaufmann Publishers Inc. (PDF)
    • Hofmann, T. (2017, August). Probabilistic latent semantic indexing. In ACM SIGIR Forum (Vol. 51, No. 2, pp. 211-218). ACM.
    • Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77-84. (PDF)
    • Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3(Jan), 993-1022. (PDF)
  • Recommended video lectures

About

Machine Learning Basics @bflysoft

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published