- Data types and variables
- Control structures (if-else, for loops, while loops)
- Functions and lambda expressions
- Exception handling
- List comprehensions
- Generators and iterators
- Decorators
- Context managers
- NumPy (arrays, matrix operations)
- pandas (dataframes, series, data manipulation)
- Matplotlib (basic plotting, figures, and axes)
- seaborn (statistical data visualization)
- Handling missing data
- Data type conversion
- Normalizing and scaling
- Descriptive statistics
- Correlation analysis
- Outlier detection
- Merging, joining, and concatenating data
- Grouping and aggregation
- Pivot tables and cross-tabulation
- Supervised vs. unsupervised learning
- Overfitting and underfitting
- Bias Variance Tradeoff
- Train-test split
- Cross-validation
- Linear regression (Line fitting, Residual, Gradient Descent)
- Polynomial regression (Polynomial Order)
- Ridge, Lasso and ElasticNet regression (What effed does each penalty term have on the training data)
- Logistic regression (OVA, OVR)
- Decision Trees (Gini, Entrophy)
- Random Forests
- Support Vector Machines (SVM) (Max-margin Classifier, Kernel trick [RBF, Linear, Sigmoid])
- k-Nearest Neighbors (k-NN)
- Bagging
- Boosting (AdaBoost, GradientBoost, XGBoost, CatBoost, LightGBM)
- Stacking
- k-means clustering
- Hierarchical clustering
- Principal Component Analysis (PCA)
- Confusion matrix
- ROC-AUC
- Precision-Recall
- F1 Score
- Grid Search
- Optional
- Cross-validation
- Perceptrons
- Activation functions (ReLU, sigmoid, tanh)
- Feedforward neural networks
- Backpropagation and gradient descent
- Convolutional Neural Networks (CNNs)
- LeNet, AlexNet, VGGNet, InceptionNet, ResNet, EfficientNet
- Transfer Learning
- Recurrent Neural Networks (RNNs)
- Vanishing and Exploding Gradient
- Long Short-Term Memory networks (LSTMs)
- Autoencoders (VAE)
- Generative Adversarial Networks (GANs) (DCGAN, PixGAN, CycleGAN)
- Transformers
- TensorFlow
- Keras
- PyTorch
- Regularization techniques
- Hyperparameter tuning (Grid search, Random search)
- Model deployment (Flask, Docker)
- Sequence to Sequence Models
- Attention Mechanism
- Self Attention
- Transformers, Positional Encoding
- Encoder Only Architecture
- BERT (Bidirectional Encoder Representations from Transformers)
- GPT (Generative Pre-trained Transformer) - often categorized as Decoder Only but can be adapted for encoder-only tasks DistilBERT (a distilled version of BERT that is lighter and faster) ALBERT (A Lite BERT for self-supervised learning of language representations)
- Decoder Only Architecture
- GPT (Generative Pre-trained Transformer)
- Encoder-Decoder Architecture
- Transformer (original model comprising both encoder and decoder)
- BART (Bidirectional and Auto-Regressive Transformers)
- T5 (Text-to-Text Transfer Transformer)
- Zero-shot, Single Shot, Few Shot
- RAG
- OpenAI APIs
- Using self-hosted LLMs, Huggingface
- Text preprocessing (tokenization, stemming, lemmatization)
- Word embeddings (Word2Vec, GloVe)
- Sentiment analysis
- Named Entity Recognition (NER)
- Image processing basics
- Object detection
- Image classification
- ARIMA models
- Seasonal decomposition
- Forecasting
Courses
- https://www.coursera.org/professional-certificates/ibm-data-science
- https://www.coursera.org/specializations/machine-learning-introduction
- https://www.coursera.org/specializations/deep-learning
- https://www.coursera.org/specializations/natural-language-processing
- https://www.coursera.org/professional-certificates/tensorflow-in-practice
- https://www.coursera.org/specializations/machine-learning-engineering-for-production-mlops
- For LLM cover all short courses by dl.ai - https://www.deeplearning.ai/short-courses/