From b3a0c115b7e4b793e78cf335efae84b49f802bd3 Mon Sep 17 00:00:00 2001 From: karishma-battina <67629745+karishma-battina@users.noreply.github.com> Date: Fri, 24 Jan 2025 00:17:38 +0000 Subject: [PATCH 01/15] Add sueprvised-learning.md file --- .../supervised-learning.md | 47 +++++++++++++++++++ 1 file changed, 47 insertions(+) create mode 100644 content/ai/concepts/supervised-learning/supervised-learning.md diff --git a/content/ai/concepts/supervised-learning/supervised-learning.md b/content/ai/concepts/supervised-learning/supervised-learning.md new file mode 100644 index 00000000000..d90741a9447 --- /dev/null +++ b/content/ai/concepts/supervised-learning/supervised-learning.md @@ -0,0 +1,47 @@ +--- +Title: 'Supervised Learning' +Description: 'Supervised learning is a type of machine learning where algorithms learn from labeled data to make predictions or decisions.' +Subjects: + - 'Machine Learning' + - 'Data Science' + - 'Artificial Intelligence' +Tags: + - 'AI' + - 'Deep Learning' + - 'Classification' + - 'Regression' + - 'Predictive Modeling' +CatalogContent: + - 'machine-learning' + - 'paths/data-science' +--- + +**Machine learning (ML)** is a type of machine learning where the algorithm learns from labeled data. It involves training a model on input-output pairs to generalize and predict outcomes for new, unseen data. This label acts as a "supervisor," guiding the learning process. + +Imagine teaching a child by showing them examples with correct answers. Similarly, the algorithm learns patterns from these examples and uses them to make predictions on new, unseen data. + +Examples: Identifying Handwritten Digits, predicting the prices of cars, spam emails detection. + +## Types of Supervised Learning + +### Classification + +In Classification, the algorithm learns from labeled training data, where each input is associated with a specific class, and then uses this knowledge to classify new, unseen data. + +- Example: In a spam filter, the classes are "spam" and "not spam." + - Linear regression: Plots the line or plane of "best fit" of optimal values for prediction tasks. + - Logistic regression: Classifies elements in a data set into discrete categories. +- Classification: Categorizes data points into discrete groups. + - Naïve-Bayes classifier: Uses Bayes' theorem of probability to perform classification of elements. + - Support vector machine (SVM): Margin classifiers that define hyperplanes to separate data points into discrete categories. + - Artificial Neural Networks (ANN): Classifiers modeled after biological neural networks with relatively high-performance capabilities in regression and classification tasks. + +### Regression + +Predicts numeric values. + +- Clustering: Recognize patterns and structures in unlabeled data by grouping them into clusters. + - K-Means: Categorizes data points into clusters based on their proximity to cluster centroids. + - Hierarchical Agglomerative Clustering: Groups data points into clusters based on various measures of similarity, such as the smallest average distance between all points, minimal variance between data points, or smallest maximum distance between data points. +- Dimensionality Reduction: Scale down the dimensions in the dataset from a high-dimensional space into a low-dimensional space while maintaining the maximum amount of relevant information. + - Principal Component Analysis (PCA): Reduces the dimensionality of a dataset to the 'n' number of principal dimensions that contain the most valuable information. From 3d60a5b3f39e46f048848371d4c1a78c184c0aa6 Mon Sep 17 00:00:00 2001 From: karishma-battina <67629745+karishma-battina@users.noreply.github.com> Date: Sat, 25 Jan 2025 05:34:53 +0000 Subject: [PATCH 02/15] Update description --- .../ai/concepts/supervised-learning/supervised-learning.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/ai/concepts/supervised-learning/supervised-learning.md b/content/ai/concepts/supervised-learning/supervised-learning.md index d90741a9447..101460459b4 100644 --- a/content/ai/concepts/supervised-learning/supervised-learning.md +++ b/content/ai/concepts/supervised-learning/supervised-learning.md @@ -1,6 +1,6 @@ --- Title: 'Supervised Learning' -Description: 'Supervised learning is a type of machine learning where algorithms learn from labeled data to make predictions or decisions.' +Description: 'Supervised learning is a machine learning technique where algorithms learn from labeled data to make predictions.' Subjects: - 'Machine Learning' - 'Data Science' @@ -16,7 +16,7 @@ CatalogContent: - 'paths/data-science' --- -**Machine learning (ML)** is a type of machine learning where the algorithm learns from labeled data. It involves training a model on input-output pairs to generalize and predict outcomes for new, unseen data. This label acts as a "supervisor," guiding the learning process. +**Supervised learning (ML)** is a type of machine learning where the algorithm learns from labeled data. It involves training a model on input-output pairs to generalize and predict outcomes for new, unseen data. This label acts as a "supervisor," guiding the learning process. Imagine teaching a child by showing them examples with correct answers. Similarly, the algorithm learns patterns from these examples and uses them to make predictions on new, unseen data. From 8cc63bcdd4062c2f7f10cb762d574991254e6876 Mon Sep 17 00:00:00 2001 From: karishma-battina <67629745+karishma-battina@users.noreply.github.com> Date: Sun, 26 Jan 2025 05:03:00 +0000 Subject: [PATCH 03/15] Add types and examples --- .../supervised-learning.md | 46 ++++++++++++++++++- 1 file changed, 45 insertions(+), 1 deletion(-) diff --git a/content/ai/concepts/supervised-learning/supervised-learning.md b/content/ai/concepts/supervised-learning/supervised-learning.md index 101460459b4..0e99c9aec99 100644 --- a/content/ai/concepts/supervised-learning/supervised-learning.md +++ b/content/ai/concepts/supervised-learning/supervised-learning.md @@ -28,6 +28,25 @@ Examples: Identifying Handwritten Digits, predicting the prices of cars, spam em In Classification, the algorithm learns from labeled training data, where each input is associated with a specific class, and then uses this knowledge to classify new, unseen data. +Examples: +Key Components: Spam detection, Handwritten Digit Recognition, Image Classification, Medical Diagnosis. + +Features (Input Variables): These are the measurable characteristics or attributes of the data points. They can be numerical or categorical +Labels (Output Variables/Classes/Categories): These are the predefined categories to which data points are assigned. They are discrete values, meaning they belong to a finite set. +Training Data: A dataset where each data point is paired with its correct label. This is what the model learns from. +Model: The algorithm or function that learns the mapping between the features and the labels. + +Types of Classification: +Binary Classification: The task of classifying data points into one of two classes. +Examples: Spam detection (spam or not spam), Medical diagnosis (disease present or absent) +Multi-class Classification: The task of classifying data points into one of more than two classes. +Examples: Image classification (cat, dog, bird, fish), Handwritten digit recognition (0, 1, 2, 3, 4, 5, 6, 7, 8, 9) +Multi-label Classification: The task of assigning multiple labels to each data point. This is different from multi-class classification, where each data point can only belong to one class. +Examples: Tagging a movie with multiple genres (action, adventure, comedy), Classifying a document with multiple topics (politics, economics, international relations) + +Common Classification Algorithms: Logistic Regression, Support Vector Machines (SVMs), Decision Trees, Random Forests, Naive Bayes, K-Nearest Neighbors (KNN) +// + - Example: In a spam filter, the classes are "spam" and "not spam." - Linear regression: Plots the line or plane of "best fit" of optimal values for prediction tasks. - Logistic regression: Classifies elements in a data set into discrete categories. @@ -35,13 +54,38 @@ In Classification, the algorithm learns from labeled training data, where each i - Naïve-Bayes classifier: Uses Bayes' theorem of probability to perform classification of elements. - Support vector machine (SVM): Margin classifiers that define hyperplanes to separate data points into discrete categories. - Artificial Neural Networks (ANN): Classifiers modeled after biological neural networks with relatively high-performance capabilities in regression and classification tasks. + // ### Regression -Predicts numeric values. +Regression, in the realm of machine learning and statistics, is a supervised learning task focused on predicting a continuous numerical output. + +Unlike classification, which assigns data points to categories, regression aims to estimate a value within a range. + +Key Components: + +Features (Input Variables): These are the independent variables used to make predictions. They can be numerical or categorical, but they ultimately influence the numerical output.   +Target Variable (Output Variable/Dependent Variable): This is the continuous numerical value we want to predict.   +Training Data: A dataset containing examples of input features and their corresponding target values. The model learns from these examples.   +Model: The learned function that maps the input features to the target variable. + +Types of Regression: + +Linear Regression: Assumes a linear relationship between the features and the target variable. It tries to find the best-fitting straight line (in simple linear regression with one feature) or hyperplane (in multiple linear regression with multiple features) that minimizes the difference between the predicted and actual values.   +Polynomial Regression: Used when the relationship between the features and the target variable is non-linear. It fits a polynomial curve to the data.   +Multiple Linear Regression: Used when there are multiple input features influencing the target variable.   +Support Vector Regression (SVR): Uses the principles of Support Vector Machines to perform regression. It tries to find a hyperplane that best fits the data within a certain margin of error.   +Decision Tree Regression: Uses a tree-like structure to make predictions. Each node in the tree represents a decision based on a feature, and the leaves represent the predicted values.   +Random Forest Regression: An ensemble method that combines multiple decision trees to improve prediction accuracy and reduce overfitting.   +Neural Network Regression: Uses neural networks to learn complex non-linear relationships between features and the target variable. + +Examples: House Price Prediction, Stock Price Prediction, Temperature Forecasting, Sales Forecasting. + +// - Clustering: Recognize patterns and structures in unlabeled data by grouping them into clusters. - K-Means: Categorizes data points into clusters based on their proximity to cluster centroids. - Hierarchical Agglomerative Clustering: Groups data points into clusters based on various measures of similarity, such as the smallest average distance between all points, minimal variance between data points, or smallest maximum distance between data points. - Dimensionality Reduction: Scale down the dimensions in the dataset from a high-dimensional space into a low-dimensional space while maintaining the maximum amount of relevant information. - Principal Component Analysis (PCA): Reduces the dimensionality of a dataset to the 'n' number of principal dimensions that contain the most valuable information. + // From 05f6dcfaf4cc46a363b7f2dad8233a4c10b5e125 Mon Sep 17 00:00:00 2001 From: karishma-battina <67629745+karishma-battina@users.noreply.github.com> Date: Sun, 26 Jan 2025 14:39:08 +0000 Subject: [PATCH 04/15] Check formatting --- .../supervised-learning.md | 58 +++++++++---------- 1 file changed, 29 insertions(+), 29 deletions(-) diff --git a/content/ai/concepts/supervised-learning/supervised-learning.md b/content/ai/concepts/supervised-learning/supervised-learning.md index 0e99c9aec99..704f7e3ef64 100644 --- a/content/ai/concepts/supervised-learning/supervised-learning.md +++ b/content/ai/concepts/supervised-learning/supervised-learning.md @@ -22,39 +22,39 @@ Imagine teaching a child by showing them examples with correct answers. Similarl Examples: Identifying Handwritten Digits, predicting the prices of cars, spam emails detection. +Key Components +Training Data: A dataset of input-output pairs (e.g., emails labeled as "spam" or "not spam"). + +Model: The algorithm (e.g., decision tree, neural network) that learns from the data. + +Loss Function: Measures how far the model's predictions are from the true labels (e.g., mean squared error for regression, cross-entropy for classification). + +Optimization: Adjusting the model’s parameters (weights) to minimize the loss (e.g., using gradient descent). + ## Types of Supervised Learning ### Classification In Classification, the algorithm learns from labeled training data, where each input is associated with a specific class, and then uses this knowledge to classify new, unseen data. -Examples: -Key Components: Spam detection, Handwritten Digit Recognition, Image Classification, Medical Diagnosis. +- Examples: -Features (Input Variables): These are the measurable characteristics or attributes of the data points. They can be numerical or categorical -Labels (Output Variables/Classes/Categories): These are the predefined categories to which data points are assigned. They are discrete values, meaning they belong to a finite set. -Training Data: A dataset where each data point is paired with its correct label. This is what the model learns from. -Model: The algorithm or function that learns the mapping between the features and the labels. + - Spam detection, Handwritten Digit Recognition, Image Classification, Medical Diagnosis. -Types of Classification: -Binary Classification: The task of classifying data points into one of two classes. -Examples: Spam detection (spam or not spam), Medical diagnosis (disease present or absent) -Multi-class Classification: The task of classifying data points into one of more than two classes. -Examples: Image classification (cat, dog, bird, fish), Handwritten digit recognition (0, 1, 2, 3, 4, 5, 6, 7, 8, 9) -Multi-label Classification: The task of assigning multiple labels to each data point. This is different from multi-class classification, where each data point can only belong to one class. -Examples: Tagging a movie with multiple genres (action, adventure, comedy), Classifying a document with multiple topics (politics, economics, international relations) +- Key Components: -Common Classification Algorithms: Logistic Regression, Support Vector Machines (SVMs), Decision Trees, Random Forests, Naive Bayes, K-Nearest Neighbors (KNN) -// + - Features (Input Variables): These are the measurable characteristics or attributes of the data points. They can be numerical or categorical + - Labels (Output Variables/Classes/Categories): These are the predefined categories to which data points are assigned. They are discrete values, meaning they belong to a finite set. + - Training Data: A dataset where each data point is paired with its correct label. This is what the model learns from. + - Model: The algorithm or function that learns the mapping between the features and the labels. -- Example: In a spam filter, the classes are "spam" and "not spam." - - Linear regression: Plots the line or plane of "best fit" of optimal values for prediction tasks. - - Logistic regression: Classifies elements in a data set into discrete categories. -- Classification: Categorizes data points into discrete groups. - - Naïve-Bayes classifier: Uses Bayes' theorem of probability to perform classification of elements. - - Support vector machine (SVM): Margin classifiers that define hyperplanes to separate data points into discrete categories. - - Artificial Neural Networks (ANN): Classifiers modeled after biological neural networks with relatively high-performance capabilities in regression and classification tasks. - // +- Types of Classification: + + - Binary Classification: The task of classifying data points into one of two classes. + - Multi-class Classification: The task of classifying data points into one of more than two classes. + - Multi-label Classification: The task of assigning multiple labels to each data point. This is different from multi-class classification, where each data point can only belong to one class. + +- Common Classification Algorithms: Logistic Regression, Support Vector Machines (SVMs), Decision Trees, Random Forests, Naive Bayes, K-Nearest Neighbors (KNN) ### Regression @@ -62,12 +62,11 @@ Regression, in the realm of machine learning and statistics, is a supervised lea Unlike classification, which assigns data points to categories, regression aims to estimate a value within a range. -Key Components: - -Features (Input Variables): These are the independent variables used to make predictions. They can be numerical or categorical, but they ultimately influence the numerical output.   -Target Variable (Output Variable/Dependent Variable): This is the continuous numerical value we want to predict.   -Training Data: A dataset containing examples of input features and their corresponding target values. The model learns from these examples.   -Model: The learned function that maps the input features to the target variable. +- Key Components: + - Features (Input Variables): These are the independent variables used to make predictions. They can be numerical or categorical, but they ultimately influence the numerical output. + - Target Variable (Output Variable/Dependent Variable): This is the continuous numerical value we want to predict. + - Training Data: A dataset containing examples of input features and their corresponding target values. The model learns from these examples. + - Model: The learned function that maps the input features to the target variable. Types of Regression: @@ -81,6 +80,7 @@ Neural Network Regression: Uses neural networks to learn complex non-linear rela Examples: House Price Prediction, Stock Price Prediction, Temperature Forecasting, Sales Forecasting. +Common Classification Algorithms: Linear Regression, Polynomial Regression, Support Vector Regression (SVR), Decision Tree Regression, Random Forest Regression, Neural Network Regression. // - Clustering: Recognize patterns and structures in unlabeled data by grouping them into clusters. From 4fb4a74c55bbb7e44046ec63e8b51bf9bd45e2c7 Mon Sep 17 00:00:00 2001 From: karishma-battina <67629745+karishma-battina@users.noreply.github.com> Date: Sun, 26 Jan 2025 15:14:36 +0000 Subject: [PATCH 05/15] Fix formatting --- .../supervised-learning.md | 49 +++++++------------ 1 file changed, 18 insertions(+), 31 deletions(-) diff --git a/content/ai/concepts/supervised-learning/supervised-learning.md b/content/ai/concepts/supervised-learning/supervised-learning.md index 704f7e3ef64..ca804ee75e5 100644 --- a/content/ai/concepts/supervised-learning/supervised-learning.md +++ b/content/ai/concepts/supervised-learning/supervised-learning.md @@ -22,14 +22,11 @@ Imagine teaching a child by showing them examples with correct answers. Similarl Examples: Identifying Handwritten Digits, predicting the prices of cars, spam emails detection. -Key Components -Training Data: A dataset of input-output pairs (e.g., emails labeled as "spam" or "not spam"). - -Model: The algorithm (e.g., decision tree, neural network) that learns from the data. - -Loss Function: Measures how far the model's predictions are from the true labels (e.g., mean squared error for regression, cross-entropy for classification). - -Optimization: Adjusting the model’s parameters (weights) to minimize the loss (e.g., using gradient descent). +- Key Components: + - Training Data: A dataset of input-output pairs (e.g., emails labeled as "spam" or "not spam"). + - Model: The algorithm (e.g., decision tree, neural network) that learns from the data. + - Loss Function: Measures how far the model's predictions are from the true labels (e.g., mean squared error for regression, cross-entropy for classification). + - Optimization: Adjusting the model’s parameters (weights) to minimize the loss (e.g., using gradient descent). ## Types of Supervised Learning @@ -37,9 +34,7 @@ Optimization: Adjusting the model’s parameters (weights) to minimize the loss In Classification, the algorithm learns from labeled training data, where each input is associated with a specific class, and then uses this knowledge to classify new, unseen data. -- Examples: - - - Spam detection, Handwritten Digit Recognition, Image Classification, Medical Diagnosis. +- Examples: Spam detection, Handwritten Digit Recognition, Image Classification, Medical Diagnosis. - Key Components: @@ -59,33 +54,25 @@ In Classification, the algorithm learns from labeled training data, where each i ### Regression Regression, in the realm of machine learning and statistics, is a supervised learning task focused on predicting a continuous numerical output. - Unlike classification, which assigns data points to categories, regression aims to estimate a value within a range. +- Examples: House Price Prediction, Stock Price Prediction, Temperature Forecasting, Sales Forecasting. + - Key Components: + - Features (Input Variables): These are the independent variables used to make predictions. They can be numerical or categorical, but they ultimately influence the numerical output. - Target Variable (Output Variable/Dependent Variable): This is the continuous numerical value we want to predict. - Training Data: A dataset containing examples of input features and their corresponding target values. The model learns from these examples. - Model: The learned function that maps the input features to the target variable. -Types of Regression: - -Linear Regression: Assumes a linear relationship between the features and the target variable. It tries to find the best-fitting straight line (in simple linear regression with one feature) or hyperplane (in multiple linear regression with multiple features) that minimizes the difference between the predicted and actual values.   -Polynomial Regression: Used when the relationship between the features and the target variable is non-linear. It fits a polynomial curve to the data.   -Multiple Linear Regression: Used when there are multiple input features influencing the target variable.   -Support Vector Regression (SVR): Uses the principles of Support Vector Machines to perform regression. It tries to find a hyperplane that best fits the data within a certain margin of error.   -Decision Tree Regression: Uses a tree-like structure to make predictions. Each node in the tree represents a decision based on a feature, and the leaves represent the predicted values.   -Random Forest Regression: An ensemble method that combines multiple decision trees to improve prediction accuracy and reduce overfitting.   -Neural Network Regression: Uses neural networks to learn complex non-linear relationships between features and the target variable. - -Examples: House Price Prediction, Stock Price Prediction, Temperature Forecasting, Sales Forecasting. +- Types of Regression: -Common Classification Algorithms: Linear Regression, Polynomial Regression, Support Vector Regression (SVR), Decision Tree Regression, Random Forest Regression, Neural Network Regression. -// + - Linear Regression: Models a linear relationship between inputs and a target variable by finding the line of best fit that minimizes the sum of squared errors. + - Polynomial Regression: Used when the relationship between the features and the target variable is non-linear. It fits a polynomial curve to the data. + - Multiple Linear Regression: Used when there are multiple input features influencing the target variable. + - Support Vector Regression (SVR): Uses SVM principles to find the best-fitting hyperplane within a margin of error. + - Decision Tree Regression: Uses a tree structure where nodes represent feature-based decisions, and leaves represent predicted values. + - Random Forest Regression: An ensemble method that combines multiple decision trees to improve prediction accuracy and reduce overfitting. + - Neural Network Regression: Uses neural networks to learn complex non-linear relationships between features and the target variable. -- Clustering: Recognize patterns and structures in unlabeled data by grouping them into clusters. - - K-Means: Categorizes data points into clusters based on their proximity to cluster centroids. - - Hierarchical Agglomerative Clustering: Groups data points into clusters based on various measures of similarity, such as the smallest average distance between all points, minimal variance between data points, or smallest maximum distance between data points. -- Dimensionality Reduction: Scale down the dimensions in the dataset from a high-dimensional space into a low-dimensional space while maintaining the maximum amount of relevant information. - - Principal Component Analysis (PCA): Reduces the dimensionality of a dataset to the 'n' number of principal dimensions that contain the most valuable information. - // +- Common Classification Algorithms: Linear Regression, Polynomial Regression, Support Vector Regression (SVR), Decision Tree Regression, Random Forest Regression, Neural Network Regression. From 03f55a0607308ae696b301cbb13c622c3992dcc3 Mon Sep 17 00:00:00 2001 From: karishma-battina <67629745+karishma-battina@users.noreply.github.com> Date: Sun, 26 Jan 2025 15:18:18 +0000 Subject: [PATCH 06/15] Improve readability --- .../supervised-learning.md | 63 ++++++++++--------- 1 file changed, 32 insertions(+), 31 deletions(-) diff --git a/content/ai/concepts/supervised-learning/supervised-learning.md b/content/ai/concepts/supervised-learning/supervised-learning.md index ca804ee75e5..36a8f445818 100644 --- a/content/ai/concepts/supervised-learning/supervised-learning.md +++ b/content/ai/concepts/supervised-learning/supervised-learning.md @@ -20,13 +20,14 @@ CatalogContent: Imagine teaching a child by showing them examples with correct answers. Similarly, the algorithm learns patterns from these examples and uses them to make predictions on new, unseen data. -Examples: Identifying Handwritten Digits, predicting the prices of cars, spam emails detection. +**Examples:** Identifying Handwritten Digits, predicting the prices of cars, spam emails detection. -- Key Components: - - Training Data: A dataset of input-output pairs (e.g., emails labeled as "spam" or "not spam"). - - Model: The algorithm (e.g., decision tree, neural network) that learns from the data. - - Loss Function: Measures how far the model's predictions are from the true labels (e.g., mean squared error for regression, cross-entropy for classification). - - Optimization: Adjusting the model’s parameters (weights) to minimize the loss (e.g., using gradient descent). +**Key Components:** + +- Training Data: A dataset of input-output pairs (e.g., emails labeled as "spam" or "not spam"). +- Model: The algorithm (e.g., decision tree, neural network) that learns from the data. +- Loss Function: Measures how far the model's predictions are from the true labels (e.g., mean squared error for regression, cross-entropy for classification). +- Optimization: Adjusting the model’s parameters (weights) to minimize the loss (e.g., using gradient descent). ## Types of Supervised Learning @@ -34,20 +35,20 @@ Examples: Identifying Handwritten Digits, predicting the prices of cars, spam em In Classification, the algorithm learns from labeled training data, where each input is associated with a specific class, and then uses this knowledge to classify new, unseen data. -- Examples: Spam detection, Handwritten Digit Recognition, Image Classification, Medical Diagnosis. +**Examples:** Spam detection, Handwritten Digit Recognition, Image Classification, Medical Diagnosis. -- Key Components: +**Key Components:** - - Features (Input Variables): These are the measurable characteristics or attributes of the data points. They can be numerical or categorical - - Labels (Output Variables/Classes/Categories): These are the predefined categories to which data points are assigned. They are discrete values, meaning they belong to a finite set. - - Training Data: A dataset where each data point is paired with its correct label. This is what the model learns from. - - Model: The algorithm or function that learns the mapping between the features and the labels. +- Features (Input Variables): These are the measurable characteristics or attributes of the data points. They can be numerical or categorical +- Labels (Output Variables/Classes/Categories): These are the predefined categories to which data points are assigned. They are discrete values, meaning they belong to a finite set. +- Training Data: A dataset where each data point is paired with its correct label. This is what the model learns from. +- Model: The algorithm or function that learns the mapping between the features and the labels. -- Types of Classification: +**Types of Classification:** - - Binary Classification: The task of classifying data points into one of two classes. - - Multi-class Classification: The task of classifying data points into one of more than two classes. - - Multi-label Classification: The task of assigning multiple labels to each data point. This is different from multi-class classification, where each data point can only belong to one class. +- Binary Classification: The task of classifying data points into one of two classes. +- Multi-class Classification: The task of classifying data points into one of more than two classes. +- Multi-label Classification: The task of assigning multiple labels to each data point. This is different from multi-class classification, where each data point can only belong to one class. - Common Classification Algorithms: Logistic Regression, Support Vector Machines (SVMs), Decision Trees, Random Forests, Naive Bayes, K-Nearest Neighbors (KNN) @@ -56,23 +57,23 @@ In Classification, the algorithm learns from labeled training data, where each i Regression, in the realm of machine learning and statistics, is a supervised learning task focused on predicting a continuous numerical output. Unlike classification, which assigns data points to categories, regression aims to estimate a value within a range. -- Examples: House Price Prediction, Stock Price Prediction, Temperature Forecasting, Sales Forecasting. +**Examples:** House Price Prediction, Stock Price Prediction, Temperature Forecasting, Sales Forecasting. -- Key Components: +**Key Components:** - - Features (Input Variables): These are the independent variables used to make predictions. They can be numerical or categorical, but they ultimately influence the numerical output. - - Target Variable (Output Variable/Dependent Variable): This is the continuous numerical value we want to predict. - - Training Data: A dataset containing examples of input features and their corresponding target values. The model learns from these examples. - - Model: The learned function that maps the input features to the target variable. +- Features (Input Variables): These are the independent variables used to make predictions. They can be numerical or categorical, but they ultimately influence the numerical output. +- Target Variable (Output Variable/Dependent Variable): This is the continuous numerical value we want to predict. +- Training Data: A dataset containing examples of input features and their corresponding target values. The model learns from these examples. +- Model: The learned function that maps the input features to the target variable. -- Types of Regression: +**Types of Regression:** - - Linear Regression: Models a linear relationship between inputs and a target variable by finding the line of best fit that minimizes the sum of squared errors. - - Polynomial Regression: Used when the relationship between the features and the target variable is non-linear. It fits a polynomial curve to the data. - - Multiple Linear Regression: Used when there are multiple input features influencing the target variable. - - Support Vector Regression (SVR): Uses SVM principles to find the best-fitting hyperplane within a margin of error. - - Decision Tree Regression: Uses a tree structure where nodes represent feature-based decisions, and leaves represent predicted values. - - Random Forest Regression: An ensemble method that combines multiple decision trees to improve prediction accuracy and reduce overfitting. - - Neural Network Regression: Uses neural networks to learn complex non-linear relationships between features and the target variable. +- Linear Regression: Models a linear relationship between inputs and a target variable by finding the line of best fit that minimizes the sum of squared errors. +- Polynomial Regression: Used when the relationship between the features and the target variable is non-linear. It fits a polynomial curve to the data. +- Multiple Linear Regression: Used when there are multiple input features influencing the target variable. +- Support Vector Regression (SVR): Uses SVM principles to find the best-fitting hyperplane within a margin of error. +- Decision Tree Regression: Uses a tree structure where nodes represent feature-based decisions, and leaves represent predicted values. +- Random Forest Regression: An ensemble method that combines multiple decision trees to improve prediction accuracy and reduce overfitting. +- Neural Network Regression: Uses neural networks to learn complex non-linear relationships between features and the target variable. -- Common Classification Algorithms: Linear Regression, Polynomial Regression, Support Vector Regression (SVR), Decision Tree Regression, Random Forest Regression, Neural Network Regression. +**Common Classification Algorithms:** Linear Regression, Polynomial Regression, Support Vector Regression (SVR), Decision Tree Regression, Random Forest Regression, Neural Network Regression. From 594e4bc0c1f26345a605fe02709cbd22e330632c Mon Sep 17 00:00:00 2001 From: karishma-battina <67629745+karishma-battina@users.noreply.github.com> Date: Sun, 26 Jan 2025 15:33:41 +0000 Subject: [PATCH 07/15] Format fix2 --- .../supervised-learning.md | 46 +++++++++---------- 1 file changed, 23 insertions(+), 23 deletions(-) diff --git a/content/ai/concepts/supervised-learning/supervised-learning.md b/content/ai/concepts/supervised-learning/supervised-learning.md index 36a8f445818..a514f9fd9f0 100644 --- a/content/ai/concepts/supervised-learning/supervised-learning.md +++ b/content/ai/concepts/supervised-learning/supervised-learning.md @@ -24,10 +24,10 @@ Imagine teaching a child by showing them examples with correct answers. Similarl **Key Components:** -- Training Data: A dataset of input-output pairs (e.g., emails labeled as "spam" or "not spam"). -- Model: The algorithm (e.g., decision tree, neural network) that learns from the data. -- Loss Function: Measures how far the model's predictions are from the true labels (e.g., mean squared error for regression, cross-entropy for classification). -- Optimization: Adjusting the model’s parameters (weights) to minimize the loss (e.g., using gradient descent). +- _Training Data:_ A dataset of input-output pairs (e.g., emails labeled as "spam" or "not spam"). +- _Model:_ The algorithm (e.g., decision tree, neural network) that learns from the data. +- _Loss Function:_ Measures how far the model's predictions are from the true labels (e.g., mean squared error for regression, cross-entropy for classification). +- _Optimization:_ Adjusting the model’s parameters (weights) to minimize the loss (e.g., using gradient descent). ## Types of Supervised Learning @@ -39,18 +39,18 @@ In Classification, the algorithm learns from labeled training data, where each i **Key Components:** -- Features (Input Variables): These are the measurable characteristics or attributes of the data points. They can be numerical or categorical -- Labels (Output Variables/Classes/Categories): These are the predefined categories to which data points are assigned. They are discrete values, meaning they belong to a finite set. -- Training Data: A dataset where each data point is paired with its correct label. This is what the model learns from. -- Model: The algorithm or function that learns the mapping between the features and the labels. +- _Features (Input Variables):_ These are the measurable characteristics or attributes of the data points. They can be numerical or categorical +- _Labels (Output Variables/Classes/Categories):_ These are the predefined categories to which data points are assigned. They are discrete values, meaning they belong to a finite set. +- _Training Data:_ A dataset where each data point is paired with its correct label. This is what the model learns from. +- _Model:_ The algorithm or function that learns the mapping between the features and the labels. **Types of Classification:** -- Binary Classification: The task of classifying data points into one of two classes. -- Multi-class Classification: The task of classifying data points into one of more than two classes. -- Multi-label Classification: The task of assigning multiple labels to each data point. This is different from multi-class classification, where each data point can only belong to one class. +- _Binary Classification:_ The task of classifying data points into one of two classes. +- _Multi-class Classification:_ The task of classifying data points into one of more than two classes. +- _Multi-label Classification:_ The task of assigning multiple labels to each data point. This is different from multi-class classification, where each data point can only belong to one class. -- Common Classification Algorithms: Logistic Regression, Support Vector Machines (SVMs), Decision Trees, Random Forests, Naive Bayes, K-Nearest Neighbors (KNN) +- **Common Classification Algorithms:** Logistic Regression, Support Vector Machines (SVMs), Decision Trees, Random Forests, Naive Bayes, K-Nearest Neighbors (KNN) ### Regression @@ -61,19 +61,19 @@ Unlike classification, which assigns data points to categories, regression aims **Key Components:** -- Features (Input Variables): These are the independent variables used to make predictions. They can be numerical or categorical, but they ultimately influence the numerical output. -- Target Variable (Output Variable/Dependent Variable): This is the continuous numerical value we want to predict. -- Training Data: A dataset containing examples of input features and their corresponding target values. The model learns from these examples. -- Model: The learned function that maps the input features to the target variable. +- _Features (Input Variables):_ These are the independent variables used to make predictions. They can be numerical or categorical, but they ultimately influence the numerical output. +- _Target Variable (Output Variable/Dependent Variable):_ This is the continuous numerical value we want to predict. +- _Training Data:_ A dataset containing examples of input features and their corresponding target values. The model learns from these examples. +- _Model:_ The learned function that maps the input features to the target variable. **Types of Regression:** -- Linear Regression: Models a linear relationship between inputs and a target variable by finding the line of best fit that minimizes the sum of squared errors. -- Polynomial Regression: Used when the relationship between the features and the target variable is non-linear. It fits a polynomial curve to the data. -- Multiple Linear Regression: Used when there are multiple input features influencing the target variable. -- Support Vector Regression (SVR): Uses SVM principles to find the best-fitting hyperplane within a margin of error. -- Decision Tree Regression: Uses a tree structure where nodes represent feature-based decisions, and leaves represent predicted values. -- Random Forest Regression: An ensemble method that combines multiple decision trees to improve prediction accuracy and reduce overfitting. -- Neural Network Regression: Uses neural networks to learn complex non-linear relationships between features and the target variable. +- _Linear Regression:_ Models a linear relationship between inputs and a target variable by finding the line of best fit that minimizes the sum of squared errors. +- _Polynomial Regression:_ Used when the relationship between the features and the target variable is non-linear. It fits a polynomial curve to the data. +- _Multiple Linear Regression:_ Used when there are multiple input features influencing the target variable. +- _Support Vector Regression (SVR):_ Uses SVM principles to find the best-fitting hyperplane within a margin of error. +- _Decision Tree Regression:_ Uses a tree structure where nodes represent feature-based decisions, and leaves represent predicted values. +- _Random Forest Regression:_ An ensemble method that combines multiple decision trees to improve prediction accuracy and reduce overfitting. +- _Neural Network Regression:_ Uses neural networks to learn complex non-linear relationships between features and the target variable. **Common Classification Algorithms:** Linear Regression, Polynomial Regression, Support Vector Regression (SVR), Decision Tree Regression, Random Forest Regression, Neural Network Regression. From d284327c035ef8aa51654a8ca706b81d94f570fe Mon Sep 17 00:00:00 2001 From: karishma-battina <67629745+karishma-battina@users.noreply.github.com> Date: Sun, 26 Jan 2025 16:03:29 +0000 Subject: [PATCH 08/15] Format fix3 --- .../supervised-learning.md | 44 +++++++++---------- 1 file changed, 22 insertions(+), 22 deletions(-) diff --git a/content/ai/concepts/supervised-learning/supervised-learning.md b/content/ai/concepts/supervised-learning/supervised-learning.md index a514f9fd9f0..19aa14d3323 100644 --- a/content/ai/concepts/supervised-learning/supervised-learning.md +++ b/content/ai/concepts/supervised-learning/supervised-learning.md @@ -24,10 +24,10 @@ Imagine teaching a child by showing them examples with correct answers. Similarl **Key Components:** -- _Training Data:_ A dataset of input-output pairs (e.g., emails labeled as "spam" or "not spam"). -- _Model:_ The algorithm (e.g., decision tree, neural network) that learns from the data. -- _Loss Function:_ Measures how far the model's predictions are from the true labels (e.g., mean squared error for regression, cross-entropy for classification). -- _Optimization:_ Adjusting the model’s parameters (weights) to minimize the loss (e.g., using gradient descent). +- **_Training Data:_** A dataset of input-output pairs (e.g., emails labeled as "spam" or "not spam"). +- **_Model:_** The algorithm (e.g., decision tree, neural network) that learns from the data. +- **_Loss Function:_** Measures how far the model's predictions are from the true labels (e.g., mean squared error for regression, cross-entropy for classification). +- **_Optimization:_** Adjusting the model’s parameters (weights) to minimize the loss (e.g., using gradient descent). ## Types of Supervised Learning @@ -39,16 +39,16 @@ In Classification, the algorithm learns from labeled training data, where each i **Key Components:** -- _Features (Input Variables):_ These are the measurable characteristics or attributes of the data points. They can be numerical or categorical -- _Labels (Output Variables/Classes/Categories):_ These are the predefined categories to which data points are assigned. They are discrete values, meaning they belong to a finite set. -- _Training Data:_ A dataset where each data point is paired with its correct label. This is what the model learns from. -- _Model:_ The algorithm or function that learns the mapping between the features and the labels. +- **_Features (Input Variables):_** These are the measurable characteristics or attributes of the data points. They can be numerical or categorical +- **_Labels (Output Variables/Classes/Categories):_** These are the predefined categories to which data points are assigned. They are discrete values, meaning they belong to a finite set. +- **_Training Data:_** A dataset where each data point is paired with its correct label. This is what the model learns from. +- **_Model:_** The algorithm or function that learns the mapping between the features and the labels. **Types of Classification:** -- _Binary Classification:_ The task of classifying data points into one of two classes. -- _Multi-class Classification:_ The task of classifying data points into one of more than two classes. -- _Multi-label Classification:_ The task of assigning multiple labels to each data point. This is different from multi-class classification, where each data point can only belong to one class. +- **_Binary Classification:_** The task of classifying data points into one of two classes. +- **_Multi-class Classification:_** The task of classifying data points into one of more than two classes. +- **_Multi-label Classification:_** The task of assigning multiple labels to each data point. This is different from multi-class classification, where each data point can only belong to one class. - **Common Classification Algorithms:** Logistic Regression, Support Vector Machines (SVMs), Decision Trees, Random Forests, Naive Bayes, K-Nearest Neighbors (KNN) @@ -61,19 +61,19 @@ Unlike classification, which assigns data points to categories, regression aims **Key Components:** -- _Features (Input Variables):_ These are the independent variables used to make predictions. They can be numerical or categorical, but they ultimately influence the numerical output. -- _Target Variable (Output Variable/Dependent Variable):_ This is the continuous numerical value we want to predict. -- _Training Data:_ A dataset containing examples of input features and their corresponding target values. The model learns from these examples. -- _Model:_ The learned function that maps the input features to the target variable. +- **_Features (Input Variables):_** These are the independent variables used to make predictions. They can be numerical or categorical, but they ultimately influence the numerical output. +- **_Target Variable (Output Variable/Dependent Variable):_** This is the continuous numerical value we want to predict. +- **_Training Data:_** A dataset containing examples of input features and their corresponding target values. The model learns from these examples. +- **_Model:_** The learned function that maps the input features to the target variable. **Types of Regression:** -- _Linear Regression:_ Models a linear relationship between inputs and a target variable by finding the line of best fit that minimizes the sum of squared errors. -- _Polynomial Regression:_ Used when the relationship between the features and the target variable is non-linear. It fits a polynomial curve to the data. -- _Multiple Linear Regression:_ Used when there are multiple input features influencing the target variable. -- _Support Vector Regression (SVR):_ Uses SVM principles to find the best-fitting hyperplane within a margin of error. -- _Decision Tree Regression:_ Uses a tree structure where nodes represent feature-based decisions, and leaves represent predicted values. -- _Random Forest Regression:_ An ensemble method that combines multiple decision trees to improve prediction accuracy and reduce overfitting. -- _Neural Network Regression:_ Uses neural networks to learn complex non-linear relationships between features and the target variable. +- **_Linear Regression:_** Models a linear relationship between inputs and a target variable by finding the line of best fit that minimizes the sum of squared errors. +- **_Polynomial Regression:_** Used when the relationship between the features and the target variable is non-linear. It fits a polynomial curve to the data. +- **_Multiple Linear Regression:_** Used when there are multiple input features influencing the target variable. +- **_Support Vector Regression (SVR):_** Uses SVM principles to find the best-fitting hyperplane within a margin of error. +- **_Decision Tree Regression:_** Uses a tree structure where nodes represent feature-based decisions, and leaves represent predicted values. +- **_Random Forest Regression:_** An ensemble method that combines multiple decision trees to improve prediction accuracy and reduce overfitting. +- **_Neural Network Regression:_** Uses neural networks to learn complex non-linear relationships between features and the target variable. **Common Classification Algorithms:** Linear Regression, Polynomial Regression, Support Vector Regression (SVR), Decision Tree Regression, Random Forest Regression, Neural Network Regression. From ab856aca08b25f970e3c04782563e0934917da73 Mon Sep 17 00:00:00 2001 From: karishma-battina <67629745+karishma-battina@users.noreply.github.com> Date: Sun, 26 Jan 2025 16:07:15 +0000 Subject: [PATCH 09/15] Fix typo --- .../ai/concepts/supervised-learning/supervised-learning.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/ai/concepts/supervised-learning/supervised-learning.md b/content/ai/concepts/supervised-learning/supervised-learning.md index 19aa14d3323..99d82b236a7 100644 --- a/content/ai/concepts/supervised-learning/supervised-learning.md +++ b/content/ai/concepts/supervised-learning/supervised-learning.md @@ -20,7 +20,7 @@ CatalogContent: Imagine teaching a child by showing them examples with correct answers. Similarly, the algorithm learns patterns from these examples and uses them to make predictions on new, unseen data. -**Examples:** Identifying Handwritten Digits, predicting the prices of cars, spam emails detection. +**Examples:** Identifying Handwritten Digits, Predicting the prices of cars, Spam emails detection. **Key Components:** @@ -35,7 +35,7 @@ Imagine teaching a child by showing them examples with correct answers. Similarl In Classification, the algorithm learns from labeled training data, where each input is associated with a specific class, and then uses this knowledge to classify new, unseen data. -**Examples:** Spam detection, Handwritten Digit Recognition, Image Classification, Medical Diagnosis. +**Examples:** Spam Detection, Handwritten Digit Recognition, Image Classification, Medical Diagnosis. **Key Components:** @@ -50,7 +50,7 @@ In Classification, the algorithm learns from labeled training data, where each i - **_Multi-class Classification:_** The task of classifying data points into one of more than two classes. - **_Multi-label Classification:_** The task of assigning multiple labels to each data point. This is different from multi-class classification, where each data point can only belong to one class. -- **Common Classification Algorithms:** Logistic Regression, Support Vector Machines (SVMs), Decision Trees, Random Forests, Naive Bayes, K-Nearest Neighbors (KNN) +**Common Classification Algorithms:** Logistic Regression, Support Vector Machines (SVMs), Decision Trees, Random Forests, Naive Bayes, K-Nearest Neighbors (KNN) ### Regression From 3fa2e62a57133c4552805b701d7104047e1b7a8d Mon Sep 17 00:00:00 2001 From: karishma-battina <67629745+karishma-battina@users.noreply.github.com> Date: Sun, 26 Jan 2025 16:15:31 +0000 Subject: [PATCH 10/15] Remove duplicates --- .../supervised-learning/supervised-learning.md | 14 -------------- 1 file changed, 14 deletions(-) diff --git a/content/ai/concepts/supervised-learning/supervised-learning.md b/content/ai/concepts/supervised-learning/supervised-learning.md index 99d82b236a7..79c8894c0f0 100644 --- a/content/ai/concepts/supervised-learning/supervised-learning.md +++ b/content/ai/concepts/supervised-learning/supervised-learning.md @@ -37,13 +37,6 @@ In Classification, the algorithm learns from labeled training data, where each i **Examples:** Spam Detection, Handwritten Digit Recognition, Image Classification, Medical Diagnosis. -**Key Components:** - -- **_Features (Input Variables):_** These are the measurable characteristics or attributes of the data points. They can be numerical or categorical -- **_Labels (Output Variables/Classes/Categories):_** These are the predefined categories to which data points are assigned. They are discrete values, meaning they belong to a finite set. -- **_Training Data:_** A dataset where each data point is paired with its correct label. This is what the model learns from. -- **_Model:_** The algorithm or function that learns the mapping between the features and the labels. - **Types of Classification:** - **_Binary Classification:_** The task of classifying data points into one of two classes. @@ -59,13 +52,6 @@ Unlike classification, which assigns data points to categories, regression aims **Examples:** House Price Prediction, Stock Price Prediction, Temperature Forecasting, Sales Forecasting. -**Key Components:** - -- **_Features (Input Variables):_** These are the independent variables used to make predictions. They can be numerical or categorical, but they ultimately influence the numerical output. -- **_Target Variable (Output Variable/Dependent Variable):_** This is the continuous numerical value we want to predict. -- **_Training Data:_** A dataset containing examples of input features and their corresponding target values. The model learns from these examples. -- **_Model:_** The learned function that maps the input features to the target variable. - **Types of Regression:** - **_Linear Regression:_** Models a linear relationship between inputs and a target variable by finding the line of best fit that minimizes the sum of squared errors. From 999330b1646fd9dbb2a63aee1cd4bbc7ace34ee2 Mon Sep 17 00:00:00 2001 From: karishma-battina <67629745+karishma-battina@users.noreply.github.com> Date: Sun, 26 Jan 2025 16:34:56 +0000 Subject: [PATCH 11/15] Remove whitespaces From c0457037b039594d21806a7110a792f677204ce7 Mon Sep 17 00:00:00 2001 From: karishma-battina <67629745+karishma-battina@users.noreply.github.com> Date: Sun, 26 Jan 2025 16:36:51 +0000 Subject: [PATCH 12/15] Simplify the line --- content/ai/concepts/supervised-learning/supervised-learning.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/content/ai/concepts/supervised-learning/supervised-learning.md b/content/ai/concepts/supervised-learning/supervised-learning.md index 79c8894c0f0..21e842f306a 100644 --- a/content/ai/concepts/supervised-learning/supervised-learning.md +++ b/content/ai/concepts/supervised-learning/supervised-learning.md @@ -47,8 +47,7 @@ In Classification, the algorithm learns from labeled training data, where each i ### Regression -Regression, in the realm of machine learning and statistics, is a supervised learning task focused on predicting a continuous numerical output. -Unlike classification, which assigns data points to categories, regression aims to estimate a value within a range. +Regression is a supervised learning task focused on predicting a continuous numerical output. Unlike classification, which assigns data points to categories, regression aims to estimate a value within a range. **Examples:** House Price Prediction, Stock Price Prediction, Temperature Forecasting, Sales Forecasting. From 676d8b9e0d0c43648934afbca102c923afd9df57 Mon Sep 17 00:00:00 2001 From: Karishma Battina <67629745+karishma-battina@users.noreply.github.com> Date: Mon, 27 Jan 2025 17:17:08 +0000 Subject: [PATCH 13/15] Modify as per comments --- .../supervised-learning.md | 20 +++++++++---------- 1 file changed, 9 insertions(+), 11 deletions(-) diff --git a/content/ai/concepts/supervised-learning/supervised-learning.md b/content/ai/concepts/supervised-learning/supervised-learning.md index 21e842f306a..333084dd874 100644 --- a/content/ai/concepts/supervised-learning/supervised-learning.md +++ b/content/ai/concepts/supervised-learning/supervised-learning.md @@ -16,13 +16,11 @@ CatalogContent: - 'paths/data-science' --- -**Supervised learning (ML)** is a type of machine learning where the algorithm learns from labeled data. It involves training a model on input-output pairs to generalize and predict outcomes for new, unseen data. This label acts as a "supervisor," guiding the learning process. - -Imagine teaching a child by showing them examples with correct answers. Similarly, the algorithm learns patterns from these examples and uses them to make predictions on new, unseen data. +**Supervised learning (ML)** is a type of machine learning in which the algorithm learns from labeled data. It involves training a model using input-output pairs so that it can generalize and make predictions for new, unseen data. The labels serve as "supervisors," guiding the learning process. **Examples:** Identifying Handwritten Digits, Predicting the prices of cars, Spam emails detection. -**Key Components:** +## Key Components - **_Training Data:_** A dataset of input-output pairs (e.g., emails labeled as "spam" or "not spam"). - **_Model:_** The algorithm (e.g., decision tree, neural network) that learns from the data. @@ -37,7 +35,7 @@ In Classification, the algorithm learns from labeled training data, where each i **Examples:** Spam Detection, Handwritten Digit Recognition, Image Classification, Medical Diagnosis. -**Types of Classification:** +## Types of Classification - **_Binary Classification:_** The task of classifying data points into one of two classes. - **_Multi-class Classification:_** The task of classifying data points into one of more than two classes. @@ -51,14 +49,14 @@ Regression is a supervised learning task focused on predicting a continuous nume **Examples:** House Price Prediction, Stock Price Prediction, Temperature Forecasting, Sales Forecasting. -**Types of Regression:** +## Types of Regression -- **_Linear Regression:_** Models a linear relationship between inputs and a target variable by finding the line of best fit that minimizes the sum of squared errors. +- **_[Linear Regression:](https://www.codecademy.com/learn/linear-regression-mssp)_** Models a linear relationship between inputs and a target variable by finding the line of best fit that minimizes the sum of squared errors. - **_Polynomial Regression:_** Used when the relationship between the features and the target variable is non-linear. It fits a polynomial curve to the data. -- **_Multiple Linear Regression:_** Used when there are multiple input features influencing the target variable. -- **_Support Vector Regression (SVR):_** Uses SVM principles to find the best-fitting hyperplane within a margin of error. -- **_Decision Tree Regression:_** Uses a tree structure where nodes represent feature-based decisions, and leaves represent predicted values. -- **_Random Forest Regression:_** An ensemble method that combines multiple decision trees to improve prediction accuracy and reduce overfitting. +- **_[Multiple Linear Regression:](https://www.codecademy.com/learn/multiple-linear-regression-course)_** Used when there are multiple input features influencing the target variable. +- **_[Support Vector Regression (SVR):](https://www.codecademy.com/resources/docs/sklearn/support-vector-machines)_** Uses SVM principles to find the best-fitting hyperplane within a margin of error. +- **_[Decision Tree Regression:](https://www.codecademy.com/article/mlfun-decision-trees-article)_** Uses a tree structure where nodes represent feature-based decisions, and leaves represent predicted values. +- **_[Random Forest Regression:](https://www.codecademy.com/learn/machine-learning-random-forests-decision-trees)_** An ensemble method that combines multiple decision trees to improve prediction accuracy and reduce overfitting. - **_Neural Network Regression:_** Uses neural networks to learn complex non-linear relationships between features and the target variable. **Common Classification Algorithms:** Linear Regression, Polynomial Regression, Support Vector Regression (SVR), Decision Tree Regression, Random Forest Regression, Neural Network Regression. From a87384dea326c3dc781fb171ccc314117079a896 Mon Sep 17 00:00:00 2001 From: Karishma Battina <67629745+karishma-battina@users.noreply.github.com> Date: Mon, 27 Jan 2025 17:23:00 +0000 Subject: [PATCH 14/15] Improvise headings --- .../supervised-learning/supervised-learning.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/content/ai/concepts/supervised-learning/supervised-learning.md b/content/ai/concepts/supervised-learning/supervised-learning.md index 333084dd874..c52d57da1a2 100644 --- a/content/ai/concepts/supervised-learning/supervised-learning.md +++ b/content/ai/concepts/supervised-learning/supervised-learning.md @@ -20,7 +20,7 @@ CatalogContent: **Examples:** Identifying Handwritten Digits, Predicting the prices of cars, Spam emails detection. -## Key Components +### Key Components - **_Training Data:_** A dataset of input-output pairs (e.g., emails labeled as "spam" or "not spam"). - **_Model:_** The algorithm (e.g., decision tree, neural network) that learns from the data. @@ -29,13 +29,13 @@ CatalogContent: ## Types of Supervised Learning -### Classification +## Classification In Classification, the algorithm learns from labeled training data, where each input is associated with a specific class, and then uses this knowledge to classify new, unseen data. **Examples:** Spam Detection, Handwritten Digit Recognition, Image Classification, Medical Diagnosis. -## Types of Classification +### Types of Classification - **_Binary Classification:_** The task of classifying data points into one of two classes. - **_Multi-class Classification:_** The task of classifying data points into one of more than two classes. @@ -43,13 +43,13 @@ In Classification, the algorithm learns from labeled training data, where each i **Common Classification Algorithms:** Logistic Regression, Support Vector Machines (SVMs), Decision Trees, Random Forests, Naive Bayes, K-Nearest Neighbors (KNN) -### Regression +## Regression Regression is a supervised learning task focused on predicting a continuous numerical output. Unlike classification, which assigns data points to categories, regression aims to estimate a value within a range. **Examples:** House Price Prediction, Stock Price Prediction, Temperature Forecasting, Sales Forecasting. -## Types of Regression +### Types of Regression - **_[Linear Regression:](https://www.codecademy.com/learn/linear-regression-mssp)_** Models a linear relationship between inputs and a target variable by finding the line of best fit that minimizes the sum of squared errors. - **_Polynomial Regression:_** Used when the relationship between the features and the target variable is non-linear. It fits a polynomial curve to the data. From b438634edcfc994398136ca0c863f6575746305e Mon Sep 17 00:00:00 2001 From: Mamta Wardhani Date: Tue, 4 Feb 2025 12:44:01 +0530 Subject: [PATCH 15/15] Update supervised-learning.md minor fixes --- .../supervised-learning.md | 55 +++++++++---------- 1 file changed, 27 insertions(+), 28 deletions(-) diff --git a/content/ai/concepts/supervised-learning/supervised-learning.md b/content/ai/concepts/supervised-learning/supervised-learning.md index c52d57da1a2..8d051eb22a7 100644 --- a/content/ai/concepts/supervised-learning/supervised-learning.md +++ b/content/ai/concepts/supervised-learning/supervised-learning.md @@ -2,61 +2,60 @@ Title: 'Supervised Learning' Description: 'Supervised learning is a machine learning technique where algorithms learn from labeled data to make predictions.' Subjects: - - 'Machine Learning' + - 'AI' - 'Data Science' - - 'Artificial Intelligence' + - 'Machine Learning' Tags: - 'AI' - 'Deep Learning' - 'Classification' - 'Regression' - - 'Predictive Modeling' CatalogContent: - - 'machine-learning' - - 'paths/data-science' + - 'learn-python-3' + - 'paths/computer-science' --- -**Supervised learning (ML)** is a type of machine learning in which the algorithm learns from labeled data. It involves training a model using input-output pairs so that it can generalize and make predictions for new, unseen data. The labels serve as "supervisors," guiding the learning process. +**Supervised learning (ML)** is a type of machine learning where an algorithm learns from labeled data. It involves training a model using input-output pairs so it can generalize and make accurate predictions for new, unseen data. The labeled outputs act as a guide, helping the model learn the correct relationships. -**Examples:** Identifying Handwritten Digits, Predicting the prices of cars, Spam emails detection. +**Examples:** Identifying handwritten digits, predicting car prices based on features, detecting spam emails based on content and metadata. ### Key Components -- **_Training Data:_** A dataset of input-output pairs (e.g., emails labeled as "spam" or "not spam"). -- **_Model:_** The algorithm (e.g., decision tree, neural network) that learns from the data. -- **_Loss Function:_** Measures how far the model's predictions are from the true labels (e.g., mean squared error for regression, cross-entropy for classification). -- **_Optimization:_** Adjusting the model’s parameters (weights) to minimize the loss (e.g., using gradient descent). +- **Training Data:** A dataset containing input-output pairs (e.g., images labeled with digits or emails marked as spam/not spam). +- **Model:** A machine learning algorithm (e.g., decision trees, neural networks) that learns patterns from the data. +- **Loss Function:** A metric that measures how well the model’s predictions match the actual labels. (e.g., Mean Squared Error for regression, Cross-Entropy Loss for classification). +- **Optimization:** A process of adjusting model parameters to minimize the loss and improve accuracy, often using gradient descent or other optimization techniques. ## Types of Supervised Learning -## Classification +### Classification -In Classification, the algorithm learns from labeled training data, where each input is associated with a specific class, and then uses this knowledge to classify new, unseen data. +Classification involves training an algorithm on labeled data, where each input is associated with a specific category. The model then classifies new, unseen data based on learned patterns. -**Examples:** Spam Detection, Handwritten Digit Recognition, Image Classification, Medical Diagnosis. +**Examples:** Spam Detection, handwritten digit recognition, image classification, medical diagnosis. -### Types of Classification +#### Types of Classification -- **_Binary Classification:_** The task of classifying data points into one of two classes. -- **_Multi-class Classification:_** The task of classifying data points into one of more than two classes. -- **_Multi-label Classification:_** The task of assigning multiple labels to each data point. This is different from multi-class classification, where each data point can only belong to one class. +- **Binary Classification:** The task of classifying data points into one of two classes. +- **Multi-class Classification:** The task of classifying data points into one of more than two classes. +- **Multi-label Classification:** The task of assigning multiple labels to each data point. This is different from multi-class classification, where each data point can only belong to one class. **Common Classification Algorithms:** Logistic Regression, Support Vector Machines (SVMs), Decision Trees, Random Forests, Naive Bayes, K-Nearest Neighbors (KNN) -## Regression +### Regression Regression is a supervised learning task focused on predicting a continuous numerical output. Unlike classification, which assigns data points to categories, regression aims to estimate a value within a range. -**Examples:** House Price Prediction, Stock Price Prediction, Temperature Forecasting, Sales Forecasting. +**Examples:** House price prediction, stock price prediction, temperature forecasting, sales forecasting. -### Types of Regression +#### Types of Regression -- **_[Linear Regression:](https://www.codecademy.com/learn/linear-regression-mssp)_** Models a linear relationship between inputs and a target variable by finding the line of best fit that minimizes the sum of squared errors. -- **_Polynomial Regression:_** Used when the relationship between the features and the target variable is non-linear. It fits a polynomial curve to the data. -- **_[Multiple Linear Regression:](https://www.codecademy.com/learn/multiple-linear-regression-course)_** Used when there are multiple input features influencing the target variable. -- **_[Support Vector Regression (SVR):](https://www.codecademy.com/resources/docs/sklearn/support-vector-machines)_** Uses SVM principles to find the best-fitting hyperplane within a margin of error. -- **_[Decision Tree Regression:](https://www.codecademy.com/article/mlfun-decision-trees-article)_** Uses a tree structure where nodes represent feature-based decisions, and leaves represent predicted values. -- **_[Random Forest Regression:](https://www.codecademy.com/learn/machine-learning-random-forests-decision-trees)_** An ensemble method that combines multiple decision trees to improve prediction accuracy and reduce overfitting. -- **_Neural Network Regression:_** Uses neural networks to learn complex non-linear relationships between features and the target variable. +- **[Linear Regression:](https://www.codecademy.com/learn/linear-regression-mssp):** Models a linear relationship between inputs and a target variable by finding the line of best fit that minimizes the sum of squared errors. +- **Polynomial Regression:** Captures non-linear relationships by fitting a polynomial curve to the data. +- **[Multiple Linear Regression:](https://www.codecademy.com/learn/multiple-linear-regression-course):** Used when there are multiple input features influencing the target variable. +- **[Support Vector Regression (SVR):](https://www.codecademy.com/resources/docs/sklearn/support-vector-machines):** Uses SVM principles to find the best-fitting hyperplane within a margin of error. +- **[Decision Tree Regression:](https://www.codecademy.com/article/mlfun-decision-trees-article):** Uses a tree structure where nodes represent feature-based decisions, and leaves represent predicted values. +- **[Random Forest Regression:](https://www.codecademy.com/learn/machine-learning-random-forests-decision-trees):** An ensemble method that combines multiple decision trees to improve prediction accuracy and reduce overfitting. +- **Neural Network Regression:** Uses neural networks to learn complex non-linear relationships between features and the target variable. **Common Classification Algorithms:** Linear Regression, Polynomial Regression, Support Vector Regression (SVR), Decision Tree Regression, Random Forest Regression, Neural Network Regression.