![](https://crypto4nerd.com/wp-content/uploads/2024/03/1-xi53L91s1oS6Ke5Gq6Bww-1024x576.jpeg)
- It is Supervised Machine Learning Algorithm. It works on the labelled data which is X and Y where Y is Discrete/Categorical in nature.
- It is suitable for Binary Classification.
Example: Y → 0/1, Y → Head/Tail, Y → True/False.
Example 1: Given the credit card transaction data, predict the transaction is Genuine or Fraudulent.
Example 2: Given the employee performance data, predict the employee ratings as 1/2/3/4/5.
In Logistic Regression, Sigmoid Function came in picture. Sigmoid function allows to convert any real valued data to the range of 0 to 1.
It allows you to implement regression to measure the relationship between the independent variable and dependent variable in order to predict the probability of every data point belong to either of the two classes.
Equation for Logistic Regression :-
- Binary Logistic Regression → when there are two classes in the variable then it is known as Binary Logistic Regression.
- Multinomial Logistic Regression → when there are more than two classes in the variable, it is termed as Multinomial Logistic Regression.
Training Phase :-
Input → X , Y (Historical data)
Output → Best Fit Line Equation (in the Range of 0 to 1)
Testing (Validation Sets) Phase :-
Input → X_test
Output → Y on the basis of Equation and Probability Matrix
In Evaluation metrics, it will generate Confusion Matrix, Accuracy Score, Classification Report (Recall, Precision, F1-score).
- Feature Selection
- Dedicated Approach → Adjustment of Threshold (Default Threshold — 0.5) → used to improve the logistic regression model.
We always go with lower threshold because we want to improve the class 1 without affecting to class 0.
- It is suitable for Clearly Seperable data.
- It is not suitable for Noisy data.
- It is an algorithm which works on the basis of probability.
- It uses MLE (Maximum Likelyhood Estimation) & not OLS.
- It relies on less assumption compare to linear regression.
In conclusion, logistic regression is a vital statistical method for binary classification tasks, using a logistic function to model relationships between variables and probabilities of outcomes. It remains a crucial tool for predictive modeling in various domains such as healthcare, finance, and marketing.