Özhan Atalı

Simple Linear Regression (SLR)

04 Kas 2024 - İstanbul

Simple Linear regression (SLR)

Linear regression model is a particular type of supervised learning model.

In statistics, simple linear regression (SLR) is a linear regression model with a single explanatory variable.

Linear regression is one example of a regression model. But there are other models for addressing regression problems too. In contrast with the regression model, the other most common type of supervised learning model is called a classification model. In classification, there are only a small number of possible outputs, whereas in regression, there are infinitely many possible numbers that the model could output.

Terminology

x (feature) –> learning algorithm / function (model) –> y (prediction)

Linear function / Linear regression with one variable / Univariate linear regression f(X)=wx+b

Cost function / Squared error cost function / Mean Squared Error (MSE)

The cost function is the difference between our estimated value and the actual value on the line. The smaller this difference is, the closer our estimates will be to reality. The graph below will better understand what we mean.

\[J(w,b) = {1 \over 2m} {\sum_{i=1}^m } \left( ŷ^i-y^i \right)^2\]

where:

Goal of linear regression

Our goal is to minimizing the cost function :

\[\underset{w,b}{\text{minimize }} J(w,b)\]

Gradient descent

Gradient descent is used to minimize any function that have more than two parameters.

repeat until convergence (local minimum where the parameters w and b no longer change much with each additional step):

\[w=w-α \left(\frac{\partial}{\partial w} \quad J(w,b)\right)\] \[b=b-α \left(\frac{\partial}{\partial b} \quad J(w,b)\right)\]

where

above calculations for both w and b should be simultaneously (need to change them both together although they effect each other).

Types of Gradient descent

1. Batch gradient descent (BGD)

The term batch gradient descent refers to the fact that on every step of gradient descent, we’re looking at all of the training examples, instead of just a subset of the training data. (+)Stability (+)Accuracy (-)Performance

2. Stochastic Gradient Descent (SGD)

Stochastic Gradient Descent (SGD) is a simplified version of Gradient Descent (GD) that addresses some of its challenges. In SGD, the gradient is computed for only one randomly selected partition of the shuffled dataset during each iteration, instead of using the entire dataset.

3. Mini-Batch Gradient Descent

Data is splitted into small batches and then compute loss for each. The weights are updated after each batch.


Choise of learning rate


Choise of learning rate

Normal equation

Normal equation is an alternative to gradient descent -only for lienar regression which works slower.

Refs

Out of the concept

Following page gives a great summary of MathJax library which is inherently supported by markdown lang of github: MathJax Tutorial and Quick Ref








📝 | Sitenin son güncellenme tarihi: 08 Kas 2024