This post discusses the linear regression model, including simple linear regression and multiple linear regression, and its implementations in Python, specifically in Scikit-Learn library. It also serves as a basis for further discussions of more advanced linear regression models such as Bayesian linear regression.

## Introduction

Linear Regression is the most frequently used statistical and machine learning technique. It tries to put a straight line between feature variables \(X\) and label variable \(y\) that best fits the dataset. In mathematical term, it can be expressed as

\[ y=X\beta+\epsilon \tag{1.1} \]

where \(\beta\) is the parameter vector that includes the constant intercept term and exposure coefficients to each feature variable \(x\in X\).

Least Square (OLS) provides a closed-form estimation of coefficient \(\beta\) called normal equation given as follows:

\[ \hat{\beta} = (X^TX)^{-1}X^Ty \tag{1.2} \]

In the case of linear regression, it is also Maximinum Likelihood Estimation (MLE).

If you have difficulty viewing the formulas, right click on it and select Math Settings Math Renderer to switch to another format.

There are tons of materials about this topic in textbooks and online so I won't spill out more formulas. Let's look at the Python code illustration.

## Simple Illustration

First let's generate a sample dataset and then solve for the coefficient via the Normal equation

1 | import numpy as np |

The coefficient and intercept used to generate the dataset are 2.0 and 1.0. Then when we try to back out them from the dataset we get 2.0086851 and 0.6565181, respetively. This is within the confidence interval of regression statistics.

Nevertheless there is no need to solve for the coefficent directly, as the recommended way to do so in Python is through its sklearn package.

1 | from sklearn.linear_model import LinearRegression |

As you can see the true line and fitted line are hardly distingishable in the context of sample dataset. The R square indicates that 79.7% of the variability in y can be explained by x. Furthermore, in simple linear regression, its square root is Pearson correlation coefficient between \(x\) and \(y\), which shows 89% positive correlated.

Above is a quick introduction on linear regression, served as the starting point for more advanced topics in Machine Learning. Traditional topics such as multicollinearity, stepwise regression, generalized linear model, hierarchical linear model, and regularization methods such as lasso regression, ridge regression are not in my short-term plan. In next post we'll look at Beyesian Linear Regression.

**Reference**

- Linear Regression, Wikipedia
- Ordinary Least Squares, Wikipedia
- Maximum Likelihood Estimation, Wikipedia

**DISCLAIMER: This post is for the purpose of research and backtest only. The author doesn't promise any future profits and doesn't take responsibility for any trading losses.**