Interpretable machine learning algorithm - Linear regression

June 21, 2017, 2:36 p.m. By: Vishakha Jha

Linear Regression

Linear Regression is one of the algorithms used to predict scalar values given some values. It is very simple and useful approach. The algorithm was developed in the field of statistics and is said to be a model for the understanding relationship between input and output numerical variable. The algorithm shows a relationship between two variable and how one variable affects other. It shows how a change in independent variable leads to change an independent variable which is also known as a predictor. We prefer this algorithm usually when we have a continuous outcome.

There are two types of linear regression- when there is a single input variable the method is known as a simple linear regression but when there is multiple input variable it often refers to multiple linear regression. The difference between simple and multiple linear regression is that later has more than one independent variable whereas simple linear regression has only one independent variable.

The algorithm is straightforward to understand and explain. The algorithm requires minimal tuning and said to work at a faster pace. It can be regularised to avoid over-fitting and can be easily updated with new data but it performs poorly when there are non-linear relations. It is also not flexible enough to capture complex pattern and adding right interacting terms can be tedious and time-consuming. It is sensitive to outliner and can terribly affect the regression line.

Linear Regression is widely used in business. It is applicable in the field of estimating sales by forecasting through monthly sale analysis. It also helps to assess the risk involved in insurance or financial domain. The algorithm is used in Data science libraries in Python to implement stats model and SciKit and in R to implement stats.

Image Source: