Random Forest – Supervised classification machine learning algorithm

June 15, 2017, 1:56 p.m. By: Vishakha Jha

Random Forest Algorithm

Random Forest is the go to machine learning algorithm that works through bagging approach to create a bunch of decision trees with a random subset of the data. It is considered to be one of the most effective algorithm to solve almost any prediction task. It can be used both for classification and the regression kind of problems. It is a combination of tree predictors where each tree depends on the values of a random vector sampled independently with the same distribution for all trees in the forest.

The pseudo code for random forest algorithm can split into two stages. First, in which ‘n' random trees are created, this forms the random forest. In the second stage, the outcome for the same test feature from all decision trees is combined. Then the final prediction is derived by assessing the results of each decision tree or just by going with a prediction that appears the most times in the decision trees.

Random Forest Machine Learning Algorithm maintains accuracy even when there is inconsistent data and is simple to use. It also gives estimates on what variables are important for the classification. It runs efficiently on large databases while generating an internal unbiased estimate of the generalisation error. It also provides methods for balancing error in class population unbalanced data sets but analysing them theoretically is difficult and formation of a large number of trees can also slow down prediction while handling real-time system. There is also another drawback that is, it does not predict beyond the range of the response values in the training data.

The random algorithm especially helps data scientists to save data preparation time, as they do not require any input preparation and are able to handle numerical data and categorical features without scaling or transformation. Random forest is commonly known for implementations in R packages and Python. It is used in wide varieties applications such as Medicine, Stock Market, E-commerce and Banking sector.

How Random Forest algorithm works

Video Source: Thales Sehn Korting