Decision trees are supervised learning algorithms, which can be used both for the purpose of classification and regression. However, more often it is used for solving Classification problems. It is a tree-structured classifier, where internal nodes represent the features of a dataset, branches represent the decision rules and each leaf node represents the outcome.
In a Decision tree, there are two nodes, which are the Decision Node and Leaf Node. Decision nodes are used to make any decision and have multiple branches, whereas Leaf nodes are the output of those decisions and do not contain any further branches.
The decision tree is a graphical representation for getting all the possible solutions to a problem/decision based on given conditions. Hence, it is more interpretable and easy to understand.
Aim of a Decision Tree Algorithm
Decision tree aims to create a training model which can predict the class or value of the target variable by learning simple decision rules inferred from the training data.
In Decision Trees, if we want to predict a class label, we begin from the ‘root node’ of the tree. We compare the values of the root attribute with the record’s attribute. On the basis of comparison, we follow the branch corresponding to that value and jump to the next node. (as illustrated in the above image)
Important Terminologies related to Decision Trees
- Root Node: The entire population or sample and this further gets divided into two or more homogeneous sets.
- Splitting: Process of dividing a node into two or more sub-nodes.
- Decision Node: When a sub-node splits into further sub-nodes, then it is called the decision node.
- Leaf / Terminal Node: Nodes that do not split are called Leaf or Terminal nodes.
- Pruning: When we remove sub-nodes of a decision node, this process is called pruning. You can say the opposite process of splitting.
- Branch / Sub-Tree: A subsection of the entire tree is called branch or sub-tree.
- Parent and Child Node: A node, which is divided into sub-nodes is called a parent node of sub-nodes whereas sub-nodes are the child of a parent node.
Decision trees classify the examples by sorting them down the tree from the root to some leaf/terminal node, with the leaf/terminal node providing the classification of the example.
Each node in the tree acts as a test case for some attribute, and each edge descending from the node corresponds to the possible answers to the test case. This process is recursive in nature and is repeated for every subtree rooted at the new node.
Assumptions while creating Decision Tree
Below are some of the assumptions we make while using Decision tree:
- In the beginning, the whole training set is considered as the root.
- Feature values are preferred to be categorical. If the values are continuous then they are discretized prior to building the model.
- Records are distributed recursively on the basis of attribute values.
- Order to place attributes as root or internal node of the tree is done by using some statistical approach.
Decision Trees follow Sum of Product (SOP) representation. The Sum of product (SOP) is also known as Disjunctive Normal Form. For a class, every branch from the root of the tree to a leaf node having the same class is a conjunction (product) of values, different branches ending in that class form a disjunction (sum).
The primary challenge in the decision tree implementation is to identify which attributes do we need to consider as the root node and each level. Handling this is known as the attributes selection. We have different attributes selection measures to identify the attribute which can be considered as the root note at each level.
Types of Decision Trees
Types of decision trees are based on the type of target variable we have;
- Categorical Variable Decision Tree: Decision Tree with a categorical target variable.
- Continuous Variable Decision Tree: Decision Tree with continuous target variable.
A Real Life Example
In colleges and universities, the shortlisting of a student can be decided based upon his merit scores, attendance, overall score etc. A decision tree can also decide the overall promotional strategy of faculties present in the universities.
Advantages of Decision Tree
- A decision tree model is very interpretable and can be easily represented to senior management and stakeholders.
- Preprocessing of data such as normalization and scaling is not required which reduces the effort in building a model.
- A decision tree algorithm can handle both categorical and numeric data and is much efficient compared to other algorithms.
- Any missing value present in the data does not affect a decision tree which is why it is considered a flexible algorithm.
Disadvantages of Decision Tree
- A decision tree works badly when it comes to regression as it fails to perform if the data have too much variation.
- A decision tree is sometimes unstable and cannot be reliable as alteration in data can cause a decision tree go in a bad structure which may affect the accuracy of the model.
- If the data are not properly discretized, then a decision tree algorithm can give inaccurate results and will perform badly compared to other algorithms.
- Complexities arise in calculation if the outcomes are linked and it may consume time while training a model.