A decision tree can be used to aid decision making. A decision tree is constructed using a tree structure which models decisions and their consequences. A node represents an attribute and each branch from the node represents a possible value for the attribute. Each leaf node represents an outcome. As decisions trees can be effectively represented visually - with the attributes, values and outcomes all expressed in plain English - they are understandable by humans.
Decision Tree Learning
Decision trees can be generated using an algorithm which recursively partitions a training set into a tree structure. At every iteration of the process an attribute is selected to divide the current dataset into smaller subsets. The statistical property used for selecting the best attribute is called information gain. The information gain of an attribute represents the expected reduction in entropy if the attribute was used to partition the current dataset. Entropy is a measure of the diversity of a dataset. The higher the number of distinct outcomes that are contained in a dataset the higher the entropy.
Pseudo code for a recursive decision tree learning algorithm is shown below:
PROCEDURE generateTree(dataset) IF all items in dataset have the same classification THEN RETURN Find best attribute (a) of dataset to split on Create array (s) of subsets of dataset split on a FOR EACH subset IN s generateTree(subset)
Interactive Example Of Decision Tree Learning
Below is an example of using decision tree learning. It uses four attributes (age, income, credit history and savings) to determine the risk of offering someone a loan. Click on the 'Generate decision tree...' link to see a visual representation of the generated tree. Try altering the values in the table before clicking the link again to see how changing the input values affect the tree structure.Generate decision tree...