Decision Tree Example: What happens at each split?

The Root Node

The tree grid provides a numerical view of what the Decision Tree has done.  The grid can be viewed in a hierarchical drill down mode (as shown below), or in a variety of other ways using the options in the top left of the panel. 

In the image below, the focus node is the root node (highlighted in red in the tree grid), which relates to all customers in the base selection.  The Node Details tab is also shown which shows all the same figures, but emphasises how they relate to each other:

  • Base Count:  There are 1,156,533 people in our base selection (all people).

  • % of [all] Base: This is 100% confirming that the node contains the entire base selection (subsequent nodes will contain only a proportion of the full set of customers).

  • Analysis Count: 25,175 people in the analysis selection (people taking Swedish holidays).

  • % of [all] Analysis: This is 100% or the entire analysis selection.

  • Analysis %: Those people in the analysis selection make up 2.18% of the base.

At each split a single variable is used to partition the customers into two child nodes.  The main differences between different types of Decision Tree (e.g. PWE, CHAID, and CART) relate to how these splits are made.  However, the purpose is always the same: to identify rules for homing in on the customers in the analysis selection.

Splitting

Each split is made by finding a rule which separates people who are in the analysis selection from those who are not in the analysis selection.  The rule behind a split is based on a single dimension.

As this process is repeated, nodes are created which have a higher and higher Analysis % (coloured more intense red in the graphical displays).  In this example, these are people most likely to go on holiday to Sweden. Nodes which have a low Analysis % (coloured more intense blue) will also be created as a result of the splitting process.

The first split created above, successfully concentrates the responders (analysis selection) in to node 1.  This node contains 304,326 people (base count), of which 17,463 (analysis count) are responders.  The responders make up 5.74% (Analysis %) of the node.   Node 2 contains the other 852,227 people, of which only 7,712 or 0.90% are responders.  The rule used is based on the income dimension.

This can be seen graphically from the torus views of nodes 1 and 2, shown below:

Node 1

Income = £20k – 80k

Node 2

Income = < £20k and £80k +

Girth of the torus is smaller, indicating less people (compared to node 2).

Red band is wider, indicating higher proportion of responders (Analysis %)

Girth of the torus is larger, indicating more people (compared to node 1).

Red band is narrower, indicating lower proportion of responders (Analysis %)

The Gain measures the degree to which responders have been concentrated in a node.  It is calculated from the ratio of the Analysis % in the node to the Analysis % in the root node.

Node 1 has a gain = 2. 64 (=5.74 / 2.18) and node 2 has a gain = 0.42 (=0.90 / 2.18).  A node which has a higher proportion of responders than the root node will have a gain larger than 1 (and therefore coloured red in the graphical displays).

For more details on the different statistics see Decision Tree: Statistical Terminology