Decision Tree: How do I use the Gains Chart?
To ensure that you select the top nodes it is simplest to use a Gains Chart.
-
Each point on the green curve represents a leaf on the tree and they are ordered from left to right in order of descending gain.
The line between the points bows upwards, since the earlier nodes have a higher proportion of Analysis records to Base records.
To see the figures behind the chart, use the Tree Grid set to Cumulative.
-
Notice that the Gains Chart only displays leaf nodes, that is, nodes which are at the ends of a branch.
- Notice that the Gains Chart only displays leaf nodes, that is, nodes which are at the ends of a branch.
The better the Decision Tree is at isolating the Analysis Selection the steeper the gains curve will be. The Hindsight line shows the best that can be achieved, as would result from a single node containing all the analysis records.
Power
The power is in fact calculated from the Gains Chart curves and is based on the distance between the Decision Tree line and the Random and Hindsight lines. The power is the ratio of (the area between the Decision Tree and Random Line) to (the area between the Hindsight Line and Random Line).
The power of a Decision Tree model measures how good the Decision Tree is at identifying records in the Analysis selection. It ranges between 0 and 1 (best)
-
Power - 0: Decision Tree is no better than random.
-
Power - 1: Decision Tree is as good as hindsight. Selecting the best nodes from the Decision Tree will enable you to select all the Analysis Selection without picking up any of the base selection.