Decision Tree Example: How do I modify the Decision Tree?

Starting and stopping the build

The tree built so far is extremely simple.  It has only stopped growing because it has reached 20 nodes, which is the default Maximum Nodes value in the Stopping Conditions.  See How do I set the Stopping Conditions? for more details. 

To manually influence the build process, either wait for the tree to stop growing, or click the Pause button while it is still growing.

You can then use the manual control options to take full control of how the tree is built.  You can determine:

  • Which Node is split.

  • Which Dimension is used to create the child nodes.

  • Which Categories are grouped together to define a split.

For more examples and details, see How do I manually control the build process?

Alternatively, you can continue to use the automatic build process, but just exert some manual influence over it.  For example:

  • Grow from a specific point in the tree.

  • Stop sections of the tree from growing and grow the rest.

  • Prune the tree to remove a branch that has been grown.

  • Update the various settings, for example the dimensions or stopping conditions.

For more examples and details, see How do I influence the automatic build process?

Note that you cannot modify the base or analysis selections without rebuilding the tree. 

Manually Controlling the Build

Split a particular node using a specific dimension

Just right click on the node you want to split.  A list of available valid splits is given under the "Use Valid Split" sub-menu.  These splits are numbered according to their rank, as defined by the "Split Selection" measure.  See How do I set the Algorithm Options?.

You can force the use of an invalid split, by selecting from the list of dimensions on the "Force Invalid Split" sub-menu.  These splits are not ranked.  In the image below, Newspapers has been added as a new dimension.  This produces an invalid split as the resulting nodes are not significant.  As shown below, you could force this split to be used.

Defining your own branches

The Next Split tab allows us to see the statistics that informed the split. In this example Income was used. Branch 1 included categories with both positive and negative PWEs. By highlighting the categories with positive PWEs, and right-clicking, we can define a branch so it will only include those categories.

To create further groupings, just highlight the categories you would like in the same branch and right click to select "Define Branch".

Influencing the Automatic Build

Growing sections of the tree

We may decide that the best nodes in the tree are already good enough to use in a selection, and that we don’t want to investigate the worst nodes further at this stage.  But perhaps we want to investigate the rest of the nodes, this time using the regional variable.

Individual leaves or groups of leaves can be stopped from growing by right clicking and choosing “Stop leaf nodes from growing”.   In the image below, node 3 is being stopped, which will result in all leaves from that point on being prevented from growing.  Node 8 has also been stopped in this way (as indicated by the line across the node).

You can reverse this action by selecting "Allow leaf nodes to grow".  Individual leaf nodes can be stopped or allowed to grow by clicking on them individually.

Now, modify the dimensions by dragging on the Region dimension and remove the ticks in the Create Splits column, except on the region row. This will build a Decison Tree only using the Region dimension.

The Build Options allows you to influence how the Decision Tree builds.

Increasing the Nodes and/or Maximum Depth both give the potential to create a larger Decision Tree and possibly obtain further insight. At times a branch may stop growing due to the number of records in that node being to small; changing the minimum Node Size to split can address this. The algorithm can also be influenced - switching from PWE to CHAID, for example.