Profile: How do I interpret the Profile results?

Each variable contains a summary for that variable in the Profile on a single line.

Clicking the + symbol shows the full Profile results for that variable

Example Profile results for the variable Title. Analysis set was Holiday Destination = Italy and Base set was the Universe

Not all options maybe on display.  To show further display information right click on one of the headings and select Column Chooser... from the pop up menu.  When the Column Chooser appears tick the options to display and click OK.

Variable Display Options

  •  Code displays the category code.

  • Description displays the name assigned to that particular variable category.

  • Analysis is the number of occurrences of this particular value in the Analysis set e.g. there are 12,354 occurrences of Title 'Miss' in the Analysis set in the above example. In other words out of the records who have a holiday destination of 'Italy', 12,354 records have a title of 'Miss'.

  • Base is the number of occurrences of this particular value in the Base set. In our example, there are 61,735 records in the Universe with a title of 'Miss'.

  • % of Analysis shows the percentage of the Total Analysis records. This is calculated as (Analysis / Total Analysis records) x 100.

  • Index is the ratio of the above percentages multiplied by 100 i.e. (Analysis% / Base%) x 100.

  • Z-Score displays a figure which is the standardised measure of how confident we can be that the result presented (the difference between the analysis and base percentages) is a true characteristic of the data and not a quirk of the data sample used.  for each category, the Z-Score measures the number of standard deviations the result (% of analysis) is away from the expected result (% of base) of the category.

  • Probability displays a figure based upon the probability of the z-score occurring in data of the same sample sizes but where the analysis selection is random.  This will be very low if the z-score is significant.  This is another way of looking at the z-score.

  • Confidence % displays a figure which is related to the z-score probability but transformed to give a measure of confidence that the z-score has occurred due to the trends in the data rather than the statistical sampling error.  This is a percentage and will be over 90% for significant z-scores.  This is just another way of looking at the z-score.

  • Ratio is only used for comparison profiles. This is the number of A records per B records.  One use of a comparison profile is to compare good and bad risk customers.

  • Zd Exp displays the result of an alternate Z-Score calculation.  The Zd Exp (Z-Score of difference from expected) and is used in Excel logging of profiles and when selecting categories by z-score threshold for inclusion in a PWE model.

  • PWE (Predictive Weight of Evidence) value is the amount of evidence that membership of that category gives to support a record being in the analysis set.  The evidence is calculated by considering the information gain on learning the classification of a record is in this particular category.  The evidence algorithm is derived using information theory and probability.

  • Penetration displays a histogram with an Index value shown centred around 100 and shaded by the Z-Score.  Histogram bars to the left of the 100 centre line show under representation.  Histogram bars to the right of the centre line show over representation.  

  • Shade displays a number which relates to the colour shading applied to the Penetration histogram as follows:

Shade No

 

4

3

2

1

0

Colour

 

Red

Red Orange

Orange

Yellow Orange

Yellow

Z-Score

 

> + / - 3.29

> +/ - 2.576

> +/ - 1.96

> + / - 1.65

<= + / - 1.65

Confidence

 

>99.9%

>99%

>95%

>90%

<= 90%

  • % of Analysis : % of Base displays the two percentages as parallel histograms.  The analysis percentage is shown in green in the upper bar.  The base percentage is shown in blue in the lower bar.

  • Market Potential shows a histogram of two parts.  The total length of the histogram shows the total number of records in the base (blue line) having this variable value.  The left hand green line shows the proportion of records that are in the analysis set.

 

Return to Profile: Overview