Home Page Turf Examples
Turf Options Turf Results
Turf Files Turf Methods
Turf Timings Turf Limitations

The TURF command in P-STAT

TURF stands for Total Unduplicated Reach and Frequency. It is most often used in market research applications.

For example, a file has the responses of 1000 cases on 40 items. Each item is a TV program which the respondent did or did not watch.

You would like to know which group of 8 programs was reached by the largest number of respondents. A respondent is "reached" by a given group of programs if they watched at least one of the programs in the group.

TURF, to find the best group, evaluates each of the 76.9 million combinations of 8 items taken from a pool of 40, and writes a file identifying the 100 best combinations. This takes about a minute on a 2.4GHz pc.

Weighting can be applied to the cases, or to the response values, or to the items. The "being reached" criterion of one response can be increased.

Adding options affects speed. The above one-minute run takes 12 minutes if case weights are used, and would take about 30 minutes if the other options are used.

REACH versus FREQUENCY

Reach and frequency are different measurements in TURF.

The REACH score for a combination is the number of cases that have at least N positive responses on the variables in that combination. N is the reach threshold in use, which has a default setting of one.

The FREQ score for a combination is, for the reached cases, normally the total number of positive responses on the variables in the combination. The response.weights option causes the FREQ score to be the sum of the positive responses.

If a case indeed watched all eight programs in a group, the frequency count for that group would be increased by 8, but the reach count would only be increased by 1.

If the input were hours watched rather than just watched or not, the response.weights option would sum the response values and thereby measure the impact of the combination.

Features of the TURF command

Allows up to 210 items (i.e., variables). Allows combinations of items up to size 60. Allows several combination sizes to be done in one run. Allows tens of thousands of cases.

Allows weighting of cases. Allows weighting of items. Allows weighting of responses; this allows the intensity of a response to be utilized.

Allows setting a reach threshold of more than one. Allows forcing designated items into every combination. Allows limits on how many of a set of items can be placed in the combinations to be analyzed.

Writes a result file containing the best combinations. The items within each combination are ordered by their importance. Writes a template file for use by the TURF.SCORES command. Writes (in TURF.SCORES) the reach score for each case.

Takes 3.1 seconds for 1,000 cases on 40 items, 6 at a time on a 2.4 GHz PC. This evaluates 3.838 million groups.

Runs so rapidly that billions of combinations can be done. Takes 18.5 minutes for 1,000 cases on 100 items, 6 at a time on a 2.4 GHz PC. This evaluates 1.192 billion groups. Note: using all options would take about 9 hours.

Shows the percent of combinations already processed in a progress window. Writes a detailed report when the command finishes.

A typical final report produced by TURF ----------TURF analysis for file tin completed----------- | OPTIONS: none | | | | 29 items were used in the analysis. | | | | 550 cases were read and used. | | 213 cases had at least one positive response, | | making that the maximum possible reach. | | | | SIZE 4 evaluated 23,751 combinations: | | 212 was the best REACH, found in 2 combinations. | | 628 was the best FREQ in those 2 combinations. | | 729 was the best FREQ in any size 4 combination. | | | | The FREQ score for a combination is the count | | of the non-zero responses for that combination, | | summed over the reached cases. | | | | REACH.RESULTS file work1 has the 100 | | combinations with the highest reach scores. | | The items are ordered by their REACH contribution. | | Cumulative reach is shown. | | | | Time: .1 seconds. | ---------------------------------------------------------

A typical REACH.RESULTS file produced by TURF

 TURF results ordered by REACH, from input file tin using 29,4
     
                              pct of                                
                        pct      max        item  item  item   item 
 size  rank  reach  reached    reach  freq  .1    .2    .3     .4   

    4     1    212   38.545   99.531   628  VAR3  VAR4  VAR5   VAR23
                                            186   208   210    212  
   
    4     2    212   38.545   99.531   602  VAR3  VAR4  VAR5   VAR28
                                            186   208   210    212  
    
    4     3    211   38.364   99.061   660  VAR3  VAR4  VAR5   VAR13
                                            186   208   210    211  
    
    4     4    211   38.364   99.061   658  VAR3  VAR4  VAR5   VAR15
                                            186   208   210    211  
    
    4     5    211   38.364   99.061   634  VAR3  VAR4  VAR5   VAR19
                                            186   208   210    211  

The final report from a longer run

     ---------TURF analysis for file work2 completed----------
     | OPTIONS: none                                         |
     |                                                       |
     |      100 items were used in the analysis.             |
     |                                                       |
     |    1,000 cases were read and used.                    |
     |      973 cases had at least one positive response,    |
     |          making that the maximum possible reach.      |
     |                                                       |
     | SIZE   6 evaluated 1,192,052,400 combinations:        |
     |      941 was the best REACH, found in 1 combination.  |
     |    1,956 was the FREQ value in that combination.      |
     |    1,983 was the best FREQ in any size 6 combination. |
     |                                                       |
     | The FREQ score for a combination is the count         |
     | of the non-zero responses for that combination,       |
     | summed over the reached cases.                        |
     |                                                       |
     | REACH.RESULTS file work101 has the 100                |
     | combinations with the highest reach scores.           |
     | The items are ordered by their REACH contribution.    |
     | Cumulative reach is shown.                            |
     |                                                       |
     | Time: 18 minutes, 35.5 seconds.                       |
     ---------------------------------------------------------

The other TURF help files

The following links complete the online TURF documentation. They are:

TURF.EXAMPLES This shows a small dataset and four TURF runs, done with differing options. The output is explained. These show the effects of item, response and case weighting, as well as the effect of using a reach threshold of more than 1.

TURF.OPTIONS This describes all of the TURF identifiers.

TURF.RESULTS The principle output from most TURF runs is the REACH.RESULTS file and, to a lesser extent, the FREQ.RESULTS file. This helpfile describes these files, and covers some related options that affect their contents and format.

This helpfile also describes how the individual reach contributions for each item within a combination of items is calculated. Note, the latter was substantially improved with version 2.23 release 6 in September, 2006.

TURF FILES This helpfile describes four additional output files that TURF can produce: REACH.SUMMARY, FREQ.SUMMARY, FULL.OUTPUT and TEMPLATE.

TURF.METHODS This shows how the reach and freq scores for a case are calclated.

TURF.TIMINGS This describes how the length of a TURF run can be estimated.

TURF.LIMITS This discusses what can and cannot be reasonably attemped.