Home Page Turf Overview
Turf Examples Turf Results
Turf Files Turf Methods
Turf Timings Turf Limitations

TURF.OPTIONS

This helpfile describes the various identifiers within P-STAT's TURF command.

General identifiers

TURF xxx

this supplies the input filename. Except for an optional weight variable, all variables are treated as analysis items.

The values on the analysis items should be zeros or positive numbers. A positive value signifies a "hit".

Cases with any missing or negative values on the analysis items are ignored. The SET.MISS.TO.ZERO identifier, described below, sets missing analysis items to zeros.

When case weighting is being used, any cases with a missing, negative or zero value on the weight variable are also ignored.

SIZE 6,     SIZE 4 to 7,     SIZE 6 to 3,     SIZE 4 6 8,

what size combinations to use. required. One or more sizes can be done in one run. They are done in the order given; for SIZE 6 to 4, size 6 is done first.

The final report shows the result for each size separately. The output files show the best results from the first size, then the second, and so forth.

Many (20 or more) sizes can be done in a run; each size must be from 1 to 60, and there should not be any repeated sizes in a run.

Note...some sizes cannot be run in a reasonable amount of time. Consider 40 items. Depending on number of cases and on options:

Size 4 takes 91,390 iterations. Seconds. Size 6 takes 3.8 million iterations. Minutes. Size 10 takes 847 million iterations. An hour. Size 15 takes 40 billion iterations. A day. Size 20 takes 137 billion iterations. A week.

          This command produced the above numbers.
          DO  #j = 1, 20;
          PUT #j (combinations( 40,#j));
          ENDDO $

The F2 key can be used to cause a TURF command to abandon the current size being processed. It will produce the report and the output files for the sizes already completed.

REACH.THRESHOLD 2

is optional. can be fractional. This permits the user to control what constitutes a successful "reach".

The default is one; if a case has a positive response on any of the items in a given combination, that case is added to the reach total for that set of items.

Using REACH.THRESHOLD 3, for example, means a case needs a reach score of 3 or more to have been reached on a given group. Having several responses increases a case's reach score; weighting of either items or responses can also affect the reach score.

PROGRESS 5

is optional. controls how often the progress window or report line is updated. The default is 1, which means every million combinations. PROGRESS 0 turns it off.

SET.MISS.TO.ZERO

is optional. If used, missing analysis values in the input file are set to zeros. If needed, this saves having to write some PPL as the file is read.

Identifiers that control the makeup of the combinations to be used

USE list-of-vars min max

is optional. This provides a limitation on the makeup of the combinations to be tried.

Of the variables whose names (or ranges) follow USE, at least MIN of them and at most MAX of them should be in every combination that will be tried. The MIN value can be zero.

Up to 30 such USE phrases can be given. Combinations are used only if they pass the constraints in every one of the USE phrases.

Each use of USE is followed by:

  1. The names of the variables in the group. Ranges, like TOPPING.1 TO TOPPING.8, can be used.

  2. The smallest number of those variables that are required. Can be zero. A combination must have AT LEAST that many of the variables in the group.
  3. The largest number of those variables that may be used. A combination may have AT MOST that many of the variables in the group.

    All of the group could be used if the supplied number is equal to or larger then the size of the group. Therefore, using 999 is a vivid way of saying there is no upper limit for the group.

    For example:
       TURF xxx, size 8,
           use  aaa         bbb to ddd  1 999,
           use  eee to ggg  jjj to mmm  2 4,
           use  yyy         zzz         0 1 $
    
    In the above command, the only combinations that will be evaluated are those that have at least one variable from the first group, and at least two but no more than four variables from the second group, and no more than one variable from the third group.

FORCE vars

is optional. names or ranges of items that should be part of every combination.

Suppose there are 30 items and size is 6; without force, 593,775 combinations are done, because we take 30 items 6 at a time.

If 2 items are forced, only 20,475 combinations will be done because the run reduces to 28 items taken 4 at a time.

If size is 6 and all 6 items are forced, just that one pass will be done.

Identifiers for various kinds of weighting

CASE.WEIGHTS varname

is optional. The named variable will be used as a caseweight, and not as an analysis item.

ITEM.WEIGHTS filename

the default is treat all of the items the same, i.e., with weights of 1.

When ITEM.WEIGHTS is used, it should be followed by the name of a P-STAT system file which itself has exactly 2 variables.

In each record, the first variable has the name of a item being used for the TURF analysis, the second is the weight to be used for that item. The first variable is therefore character, and the second is numeric.

The file is not required to have a record for every item. In other words, some items can be given changed weights; others can be left as is ( i.e., still set to 1).

The file can have names and weights for items not used in the current run; if so, they are ignored.

RESPONSE.WEIGHTS

the default is to store the input data as zeros or ones, with one meaning a yes. This option leaves the input values intact; they should be in zero (no) or a positive value (not necessarily an integer) to show the INTENSITY of a yes.

REACH.RESULTS and FREQ.RESULTS

The following five identifiers are fully described in TURF results.

REACH.RESULTS rrr

optional output P-STAT system file. This file holds the combinations with the best REACH values.

REACH.DETAILS cumulative.pct

This controls which lines are printed in the REACH.RESULTS file to show the importance of the items in a combination.
FREQ.RESULTS fff

optional output P-STAT system file. This file holds the combinations with the best FREQ values.

FREQ.DETAILS cumulative.pct

This controls which lines are printed in the FREQ.RESULTS file to show the importance of the items in a combination.

OMIT size pct.of.max.reach

This provides a way to drop some of the statistics in the REACH.RESULTS and FREQ.RESULTS files.

Other optional output file identifiers

The following four identifiers are fully described in TURF files.

REACH.SUMMARY qqq

optional output P-STAT system file. This file shows how many combinations had each of the reach values that were found.

FREQ.SUMMARY qqq

optional output P-STAT system file. This file shows how many combinations had each of the freq values that were found.

FULL.OUTPUT fff

optional output P-STAT system file. This has the results of ALL combinations in the order that they were processed. This should only be used in very small runs.

TEMPLATE ttt

optional output P-STAT system file. This contains the names of the items that comprised the best combination. It is intended for the TURF.SCORES command.