Home Page Overview
New Features Downloads & Prices
P-STAT documentation Technical Support
Contact P-STAT Unistat for Windows

General Support

Support for installation and programming problems can be obtained by contacting us

Frequently Asked Questions

Why the name P-STAT?

The "P" stands for Princeton where the program was developed. The "STAT" stands for statistics which was the initial emphasis.


P-STAT still has an extensive suite of basic statistics but its strengths continue to be in its tables, its report writing, its language for manipulating data and its file management.

What machines does P-STAT run on?

Intel Chip personal computers running Windows XP, Windows 7 and 8, etc..
Intel Chip personal computers running Unix or Linux.
Most Unix workstations including: Sun Sparc.

How can I use the help file when I always run batch jobs?

Run a P-STAT job with the following two commands:

PR 'HELP.TXT' $
HELP, EVERYTHING $

The text file HELP.TXT can now be searched by your own text processor when you are building your batch job streams.

How many cases and variables can I have in a file?

The number of cases is limited only by the available disk space

The number of variables depends on the implementation. A 6,000 variable size is supplied for most machines. If you need a larger size, Whopper 3 at 10,000 variables and Whopper 4 which handles up to 25,000 variables (fields) per case are available on most supported computers.

What about character (string) variables?

In P-STAT a single character variable can be up to 50,000 characters long. Many character functions are available to manipulate these variables.

How does P-STAT handle dates?

DATE and TIME: 47 PPL functions are available:

These functions accept date or date/time input and do things like add days to a date, or determine the number of hours between two date/time values.

These functions accept dates in character form like:
'Mon June 17, 2002 12:30:12.123'

The time field, which allows milliseconds, can be omitted. The year can be from 1753 to 2999.

Six logical operators (DATE.LT through DATE.GT) have been added. These are used to see if one date comes before or after another. For example:

[if birthday DATE.GT 'jan 1 1950', delete]

Programming Questions


Return to P-STAT's Home Page Return to P-STAT's Home Page




Solutions

1:     Rename all the variables

The easiest way is to build a new file which has nothing in it but variable names and then use the on-the-fly (+) feature to concatentate the files. For example:
      BUILD Names;
      VARS Age Income Education Occupation Salary;
      READ $
      MOFIFY Names + Myfile, OUT Myfile $


The only constraint is that the data type (character or numeric) of the variables in the "names" file must match those in the data file.

The RENAME function can be used to rename either individual variables or lists of variables. The following example renames 2 variables.

     MODIFY Myfile [ RENAME Item12 TO Education;
                     RENAME Item19 TO Income ],  
        OUT Myfile $
A list of variables can be renamed by using a DO loop and supplying the RENAME function with rules. The following example renames all the variables in a file except the first variable with the prefix "Y95".

    MODIFY Myfile [ DO #J USING 2 .ON.;
           RENAME V(#J) TO ( 'Y95' & ) ], OUT Myfile95 $
By using patterns in the parentheses after the TO, new labels can be created which contain sequence numbers with a specified prefix and possibly a suffix.

Return     Return to P-STAT's Home Page


2:     Computing Means

The answer depends on what means you want and whether you just want to see the means or whether you want to save them for later use.

The following solutions apply equally to computing totals, minimums, maximums, and standard deviations. To provide the answers we use the following trivial data set.

      Name    Test1  Test2  Test3

      Sam        85     78     93 
      John       74     81     79
      Sally      89      -     95 (Sally did not take Test3)

If the mean that is wanted is the mean of the three test variables, there are two functions available. The MEAN function produces the mean only if all the values are non-missing. The MEAN.GOOD function computes the mean using the available good values. The command

    LIST Students [ GEN Grade = MEAN.GOOD ( Test1 TO Test3 ) ] $

produces the following listing:
 
      Name    Test1  Test2  Test3 Grade
 
      Sam        85     78     95    86
      John       74     81     79    78
      Sally      89      -     95    92
 
If the MODIFY command is used with an output file, the mean is available for use in subsequent steps. The following example saves the file. Because the MEAN function is used instead of MEAN.GOOD, Sally has a missing value for the new variable.

    MODIFY Students [ GEN Grade = MEAN ( Test1 TO Test3 ) ]
    OUT Students $

There are several ways to get the average for each of the three tests. If you want to see the value in a listing, the LIST command can be used with the MEANS identifier.

    LIST Students, MEANS $

which produces the following listing:
 
       Name   Test1     Test2     Test3 
 
       Sam       85        78        95   
       John      74        81        79   
       Sally     89         -        95   
               =======   =======   =======
  Mean           82.67     79.50     89.67


If you wish to have the cross case means saved in a file, either the COUNT command or the AGGREGATE command might be used. The following examples illustrate the difference
   COUNT student, none, stats 1, out mstat $
   LIST mstat $

                      count or
    variable  value       stat

    Test1     mean    82.66667
    Test2     mean    79.50000
    Test3     mean    89.66667

   _____________________________________________

   AGGREGATE student [ drop name ], means mstat $
   LIST mstat $

        Test1  Test2      Test3  statistic

     82.66667   79.5   89.66667  means  
Return     Return to P-STAT's Home Page




3:     Rearranging the Order of Variables:

It is very easy to rearrange the order of variables in a file.
   LIST Myfile [ KEEP Q1 Q5 Q7 Q2 to Q4 Q9 Q6 Q8 Q10 ] $
However, when there are many variables this can be tedious. There are a number of features in PPL (the P-STAT Programming Language) which can be used if the reordering follows a pattern or if the variable names provide help.

For the purposes of the following problems we begin with a file of 41 variables. The first variable is an id variable. It is followed by 20 pairs of variables. The first variable in each pair is the score on a test taken at the beginning of the semester. The second variable in the pair is the score on the same test taken at the end of the semester. The variables are named ID and Q1 to Q40.

The first task is to rearrange the file so that the scores for all the pre tests are in positions 2-21 of the file followed by the scores for the post tests. Brute force can do it:
  [ KEEP Id Q1 Q3 Q5 ... Q39 Q2 Q4 Q6 ... Q40 ]
Far easier and less error prone is the use of a mask.
  [ KEEP Id Q1 .on. ( MASK 10 ) Q1 .on. ( MASK 01 ) ]     
which is the same as:
  [ KEEP Id Q1 .on. ( MASK 10 ) Q2 .on. ( MASK 10 ) ]   
If you choose to give your variables names such as pre1 to pre20 and post1 to post20 when you create the file or by using the RENAME command, you can use wildcard notation:
  [ KEEP pre? post? ]
Wildcards can prove a double-edged sword and while elegant and parsimonious must be used with care.

More difficult is the problem of taking this new file and putting it back in its original order. Again there are several solutions. Brute force of course will work
  [ KEEP Id Q1 Q2 Q3 Q4 ..... Q39 Q40 ]
The simplest solution is one that requires the use of a little known attribute of the SPLIT funtion and the realization that a case can be SPLIT into 1, which does not seem like much of a split.

The CYCLE attribute specifies a step size between the variables to be accessed. Thus CYCLE 20 means: take a variable and then cycle through the list of variables taking every 20th variable. The first time through cycle starts with the first variable. The second time through cycle starts with the second variable and so on until all the variables in the list have been selected.
   MODIFY Students
     [ SPLIT INTO 1, CARRY Id, USE ( Q1 .ON. ) CYCLE 20 ],
   OUT Students $
This solution will work whatever the original variable names and those original variable names are preserved.

Return     Return to P-STAT's Home Page




4:     Printing the Output

How you print your output depends on where you want the printout to go.

Print files are usually directed to a disk file, a print queue or, on a PC with a printer attached, directly to the printer. The print queue and the direct attachment are merely special names. The PR keyword is used with the name of the print destination.

When PR is used by itself as a command, all subsequent output is directed to that destination until another PR keyword changes the destination.
    PR "listing.prt" $


When the PR is used within a command as either an identifier or a subcommand, only the current output from that command is sent to the designated destintion. When the command ends, all subsequent printout reverts to the original destination.
   LIST Myfile,  PR "listing.prt" $ 
Under Windows with an attached printer, the names PRN, LPT1, LPT2 and LPT3 are recognized as special.
   LIST Myfile,  PR PRN $
Printout that goes to the terminal and printout that goes to a diskfile or print queue may have different attributes. The PRINTER.SETTINGS command permits you to specify the names and attributes for 8 or more print destinations. The output is automatically reformatted to conform to the attributes of the print destination associated with the current PR keyword.
  PRINTER.SETTINGS "Disk.fil",  PAGE.CHARACTER ' ',
    NO ECHO, NO UNDERLINE, LINES 100, OUTPUT.WIDTH 160 $

  PRINTER.SETTINGS "Queue.prt", PAGE.CHARACRER 10,
    ECHO,  LINES 56, OUTPUT.WIDTH 132 $
Return     Return to P-STAT's Home Page




5:     A macro to create passwords

The following macro is designed to create a password of a selected length which will be easy to recreate only if the 3 character code is known. To make it more difficult for anyone else to break, run it twice: the first time to generate a new 3 character code; the second time to create a longer password.

RUN password ( abc, 3 ) $

produces the following characters for the password:

urM

RUN password ( urM, 8 )

produces:
   l   0   5   a   j   E   P   j

The Macro

MACRO password ( code, len ) $
 
/*  to use the macro, provide a 3 character code and the
    number of characters to be provided.

    RUN password ( 'cat', 6 ) $ 
*/  

/* build a dummy file; The number of variables will be the number
   of characters (&len) to be placed in the generated password */
 
BUILD work;
vars var1 to var&len;
read;
&len*1 $

/* generate seed variables for the random number generator */

GEN ##code:c3 = "&code" $
GEN ##s1 = ival( substring ( ##code, 1, 1 )) $
GEN ##s2 = ival( substring ( ##code, 2, 1 )) * 13 $
GEN ##s3 = ival( substring ( ##code, 3, 1 )) * 39 $

LIST work [
   DO #k = 1, &len;
   /* create &len character variables to hold password characters */;
   GENERATE ?:c1;
   ENDDO;

   /* use var1 to var&len for size of password */;
   DO #j using var1 to var&len;

   /* get a random number.  */;
   get: SET v(#j) = INT ( RANUNI ( -3, ##s1, ##s2, ##s3 ) * 1000 );

   /* if number does not map to ascii letter or number get another */
   IF v(#j) notamong ( 49 to 57 65 to 90 97 to 122 ) go to get;

   /* convert ascii number to character */;
   SET v(#j+&len) = CVAL (v(#j));
   ENDDO;

   /* show just the new character variables */;
   KEEP .new. ], DATA.ONLY $
 
ENDMACRO $
Return     Return to P-STAT's Home Page<