General Support
Support for installation and programming problems
can be obtained by
contacting us
Frequently Asked Questions
- Why the name P-STAT?
-
The "P" stands for Princeton where the program was developed.
The "STAT" stands for statistics which was the initial
emphasis.
-
P-STAT still has an extensive suite of basic statistics but
its strengths continue to be in its tables,
its report writing, its language for
manipulating data and its file management.
- What machines does P-STAT run on?
-
Intel Chip personal computers running
Windows 95/98, NT, XP, etc..
- Intel Chip personal computers running Unix or Linux.
- Most Unix workstations including: Sun Sparc.
- How can I use the help file when I always run batch jobs?
-
Run a P-STAT job with the following two commands:
PR 'HELP.TXT' $
HELP, EVERYTHING $
The text file HELP.TXT can now be searched by your own text
processor when you are building your batch job streams.
-
How many cases and variables can I have in a file?
-
The number of cases is limited only by the available disk space
-
The number of variables depends on the implementation.
A 6,000 variable size is supplied for most machines.
If you need a larger size,
Whopper 3 at
10,000 variables and
Whopper 4 which
handles up to 25,000 variables (fields) per case
are available on most supported computers.
-
What about character (string) variables?
-
In P-STAT a single character variable can be up to
50,000 characters long.
Many character functions are available to manipulate these
variables.
- How does P-STAT handle dates?
-
DATE and TIME: 47 PPL functions are available:
These functions accept date or date/time input and do things
like add days to a date, or determine the number of
hours between two date/time values.
These functions accept dates in character form like:
-
'Mon June 17, 2002 12:30:12.123'
The time field, which allows milliseconds, can be
omitted. The year can be from 1753 to 2999.
Six logical operators (DATE.LT through DATE.GT)
have been added. These are used to see if one date
comes before or after another. For example:
-
[if birthday DATE.GT 'jan 1 1950', delete]
Programming Questions
Return to P-STAT's Home Page
Solutions
1:
Rename all the variables
The easiest way is to build a new file which has nothing
in it but variable names and then use the on-the-fly (+) feature to
concatentate the files. For example:
BUILD Names;
VARS Age Income Education Occupation Salary;
READ $
MOFIFY Names + Myfile, OUT Myfile $
The only constraint is that the data type (character or
numeric) of the variables in the "names" file must match
those in the data file.
The RENAME function can be used to rename either individual
variables or lists of variables. The following example
renames 2 variables.
MODIFY Myfile [ RENAME Item12 TO Education;
RENAME Item19 TO Income ],
OUT Myfile $
A list of variables can be renamed by using a DO loop and
supplying the RENAME function with rules. The following
example renames all the variables in a file except the
first variable with the prefix "Y95".
MODIFY Myfile [ DO #J USING 2 .ON.;
RENAME V(#J) TO ( 'Y95' & ) ], OUT Myfile95 $
By using patterns in the parentheses after the TO, new labels
can be created which
contain sequence numbers with a specified prefix and possibly
a suffix.
Return
Return to P-STAT's Home Page
2:
Computing Means
The answer depends on what means you want and
whether you just want to see the means or whether you want
to save them for later use.
The following solutions apply equally to computing totals,
minimums, maximums, and standard deviations.
To provide the answers we use the
following trivial data set.
Name Test1 Test2 Test3
Sam 85 78 93
John 74 81 79
Sally 89 - 95 (Sally did not take Test3)
If the mean that is wanted is the mean of the three test variables,
there are two functions available. The MEAN
function produces the mean only if all the values are non-missing.
The MEAN.GOOD function computes the mean using the
available good values.
The command
LIST Students [ GEN Grade = MEAN.GOOD ( Test1 TO Test3 ) ] $
produces the following listing:
Name Test1 Test2 Test3 Grade
Sam 85 78 95 86
John 74 81 79 78
Sally 89 - 95 92
If the MODIFY command is used with an output file,
the mean is available for use in subsequent steps.
The following example saves the file. Because the MEAN function
is used instead of MEAN.GOOD, Sally has a missing value
for the new variable.
MODIFY Students [ GEN Grade = MEAN ( Test1 TO Test3 ) ]
OUT Students $
There are several ways to get the average for each of the
three tests. If you want to see the value in a listing,
the LIST command can be used with the MEANS identifier.
LIST Students, MEANS $
which produces the following listing:
Name Test1 Test2 Test3
Sam 85 78 95
John 74 81 79
Sally 89 - 95
======= ======= =======
Mean 82.67 79.50 89.67
If you wish to have the cross case means saved in a file,
either the COUNT command or the AGGREGATE command might be used.
The following examples illustrate the difference
COUNT student, none, stats 1, out mstat $
LIST mstat $
count or
variable value stat
Test1 mean 82.66667
Test2 mean 79.50000
Test3 mean 89.66667
_____________________________________________
AGGREGATE student [ drop name ], means mstat $
LIST mstat $
Test1 Test2 Test3 statistic
82.66667 79.5 89.66667 means
Return
Return to P-STAT's Home Page
3:
Rearranging the Order of Variables:
It is very easy to rearrange the order of variables in a file.
LIST Myfile [ KEEP Q1 Q5 Q7 Q2 to Q4 Q9 Q6 Q8 Q10 ] $
However, when there are many variables this can be tedious.
There are a number of features in PPL (the P-STAT Programming
Language) which can be used if the reordering follows a
pattern or if the variable names provide help.
For the purposes of the following problems we begin with
a file of 41 variables. The first variable is an id
variable. It is followed by 20 pairs of variables.
The first variable in each pair is the score on a test
taken at the beginning of the semester. The second
variable in the pair is the score on the same test taken
at the end of the semester.
The variables are named
ID and Q1 to Q40.
The first task is to rearrange the file so that the scores
for all the pre tests are in positions 2-21 of the file followed
by the scores for the post tests. Brute force can do it:
[ KEEP Id Q1 Q3 Q5 ... Q39 Q2 Q4 Q6 ... Q40 ]
Far easier and less error prone is the use of a mask.
[ KEEP Id Q1 .on. ( MASK 10 ) Q1 .on. ( MASK 01 ) ]
which is the same as:
[ KEEP Id Q1 .on. ( MASK 10 ) Q2 .on. ( MASK 10 ) ]
If you choose to give your variables names such as
pre1 to pre20 and
post1 to post20 when you create the file or by using the
RENAME command, you can use wildcard notation:
[ KEEP pre? post? ]
Wildcards can prove a double-edged sword and while
elegant and parsimonious must be used with care.
More difficult is the problem of taking this new file
and putting it back in its original order. Again there
are several solutions. Brute force of course will work
[ KEEP Id Q1 Q2 Q3 Q4 ..... Q39 Q40 ]
The simplest solution is one that requires the use of
a little known attribute of the SPLIT funtion and the
realization that a case can be SPLIT into 1, which
does not seem like much of a split.
The CYCLE
attribute specifies a step size between the variables
to be accessed. Thus CYCLE 20 means: take a variable and then
cycle through the list of variables taking every 20th variable.
The first time through cycle starts with the first variable.
The second time through cycle starts with the second variable
and so on until all the variables in the list have been selected.
MODIFY Students
[ SPLIT INTO 1, CARRY Id, USE ( Q1 .ON. ) CYCLE 20 ],
OUT Students $
This solution will work whatever the original variable names
and those original variable names are preserved.
Return
Return to P-STAT's Home Page
4:
Printing the Output
How you print your output depends on where you want the printout to
go.
Print files are usually directed to a disk file, a
print queue or, on a PC with a printer attached, directly
to the printer. The print queue and the direct attachment
are merely special names. The PR keyword is used with the
name of the print destination.
When PR is used by itself as a command, all subsequent
output is directed to that destination until another
PR keyword changes the destination.
PR "listing.prt" $
When the PR is used within a command as either an
identifier or a subcommand, only the current output from
that command is sent to the designated destintion. When
the command ends, all subsequent printout reverts to the
original destination.
LIST Myfile, PR "listing.prt" $
Under Windows with an attached printer, the names PRN,
LPT1, LPT2 and LPT3 are recognized as special.
LIST Myfile, PR PRN $
Printout that goes to the terminal and printout that goes
to a diskfile or print queue may have different attributes.
The PRINTER.SETTINGS command permits you to specify the
names and attributes for 8 or more print destinations.
The output is automatically reformatted to conform to
the attributes of the print destination associated with
the current PR keyword.
PRINTER.SETTINGS "Disk.fil", PAGE.CHARACTER ' ',
NO ECHO, NO UNDERLINE, LINES 100, OUTPUT.WIDTH 160 $
PRINTER.SETTINGS "Queue.prt", PAGE.CHARACRER 10,
ECHO, LINES 56, OUTPUT.WIDTH 132 $
Return
Return to P-STAT's Home Page
5:
A macro to create passwords
The following macro is designed to create a password of a selected
length which will be easy to recreate only if the 3 character code
is known. To make it more difficult for anyone else to break,
run it twice: the first time to generate a new 3 character code;
the second time to create a longer password.
RUN password ( abc, 3 ) $
produces the following characters for the password:
urM
RUN password ( urM, 8 )
produces:
l 0 5 a j E P j
The Macro
MACRO password ( code, len ) $
/* to use the macro, provide a 3 character code and the
number of characters to be provided.
RUN password ( 'cat', 6 ) $
*/
/* build a dummy file; The number of variables will be the number
of characters (&len) to be placed in the generated password */
BUILD work;
vars var1 to var&len;
read;
&len*1 $
/* generate seed variables for the random number generator */
GEN ##code:c3 = "&code" $
GEN ##s1 = ival( substring ( ##code, 1, 1 )) $
GEN ##s2 = ival( substring ( ##code, 2, 1 )) * 13 $
GEN ##s3 = ival( substring ( ##code, 3, 1 )) * 39 $
LIST work [
DO #k = 1, &len;
/* create &len character variables to hold password characters */;
GENERATE ?:c1;
ENDDO;
/* use var1 to var&len for size of password */;
DO #j using var1 to var&len;
/* get a random number. */;
get: SET v(#j) = INT ( RANUNI ( -3, ##s1, ##s2, ##s3 ) * 1000 );
/* if number does not map to ascii letter or number get another */
IF v(#j) notamong ( 49 to 57 65 to 90 97 to 122 ) go to get;
/* convert ascii number to character */;
SET v(#j+&len) = CVAL (v(#j));
ENDDO;
/* show just the new character variables */;
KEEP .new. ], DATA.ONLY $
ENDMACRO $
Return
Return to P-STAT's Home Page<