means by yaofenji

VIEWS: 8 PAGES: 4

• pg 1
```									SAS PROC MEANS                                                                  1

PROC MEANS
The procedure MEANS generates descriptive statistics about a data set. As
the name suggests, this procedure computes the mean of the nonmissing
numeric values. In addition to the mean, many other statistics can be
computed. By default, this procedure calculates the number of nonmissing
values, the mean, the sample standard deviation, the minimum value, and
the maximum value for each numeric variable. In general, the format of
this procedure is as follows:

PROC MEANS DATA=dataset <options> <statistic-keywords>;
BY variable_list;
CLASS variable_list;
VAR variable_list;
RUN;

The required keywords for this procedure are PROC MEANS.       The SAS data
set is specified using DATA=dataset.

The BY, CLASS, and VAR statements are just a few of the optional
statements available for this procedure and are used to customize the
output.

BY               This option assumes that the data set is sorted
according to the order of the variables listed in this
statement (see PROC SORT reference). If the data
set is appropriately sorted, then the BY statement
calculates the appropriate statistics and creates a
separate table for each unique combination of values
of the variables listed in this statement.

CLASS            This option also defines groups based on unique
combinations of the values of the variables listed in
this statement and calculates the requested statistics
for these groups. The CLASS statement does not
create a new table for each group. This option is
easy to use since the data set does not have to be
pre-sorted for this option.

VAR              This statement specifies which numeric variables are
to be analyzed. If no VAR statement is requested,

Last Updated: January 2, 2009
SAS PROC MEANS                                                                   2

then the requested statistics are calculated for all the
numeric variables in the data set.

There are many options for the MEANS procedure. Only two are identified
below.

ALPHA=value      This specifies the value of " for the 100(1-")%
Confidence Intervals. The default is ALPHA=0.05.
Unless otherwise specified in the option(s) section,
this is the value used for the calculation of the
confidence intervals. The value for " must be a
number between 0 and 1.

MAXDEC=integer This specifies the maximum number of decimal
places to be displayed for the statistics. Only
nonnegative integers are allowable values for this
option.

VARDEF=DF        This is the default setting of VARDEF. The VARDEF is
the divisor used in the computation of the standard
deviation (s) and sample variance (s2). When
working with a sample this does not need to be
changed.

VARDEF=N         This setting is used when working with population
data. In order to compute the population standard
deviation (F) and population variance (F2) the divisor
needs to be the number of observations in the
population.

The default statistics computed by PROC MEANS are the mean, number of
nonmissing observations, standard deviation, minimum value and maximum

Last Updated: January 2, 2009
SAS PROC MEANS                                                                      3

value. Other computations can be made by specifying some of the keywords
below.

Descriptive Statistics:

CLM                 the 100(1-")% Confidence Interval for the mean

CSS                 the corrected sum of squares

MAX                 the maximum value of the variable
MEAN                the arithmetic mean .
MIN                 the minimum value of the variable
N                   the number of nonmissing observations for that
variable
NMISS               the number of missing observations for that variable
RANGE               the difference between the maximum and minimum
values
STDDEV or STD       the standard deviation s or F
STDERR              the standard error of the mean
SUM                 the sum of all values of the variable

USS                 the uncorrected sum of squares

VAR                 the variance    or F2

Percentile and Related Statistics:

MEDIAN              the median or 50th percentile
P1, P5, P10,        these give the appropriate percentile, i.e. P5 gives
P90, P95,     the fifth percentile.
P99
Q1                  the lower quartile or the 25th percentile
Q3                  the upper quartile or the 75th percentile
QRANGE              the interquartile range calculated by upper quartile
less the lower quartile (Q3 - Q1)

Hypothesis Testing Statistics:

T                   is the t statistic to test the null hypothesis that the
population mean is equal to :0. For PROC MEANS, :0
is the value zero and cannot be set to any other
value.

Last Updated: January 2, 2009
SAS PROC MEANS                                                                          4

PROBT              is the two-tailed p-value for the student t statistic, T
above.

Example

An example of PROC MEANS is shown below. The data file Country.dat is
used for this example. The data step to read the file is:

DATA country;
INFILE '/u2/example/Country.dat' firstobs=2 dlm='09'x;
INPUT cont \$ country \$ pop92 urban gdp lifeexpm lifeexpf
birthrat deathrat;
RUN;

This example requests the mean, the number of nonmissing observations,
the t-statistic to test the mean and p-value of this test.

PROC MEANS DATA=country MEAN N T PROBT;
RUN;

The output for this procedure is displayed below:

The SAS System       11:35 Monday, May 31, 2004    2

The MEANS Procedure

Variable            Mean      N    t Value    Pr > |t|
------------------------------------------------------
pop92         40.7485574    122       3.34      0.0011
urban         48.7762295    122      21.88      <.0001
gdp              4157.71    122       7.51      <.0001
lifeexpm      61.9016393    122      68.68      <.0001
lifeexpf      66.3114754    122      64.90      <.0001
birthrat      31.2867769    121      26.02      <.0001
deathrat      10.4628099    121      24.72      <.0001
------------------------------------------------------

q NOTE q           Since no VAR statement was used in PROC MEANS, the
descriptive statistics were calculated for all the numeric
variables in the data set.

 End of Example

Last Updated: January 2, 2009

```
To top