means by yaofenji


									SAS PROC MEANS                                                                  1

The procedure MEANS generates descriptive statistics about a data set. As
the name suggests, this procedure computes the mean of the nonmissing
numeric values. In addition to the mean, many other statistics can be
computed. By default, this procedure calculates the number of nonmissing
values, the mean, the sample standard deviation, the minimum value, and
the maximum value for each numeric variable. In general, the format of
this procedure is as follows:

PROC MEANS DATA=dataset <options> <statistic-keywords>;
     BY variable_list;
     CLASS variable_list;
     VAR variable_list;

The required keywords for this procedure are PROC MEANS.       The SAS data
set is specified using DATA=dataset.

The BY, CLASS, and VAR statements are just a few of the optional
statements available for this procedure and are used to customize the

      BY               This option assumes that the data set is sorted
                       according to the order of the variables listed in this
                       statement (see PROC SORT reference). If the data
                       set is appropriately sorted, then the BY statement
                       calculates the appropriate statistics and creates a
                       separate table for each unique combination of values
                       of the variables listed in this statement.

      CLASS            This option also defines groups based on unique
                       combinations of the values of the variables listed in
                       this statement and calculates the requested statistics
                       for these groups. The CLASS statement does not
                       create a new table for each group. This option is
                       easy to use since the data set does not have to be
                       pre-sorted for this option.

      VAR              This statement specifies which numeric variables are
                       to be analyzed. If no VAR statement is requested,

                                                     Last Updated: January 2, 2009
SAS PROC MEANS                                                                   2

                       then the requested statistics are calculated for all the
                       numeric variables in the data set.

There are many options for the MEANS procedure. Only two are identified

      ALPHA=value      This specifies the value of " for the 100(1-")%
                       Confidence Intervals. The default is ALPHA=0.05.
                       Unless otherwise specified in the option(s) section,
                       this is the value used for the calculation of the
                       confidence intervals. The value for " must be a
                       number between 0 and 1.

      MAXDEC=integer This specifies the maximum number of decimal
                     places to be displayed for the statistics. Only
                     nonnegative integers are allowable values for this

      VARDEF=DF        This is the default setting of VARDEF. The VARDEF is
                       the divisor used in the computation of the standard
                       deviation (s) and sample variance (s2). When
                       working with a sample this does not need to be

      VARDEF=N         This setting is used when working with population
                       data. In order to compute the population standard
                       deviation (F) and population variance (F2) the divisor
                       needs to be the number of observations in the

The default statistics computed by PROC MEANS are the mean, number of
nonmissing observations, standard deviation, minimum value and maximum

                                                      Last Updated: January 2, 2009
SAS PROC MEANS                                                                      3

value. Other computations can be made by specifying some of the keywords

Descriptive Statistics:

      CLM                 the 100(1-")% Confidence Interval for the mean

      CSS                 the corrected sum of squares

      MAX                 the maximum value of the variable
      MEAN                the arithmetic mean .
      MIN                 the minimum value of the variable
      N                   the number of nonmissing observations for that
      NMISS               the number of missing observations for that variable
      RANGE               the difference between the maximum and minimum
      STDDEV or STD       the standard deviation s or F
      STDERR              the standard error of the mean
      SUM                 the sum of all values of the variable

      USS                 the uncorrected sum of squares

      VAR                 the variance    or F2

Percentile and Related Statistics:

      MEDIAN              the median or 50th percentile
      P1, P5, P10,        these give the appropriate percentile, i.e. P5 gives
            P90, P95,     the fifth percentile.
      Q1                  the lower quartile or the 25th percentile
      Q3                  the upper quartile or the 75th percentile
      QRANGE              the interquartile range calculated by upper quartile
                          less the lower quartile (Q3 - Q1)

Hypothesis Testing Statistics:

      T                   is the t statistic to test the null hypothesis that the
                          population mean is equal to :0. For PROC MEANS, :0
                          is the value zero and cannot be set to any other

                                                         Last Updated: January 2, 2009
SAS PROC MEANS                                                                          4

      PROBT              is the two-tailed p-value for the student t statistic, T


An example of PROC MEANS is shown below. The data file Country.dat is
used for this example. The data step to read the file is:

DATA country;
     INFILE '/u2/example/Country.dat' firstobs=2 dlm='09'x;
     INPUT cont $ country $ pop92 urban gdp lifeexpm lifeexpf
          birthrat deathrat;

This example requests the mean, the number of nonmissing observations,
the t-statistic to test the mean and p-value of this test.


The output for this procedure is displayed below:

                                 The SAS System       11:35 Monday, May 31, 2004    2

                                The MEANS Procedure

              Variable            Mean      N    t Value    Pr > |t|
              pop92         40.7485574    122       3.34      0.0011
              urban         48.7762295    122      21.88      <.0001
              gdp              4157.71    122       7.51      <.0001
              lifeexpm      61.9016393    122      68.68      <.0001
              lifeexpf      66.3114754    122      64.90      <.0001
              birthrat      31.2867769    121      26.02      <.0001
              deathrat      10.4628099    121      24.72      <.0001

q NOTE q           Since no VAR statement was used in PROC MEANS, the
                   descriptive statistics were calculated for all the numeric
                   variables in the data set.

                                                                    End of Example

                                                            Last Updated: January 2, 2009

To top