C Syntax Tutorial by tnc19759

VIEWS: 6 PAGES: 10

More Info
									                                Using SAS PC with Windows


                       USING SAS PC WITH WINDOWS
                               Statistics 511

                              Professor Naomi Altman

                   revised from previous editions by McShane and Altman,
                              and Nshinyabakobeje and Altman




                               TABLE OF CONTENTS


A. OVERVIEW OF THE SAS SYSTEM                                              2

B. TWO STEPS NEEDED IN THE SAS PROGRAMMING LANGUAGE                        2

C. SYNTAX                                                                  3

D. CHARACTERISTICS OF A SAS DATA SET                                       4

D. CREATING A SAS DATA SET                                                 5

E. CREATING A SAS PROGRAM FOR DESCRIPTIVE STATISTICS                       6

F. PRODUCING A REPORT FROM A SAS OUTPUT                                    10

G. HELPFUL HINTS                                                           10




                                                                                1
                                       Using SAS PC with Windows



                              A. OVERVIEW OF THE SAS SYSTEM



   SAS (Statistical Analysis System) is a software system/package for data analysis. SAS provides
    tools for: information storage and file handling; data modification and management; statistical
    analysis; and report writing.
   The SAS system is a powerful programming language plus a collection of ready-to-use programs
    called procedures or PROC’s, which can perform a large variety of applications.
   We will use primarily the Basic and Statistical tools – a small fraction of the capabilities of SAS.
   On-line documentation is available at www.sas.psu.edu. There is also on-line help when you run
    SAS PC, but this is difficult to use.



            B. TWO STEPS NEEDED IN THE SAS PROGRAMMING LANGUAGE

   The SAS language has its own vocabulary and syntax - words and the rules for putting them
    together.
   A SAS statement is a string of SAS keywords, SAS names, and special characters and operators
    ending in a semicolon that instructs SAS to perform an operation or gives SAS information.
   A sequence of SAS statements is called a SAS program.
   A SAS program consists of two kinds of steps: DATA steps and PROC steps. DATA and PROC
    steps can appear in any order, and any number of DATA and PROC steps can be used in a SAS
    program.
   Usually, DATA steps create SAS data sets, and PROC steps do analysis of SAS data sets. A
    PROC may also create variables, such as residuals and fitted values, which can be placed in a new
    data set or appended to an existing data set.


A DATA step is a group of SAS statements that begins with a DATA statement. Example:
DATA ONE;                                     Creates a data set named "ONE".
INFILE 'A:YIELD.TXT';                         Reads the data from the file A:YIELD.TXT
INPUT TREAT REP YIELD;                        The file has 3 variables named TREAT, REP and YIELD.
LOGY=LOG (YIELD);                             A new variable LOGY is created and added to "ONE".


   The DATA step begins with a DATA statement and can include any number of program
    statements.




                                                                                                           2
                                       Using SAS PC with Windows


   You can use the DATA step for these purposes:
    *   retrieval: getting input data from a file
    *   editing: checking for errors in the data and correcting them; computing new variables;
    *   outputting: write data sets to disk;
    *   creating: producing new SAS data sets from existing ones by subsetting, merging, and
        updating.
   Every SAS data set has a name. By default, SAS uses the currently active data set, which is the
    one most recently called as input to a DATA or PROC statement. If your program uses several
    data sets, it is best to call the required data set using DATA=datasetname when you need to use it.
    That will avoid problems as you change your program.
   The DATA step can include statements telling SAS to create one or more new SAS data sets and
    programming statements that perform the manipulations necessary to build the data sets. Creating
    a new data set does not change the currently active data set.

A PROC is a group of SAS statements that begins with a PROC statement. Example:

PROC REG DATA=HOUSING;                         Calls the regression procedure with data set
                                               “HOUSING”
MODEL PRICE=SQFT NOBEDRM;                      Tells SAS which are the dependent and independent
                                               variables.
OUTPUT OUT=NEWDATA R=R P=P;                    Stores the created variables, R and P in a new data set
                                               called “NEWDATA”

 The PROC step (or PROCEDURE step) instructs SAS to call a procedure from its library and to
    execute that procedure, usually with a SAS data set as input.
 The PROC step begins with a PROC statement. Other statements in the PROC step give the
    program more information about the results that you want.




                                               C. SYNTAX


There are 4 main syntax rules:


    1. Every SAS statement ends with a semi-colon “;”. Failure to include the semi-colon is the
        most common error, and unfortunately leads to error messages that are difficult to decipher.
    2. Variables or data set names should contain 8 or fewer characters or digits.
    3. SAS is case insensitive. DOG, Dog and dog all mean the same thing to SAS.


                                                                                                         3
                                           Using SAS PC with Windows

     4. SAS ignores “end of line” and multiple spaces.


                           D. CHARACTERISTICS OF A SAS DATA SET


         The SAS system reads data (letters or numbers) in various forms and organizes them into a
SAS data set which is similar to a spreadsheet. Once the data have been organized into a SAS data
set, you can access, analyze, revise, and display the data. You can also store datasets – however, for
small datasets, it is most convenient to store them as text files.
The data consist of the following components: data value, variable, and observation.
    Data value is a single unit of information (a single cell)
    Variable is a set of data values describing a characteristic (a single column)
    Observation is a set of data values for the same item (a single row)
The data set following contains 5 variables, 18 observations, and 90 data values, one of which is
missing.

                               variables

           NAME          SEX       AGE         HEIGHT     WEIGHT
1      Aubrey            M       41            74         170
2      Ron               M       42            68         166
3      Carl              M       32            70         155
4      Antonio           M       39            72         167        
5      Deborah           F       30            66         124                         observation
6      Jacqueline        F       33            66         115
7      Helen             F       26            64         121
8      David             M       30            71         158
9      James             M       53            72         175                         missing value
10     Michael           M       32            69         143
11     Ruth              F       47            69         139
12     Joel              M       34            72         163
13     Donna             F       23            62         98
14     Roger             M       36            75         160
15     Yao               M       .             70         145
16     Elizabeth         F       31            67         135
17     Tim               M       29            71         176
18     Susan             F       28            65         131


                             data values
Now you are ready to create a data set file and a SAS program file.




                                                                                                         4
                                        Using SAS PC with Windows


                                  E. CREATING A SAS DATA SET

The initial step in most SAS programs will involve reading data from a text file. Any text processor
or spreadsheet can be used to create the file. We demonstrate using Notepad. Usually I use my
favorite text editor and save in txt format.


(In the SAS manuals you will occasionally see data imbedded in a DATA step. However, this is
awkward and means that a new program file needs to be written every time the data are modified.)


The data set to be created consists of two variables measured on a random sample of 9 steers. The
first variable is the live weight (in hundreds of pounds) and the second variable is the dressed weight
(in hundreds of pounds). This sample data set will be used to obtain simple summary statistics and a
Normal Probability Plot.


Live weight        4.2   3.8      4.8    3.4     4.5      4.6       4.3     3.7      3.9
Dressed weight     2.8   2.5      3.1    2.1     2.9      2.8       2.6     2.4      2.5
Data from Lyman Ott, An Introduction to Statistical Methods and Data Analysis, p. 143.
Start notepad as follows:
Start > Programs > Accessories > Notepad.
Now, type the following data set.
4.2 2.8
3.8 2.5
4.8 3.1
3.4 2.1
4.5 2.9
4.6 2.8
4.3 2.6
3.7 2.4
3.9 2.5
   Save the file on a diskette in drive A: as follows. File > Save > A:\STEERS.TXT > Save.
   File > Exit.
Now you are going to create the SAS Program.



              E. CREATING A SAS PROGRAM FOR DESCRIPTIVE STATISTICS


                                                                                                          5
                                       Using SAS PC with Windows



Although SAS has some interactive features, it is basically a batch program. This means that it is
convenient to create and save programs as text files. Usually, I create my program in my favorite text
editor, and save it as a txt file, with extension “.sas” instead of “.txt”. Then clicking on the file opens
the program and places the text in the SAS text editor, from where it can be run.

You can also create and save your program in the SAS text editor. Instructions are below.

Start SAS as follows:
Start > Programs > The SAS System > The SAS System for Windows v8
After opening SAS, one is prompted to the following screen with two windows (see below). The
upper window is a Log window showing the SAS statements which have already been processed,
along with comments. The bottom window is the Program editor window. You will enter and edit
your SAS program in the Program editor
window




Now, create the following SAS program in the Program Editor window. Use upper or lower case
letters as you choose.




                                                                                                          6
                                      Using SAS PC with Windows

/*
THIS PROGRAM IS USED TO CREATE A SMALL SAS PROGRAM
WRITTEN BY: LAST NAME, FIRST NAME OF STUDENT.
DATE: MONTH/DAY/YEAR
*/
The text above which is delimited by /* */ is a comment and is ignored by SAS. It is helpful to use
comments as a way to document your data.

OPTIONS LS=79 NOCENTER;                  OPTIONS picks options for the output. LS=
                                         selects the number of characters per line.
TITLE 'SUMMARY STATISTICS';              TITLE provides a title that
                                         appears on each page of the output.
DATA MARY;                               We now create the data set named "MARY".
INFILE 'A:STEERS.DAT';                   We read the data from A:STEERS.DAT
INPUT LIVEWT DRESSWT;                    There are two variables named LIVEWT and
                                         DRESSWT. The variable names are separated by
                                         blanks. You need to name all variables in
                                         the data set, even if you do not want to use
                                         them all.
TITLE2 'PRINTING LIVEWT';                TITLE2 provides a subtitle. It can be used
                                         e.g. if several analyses are performed in the
                                         same program.
PROC PRINT DATA=MARY;                    We now run our first PROC. It prints some or
                                         all of the data in data set MARY.
VAR LIVEWT;                              Tells SAS to print LIVEWT only. Otherwise it
                                         prints all of the data.
RUN;                                     The RUN command can be used to terminate a
                                         PROC or DATA step. Commands you submit will
                                         not run until a RUN command is added.

Try to run the SAS program now by clicking the SUBMIT icon (the running figure), or by pressing
the key function F3. Look at your SAS output. You should see a list of the data. If you do not, you
have made an error. However, you can continue reading this tutorial, as error correction is the next
topic. Whether or not the output appears, open the Log window as follows: Window > Log. You
should see the SAS commands you entered, with comments about how they executed, including error
messages (in red) if any. Warnings are printed in green.


We now want to see what happens if you make an error. To do this, we will start over. You can clear
any window by clicking on the window to make it active. Then clear the window as follows: Edit >
Clear Text. Clear the OUTPUT and LOG windows. (Window>OUTPUT Edit>Clear Text
Window>LOG Edit>Clear Text)

Recall the current SAS file in the Program Editor window as follows: Window > Program Editor to
open the window and Locals > Recall Text to bring back the most recently submitted text.
(Repeating Recall Text brings back the second most recently submitted text, etc.) To see how SAS
handles errors in a program file, change the statement PROC PRINT DATA=MARY; to PROC PRINT


                                                                                                       7
                                      Using SAS PC with Windows

DATA=MARIAM; in the program above, then run SAS. You will get the following error message
(written in red) in the LOG window. ERROR: File WORK.MARIAM.DATA does not exist. Once a
SAS program has been submitted for processing, error messages are written in the Log window. They
can be accessed as follows: Window > Log. Now open the Log window and scroll down to read the
error message. As mentioned previously, it is assumed that you have cleared the Log window of its
previous contents. If not, clear this window and run SAS program again.


Reopen your SAS program file as follows: Window > Program Editor > Locals > Recall Text. Go
to the statement PROC PRINT DATA=MARIAM; and change MARIAM back to MARY.
Add the remaining statements below to your SAS program.
PROC UNIVARIATE DATA=MARY;               PROC UNIVARIATE prints summary statistics. It is part of SAS
                                         BASIC, rather than SAS STATISTICS.
VAR LIVEWT DRESSWT;                      We will obtain summary statistics for both variables.
QQPLOT;                                  We request a Normal Probability Plot for both variables.
RUN;


   Save the SAS program file on A:\ drive as follows: File > Save > A:\steers.sas > Save
   To run the SAS program, click on the SUBMIT icon or simply press the F3 key. Other key
    functions are defined under: Help > Key.
   By selecting the Window menu, you can open the Output, Program Editor, and Log windows
    whenever necessary. The Output window can be selected and opened the same way the other two
    windows are opened.
   You can save the Output window’s contents as follows: File > Save > A:\steers.lst > Save I
    usually cut and paste the entire window into a text editor as described below.




                                                                                                        8
                                      Using SAS PC with Windows


                      F. PRODUCING A REPORT FROM A SAS OUTPUT


There are many different ways to produce a report using SAS output. We will go through one way,
which assumes that you have a text editor such as Word on your computer and that your computer is
powerful enough to run the editor and SAS at the same time.


Start the Text Editor.
In the editor, write your report. For example, type the heading and introductory material describing the
problem you analyzed. Discuss your analysis of the data. Suppose you would like to include a SAS
analysis output in your report.
Copy analysis output to the Clipboard:
   Open the SAS output: Window > output.
   Edit> Select all; Edit> Copy; Note that copying only a portion of the output by highlighting
    does not always work. I copy the entire output to a text editor, and edit there.
   Open the text editor and paste your SAS output: Edit > Paste.
You will realize that the SAS output has the SAS monospace font size 10 by default. You need to
modify the font size for a better output. Proceed as follows to modify the font size of your pasted
output.
   e.g. In Microsoft Word, choose Edit > Select All and change the SAS monospace font to size to 8.
Now you can edit your word document by adding text and/or removing parts of the output you judge
unimportant.
   You can also copy and paste graph sheets directly into your document.
   Save your report as follows: File > Save > A:\report.doc> Save
   Now you can print your report in Microsoft Word as follows: File > Print > Click OK




                                                                                                      9
                                       Using SAS PC with Windows


                                        G. HELPFUL HINTS

   Every SAS statement ends with a semicolon ;. You may continue statements on two or more lines.
    Forgetting the semicolon (;) leads to error messages which are hard to decipher. This is always the
    first thing to check if your program does not run.
   The next most common error is misspelling a SAS command, or variable name. SAS variable
    names can be no more than 8 characters long.
   The third most common error is trying to use a variable that is not available. For example, this
    error can occur if you try to draw residual plot, but forgot to store the residuals from the
    regression or if you are in the wrong data set.
   SAS ignores extra blanks, including blank lines, so you can space your program so that it is neat
    and easy to read.
   You can put more than one statement on a line, but this usually makes your program hard to read.
    SAS program files can get quite long, so it is useful to keep them readable.
   Statements added in the SAS editor are not saved in your SAS program file. If you want to have
    them available for future use, you must explicitly save them using file>save
   Most PROCs can use only one data set. If variables from multiple data sets are needed, the data
    sets can be merged in a DATA step.
   The online SAS help can be difficult to use due to statements with the same name in different
    PROCs. When seeking help for a PROC, try searching on PROC. This gives a list of all the
    PROCs. You can then click on the PROC you want, which gives a list of the statements valid for
    that PROC. You will likely find the online manual more useful.




                                                                                                        10
a list of the statements valid for
    that PROC. You will likely find the online manual more useful.




                                                                                                          11

								
To top