DOC 1 by yaoyufang

VIEWS: 15 PAGES: 4

									                                        SAS
                                Lab1: Introduction

Preliminaries
  1. Working environment: program editor, log, output (Menu -> View).
  2. Programming in SAS 4GL:
        a. This language is not case sensitive
        b. Comments: * it is a comment; and /* it is also a comment */
        c. Each expression is ended with semicolon.
        d. Names of variables: maximum length 8 chars, starting with letter, “_” and can
            contain letters, digits and “_”.
  3. To run program: submit (or submit selection).

Data manipulation
  1. User defined data:

     /* 1 */
     DATA lab1_1;
          INPUT name $ sex $ height weight;
          CARDS;
              Jurek m 178 78
              Ania k 167 60
              Krzysztof m 190 90
          ;
     PROC PRINT;
          TITLE 'Introduction to the SAS system';
          TITLE2 'Group 1';
          FOOTNOTE 'Monday, 8.15';
     RUN;
     (Students run this code)

     /* 2 */
     DATA random;
           drop n;

            Group = 'H';
            do n = 1 to 20;
               X = 4.5 + 2 * normal(57391);
               Y = X + .5 + normal(57391);
               output;
            end;

           Group = 'O';
           do n = 1 to 20;
              X = 6.25 + 2 * normal(57391);
              Y = X - 1 + normal(57391);
              output;
           end;
     /* After reading in the data with a data step, it is usually a good
     idea to print the first few cases of your dataset to check that
     things were read correctly. */
     TITLE "sample data";
     TITLE2 "";
     FOOTNOTE "";
     PROC PRINT DATA=random(obs=5);
     RUN;
     (Students run this code and add one loop to generate data for ‘G’ class)

  2. Reading data from an external file:
     /* 3 */
     DATA lab1_2;
            INFILE 'E:\SAS_DMU\Lab1_DMU\files\students.txt';
            INPUT ... ;
     PROC PRINT data=lab1_2;
          TITLE 'Data from file';
          TITLE2 '';
          FOOTNOTE '';
     RUN;
     (Students modify path to the file, finish the INPUT statement and run this code)

  3. Different delimiters:
         a. Comma           INFILE 'a path to the file' delimiter=',';
         b. Tab             INFILE 'a path to the file' delimiter='09'x;
         (Students modify code 3 to read comma and tab delimited data from students.csv
         and students.dat files)

  4. Importing data from Excel xls files.
     /* 6 */
     PROC IMPORT out=lab1_5 datafile =
     "E:\SAS_DMU\Lab1_DMU\files\students.xls"
           DBMS = excel2000 replace;
           GETNAMES = yes;
     RUN;
     TITLE "Data from xls file";
     PROC PRINT DATA=lab1_5;
     RUN;
     (Students modify path to the file and run this code)


Basic procedures
  1. PRINT

     /* 7 */
     PROC PRINT DATA=lab1_2;
          TITLE 'We print all variables.';
     RUN;

     /* 8 */
     PROC PRINT DATA=lab1_2;
          VAR name height weight;
          WHERE height>180 AND weight>80;
          TITLE 'height>180 and weight>80';
     RUN;

  2. SORT

     /* 9 */
     PROC SORT DATA=lab1_2 OUT=lab1_sort;
            BY height;
     PROC PRINT DATA=lab1_sort;
          BY height;
     RUN;

  3. PLOT

     /* 10 */
     PROC PLOT DATA=lab1_2;
          PLOT height*weight / VAXIS= 120 140 160 180 200 HAXIS= 50 to 100
     by 10;
          TITLE 'height * weight';
          FOOTNOTE 'students';
     RUN;

     /* 11 */
     PROC GPLOT DATA=lab1_2;
          PLOT age*height='x' age*weight='*';
     RUN;

     /* 12 */
     /* Specify the percentage of the available horizontal space for each
     plot */
     PROC PLOT DATA=lab1_2 HPERCENT=50;
          PLOT age*height;
          PLOT age*weight;
     RUN;

     /* 13 */
     symbol1 v='H' c=blue;
     symbol2 v='O' c=yellow;
     proc gplot DATA=random;
           plot Y*X=Group / cframe=ligr nolegend;
     run;

  4. CHART

     /* 14 */
     PROC CHART DATA=lab1_2;
          VBAR height;
            /* VBAR - vertical histogram, HBAR - horizontal, BLOCK - block,
     PIE - circular */
     RUN;

     /* 15 */
     PROC CHART DATA=lab1_2;
          BLOCK height / MIDPOINTS=150 160 170 180 190
          GROUP=sex;    /* or SUBGROUP=sex */
     RUN;



Exercise
  1. Generate dataset with 3 variables x, y, z and class variable. There are tree classes A, B
     and C. The variables x, y and z are randomly generated (use normal function) with
     mean vector for each class: A – [0,0,0], B – [3,3,3], C – [1,2,3].
  2. Print generated data.
  3. Print histogram of variables x, y and z.
4. Print histogram of all variables grouped by the class attribute.
5. Plot the data in 2D space.

								
To top