USING SAS, SASACCESS ® , AND SQL PASSTHROUGH TO by xit16869

VIEWS: 13 PAGES: 5

									      USING SAS, SAS/ACCESS®, AND SQL PASSTHROUGH TO QUERY AND JOIN ORACLE TABLES:
          An Example Using the Health Care Finance Administration’s SDPS (Medicare) Database
                                       Barbara B. Okerson, Ph.D.
                              Mid-South Foundation for Medical Care, Inc.


ABSTRACT                                                               admissions and their diagnosis and procedure codes.
                                                                       Progression, comorbidities, treatment, and outcome are
                                                                       compared and contrasted by beneficiary, hospital, and region. All
Medicare Encounter data is received by each state’s Peer Review        of this information is contained in the Oracle SDPS database.
Organization as an Oracle database known as the SDPS
(Standard Data Processing System) data set. This data set is           SAS software provides an interface with Oracle that provides a
used for quality improvement projects, integrity investigations,       number of options for interfacing and isolating the data needed
and research. This data set is then queried with SAS software          for this analysis. This paper contrasts four methods for solving
and the SAS SQL pass-through facility. Often the criteria for          this Oracle query in terms of ease of use, ease of programming,
selecting data from the SDPS data set are contained in a pre-          time and space considerations, and situational advantages and
extracted SAS data set. Using the criterion information from the       disadvantages.
SAS data set, additional data is extracted from the SDPS
database and merged or combined with the existing SAS data set         These examples use SAS version 6.12 running on a Dell Pentium
or its information. In this example used to document                   running Windows 95, also Oracle version 8 and SAS version 6.12
comorbidities and disease progression for HIV, a SAS data set is       on an RS/6000 model 390 running AIX version 4.1.
run against the SDPS Oracle database to obtain subsequent              Communication between the two platforms is achieved with
admissions and diagnoses for each patient (Medicare                    SAS/ACCESS, SAS/CONNECT, and SQL Passthrough.
beneficiary). This paper illustrates four methods for performing
this query using SAS/ACCESS, SAS/CONNECT®, and Proc
SQL.
                                                                       STEP ONE: QUERYING THE DATABASE TO
                                                                       ISOLATE THE AFFECTED POPULATION
INTRODUCTION
                                                                       The first step is to extract the first admission in the database for
                                                                       code 042. Because the total HIV admissions by year are used as
Mid-South Foundation for Medical Care, Inc. functions as the           a part of the report, all of these are written to a SAS data set and
Peer Review Organization (PRO) for the state of Tennessee.             a new data set created that includes the beneficiary number
The PRO program exists to ensure that Medicare beneficiaries           (bene_clm) and the discharge date (hse_clm0) of the first
receive appropriate, high quality care as mandated by HCFA             admission. The following code creates these two data sets.
(Health Care Finance Administration). Among the tasks required
for the PROs are:
                                                                         UVXEPLW 6$6695
• Research and distribution of statistical analysis and data on          SURF VTO
the quality of care among Medicare beneficiaries within the state.          FRQQHFW WR RUDFOHXVHU 
ERNHUVRQ
 SDVV [[[[[[
                                                                              SDWK 
WQBGEVYU

• Monitoring of trends and variations in medical care by                    FUHDWH WDEOH KLYWHPS DV
geographic area within the state and in state-to-nation                     VHOHFW EHQHBFOP KVSBLG KVHBFOPB KVHBFOP
comparisons.
                                                                              KVHBLQSD KVHBEHQH EHQHBVWD EHQHBVH[
                                                                              EHQHBUDF EHQHBPGF KVHBGJQV GJQVBFG KVHBGJQ
• Identification of sentinel events that may indicate less-than-
                                                                            IURP FRQQHFWLRQ WR RUDFOHVHOHFW $%(1(B&/0B180
optimal care.
                                                                              $+63B,' $+6(B&/0B)520B'7
                                                                              $+6(B&/0B7+58B'7 $+6(B,13$7B2873$7B,1'
• Identification, prioritization and action upon opportunities for
improvement.                                                                  $+6(B%(1(B$*( $%(1(B67$7(B&2'(
                                                                              $%(1(B6(;B,'(17B&'     $%(1(B5$&(B&'
• Evaluation of the impact of each quality improvement                        $%(1(B0'&5B6786B&'     $+6(B'*16B6(4
intervention.                                                                 $'*16B&'     $+6(B'*16B&2025%,'
                                                                            IURP 71B6'36+6( $ 71B6'36+6(B',$* $
In accordance with this tasking, Mid-South Foundation monitors            ZKHUH $%(1(B&/0B180    $%(1(B&/0B180 $1'
medical care, diagnosis groups, and treatment patterns                        $+6(B&/0B)520B'7    $+6(B&/0B)520B'7 $1'
throughout the state of Tennessee. Part of this monitoring                    $+6(B&/0B7+58B'7    $+6(B&/0B7+58B'7 $1'
involves following the normal progression of a diagnosis to                   $'*16B&'   


identify commonalties in additional diagnoses, comorbidities,             RUGHU E\ $%(1(B&/0B180 $+6(B&/0B7+58B'7 DV
length of stays, and outcomes.                                                W EHQHBFOP KVSBLG KVHBFOPB KVHBFOP
                                                                              KVHBLQSD KVHBEHQH EHQHBVWD EHQHBVH[
This paper follows the progression of Medicare beneficiaries after            EHQHBUDF EHQHBPGF KVHBGJQV GJQVBFG
the first in-patient encounter for any beneficiary with a primary or          KVHBGJQ
other diagnosis code equaling HIV (042 ICD-9). The                       TXLW
methodology involves identification of those patients, isolating         HQGUVXEPLW
first occurrence, then, for each patient, identifying all subsequent
This data set is then used for admission, length of stay, and other     the SAS data set. In this method, the SAS data set variable
trends for HIV admissions.                                              names cannot be the same variable names as those from the
                                                                        Oracle data base.
The following code is then used to isolate the first admission for
each beneficiary for this diagnosis.
                                                                          UVXEPLW 6$6695
  UVXEPLW 6$6695                                                        SURF VTO
  GDWD WHPSEER                                                           FRQQHFW WR RUDFOHXVHU 
ERNHUVRQ
 SDVV [[[[[[
      VHW KLYWHPS E\ EHQHBFOP KVHBFOP                                   SDWK 
WQBGEVYU

      LI ILUVWEHQHBFOP                                                  FUHDWH WDEOH ZRUNRUD DV
  UXQ                                                                    VHOHFW EHQHBFO KVSBLG KVHBFOI KVHBFOW KVHBGJQ
  HQGUVXEPLW                                                              KVHBVHT KVHBFPEG WHPSEEREHQHBFOP WHPSEERKVSBLG
                                                                           WHPSEERKVHBFOPB WHPSEERKVHBFOPWHPSEERKVHBEHQH
                                                                           WHPSEEREHQHBVWD WHPSEEREHQHBVH[WHPSEEREHQHBUDF
The beneficiary numbers and dates contained in this data set are           WHPSEEREHQHBPGF WHPSEERKVHBGJQV WHPSEERGJQVBFG
then used as the criteria for querying and selecting all encounter         WHPSEERKVHBGJQ
data for each of these beneficiaries after their initial admission        IURP FRQQHFWLRQ WR RUDFOHVHOHFW %%(1(B&/0B180
from the Oracle database. Each of the methods given have                   %+63B,' %+6(B&/0B)520B'7
advantages and disadvantages.                                              %+6(B&/0B7+58B'7 %+6(B'*16B6(4
                                                                           %'*16B&' %+6(B'*16B6(4
                                                                          IURP 71B6'36+6(B',$* % DV RUEHQHBFO KVSBLG
STEP TWO: USING THE SAS DATA SET TO                                        KVHBFOI KVHBFOW KVHBGJQ KVHBVHT KVHBFPEG
FURTHER QUERY ORACLE                                                      ULJKW MRLQ ZRUNWHPSEER DV VDV
                                                                           RQ RUEHQHBFO VDVEHQHBFOP DQG RUKVHBFOW JH
The programming task required is to use the SAS data set as a              VDVKVHBFOP
means for querying the Oracle database. The resultant database            TXLW
should have all the beneficiary information contained in the              HQGUVXEPLW
original SAS data set as well as any additional diagnoses for
each beneficiary.

In terms of ease of programming, this task could be
accomplished by extracting the encounter table and diagnosis            Method Two: Creating a SAS Data View of the
table (both millions of records) and then merging with SAS. If
one has nothing but unlimited space and time and is not worried         Joined SAS Data Set and Oracle Tables
about computing costs this could be your method. Since that is
not usually the case in the real world, a more efficient and            The code to create a data set view differs very little from the code
expedient solution is necessary. Four such methods are                  in method one. What differs is when and where in the processing
illustrated here:                                                       the use of space and time occurs. However, interestingly, even if
                                                                        the view is used only one time to create a permanent SAS data
    •   Joining the SAS data set directly with the Oracle tables        set, the total processing time is less than creating a SAS data set
                                                                        as illustrated in method one (by about two seconds.)
    •   Joining the SAS data set with a view of the Oracle table
                                                                        The code follows:
    • Create a temporary Oracle table to join with the SAS
                                                                          UVXEPLW 6$6695
data set
                                                                          SURF VTO
                                                                          FRQQHFW WR RUDFOHXVHU 
ERNHUVRQ
 SDVV [[[[[[
   • Pass the matching information to Oracle using the SAS
macro language                                                             SDWK 
WQBGEVYU

                                                                          FUHDWH YLHZ ZRUNRUD DV
                                                                          VHOHFW EHQHBFO KVSBLG KVHBFOI KVHBFOW KVHBGJQ
                                                                           KVHBVHT KVHBFPEG WHPSEEREHQHBFOP WHPSEERKVSBLG
Method One: Merging the SAS Data Set Directly                              WHPSEERKVHBFOPB WHPSEERKVHBFOPWHPSEERKVHBEHQH
with the Oracle Tables                                                     WHPSEEREHQHBVWD WHPSEEREHQHBVH[WHPSEEREHQHBUDF
                                                                           WHPSEEREHQHBPGF WHPSEERKVHBGJQV WHPSEERGJQVBFG
When merging the SAS data set with the Oracle table in a single            WHPSEERKVHBGJQ
step, advantages include keeping the processing on the UNIX               IURP FRQQHFWLRQ WR RUDFOHVHOHFW %%(1(B&/0B180
server, accomplishing the join in a single program, and ease of            %+63B,' %+6(B&/0B)520B'7
programming. Disadvantages include processing time and the                 %+6(B&/0B7+58B'7 %+6(B'*16B6(4
work space required for the intermediate tables.                           %'*16B&' %+6(B'*16B6(4
                                                                          IURP 71B6'36+6(B',$* % DV RUEHQHBFO KVSBLG
In this solution, all the required variables are extracted from            KVHBFOI KVHBFOW KVHBGJQ KVHBVHT KVHBFPEG
Oracle into a SAS work file called ora. This file is joined with the      ULJKW MRLQ ZRUNWHPSEER DV VDV
original SAS data set using the defined criteria.                          RQ RUEHQHBFO VDVEHQHBFOP DQG RUKVHBFOW JH
                                                                           VDVKVHBFOP
The code for this solution follows. Note that it is necessary to list     TXLW
the data set name prefix to identify those variables selected from        HQGUVXEPLW
Any time subsequently that the view would be used instead of a       Method Four: Pass the Criterion Information to
data set would result in considerable processing time. However,
if a number of such joins are used regularly, the storage space
                                                                     Oracle using the SAS Macro Language
saved may be worth the increased processing time.
                                                                     In this method, the beneficiary number and date information are
                                                                     passed to Oracle through macro processing. Again, like the
                                                                     previous solution, the main advantage is that the processing
Method Three: Create a Temporary Oracle Table                        remains in Oracle. In this instance, Oracle privileges are not
                                                                     required.
to Join with the SAS Data Set
                                                                     Although difficult to code, with difficulty increasing by the number
In this method, the SAS data set containing the join criteria is     of variables that comprise the query criteria, the result is very
written to the Oracle data base as a temporary Oracle table. The     efficient code that takes advantage of client-server technology,
advantages are that all processing again remains on the UNIX         keeping all the selection processing in Oracle, while eliminating
side, all processing is in Oracle, the amount of data processed by   the necessity of creating a temporary Oracle table.
Oracle and by SAS is reduced, the actual join is in a single step,
and the programming is relatively easy.                              In the following code, two macros are created. In the first (bene),
                                                                     all the beneficiary numbers to be selected are written out as a
In order for this method to work, it is necessary for the            string or array of macro variables that can be retrieved by calling
programmer to have CREATE TABLE and DROP TABLE                       the macro. The SAS Autocall macro, %cmpres is used to
privileges for the Oracle data base. Where this is true, this        eliminate any multiple blanks from the string. The %cmpres
method probably provides the best solution. The following code       macro calls both the %left and the %trim macros to remove any
writes the SAS data set as an Oracle table.                          leading and trailing blanks.
  UVXEPLW 6$6695
                                                                     In the second macro (benedate), a series of macro variables are
  SURF GEORDG GEPV RUDFOH GDWD WHPSEER
                                                                     created that pair the beneficiary numbers with the initial discharge
          XVHU ERNHUVRQ RUDSZ [[[[[[ SDWK 
#I6'36
               date. These can then be called by the SQL to Oracle code. It is
          WDEOH WULWHPS ORDG UXQ                                  necessary to use the %global statement when creating these
  HQGUVXEPLW                                                        variables; it is not implicit in this case.

It is important to make sure that the criterion variables have the   The code to create the macros follows:
same characteristics in the temporary Oracle table as they do in
the permanent Oracle data base. SAS software automatically               UVXEPLW
converts variable length formats to fixed formats when creating          GDWD BQXOOB
SAS data sets from Oracle. Although that code is not given, in               VHW WHPSEER QREV QREV
this example it was necessary to reformat the beneficiary number         FDOO V\PSXWFRXQWSXWQREV
variable (bene_clm) to a varying format. This was not necessary          FDOO V\PSXWEHQH__OHIWQREV__WULPEHQHBFOP__
for the previous two methods. The join code follows:                     UXQ
                                                                         PDFUR EHQH
  UVXEPLW 6$6695                                                           GR L    WR 	FRXQW
  SURF VTO                                                                  FPSUHV		EHQH	L HQG
  FRQQHFW WR RUDFOHXVHU 
ERNHUVRQ
 SDVV ;;;;;;                          PHQG
   SDWK 
WQBGEVYU
                                                     PDFUR EHQHGDWH
  FUHDWH WDEOH ZRUNRUD DV                                               GR L     WR 	FRXQW
  VHOHFW EHQHBFOP KVSBLG KVHBFOPB KVHBFOP                               GDWD BQXOOB VHW RQH
   KVHBLQSD KVHBEHQH EHQHBVWD EHQHBVH[ EHQHBUDF                            LI QREV 	L
   EHQHBPGF KVHBGJQV GJQVBFG KVHBGJQ KVHBFOPI                          JOREDO QDPH	L GDWH	L
   KVHBFOPW GJQVBFG KVHGJQVKVHGJQ                                          FDOO V\PSXW1$0(	LWULPEHQHBFOP
  IURP FRQQHFWLRQ WR RUDFOHVHOHFW $%(1(B&/0                               FDOO V\PSXW'$7(	LKVHBFOP
   $+63B,' $+6(B&/0B $+6(B&/0                            HQG PHQG
   $+6(B,13$ $+6(B%(1( $%(1(B67$
   $%(1(B6(; $%(1(B5$& $%(1(B0'&
   $+6(B'*16 $'*16B&' $+6(B'*1                       Although it would be possible to use only the paired macro
   $+6(B&/0B)520B'7 $+6(B&/0B7+58B'7                       variables to query Oracle, it is actually more efficient to isolate
   $'*16B&' $+6(B'*16B6(4                                  within Oracle those beneficiary numbers that match before
   $+6(B'*16B&2025%,'                                            performing the date processing. If processing space
  IURP %2.(562175,7(03 $ 71B6'36+6(B',$* $                      considerations are the primary concern, rather than a balance of
  ZKHUH $%(1(B&/0   $%(1(B&/0B180                             time and space, this problem can be addressed using only the
     $1' $+6(B&/0 OH $+6(B&/0B7+58B'7                       second macro.
      DV W EHQHBFOP KVSBLG KVHBFOPB KVHBFOP
     KVHBLQSD KVHBEHQH EHQHBVWD EHQHBVH[ EHQHBUDF               The following code uses the macro variables, rather than joins, to
     EHQHBPGF KVHBGJQV GJQVBFG KVHBGJQ KVHBFOPI                create the data set containing the identical information resulting
     KVHBFOPW GJQVBFG KVHGJQV KVHGJQ                           from the previous three methods.
  TXLW
  HQGUVXEPLW
    3URF VTO                                                       BIBLIOGRAPHY
    FRQQHFW WR RUDFOHXVHU 
ERNHUVRQ
 SDVV ;;;;;;
       SDWK 
WQBGEVYU
                                            Buffum, Henry W., “Strategic Uses of SAS Data Step
    FUHDWH WDEOH RUDKLY EHQHBFOP FKDU KVSBLG FKDU                    Programming and SQL Passthrough to Query Oracle
       KVHBFOPB GDWH KVHBFOP GDWH KVHBLQSD FKDU                     Databases,” Proceedings of the Twenty-first Annual SAS
       KVHBEHQH QXP EHQHBVWD FKDU EHQHBVH[ FKDU                      Users Group International Conference, Cary, NC: SAS
       EHQHBUDF FKDU EHQHBPGF FKDU KVHBGJQV FKDU                     Institute, Inc., 1996.
       GJQVBFG FKDU KVHBGJQ FKDU
    FUHDWH WDEOH WHPSRUD DV VHOHFW EHQHBFOP KVSBLG               Carpenter, Art, Carpenter’s Complete Guide to the SAS Macro
       KVHBFOPB KVHBFOP KVHBLQSD KVHBEHQH                        Language, Cary, NC: SAS Institute, 1998.
       EHQHBVWD EHQHBVH[ EHQHBUDF EHQHBPGF
       KVHBGJQV GJQVBFG KVHBGJQ                                  Health Care Finance Organization. “Peer Review Organization
    IURP FRQQHFWLRQ WR RUDFOHVHOHFW $%(1(B&/0B180                Overview.” http://www.sdps.org/pro_over.html.
       $+63B,' $+6(B&/0B)520B'7
       $+6(B&/0B7+58B'7                                        Health Care Finance Administration, “Introducing the Medicare
       $+6(B,13$7B2873$7B,1' $+6(B%(1(B$*(                  Quality of Care Surveillance System,” Quality Resume,
       $%(1(B67$7(B&2'( $%(1(B6(;B,'(17B&'                  1(1997): 1-2.
       $%(1(B5$&(B&' $%(1(B0'&5B6786B&'
       %+6(B'*16B6(4 %'*16B&'                             Klenz, Bradley W. , “Using the SAS/ACCESS Interface with
       %+6(B'*16B&2025%,'                                         Relational DBMSs: Beyond the Basics,” Proceedings of the
    IURP 71B6'36+6(B',$* $ 71B6'36+6(B',$* % ZKHUH               Sixteenth Annual SAS Users Group International Conference,
       $p%(1(B&/0B180p LQEHQH    DQG                             Cary, NC: SAS Institute, Inc., 1991.
       $p%(1(B&/0B180p %p%(1(B&/0B180p DV
    RUEHQHBFOP KVSBLG KVHBFOPB   KVHBFOP                     Okerson, Barbara B., “Using the SAS SQL Procedure,” The
       KVHBLQSD KVHBEHQH EHQHBVWD EHQHBVH[                        University of Memphis, 1993, 1996.
       EHQHBUDF EHQHBPGF KVHBGJQV GJQVBFG
                                                                    SAS Institute. SAS/ACCESS Software for Relational Databases:
       KVHBGJQ
                                                                      Reference, Version 6, Cary, NC.
    PDFUR PGDWD
    GR L    WR 	FRXQW
                                                                    SAS Institute. SAS Guide to the SQL Procedure. Usage and
       OHW QDPH 		QDPH	L                                          Reference, Version 6., Cary, NC.
       OHW GDWH 		GDWH	L
       LQVHUW LQWR WHPSRUDKLY                                      SAS Institute. SAS Language. Reference, Version 6, Cary, NC.
       VHOHFW 
 IURP RUD ZKHUH RUDEHQHBFOP 	QDPH DQG
           RUDKVHBFOP JH 	GDWH
    HQG                                                           ACKNOWLEDGMENTS
    PHQG
    PGDWD
    TXLW                                                           SAS, SAS/ACCESS, and SAS/CONNECT are registered
                                                                      trademarks of SAS Institute, Inc. of Cary, North Carolina.
    HQGUVXEPLW
                                                                    Other brand and product names are registered trademarks or
                                                                      trademarks of their respective companies.

STEP THREE: PROCESS THE DATA AND WRITE                              The analyses upon which this publication is based were
THE REPORTS                                                         performed under Contract Number 500- - , entitled “Utilization and
                                                                    Quality Control Peer review organization for the State of
                                                                    Tennessee,” sponsored by the Health Care Financing
This is where the fun begins: the actual analysis, search for
                                                                    Administration, Department of Health and Human Services. The
commonalties, trends, problems, and reportable phenomena.
                                                                    content of this publication does not necessarily reflect the views
                                                                    or policies of the Department of Health and Human Services, nor
                                                                    does mention of trade names, commercial products or
CONCLUSION                                                          organizations imply endorsement by the U.S. Government.

The best solution generally becomes a combination of                The author assumes full responsibility for the accuracy and
programming ability, database size, data needs, hardware and        completeness of the ideas presented. This article is a direct
software limitations, and application. In this situation, ease of   result of the Health Care Quality Improvement Program initiated
programming was sacrificed for efficiency because of the size of    by the Health Care Financing Administration, which has
the database.                                                       encouraged identification of quality improvement projects derived
                                                                    from analysis of patterns of care, and therefore required no
In all cases, good knowledge of SAS/ACCESS, SAS/CONNECT,            special funding on the part of this contractor. Ideas and
SQL language including Pass-Through syntax, macro language,         contributions to the author concerning experience in engaging
and Oracle formats is imperative. The code for any good solution    with issues presented are welcome.
should ultimately be generic enough that it can be adapted to
future applications. This code can be used to track any group of
patients through subsequent admissions.
For more information contact:

   Barbara B. Okerson, Ph.D.
   Biostatistician/Health Care Data Analyst
   Mid-South Foundation for Medical Care, Inc.
   6401 Poplar Ave
   Memphis, TN 38119
   (901) 682-0381, ext. 304
   e-mail: tnpro.bokerson@sdps.org

								
To top