Docstoc

Psyc Major Depressive Disorder

Document Sample
Psyc Major Depressive Disorder Powered By Docstoc
					                                  Syllabus for
                Introduction to Data Mining With Linear Models
                                Using Your Data

Instructor: Patricia B. Cerrito
            Department of Mathematics
            University of Louisville
            Louisville, KY 40292
            pcerrito@louisville.edu
            502-852-6010
            502-852-7132 (fax)

Office:              115 Natural Sciences Building

The best way to contact the instructor is via e-mail. I will usually respond
within an hour or two (meetings excepted), including weekends. If you
would like to speak on the phone, or meet in my office, please request this
meeting via e-mail so that we can arrange a mutually agreeable time.

You are welcome to supply your data for this course. However, you need to send
me a complete description of the data so that I can approve it before you start the
course.

Proposed assignments:

1. You will have three paper assignments that contain specific information
   related to the material in the course.
2. There will be a required Final project. This final project will be a submission to
   a conference or professional journal. If you submit to a conference, you must
   be prepared to present your work if it is accepted. It must be related to your
   choice of data.

There will be a total of 3 papers, and each paper will count 20% of the grade.
The papers are to be submitted via lulu.com/CECS694. The remaining 20% each
semester will be assessed on participation on the course discussion board via
lulu.com/CECS694, and for practice assignments. There will be no tests or
examinations in the course. The userid and password for this account are
onlinececs@gmail.com and cecs694class respectively. Each of you will have a
project on this web site, and you should upload all of your papers to this website
in your specific project.

You will be able to start on course material at home as soon as you have
purchased a license for SAS, and have successfully installed the SAS license.
Otherwise, you will need to use the computer lab. It is suggested that you pace
yourself through the semester so that all assignments are completed in a timely
fashion. If you fall too far behind the rest of the students in the course, you will



                                                                                       1
have difficulty with the participation portion of the grade. You should be prepared
to allocate a minimum of 10 hours per week to working in the course. You need
to block out that time in your schedule, and use it in study.

The statistical software, SAS, will be used throughout the course. It is the
statistical software most used in the business world, and in healthcare. It is the
most comprehensive statistical software available. Even if you have a version of
SAS installed on your machine already, you will need to re-install the more recent
version that is now available. You will need to use my license instead of the
license supplied by IT. I will need to see your purchase receipt in order to do this.
Separate instructions on how to obtain SAS will be provided. I am hoping to have
SAS Forecast Server available as well.

In order to use SAS, you need to purchase a license for $30.00. You can
purchase the license through IT software resales located at
http://louisville.edu/it/services/software/. Once you have purchased the license,
please send me the receipt and I will send you the license you need to use.

Currently, the full version of SAS is located at https://webapp.louisville.edu/sas.
The download contains CD images. You should have a CD creation program
such as Roxio or Sonic on your computer. That program will have an option for
copying CD images to CDs. Do NOT just use the diskcopy option or you will get
unusable install CDs. This install will require Windows XP and you should have
512 mgs of RAM on your machine in order to run SAS. However, without the
license, the install CDs will not load.

Once you have downloaded the CDs, start with SAS Setup CD to install SAS, or
with SAS Software Navigator to install Enterprise Miner 5.2 on a personal
workstation. We will not be using Enterprise Miner in this class, and it is only for
those who are interested. From that point on, SAS will instruct you on what CD to
install next. SAS must be installed sequentially. If you have difficulties with the
installation, tell me about it immediately! While I may not be able to solve the
problem, I can get you in contact with someone who can. Once you have
completed this part of the installation, you need to install Enterprise Guide. Find
the Enterprise Guide Setup CD and install it.

The most important job skill wanted by employers is the ability to communicate.
Therefore, written communication will be strongly encouraged, and you will be
required to write according to a technical style. Materials will be posted on how to
write papers.

Approximate Paper due dates:
Paper 1                                   Week 5
Paper 2                                   Week 9
Paper 3                                   Week 12




                                                                                      2
Final Paper                                 Week 15

Each paper counts 20% of the final grade. The remaining 20% of your grade is
based upon participation on the discussion threads as outlined on
lulu.com/cecs694. A minimum number of postings to the discussion threads is
required for this grade. You need to submit at least 3 postings per week
throughout the semester.

You will need to resubmit each paper assignment until it is accepted by me,
at which point the grade will be an A. I will grade each paper strictly, but I
will provide feedback on each paper submission so that you have
opportunity to learn from your writing.

Students with disabilities should contact me or the University of Louisville’s
Disability Resource Center as soon as possible. We will make every effort to
accommodate your needs.

Course Topics
Section   Description                        Outline
1         This section provides the basic    Introduction to datasets,
          foundation of SAS so that the      Enterprise Guide. Do just enough of the
          student (or reader) can safely     training guide to become familiar with EG.
          navigate through SAS to solve
          problems.
2         Data Visualization Techniques      Bar graphs in SAS
                                             Scatterplots in SAS
                                             Kernel density estimation in SAS
                                             Examples of graphs and interpretations.
                                             Use material posted on kernel density
                                             estimation.
3         General Linear Model and           Regression versus the general linear
          Logistic Regression                model
                                             The meaning of the p-value and the
                                             correlation coefficient
                                             Type I versus Type III Sums of Squares.
                                             Use posted material and SAS textbook.
4         Mixed Models                       The difference between the mixed models
                                             algorithm and the general linear model
                                             Repeated Measures. Use posted material
                                             and SAS textbook.
          Generalized Linear Models and      An introduction to the generalized linear
          generalized linear mixed models    model and how it differs from the general
                                             linear model and from mixed models. Use
                                             posted material and SAS textbook.




                                                                                      3
Proposed Textbook:

1. Data Mining With Linear Models
2. Data download for textbook

You can order the text online at http:// http://stores.lulu.com/CECSTextbooks. It
will take a week or so for the text to arrive, so be certain that you have ordered
early.

Optional Textbook: SAS for Mixed Models, Second Edition By: Ramon Littell,
George Milliken, Walter Stroup, Russell Wolfinger, and Oliver Schabenberger. I
suggest you purchase this text from SAS Press at
http://www.sas.com/apps/pubscat/complete.jsp.


Proposed assignments:

1. You should have a dataset of sufficient complexity that you can write on data
   visualization, logistic regression, and linear regression from the same dataset.
   If you don’t have a dataset available to you, I have posted several sites where
   you can find data.
2. Paper 1: Logistic Regression
3. Paper 2: Mixed Model
4. Paper 3: Generalized or Generalized Mixed Model
5. Paper 4: Combine papers 1-3 into a cohesive whole for submission to a
   professional journal or conference.




                                                                                     4
Writing
Writing is required in this course, and you will need to write according to a
specific format. There needs to be a title page, an abstract page, the body of the
paper, and references. Except for the title page, all pages are numbered.

The abstract should be 250 words or less, and it should have the format:
objective, methods, results, and conclusion. Results are specific; a conclusion is
more general in tone. The body of the paper then expands of the four sections
given in the abstract.

The introduction should have the objective and some results. It should also
contain and background material. As each of you has a separate disease, you
will need to have some information concerning your disease in this background
section. This section is followed by methods, results, and conclusion.

The methods section should describe both the data and the statistical methods
used in the data. The results section should always contain a data summary in
addition to a data analysis.

The paper should also read something like a narrative. The data tell a story. You
have to discover that story, and then you have to write it! Assume that the reader
is generally educated but has little specific medical or statistical knowledge.
There is no specific minimum number of pages or words. Include what you need,
but be as brief as possible.

Tables and figures should be numbered and labeled. Distinguish between them.
A table is not a figure!

Writing is a process. Expect to receive feedback from me and be prepared to
resubmit your paper. Once a paper is accepted by me, it is an A paper. I have
attached an example written by one of my students in a previous semester. It
was paper 3, so it shows some of the statistical methods required for the paper.




                                                                                     5
Factors Influencing the Length of Stay in a General
Hospital for Inpatients Diagnosed with Depression
Name
Department of Mathematics
University of Louisville
Louisville, KY 40208
Email
Date
Abstract

Hospital inpatients are consumers of hospital bedtime, a precious commodity in
our society; the length of stay in a general hospital by an inpatient is of interest to
the medical community (doctors, nurses, and patients) as well as the businesses
that bring them together (hospitals, insurers, HMOs, etc.) Furthermore, the
financial costs of hospital stays are of particular interest to any parties
responsible for their payment. Comorbid depression is linked to extended stays,
and among depressed patients, labor & delivery patients are the most common;
however, L&D patients average significantly shorter lengths of stay (2.5 days vs.
5.4 days) and lower costs ($6,900 vs. $18,900) than other inpatients with
comorbid depression. Among the twenty most prevalent diagnoses, urinary tract
infections, anemia, fluid & electrolyte disorders, and hypertension extended the
lengths of stay most; of the top twenty procedures performed; diagnostic vascular
catheterization, respiratory intubation & mechanical ventilation, hemodialysis,
and blood transfusions were correlated with longer stays and higher expenses in
the hospital. Alcohol & drug rehabilitation/detoxification were shown to decrease
an inpatient’s length of stay, although this may be due in part to transfer to a
separate facility. There were fewer significant diagnoses or procedures related
to labor & delivery patients, with lesser impact in that subgroup. It was
determined that, while longer lengths of stay can significantly inflate total
charges, the reliability of the length of stay as a predictor is usually below 95%.
The variation in length of stay and total charges remains largely unexplained by
these findings.




                                                                                     2
Introduction

According to the most recent national survey by the U.S. Centers for Disease
Control [7], almost 37 million American adults reported having no health
insurance. Numerous news articles, books, political platforms, and studies have
drawn attention to the rapid growth of the costs and utilization of our health care
system., In light of the high demand for medical attention, including by those who
currently are unable to afford it, the amount of time an inpatient spends in the
hospital is of particular interest not only to the patients and their doctors, but also
to other, yet-to-be-seen patients, hospital administrators, and insurance
companies or other payers. In addition, the financial costs are highly relevant to
this discussion and must be analyzed in parity with the length of stay.

Under the supervision of the federal Agency on Health Care Research and
Quality, the Healthcare Cost and Utilization Project has gathered, cleaned, and
released an extensive and modern database of hospital-reported information.
The Nationwide Inpatient Sample (2004) will be the primary source of information
in this study. [1]

Depression is a prevalent illness in the United States. Major Depressive
Disorder affects roughly 6.7% of American adults in a given year and about twice
as many women as men. It is the leading cause of disability in Americans age 15
to 44, according to the National Institute for Mental Health. [2]

Previous research has shown that patients seldom report feeling depressed;
about 1.2% of patients visiting their primary care providers listed depression as
the reason for their visit [3][5]. It should be noted that depression is often not the
primary reason for admission to the hospital or the principal diagnosis, and
therefore, further understanding of its effects as a comorbid factor is desirable.

It has been demonstrated that psychiatric comorbidity increases an inpatient’s
length of stay [5] in a general hospital, but given the condition that a patient has
been diagnosed depressed, it would be useful to understand what other factors
will likely impact the length of time spent in the hospital. These factors could be
descriptive of the patient, such as age, race, or gender; other factors may relate
directly to why the patient is in the hospital (diagnoses) or what types of care
(procedures) the patient received.

Certain Diagnostic-Related Groups (DRGs) of patients, as categorized for
Medicare billing purposes, are more often found to be comorbid with depression
than others. This study will examine a nationally representative sample of
inpatients for the most common DRGs associated with depression.

The International Classification of Diseases, published by the World Health
Organization, provides doctors and hospitals worldwide a standard, exhaustive




                                                                                     3
numerical coding system for diagnoses and procedures. Using these codes, it is
possible to analyze the statistical significance of each diagnosis and procedure.

The goal of this analysis is to determine what, if any, relationships exist between
a patient’s length of stay and the patient’s age, race, gender, DRG codes,
diagnosis codes, or procedure codes reported during the stay. The total charges
incurred will also be regressed to identify which factors have the greatest
monetary impact on the cost of care. This paper will focus specifically on those
patients who were diagnosed as depressed.

This study will use logistic and linear regression techniques to determine which
factors to include; age, race, gender, DRG, diagnosis, procedure, and payer type
have the most significant influence on an inpatient’s length of stay and total
charges incurred in the hospital. Results show a strong relationship between
gender, certain diagnoses such as chest pain, chronic illness, or mental disorder,
with an inpatient’s length of stay.




                                                                                 4
Methods

This study analyzed a modern, large, objective, representative sample
(N=40,394) of inpatients who were diagnosed with depression. The sample, a
subset of the Nationwide Inpatient Sample, was obtained from the Healthcare
Cost & Utilization Project [1]. The statistical modeling software SAS was used to
clean the data, determine the descriptive properties of the sample, perform the
regressions, and create the figures and tables provided in this paper. Any
records with missing values in the critical categories analyzed were omitted from
the results. Some categorical variables in the NIS data contained a wide array of
values; any patient whose record was missing codes for any of these categories
was omitted from the regression. In the NIS data, each patient is assigned a
single DRG code from a list of between 500 and 600 codes. To reduce the
degrees of freedom in the analysis, only the top 10 DRG codes, ranked from
most- to least-frequent, were kept. The remaining DRGs were recoded into an
―Other‖ category. Table 1 (below) lists these top-10 DRGs.


         TABLE 1: TOP TEN DRG CODES IN INPATIENTS WITH DEPRESSION
  Recode_DRG Description                                          Percent
          Other   Other DRG                                                            70.29
            373   Vaginal delivery w/o complicating diagnosis                           6.32
            426   Depressive Neuroses                                                   3.95
             89   Simple pneumonia & pleurisy age >17 w/ cardiac catheter               2.85
            209   Major joint & limb reattachment procedures of lower extremity         2.71

            127   Heart failure & shock                                                 2.55
            143   Chest pain                                                            2.37
                  Esophagitis, gastroent & miscellaneous digestive disorders age >17
            182   w/ cardiac catheter                                                   2.36
             88   Chornic obstructive pulmonary disease                                 2.36
            462   Rehabilitation                                                        2.25
             14   Intracranial hemorrhage & stroke w/ infarction                        1.99




                                                                                               5
The NIS database permits up to 15 ICD-9 DX (diagnosis) codes, stored for each
patient in variables labeled DX1, DX2, …, DX15. Because each DXn variable
may contain any of thousands of codes, it was necessary to recode the ICD-9
codes reported in the data, keeping for regression only the twenty DX (diagnosis)
codes most prevalent in patients with depression, while recoding all lower-
ranking codes as ―Other,‖ as shown in Table 2 below.

                   TABLE 2: TOP TWENTY ICD-9 DIAGNOSIS CODES
                         AMONG INPATIENTS WITH DEPRESSION
    Recode_DX    Description                                                  Percent
         Other   Other Diagnosis                                                     62.05
           311   Mood disorders                                                       11.6
          4019   Essential hypertension                                               5.01
         53081   Esophageal disorders                                                 1.94
         25000   Diabetes mellitus without complication                               1.77
                 Screening and history of mental health and substance abuse
          3051   codes                                                                 1.6
          4280   Congestive heart failure; nonhypertensive                            1.45
         41401   Coronary atherosclerosis and other heart disease                     1.39
          2449   Thyroid disorders                                                    1.37
          2724   Disorders of lipid metabolism                                        1.35
          V270   Normal pregnancy and/or delivery                                     1.23
          2765   Fluid and electrolyte disorders                                      1.09
          2859   Deficiency and other anemia                                          1.08
          5990   Urinary tract infections                                             1.07
         66311   Umbilical cord complication                                             1

           496   Chronic obstructive pulmonary disease and bronchiectasis             0.99
         42731   Cardiac dysrhythmias                                                 0.95
          2720   Disorders of lipid metabolism                                         0.9
          2768   Fluid and electrolyte disorders                                      0.75
         30000   Anxiety disorders                                                    0.71
         71590   Osteoarthritis                                                        0.7




                                                                                             6
Similarly, the top 20 ICD-9 PR (procedure) codes were kept while recoding the
remainder as ―Other.‖ These results are shown in Table 3 below.

               TABLE 3: TOP TWENTY ICD-9 PROCEDURE (PR) CODES
                      AMONG INPATIENTS WITH DEPRESSION
     Recode_PR Description                                                   Percent
          Other Other Procedure                                                 65.02
           9904 Blood transfusion                                                 4.09
           7359 Other procedures to assist delivery                                3.2
           3893 Other vascular catheterization; not heart                         3.11
           7569 Repair of current obstetric laceration                            2.29
           8856 Diagnostic cardiac catheterization; coronary arteriography        1.95
           3722 Diagnostic cardiac catheterization; coronary arteriography        1.82
           7309 Artificial rupture of membranes to assist delivery                1.79
            741 Other diagnostic procedures on musculoskeletal system             1.74
           8853 Diagnostic cardiac catheterization; coronary arteriography         1.7
           4516 Upper gastrointestinal endoscopy; biopsy                          1.54
           3995 Hemodialysis                                                      1.46
           7534 Fetal monitoring                                                  1.39
            734 Other procedures to assist delivery                               1.36
           8872 Diagnostic ultrasound of heart (echocardiogram)                   1.35
           7154 Arthroplasty knee                                                 1.18
           4513 Upper gastrointestinal endoscopy; biopsy                           1.1
           9604 Respiratory intubation and mechanical ventilation                 1.02
           9462 Alcohol and drug rehabilitation/detoxification                    1.02
            736 Episiotomy                                                        0.94
           9671 Respiratory intubation and mechanical ventilation                 0.93

Each of these top-20 DX and PR codes, along with an ―Other DX‖ and ―Other
PR,‖ were then assigned their own variable in each patient’s profile, with possible
values 0 (not present) or 1 (present). By reducing the possible diagnoses and
procedures, the degrees of freedom in the regressions were kept to a minimum
in order to improve the accuracy of the models created. For the same reason,
the dependent variable, length of stay, was recoded into five categories. These
were intervals of days in the hospital: 0 to 2 days, 3 to 5 days, 6 to 8 days, 9 to
11 days, and 12 or more days. Total charges were grouped into ten categories

In order to eliminate outliers, all inpatients in the dataset with a length of stay
greater than 30 days (<1% of inpatients) were excluded, as were any records
with missing values for any of the independent variables analyzed. This reduced
the number of data points by about half, given the number of variables.

Prior to performing the linear regression, pregnancy-related patients (DRG codes
370-384) were separated from the others.           Although other indicators of
pregnancy exist, such as diagnosis codes for pregnancy and procedure codes for
various methods of delivery, these options were less consistent and too various
to be an effective screening method.



                                                                                         7
Results

First, the sample was analyzed for basic patient descriptors. See the figures
below for analysis of patient race, gender, and age. It should be mentioned that
for these preliminary census-style analyses, very few sample points were
omitted.

Figure 1: Race (frequency)




                                                     Key:
                                                     1.00 - White
                                                     2.00 - Black
                                                     3.00 - Hispanic
                                                     4.00 - Asian or Pacific Islander
                                                     5.00 - Native American
                                                     6.00 - Other



Figure 1 illustrates the cross-section of inpatients from the database who were
diagnosed with depression. According to the U.S. Census Bureau [4], 80.2% of
Americans are white, 12.8% are black, 14.4% are Hispanic, 4.3% are Asian, and
1% are Native American. By comparison, therefore, the three most prevalent
races in the U.S. (white, black, and Hispanic) are also the three most common
races diagnosed with depression. However, of these three groups, those of
Hispanic descent are much more underrepresented than whites or blacks.




                                                                                        8
Figure 2: Gender (frequency)




                                                            Key:
                                                            0.00 - Male
                                                            1.00 - Female

Figure 2 shows that the majority of those diagnosed with depression are female.
As in Figure 1, a majority representation here should not necessarily be taken to
mean that women are at a higher risk for depression than men—only that women
are diagnosed more frequently than men. However, because men and women
represent approximately equal portions of the population, this does imply that
being female increases a patient’s risk of being depressed. A fact sheet from the
website WebMD.com [6] dedicated to depression in women indicates that the
most likely explanation for the higher concentration of women in the NIS data is
that women are indeed twice as prone to depression as men.




                                                                               9
Figure 3: Age (density)




It is clear from Figure 3 that the prevalence of depression varies between
different age groups, with the highest concentration of diagnoses for patients
between 45 and 60 years of age. The graph shows a sharp decline in the
diagnosis of patients around ages 62-65, dropping to a low previously only seen
pre-forties. Assuming that most people retire in their mid-60s, the data seem to
indicate that retirement suppresses the risk of developing clinical depression.

An interesting point is the steady increase following retirement in the diagnosis of
depression. This could be due to a late-life crisis or related to the losses of
personal relationships such as when spouses and friends of the same age pass
away. It may also be directly related to the likely physiological deterioration of
the mind and/or body in one’s later years, according to WebMD.com. [4]




                                                                                 10
Figure 4: Depression: Age (density) by Gender




Figure 4 shows that women (1) age 20 or older are fairly equally represented in
the NIS data, with a mild dip around age 40 and a larger one at around age 65.
Men (0), however, are shown to have a sharp increase in the prevalence of
depression between the ages of 30 and 45, peaking around age 50-55. This
period of increased depressive prevalence has often been described as a man’s
midlife crisis.

Among the cohort of inpatients diagnosed with depression, comorbidity with other
conditions such as heart trouble, labor and delivery, history of mental health or
drug abuse, and diabetes impacted the patient’s length of stay in the hospital.

A logistic regression was performed to examine the outcome of a patient’s length
of stay in the hospital when compared to independent variables such as patient
demographics and diagnosis and procedure codes.                Several variables
considered by the regression do not appear among the results shown below in
Table 4. Though it is reasonable to believe that a patient’s status as alive or
dead when leaving the hospital may influence the time spent in the hospital, the
number of patients in this sample who died represent a very small (less than 2%)
minority. Hence, there were not enough data points to accurately conclude
whether death statistically influenced the length of stay. In order to maintain a


                                                                              11
95% confidence in the results, all variables with a P-value higher than 5% (or
0.05) have been removed from consideration. Table 4 (below) lists those
variables determined to be statistically significant in predicting an inpatient’s
length of stay in the hospital.


                              TABLE 4: LOGISTIC REGRESSION
                                  Type 3 Analysis of Effects
                                                                Wald
   Effect                                               DF Chi-Square     Pr > ChiSq
   AGE                                                   1      56.7190        <.0001
   FEMALE                                                1      45.9837        <.0001
   RACE                                                  5      69.4379        <.0001
   Recode_DRG                                           10     711.1957        <.0001
   PAY1                                                  5     100.1254        <.0001
   Mood disorders (DX 311)                               1      34.2698        <.0001
   Esophageal disorders (DX 53081)                       1      11.8615        0.0006
   Screening and history of mental health and            1      10.7686        0.0010
   substance abuse codes (DX 3051)
   Congestive heart failure; nonhypertensive             1      48.1990        <.0001
   (DX 4280)
   Disorders of lipid metabolism (DX 2724)               1      12.7309        0.0004
   Normal pregnancy and/or delivery (DX V270)            1      11.7503        0.0006
   Fluid and electrolyte disorders (DX 2765)             1      65.9981        <.0001
   Deficiency and other anemia (DX 2859)                 1      57.0486        <.0001
   Urinary tract infections (DX 5990)                    1     137.7252        <.0001
   Umbilical cord complication (DX 66311)                1      25.7440        <.0001
   Chronic obstructive pulmonary disease and             1      28.5103        <.0001
   bronchiectasis (DX 496)
   Cardiac dysrhythmias (DX 42731)                       1      33.8118        <.0001
   Disorders of lipid metabolism (DX 2720)               1      13.1757        0.0003
   Fluid and electrolyte disorders (DX 2768)             1      56.6711        <.0001
   Anxiety disorders (DX 30000)                          1      16.6161        <.0001
   Blood transfusion (PR 9904)                           1     252.9093        <.0001
   Other vascular catheterization; not heart             1     974.7734        <.0001
   (PR 3893)
   Diagnostic cardiac catheterization; coronary          1       7.0759        0.0078
   arteriography (PR 3722)
   Artificial rupture of membranes to assist delivery    1       4.3645        0.0367
   (PR 7309)
   Other diagnostic procedures on musculoskeletal        1      76.5503        <.0001
   system (PR 741)



                                                                                        12
                              TABLE 4: LOGISTIC REGRESSION
                                  Type 3 Analysis of Effects
                                                              Wald
    Effect                                            DF Chi-Square     Pr > ChiSq
    Upper gastrointestinal endoscopy; biopsy           1      17.2931        <.0001
    (PR 4516)
    Hemodialysis (PR 3995)                             1      43.8447        <.0001
    Other procedures to assist delivery (PR 734)       1       7.6946        0.0055
    Diagnostic ultrasound of heart (echocardiogram)    1      15.9908        <.0001
    (PR 8872)
    Arthroplasty knee (PR 8154)                        1      11.4148        0.0007
    Upper gastrointestinal endoscopy; biopsy           1      14.3687        0.0002
    (PR 4513)
    Respiratory intubation and mechanical              1      49.4953        <.0001
    ventilation (PR 9604)
    Alcohol and drug rehabilitation/detoxification     1      35.9534        <.0001
    (PR 9462)
    Respiratory intubation and mechanical              1       4.1589        0.0414
    ventilation (PR 9671)


Variable descriptions:
    AGE refers to the patient’s age in years
    FEMALE is the patient’s gender, with 0 for male and 1 for female
    RACE is the patient’s race, possible values: (1) white, (2) black, (3)
      Hispanic, (4) Asian or Pacific Islander, (5) Native American, (6) other
    PAY1 is the patient’s expected primary payer, possible values: (1)
      Medicare, (2) Medicaid, (3) private including HMO, (4) self-pay, (5) no
      charge, (6) other
    Other variables are the presence of ICD-9 DX codes, followed by PR
      codes, where (0) not present and (1) present.

The regression also generated odds ratios for each variable, where the ―Point
Estimate‖ column shows the ratio. An odds ratio equal to 1.000 for a particular
independent variable in the table indicates that the patient’s length of stay is
independent of that variable; that is, length of stay would not be expected to
increase or decrease if the variable examined increases or decreases. When the
odds ratio is greater than 1, the length of stay is expected to increase when the
variable does; when the odds ratio is less than 1, the length of stay is expected to
decrease when the variable value increases. The further the odds ratio is from 1,
the greater the impact of that variable on the length of stay. Note: for most
variables listed in this study, ―0 increasing to 1‖ would mean ―Not Present
increasing to Present‖.




                                                                                      13
Table 5 lists these Odds Ratios; those variables with odds ratios between 0.95
and 1.05 were considered relatively inconsequential and have been omitted from
the table, along with any variables ruled insignificant in (and omitted from) Table
4 (above).


                                 TABLE 5a: Odds Ratio Estimates
                                                                     Point      95% Wald
  Effect                                                          Estimate   Confidence Limits
  FEMALE        0 vs 1                                               0.796     0.745       0.850
  Heart Failure & shock (DRG 127) vs Other                           1.059     0.821       1.365
  Intracranial hemorrhage & stroke w infarct (DRG 14) vs Other       0.516     0.409       0.653
  Chest pain (DRG 143) vs Other                                      4.523     3.193       6.408
  Esophagitis, gastroent & misc digest disorders age >17 w cc        1.130     0.910       1.405
  (DRG 182) vs Other
  Major joint & limb reattachment procedures of lower extremity      0.777     0.645       0.937
  (DRG 209) vs Other
  Vaginal delivery w/o complicating diagnoses (DRG 373)              1.838     1.477       2.287
  vs Other
  Depressive neuroses (DRG 426) vs Other                             0.483     0.380       0.616
  Rehabilitation (DRG 462) vs Other                                  0.061     0.048       0.078
  Chronic obstructive pulmonary disease (DRG 88)     vs Other        0.505     0.376       0.678
  Simple pneumonia & pleurisy age >17 w cc (DRG 89) vs Other         0.587     0.456       0.755
  Mood disorders (DX 311)     0 vs 1                                 0.651     0.564       0.751
  Esophageal disorders (DX 53081) 0 vs 1                             0.867     0.800       0.940
  Screening and history of mental health and substance abuse         0.862     0.789       0.942
  codes (DX 3051) 0 vs 1
  Congestive heart failure; nonhypertensive (DX 4280) 0 vs 1         1.465     1.316       1.632
  Disorders of lipid metabolism (DX 2724) 0 vs 1                     0.834     0.754       0.921
  Normal pregnancy and/or delivery (DX V270) 0 vs 1                  0.654     0.513       0.834
  FLuid and electrolyte disorders (DX 2765) 0 vs 1                   1.645     1.459       1.855
  Deficiency and other anemia (DX 2859) 0 vs 1                       1.461     1.324       1.612
  Urinary tract infections (DX 5990) 0 vs 1                          1.977     1.765       2.216
  Umbilical cord complication (DX 66311) 0 vs 1                      0.560     0.447       0.700
  Chronic obstructive pulmonary disease and bronchiectasis           1.355     1.212       1.515
  (DX 496) 0 vs 1
  Cardiac dysrhythmias (DX 42731) 0 vs 1                             1.423     1.264       1.603
  Disorders of lipid metabolism (DX 2720) 0 vs 1                     0.816     0.731       0.911
  Fluid and electrolyte disorders (DX 2768) 0 vs 1                   1.649     1.448       1.879
  Anxiety disorders (DX 30000)         0 vs 1                        1.316     1.153       1.501
  Blood transfusion (PR 9904) 0 vs 1                                 2.223     2.015       2.453




                                                                                                   14
                                 TABLE 5a: Odds Ratio Estimates
                                                                             Point          95% Wald
  Effect                                                                  Estimate       Confidence Limits
  Other vascular catheterization; not heart (PR 3893) 0 vs 1                  6.180           5.513    6.929
  Diagnostic cardiac catheterization; coronary arteriography                  0.650           0.473    0.893
  (PR 3722) 0 vs 1
  Artificial rupture of membranes to assist delivery (PR 7309)                0.806           0.658    0.987
  0 vs 1
  Other diagnostic procedures on musculoskeletal system (PR 741)              3.009           2.351    3.850
  0 vs 1
  Upper gastrointestinal endoscopy; biopsy (PR 4516)      0 vs 1              1.391           1.191    1.626
  Hemodialysis (PR 3995) 0 vs 1                                               1.759           1.488    2.078
  Other procedures to assist delivery (PR 734)   0 vs 1                       1.343           1.090    1.654
  Diagnostic ultrasound of heart (echocardiogram) (PR 8872)                   1.385           1.180    1.624
  0 vs 1
  Arthrosplasty knee (PR 8872) 0 vs 1                                         1.547           1.201    1.992
  Upper gastrointestinal endoscopy; biopsy (PR 4513) 0 vs 1                   1.424           1.186    1.710
  Respiratory intubation and mechanical ventilation (PR 9604)                 2.401           1.881    3.065
  0 vs 1
  Alcohol and drug rehabilitation/ detoxification (PR 9462) 0 vs 1            1.809           1.490    2.195
  Respiratory intubation and mechanical ventilation (PR 9671)                 0.770           0.599    0.990
  0 vs 1



                       TABLE 5b: Association of Predicted Probabilities
                                 and Observed Responses
                     Percent Concordant                      74.7 Somers' D           0.499
                     Percent Discordant                      24.9 Gamma               0.501
                     Percent Tied                             0.4 Tau-a               0.360
                     Pairs                          108567131 c                       0.749


Table 5b (above) shows that the model was accurate 74.7% of the time when
testing against all possible (108 million) disparate pairs of data points for the
variables in the regression. The logistic regression should have at least 50%
concordant to be considered accurate, with no more than 50% discordant.


Note from the summary statistics in Table 6 (below) that there are significant
differences between those patients who are giving birth during their stay and
those who are not. The length of stay for L&D (Labor and Delivery, still comorbid
with depression) is about 2.5 days on average, compared to about 5.4 days for
non-pregnant depression-diagnosed patients; also, the standard deviation for
LOS (length of stay) is significantly lower for L&D patients. L&D patients are less


                                                                                                               15
      likely to have died while in the hospital’s care—only about 0.03% of delivering
      mothers died, compared with about 2% for the remaining patients. Also, the
      median age of L&D patients was 27, versus 57 for other patients.


     TABLE 6a: DESCRIPTIVE STATISTICS FOR PATIENTS NOT IN LABOR & DELIVERY
Variable     Label                               Mean     Std Dev       N   Median     t Value        Pr > |t|
LOS          Length of stay (cleaned)             5.388      4.657   18496      4.0     157.35        <.0001
DIED         Died during hospitalization          0.019      0.137   18497        0      18.97        <.0001
FEMALE       Indicator of sex                     0.650      0.477   18497      1.0     185.42        <.0001
TOTCHG       Total charges (cleaned)           26906.24   27731.88   18051 18910.00     130.35        <.0001
ELECTIVE     Elective/non-elective admission      0.330      0.470   18435        0      95.33        <.0001
AGE          Age in years at admission           58.215     18.291   18495     57.0     432.83        <.0001



    TABLE 6b: DESCRIPTIVE STATISTICS FOR PATIENTS IN LABOR & DELIVERY ONLY
Variable     Label                               Mean     Std Dev       N    Median    t Value    Pr > |t|
LOS          Length of stay (cleaned)             2.520      1.487   4076        2.0   108.24     <.0001
DIED         Died during hospitalization         0.0002      0.016   4076          0     1.00     0.3174
FEMALE       Indicator of sex                     1.000          0   4076        1.0        .          .
TOTCHG       Total charges (cleaned)            8483.26    5971.25   3997    6864.00    89.82     <.0001
ELECTIVE     Elective/non-elective admission      0.475      0.499   4067          0    60.63     <.0001
AGE          Age in years at admission           27.783      6.174   4076      27.00   287.29     <.0001



      The significant differences shown in Table 6 suggest a general disparity between
      L&D patients with post-partum depression and the rest of the depressed patients
      in the sample. The L&D patients, therefore, were treated separately in the linear
      regression that follows. That is, two regressions were performed—one for
      women in Labor & Delivery and another for non-L&D (containing both men and
      women).

      It was determined that each independent explanatory variable in the linear
      regression for the dependent variable, Length of Stay, held levels of significance
      that varied between the two subgroups. Hence, only those variables significant
      to each specific regression were kept and reported here.

      Because several diagnoses considered here are commonly prescribed
      procedures that are also considered in the regressions, DX (diagnosis) codes
      and PR (procedure) codes were regressed separately to improve the results.




                                                                                                 16
  Table 7 (below) represents patients who were not in Labor & Delivery. This set
 includes diagnoses among the parameters, but not procedures:

                     TABLE 7a: INFLUENCE OF DIAGNOSIS CODES
                         ON LENGTH OF STAY (NON-L&D)
                                  Analysis of Variance
                                                 Sum of       Mean
               Source                    DF     Squares      Square    F Value    Pr > F
               Model                       20    35221 1761.06404        89.02 <.0001
               Error                  18411     364222      19.78287
               Corrected Total        18431     399444


                            TABLE 7b: INFLUENCE OF DIAGNOSIS
                           CODES ON LENGTH OF STAY (NON-L&D)
                                      Regression Details
                         Root MSE                 4.44779 R-Square      0.0882
                         Dependent Mean           5.38873 Adj R-Sq      0.0872
                         Coeff Var               82.53888

 Note the mean of 5.39 days and the R-Square of 0.088.

       TABLE 7c: INFLUENCE OF DIAGNOSIS CODES ON LENGTH OF STAY (NON-L&D)
                                 Parameter Estimates
                                                                 Parameter   Standard
Variable     Label                                          DF    Estimate      Error t Value       Pr > |t|
Intercept    Intercept                                       1     3.57208       0.94800    3.77    0.0002
AGE          Age in years at admission                       1     0.01879       0.00211    8.90    <.0001
DIED         Died during hospitalization                     1     0.66662       0.24267    2.75    0.0060
ELECTIVE     Elective versus non-elective admission          1    -0.78513       0.07230   -10.86   <.0001
FEMALE       Indicator of sex                                1    -0.37903       0.07006    -5.41   <.0001
c311_Max     Mood disorders                                  1    -1.70090       0.16347   -10.40   <.0001
c4019_Max    Essential hypertension                          1    -0.15638       0.07068    -2.21   0.0270
c53081_Max   Esophageal disorders                            1    -0.37682       0.08701    -4.33   <.0001
c3051_Max    Screening and history of mental health          1    -0.52312       0.09800    -5.34   <.0001
             and substance abuse codes
c4280_Max    Congestive heart failure; nonhypertensive       1     0.83912       0.11273    7.44    <.0001
c41401_Max   Coronary atherosclerosis and other heart        1    -0.29621       0.10361    -2.86   0.0043
             disease
c2724_Max    Disorders of lipid metabolism                   1    -0.54065       0.10503    -5.15   <.0001
c2765_Max    Fluid and electrolyte disorders                 1     0.93315       0.13120    7.11    <.0001




                                                                                                       17
     TABLE 7c: INFLUENCE OF DIAGNOSIS CODES ON LENGTH OF STAY (NON-L&D)
                               Parameter Estimates
                                                               Parameter   Standard
Variable     Label                                        DF    Estimate      Error t Value   Pr > |t|
c2859_Max    Deficiency and other anemia                   1     1.27633    0.11195   11.40   <.0001
c5990_Max    Urinary tract infections                      1     1.82946    0.12870   14.21   <.0001
c496_Max     Chronic obstructive pulmonary disease         1     0.67518    0.12338    5.47   <.0001
             and bronchiectasis
c42731_Max   Cardiac dysrhythmias                          1     1.04874    0.13245    7.92   <.0001
c2720_Max    Disorders of lipid metabolism                 1    -0.44402    0.11813   -3.76   0.0002
c2768_Max    Fluid and electrolyte disorders               1     1.00783    0.14273    7.06   <.0001
c30000_Max   Anxiety disorders                             1     0.31867    0.14551    2.19   0.0285
cother_Max   Other diagnosis(es) present but not listed    1     2.56860    0.92939    2.76   0.0057
             among the top 20 most prevalent


 The estimates are based on a base stay of about 3.6 days.                        The following
 remarks reflect a decrease or increase relative to this base stay.

 Table 7c shows that elective patients stayed for about 0.8 days less than non-
 elective patients. Those diagnosed with mood disorders stayed 1.7 days fewer
 than others, and patients with a history of mental health/drug abuse stayed a half
 day less than the rest.

 Note that the biggest influence that increased the length of stay was reflected in
 the ―Other diagnoses‖ category, adding an average 2.5 days. Of the top twenty
 diagnoses regressed (which also proved to be significant here), the largest
 increase in an inpatient’s length of stay was due to Urinary Tract Infections (1.8
 days), followed by Deficiency and Other Anemia (1.3 days).


 Need a little bit of transition here from logistic regression to linear regression, and
 mention         that        you        are       considering        total      charges.




                                                                                                 18
 Table 8 (below) describes the influence of various factors on the total charges
 incurred by patients not in Labor & Delivery.

                         TABLE 8a: INFLUENCE OF DIAGNOSIS CODES
                              ON TOTAL CHARGES (NON-L&D)
                                     Analysis of Variance
                                                Sum of           Mean
             Source                   DF       Squares          Square      F Value   Pr > F
             Model                      15 4.806686E12 3.204457E11           641.00 <.0001
             Error               17973 8.985041E12            499918850
             Corrected Total     17988 1.379173E13



                            TABLE 8b: INFLUENCE OF DIAGNOSIS
                           CODES ON TOTAL CHARGES (NON-L&D)
                                      Regression Details
                         Root MSE                 22359 R-Square          0.3485
                         Dependent Mean           26908 Adj R-Sq          0.3480
                         Coeff Var             83.09504

 Note the unusually high (for diagnosis codes) R-Square value of 0.3485.

       TABLE 8c: INFLUENCE OF DIAGNOSIS CODES ON TOTAL CHARGES (NON-L&D)
                                 Parameter Estimates
                                                               Parameter      Standard
Variable     Label                                    DF        Estimate         Error    t Value      Pr > |t|
Intercept    Intercept                                    1        11784      866.29073        13.60   <.0001
DIED         Died during hospitalization                  1    9807.96950 1227.44810            7.99   <.0001
ELECTIVE     Elective versus non-elective admission       1    6179.24261     366.68600        16.85   <.0001
LOS          Length of stay (cleaned)                     1    3373.95242      37.21934        90.65   <.0001
c311_Max     Mood disorders                               1 -7839.58058       825.89905        -9.49   <.0001
c4019_Max    Essential hypertension                       1    1116.20749     346.11399         3.22   0.0013
c4280_Max    Congestive heart failure;                    1    1996.24773     566.13199         3.53   0.0004
             nonhypertensive
c41401_Max   Coronary atherosclerosis and other           1    7300.41112     519.88569        14.04   <.0001
             heart disease
c2724_Max    Disorders of lipid metabolism                1    2377.85057     537.10491         4.43   <.0001
c2765_Max    Fluid and electrolyte disorders              1 -3417.52157       663.46701        -5.15   <.0001
c2859_Max    Deficiency and other anemia                  1    1732.85151     569.41293         3.04   0.0023
c5990_Max    Urinary tract infections                     1 -1969.51099       654.04824        -3.01   0.0026
c42731_Max   Cardiac dysrhythmias                         1    1364.58965     664.09011         2.05   0.0399



                                                                                                          19
     TABLE 8c: INFLUENCE OF DIAGNOSIS CODES ON TOTAL CHARGES (NON-L&D)
                               Parameter Estimates
                                                    Parameter    Standard
Variable     Label                             DF    Estimate       Error    t Value   Pr > |t|
c2720_Max    Disorders of lipid metabolism      1   4167.13044   599.51176      6.95   <.0001
c2768_Max    Fluid and electrolyte disorders    1   1671.91687   725.82344      2.30   0.0213
c30000_Max   Anxiety disorders                  1 -1956.24617    732.44000     -2.67   0.0076



From this regression, the factors that had the greatest positive impact on total
charges include presence of death, elective admission, diagnosis of coronary
atherosclerosis and other heart disease, diagnosis of disorders of lipid
metabolism; those with the greatest negative impact include diagnosis of mood
disorders, and diagnosis of fluid and electrolyte disorders.




                                                                                          20
  Table 9 (below) shows the regression results for non-L&D patients again, but this
  time considering procedures instead of diagnoses.

                    TABLE 9a: INFLUENCE OF PROCEDURE CODES ON
                              LENGTH OF STAY (NON-L&D)
                                  Analysis of Variance
                                               Sum of        Mean
               Source                   DF    Squares       Square     F Value   Pr > F
               Model                    15      58636 3909.05454        211.23 <.0001
               Error                 18416     340808      18.50608
               Corrected Total       18431     399444


                           TABLE 9b: INFLUENCE OF PROCEDURE
                           CODES ON LENGTH OF STAY (NON-L&D)
                                      Regression Details
                         Root MSE                4.30187 R-Square       0.1468
                         Dependent Mean          5.38873 Adj R-Sq       0.1461
                         Coeff Var              79.83091

  As expected, the dependent mean did not change, although the R-square
  improved from 0.088 in the previous length-of-stay regression to 0.147 here.

      TABLE 9c: INFLUENCE OF PROCEDURE CODES ON LENGTH OF STAY (NON-L&D)
                                Parameter Estimates
                                                                 Parameter Standard
Variable    Label                                           DF    Estimate    Error        t Value   Pr > |t|
Intercept   Intercept                                        1        1.65787    0.15337    10.81    <.0001
AGE         Age in years at admission                        1        0.03046    0.00178    17.08    <.0001
ELECTIVE    Elective versus non-elective admission           1    -1.02877       0.07194    -14.30   <.0001
FEMALE      Indicator of sex                                 1    -0.37669       0.06747     -5.58   <.0001
c9904_Max   Blood transfusion                                1        1.81794    0.10485    17.34    <.0001
c3893_Max   Other vascular catheterization; not heart        1        4.42029    0.12035    36.73    <.0001
c3722_Max   Diagnostic cardiac catheterization; coronary     1    -0.65005       0.14672     -4.43   <.0001
            arteriography
c4516_Max   Upper gastrointestinal endoscopy; biopsy         1        0.43036    0.16003      2.69   0.0072
c3995_Max   Hemodialysis                                     1        1.82856    0.17290    10.58    <.0001
c8872_Max   Diagnostic ultrasound of heart                   1        0.91470    0.17106      5.35   <.0001
            (echocardiogram)
c8154_Max   Arthroplasty knee                                1        0.61597    0.19811      3.11   0.0019
c4513_Max   Upper gastrointestinal endoscopy; biopsy         1        0.80027    0.18865      4.24   <.0001




                                                                                                      21
      TABLE 9c: INFLUENCE OF PROCEDURE CODES ON LENGTH OF STAY (NON-L&D)
                                Parameter Estimates
                                                              Parameter Standard
Variable        Label                                    DF    Estimate    Error    t Value   Pr > |t|
c9604_Max       Respiratory intubation and mechanical     1     1.92933   0.19976      9.66   <.0001
                ventilation
c9462_Max       Alcohol and drug                          1     0.92978   0.20931      4.44   <.0001
                rehabilitation/detoxification
cprother_Max Other procedure(s) present but not listed    1     2.19375   0.10019    21.90    <.0001
             among the top 20 most prevalent


  Although the dataset of patients did not change, this second regression gives a
  slightly clearer picture of why the length of stay fluctuated. Hence, more weight
  could be given to the variables instead of the (lower) base stay value, now held
  to be approximately 1.66 days.

  The list of significant variables is shorter in this regression. As with the previous
  results, elective patients had shorter stays than non-elective patients by about
  one day. Coronary arteriography shortened stays by 0.65 days. In this
  regression, like in the last one, being female meant a shorter stay of about 0.38
  days. These regressors are not including Labor & Delivery patients.

  In this regression, unlike the previous one, a top-twenty procedure (―Other
  vascular catheterization; not heart‖) proved to add the most time to an inpatient’s
  stay, at 4.42 days. Next was the ―Other procedures‖ category with an additional
  2.19 days, followed by Respiratory intubation and mechanical ventilation, which
  added about 1.93 days onto the base stay.




                                                                                               22
  The final regression results for non-L&D patients, evaluating the monetary
  implications of various procedure codes, are given below in Table 10.

                      TABLE 10a: INFLUENCE OF PROCEDURE CODES
                            ON TOTAL CHARGES (NON-L&D)
                                    Analysis of Variance
                                                Sum of         Mean
             Source                   DF       Squares        Square    F Value   Pr > F
             Model                    13 5.433289E12 4.179453E11           898.80 <.0001
             Error                17975 8.358438E12       465003519
             Corrected Total      17988 1.379173E13



                           TABLE 10b: INFLUENCE OF PROCEDURE
                           CODES ON TOTAL CHARGES (NON-L&D)
                                      Regression Details
                          Root MSE                  21564 R-Square     0.3940
                          Dependent Mean            26908 Adj R-Sq     0.3935
                          Coeff Var            80.14076



      TABLE 10c: INFLUENCE OF PROCEDURE CODES ON TOTAL CHARGES (NON-L&D)
                                 Parameter Estimates
                                                              Parameter      Standard
Variable      Label                                      DF    Estimate         Error   t Value     Pr > |t|
Intercept     Intercept                                   1 -5371.14727     476.45293      -11.27   <.0001
DIED          Died during hospitalization                 1   3740.16073 1217.61077         3.07    0.0021
ELECTIVE      Elective versus non-elective admission      1   5474.31572    364.80981      15.01    <.0001
LOS           Length of stay (cleaned)                    1   3173.00836     36.95528      85.86    <.0001
c9904_Max     Blood transfusion                           1   8350.81021    531.10690      15.72    <.0001
c3893_Max     Other vascular catheterization; not         1   5158.34361    624.12294       8.26    <.0001
              heart
c8856_Max     Diagnostic cardiac catheterization;         1       16550 1427.74196         11.59    <.0001
              coronary arteriography
c3722_Max     Diagnostic cardiac catheterization;         1   5840.87643 1474.64150         3.96    <.0001
              coronary arteriography
c3995_Max     Hemodialysis                                1   6447.86093    870.16108       7.41    <.0001
c8872_Max     Diagnostic ultrasound of heart              1   5826.69351    870.48470       6.69    <.0001
              (echocardiogram)
c8154_Max     Arthroplasty knee                           1       16932     987.29797      17.15    <.0001




                                                                                                      23
     TABLE 10c: INFLUENCE OF PROCEDURE CODES ON TOTAL CHARGES (NON-L&D)
                                Parameter Estimates
                                                             Parameter    Standard
Variable       Label                                    DF    Estimate       Error    t Value   Pr > |t|
c4513_Max      Upper gastrointestinal endoscopy;         1   1935.52034   947.51398      2.04   0.0411
               biopsy
c9604_Max      Respiratory intubation and mechanical     1       21636 1042.98729      20.74    <.0001
               ventilation
cprother_Max   Other procedure(s) present but not        1       11098    463.17860    23.96    <.0001
               listed among the top 20 most prevalent



 Those procedures that consistently added the greatest amounts to the total
 charges include: respiratory intubation and mechanical ventilation, arthroplasty
 knee, and diagnostic cardiac catheterization/coronary arteriography. Conversely,
 those with the greatest negative impact include:           upper gastrointestinal
 endoscopy/biopsy, smaller length of stay, and other vascular catheterization; not
 heart.




                                                                                                  24
Beginning with Table 11 (below), all subsequent regressions are specific to only
those patients who were admitted to the hospital for labor and delivery. The
results of linear regression evaluating the influence of diagnosis codes on the
length of stay are shown in Table 11.

                     TABLE 11a: INFLUENCE OF DIAGNOSIS CODES ON
                             LENGTH OF STAY (L&D ONLY)
                                   Analysis of Variance
                                                 Sum of         Mean
              Source                    DF      Squares        Square    F Value   Pr > F
              Model                      9     683.59507 75.95501          37.07 <.0001
              Error                   4057 8313.48140          2.04917
              Corrected Total         4066 8997.07647


                           TABLE 11b: INFLUENCE OF DIAGNOSIS
                          CODES ON LENGTH OF STAY (L&D ONLY)
                                     Regression Details
                         Root MSE                1.43149 R-Square         0.0760
                         Dependent Mean          2.52029 Adj R-Sq         0.0739
                         Coeff Var              56.79881


There is a significantly smaller mean length of stay in this regression at 2.52
days; the R-square is down to 0.076.

    TABLE 11c: INFLUENCE OF DIAGNOSIS CODES ON LENGTH OF STAY (L&D ONLY)
                               Parameter Estimates
                                                                 Parameter    Standard
Variable     Label                                     DF         Estimate       Error      t Value   Pr > |t|
Intercept    Intercept                                     1        3.17886    0.12713       25.01    <.0001
ELECTIVE     Elective versus non-elective admission        1       -0.09990    0.04505        -2.22   0.0266
c4019_Max    Essential hypertension                        1        1.30340    0.21336         6.11   <.0001
c25000_Max   Diabetes mellitus without complication        1        0.70772    0.31426         2.25   0.0244
c3051_Max    Screening and history of mental health        1       -0.25242    0.09011        -2.80   0.0051
             and substance abuse codes
cV270_Max    Normal pregnancy and/or delivery              1       -0.73414    0.08758        -8.38   <.0001
c2765_Max    Fluid and electrolyte disorders               1        1.62918    0.41635         3.91   <.0001
c5990_Max    Urinary tract infections                      1        0.94446    0.28228         3.35   0.0008
c66311_Max   Umbilical cord complication                   1       -0.46687    0.05467        -8.54   <.0001
cother_Max   Other diagnosis(es) present but not           1        0.43987    0.08761         5.02   <.0001
             listed among the top 20 most prevalent




                                                                                                          25
Here, the intercept of about 3.18 days is greater than the observed mean.

Although ―Elective‖ was again a significant variable, it shortened the length of
stay by a relatively insignificant 0.1 days. Perhaps not unexpectedly, ―Normal
pregnancy and/or delivery‖ shortened the length of stay by about 0.73 days,
bringing the expectation for normal deliveries back around the observed mean.

Fluid and electrolyte disorders added about 1.6 days to a patient’s stay, while
hypertension extended the length of stay by approximately 1.3 days. Urinary
tract infections were estimated to increase the stay by 0.9 days.

Table 12 (below) describes the results of the regression evaluating total charges
and how they are influenced by various significant diagnosis codes.

              TABLE 12a: INFLUENCE OF DIAGNOSIS CODES ON TOTAL
                             CHARGES (L&D ONLY)
                               Analysis of Variance
                                           Sum of       Mean
            Source                DF      Squares      Square    F Value   Pr > F
            Model                 12   8845443744 737120312        21.95 <.0001
            Error                3975 1.334849E11     33581110
            Corrected Total      3987 1.423304E11



                        TABLE 12b: INFLUENCE OF DIAGNOSIS
                       CODES ON TOTAL CHARGES (L&D ONLY)
                                  Regression Details
                     Root MSE            5794.92105 R-Square      0.0621
                     Dependent Mean      8487.14594 Adj R-Sq      0.0593
                     Coeff Var             68.27880


The low R-Square in this regression suggests that the variation in total charges is
likely due to factors other than those regressed.




                                                                                    26
    TABLE 12c: INFLUENCE OF DIAGNOSIS CODES ON TOTAL CHARGES (L&D ONLY)
                               Parameter Estimates
                                                              Parameter      Standard
Variable     Label                                    DF       Estimate         Error     t Value   Pr > |t|
Intercept    Intercept                                   1         11627     721.67893     16.11    <.0001
ELECTIVE     Elective versus non-elective admission      1    -932.47919     184.14838      -5.06   <.0001
c311_Max     Mood disorders                              1 -1740.37300       569.22771      -3.06   0.0022
c4019_Max    Essential hypertension                      1    5429.00097     903.85470       6.01   <.0001
c53081_Max   Esophageal disorders                        1 -2227.88752 1128.36540           -1.97   0.0484
c25000_Max   Diabetes mellitus without complication      1    3296.34768 1273.30916          2.59   0.0097
c3051_Max    Screening and history of mental health      1 -1652.01791       369.54445      -4.47   <.0001
             and drug abuse codes
cV270_Max    Normal pregnancy and/or delivery            1 -2467.74572       359.48901      -6.86   <.0001
c5990_Max    Urinary tract infections                    1    2802.48892 1141.89697          2.45   0.0142
c66311_Max   Umbilical cord complications                1 -2164.04465       559.94751      -3.86   0.0001
c2720_Max    Disorders of lipid metabolism               1         20689 5806.98104          3.56   0.0004
c2768_Max    Fluid and electrolyte disorders             1         11467 2058.08907          5.57   <.0001
cother_Max   Other diagnosis(es) present but not         1    1765.90586     356.07944       4.96   <.0001
             listed among the top 20 most prevalent


 With a base cost of $11,627, the factors that increased expenses most include:
 Disorders of lipid metabolism, fluid and electrolyte disorders, and essential
 hypertension. Those diagnoses that were linked most significantly with reduced
 costs were: normal pregnancy and/or delivery, esophageal disorders, and
 umbilical cord complications.

 Table 13 (below) describes L&D patients, showing how procedures influenced
 their length of stay.

                     TABLE 13a: INFLUENCE OF PROCEDURE CODES ON
                              LENGTH OF STAY (L&D ONLY)
                                    Analysis of Variance
                                                Sum of        Mean
              Source                    DF     Squares       Square    F Value   Pr > F
              Model                     11 1803.88849 163.98986            92.45 <.0001
              Error                   4055 7193.18798        1.77391
              Corrected Total         4066 8997.07647




                                                                                                       27
                            TABLE 13b: INFLUENCE OF PROCEDURE
                            CODES ON LENGTH OF STAY (L&D ONLY)
                                       Regression Details
                           Root MSE              1.33188 R-Square    0.2005
                           Dependent Mean        2.52029 Adj R-Sq    0.1983
                           Coeff Var            52.84642


 Again, the measured mean length of stay was 2.52 days, but here, the R-square
 is much higher relative to all other regressions for length of stay.

    TABLE 13c: INFLUENCE OF PROCEDURE CODES ON LENGTH OF STAY (L&D ONLY)
                               Parameter Estimates
                                                               Parameter   Standard
Variable       Label                                    DF      Estimate      Error     t Value   Pr > |t|
Intercept      Intercept                                   1     2.31126      0.05570    41.50    <.0001
ELECTIVE       Elective versus non-elective admission      1    -0.16775      0.04215     -3.98   <.0001
c9904_Max      Blood transfusion                           1     2.15398      0.23449      9.19   <.0001
c7359_Max      Other procedures to assist delivery         1    -0.17643      0.04976     -3.55   0.0004
c3893_Max      Other vascular catheterization; not         1     1.90135      0.42549      4.47   <.0001
               heart
c7569_Max      Repair of current obstetric laceration      1    -0.13983      0.04997     -2.80   0.0052
c7309_Max      Artificial rupture of membranes to          1    -0.13019      0.05031     -2.59   0.0097
               assist delivery
c741_Max       Other diagnostic procedures on              1     1.08308      0.06220    17.41    <.0001
               musculoskeletal system
c3995_Max      Hemodialysis                                1    15.60981      1.33286    11.71    <.0001
c734_Max       Other procedures to assist delivery         1     0.29078      0.05535      5.25   <.0001
c4513_Max      Upper gastrointestinal endoscopy;           1     6.68874      1.33304      5.02   <.0001
               biopsy
cprother_Max   Other procedure(s) present but not          1     0.25536      0.04317      5.92   <.0001
               listed among the top 20 most prevalent


 The base stay is set at 2.31 days, which is lower than in the previous regression.
 This is probably due to larger coefficients for the parameters listed. Note the
 high estimates for both Hemodialysis and Upper gastrointestinal endoscopy.

 The final regression, with results given in Table 14 (below), analyzed the impact
 of procedure codes on total charges.




                                                                                                    28
                       TABLE 14a: INFLUENCE OF PROCEDURE CODES ON
                                TOTAL CHARGES (L&D ONLY)
                                      Analysis of Variance
                                                  Sum of            Mean
               Source                    DF      Squares           Square    F Value   Pr > F
               Model                     10 53079379488 5307937949            236.52 <.0001
               Error                    3977 89250976477          22441784
               Corrected Total          3987 1.423304E11



                              TABLE 14b: INFLUENCE OF PROCEDURE
                              CODES ON TOTAL CHARGES (L&D ONLY)
                                         Regression Details
                            Root MSE             4737.27605 R-Square         0.3729
                            Dependent Mean       8487.14594 Adj R-Sq         0.3714
                            Coeff Var              55.81707



      TABLE 14c: INFLUENCE OF PROCEDURE CODES ON TOTAL CHARGES (L&D ONLY)
                                 Parameter Estimates
                                                                    Parameter    Standard
Variable        Label                                      DF        Estimate       Error       t Value   Pr > |t|
Intercept       Intercept                                     1    2271.56740    374.64696         6.06   <.0001
AGE             Age in years at admission                     1      44.81763     12.19917         3.67   0.0002
ELECTIVE        Elective versus non-elective admission        1     -834.00972   151.28793        -5.51   <.0001
LOS             Length of stay (cleaned)                      1    1753.77763     56.13722       31.24    <.0001
c9904_Max       Blood transfusion                             1    5404.64695    842.94011         6.41   <.0001
c3893_Max       Other vascular catheterization; not           1    8587.75579 1513.88569           5.67   <.0001
                heart
c741_Max        Other diagnostic procedures on                1    2652.45746    193.65595       13.70    <.0001
                musculoskeletal system
c3995_Max       Hemodialysis                                  1         50319 4820.30098         10.44    <.0001
c734_Max        Other procedures to assist delivery           1     489.20354    198.79901         2.46   0.0139
c4513_Max       Upper gastrointestinal endoscopy;             1         19339 4756.80897           4.07   <.0001
                biopsy
cprother_Max    Other procedure(s) present, but not           1     405.43768    153.42088         2.64   0.0083
                listed among the top 20 most prevalent


 As can be observed in Table 14c, the procedure codes that consistently added
 the most to the total charges include: Hemodialysis, upper gastrointestinal




                                                                                                            29
endoscopy/biopsy, and other vascular catheterization/not heart. It is also clear
that elective admission and lower age consistently correlated with lower costs.




                                                                             30
Discussion

In each of the above regressions, note that although all of the parameters were
highly significant (P<.05) in the regression equations, most of the R-square
values were particularly low. The R-square is a percentage of the total variation
that is explained by the regression equation; higher values increase the value of
the regression equation, and hence are more desirable. The lower R-square
values in many of these tables indicate that although the regression equations
given here are accurate to the parameters offered, the resulting equation still
leaves much unexplained about the inpatient’s length of stay or total charges.
During the refinement of the regressions, many parameters were determined to
be unreliable (P>.05) and removed to improve the accuracy of the regression.
There was a tradeoff, however, in that the R-square decreased after removing
many of these parameters. It was in the best interests of this analysis to sacrifice
scope in order to retain reliability.

In both L&D patients and non-L&D patients, the procedures performed explained
more of the variation than did the diagnoses. Because the necessary treatment
for each condition is often unique from patient to patient, it follows that the length
of stay would be more closely correlated to the procedures performed rather than
the diagnosis. The exception is Table 8, which shows the strong correlation
between Length of Stay and Total Charges. although Length of Stay was
considered in all Total-Charge regressions, this was the only one in which the
confidence level was above 95% for Length of Stay as a predictor of Total
Charges.

Diagnostic procedures such as catheterization invariably added time to a
patient’s stay, which is understandable given that these procedures indicate that
doctors are performing the time-consuming process of discovery on the patient.
Conversely, alcohol and drug rehabilitation/detoxification was estimated to
shorten an inpatient’s length of stay, which may indicate that doctors may be
transferring rehab/detox patients to specialized clinics or sending them home in
less time than it takes to treat other patients.

Elective patients stayed for shorter periods and experienced lower total charges
than those admitted involuntarily to the hospital, which is intuitive given that
involuntary hospitalization often is related to more serious conditions, especially
when complicated by depression. Respiratory intubation and mechanical
ventilation are seldom used in stable patients; this agrees with the estimation that
these procedures correlate to longer stays and higher expenses.

After filtering the shorter-staying labor & delivery patients from the rest, the
gender imbalance in length of stay was expected to equalize, but did not. It
remained the case that the women in this sample exhibited shorter stays (roughly
0.5 days) in the hospital than did the men. There are, however, twice as many
women in the sample as men, although this is largely influenced by the



                                                                                   31
prevalence of post-partum depression in L&D patients. The non-L&D gender
bias is difficult to explain without exhaustive consideration of the specific impact
of gender-specific conditions such as prostate cancer in men or osteoporosis in
women, the prevalence of each condition, and the weighted average lengths of
stay and expenses associated with each.

Many other factors, not considered or not able to be considered here, could
further explain this remaining variability, such as geographic location and relative
community affluence of the hospital, the specific doctors, hospitals, or private
insurers, or varying severity of illness. It is possible, and perhaps even likely,
that some factor or factors that extend or shorten a patient’s length of stay and
inflate or deflate the total expenses incurred continue to elude modern medical
experts and professionals dedicated to the study of these costs. Some factors
are easier to measure than others, adding further variability to the population.

Even with the most exhaustive list of parameters and an ideal (error-free) system
of measurement and reporting, it remains very unlikely that any researcher will
be able to improve length of stay predictions beyond a certain level of
uncertainty. In a large population, there are inherently too many degrees of
freedom.

In its search for greater understanding about the use of medical resources in the
United States, the research community should continue to seek funding of
projects such as the Healthcare Cost & Utilization Project, with the goal of
obtaining an even broader, more inclusive, objective, standardized database.




                                                                                 32
References:
1. HCUP Nationwide Inpatient Sample (NIS). Healthcare Cost and Utilization
      Project (HCUP). 2000-01. Agency for Healthcare Research and Quality,
      Rockville, MD. <www.hcup-us.ahrq.gov/nisoverview.jsp>
2. United States. National Institute of Mental Health. NIMH – The Numbers
      Count: Mental Disorders in America. 17 Dec. 2007
      <http://www.nimh.nih.gov/health/publications/the-numbers-count-mental-
      disorders-in-america.shtml#MajorDepressive>.
3. Zung WW, Broadhead WE, Roth ME: Prevalence of depressive symptoms in
      primary care. Journal of Family Practice Vol. 37:337–44, 1993.
4. ―Depression: Depression in the Elderly.‖ 2007. WebMD, Inc.
      16 Dec. 2007 <http://www.webmd.com/depression/depression-elderly>.
5. Bressi, Sara K., Steven C. Marcus, and Phyllis L. Solomon. ―The Impact of
      Psychiatric Comorbidity on General Hospital Length of Stay.‖ Psychiatric
      Quarterly, 2006; Vol. 77, pp. 203-209.
6. ―Depression: Depression in Women.‖ 2007. WebMD, Inc.
      http://www.webmd.com/depression/depression-women
7. US. Centers for Disease Control. National Center for Health Statistics.
      National Health Interview Survey 28 Mar. 2007. 16 Jan. 2008
      <http://www.cdc.gov/nchs/data/nhis/earlyrelease/earlyrelease200703.pdf>.




                                                                                 33

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:7
posted:7/5/2011
language:English
pages:38