INTRODUCTION TO BIOSTATISTICS by pengxuezhuyes

VIEWS: 5 PAGES: 102

									INTRODUCTION TO
BIOSTATISTICS

 Dr. Zafar Mahmood

 zafarjan90@yahoo.com
 0346-9079308
This session covers:
 § Background and need to know
   Biostatistics
 § Origin and development of Biostatistics
 § Definition of Statistics and Biostatistics
 § Types of data
 § Graphical representation of a data
 § Frequency distribution of a data
§ “Statistics is the science which deals
 with collection, classification and
 tabulation of numerical facts as the
 basis for explanation, description
 and comparison of phenomenon”.

           ------ Lovitt
“BIOSTATISICS”
§ (1) Statistics arising out of biological
  sciences, particularly from the fields of
  Medicine and public health.
§ (2) The methods used in dealing with
  statistics in the fields of medicine, biology
  and public health for planning,
  conducting and analyzing data which
  arise in investigations of these branches.
Main Branches of
Biostatistics
 § Descriptive Biostatistics:
 § Methods of producing quantitative
   summaries of information in biological
   sciences
 § Tabulation and graphical presentations
   § Measures of central tendency
   § Measures of dispersion
Branches of Biostatistics
……
 § Inferential Biostatistics:
 § Methods of making generalizations
   about a larger group based on
   information about a subset (sample)
   of that group in biological sciences
 § Estimation
 § Testing of hypothesis
Populations and Samples

 § Before we can determine what
   statistical tools and technique to use,
   we need to know if our information
   represents a population or a sample

 § A sample is a subset which should be
   representative of a population
Samples

 § A sample should be representative if
   selected randomly (i.e., each data
   point should have the same chance
   for selection as every other point)

 § In some cases, the sample may be
   stratified but then randomized within
   the strata
Example

 We want a sample that will reflect a
  population’s gender and age:
 1. Stratify the data by gender

 2. Within each strata, further stratify by age

 3. Select randomly within each gender/age
   strata so that the number selected will be
   proportional to that of the population
Population
 § The totality of all the observation whether
   finite or infinite in any field of interest is
   called population
 § Example
 § Total number of patients in HMC
Parameter and Statistic

 § Parameter: Summary value or
   characteristic of population or universe

 § Statistic: Summary value or characteristic
   of sample used for making inferences
   about parameter
Origin and development of
statistics in Medical Research
§ In 1929 a huge paper on application of
  statistics was published in Physiology
  Journal by Dunn.
§ In 1937, 15 articles on statistical methods
  by Austin Bradford Hill, were published in
  book form.
§ In 1948, a RCT of Streptomycin for
  pulmonary tb., was published in which
  Bradford Hill has a key influence.
§ Then the growth of Statistics in Medicine
  from 1952 was a 8-fold increase by 1982.
                                                C.R. Rao
Douglas Altman   Ronald Fisher   Karl Pearson




Gauss -
Basis of
 Biostatistics
Sources of Medical
Uncertainties
 1. Intrinsic due to biological,
    environmental and sampling factors
 2. Natural variation among methods,
    observers, instruments etc.
 3. Errors in measurement or assessment
    or errors in knowledge
 4. Incomplete knowledge
Intrinsic variation as a
source of medical
uncertainties
 § Biological due to age, gender, heredity, parity, height,
   weight, etc. Also due to variation in anatomical,
   physiological and biochemical parameters
 § Environmental due to nutrition, smoking, pollution,
   facilities of water and sanitation, road traffic, legislation,
   stress and strains etc.,
 § Sampling fluctuations because the entire world cannot
   be studied and at least future cases can never be
   included
 § Chance variation due to unknown or complex to
   comprehend factors
Natural variation despite
best care as a source of
uncertainties
 §   In assessment of any medical parameter
 §   Due to partial compliance by the patients
 §   Due to incomplete information in
     conditions such as the patient in coma
Medical Errors that cause
Uncertainties
 § Carelessness of the providers such as physicians,
   surgeons, nursing staff, radiographers and pharmacists.
 § Errors in methods such as in using incorrect quantity or
   quality of chemicals and reagents, misinterpretation of
   ECG, using inappropriate diagnostic tools,
   misrecording of information etc.
 § Instrument error due to use of non-standardized or
   faulty instrument and improper use of a right instrument.
 § Not collecting full information
 § Inconsistent response by the patients or other subjects
   under evaluation
Incomplete knowledge as a
source of Uncertainties
 § Diagnostic, therapeutic and prognostic
   uncertainties due to lack of knowledge
 § Predictive uncertainties such as in
   survival duration of a patient of cancer
 § Other uncertainties such as how to
   measure positive health
Biostatistics is the
science that helps in
managing medical
uncertainties
Reasons to know about
biostatistics:
 § Medicine is becoming increasingly
   quantitative.
 § The planning, conduct and interpretation
   of much of medical research are
   becoming increasingly reliant on the
   statistical methodology.
 § Statistics pass through the medical
   literature.
CLINICAL MEDICINE

 § Documentation of medical history of
   diseases.
 § Planning and conduct of clinical studies.
 § Evaluating the merits of different
   procedures.
 § In providing methods for definition of
   “normal” and “abnormal”.
  Role of Biostatistics in
  patient care
§ In increasing awareness regarding diagnostic,
  therapeutic and prognostic uncertainties and
  providing rules of probability to delineate those
  uncertainties
§ In providing methods to integrate chances with value
  judgments that could be most beneficial to patient
§ In providing methods such as sensitivity-specificity
  and predictivities that help choose valid tests for
  patient assessment
§ In providing tools such as scoring system and expert
  system that can help reduce epistemic uncertainties
PREVENTIVE MEDICINE

§ To provide the magnitude of any health
  problem in the community.
§ To find out the basic factors underlying
  the ill-health.
§ To evaluate the health programs which
  was introduced in the community
  (success/failure).
§ To introduce and promote health
  legislation.
Role of Biostatics in Health
Planning and Evaluation
 § In carrying out a valid and reliable health
   situation analysis, including in proper
   summarization and interpretation of data.

 § In proper evaluation of the achievements
   and failures of a health programme
Role of Biostatistics in
Medical Research
 § In developing a research design that can
   minimize the impact of uncertainties
 § In assessing reliability and validity of
   tools and instruments to collect the
   infromation
 § In proper analysis of data
Example: Evaluation of Penicillin (treatment
A) vs Penicillin & Chloramphenicol
(treatment B) for treating bacterial
pneumonia in children< 2 yrs.
§ What is the sample size needed to demonstrate the
  significance of one group against other ?
§ Is treatment A is better than treatment B or vice versa ?
§ If so, how much better ?
§ What is the normal variation in clinical measurement ? (mild,
  moderate & severe) ?
§ How reliable and valid is the measurement ? (clinical &
  radiological) ?
§ What is the magnitude and effect of laboratory and technical
  error ?
§ How does one interpret abnormal values ?
WHAT DOES BIO-
STAISTICS COVER ?
        Planning
        Design
        Execution (Data collection)
        Data Processing
        Data analysis
        Presentation
        Interpretation
        Publication
 BASIC CONCEPTS
  Data : Set of values of one or more variables recorded 
  on one or more observational units

    Sources of data    1. Routinely kept records
                        2. Surveys (census)
                        3. Experiments
                        4. External source
Categories of data
  1. Primary data: observation,  questionnaire, record form,
      interviews, survey, 
  2. Secondary data: census, medical record, registry  etc
TYPES OF DATA or VARIABLE

 § QUALITATIVE DATA
 § DISCRETE QUANTITATIVE
 § CONTINOUS QUANTITATIVE
Qualitative Data or Variable

 § a variable or characteristic which cannot
   be measured in quantitative form but can
   only be identified by name or categories,
   for instance place of birth, ethnic group,
   type of drug, stages of breast cancer (I,
   II, III, or IV), degree of pain (minimal,
 § moderate, severe or unbearable).
Quantitative Data or Variable
 § A quantitative variable is one that can be
   measured and expressed numerically and they
   can be of two types (discrete or continuous).
 § The values of a discrete variable are usually
   whole numbers, such as the number of
   episodes of diarrhea in the first five years of life.
 § A continuous variable is a measurement on a
   continuous scale. Examples include weight,
   height, blood pressure, age, etc.
TYPES OF DATA or VARIABLE …

 § Although the types of variables could be
   broadly divided into categorical
   (qualitative) and quantitative , it has been
   a common practice to see four basic
   types of data (scales of measurement).
 § Nominal, Ordinal, Interval & Ratio data
Qualitative Nominal data

 § Data that represent categories or names. There is no
   implied order to the categories of nominal data. In
   these types of data, individuals are simply placed in the
   proper category or group, and the number in each
   category is counted. Each item must fit into exactly one
 § category.
 § The simplest data consist of unordered, dichotomous,
   or "either - or“ types of observations, i.e., either the
   patient lives or the patient dies, either he has some
   particular attribute or he does not.
Nominal scale data Example

 § survival status of propanolol - treated and
 § control patients with myocardial infarction
      Status 28 days       Propanolol      Control
       after hospital   -treated patient   Patients
        admission


          Dead                 7             17

           Alive              38             29

           Total              45             46

       Survival rate         84%            63%
Some other examples of nominal data


 Example: Sex ( M, F)
             Exam result (P, F)
             Blood Group (A,B, O or AB)
             Color of Eyes (blue, green,
                              brown, black)
   Anemia's ( Microcytic, Macrocytic
   Religion - Christianity, Islam, Hinduism, etc
Qualitative Ordinal data

 § The ordinal scale data have order among
   the response classifications (categories).
   The spaces or intervals between the
   categories are not necessarily equal.
 § It is similar to nominal b/c the
   measurement involve categories,
   however, the categories are ordered by
   rank.
Ordinal Scale data Examples


 § Pain level (Mild, Moderate, Severe)

 §   Tumors (Stage 0, ……, IV)
 §   Arthritis (Class 1, ……, 4 )
 §   Military Rank (Lt., Capt., Maj., Col.,
     General)
Some other examples of ordinal data


        Response to treatment
              (poor, fair, good)
        Severity of disease
               (mild, moderate, severe)
       Income status
              (low, middle, high)
QUANTITATIVE (DISCRETE)

 Example: The no. of family members
         The no. of heart beats
         The no. of admissions in a day

QUANTITATIVE (CONTINOUS)

 Example: Height, Weight, Age, BP,
 Serum Cholesterol and BMI
Discrete data -- Gaps between possible values



             Number of Children

        Continuous data -- Theoretically,
        no gaps between possible values




                     Hb
 CONTINUOUS DATA



   QUALITATIVE DATA

wt. (in Kg.) : under wt, normal & over wt.
Ht. (in cm.): short, medium & tall
Table 1 Distribution of blunt injured patients
according to hospital length of stay
Scale of measurement
 Qualitative variable:
 A categorical variable

 Nominal (classificatory) scale
       - gender, marital status, race

 Ordinal (ranking) scale
        - severity scale, good/better/best
Quantitative Variable:
 Quantitative variable:
      A numerical variable: discrete; continuous
 Numerical discrete data occur when the observations are
 integers that correspond with a count of some sort. Some
 common examples are: the number of bacteria colonies on
 a plate, the number of cells within a prescribed area upon
 microscopic examination, the number of heart beats within
 a specified time interval, a mother’s history of number of
 births ( parity) and pregnancies (gravidity), the number of
 episodes of illness a patient experiences during some time
 period, etc.
Quantitative Variable…..
 Numerical continuous

 The scale with the greatest degree of quantification is a
 numerical continuous scale. Each observation theoretically
 falls somewhere along a continuum (range). One is not
 restricted, in principle, to particular values such as the
 integers of the discrete scale. The restricting factor is the
 degree of accuracy of the measuring instrument most
 clinical measurements, such as blood pressure, serum
 cholesterol level, height, weight, age etc. are on a numerical
 continuous scale.
      Quantitative Interval Scale of
      measurement

       Quantitative variable:
       A numerical variable: discrete; continuous

       Interval scale : 
       Data is placed in meaningful intervals and order. The 
       unit of measurement are arbitrary. There  is no true zero

 -     Temperature (37º C -- 36º C;  38º C-- 37º C are equal)  
       and   No implication of ratio (30º C is not twice  as hot 
       as 15º C)
 Quantitative Ratio Scale of
 measurement


   Data is presented in frequency distribution
   in logical order. A meaningful ratio exists.
   There is a true zero

- Age, weight, height, pulse rate
- pulse rate of 120 is twice as fast as 60
- person with weight of 80kg is twice as heavy
as the one with weight of 40 kg.
Scales of Measure

 §   Nominal – qualitative classification of
     equal value: gender, race, color, city
 §   Ordinal - qualitative classification
     which can be rank ordered:
     socioeconomic status of families
 §   Interval - Numerical or quantitative
     data: can be rank ordered and sizes
     compared : temperature
 §   Ratio - Quantitative interval data along
     with ratio: time, age.
CLINIMETRICS
 A science called clinimetrics in which
   qualities are converted to meaningful
   quantities by using the scoring system.

 Examples: (1) Dummy score based on
   appearance, pulse, grimace, activity and
   respiration is used for neonatal prognosis.
 (2) Smoking Index: no. of cigarettes, duration,
   filter or not, whether pipe, cigar etc.,
 (3) APACHE( Acute Physiology and Chronic
   Health Evaluation) score: to quantify the
   severity of condition of a patient
                    INVESTIGATION


                                         Data Colllection



                                                      Inferential Statistiscs
                      Descriptive Statistics
Data Presentation
                                                    Estimation       Hypothesis   Univariate analysis
                      Measures of Location
   Tabulation                                                        Testing
                     Measures of Dispersion
   Diagrams                                                  Ponit estimate       Multivariate analysis
                    Measures of Skewness &
    Graphs                                                  Inteval estimate
                           Kurtosis
Methods Of Data Collection, Organization
And Presentation
Learning Objectives

•   Identify the different methods of data organization
    and presentation

2. Understand the criterion for the selection of a method to organize
    and present data

3. Identify the different methods of data collection and criterion that
    we use to select a method of data collection

4. Define a questionnaire, identify the different parts of a
questionnaire and indicate the procedures to prepare a
questionnaire
        Data Collection Methods
Various data collection techniques can be used such
as:

•   Observation
•   Face-to-face and self-administered interviews
•   Postal or mail method and telephone interviews
•   Using available information
•   Focus group discussions (FGD)

• Other data collection techniques – Rapid appraisal
techniques, 3L technique, Nominal group techniques, Delphi
techniques, life histories, case studies, etc.
                     Observation
Observation is a technique that involves systematically selecting,
watching and recoding behaviors of people or other phenomena
and aspects of the setting in which they occur, for the purpose
of getting (gaining) specified information. It includes all methods
from simple visual observations to the use of high level
machines and measurements, sophisticated equipment or
facilities, such as radiographic, biochemical, X-ray machines,
microscope, clinical examinations, and microbiological
examinations
      Interviews and self-administered
      questionnaire
Interviews and self-administered questionnaires are
probably the most commonly used research data
collection techniques. Therefore, designing good
“questioning tools” forms an important and time
consuming phase in the development of most research
proposals.
Once the decision has been made to use these
techniques, the following questions should be
considered before designing our tools:
     Interviews and self-administered
     questionnaire…..
1.   What exactly do we want to know, according to the
     objectives and variables we identified earlier? Is
     questioning the right technique to obtain all answers, or
     do we need additional techniques, such as observations
     or analysis of records?

2.   Of whom will we ask questions and what techniques will
     we use? Do we understand the topic sufficiently to
     design a questionnaire, or do we need some loosely
     structured interviews with key informants or a focus
     group discussion first to orient ourselves?
Interviews and self-administered
questionnaire…..
3. Are our informants mainly literate or illiterate? If
      illiterate, the use of self-administered questionnaires is
      not an option.

4. How large is the sample that will be interviewed?
      Studies with many respondents often use shorter,
      highly structured questionnaires, whereas smaller
      studies allow more flexibility and may use
      questionnaires with a number of open-ended
   questions.
  Face-to-face and telephone
  interviews
Face-to-face and telephone interviews have many
advantages
A good interviewer can stimulate and maintain the
respondent’s interest, and can create a rapport
(understanding, concord) and atmosphere conducive to the
answering of questions.
 If anxiety aroused, the interviewer can allay it. If a question
is not understood an interviewer can repeat it and if necessary
(and in accordance with guidelines decided in advance)
provide an explanation or alternative wording.

In face-to-face interviews, observations can be made as well.
 Mailed Questionnaire Method

Under this method, the investigator prepares a questionnaire
containing a number of questions pertaining the field of inquiry.
The questionnaires are sent by post to the informants together
with a polite covering letter explaining the detail, the aims and
objectives of collecting the information, and requesting the
respondents to cooperate by furnishing the correct replies and
returning the questionnaire duly filled in. In order to ensure
quick response, the return postage expenses are usually borne
by the investigator.
 Use of documentary sources

Clinical and other personal records, death certificates,
published mortality statistics, census publications, etc. are
documentary sources. Examples include:

1. Official publications of Central Statistical Authority
2. Publication of Ministry of Health and Other Ministries
3. News Papers and Journals.
4. International Publications like Publications by WHO, World
Bank, UNICEF
5. Records of hospitals or any Health Institutions.
    Problems in gathering data
Common problems might include:

  • Language barriers
  • Lack of adequate time
  • Expense
  • Inadequately trained and experienced staff
  • Invasion of privacy
  • Suspicion
  •Bias (spatial, project, person, season, diplomatic,
  professional)
  •Cultural norms (e.g. which may preclude men interviewing
    women)
  Choosing a Method of Data Collection
Decision-makers (consultants ) need information that is
relevant, timely, accurate and usable. The cost of
obtaining, processing and analyzing these data is high.

The challenge is to find ways, which lead to information
that is cost-effective, relevant, timely and important for
immediate use.

Some methods pay attention to timeliness and reduction
in cost. Others pay attention to accuracy and the
strength of the method in using scientific approaches.
     Categories of Data
Primary Data: are those data, which are collected by the
investigator himself for the purpose of a specific inquiry or
study. Such data are original in character and are mostly
generated by surveys conducted by individuals or research
institutions.
The first hand information obtained by the investigator is more
reliable and accurate since the investigator can extract the correct
information by removing doubts, if any, in the minds of the
respondents regarding certain questions.
  Categories of Data….
Secondary Data: When an investigator uses data,
which have already been collected by others, such data are
called "Secondary Data". Such data are primary data for the
agency that collected them, and become secondary for
someone else who uses these data for his own purposes.

The secondary data can be obtained from journals, reports,
government publications, publications of professionals and
research organizations.
        Types of Questions
Before examining the steps in designing a
questionnaire, we need to review the types of
questions used in questionnaires. Depending on
how questions are asked and recorded we can
distinguish two major possibilities –
      1. Open –ended questions,
      2. Closed questions.
  Open-ended questions
Open-ended questions permit free responses that should
be recorded in the respondent’s own words. The
respondent is not given any possible answers to choose
from.
For example
“Can you describe exactly what the traditional birth
attendant did when your labor started?”

“What do you think are the reasons for a high drop-out
rate of village health committee members?”

“What would you do if you noticed that your daughter
(school girl) had a problem in education?”
  Closed Questions
Closed questions offer a list of possible options or answers
from which the respondents must choose. When designing
closed questions one should try to:

   • Offer a list of options that are exhaustive and mutually
     Exclusive
   • Keep the number of options as few as possible.

    For example
    “What is your marital status?
    1. Single
    2. Married/living together
    3. Separated/divorced/widowed
  Closed Questions….
Closed questions may also be used if one is only interested in
certain aspects of an issue and does not want to waste the
time of the respondent and interviewer by obtaining more
information than one needs.
  For example, a researcher who is only interested in
  the protein content of a family diet may ask:

 “Did you eat any of the following foods yesterday? (Circle
 yes or no for each set of items)

     •   Peas, bean, lentils        Yes    No
     •   Fish or meat               Yes    No
     •   Eggs                       Yes    No
     •   Milk or Cheese             Yes    No
      Designing the Questionnaire
Steps involved in designing the questionnaire
1)      Content:
    ·   Take your Objectives and Variable
    ·   Decide measure of quantitative variables or levels of qualitative variables to
    reach your objectives
2)      Formulating Questions
    ·   Questions need to be clearly worded so as not to confuse the respondent or
    arouse extraneous attitudes.
    ·   Questions should provide a clear understanding of the information sought
    ·   Be precise; avoid ambiguity and wording that might be perceived to elicit a
    specific purpose.
    ·   Questions may be open-ended, multiple choice, completion or variations of
    these.
·       Studiously avoid overly complex questions
3)           Sequencing Questions
·            Sequence of questions should be informant friendly beginning with a natural
             conversation questions (e.g., age, marital status, education etc)
·            Restrict yourself to an essential minimum questions while asking personal
             information ·
             Start then with interesting but non-controversial questions
·            At the end pose more sensitive questions
4)           Formatting the Questionnaire
·            Provide a separate page explaining the purpose of the study, requesting the
             informant consent to be interviewed and assuring confidentiality of the data
             recorded.
·            Each questionnaire has must have heading and space locating SNO., data and
             location of the interviewer.
·            Sufficient space is provided for answer to open-ended questions.
·            Proper and attractive layout
    5)       Translation
         ·   The interview will be conducted in one or more local languages and should be
             translated to the original language for standardizing the questions.
 Key Principle for Constructing a
Questionnaire

1)   It should be easy for the respondent to read,
     understand, and answer.
2)   Motivate the respondents to answer
3)   Be designed for efficient data processing
4)   Have a well designed professional appearance
5)   Design to minimize missing data
Frequency Distributions

  § data distribution – pattern of
    variability.
    § the center of a distribution
    § the ranges
    § the shapes
  § simple frequency distributions
  § grouped frequency distributions
    § midpoint
Tabulate the hemoglobin values of 30 adult
        male patients listed below

      Patien Hb       Patien Hb       Patien Hb
      t No   (g/dl)   t No   (g/dl)   t No   (g/dl)
      1      12.0     11     11.2     21     14.9
      2      11.9     12     13.6     22     12.2
      3      11.5     13     10.8     23     12.2
      4      14.2     14     12.3     24     11.4
      5      12.3     15     12.3     25     10.7
      6      13.0     16     15.7     26     12.5
      7      10.5     17     12.6     27     11.8
      8      12.8     18     9.1      28     15.1
      9      13.2     19     12.9     29     13.4
      10     11.2     20     14.6     30     13.1
        Steps for making a
        table
Step1   Find Minimum (9.1) & Maximum (15.7)

Step2   Calculate difference 15.7 – 9.1 = 6.6

Step3    Decide the number and width of
        the classes (7 c.l) 9.0 -9.9, 10.0-10.9,----

Step4   Prepare dummy table –
        Hb (g/dl), Tally mark, No. patients
                  DUMMY TABLE                          Tall Marks TABLE
 
    Hb (g/dl)      Tall marks   No.        Hb (g/dl)     Tall marks    No. 
                                patients                               patients


     9.0 – 9.9                              9.0 – 9.9    l             1
    10.0 – 10.9                            10.0 – 10.9   lll           3
    11.0 – 11.9                            11.0 – 11.9   lll           6
    12.0 – 12.9                            12.0 – 12.9
    13.0 – 13.9
                                                         llll llll     10
                                           13.0 – 13.9
    14.0 – 14.9                            14.0 – 14.9   llll          5
    15.0 – 15.9                            15.0 – 15.9                 3
                                                         lll           2
                                                         ll
    Total                        
                                           Total         -             30
Table Frequency distribution of 30 adult male
              patients by Hb
            Hb (g/dl)       No. of
                           patients
            9.0 – 9.9          1
           10.0 – 10.9         3
           11.0 – 11.9         6
           12.0 – 12.9        10
           13.0 – 13.9         5
           14.0 – 14.9         3
           15.0 – 15.9         2
              Total           30
Table Frequency distribution of adult patients by
               Hb and gender:
           Hb                Gender        Total
          (g/dl)
                      Male        Female


           <9.0         0             2      2
         9.0 – 9.9      1             3      4
        10.0 – 10.9     3             5      8
        11.0 – 11.9     6             8     14
        12.0 – 12.9    10             6     16
        13.0 – 13.9     5             4      9
        14.0 – 14.9     3             2      5
        15.0 – 15.9     2             0      2


          Total        30             30    60
                    Elements of a Table
Ideal table should have      Number
                                 Title
                                 Column headings
                                 Foot-notes
Number –      Table number for identification in a report

Title,place      -        Describe the body of the table, variables, 
Time period             (What, how classified, where and when)

Column  -      Variable name, No. , Percentages (%), etc.,
Heading 

Foot-note(s)   - to describe some column/row headings, 
                 special cells, source, etc.,
Table II. Distribution of 120 Corporation divisions according to
annual death rate based on registered deaths in 1975 and 1976




             Figures in parentheses indicate percentages
DIAGRAMS/GRAPHS

Discrete data
   --- Bar charts (one or two groups)

Continuous data
  --- Histogram
  --- Frequency polygon (curve)
  --- Stem-and –leaf plot
  --- Box-and-whisker plot
Example data

   68   63   42   27   30   36   28   32
   79   27   22   28   24   25   44   65
   43   25   74   51   36   42   28   31
   28   25   45   12   57   51   12   32
   49   38   42   27   31   50   38   21
   16   24   64   47   23   22   43   27
   49   28   23   19   11   52   46   31
   30   43   49   12
Histogram




    Figure 1 Histogram of ages of 60 subjects
Polygon
Example data

   68   63   42   27   30   36   28   32
   79   27   22   28   24   25   44   65
   43   25   74   51   36   42   28   31
   28   25   45   12   57   51   12   32
   49   38   42   27   31   50   38   21
   16   24   64   47   23   22   43   27
   49   28   23   19   11   52   46   31
   30   43   49   12
         Stem and leaf plot
Stem-and-leaf of Age     N = 60
Leaf Unit = 1.0


   6    1 122269
 19     2 1223344555777788888
 (11) 3 00111226688
 13    4 2223334567999
  5    5 01127
  4    6 3458
  2    7 49
Box plot
Descriptive statistics report:
Boxplot
  - minimum score
  - maximum score
  - lower quartile
  - upper quartile
  - median
  - mean



 - the skew of the distribution:
    positive skew: mean > median & high-score whisker is longer
    negative skew: mean < median & low-score whisker is longer
             Pie   Chart
                           •Circular diagram – total -100%

                           •Divided into segments each
                           representing a category

                           •Decide adjacent category

                           •The amount for each category is
                           proportional to slice of the pie




The prevalence of different degree of
           Hypertension
         in the population
 Bar Graphs
                         Heights of the bar indicates 
                         frequency

                         Frequency in the Y axis 
                         and categories of variable 
                         in the X axis

                         The bars should be of equal 
                         width and no touching the 
                         other bars
The distribution  of risk factor among cases with 
            Cardio vascular Diseases 
HIV cases enrolment in
USA by gender
          Bar chart
HIV cases Enrollment
in USA by gender
               Stocked bar chart
Graphic Presentation of
Data
   the frequency polygon
   (quantitative data)



   the histogram
   (quantitative data)



   the bar graph
   (qualitative data)
General rules for designing
graphs
 § A graph should have a self-explanatory
   legend
 § A graph should help reader to understand
   data
 § Axis labeled, units of measurement
   indicated
 § Scales important. Start with zero (otherwise
   // break)
 § Avoid graphs with three-dimensional
   impression, it may be misleading (reader
   visualize less easily
Exercise
 § Identify the type of data (nominal, ordinal, interval and
   ratio) represented by each of the following. Confirm
   your answers by giving your own examples.

 §   1. Blood group
 §   2. Temperature (Celsius)
 §   3. Ethnic group
 §   4. Job satisfaction index (1-5)
 §   5. Number of heart attacks
Exercise ....
 § 6. Calendar year
 § 7. Serum uric acid (mg/100ml)
 § 8. Number of accidents in 3 - year period
 § 9. Number of cases of each reportable disease
   reported by a health worker
 § 10. The average weight gain of 6 1-year old dogs (with
   a special diet supplement) was 950grams last month.
Any Questions
thanks

								
To top