					Additional file II: Summary of the theoretical review
Generic         Underlying construct       Scaling strategy      Item generation                     Item reduction                  Response          Scoring method
measures                                                         Technique                                                           formats
EuroQol         Health-related quality     Health state          Meta analysis of existing health    Descriptors selected to         3 response        Profile or weighted
                of life – not defined      valuation             status questionnaires               ‘cover as many as possible      categories        health index
                Standardised non-                                                                    of the domains frequently       Different
                disease specific- single                                                             covered by others’ and          wordings
                index                                                                                cover wide range of severity
                                                                                                     within each domain - no         VAS: (0-
                                                                                                     further details in literature   100mm)

McGill Pain     Based on Melzack’s         Pain Rating           120 words from questionnaires       PRI: Non-agreement of           PRI: Select one   PRI: 3 originally
Questionnaire   theory of pain             Intensity (PRI) -     & literature                        judges & equal appearing        word from each    proposed: No. of items
(MPQ)                                      Equal appearing                                           intervals=78 words              group if any      (NWC); Mean scale
                                           intervals for                                                                             applicable.       values (PRI(S));
                                           words &                                                                                                     Rank of values
                                           groupings                                                                                 PPI: 1-5          (PRI(R))

                                           Present Pain                                                                                                Subsequently:
                                           Intensity (PPI) - 5                                                                                         weighted rank method
                                           point scale                                                                                                 developed

                                                                                                                                                       PPI: scale value
SF-36           Physical and mental        Likert method of      Review of existing measures.        Full MOS as criterion           2-6 response      Summated scores for
                concepts and multiple      summated ratings      Items selected to reproduce the     & psychometric standards        categories        each dimension
                manifestations of well                           parent scale of the Medical         considered                      Different         Recalibrated for
                being and health =                               Outcomes Survey (MOS) - 245         Difficult to find detail of     wordings          linearity and
                8 health concepts with                           items.                              item reduction to 36 items.                       transformed
                5 defined                                                                            Likert assumptions tested
                                                                                                     for the 36 items

WHOQOL          WHOQOL group               Likert                From focus groups, question         Principal from each centre      5 response        Sum for each facet.
                defined quality of life                          writing panels in each centre.      ranked each item on             categories        6 domain scores:
                then worldwide (15                               Maximum of 12 items per facet       importance as judged from       Different         from EFA and CFA
                centres) generated                               – all items pooled (across          focus group discussions         wordings          physical,
                facets by focus groups                           centres). This resulted in 1800     - 236 items /29 facets in                         psychological, social,
                                                                 items with 1000 dissimilar items.   pilot                                             environment, spiritual,
                                                                                                     Then tested on 300 subjects                       independence
                                                                                                     in each centre -                                  Later reduced to 4
                                                                                                     psychometric methods used                         domains (1st four
                                                                                                     to reduce to 24 facets.                           above)
                                                                                                     4 items per facet to be able
                                                                                                     to test reliability
Disease specific   Underlying construct         Scaling strategy           Item generation                   Item reduction         Response format           Scoring method
Measure                                                                    Technique
Clinician report
American Knee      Not defined - Knee rating      No details in            Consensus of knee society         No details in          3-7 response categories   Knee & Function
Society            and function                 literature                                                   literature             Different wordings        score - additive
Score (AKS)                                                                                                                                                   100 points with
Harris Hip         Not defined- function &      No details in literature   No details in literature          No details in          1-6 response categories   Additive 100 points
Score              capacity                                                                                  literature             Different wordings        Separate method for
                                                                                                                                                              Range of Motion
Hospital for       Not defined -Knee            No details in literature   No details in literature          No details in          1-9 response categories   Additive points with
Special Surgery    disability                                                                                literature             Different wordings        deductions
Knee Score (HSS)
Lequesne Hip &     Not defined- severity        No details in literature   No details in literature          No details in          1-8 response categories   Additive
Knee Indices       index                                                                                     literature             Different wording
Merle D'Aubigne    Not defined -Function of     No details in literature   No details in literature          No details in          6 response categories     For function: table of
Hip Rating         the hip/improvement                                                                       literature             Different wordings        scores to grade hip
                                                                                                                                                              For improvement:
                                                                                                                                                              based on sum of
Self report
Arthritis Impact   WHO definition of health     Guttman for item           Previous questionnaires           Items examined to      2-6 response categories   Standardised
Measurement                                     selection then Likert      Initially 55 items                produce optimal        Different wordings        summated scores:
Scale (AIMS)                                    for response format                                          Guttman scales - 46                              9 dimensions and
                                                and scoring                                                  items - 1 item (sex)                             overall
                                                                                                             dropped in
                                                                                                             subsequent version
Disease            Individualised measure:      Graphical rating scale     ‘Grounded theory’ approach        Responses from         Open questions            A handicap profile
Repercussion       Individual function,                                    based on a survey on impact of    patients could be                                obtained by plotting
Profile (DRP)      social, psychological,                                  RA on 458 patients                separated into 6       Severity rated on 10      the handicap rating
                   emotional and economic                                                                    domains                point graphic rating      for each domain on a
                   disadvantage i.e. ‘patient                                                                                       scales                    bar chart.
                   perceived’ handicap
Health             Hierarchical model :         No details in literature   Existing instruments – range of   Pilot of 62 items      Disability Index:         Disability Index:
Assessment         death, disability,                                      questionnaires considered 200                            8 components 20 items     Average of highest
Questionnaire      discomfort, drug toxicity                               items.                            Pre-tested and         (all 4 response           score for any
(HAQ)              and dollar cost                                                                           revised repeatedly     categories - same         question within
                                                                           Disability Index: 62 items        Redundant items        wording: based on ARA     component (adjusted
                                                                                                             eliminated.            functional classes)       for use of aids)
                                                                                                                                                              Based on the 8
                                                                                                                                    Checklist for ‘use of     components (0-3).
                                                                                                                                    aids/help needed’ items
Disease          Underlying construct           Scaling strategy         Item generation                     Item reduction           Response format             Scoring method
specific                                                                 Technique
Self report
Oxford Hip and   Not defined -Patients          Likert (not explicitly   20 patients interviewed & review    20 items tested on       All 5 response categories   Overall sum
Knee             perception of outcome          stated)                  of existing questionnaires. 20      3x20 patients,           Different wordings
Questionnaires                                                           items drafted (method of            reviewed and
                                                                         reduction to 20 items not           modified, until final
                                                                         explained in literature)            version of 12 items
WOMAC            Objective of defining the      Likert                   100 patients probing 5 dimensions   Psychometric, quasi-     Likert: All 5 response      Sum each dimension
                 dimensionality of pain and                              pilot: 41 items                     experimental trials      categories with same        and overall.
                 disability. Five dimensions    VAS                                                          and scaling &            wording                     - Other weighting and
                 initially. Final version had                                                                statistical analysis -   VAS: 0-100mm                aggregation methods
                 3 subscales of pain,           Numeric rating scale                                         24 items                 NRS: 0-10                   considered
                 stiffness and physical                                                                                                                            -Signal method
                 function                                                                                                                                         explored but not
Abbreviations: ARA=American Rheumatism Association; CFA=confirmatory factor analysis; EFA=exploratory factor analysis, NRS=numeric rating scale
RA=Rheumatoid arthritis; VAS=Visual analogue scale