Chapter 6 Coding and Classification 6.1 Introduction 06.01a Traditional patient record In the traditional patient record, data are available in written format only, mainly as free text, but sometimes also as numeric data, such as laboratory test results. The patient record is primarily used for patient care itself, that is, for diagnosis, therapy, and prognosis. Reconstructing the patient history from such a handwritten patient record by a clinician other than the original author is hindered by the fact that many medical terms are ill-defined and are perhaps even ambiguous. 06.01b Computer-based patient record Since many patient data are becoming available in computer-based patient records ( CPRs) (see Chapter 7), use of these data for purposes other than traditional archiving and reporting is becoming feasible. Reasons for storing medical data in a computer are given in Panel 6.1. Decision-support systems may support care providers in making decisions based on CPR data (see Chapters 15 and 16). For instance, prescription of a drug may trigger a decision-support system that checks for contraindications or drug interactions. Such a system will be able to operate properly only if all the diseases and symptoms of a patient are recorded in a standardized and consistent way. 06.01c Need for classification Many data in health care, such as diagnoses, patient history data, physical examination data, or the reporting of X-ray pictures, are expressed as free text (see also Chapters 2 and 3). This leads to an infinite list of possible expressions. However, statistical overviews and decision-support systems can cope with only a finite number of classes. Rules for assigning expressions from the patient record to classes must be well defined by objective criteria. Assigning such an expression to a class always implies data reduction (i.e., loss of information), but this is not necessarily a disadvantage. 06.01d Type of classification The appropriate level of detail and the structure of the classification system depend on the purpose for which the classification system has been designed. A classification of diagnoses for health statistics, for example, may require categories other than classifications for planning patient care in a hospital ward. On the other hand, it must be possible to present all medically relevant expressions in CPRs without any data reduction. Therefore, standardized terminologies are used in these type of applications. 06.01e Terminology for coding In this chapter, we will follow as much as possible the standard terminology used in the International Standards Organization ( ISO) International Electrotechnical Commission (IEC) Technical Report TR 9789 (Information Technology; Guidelines for the Organization and Representation of Data Elements for Data Interchange. Coding Methods and Principles) (see also Chapter 34). This means that three basic elements are used in the so-called semantic triangle: (1) object, (2) concept, and (3) term. Objects, also called referents, are particular things in reality, and they are concrete (e.g., the stomach), as well as abstract (e.g., the mind). A concept is a unit of thought formed by using the common properties of a set of objects (e.g., an organ). A term is a designation by a linguistic expression of a concept or an object in a specific language. 6.2 Classifications 06.02a Classifying The term classifying has two different meanings: 1. the process of designing a classification, and 2. the coding or description of an object by using codes or terms that are designators of the concepts in a classification. 06.02b Definition Here, we use only the first meaning of classifying. A classification is an ordered system of concepts within a domain, with implicit or explicit ordering principles. The way in which classes are defined depends on their intended use. A classification is based on prior knowledge and forms a key to the extension of knowledge (see also Fig. 1.1). 06.02c Purpose The purpose of a classification is, for example, to support the generation of health care statistics or to facilitate research. Examples are the classification of abnormalities of electrocardiograms or diagnoses of patients into disease classes. 06.02d Concepts In a classification, concepts are ordered according to generic relations. Generic relations are relations of the type "A is a kind of B," for example, pneumonia is a kind of lung disease, where pneumonia represents the narrower concept and lung disease represents the broader concept. Classifications contain concepts within a certain domain. Examples of domains are reason for encounter, diagnosis, and medical procedure. In this respect the International Classification of Diseases, 9th edition (ICD-9), which will be discussed in Section 5.1, is a classification of diagnoses. A classification allows one to compare findings collected in different environments. For instance, if we want to compute the number of beds required per age category in a hospital, we could use the following age classes: babies age 0-3 children age 4-12 teenagers age 13-18 adults age 19-64 elderly age 65+ In this hypothetical example, defining the classes is a relatively simple task and the requirements for a classification are easily met (see Panel 6.2). Classifying is done according to a single criterion: age; that is, age is used as a differentiating criterion. 6.2.1 Ordering Principles 06.02.01a ordering principles In classifications that use more than one ordering principle, the situation is more complicated. In classifying diseases we deal with the following aspects, among others: anatomic location, etiology, morphology, and dysfunction. Each of these aspects can be used for a different ordering. Such an ordering throughout a classification is called an axis. Multiaxial classifications use several orderings simultaneously. In the International Classification of Primary Care (ICPC), for instance, the diagnoses are classified along two axes, one for the organ system (an alphabetic character) and one for the components (a number; see Table 6.1). ICPC has primarily been designed for epidemiological purposes. Therefore, the classes were chosen in such a way that for health care studies in primary care, each class will contain a sufficient number of cases. This is why all tropical diseases are grouped together. This classification may be of use in areas such as Europe or North America, but it is definitely impractical for general practitioners operating in tropical areas, such as Africa, Central and South America, India, or Indonesia. 6.2.2 Nomenclatures and Thesauri 06.02.02a Thesaurus One of the problems of uniform registration in health care is the lack of a common terminology. A thesaurus is a list of terms used for a certain application area or domain. Examples are a list of diagnostic terms or a list of terms for laboratory tests. A thesaurus is always intended to be complete for its domain. For practical usage, thesauri that also contain a list of synonyms for each preferred term have also been developed. In this way, a thesaurus stimulates the usage of standardized terminology. A restricted set of preferred terms used within an organization for a given purpose is called a controlled vocabulary. 06.02.02b Nomenclature In a nomenclature, codes are assigned to medical concepts, and medical concepts can be combined according to specific rules to form more complex concepts. This leads to a large number of possible code combinations. 06.02.02c Differences between classification and nomenclature The difference between a classification system and a nomenclature is that in the former possible codes are predefined, whereas in the latter a user is free to combine codes for all aspects involved. The retrieval of records for patients whose data fulfill certain classification codes from a large database is relatively easy; retrieving records for patients stored by using a nomenclature is more difficult because of the high degree of freedom, leading to very complex codes. A nomenclature, however, is useful in producing standardized reports, such as discharge letters. 06.02.02d Nomenclature of diseases In 1933, the New York Academy of Medicine started work on a database of medical terms, the Standard Classified Nomenclature of Diseases. The American Medical Association continued this work in 1961, and in 1965 the Systematic Nomenclature of Pathology (SNOP) coding system was published by the American College of Pathologists. SNOP formed the basis for the development of the Systematized Nomenclature of Human and Veterinary Medicine ( SNOMED), which is an example of such a nomenclature (see Section 5.4). 6.2.3 Codes 06.02.03a Coding Coding is the process of assigning an individual object or case to a class, or to a set of classes in the case of a multiaxial classification. In most classifications, classes are designated by codes. Coding is, in fact, interpretation of the aspects of an object. Codes may be formed by numbers, alphabetic characters, or both. The following list describes different types of codes. 06.02.03b Number codes Number codes may be issued sequentially. This means that each new class will be given the next unused number. The advantage is that new classes can easily be added. Numbers could be issued at random to avoid any patient-specific information is hidden in the code. Series of numbers can be reserved for sets of classes. Issuing this type of number is only of use with a fixed set of classes, that is, when no expansion of the set of classes is expected. 06.02.03c Mnemonic codes A mnemonic code is formed from one or more characters of its related class rubric. This helps users to memorize codes. However, for classifications with many classes this may lead either to long codes or codes with no resemblance to the class rubrics. Therefore, mnemonic codes are generally used for limited lists of classes. For example, hospital departments are often indicated by a mnemonic code, such as ENT for the Department of Ear, Nose, Throat, CAR for Cardiology, or OB-GYN for the Department of Obstetrics and Gynecology. 06.02.03d Hierarchical codes Hierarchical codes are formed by extending an existing code with one or more additional characters for each additional level of detail. A hierarchical code thus bears information on the level of detail of the related class and on the hierarchical relation with its parent class. This way of coding bears resemblance to the structure of hierarchical databases (Chapter 4), with "parents" at the higher level, and "children" at the lower levels. This implies that patient data can be retrieved by using hierarchical codes at a certain level, even when significant extensions or modifications are made at lower levels. An example of hierarchical codes are the codes used in ICD-9. 06.02.03e Juxtaposition codes Juxtaposition codes are composite codes consisting of segments. Each segment provides a characteristic of the associated class. In ICPC, for instance, a diagnostic code is formed by using a code consisting of one letter of the alphabet (a mnemonic code for the tract), followed by a two-digit number. For instance, all codes with the character "D" are related to the tractus digestivus and all codes starting with an "N" describe disorders of the nervous system. In the example of ICPC, two independent characteristics are coded simultaneously, and each characteristic has its own position in the code. 06.02.03f Combination codes Another example is a classification of medical procedures using ordering principles: action, equipment, aim, and anatomical site (see Fig. 6.1). The combination of 100 anatomical sites with 20 different actions, 10 different instruments, and 5 different purposes results in a classification system with a potential of a 100,000 classes and codes. A way to cope with this explosion is the use of a combination code. By using a six-digit combination code consisting of four segments, with segments dedicated to action (two digits), equipment (two digits), aim (one digit), and anatomical site (one digit), respectively, a coding clerk has to distinguish only 135 codes, with which 100,000 combinations can be generated. 06.02.03g Value addition codes In value addition codes in general only powers of 2 are used as a representation of a data item or class. Just as in a combination code, several characteristics can be coded. In this case, however, only one number instead of a segment for each characteristic is used as a code. This is easily illustrated if we code the presence or absence of risk factors, such as: 20 = 1 for smoker/0 for nonsmoker, 21 = 2 for overweight/0 for no overweight, 22 = 4 for increased cholesterol/0 for not increased cholesterol. By using the codes 1 to 7 we can sum all the three risk factors mentioned above. A smoker who is overweight but with no increased cholesterol level is coded as 3, and a nonsmoker who is overweight and who has an increased cholesterol level is coded as 6. 6.2.4 Taxonomy 06.02.04a Taxonomy Taxonomy is the theoretical study of classification, including its basic principles, procedures, and rules. The term taxonomy is known from Linnaeus's work in classifying biological organisms. The term taxonomy is also used to designate the end product of a taxonomic design process and is then frequently synonymous with classification. In this book we will use the term taxonomy for the first definition: the science of classification. The term classification is used for the end product of the design process. Taxonomy is concerned with classifications in general. All objects in a group have some features in common, that is, they fall within the boundaries of a group. All mammals form one group, to which people, cats, and whales belong. A group may be further subdivided on the basis of another feature or character. The lion, the tiger, and Felix domestica (the cat in our home) all belong to the group (or set) of cats. In a disease classification system such as ICD-9, the classification and subdivision are performed by the grouping of diseases in organ systems or by etiology. The different "chapters" (main disease categories or etiological categories) of ICD-9 are subdivided into groups, the groups are divided into three-digit classes, and so on (see Section 5.1 for a description of ICD-9). 6.2.5 Nosology 06.02.05a Nosology Nosology is usually defined as the science of the classification of diseases. Since nosological discussions usually involve symptoms, syndromes, disorders, and injuries, as well as diseases, it would be more appropriate to define nosology as the science of the classification of diagnostic terms, that is, the taxonomy of diagnostic terms. Increasing information needs in health care have highlighted many nosological problems. It seems that the impressive expansion of the diagnostic vocabulary during the last century has not been matched by the development of a precise meta-language for describing relations between diagnostic terms. Although meta-terms such as disease, disorder and syndrome are widely used, there is much confusion as to their proper meaning. A meta-language for describing nosological relations is either lacking or unused. 06.02.05b Nosography Nosology is usually distinguished from nosography, which is the science of the description of diseases. The difference between the definition and the description of disease is usually explained as follows: A disease definition gives only essential characteristics of the disease, whereas a description includes accidental characteristics, that is, characteristics that are empirically correlated with the essence of the disease, such as the so-called classification criteria of rheumatoid arthritis by the American Rheumatism Association (ARA) (see Table 6.2). There are no essential characteristics in this definition; all characteristics are accidental. This kind of definition, in which a set of accidental characteristics is used, is called polythetic. Apparently, the essential characteristic of rheumatoid arthritis is something that medical science has not yet discovered. There is a growing feeling that classifications such as ICD, SNOMED and the Diagnostic and Statistical Manual for Mental Disorders ( DSM-IV) do no justice to the way in which diagnostic terms are actually used in health care and that a new paradigm is needed. 6.3 History of Classification 06.03a History of classification In health care, the most widely used classification system is ICD and the classifications derived from it. The first attempt at registration was the London Bills of Mortality in 1629. The first edition of the International List of Causes of Death, as it was then called, was presented by Jacques Bertillon at a meeting of the International Statistical Institute (ISI) in 1893 in Chicago, and it was officially accepted in 1900. This list was regularly revised under the supervision of ISI until its fifth edition in 1938. Until then, the code list was primarily used for mortality statistics. Health insurance companies, hospitals, medical services, the military, and other agencies felt a growing need to extend the list with codes for morbidity. The International Health Conference, held in New York City in 1946, entrusted the Interim Commission of the World Health Organization (WHO) with the responsibility of undertaking the necessary preparatory work to extend the International List of Causes of Death with an International List for the Causes of Morbidity. 6.4 Classification and Coding Problems 06.04a Classification and coding problems Classification problems should be distinguished from coding problems: Classification problems concern the ordering of concepts in a way that is logically sound, elegant, and well-suited for the potential users of the classification. Coding problems concern the technical support that must be provided to enable coding clerks to assign an individual case to the right class and produce the right code in an efficient and reliable way. 6.4.1 Classification Problems 06.04.01a Combination of categories A problem of juxtaposition, combination codes, and value addition codes is that not all combinations that can be generated are sensible. In the example of the medical procedures, a "transplantation to remove an abscess" is not sensible. Combination codes also give ambiguous results. The combination of a code for larynx, a code for removal, and a code for tube is ambiguous. It is unclear whether the tube or the larynx is removed, because the code lacks semantic information about how the items are related. In a prestandard for the classification of surgical procedures of the Comité Européen de Normalisation (CEN) (see Chapter 34), this ambiguity problem is tackled by the use of both semantic categories and syntactic categories. 06.04.01b Classification Problems When developing a classification of diseases, etiology, location, and pathophysiological mechanism can be useful ordering principles. However, we cannot always apply each ordering principle to all diseases. Using etiology as the ordering principle, we can classify "viral pneumonia" as a viral disease, but we cannot classify "pneumonia" with the same degree of certainty to any etiological class. Therefore, pneumonia will be classified, using an anatomical ordering principle, as a pulmonary disease. Most classifications combine several ordering principles on one level. The overlap of disease classes that then results violates the rule of mutual exclusiveness (see Panel 6.2). The class "pulmonary disease" intersects with the class "viral disease." When a disease is already classified elsewhere, an exclusion statement is used to indicate that the disease is considered a member of one class only. However, this will cause problems in statistical analysis. If we want to compute the number of cases resulting from a viral disease, we cannot simply count the members in the class "viral diseases", since "viral pneumonia" is also a viral disease but those cases are classified in the class "pulmonary diseases". Adding the two classes will include cases of nonviral pulmonary diseases as well. 06.04.01c Maintenance of classification The dynamic nature of classification explains the continuous need for maintenance of classifications such as ICD and SNOMED. The classification of acquired immune deficiency syndrome (AIDS) as a viral disease was preceded by the classification of AIDS as an immune deficiency disease. The question whether AIDS was a viral disease was accompanied by much discussion. Nowadays, the hypothesis that AIDS is caused by infection with human immunodeficiency virus (HIV) is widely accepted. Creutzfeldt-Jakob's disease is currently regarded to be a prion disease, but it used to be regarded as a slow virus disease. Because diagnostic terms can disappear or serve different diagnostic goals over periods of time, one should be aware that statistical analysis of existing data may require a different algorithm each time a change is made to the classification scheme. 6.4.2 Coding Problems 06.04.02a Differences between coding language an clinical practice Browsing large medical classifications of diagnoses and procedures is required to encode a patient's condition for medicoeconomic purposes. The basic problem with this kind of browsing is the fact that the language used in the classification is rather different from the clinical language found in the patient record. Regardless of who encodes the patient's condition, there are difficulties in terms of mismatches between the terms in the classification and the overall representation of the patient. This gap can be bridged by using adequate computer programs. 06.04.02b Intelligent assistance Two different techniques are used to provide clinicians and encoders with intelligent help. The first type is morpho-semantic analysis of the input languages to extract all underlying concepts. This analysis decomposes all compound words into their parts: prefixes, stems, and suffixes. It then groups similar stems into more general categories. On this basis, an analysis of all available sentences around the classification in use is performed and a corresponding indexing is precomputed. Any future query of the browsing process will be handled in this context. The net result is a somewhat conceptual indexing of the classification, which has been shown to be much more valuable than usual lexical indexing. The other type of assistance is the incorporation of a thesaurus with synonym expressions that all point to an existing entry in the classification. Such a thesaurus, which may be hidden to the user, is part of the corpus on which the indexing is done. By using a large thesaurus, the overall performance of the browser may be dramatically increased. Local thesauri that use expressions and other medical terms specific to a language or country are also possible. In general, synonyms may include equivalent expressions (e.g., proper names) or subexpressions that represent a specialization of the initial expression. At the implementation level, browsers for medical classifications are readily available for use on personal computers, and they usually have adequate response times. 6.5 Classification Systems 6.5.1 ICD - International Classification of Diseases 06.05.01a ICD As discussed in Section 3, ICD is the archetypal coding system for patient record abstraction. The first edition was published in 1900, and it is being revised at approximately 10-year intervals. The most recent version is ICD-10, which was published in 1992. WHO is responsible for its maintenance. Most present registration systems, however, are still based on ICD-9 or its modification, ICD-9-CM, which contains more detailed codes. ICD consists of a core classification of three-digit codes, which are the minimum requirement for reporting mortality statistics to WHO. An optional fourth digit provides an additional level of detail. At all levels, the numbers 0 to 7 are used for further detail, whereas the number 8 is reserved for all other cases and the number 9 is reserved for unspecified coding. The basic ICD is meant to be used for coding diagnostic terms, but ICD-9 as well as ICD-10 also contain a set of expansions for other families of medical terms. For instance, ICD-9also contains a list of codes starting with the letter "V" for reasons for encounter or other factors that are related to someone's health status. A list of codes starting with the letter "E" is used to code external causes of death. The nomenclature of the morphology of neoplasms is coded by the "M" list. The disease codes of both ICD-9 and ICD-10 are grouped into chapters. For example, in ICD-9, infectious and parasitic diseases are coded with the three-digit codes 001 to 139, and in ICD-10 the codes are renumbered and extended as codes starting with the letters A or B; for tuberculosis the three-digit codes 010 to 018 are used in ICD-9, and the codes A16 to A19 are used in ICD-10. The four-digit levels and optional five-digit levels enable the encoder to provide more detail. Table 6.3 gives examples of some codes in the ICD-9 system. The U.S. National Center for Health Statistics published a set of clinical modifications to ICD-9, known as ICD-9-CM. It is fully compatible with ICD-9, but it contains an extra level of detail where needed (see Table 6.3). In addition, ICD-9-CM contains a volume III on medical procedures. 6.5.2 ICPC - International Classification of Primary Care 06.05.02a ICPC-International Classification of Primary Care The World Organization of National Colleges, Academies and Academic Associations of General Practitioners/Family Physicians (WONCA) did not accept ICD-9, but came up with its own classification. The granularity of this system is less than that of ICD-9. It is not only used for coding diagnoses but it also contains codes for reasons for encounter (RfE) and for therapies and laboratory tests. In most primary care information systems, the laboratory test results are directly entered as coded numerical values, so there is no need for manual coding, and a drug prescription module automatically stores the generic code for the drug and other prescription data. ICPC is compatible with earlier WONCA classifications, such as the ICHPPC-2-Defined (International Classification of Health Care Problems in Primary Care) and the IC-Process-PC. For codes derived from ICHPPC-2-Defined, inclusion criteria (for further specification of the code) are used. ICPC is a two-axis system (see Table 6.1). The first axis, primarily oriented toward body systems (the tracts), is coded by a letter, and the second axis, the component, is coded by two digits. The component axis contains seven code groups. In this system the diagnosis pneumonia is coded R81 (R for respiratory tract and 81 for the diagnostic component). Codes that can be applied to more than one tract are described only as a two-digit component. For instance, the procedure code 42 (electrical tracing) can be used for electrocardiograph registration by using the code K42. These codes require the combination with a tract letter. 06.05.02b SOAP ICPC is used to encode encounters structured according to the SOAP principle (S for subjective information, e.g., complaints; O is for objective information, e.g., test and lab results; A is for assessment, e.g., diagnosis; and P is for plan, e.g., diagnostic tests, treatment, medication, etc.; see also Chapter 7). Optionally, a fourth digit is used for some cases when an extra level of detail is required or to specify synonyms, which is a mixture of coding principles. ICPC can be used in the RfE mode (i.e., for coding the reason for encounter or the complaints), the diagnostic mode, or the process mode, where further actions, such as laboratory tests and therapies are coded. The process mode is not coded directly, since most of its components are already incorporated as alphanumeric values. 06.05.02c Disease episode An attractive way to organize patient-oriented information is by disease episodes. ICPC can be used to organize the registration of a disease episode over time, from its onset to its resolution. A disease episode may include several encounters. Each problem in an encounter should be coded separately. The same holds for complications of primary diseases. The committee that developed ICPC also produced conversions to and from ICD-9 and ICD-10. For several diagnoses of ICPC, criteria derived from ICHPPC-2-Defined have been defined. 6.5.3 DSM - Diagnostic and Statistical Manual for Mental Disorders 06.05.03a DSM A specialist code designed by the American Psychiatric Association is the Diagnostic and Statistical Manual for Mental Disorders (DSM) coding system. The first edition (DSM-I) was published in 1952. In developing DSM-II the decision was made to base it on the then newly developed ICD-8. Both systems became effective in 1968. DSM-IV has been coordinated with the development of ICD-10. The chapter on mental disorders of ICD-9-CM was compatible with DSM-III-R, its revised third edition. The fourth edition, DSM-IV, is compatible with the chapter on mental disorders in ICD-10. The classification is intended to be used by psychiatrists. However, the etiology or the pathophysiological processes are only known for some mental disorders. The approach taken in DSM-III, DSM-III-R, and DSM-IV is nontheoretical with regard to etiology or the pathophysiological process except for disorders for which the etiology or the pathology is established. In these cases etiology and pathology are included in the definition of the disorder. For instance, it is believed that phobic disorders represent some displacement of anxiety, resulting from the breakdown of defense mechanisms that keep internal conflicts out of one's consciousness. Others explain phobic disorders on the basis of acquired or learned avoidance responses to conditional anxiety. Still others believe that certain phobias result from a dysregulation of basic biological systems that mediate separation anxiety. Clinicians, however, agree on the clinical manifestations. Since it is not possible to define a theory for each disorder, let alone know the etiology, DSM is designed to describe the clinical manifestations of the disease along several axes. Therefore, DSM is a multiaxial classification system. Like ICPC, DSM also uses definitions for the disorders, including criteria for assigning a diagnosis. Disorders in the DSM systems are classified along five axes: 1. clinical syndromes, 2. personality disorders and special developmental disorders, 3. relevant physical conditions, 4. severity of psychological stressors, and 5. overall psychological functioning. 6.5.4 SNOMED - Systematized Nomenclature of Human and Veterinary 06.05.04a SNOMED SNOMED allows for the coding of several aspects of a disease. SNOMED was first published in 1975 and was revised in 1979. Its current version is called SNOMED International (Systematized Nomenclature of Human and Veterinary Medicine). SNOMED is also a multiaxial system. SNOMED II was a code with 7 axes, and SNOMED International has 11 axes or modules. Each of these axes forms a complete hierarchical classification system (see Table 6.4). A diagnosis in SNOMED may consist of a topographic code, a morphology code, a living organism code, and a function code. When a well-defined diagnosis for a combination of these four codes exists, a dedicated diagnostic code is defined. For example, the disease code D-13510 (Pneumococcal pneumonia) is equivalent to the combination of: T-28000 (topology code for Lung, not otherwise specified), M-40000 (morphology code for Inflammation, not otherwise specified), and L-25116 (for Streptococcus pneumoniae) along the living organism axis. Tuberculosis (D-14800), for instance, could also be coded as Lung (T-28000) + Granuloma (M-44000) + Mycobacterium tuberculosis (L-21801) + Fever (F-03003). However, this can be confusing since tuberculosis is not only restricted to the lung. SNOMED is also able to combine medical concepts, using so-called combination or juxtaposition codes, to form more complex concepts. Linkage between concepts, for instance, can be expressed by "is caused by." In SNOMED International, almost all diagnostic terms of ICD-9-CM are incorporated in the disease/diagnostic module (D-codes). Rules for combining SNOMED terms to form complex entities or complex concepts have not yet been developed. Any SNOMED term may be combined with any other SNOMED term. This means that there are often multiple ways to express a code for the same valid concept; however, these are not always meaningful. This freedom of combining codes for all axes allows for meaningless codes; checking such codes for correctness by a computer is almost impossible. 6.5.5 ICD-O - International Classification of Diseases for Oncology 06.05.05a ICD-O In 1976, WHO published the first edition of the International Classification of Diseases for Oncology ( ICD-O) after extensive field testing. It was based on ICD-9. The second edition, published in 1990, is an extension of the draft neoplasm chapter of ICD-10. ICD-O combines a four-digit topography code based on ICD with a morphology code that includes a neoplasm behavior code and a code for histological grading and differentiation. These neoplasm morphology codes have been adopted in the morphology axes of SNOMED and SNOMED International. ICD-O is widely used for cancer registrations. 6.5.6 CPT - Current Procedural Terminology 06.05.06a CPT Another coding system used in the United States for billing and reimbursement is the Current Procedural Terminology (CPT) code. It provides a coding scheme for diagnostic and therapeutic procedures that define procedures with codes based on the cost. 6.5.7 ICPM - International Classification of Procedures in Medicine 06.05.07a ICPM ICPM was published in 1976 by WHO for trial purposes. It contained chapters on diagnostic, laboratory, preventive, surgical, other therapeutic, and ancillary procedures. Originally, WHO planned to add chapters on radiology and drugs and to revise the classification after some years on the basis of comments received from users. Unfortunately, this never happened. Nevertheless, ICPM has been a source of inspiration for a number of other procedural classifications. The procedural part of ICD-9-CM and CCP were both based on ICPM. In Germany and The Netherlands, extensions of ICPM are mandatory in hospitals for reimbursement and administration purposes. 6.5.8 RCC - Read Clinical Classification 06.05.08a RCC The Read Clinical Classification (RCC), or Read code, was developed privately in the early 1980s by a British GP (James Read), and was adopted by the British National Health Service (NHS) in 1990. RCC has been further expanded by the Clinical Terms Project. The Clinical Terms Project is a working group chaired by the chief executive of NHS, which consists of representatives from the Royal College of Medicine, the Joint Consultants Committee, the General Medical Services Committee of the British Medical Association, and the NHS executive. RCC tries to cover the entire field of health care (see Table 6.5). 06.05.08b RCC and CPR RCC has especially been developed for use by CPR systems. It aims to cover all terms that may be written in a patient record. They are arranged in chapters that cover all aspects of care. Each code represents a clinical concept and an associated "preferred term." Each code can also be linked to a number of synonyms, acronyms, eponyms, and abbreviations, which allows for the use of natural language. The concepts are arranged in a hierarchical structure, with each successive level representing greater detail. RCC uses a five-digit alphanumeric code which, in principle, allows for more than 650 million possible codes. RCC is compatible with and is cross-referenced to all widely used standard classifications, such as ICD-9 (see Table 6.6), ICD-9-CM, OPCS-4, CPT-4, and Diagnosis-Related Groups (DRGs). RCC has a one-to-one cross-reference, or mapping, to all the terms in the classifications mentioned (Table 6.6).This hierarchy in coding detail is found in all code categories. In RCC version 3, terms may have multiple parents in the hierarchy. Version 3.1 adds the ability to combine terms in a specific, controlled way. 6.5.9 ATC - Anatomic Therapeutic Chemical Code 06.05.09a ATC The Anatomic Therapeutic Chemical Code (ATC) has been developed for the systematic and hierarchical classification of drugs. In the early 1970s, the Norwegian Medicinal Depot expanded the existing three-level anatomic and therapeutic classification system of the European Pharmaceutical Market Research Association and added two chemical levels. Later, the WHO Drug Utilization Research Group accepted the ATC classification as a standard. Presently, the WHO Collaborating Center for Drug Statistics Methodology in Oslo is responsible for maintaining the ATC codes. ATC is an acronym for anatomical (A), the organ system in the body for which the drug is given; therapeutic (T), the therapeutic purpose for which the drug is used; and chemical (C), the chemical class to which the drug belongs. Table 6.7 provides an example of an ATC code and its composition, while Table 6.8 provides a listing of the definitions used in the ATC code. 06.05.09b Advantages and disadvantages of ATC All classifications have disadvantages. No coding system fulfills all needs of all users. The advantages of the ATC are as follows: It identifies a drug product, including the active substance, the route of administration, and if relevant, the dose; It is therapeutically as well as chemically oriented, a feature that most other systems lack; Its hierarchical structure allows for a logical grouping; It is accepted as the international WHO standard for drug utilization research. A disadvantage is that it does not cover combination products, dermatological preparations, and locally compounded preparations. 06.05.09c ATC and national drug database In some countries national drug databases often contain the ATC code for each drug product. This allows pharmaceutical information systems to select alternative drugs. It also provides decision-support systems with information to check for drug interactions, double medication, and dosage control. 6.5.10 MeSH - Medical Subject Headings 06.05.10a MeSH The Medical Subject Headings (MeSH) classification is developed and maintained by the National Library of Medicine (NLM) in the United States. It is generally used to index the world medical literature. Within the hierarchy of MeSH, a concept may appear as narrower concepts of more than one broader concept. For example, pneumonia is listed as a respiratory tract infection as well as a lung disease. MeSH forms the basis for the Unified Medical Language System (UMLS) also developed by NLM (see Panel 6.3). 6.5.11 DRG - Diagnosis Related Groups 06.05.11a DRG The DRG classification is based on ICD-9-CM codes and other factors not included in ICD-9. The grouping of ICD codes is based on factors that affect the cost of treatment and the length of stay in the hospital, such as severity, complications, and type of treatment. The resulting classes are homogeneous with respect to costs and they are medically recognized. DRGs may thus be used for budgeting. Because factors related to the delivery of care are included, their usefulness for budgeting is disputable. Some disease groups are clustered further, which is called case mix. 6.6 Current Developments 06.06a GALEN and UMLS The American Society for Testing and Materials ( ASTM) is working on the standardization of an extensive nomenclature system. In Europe, standardization efforts are undertaken by the European Union (see Chapter 34). The GALEN project, for example, aims at the development of a reference model for medical concepts, which will be independent of language and existing coding systems and which will be independent of the data model used by computer-based patient record systems. NLM is developing UMLS (see Panel 6.3). UMLS contains a meta-thesaurus with medical concepts and a semantic network, which provides information on the semantic relationships between medical concepts. These concepts are taken from established vocabularies such as SNOMED, ICD-9-CM, and MeSH. NLM is developing methods to enhance the use of UMLS for encoding clinical data. 6.7 Conclusion 06.07a Conclusion There are many overlapping classifications not only for the coding of diagnoses but also for the classification of medical events. Although most diagnostic coding systems try to be compatible with the ICD family, ICD itself represents only a limited view and is unable to fulfill the needs of all users. Another problem is that all coding systems require well-defined criteria, but a standardized medical terminology is still lacking. Systems such as SNOMED have much more expressive power than the more rigid systems such as ICD-9-CM. In studies in which several coding schemes were compared with respect to their expressive powers, SNOMED scored much higher than ICD-9-CM. On the other hand, the use of coded data in database queries for statistical overviews and for use by expert systems is more complicated. Wide acceptance of a coding system is essential for the development of decision-support systems. International institutions such as WHO with its recognized collaborating centers play an important role in the standardization process.