"User-centered semantic harmonization A case study"
Journal of Biomedical Informatics 40 (2007) 353–364 www.elsevier.com/locate/yjbin Methodological Review User-centered semantic harmonization: A case study a,* Chunhua Weng , John H. Gennari b, Douglas B. Fridsma a a Department of Biomedical Informatics, University of Pittsburgh, UPMC Shadyside Cancer Pavilion Suit 301, 5150 Centre Avenue, Pittsburgh, PA 15232, USA b Department of Medical Education and Biomedical Informatics, University of Washington, Seattle, WA, USA Received 15 September 2006 Available online 21 March 2007 Abstract Semantic interoperability is one of the great challenges in biomedical informatics. Methods such as ontology alignment or use of metadata neither scale nor fundamentally alleviate semantic heterogeneity among information sources. In the context of the Cancer Bio- medical Informatics Grid program, the Biomedical Research Integrated Domain Group (BRIDG) has been making an ambitious eﬀort to harmonize existing information models for clinical research from a variety of sources and modeling agreed-upon semantics shared by the technical harmonization committee and the developers of these models. This paper provides some observations on this user-centered semantic harmonization eﬀort and its inherent technical and social challenges. The authors also compare BRIDG with related eﬀorts to achieve semantic interoperability in healthcare, including UMLS, InterMed, the Semantic Web, and the Ontology for Biomedical Inves- tigations initiative. The BRIDG project demonstrates the feasibility of user-centered collaborative domain modeling as an approach to semantic harmonization, but also highlights a number of technology gaps in support of collaborative semantic harmonization that remain to be ﬁlled. Published by Elsevier Inc. Keywords: Semantic interoperability; Semantic harmonization; Knowledge management and engineering; caBIGä; Collaborative domain modeling; Group consensus 1. Background and motivation operability, while the latter, semantic interoperability, deals with terms denoting diﬀerent concepts or concepts Despite years of research in biomedical informatics, with mismatched scopes and uses. Semantic interoperabil- interoperability among healthcare information systems ity failures introduce severe information integration errors remain an unresolved challenge . A lack of interoperabil- that are much harder to detect and resolve than syntactic ity is linked to medical errors and other problems: The US interoperability problems . Department of Health and Human Services has stated that, Semantic interoperability problems in the healthcare ‘‘Interoperability is needed for clinicians to make fact-based arena can be largely attributed to the decentralized nature decisions so medical errors and redundant tests can be of multidisciplinary healthcare communities and healthcare reduced’’ . According to the Institute of Electrical and information systems. Numerous healthcare specialties have Electronic Engineers (IEEE), interoperability is the ability coevolved and established their own cultures and values for for two or more systems ﬁrst, to exchange information a long time. Therefore, stakeholders from disparate ﬁelds and second, to use the exchanged information. The former often organize healthcare information in diﬀerent ways can be achieved by making data structures or formats con- from varied perspectives. At present, despite the myriad sistent across systems and is also referred to as syntax inter- terminology services, a big problem is the lack of explicit semantics shared among multidisciplinary practitioners * Corresponding author. Fax: +1 412 647 5380. and stakeholders across healthcare communities. Most E-mail address: firstname.lastname@example.org (C. Weng). medical concepts in use have heterogeneous and vague def- 1532-0464/$ - see front matter Published by Elsevier Inc. doi:10.1016/j.jbi.2007.03.004 354 C. Weng et al. / Journal of Biomedical Informatics 40 (2007) 353–364 initions which cannot be generalized across disciplines or a shared conceptual reference model to serve as the com- organizations. Consequently, the uses of healthcare infor- mon ground for all systems or to deﬁne a shared metadata mation often have to be conﬁned to local contexts . set [15–18]. The number of needed mappings equals the Semantic interoperability has presented a challenge to a number of data sources (N). wide range of integration activities involving exchange of The eﬀort to construct mappings across data sources in healthcare information, geographic information, interna- the above three methods can be represented by a common tional laws, and electric devices. As an important practical formula: C2 þ N , where K is the number of metadata sets, K problem, it has been tackled from diﬀerent angles [4–8]. and N is the total number of data sources. When K equals Methods to achieve semantic interoperability largely fall 1, all data sources are mapped to one metadata set pro- into the following three categories: model alignment, using vided by a shared conceptual reference model; when K semantic tags or metadata, and developing shared concep- equals N, then all data sources require mappings for each tual references. pair of them. Therefore, construction of a shared concep- tual reference model needs a minimum number of map- 1.1. Interoperability via model alignment pings to achieve semantic interoperability among information sources. The ﬁrst approach, model alignment, creates mappings In reality, it is very diﬃcult to construct a comprehen- among models to support their semantic interoperability sive conceptual reference model. Domain modeling, [9–11]. It allows multiple models to co-exist and provides which is to conceptualize a domain and represent this con- only an ad hoc integration solution that does not alleviate ceptualization in computable knowledge as ontology or the heterogeneity among disparate data sources. Its advan- domain analysis models , is the common approach to tage is that sources do not have to be modiﬁed to achieve developing shared conceptual reference models. However, interoperability. Its drawback is that every pair of informa- most domain modeling eﬀorts do not scale beyond individ- tion sources needs a mapping. Therefore, this approach has ual organizations. Distributed autonomous domain model- poor scalability. The number of required mappings is C2 , N ing eﬀorts have produced voluminous overlapping or which is proportional to the square of the number of data inconsistent model resources for biomedical researchers, sources (N). Tremendous computing resources are required and contributed to the interoperability problems across to construct and maintain mappings, which may still have healthcare information systems. A shared conceptual refer- limited accuracy because mapped terms across diﬀerent ence model is also required to support the interoperability systems do not necessarily share the same conceptual deﬁ- of legacy systems built on existing models. Therefore, an nitions or scopes of uses. important task in building a shared conceptual reference model is to eﬀectively reuse extant knowledge resources 1.2. Interoperability via metadata from myriad domain analysis models that have been devel- oped and have evolved constantly in a poorly-coordinated The second method is to use semantic tags or metadata manner. , such as the Dublin Core Metadata Initiative , With the quickly expanding number of information and which is one of the key technologies for building the knowledge sources, researchers have been trying to ﬁnd Semantic Web. Mappings are created not directly between eﬀective approaches to integrate or merge the semantics data sources, but either between a data source and a meta- from diﬀerent models. However, it is no trivial task. One data set or between diﬀerent metadata sets. Supposing panel in the 2004 WWW workshop on the semantic web there are N data sources and K metadata sets, the number for health science pointed out, ‘‘all the theoretical elements, of mappings to be created is C2 þ N , which is proportional K including operations research, econometrics, systems the- to the square of the number of metadata sets (K) and ory, dynamical systems, and machine learning and model the number of the data sources (N). The smaller the K, building, are in place today, but the practical tools for the more data sources will share the metadata sets and large-scale semantic uniﬁcation and semantics-driven inte- the fewer mappings need to be created to support the inter- gration and interoperability are yet to be constructed’’ operability. However, autonomous metadata development . Domain consensus modeling is one of the most impor- is often localized and introduces potential semantic inter- tant challenges for domain modeling in healthcare pres- operability problems among metadata sets. Similar prob- ently [20,21]. lems apply to the uses of standard terminology services since regular metadata or terminology services usually 1.4. Semantic harmonization carry limited contextual information and cannot easily gen- eralize across disciplines . In this paper, we use ‘‘semantic harmonization’’ to refer to a domain consensus modeling approach, which includes 1.3. Interoperability via a shared conceptual reference model three steps: (1) investigating semantically connected domain analysis models; (2) deriving common semantics The third approach, which is also the ideal solution to based on the group consensus of the developers of the semantic interoperability, is to develop core ontology or models; and (3) building a coherent conceptual reference C. Weng et al. / Journal of Biomedical Informatics 40 (2007) 353–364 355 model through explicit and formal knowledge representa- col-driven clinical research through a shared domain anal- tions of the shared semantics. This approach is diﬀerent ysis model. The mission of BRIDG is twofold: (1) to from other domain modeling approaches for its consen- harmonize the semantics from available clinical trials infor- sus-based and user-centered processes. It requires commu- mation models into a shared model; and (2) to explore a nity-based sharing, deriving, and integrating of common methodology for user-centered semantic harmonization. domain knowledge from distributed miscellaneous Next, we introduce the BRIDG project in terms of its resources in a scalable manner in an open and collaborative desiderata for a shared reference model, modeling mecha- environment . It is related to an earlier concept called nism, major participants and source models and harmoni- semantic uniﬁcation , but highly emphasizes user-cen- zation processes. tered collaborative modeling processes, instead of reliance on automatic model integration or synthesis algorithms 2.1. Desiderata of the BRIDG model [24,25]. In practice, various forms of semantic harmonization BRIDG has deﬁned three desiderata for developing the for harmonizing international laws or electric device stan- shared reference model: that it be (1) comprehensive; (2) dards have existed for a long time [4–6,26]; however, the consensus-based; and (3) abstraction and context neutral. processes and methodologies have rarely been discussed and remained as vague and intricate interdisciplinary (1) Comprehensive: To enable the shared model to group work. Moreover, to our knowledge, there is no col- address the needs of the broad clinical research com- laborative modeling technology that eﬀectively supports munity, BRIDG has made ‘‘being comprehensive’’ its user-centered semantic harmonization. We feel that there important design principle. By building an open- is a pressing need for developing eﬀective approaches to source modeling environment, BRIDG has widely semantic harmonization. The purposes of this paper are connected and engaged interested stakeholders of to share our experience with the BRIDG semantic harmo- clinical research from academia, industry, govern- nization eﬀort, to examine the challenges and opportunities ment, standardization agencies, and patient advo- for supporting user-centered domain consensus modeling, cates to participate in the semantic harmonization and to inform future technology designs in support of this activities. task. (2) Consensus-based: As a user-centered project, BRIDG strongly advocates community participation 2. Methodology of BRIDG and group discussions. For all source models that need to be harmonized, model developers participate In the domain of cancer clinical trials, there is an in at least one harmonization session with the increasing need for global trial data sharing and knowledge BRIDG harmonization committee. During such discovery. Many standards for clinical trials have been or face-to-face meetings, each concept, its attributes, are being developed by various clinical research standardi- and its relationships to other concepts are discussed zation organizations, including the National Cancer Insti- thoroughly in the group to reach agreed-upon mod- tute (NCI), the Food and Drug Administration (FDA), eling decisions. Health Level Seven (HL7), and the Clinical Data Inter- (3) Abstraction and context neutral: Earlier research on change Standards Consortium (CDISC). For example, the Uniﬁed Medical Language Systems (UMLS) HL7 uses a Reference Information Model (RIM) to identiﬁed the importance of collaboration at the right develop interchange speciﬁcations for healthcare informa- level of abstraction, where involved parties would tion systems; CDISC is developing platform-independent have suﬃcient collective experience and understand- standards to support the acquisition, exchange, submis- ing to reach consensus . In accordance with this, sion, and archiving of clinical data for pharmaceutical to make concepts in the shared reference model reus- companies and FDA. Meanwhile, many pharmaceutical able for diﬀerent applications and contexts, BRIDG and technology companies as well as academic researchers has tried to separate content from representations are developing clinical trial research knowledge models or and focuses on representing the abstract meanings ontologies for improving the quality of clinical trial proto- of the concepts shared by the clinical research com- cols  or the computability of reported clinical trial data munities. Subsequently, the BRIDG model only con- . tains concept deﬁnitions, attributes, and concept In February 2004, the Cancer Biomedical Informatics relationships. By design, it is not application-oriented Grid (caBIGä) project was created as a voluntary virtual and excludes all implementation-speciﬁc application informatics infrastructure to develop interoperable cancer information. research tools . The project initiated scientiﬁc collabo- ration among domain experts from more than 50 cancer However, the implementation of these desiderata has institutes across the United States. As part of the caBIGä practical challenges. For example, when striving to be com- eﬀorts, the Biomedical Research Integration Domain prehensive, BRIDG has diﬃculty in staying focused and Group (BRIDG) was formed in late 2004 to support proto- prioritizing relevant source models. The BRIDG model is 356 C. Weng et al. / Journal of Biomedical Informatics 40 (2007) 353–364 also too abstract to be used directly by many application- CDISC, and The Clinical Trial Object Model (CTOM) oriented users. In Section 4, we will revisit these desiderata contributed by NCI, but it has grown to include regular when we describe the unresolved research issues related to representatives and content from other organizations the BRIDG project. including NCI, CDISC, HL7, FDA, WHO, FAET (The Federal Adverse Events Taskforce), PhRMA (The Phar- 2.2. Modeling language: UML maceutical Research and Manufacturers of America), and the ClinicalTrials.gov. There are also some ﬂuid members BRIDG chose the Uniﬁed Modeling Language (UML) in the BRIDG project, who participate in the harmoniza- as the representation language for its software engineering tion activities occasionally for a particular subproject of strengths. Domain experts can easily view graphical class BRIDG of interest to them. relationships in UML diagrams. UML models also support At present, there are six active domains within the whole a model-driven-architecture so that changes in UML mod- BRIDG modeling space, including (1) lab data modeling, els can be eﬃciently reﬂected in future applications that use (2) patient study calendar, (3) clinical trials registration, the models. Moreover, UML supports package importing (4) adverse events, (5) SDTM, and (6) CTOM. User partic- and exporting so that collaborative modelers can work ipation in these domains is illustrated in Fig. 1. on diﬀerent portions of the shared model simultaneously. 2.4. Group communication and workﬂow 2.3. Community participation BRIDG participants are widely distributed across dis- There are three major stakeholders on the BRIDG pro- tances and organizations. To actively engage the collabora- ject, which are HL7, CDISC, and NCI. Representatives tion of a variety of parties, BRIDG employs GForge, a from these stakeholders make up an Advisory Board and web-based project management and collaboration software a Technical Harmonization Committee (THC), which , to host the project and store all discussions and the oversee the evolution of the BRIDG model. The approxi- shared model at http://www.bridgproject.org. Here any mately 10-member BRIDG Advisory Board serves to iden- project team member can download the formative shared tify the priorities for harmonization, and the four-member model. Although GForge provides online communication THC maintains the BRIDG model and leads all harmoni- and collaboration features, such as messaging, these have zation meetings. Through the open-source collaborative never been used. Email list services and biweekly telecon- mode, any interested individual or parties can join the ferences are set up to facilitate communications among BRIDG community, learn how to work within the BRIDG interested BRIDG participants and to share the latest mod- project and submit relevant source models for harmoniza- eling activities. About ﬁfteen to twenty people have been tion. At the very beginning, BRIDG started with only The participating in the biweekly teleconferences on a regular Study Data Tabulation Model (SDTM) contributed by basis. A face-to-face harmonization meeting also occurs NCI BRIDG Model WHO (SDTM) Study Data CDISC Tabulation Model PDQ.gov Clinical Trials Registration ClinicalTrials.gov (CTOM) Clinical Trial NCI caBIG-caAERS Object Model Adverse Federal Adverse Events Events Task Force caBIG Lab FDA Specification CDISC HL7 Patient Study Calendar NCI--caBIG CDISC Fig. 1. Participation in the BRIDG modeling domains as February 2007. C. Weng et al. / Journal of Biomedical Informatics 40 (2007) 353–364 357 every month. Source model providers and any interested 3. Result 1: BRIDG harmonization work practices parties can attend those meetings. The evolving BRIDG model is divided into two pack- For the BRIDG project, we observed some common ages: the harmonized area and the staging area. All sources semantic heterogeneity across models to be harmonized models are ﬁrst imported into the staging area for harmo- and identiﬁed a set of typical semantic harmonization nization. Then the harmonization results are stored in the actions, as listed below. harmonized area. Classes archived in the harmonized area can be reused by anyone who is interested in developing 3.1. Common semantic heterogeneity BRIDG-compliant applications. A source model to be harmonized can be in any for- Semantic heterogeneity can be either conceptualization mat, including a UML model, an excel spread sheet, a mismatches, the way a domain is interpreted, or explication database model, or an ontology, because the harmoniza- mismatches, the way a domain is represented . In tion is focused on the concepts and their deﬁnitions, not Table 1, we use the framework created by Visser and exam- on the representation diﬀerences. To date, the source ples from the BRIDG project to illustrate the typical models that BRIDG has encountered have been data- semantic mismatches across the domain models that we base-like data dictionaries or schemas, rather than richer collected for BRIDG. ontologies. These sorts of source models are re-formatted It is time-consuming and error-prone to manually cate- and imported into the UML modeling environment gorize semantic relationships among related concepts from through a manual process. BRIDG modelers extract key diﬀerent domain analysis models. In the BRIDG project, concepts from source models and then reconstruct a this step is often the bottleneck of semantic harmonization UML model for these concepts and import this model processes. as a package into the staging area of the shared BRIDG model. If the source model was an ontology with higher- 3.2. A collection of harmonization actions order constructs such as axioms, constraints, or necessary and suﬃcient conditions from a description logic repre- To resolve the above semantic mismatches, BRIDG has sentation, then these would have to be ignored or trans- used a list of typical harmonization actions as follows: ferred as text into the UML environment. However, Renaming a class: This action resolves any term mis- from a practical standpoint, current applications do not match. When two classes referring to the same concept typically use such constructs. have a term mismatch, one of the classes is renamed to Next, the technical harmonization committee prints out be mapped to the other. semantically connected concept deﬁnitions from both the Redesigning an attribute (to rename or to change the sources and the shared model, juxtapose related concepts type): When two attributes referring to the same concept in an excel spread sheet, and distribute it to the group. This have diﬀerent data types or diﬀerent terms, one of the attri- step helps modelers focus their attention on the semanti- butes is renamed or its type is changed to be mapped to the cally interconnected areas of related models to create map- other. pings across models. Mappings are explicit representations Uniﬁcation of attributes for a concept: When models of similarities and mismatches between the shared concepts deﬁne diﬀerent attributes for the same concept, the ulti- (classes or attributes). During the harmonization meetings, mate deﬁnition of the concept should include the uniﬁca- for each concept, the chairman of the meetings reads aloud tion of all possible attributes. This operator is used to these deﬁnitions and the group discusses them and comes resolve attribute or ‘‘concept and attribute’’ mismatch. up with an agreed-upon deﬁnition for all. To arrive at a Moving attributes from a subclass to a super class: When well agreed-upon decision, the deliberation of a concept both models have the same pair of super classes and sub- can take from an hour to four hours or more. classes but diﬀerent attribute assignments, an attribute is BRIDG employs stewardship to facilitate collaborative moved from the subclass to the super class to make it more modeling. This means that only members of the technical general. harmonization committee can make changes to the shared Creating a more abstract super class for two mismatched model; others can only suggest changes to the technical concepts: This operator is used to resolve a concept mis- harmonization committee. Therefore, after each harmoni- match. For example, if two models both contain the con- zation meeting, a member from the technical harmoniza- cept of ‘‘protocol’’, but one refers to a document and the tion committee applies changes to the shared model and other refers to a series of activities, we could create a class distributes the new model to all on the shared project called ‘‘protocol’’ as an abstract class, with ‘‘document’’ GForge site. On this basis, that source model is considered and ‘‘procedure’’ as two subclasses of it. harmonized. The complete versioning history of the shared Creating a subsumption between two concepts: If one model is archived on the shared web site so that people can model deﬁnes ‘‘eligibility criteria’’ and the other model track the diﬀerences across versions. Therefore, semantic deﬁnes ‘‘inclusion criteria’’ and ‘‘exclusion criteria’’, these harmonization is an iterative and cumulative modeling two models represent the same concept at diﬀerent abstrac- process. tion levels. We can leverage ‘‘eligibility criteria’’ and make 358 C. Weng et al. / Journal of Biomedical Informatics 40 (2007) 353–364 Table 1 Major semantic mismatches for harmonization Major category Deﬁnition and example 1.Conceptualization mismatch 1.1 Class mismatch 1.1.1 Categorization mismatch Diﬀerent super classes for the same class E.g., in model A, the super class for class ‘‘clinical trial protocol’’ is ‘‘document’’; in model B, the super class for ‘‘clinical trial protocol’’ is ‘‘study’’ 1.1.2 Granularity mismatch Diﬀerent abstraction levels for the same class E.g., model A deﬁnes ‘‘inclusion eligibility’’ and ‘‘exclusion eligibility’’, and model B deﬁnes ‘‘Eligibility criteria’’ 1.2 Relation mismatch 1.2.1 Structure mismatch Diﬀerent class relationships E.g., class ‘‘adverse events’’ associates class ‘‘observation’’ in model A, while class ‘‘observation’’ subsumes class ‘‘adverse events’’ in model B 1.2.2. Attributed-assignment mismatch Diﬀerent attribute assignment between super and subclasses E.g., model A and B have the same super class and subclass, but model A deﬁnes a shared attribute at the super class level while model B deﬁnes it at the subclass level 1.2.3 Attribute-type mismatch Diﬀerent types of attribute for the same class E.g., model A records temperature in Celsius, while model B records temperature in Fahrenheit 2. Explication mismatch 2.1 Concept mismatch Models have the same term and deﬁnition for diﬀerent concepts E.g., class ‘‘Protocol’’ in model A refers to a document, while in model B it refers to a study procedure 2.2 Term mismatch Models use synonyms to deﬁne the same concept E.g., model A has the class ‘‘Subject’’, and model B has the class ‘‘Participant’’, both referring to patients in clinical trials 2.3 Attribute mismatch Models use diﬀerent attributes to deﬁne the same concept E.g., class ‘‘Investigator’’ has 5 attributes in model A, but 7 attributes in model B 2.4 Concept and term mismatch Models use the same attributes to deﬁne diﬀerent concepts in diﬀerent terms E.g., class ‘‘Drug’’ in model A and class ‘‘Agent’’ in model B have the same set of attributes 2.5 Concept and attribute mismatch Models use the same term but diﬀerent attributes to represent diﬀerent concepts E.g., class ‘‘Subject’’ in model A refers to clinical trial subjects and has three attributes, while class ‘‘Subject’’ in model B refers to general subjects and has ten attributes 2.6 Term and attribute mismatch Models use synonyms and diﬀerent attributes for the same concept E.g., class ‘‘Subject’’ in model A has three attributes, while class ‘‘Participant’’ in model B has eight it a super class of ‘‘inclusion criteria’’ and ‘‘exclusion model. Here a full copy means a copy of the class, its sub- criteria’’. classes, and its super classes. Creating an association for two concepts: For example, Deletion of an obsolete attribute: An obsolete attribute ‘‘study’’ and ‘‘document’’ are two perspectives on a clinical that is out of practice is removed. trial protocol. For the principle investigator, a clinical trial Deletion of an obsolete class: An obsolete concept that protocol describes a study design to test a treatment, contradicts the shared concepts is removed. whereas for a clinical staﬀ, a clinical trial protocol is a doc- Model uniﬁcation or merging is more of an art than a ument to be used as a reference or guideline. Therefore, science . There is no standard way to resolve each type class ‘‘protocol’’ should have an association to both the of semantic mismatch listed in Table 1. Many of the above class ‘‘study’’ and the class ‘‘document.’’ actions can be applied to solve each type of semantic mis- Copying of a class: When a source model contains a dis- match. During a BRIDG harmonization meeting, initially tinctive class not present in the shared model, a full copy of several actions are recommended and reviewed, and then this class is needed to insert this new concept to the shared eventually one action is selected to revise the model. Group C. Weng et al. / Journal of Biomedical Informatics 40 (2007) 353–364 359 discussion support is critical for reconciling diversiﬁed ferent types of associations created a lot of ambiguity perspectives. in a UML model. Moreover, UML is not a formal knowledge represen- 4. Result 2: unresolved research challenges for BRIDG tation language for deﬁning coherent class relationships and constraints. In UML diagrams, relations of a super Over the past 18 months, the BRIDG project has class are not automatically inherited to subclasses; revealed some challenges with regards to user-centered therefore, modelers can purposefully or unintentionally semantic harmonization: some are due to the gap between overwrite a super class when deﬁning attribute and available technology and the goals of BRIDG, some are class-relationships for subclasses, which often create inherent to the task itself, and some have been caused by inconsistency in a UML model. Furthermore, UML does the to-be improved semantic harmonization methodology. not well support knowledge reuse and attribute deﬁni- Next we analyze the problems that we have encountered. tions cannot be easily shared across classes. For example, in the BRIDG model, many classes require the same 4.1. A gap between a reference model and applications attribute ‘‘ID’’. Therefore is no mechanism for sharing the deﬁnition of the same attribute among diﬀerent mod- Designed to be a domain reference, the BRIDG model elers. Sometimes ‘‘ID’’ is deﬁned as a string, and other has been developed at a high level and to be context-neu- times ‘‘ID’’ is deﬁned as an integer, which causes much tral. Aimed at being generic and reusable for diﬀerent redundancy and inconsistency. application contexts, the BRIDG model has nevertheless Alternative domain modeling tools to UML include been criticized for providing little application development ´ ´ Protege  and OWL , which provide more formal support and for being disconnected from realistic applica- and coherent modeling mechanisms. However, these lan- tion models. ‘‘How can we use the harmonized BRIDG guages are necessarily further from the underlying database model?’’ has been a recurring question frequently posted technology used by most applications. In addition, the in teleconferences and harmonization meetings. Partici- ´ ´ ´ ´ default user interface for Protege or Protege/OWL requires pants often ﬁrst show enthusiasm in this eﬀort and agreed a fairly high level of modeling sophistication, whereas our with its importance but give up due to diﬃculties connect- domain modelers were much more comfortable with the ing the model to real-world uses. style of graphical modeling user-interfaces provided by In fact, this challenge is not unique to BRIDG, but inev- UML tools. Therefore, we realize that no single existing itable for any reference model . Reference ontologies domain modeling tool satisﬁes user needs perfectly, espe- are broad and deep, such as the comprehensive and cially for the mixed BRIDG modeling group. A combina- abstract BRIDG model, while application models are nar- tion of multiple tools or a tool that integrates features row and concrete, and often focused on particular user from multiple tools may work for such collaborative mod- needs. As pointed out by Brinkley et al. based on their eling tasks in the future. experience with the Foundational Model of Anatomy, ref- erence ontologies are too large and detailed to be used 4.3. Knowledge provenance ‘‘out-of-the box’’ in applications, even when developers are aware of them and would like to use them . How Semantic harmonization essentially distills and inte- we can make a reference model usable in practice is still grates domain semantics from existing knowledge sources. an open research challenge because changes in the evolving It is a modeling process based on but beyond semantics reference models need to be systematically reﬂected in the merging. To make a shared reference model convincing applications of reference models. Looking back, we could to assorted communities in a domain, we need to provide have incorporated a broad range of application needs into source meta-information or a description of the origins of consideration during the BRIDG harmonization process, the shared semantics, which is also called knowledge prov- instead of postponing this process until the completion of enance . In addition, the harmonization result is often the BRIDG model. Timely consideration of application structurally diﬀerent from all the sources. Knowledge prov- scenarios could have served as a formative evaluation tool enance is necessary to support evidence-based modeling by for the abstract BRIDG model. making explicit the origins of the semantics in the harmo- nization result. 4.2. UML for semantics representation Earlier in the BRIDG project, the old versions of the BRIDG model did not provide suﬃcient knowledge prov- Despite its desirable software engineering features enance information so that some users could not tell how listed in Section 2.2, UML is not a satisfying knowledge their source models related to the shared reference model. representation language for constructing a shared We learned this lesson from user feedback and inserted domain reference. UML diagrams are full of associa- mappings to the original source models for each class tions, which represent diﬀerent relationships between and attribute in the current version of the BRIDG model. classes, such as ‘‘whole and part’’, ‘‘actor and action’’, These mappings eﬀectively provided preliminary prove- ‘‘cause and consequence’’, and many others. These dif- nance information; however, we saw a great need for a sys- 360 C. Weng et al. / Journal of Biomedical Informatics 40 (2007) 353–364 tematic approach to tracking and maintaining links ‘‘software code change representation’’ in software engi- between a harmonization result and the origins of the neering, has been a big technical challenge for the BRIDG semantics. harmonization eﬀort. 4.4. Model alignment and merging 4.6. Consensus-based modeling in open-source communities Semantic harmonization relies on the merging of According to the feedback provided by the BRIDG par- semantically connected domain models, which is ‘‘the pro- ticipants, group discussions and consensus-based modeling cess of ﬁnding commonalities between two diﬀerent ontol- have been helpful from the collaboration perspective. It ogies A and B and deriving a new ontology C that showed each participant’s commitment to making the facilitates interoperability between computer systems that model generic and their respect for the needs of other com- are based on the ontologies A and B’’ . During seman- munities. However, consensus-based modeling has proven tic harmonization, a key task is to compare various technically impossible for a distributed open-source model- source models and unify those semantically connected ing community such as BRIDG without sophisticated tech- areas. It is tedious to compare thousands of concepts in nology support. It was impossible to organize large related models manually. In the BRIDG project, we did semantic harmonization face-to-face meetings or to achieve not use an automatic model alignment tool and we relied a consensus among this large distributed community on manual concept comparisons, which was very time- through teleconferences or emails. Through monthly face- consuming and error-prone. Although it may be that we to-face harmonization meetings, BRIDG only achieved could have used a tool such as PROMPT  for this the ‘‘small group consensus’’ in its harmonization eﬀort, step, it seemed as though the overhead cost for converting by which every concept included in the shared reference ´ ´ models into Protege and learning PROMPT was too high. model has the deﬁnitions agreed-upon by each source Most participants requested some assistance for model model contributor and the technical harmonization com- alignment such as receiving recommendations of merging mittee. This was a compromise. From the BRIDG experi- options. ence, we can see the value of supporting collaborative modeling and negotiation as well as a substantial technol- 4.5. Handling divergence ogy gap in achieving these when a big group of participants is involved. Semantic harmonization is essentially a dynamic knowledge engineering process. The shared reference model and source models inevitably coevolved. On one 4.7. Tradeoﬀs between being comprehensive and focused hand, the shared reference model needs to be ready to assimilate any knowledge updates from source models BRIDG strived for comprehensiveness, but again, to maintain knowledge provenance. On the other hand, because of the large group involved with many diverse per- the changes in the shared model are requested to be spectives, it suﬀers from the challenge for focus control in quickly communicated to model developers who use the its modeling process. To make the harmonization process shared reference model as a foundation for domain mod- eﬃcient, it is important to stay focused. There are two eling. It is important to eﬀectively support two-way syn- important principles for focus control: (1) identifying the chronization, or divergence handling for both source shared concepts across all domain models; (2) strictly fol- models and the shared reference model. Version control lowing the top-down principle and moving from abstract has been studied in software engineering for a long time; to context-speciﬁc concepts. However, focus control in however, divergence handling for co-evolving models is practice is hard. Related research questions include still a practical challenge. ‘‘How can we eﬃciently identify semantic connections Without appropriate technology support, the BRIDG across models?’’ and ‘‘How we can eﬃciently recognize a team uses a stewardship mechanism to handle divergences. semantically relevant source information model?’’ Only the steward of the BRIDG model, a technical harmo- In the BRIDG project, many interested parties in this nization member, can revise the BRIDG model. All other volunteer project wanted their source models to be harmo- revision suggestions based on group discussions are nized into the shared conceptual reference. Interests of reviewed by this person and applied to the shared model. modeling subgroups on the BRIDG project include adverse This person also releases and distributes the new versions events, study calendar, lab specimen, and best modeling of the BRIDG model. A big problem with this method practice, etc. Therefore, the domain modeling requests is, as mentioned above, that the steward can be a bottle- from various perspectives expanded fast and the complex- neck to the harmonization process and severely interfere ity of the project management increased quickly. BRIDG with the group’s collaboration eﬃciency. In addition, we needed to know how these requests overlap and how an lack an eﬃcient mechanism to show the changes to partic- individual’s interest is relevant to the ﬁnal shared seman- ipants before and after harmonization. Model change rep- tics. Overall, while focus control is critical for semantic har- resentation, an extension to the classic problem of monization from the project management perspective, it C. Weng et al. / Journal of Biomedical Informatics 40 (2007) 353–364 361 involves challenges in the areas of relevance measurement 5. Related work and harmonization tasks prioritizing. There is a wealth of related work that aims at achieving 4.8. Group discussions and rationale capture system interoperability as a general concept. An important example of such work is the semantic web , which aims The goal of increased interoperability between commu- at making web resources and services interoperable. In nities will not be achieved through further formalization addition to this general approach, we also compare our and abstraction. Rather, negotiation within, and especially work to three more speciﬁc examples of interoperability between communities, is indispensable . In the BRIDG projects within medical informatics: the UMLS, the Inter- project, modelers frequently use ‘‘sticky notes’’ or annota- Med, and the OBI project. tions in UML diagrams to document the sources of a con- cept or to make suggestions about how the model or a 5.1. Interoperability and the semantic web particular class or attribute should be harmonized. During a harmonization process, a concept could go through sev- The Semantic Web has a number of diﬀerent aspects and eral statuses: to be reviewed, to be approved, to be modiﬁed, deﬁnitions; here, we focus on those aspects that deal with to be incorporated or discarded, etc. Valuable knowledge resolving diﬀerences among the terminologies used by provenance information or harmonization design ratio- web resources. In particular, for two web resources to nales were embedded in these notes and shared within the interoperate, they must share at least some terms and con- group. Since semantic harmonization involves group col- cepts; in our terms, the web resources must use a common laboration among multidisciplinary domain experts, reference ontology to avoid semantic mismatches . As rational modeling practices are crucial to support system- should be clear, the idea of reference ontologies for the atic modeling. Poor management of design rationales con- semantic web is the same idea as the shared conceptual ref- veyed in these notes may have caused unnecessary erence model used by BRIDG and that we espouse here for recurring group discussions and sometimes created nega- semantic harmonization. tive social impact on the group work, especially for those However, there are signiﬁcant diﬀerences. First, the whose opinions are not adopted. semantic web has embraced the use of OWL as its knowl- Given the above challenges and the lack of adequate edge representation language: OWL is built up from RDF, technology support in the BRIDG project, we relied on and is an expressive language that is well-suited for certain human practice guidelines to regulate harmonization activ- types of inference. In theory, OWL has a number of advan- ities. For instance, we encouraged all modelers to develop tages over a simpler language such as UML. However, in use cases before modeling to ﬁnd a common focus shared practice the diﬀerence between models created in UML by the group. Moreover, we provided detailed instructions (for BRIDG) and OWL (for the semantic web) are often about handling divergence, such as how to get the latest not so great, because much practical domain modeling version of the shared model, how to synchronize the local work is in creating very simple concept and relationship version with the evolving shared model, and so on. How- deﬁnitions, without using the richer knowledge representa- ever, there was a lack of formal mechanisms to enforce tional capabilities of OWL. these guidelines; therefore, variations of harmonization Rather than focusing on knowledge representational practices were common. choices, we emphasize the hard work of face-to-face Overall, the BRIDG project did not resolve the above negotiations and the need for improved technology sup- research challenges, such as choosing appropriate repre- port for this critical step in semantic harmonization. sentation mechanisms for reference models, balancing Yet early formulations of the semantic web seemed to being comprehensive and staying focused and supporting brush over this step and assume that reference ontologies group consensus within the large open-source community. would be easily created and that semantic diﬀerences We also identiﬁed several technology gaps such as with might be resolved automatically, perhaps through classiﬁ- model alignment and merging, handling divergence, sup- cation (the inference method for OWL). However, more port of consensus-modeling in distributed open-source recent views of the semantic web have recognized the communities, and support for knowledge provenance hard work involved in reference ontology development and design rationale capture. Such knowledge could pro- and maintenance . vide implications for the design of future collaborative Finally, the semantic web is usually viewed as existing modeling or semantic harmonization technology. For across a broad range of web resources. Thus, the semantic example, in prior work, we established that annotations web envisions a network of related reference ontologies, can eﬀectively capture ‘‘design rationale’’ and support pro- perhaps building on each other in a principled manner. gress tracking, group discussion, and group activity coor- In contrast, our focus is much narrower: for BRIDG, all dination during collaborative writing processes [38–40]. stakeholders are interested in interoperability around sys- Therefore, it is possible to develop a web-based model tems that work with clinical trial protocols. Thus, we have annotation system to support semantic harmonization proposed building a single shared conceptual reference online. model, rather than a network of models. 362 C. Weng et al. / Journal of Biomedical Informatics 40 (2007) 353–364 5.2. UMLS . Researchers on this project ﬁrst hoped to make GLIF an Interlingua in the clinical guideline world, thereby sup- Within healthcare, The Uniﬁed Medical Language Sys- porting guideline interoperability. However, they soon dis- tem (UMLS) is currently the most widely used Interlingua. covered that true interoperation among the four major A core component of the UMLS is the metathesaurus, guideline representation methodologies, including Medical which provides a unifying paradigm that integrates Logic Modules (MLM), GEODE, MBTA, and EON, was machine-readable knowledge from a variety of sources impossible due to their incompatible functionality support. including patient record systems, bibliographic databases, Thus, InterMed developed a generic model that ‘‘would factual databases, expert systems, etc. The metathesaurus capture a large common subset of functionality shared by keeps a copy for all the knowledge sources and reserves the diﬀerent models and that would facilitate its adaptation the names, meanings, hierarchical contexts, attributes, to local settings and its integration with other systems such and inter-term relationships in the sources, and also creates as electronic medical records and order entry systems’’ . bridging relationships among related concepts across BRIDG and InterMed are alike in many ways in that knowledge sources. As with the semantic web and our they both pursue the same mode of sharing, which is to own work, one goal of UMLS is to provide support for encourage the community to adopt a shared aggregated connecting terms across multiple sources. standard. This view is diﬀerent than the UMLS or the UMLS is diﬀerent from BRIDG in multiple ways. First, semantic web approach which both focus on linking or the BRIDG model uniﬁes various aspects of all the con- mapping diﬀerent vocabularies or ontologies together. cepts in the clinical research domain and creates a shared GLIF was also developed via a consensus-based multi- generic representation for each concept, while the UMLS institutional collaboration and contained most of the model retains various representations of important bio- important features that are needed in clinical guidelines medical concepts from all sources and creates only post . In addition, the BRIDG project veriﬁed some of the hoc mappings among them. Since the local usage of sepa- ﬁndings from InterMed, e.g., (1) ‘‘standardization works rately built source models is respected, there is no mecha- most smoothly if focused on well-deﬁned common compo- nism to ensure the cross-framework consistency. Thus nents’’; (2) ‘‘using existing standards as a starting point, UMLS terminologies have diﬀerent formalisms and while aiding in establishing credibility and consensus, does degrees of completeness, as well as uncoordinated updating not always meet the modeling requirements’’; (3) face-to- policies. face meetings are crucial precursors to the eﬀective use of Second, the construction of the BRIDG model relies on distance communication technologies; and (4) abstraction consensus among domain experts through discussions of is important for building a shared standard . deep semantics for each concept, while the UMLS model The BRIDG project is built on the major success factors is produced by automated processing of machine-readable or design principles of other projects, such as open-source versions of its source vocabularies, followed by human collaboration, utilization of a variety of source models, and review and editing by subject experts . concept abstraction. However, it does not receive funding Third, BRIDG is a truly collaborative project in that for tool development as was the case for both the UMLS every concept in the model is based on group consensus and the InterMed project; therefore, it faces great chal- among all stakeholders present in harmonization meetings, lenges in continuing costly face-to-face meetings for con- while it is less clear if a true collaboration has been devel- sensus-based domain modeling that involve a large oped among the various groups who maintain the compo- interested community. After two years of harmonization nent vocabularies from which the UMLS is constructed . experiences, we realize that the project eﬃciency would Finally, UMLS accommodates dynamic links to all have been improved had we had the resources to develop knowledge sources and the mappings among them can be some model alignment and collaborative semantic uniﬁca- automatically updated regularly, while BRIDG hard-codes tion tools. So far all these tasks remain tedious manual representations of concepts from all the knowledge sources processes. and does not provide tools that facilitate dynamic knowl- edge updates. Overall, BRIDG and UMLS have diﬀerent 5.4. The OBO foundry and OBI project philosophies: BRIDG aims at a shared generic standard for knowledge sharing through conceptual uniﬁcation, In recent years, domain-agreed upon standards have while UMLS is about facilitating conceptual exchange been recognized as important for achieving semantic inter- across systems through automatic model alignment. operability, especially by the ontology research commu- nity. The Open Biomedical Ontology Foundry and the 5.3. InterMed Ontology for Biomedical Investigations (OBI) are two examples of community-based, domain-speciﬁc standardi- Prior to BRIDG, InterMed was an important collabora- zation eﬀorts at the knowledge level. tive informatics initiative that developed a common model The OBO Foundry  is a collaborative experiment, for guidelines (and to a lesser extent, clinical trial proto- involving a group of ontology developers who have agreed cols) known as the Guideline Interchange Format (GLIF) in advance to the adoption of a set of principles specifying C. Weng et al. / Journal of Biomedical Informatics 40 (2007) 353–364 363 best practices in ontology development. These principles lenges, and potential pitfalls of modeling practice in build- are designed to foster interoperability of ontologies within ing a domain reference model. the broader OBO framework, and also to ensure a gradual With the soon-to-be completed pilot phase of the caBIG improvement of quality and formal rigor in ontologies, in program, it is still too early to see the fruits of this model- ways designed to meet the increasing needs of data and ing eﬀort, but we want to bring the attention of our information integration in the biomedical domain. research community to a few technology gaps in support The Ontology for Biomedical Investigations (OBI) pro- of collaborative modeling and domain knowledge synthe- ject is developing an integrated ontology for the description sis. As we describe in Section 4, some of these gaps include of biological and medical experiments and investigations a need for better modeling tools (4.2), methods for manag- by leveraging broad international research communities. ing knowledge provenance (4.3), technologies for model The purpose of this ontology is to support the consistent alignment (4.4), tools for divergence handling and model annotation of biomedical investigations, regardless of the change representation (4.7), technologies for very large particular ﬁeld of study. This project was formerly called group collaborations (4.6), and tools for annotation and the Functional Genomics Investigation Ontology (FuGO) design rationale capture (4.8). Semantic harmonization is project . To date, the OBI project contains seventeen a challenging research area. Several intertwined unresolved ontology development groups including a ‘‘clinical trials research issues from software engineering, computer-sup- ontology’’ group. ported collaborative work, particularly collaborative mod- These two eﬀorts, like BRIDG, emphasize community eling, and semantic knowledge representation and feedback and community convergence on a single reference engineering all come into play. For collaborative domain- ontology. The OBI project is focused on metadata develop- consensus modeling to scale, future technology support will ment for annotation purposes, while the BRIDG project is be indispensable and should be urgently investigated. focused on developing a shared domain analysis model and supporting model-driven application development. More- Acknowledgments over, as we mentioned in Section 2.4, the BRIDG project does not include many source models as strictly deﬁned This work has been funded through the caBIGä project ontologies, but more information models. To some degree, by the National Cancer Institute. The authors express their BRIDG strives for agreed-upon content as a ﬁrst step great appreciation to the two reviewers for their construc- toward interoperability, while OBI aims for both agreed- tive and insightful review comments. The manuscript has upon content and a formal representation (in ontology). beneﬁted tremendously from the reviews. The authors also Thus, the BRIDG work is complementary to these more thank Roger Day, Heather Piwowar, and Diane Paul for formal ontology eﬀorts—as future work, we could take their helpful comments. the consensus that BRIDG achieved around content, and explore how well or easily this content could be expressed References in a formal ontology, one that might be deﬁned by an OBO or OBI-style eﬀort.  Mori AR, Consorti F. Exploiting the terminological approach from CEN/TC251 and GALEN to support semantic interoperability of healthcare record systems. International Journal of Medical Infor- 6. Conclusion matics 1998;48:111–24.  WHO. Workshopon semantic interoperability prerequisites for eﬃcient e-health systems. Information Society, Available from: <http:// There is little prior empirical knowledge about semantic www.who.int/classiﬁcations/terminology/prerequisites.pdf/>, 2005. harmonization. In this paper, we bridge this knowledge gap  Berg M, Goorman E. The contextual nature of medical information. by reﬂecting on our experiences with the BRIDG project International Journal of Medical Informatics 1999;56:51–60. and summarize some unresolved challenges for user-cen-  Cugini J. The common criteria: on the road to international harmonization. Computer Standards & Interfaces 1995;17(4):315–20. tered semantic harmonization processes.  Daly JM. Electrotechnical standardization harmonization in North At present, domain modeling is convenient for almost America. In: Conference record of the 1996 IEEE industry applica- any organization, but there is little support for commu- tions conference, thirty-ﬁrst IAS annual meeting (Cat. nity-based modeling that eﬀectively utilizes existing domain No.96CH25977); 1996. p. 2457–62. knowledge resources. BRIDG made a ﬁrst step to explore  London WR. Harmonization of IT laws. Computer Law and Security Report 1994;10(2):64–75. this open ground. The BRIDG project successfully estab-  Salamon H, Slater T, Ritter O. Industry perspectives on semantic web lished collaboration across diﬀerent clinical research com- for life sciences. In: W3C Workshop on semantic web for life sciences munities in academia, industry, government, and summary; 2004. Available from: http://www.w3.org/2004/10/ organized them to work together toward a shared ambi- swls-workshop-report.html. tious goal. Instead of building yet another new standard  Issues in Crosswalking Content Metadata Standards. Available from: http://www.niso.org/press/whitepapers/crsswalk.html. in an isolated research mode, BRIDG explored an open-  Campbell KE, Oliver DE, Shortliﬀe EH. The uniﬁed medical source approach for supporting community-based domain language system: toward a collaborative approach for solving analysis and knowledge synthesis and contributed ﬁrst- terminologic problems. Journal of the America Medical Informatics hand experience about this methodology, its inherent chal- Association 1998;5(1):12–6. 364 C. Weng et al. / Journal of Biomedical Informatics 40 (2007) 353–364  Noy N, Musen M. PROMPT: algorithm and tool for automated international symposium on foundations of software engineering, ontology merging and alignment. In: Proceedings of the seventeenth Newport Beach, CA, USA; 2004. p. 33–42. national conference on artiﬁcial intelligence (AAAI-2000), Austin,  Boer A, Engers TV, Winkels R. Using ontologies for comparing and TX; 2000. p. 450–55. harmonizing legislation. In: International conference on artiﬁcial  Klein M. Combining and relating ontologies: an analysis of problems intelligence and law, Edinburgh, Scotland, UK; 2003. p. 60–9. and solutions. In: Gomez-Perez A et al., editors. Workshop on  Kahn M et al. A model-based method for improving protocol ontologies and information sharing, IJCAI’01, Seattle; 2001. quality. Applied Clinical Trials 2002:40–50.  Sciore E, Siegel M, Rosenthal A. Using semantic values to facilitate  Sim I, Detmer DE. Beyond trial registration: a global trial bank for interoperability among heterogeneous information systems. ACM clinical trial reporting. Health in Action 2005;2(11):1090–2. Transactions on Database Systems 1994;19(2):254–90.  caBIG, 2006. Available from: <https://cabig.nci.nih.gov/>.  Dublin Core Metadata Standards. Available from: <http://dublin-  GForge. Available from: <http://gforgegroup.com/>. core.org/>.  Pinto HS, Gomez-Perez A, Martins JP. Some issues on ontology  Ingenerf J, Reiner J, Seik B. Standardized terminological services integration. In: Proceedings of IJCAI99’s workshop on ontologies enabling semantic interoperability between distributed and heteroge- and problem solving methods: lessons learned and future trends; neous systems. International Journal of Medical Informatics 1999. p. 7.1–.12. 2001;64:223–40.  Brinkley JF et al. A framework for using reference ontologies as a  Lenat DB. CYC: a large-scale investment in knowledge infrastruc- foundation for the semantic web. In: Proceedings, American medical ture. Communications of the ACM 1995;38(11):33–8. informatics association fall symposium, Bethesda, MD; 2006. p. 96–  Doerr M. The CIDOC conceptual reference module: an ontological 100. approach to semantic interoperability of metadata. AI Magzine  Protege. Available from: <http://protege.stanford.edu/>. 2003;24(3):75–92.  OWL. Available from: <http://www.w3.org/TR/owl-features/>.  Gennari JH, Silberfein A, Wiley JC. Integrating genomic knowledge  Silva PPd, McGuinness DL, McCool R. Knowledge provenance sources through an anatomy ontology. In: Proceedings of the paciﬁc infrastructure. IEEE Data Engineering Bulletin 2003;26(4):26–32. symposium on biocomputing; 2005. p. 115–26.  Sowa JF. Building, sharing, and merging ontologies. Electronic  Rosse C, Mejino JLV. A reference ontology for bioinformatics: the communication in the onto-std mailing list 1997. foundational model of anatomy. Journal of Biomedical Informatics  Friesen N. Semantic interoperability, communities of practice and the 2003;36(6):478–500. CanCore learning object metadata proﬁle. In: WWW2002 alternate  Arango G, Prieto-Diaz R. Domain analysis: concepts and research paper tracks proceedings; 2002. directions. In: Prieto-Diaz R, Arango G, editors. Domain analysis:  Weng C, Gennari JH, McDonald DW. A collaborative clinical trial acquisition of reusable information for software construction. IEEE protocol writing system. In: Proceedings of MedInfo’2004, San Computer Society Press; 1989. Francisco, CA; 2004. p. 1481–6.  Visser PRS et al. An analysis of ontological mismatches: heteroge-  Weng C, Gennari JH. Asynchronous collaborative writing through neity versus interoperability. In: AAAI 1997 spring symposium on annotations. In: Notes, Proceedings of ACM conference on computer ontological engineering, Stanford, USA, 1997; 1997. p. 164–72. supported cooperative work (CSCW’04), Chicago, IL; 2004. p. 578–  Schlenoﬀ C et al. An analysis of existing ontological systems for 81. applications in manufacturing and healthcare. NISTIR 6301,  Weng C, McDonald DW, Gennari JH. Participatory design of a National Institute of Standards and Technology, Gaithersburg, collaborative clinical trial protocol authoring system. In: The second MD; 1999. international conference on IT in health care: socio-technical  Doerr M, Hunter J, Lagoze C. Towards a core ontology for approaches, Portland, OR; 2004. information integration. Journal of Digital Information 2003;4(1).  Shadbolt N, Berners-Lee T, Hall W. The semantic web revisited.  Kapur D, Narendran P. Matching, uniﬁcation and complexity. IEEE Intelligent Systems 2006;21(3):96–101. SIGSAM Bulletin 1987;21(4):6–9.  Peleg M et al. The intermed approach to sharable computer-  Nejati S. Formal support for merging and negotiation. In: Proceed- interpretable guidelines: a review. Journal of the American Medical ings of the 20th IEEE/ACM international Conference on Automated Informatics Association 2004;11(1):1–10. software engineering, Long Beach, CA, USA; 2005. p. 456–60.  The OBO Foundry. Available from: <http://obofoundry.org/>.  Uchitel S et al. System architecture: the context for scenario-based  Ontology for Biomedical Investigations. Available from: <http:// model synthesis. In: Proceedings of the 12th ACM SIGSOFT twelfth obi.sourceforge.net/index.php#organization/>.