Terminology Tools Report an overview Oct 2004 by 7Ily2Y

VIEWS: 10 PAGES: 29

									Terminology Tools Report




     - an overview -




       October 2004




        Don Walker
                                            Terminology Tools Report – an overview



                                                            CONTENTS

1    Introduction ............................................................................................................ 4
      1.1     Terms of reference ............................................................................................................. 4
      1.2     Some tasks.......................................................................................................................... 4
      1.3     The tools that might be used .............................................................................................. 5
      1.4     Those who might use terminology tools ............................................................................ 5
2    The issues raised .................................................................................................... 5
      2.1     Availability of tools are a priority for CATCH if it is to be implemented within 12
              months or so; maintenance is a particular concern and tools that are responsive to
              changes/updates are crucial. Developments in web-based access tools are of
              particular interest. ............................................................................................................... 5
      2.2     Terminology management tools need to be able to handle more than one
              terminology. Tools need to support a broad architecture of terminology and
              classification. Mapping between terminologies is crucial. The quality or the user
              interface and general usability by dispersed and varied users is very important. ............... 6
      2.3     Tools need to deal with both terminologies and classifications and the link between
              the two. ............................................................................................................................... 7
      2.4     Maintenance and responsiveness is critical – web service models..................................... 7
      2.5     Should we expect one tool to serve development, maintenance and dissemination
              requirements or do we need a different tool for each? If multiple tools are needed
              what import and export functions are available? Access and version controls are
              particularly important. ........................................................................................................ 7
      2.6     Ideal to minimise the number of terminologies in as effort of mapping increases with
              number of different terminologies.. .................................................................................... 8
      2.7     What are the developments in natural language processing? E.g. is encoding of semi-
              structured pathology reports possible now eg for Cancer registry? ................................... 9
      2.8     Links with messaging must be kept in mind within the terminology architecture.
              Interface/integration with HL7 is an important support to agency integration. .................10
      2.9     Ability to support multilingual terminologies preferable. Information about recent
              Dutch experience/work in this area would be of interest; getting the syntax right is
              crucial. ...............................................................................................................................11
      2.10    Need to focus on what can be realistically expected in the short term as well as keep
              an eye on future, such as natural language processing. But important not to ‗bite off
              more than we can chew‘. ...................................................................................................11
      2.11    Semantic interoperability is the key/main game, syntactic interoperability is an
              important facilitator of semantic interoperability. .............................................................11
      2.12    Pre and Post coordination, how do tools manage these? ...................................................12
      2.13    Information about the general state of the market is required. Currently the market is
              quite small for terminology development tools but large for terminology
              implementation tools. Do we buy development tools or build ourselves? ........................15
      2.14    Interest in latest developments in the relationship between interface and reference
              terminologies. ....................................................................................................................16
      2.15    AIHW is also interested in functional requirements for Meteor (on-line metadata
              registry) in a terminology environment. Main focus is on linkage between
              terminologies and classifications. .....................................................................................16
      2.16    Where is SNOMED up to in general? Domain coverage, localisation? Uptake- who
              is using where? If others have chosen other paths, what are they? What impact might
              that have on Australia? ......................................................................................................16
      2.17    What has happened in the US in year since purchase of national SNOMED licence?
              Has purchase of licence ‗opened the floodgates‘ for use? Extent of growth in use of
              SNOMED Ct by hospitals and other key health care providers? ......................................17




Don Walker (October 2004)                                                                                                                             2
                                            Terminology Tools Report – an overview



      2.18 Interest in governance and intellectual property issues, especially any moves toward
           internationalisation of control of SNOMED-CT. ..............................................................17
      2.19 What is the uptake by software vendors on international scale? .......................................18
      2.20 What is the use of terminology services by software vendors, or are they using an
           internal solutions ? ............................................................................................................18
      2.21 Are there tools which manage the 3 levels of— term to classification to grouping (eg
           DRG)? ...............................................................................................................................19
      2.22 Important to consider the customer interface not just developer/ maintainer
           interface— Are there tools that could support a process of submission of new terms
           by users? Web based? .......................................................................................................19
      2.23 What are the costs and licensing options for tools such as Health Language ...................19
      2.24 How would tools be used if we had them? Need to keep in mind there may be many
           users with different purposes. What is the capacity of these tools? What would be the
           best licensing arrangements for Australia? .......................................................................19
3    Appendix .............................................................................................................. 26
      3.1      Teleconference Notes ........................................................................................................26
               3.1.1        Briefing notes for Australian participants in SNOMED CT User Group meeting and
                            HL7 meeting – US, September 04 ....................................................................................26
               3.1.2        Notes from teleconference Friday 17th September ............................................................26
      3.2      E-Mail message from Tad McKeon of St Jude .................................................................27
4    Attachments ......................................................................................................... 28
      4.1      Attached ―NIS approach to in-house tools at NCCH‖ ......................................................28
      4.2      Attached documents describing the PAT ..........................................................................29
      4.3      Attached Health Language report .....................................................................................29
      4.4      Attached ――Essential SNOMED‖ or ―SNOMED ET‖‖ .....................................................29




Don Walker (October 2004)                                                                                                                       3
                            Terminology Tools Report – an overview




1    Introduction
A wide variety of computer tools may be used when working with terminologies. The
rang includes spreadsheets, spelling checkers, access databases, phonetic algorithms,
normalisation methods, poly-browsers, authoring (modelling) software applications,
and terminology servers.


1.1 Terms of reference
This report is based on the questions and comments that resulted from a
teleconference meeting on Friday 17th September, 2004. Participants included
members of the (ex) CTWG, DoHA, States and Territories, NCCH, and AIHW. Notes
from the meeting are attached under paragraph 3.1 on page 26.


1.2 Some tasks
A glance at the above notes suggests that the following broad tasks and features may
be relevant to the topic of ―terminology tools‖…
    1. Parse original text to create a vocabulary of terms, either manually, or by
        machine natural language processing (NLP).
    2. Create (model/author) a terminology or classification system
    3. Link vocabulary items to terminology or classification terms
    4. Link vocabulary items to terminology or classification concepts
    5. Link concepts from one terminology or classification system to concepts in
        another
    6. Map (using many to many relations) the concepts from one terminology or
        classification system to concepts in another
    7. Edit (maintain) terminologies and classification
    8. Distribute terminology data
    9. Update terminologies in remote sites
    10. Integrate updated terminology with local-extensions and resolve conflicts
    11. Process clinical data entries prior to concept identification
    12. Identify pre-coordinated concepts in a terminology
    13. Identify the post-coordinated concepts and their relationships that are
        contained in a clinical phrase
    14. Data mine the concepts in the free text of clinical records
    15. Create user subsets with user defined hierarchies
    16. Display the features of one terminology
    17. Display the features of multiple terminologies
    18. Offer web based access and communication
    19. Offer standard syntax for mapping-rules

The above tasks and features are revisited in more detail later in the report.




Don Walker (October 2004)                                                              4
                                  Terminology Tools Report – an overview




1.3 The tools that might be used
The following tools might be involved in the above tasks

Tool                           Reference                                                 Task number
Aust Parsing Tool              donald.walker@adelaide.edu.au                             1,
Mayo NLP Tools                 elkin.peter@mayo.edu                                      1,2,7,11,12,13,14,17,18
Language & Computing
                               dave@LandCglobal.com                                      1,11,12,13,14…
(L&C)
Computer Science Innovations
                               jmortimer@csi-inc.com                                     1,11,12,13,14…
(CSI)
Apelon                         JBowie@apelon.com                                         2,5,6,7,8,9,12,17,18
Health Language                marc@healthlanguage.com                                   2,5,7,8,9,10,12,17,18
Protégé                        http://protege.stanford.edu/                              2,5,7,12,17,18
PAT                            donald.walker@adelaide.edu.au                             2,3,4,5,6,7,11,12,17
SNOMED Clue Browser            http://www.netcluesoft.com (David Markwell)               11,12,16
SNOMED Subset tool             http://www.snomed.org/products/tools/subset_editor.html   15
Other specialised tools        Mostly In-house applications                              19




1.4 Those who might use terminology tools
The following groups might use terminology tools
    Terminology developers, maintainers and researchers
    Electronic health record (EHR) software developers
    EHR administrators and maintainers
    EHR creators (e.g. Clinicians, nurses, etc.)
    EHR researchers



2      The issues raised
The following issues were raised during the teleconference on 17th September 2004.


2.1       Availability of tools are a priority for CATCH if it is to be implemented
          within 12 months or so; maintenance is a particular concern and tools that
          are responsive to changes/updates are crucial. Developments in web-based
          access tools are of particular interest.
The NCCH is responsible for the maintenance of CATCH. With this in mind, the
ideal situation would be that NCCH has in-house computing skills adequate for the
maintenance of CATCH, which is a small and simple terminology/classification
system.
Currently the NCCH is exploring the development of in-house tools (called ―NIS‖)
that are web based (using Microsoft ―.NET‖ and ―SQL‖) – see attachment Paragraph
4.1 on page 28.


Don Walker (October 2004)                                                                                  5
                            Terminology Tools Report – an overview



CATCH could also be managed using ―Protégé‖ from http://protege.stanford.edu/.
This program is free. It has a comprehensive list of APIs. CATCH, being small, could
easily fit into RAM where Protégé would like it to be placed. Protégé is an open-
source program which is stable and very sophisticated.
The PAT (from the University of Adelaide) that has been used to edit CATCH needs
some enhancements to speed up the maintenance work. These have been discussed
with NCCH. The PAT is not currently web based. Changes to PAT are ―on hold‖
while NCCH develop their in-house tools. A document that describes some aspects of
PAT is attached – see paragraph 4.2 on page 29.
―Health Language‖ (HL) (http://www.healthlanguage.com/index2.html) could be used
to maintain CATCH. It would offer powerful and mission critical web support for
updating CATCH in the field. Experience in the UK will offer further insight into its
ability and reliability. At this stage, and for only CATCH, HL might be too expensive.
It has limited (but probably adequate) authoring functionality.
Apelon (http://www.apelon.com/) would be too complex for the CATCH task.


2.2    Terminology management tools need to be able to handle more than one
       terminology. Tools need to support a broad architecture of terminology
       and classification. Mapping between terminologies is crucial. The quality
       or the user interface and general usability by dispersed and varied users is
       very important.
Yes…
When placing multiple terminologies in a single tool, their data structure must first be
modified so each has a like structure. However, no ―standard structure‖ exists.
SNOMED CT has three core tables and a ―directed structure‖ that is very efficient and
simple. Additional data needs are met by the PAT export structure. Dr Chris Chute
and Dr Harold Solbig of the Mayo Clinic have evolved a very similar structure.
The data in a modified data structure can then be imported into a single and suitable
authoring, and mapping environment. Multiple terminologies in the one environment
result in a large application. Searching and web-based work may be slow. Some
systems (like Protégé) load very large files into RAM. Loading all SNOMED CT into
RAM is beyond the capacity of most systems.
Mapping between terminologies can be complex. Many-to-many relationships in
―groups‖ may be required. Complex and accurate maps may require ―syntax‖ and
―rules‖. The standards for accurate map definitions are not yet defined. Most
authoring tools currently offer simplistic mapping facilities.
An efficient user interface is vital for speedy productivity and reduction of tedium. A
good interface may result from a combination of a practical knowledge of the tasks,
an ingenious approach, the data design, application platform, hardware, and an
inspired art-form. Field testing a product is the only way to assess its interface. Only
enlightened end user can do this effectively.




Don Walker (October 2004)                                                                  6
                             Terminology Tools Report – an overview



2.3    Tools need to deal with both terminologies and classifications and the link
       between the two.
Both terminologies and classifications can be moulded into unified structures and
placed within the same authoring tool. If the tool has adequate linking and mapping
facilities, cross reference tables can be created that map one to the other. As stated
earlier, these may require many-to-many relationships, groupings, and a ―standard‖
syntax that can express rules. These features are rare in authoring software. The
desirable standards are currently being explored.


2.4    Maintenance and responsiveness is critical – web service models.
The maintenance of a terminology in the field is vital. Current technology suggests
deployed terminology servers that are updated via a cascading web network. The
network system should update all systems from a central authority (e.g. The College
of American Pathologists (CAP)). Satellite centres then integrate any required region
specific data (e.g. UK). Finally local servers are updated and local data preserved and
integrated (e.g. Hospitals, and clinics). Conflicts between updated and existing
concepts and terms need to be resolved along the way. The above is mission critical
and far from trivial. Updates may be required daily.
The UK has selected Health Language alone for the above tasks. Having only one
system avoids multiple interface and standards issues.
Work is underway on standard representation of terminologies. When these are
perfected, then perhaps a single terminology server system may not be required.
The responsiveness of web-base authoring tools may well be adequate, as only very
few users are likely to be involved at any one time. However, a terminology server
that is web-based and tries to cope with a nation‘s terminology requirements would be
slow. It could only be as reliable as the ―internet service‖ that carries it. It is generally
recommended that terminologies be updated via the web, but accessed locally – just
as the operating system of a personal computer is updated via the web and stored and
used locally.


2.5    Should we expect one tool to serve development, maintenance and
       dissemination requirements or do we need a different tool for each? If
       multiple tools are needed what import and export functions are available?
       Access and version controls are particularly important.
It depends on the task…
Constructing a vocabulary of the terms that are used in reports and documents might
require the use a ―parser‖ like that developed at the University of Adelaide for the
―GP Vocabulary stage-1 Project‖. (See attachment referenced in paragraph 4.4 on
page 29).
A simple system, like CATCH, could be developed, maintained and deployed using
the one tool (e.g. Health Language).
A complex terminology, like SNOMED, requires specialised resources for its creation
(e.g. Apelon with its ―description logic‖ and ―inference classification‖). Once created,
its central development, maintenance and integration (e.g. by CAP) would probably


Don Walker (October 2004)                                                                  7
                            Terminology Tools Report – an overview



continue to be done using its creation tool (i.e. Apelon). Local extensions of the
terminology (e.g. at hospitals or clinics) could be created and maintained with a less
powerful authoring tool (e.g. Health Language). It would need to be part of the
dissemination environment, as the local extensions would need to be preserved and
integrated with updated data.
Transferring data from one tool to another without an accepted standard format is a
difficult task. A system should be able to export its data in a predictable form using a
recognised format. The exported data can then be manipulated to suite that of the
system into which it is to be imported. An intimate knowledge of the internal structure
is required for this import. Wrongly formatted import-data can cause problems that
only a programmer with access to the inner workings of the system can resolve.
Manageable and predictable export options are provided by Protégé, PAT and Health
Language. Apelon has its own export that is difficult to decipher.
Large and complex systems tend to offer an import-service rather than import-tools.
(e.g. Health Language, Apelon, PAT). Data may be imported into Protégé.
Standards related to terminologies are evolving. SNOMED has three core data files
(Concepts, Terms, and Relationships). Ontology Web Language (OWL) seems to be a
promising environment in which to express the contents of a terminology. A very
large terminology may create an enormous OWL file.
Access to, and the export of elements of individual data tables is provided by PAT.
Other systems only allow selected and limited access.
Version control may involve ―publication version identifiers‖, in which case, some
data elements may need to be stamped with a ―version identifier‖. Version control can
be simply a matter of ―date and time stamping‖ selected or all edits. Accessing data
according to its stamp may be more complex. Merging data from different version and
preserving ―local extensions‖ can be extremely complex. Health Language may prove
to be good at this task - when field tested in the UK.


2.6    Ideal to minimise the number of terminologies as the effort of mapping
       increases with the number of different terminologies.
There should be only one healthcare terminology, as the many concepts encountered
in healthcare are shared between several domains of healthcare – GPs, surgeons,
pathologists, nurses, ambulance staff, pharmacists, dentists etc. all share common
healthcare concepts which might be applied to a single patient.
It is not feasible to create and maintain accurate maps between concepts in different
terminologies - the context and definitions of the concepts differ, even when the
words that describe them are the same. It is however appropriate to map concepts in a
terminology to classification systems.
When a ―standard terminology‖ is first adopted in an environment where several
different terminologies are already used, it would be desirable (for continuity of data
management) to provide cross-mapping tables from the concepts in the old systems to
those in the new standard system. The cross-mapping tables would inevitably be
approximate, and need not be maintained. They would simply offer a bridge from the
old to the new.




Don Walker (October 2004)                                                                8
                                   Terminology Tools Report – an overview



Any attempts to map between terminologies should be done via a ―standard reference
terminology‖. If not, the number of cross reference tables increases dramatically as
the number of disparate terminologies is increased.


2.7    What are the developments in natural language processing? E.g. is
       encoding of semi-structured pathology reports possible now eg for Cancer
       registry?
The short answer is ―yes…‖, however there is room for improvement.
Natural language processing (NLP) should be considered in the context to which it is
applied. NLP might attempt to translate a book from one language to another, or it
might simply facilitate an index search. Currently the former would be unsatisfactory;
however the latter has been applied, with success, for many years.
The term ―semi-structured free-text‖ tends to apply to a free text entry in a pre-
defined field – such as ―diagnosis‖ or ―pathology test‖. The shorter and more specific
the entry, the more reliable will be the NLP.
NLP of full text clinical records is performed at the Mayo Clinic by Peter Elkin and
Russell Hamm. Their results appear impressive. In-house software is used. It has been
developed for research purposes over many years.
Several USA commercial firms analyse free text documents. They do so to ―data
mine‖ clinical records. Their claimed accuracy is in the order of 90 to 96%. Results
are ―coded‖ reports. The systems tend to have their own compilation of several
terminologies and classification systems, plus ―learnt‖ terms and concepts.
The commercial NLP systems include…
   - Language & Computing (http://www.landCglobal.com)
   - CSI – Computer Science Innovations (http://www.csi-
      inc.com/CSI/welcome.html)

Heather Grain of the School of Public Health, La Trobe University, Australia has
some experience in the NLP of free text diagnostic phrases.
There are many well established NLP techniques. Some are described on pages 13-18
of the report of the ―GP Vocabulary stage-2‖ in the document titled ―2.05 Target
System Analysis using Term Matching Techniques‖.
The Unified Medical Language System of the National Library of Medicine (NLM)
offers many NLP tools and data files. The following notes about ―normalisation‖ were
gleaned from (a) the UMLS-Users‘ e-mail messages (largely authored by Allen C.
Browne) and (b) the documentation of the UMLS Knowledge Resources. They offer
insight into aspects of NLP…


Normalization reduces each string to its morphologically uninflected form.
    o The uninflected form of a noun is its singular, e.g. ―dog‖ and ―dogs‖ are both normalized to ―dog‖, the
         singular form. Likewise ―geese‖ becomes ―goose‖.
    o The uninflected form of a verb is its infinitive, e.g. ―goes‖ and ―went‖ are both normalized to the
         infinitive ―go‖.
    o The uninflected form of an adjective or adverb is its positive, e.g. ―reddish‖, ―redder‖ and ―red‖ are
         normalized to ―red‖, the positive form.




Don Walker (October 2004)                                                                                       9
                                     Terminology Tools Report – an overview



If a word could be the inflection of more than one base form, then multiple uninflected forms are used. For
example, ―found‖ could be the past tense of the verb ―find‖. Or the infinitive of the verb ―found‖. So, ―found‖ has
two normalized forms, ―find‖ and ―found‖.
Two published papers on the subject are available from the web-page at:
http://umlslex.nlm.nih.gov/Lexicon/SpecialistLexicon.html
      o Lexical Methods for Managing Variation in Biomedical Terminologies, by Alexa T. McCray, Suresh
          Srinivasan, Allen C. Browne
      o Evaluating Lexical Variant Generation to Improve Information Retrieval, by Guy Divita, Allen C.
          Browne, and Thomas Rindflesch. (in PDF format)

Lexical tools (including ―normalization‖ tools) are provided on the UMLS CD in directory:
http://umlslex.nlm.nih.gov/lvg/2003/. Documentation for ―normalization‖ is available via:
http://umlslex.nlm.gov/lvg/2003/docs/userDoc/index.html.
The UMLS CD contains a table called ―MRXNW.ENG‖ which converts words to their uninflected form. The
UMLS describes generic algorithms that convert words not found in the extensive file.



Clinical records will need a NLP utility. The user should be able to enter a phrase and
the terminology server should return the appropriate concepts and relationships or an
appropriate short list of options. The user would thus check the results of NLP at data
entry. The objective being to increase speed, reduce tedium and increase accuracy
when entering clinical information.
A terminology server may provide the NLP described above. It is understood that
Health Language and Apelon do not offer NLP. They rely on external pre-processing
of text.
As NLP processing is complex, not currently provided, and as it will be required by
all systems that grapple with ―post-coordination‖ of semi-structured free-text phrases,
it may be timely and wise to encourage the creation of an ―open-system NLP module‖
that could be used by many systems as a ―plug-in‖. For this, Government funding (i.e.
investment) could be indicated.


2.8     Links with messaging must be kept in mind within the terminology
        architecture. Interface/integration with HL7 is an important support to
        agency integration.
HL7 would like a ―standard terminology‖ to be used in messages. In the USA and
UK, SNOMED CT is the more-than-likely contender. CAP and SNOMED users are
working closely with various HL7 committees, as do those familiar with archetype
issues.
The terminology used within archetypes could be explored and harmonised with
―standard clinical terminology‖.
Medicines and device terminology tend to be, at least partially, country-specific.
Hence the USA and UK are developing their ―local versions‖ that link to appropriate
concepts in SNOMED-CT. The evolving AMDT (Australian Medicine and Device
Terminology) is similar to the UK model.
HL7 messages allow for any coded concept to be included in a message. The concept
description, its code, the coding system identifier, and its version are all included in
the message.




Don Walker (October 2004)                                                                                        10
                            Terminology Tools Report – an overview



2.9    Ability to support multilingual terminologies preferable. Information about
       recent Dutch experience/work in this area would be of interest; getting the
       syntax right is crucial.
Terminology contains ―concepts‖. They are universal and are independent of their
description. They exist ―in the minds eye‖. Concepts are described by words that are
language dependant. Hence a concept can be displayed in any language. Current
computers are able to display almost all characters from all countries.
Problems arise when the health culture of a nation does not recognise a concept, or
recognises it in a different context – e.g. some nations consider ―depression‖ to be a
―problem of the heart‖, and therefore in the cardiovascular area.
SNOMED is currently being translated into several European and Asian languages.


2.10 Need to focus on what can be realistically expected in the short term as well
     as keep an eye on future, such as natural language processing. But
     important not to ‗bite off more than we can chew‘.
The following are realistic and will eventually have to be bitten and chewed. The size
of the bite, and their digestibility may be a function of vision, resolve and funding.
    1. Decide to follow the USA and UK (and possibly the EU) and accept
        ―SNOMED‖ as an Australian national standard terminology.
    2. Move to improve SNOMED for Australian use.
             a. Reduce the pre-coordination of SNOMED CT – (1) Identify the
                essential SNOMED concepts from which all other clinical concepts
                can be created by post-coordination, and (2) convert the remaining pre-
                coordinated SNOMED-CT concepts into ―predefined-post-
                coordinated-concepts‖. An additional ―SNOMED-ET‖ (essential
                terminology) product might result.
             b. Harmonise the terminology used by archetypes with that of SNOMED
                – by linking and adding to SNOMED-CT. ―SNOMED-AT‖ (archetype
                terminology) could become an additional product.
    3. Develop archetypes further so they can fulfil the role of a standard-interface-
        EHR-structure
    4. Continue to support and adopt HL7 and IT14 standards
    5. Explore in more detail the acquisition and/or construction of terminology
        related tools and related services.
    6. Construct links between some existing clinical terminologies and SNOMED-
        CT to enable a smooth transition to SNOMED (e.g. between DOCLE,
        ICPC2Plus, CATCH etc.)
    7. Construct any necessary Australian extensions to SNOMED-CT (e.g.
        medicines and devices).


2.11 Semantic interoperability is the key/main game, syntactic interoperability
     is an important facilitator of semantic interoperability.
Syntactic interoperability refers to the accurate physical transfer of items from A to B
and B to A. This is usually best done through a ―standard interface‖ – be it a plug and
socket, message format, structure or middle-ware.



Don Walker (October 2004)                                                                11
                               Terminology Tools Report – an overview



Semantic interoperability describes the accurate transfer of ―meaning‖ between A & B
and visa versa. Hence a ―concept‖ that exists and is processed and managed in A but
not in B, cannot be fully transferred from A to B, although it may be physically
moved from A to B.
As an example, in order to fulfil the semantic interoperability requirements of EHR, it
is vital that a ―single standard terminology‖ that is common to all systems, is used in
health informatics. Each region, district and location may have ―local extensions‖ of
the standard core terminology. The whole terminology would require constant
management to keep all users compatible with each other and with core system
updates, and therefore ―semantically interoperable‖


2.12 Pre and Post coordination, how do tools manage these?
A ―pre-coordinated‖ concept is one that is composed of several ―atomic concepts‖
that exist as a single concept (with a single identifier) within a terminology – e.g.
―fractured left femur‖
A ―post-coordinated‖ concept is composed of several ―atomic‖ or ―primitive‖
concepts that are joined together to represent the meaning of a phrase – e.g.
{[problem] ―fracture‖ + [location of] ―femur‖ + [laterality] ―left‖}.
There are two aspects related to the tools that manage these…
   1. Equating pre- and post-coordinated ―coded‖ statements
   2. NLP of a description phrase and delivering suitably post-coordinated ―coded‖
       statements. This task has been described earlier in paragraph 2.7 on page 9.

Neither of these tasks is easy.
Fully equating pre- and post-coordination requires that both the pre-coordinated
concept and its equivalent post-coordinated form are ―fully defined‖. A concept is
defined by its hierarchies (i.e. ―is_a‖ (or vertical) relationships) and its role (or lateral)
relationships.
The hierarchies and inherited defining properties of a pre-coordinated SNOMED CT
concept and those of its post-coordinated equivalent are shown below…
Using the SNOMED CT example phrase: ―Laparoscopic repair of inguinal hernia‖,
the pre-coordinated short canonical definition is as follows…


        laparoscopic repair of inguinal hernia

                  Fully defined by ...
                    Is a
                     procedure
                    Approach
                     procedural approach
                    Direct morphology
                     hernia
                    Group
                     Access
                      endoscopic approach
                     Method
                      inspection - action
                     Procedure site




Don Walker (October 2004)                                                                  12
                                  Terminology Tools Report – an overview



                      peritoneal cavity structure
                     Access instrument
                      laparoscope
                     Group
                     Method
                      inspection - action
                     Procedure site
                      inguinal canal structure
                     Group
                     Method
                      repair - action
                     Procedure site
                      inguinal canal structure


The post-coordinated composition is as follows…

        laparoscopic procedure + surgical repair + inguinal hernia
                 The short canonical definitions are as follows…

                 laparoscopic procedure
                    Fully defined by ...
                    Is a
                     procedure
                    Approach
                     procedural approach
                    Group
                     Access
                     endoscopic approach
                     Method
                     inspection - action
                     Procedure site
                     peritoneal cavity structure
                     Access instrument
                     laparoscope

                 surgical repair
                    Fully defined by ...
                    Is a
                     procedure
                    Method
                     repair - action

                 inguinal hernia
                    Primitive
                    Is a
                     inguinal hernia


A comparison table of defining attributes are shown below




Don Walker (October 2004)                                                  13
                                      Terminology Tools Report – an overview




laparoscopic repair of inguinal hernia                 laparoscopic procedure + surgical repair + inguinal hernia
                                                       laparoscopic procedure
Is a                    procedure                       Is a                         procedure
Approach                procedural approach             Approach                     procedural approach
Direct morphology       hernia                          Access                       endoscopic approach
Access                  endoscopic approach             Method                       inspection - action

Method                  inspection - action                                          peritoneal cavity
                                                        Procedure site
                                                                                    structure
Procedure site          peritoneal cavity structure     Access instrument           laparoscope
Access instrument       laparoscope                    surgical repair
Method                  inspection - action             Is a                         procedure
Procedure site          inguinal canal structure        Method                       repair - action
Method                  repair - action                inguinal hernia
Procedure site          inguinal canal structure        Is a = Direct morphology     inguinal hernia

Missing/differences                                    Missing/differences
Direct morphology       inguinal hernia                Direct morphology             hernia
                                                       Procedure site                inguinal canal structure


From the above table, ―inguinal canal structure‖, ―inguinal hernia‖, and ―hernia‖ are
different. However, the pre- and post-coordinated definitions are very close. It would
appear, that as the concept ―inguinal hernia‖ is not yet fully defined in SNOMED CT,
some ―differences‖ between the two result.
Satisfactory equation of pre- and post-coordinated concepts has not yet been achieved,
however the description logic mathematics has been proposed. Fully defining the
appropriate SNOMED-CT concepts is a necessary first task. Currently only about 10-
15% of SNOMED-CT is fully defined.
An alternative approach could be to reduce the size and complexity of SNOMED CT
by removing many of its pre-coordinated concepts, and leaving the ―essential atomic
concepts‖ (―Essential SNOMED‖ or ―SNOMED ET‖ might result). These could then
be used to post-coordinate the almost infinite number of things encountered in
healthcare. The rejected pre-coordinated SNOMED CT concepts could be converted
to ―pre-defined post-coordinated concepts‖. Using the above ―essential SNOMED‖ in
a post-coordinated world could reduce the problem of equating pre- and post-
coordination, as most statements would be post-coordinated.
The tool most used to fully define concepts and place them in appropriate hierarchies
is Apelon. The expert knowledge required to complete the SNOMED CT ―definition-
task‖ is the problem.
The PAT tool could be used (perhaps with some modification) to construct an atomic
SNOMED, however the cooperation of SNOMED would be necessary.
Attached is a paper titled ―Essential SNOMED‖ or ―SNOMED ET‖. It expands on
this thinking (See paragraph 4.5 on page 29)
An experience along the above lines is described in an e-mail message from Tad
McKeon of St Jude. It is included in the appendix – see paragraph 3.2 on page 27.




Don Walker (October 2004)                                                                                       14
                            Terminology Tools Report – an overview



2.13 Information about the general state of the market is required. Currently
     the market is quite small for terminology development tools but large for
     terminology implementation tools. Do we buy development tools or build
     ourselves?
The terminology tools fall into three main groups:
   1. In-house tools developed by researchers and enthusiasts – e.g. PAT
   2. Open source tools – of which Protégé is the prime example
   3. Commercial tools – of which there are only two that attempt to provide
       authoring/modelling facilities. They are: Health Language and Apelon.

Health Language has been reviewed in detail for the DHAC in August 2001 by Don
Walker. The paper was titled ―Cyber+LE and its components LExScape and
LExIndex (Health Language Inc. USA)‖. The only change of note is its ability as a
terminology server on the UK data spine and associated networks. A copy of the
report is attached – see paragraph 4.3 on page 29
Apelon is a large, complex and expensive tool designed for major terminology
developers like CAP, Kaiser Permanente, USA Veterans affairs, and the UK
SNOMED CT centres. It requires a full time programmer and extensive training.

Do we buy development tools or build them ourselves?
This question begs others…
    Tools to do what?
    Who would use them?
    How often would they be used?
    Would available tools be suitable?
    Could available tools be modified to suit?
    Who might build and who might maintain the tools?
    Who would own the tools?
    What would be the motivation for responsibility?
    What are the options – e.g. build, buy, contract, etc.?
    What would be the outcome differences and risks between the various
       options?

If the task is specialised, complex and mission critical (e.g. a national terminology
server), an established, tested and proven commercial product could be the best
choice. Hence wait and see how Health Language performs in the UK.

If a solution is difficult, challenging, non-critical and required by many, it may well
be appropriate to build local tools for others to share. An example could be NLP tools
that facilitate ―coding‖ of post-coordinated clinical phrases.

In-house tasks often require in-house tools. Tools to manage CATCH could be
developed in-house by the NCCH as the task is relatively small and straightforward.

A web based open-source version of the PAT could be built, however it would be
expensive, used by very few, and might lack ―ownership‖. Building on Protégé might
be a better approach.



Don Walker (October 2004)                                                               15
                            Terminology Tools Report – an overview



2.14 Interest in latest developments in the relationship between interface and
     reference terminologies.
The ―reference terminology‖ most mentioned during the recent USA visit was
SNOMED RT, however it seemed to be a thing of the past now that SNOMED CT
has taken centre stage. This, no doubt, is because SNOMED CT is regarded as both a
reference and interface terminology, has been nominated by UK Health as their
―standard‖, and has been made available free of charge in the USA.
Those who use SNOMED CT for natural language processing add considerably to it,
largely by increasing the interface terms. At the Mayo Clinic (Peter Elkin) SNOMED
CT doubles in size after it is enhanced and made fully functional for NLP.
For interface and efficiency reasons, it is recommended (Kent Spackman) that any
user of SNOMED CT creates or uses ―subsets‖ of it, as ―nobody needs all of
SNOMED CT‖.


2.15 AIHW is also interested in functional requirements for Meteor (on-line
     metadata registry) in a terminology environment. Main focus is on linkage
     between terminologies and classifications.
This topic was not encountered during the recent USA visit. However, SNOMED and
HL7 have committees involved in maps between SNOMED CT and classification
systems. The SNOMED GP Users Group is in the process of submitting to CAP a
request that SNOMED CT be mapped to ICPC2. Maps to ICD9 and ICD10 have been
worked on for some time by CAP Groups (ICD9 in the USA and ICD10 in the UK -
Rosemary Roberts (NCCH Australia) is involved in the latter).


2.16 Where is SNOMED up to in general? Domain coverage, localisation?
     Uptake- who is using where? If others have chosen other paths, what are
     they? What impact might that have on Australia?


Domain coverage:
The concepts in SNOMED CT cover most of healthcare. Deficient areas include
Medicines and Devices. Many interface terms need to be added. Never the less, it is
by far the most detailed and comprehensive terminology.
IOTA (International Organisation for Terminology in Anaesthesia) is enhancing the
SNOMED anaesthetic domain. There are other working groups (see:
http://www.snomed.org/clinical/workinggroups.html).


Localisation:
As medicines and devices have many regional and local versions, it has been agreed
that countries establish there own ―extension‖ terminologies that link to SNOMED
CT concepts. SNOMED CT has been designed to manage local extensions. CAP
recommends the addition of interface terms, the creation of subsets and clinically
relevant hierarchies. CAP provides an excellent ―subset tool‖ for this purpose.




Don Walker (October 2004)                                                             16
                            Terminology Tools Report – an overview



Uptake:
The uptake of SNOMED CT will be dramatic as it will be the standard used in the
UK. In the USA Kaiser Permanente have made it their standard. The impression from
the USA was that SNOMED CT will become the de-facto USA standard. The EU are
in the process of deciding about SNOMED CT - it has been recommended (with some
ownership and management provisos). The uptake of SNOMED CT, of course, will
be related to the penetration of electronic health records.
Translation of SNOMED CT to several languages has been achieved. More are
underway (e.g. French and Japanese scholars involved in the process were
encountered at the meetings). If the EU nominates SNOMED CT it will be widely
translated.
Alternatives:
The serious alternatives to SNOMED CT (as the ―terminology for the whole of
healthcare‖) do not seem to exist. Some work has been done in Germany to use
ICD10 as the basis for a terminology. However, an interface to a classification seems
to be their objective.
It is generally agreed that SNOMED CT could be improved, and in fact needs to be.
Its use in the UK could well accelerate improvements.
Impact on Australia:
The position that SNOMED CT is taking in the world should make the choice for
Australia easy. However, ―when?‖, rather than ―what?‖ may be the better question.


2.17 What has happened in the US in year since purchase of national SNOMED
     licence? Has purchase of licence ‗opened the floodgates‘ for use? Extent of
     growth in use of SNOMED Ct by hospitals and other key health care
     providers?
There seems to be a consensus that SNOMED CT is the standard terminology to use
in the USA. However conversion to it will be expensive and therefore slow.
Organisations with minimal EHRs will have the easier task.
In the UK the ―floodgates‖ are certainly opened and the government ―pumps‖ are
working.
The implementation of SNOMED CT is trivial compared to the establishment of
electronic health records in a previously non-electronic site.


2.18 Interest in governance and intellectual property issues, especially any
     moves toward internationalisation of control of SNOMED-CT.
A private meeting with the Bernd Blobel (HL7 Germany/Professor Informatics at Uni.
Madgeburg) author of an EU report (on the recommended healthcare terminology for
the EU) revealed that ―SNOMED CT‖ was recommended. However, it was also
recommended that the chosen terminology should be ―owned‖ and ―managed‖ by a
―public organisation‖. The National Library of Medicine (in Washington DC) was
mentioned. When the report is published, comments will be sought. Australia may
then be able to contribute to the topic.



Don Walker (October 2004)                                                           17
                            Terminology Tools Report – an overview



2.19 What is the uptake by software vendors on international scale?
All USA software developers encountered assume SNOMED CT will be used. They
are ―gearing up for it‖. In the UK contracts are dictating progress.
Simple systems can use SNOMED CT in simple ways. A subset of SNOMED CT that
has had its hierarchies re-arranged so they are clinically appropriate, need be no more
difficult to implement than any existing terminology. The advantage would be that its
concepts would be compatible with those in other systems, some of which may be the
most complex.
Kaiser Permanente use relatively small subsets of SNOMED CT
It is only when attempts are made to fully harness the potential power of SNOMED
that the task becomes challenging.
SNOMED CT is not ready to use ―out of the box‖ (Kent Spackman).
Hospital systems that do not have EHR, and require ICD9 plus DRGs for
management have the challenge of establishing EHRs rather than the uptake of a
terminology.
The uptake of SNOMED elsewhere in the world (i.e. other than USA and UK) is
virtually non-existent (apart from its traditional pathology role). Denmark has just
negotiated a national licence. All EU attendees spoken to seemed to acknowledge that
there was no real alternative to SNOMED CT. Never the less, they would ―wait and
see‖. Cost, ownership, governance, and need were the main issues mentioned. ―The
cost of doing something, rather than nothing‖ was an interesting lunchtime discussion.


2.20 What is the use of terminology services by software vendors, or are they
     using an internal solutions ?
A ―terminology service‖ implies a maintenance infrastructure that would probably
publish updates on the web. A ―terminology server‖ (a software application like
Health Language) would receive updates in the field and integrate them with existing
terminology and local extensions. The EHR software interacts with the terminology
server via an API (application program interface) when a concepts is sought.
Terminologies must be maintained and regularly updated. This is not an easy task.
Terminologies are complex and available expertise is scant. Local extensions to a
terminology are required. These must be managed with each update. Most software
vendors are not equipped to manage all these things. New generation systems that
could afford it (and if it existed), would opt for a ―terminology service‖. Health
Language has the lion‘s share of this market at present, including National Health
Services Information Authority‘s National electronic Library for Health (NeLH) (i.e.
UK Health); Amicore, Inc.; Cerner Corporation; HealthTrio; iSOFT; McKesson
Information Solutions; MICROMEDEX; Misys; NewChurch Ltd.; Per-Se
Technologies.
The issues of (a) searching tools, (b) coping with natural language, and (c) managing
post-coordination, need to be examined with care. These tasks may ―fall between‖ the
terminology server and the EMR applications.




Don Walker (October 2004)                                                           18
                            Terminology Tools Report – an overview



Systems can manage without terminology services, as they have in the past. However,
the size, and complexity of a comprehensive terminology, and the need to standardise
and maintain terminology across the whole health sector, makes the task formidable.


2.21 Are there tools which manage the 3 levels of— term to classification to
     grouping (eg DRG)?
Yes…
Any advanced terminology modelling/authoring environment could be set up to
manage terminologies, classifications and DRGs – e.g. Apelon, HL, Protégé, or PAT.


2.22 Important to consider the customer interface not just developer/
     maintainer interface— Are there tools that could support a process of
     submission of new terms by users? Web based?
Yes…
The infrastructure implied by a ―terminology service‖ involves the creation of local-
extensions in the field, some of which may become requests for regional-extensions.
Likewise some may go on to be national-extensions. Some may progress to be
included internationally. The status of new and old terms and concepts would be
managed by the service. It would be backed by expert committees.
The tools that support the above process are a terminology-server (in the field), world
wide web (for reporting, requesting and updating), and terminology
authoring/modelling tools (for use by regional and national authorities).


2.23 What are the costs and licensing options for tools such as Health Language
The cost and licence options for Health Language are a matter for discussion and
negotiation with HL.
The ―conversation cost‖ sounded as though both HL and Apelon could be in the order
of US$50,000 per single workstation per year.


2.24 How would tools be used if we had them? Need to keep in mind there may
     be many users with different purposes. What is the capacity of these tools?
     What would be the best licensing arrangements for Australia?
Firstly, a setting needs to be proposed for this question. Let us consider that…


Scenario A
   (1) a national terminology infrastructure is provided in the form of a ―terminology
       service‖ that can mange SNOMED CT, several classifications and their maps
       to SNOMED CT, and links to other ―legacy‖ terminologies;
   (2) EHR in clinics and hospitals use SNOMED CT and report using various
       classification systems
   (3) The EHR software uses a ―terminology server‖ that is updated via the web.
   (4) Some systems have a structure that is archetype based


Don Walker (October 2004)                                                            19
                                        Terminology Tools Report – an overview




The terminology functions and tools that might be involved in the above setting
include…

In the clinic…

Significant Functions                         Tool examples          Comments
                                                                     These could be (1) a new entity, (2) part of the
                                              Simple NLP Tools to    EMR or (3) part of the terminology server
Find a pre-coordinated concept from a         process query.
                                                                     Note: HL & Apelon have no NLP ability of note
phrase
                                              Terminology server     The server searches for and returns the most
                                              (e.g. HL, Apelon)      likely concept
                                                                     These could be (1) a new entity, (2) part of the
                                              Clever NLP Tools to    EMR or (3) part of the terminology server
Find the post-coordinated concepts and        process query.
                                                                     Note: HL & Apelon have no NLP ability of note
their relationships relevant to a phrase
                                              Terminology server     The server searches for and returns the most
                                              (e.g. HL, Apelon)      likely concepts and their relationships
Determine if a concept is an ancestor or
                                              Terminology server
descendent concept mentioned in a                                    Returns the answer
                                              (e.g. HL, Apelon)
decision support rule
Provide a list of all descendants of a
                                              Terminology server
concept that could be used in complex                                Returns a list of concepts
                                              (e.g. HL, Apelon)
record searching and analysis.
                                              Terminology server     Local extension terms are supported by
Add a local term to the terminology
                                              (e.g. HL, Apelon)      SNOMED CT identifiers & HL & Apelon
                                              Terminology server     Local extension concepts are supported by
Add a local concept to the terminology
                                              (e.g. HL, Apelon)      SNOMED CT identifiers & HL & Apelon
                                              Subset editor e.g.     SNOMED ―subset tool‖ is excellent. It creates
Create a local subset and clinical                                   user hierarchies and lists.
                                              SNOMED tool (by
hierarchy.
                                              David Markwell)        HL offers only ―user-lists‖ not ―user hierarchies‖
                                              Web technology +
Send proposals for additional terms and                              Proposals from the field are sent to regional
                                              terminology server
concepts for regional use                                            centres
                                              (e.g. HL, Apelon)
                                              Web technology +
Receive updated terminology and                                      The latest versions of the terminologies or
                                              terminology server
classifications                                                      classifications arrive in the field
                                              (e.g. HL, Apelon)
                                              Web technology +
Integrate updates with existing local                                Latest versions are integrated with existing
                                              terminology server
terminologies                                                        versions. Conflicts are presented for resolution.s
                                              (e.g. HL, Apelon)



In the terminology centres…

Significant Functions                         Tool examples          Comments
                                              Web technology +
                                                                     Term and concept proposals from the field are
Receive requests                              terminology server
                                                                     received at the centre
                                              (e.g. HL, Apelon)
                                                                     Adequate searching tools reduce duplication
                                              Searching tools
Find existing terms and concepts                                     HL seems to have only rudimentary searching
                                                                     tools
                                                                     Concepts are added and their defining
                                              e.g. Apelon, HL or
Model concepts                                                       relationships are constructed. Local extension
                                              Protégé
                                                                     IDs are managed.




Don Walker (October 2004)                                                                                          20
                                      Terminology Tools Report – an overview



Significant Functions                          Tool examples            Comments
                                               e.g. Apelon, HL or       Interface terms are added. Local extension IDs
Add terms
                                               Protégé                  are managed.
                                               e.g. Apelon, HL or       Hierarchies partly define concepts. Local
Construct hierarchies
                                               Protégé                  extension IDs are managed.
                                               e.g. Apelon, HL or       Non-hierarchical relationships partly define
Create non-hierarchical relationships
                                               Protégé                  concepts
Check for redundancy by using defining         e.g. Apelon              Description logic is a feature of Apelon and not
relationships                                  Not HL or Protégé        HL or Protégé
                                                                        Clinically relevant hierarchies (i.e. Browser
Construct subsets and browser                  SNOMED CT Subset         hierarchies) differ from those created for
hierarchies                                    tools                    description logic, which are the hierarchies
                                                                        supplied with SNOMED CT.
Send proposals for additional terms and
                                                                        Nationalisation and internationalisation of terms
concepts for national and international        Web technology
                                                                        and concepts
use
Receive updated terminology and
                                               Web technology           The latest updates are received
classifications from international centres
                                               Web technology +
Integrate updates with existing regional                                Updates are merged with existing regional
                                               terminology server
terminologies                                                           extensions
                                               (e.g. HL, Apelon)
                                               Web technology +
Send updates to field terminology
                                               terminology server       Distribute the latest regional data
servers.
                                               (e.g. HL, Apelon)




Scenario B

Reference the concepts in semi-structured text to SNOMED-CT concepts. For
example the cascading questions of the Advanced Management System (AIMS) of the
Australian Patient Safety Foundation.(APSF). The functions involved include…

Significant Functions                        Tool examples                        Comments
Manual parsing of text phrases to            Manual Parser from University of     Atomic concepts and semantic
create a ―vocabulary‖ of terms               Adelaide                             relationships are identified
Link Vocabulary terms to a reference
                                             PAT, Protégé, HL, Apelon             A simple 1-to-1 mapping
terminology (e.g. SNOMED CT)
Add local-extension concepts and                                                  New terms are added and new
                                             PAT, Protégé, HL, Apelon
terms to the reference terminology                                                concepts modelled
                                             Mayo NLP Tools; Language &           For quality control, some machine
Machine parse and link text phrases –
                                             Computing (L&C); Computer            results must be compared with the
a research exercise
                                             Science Innovations (CSI)            manual results




From the above table, the following tools come to mind:

Tool               Source                    Type and comment
                                             Manual parsing tool that helps in the creation of vocabulary term
Aust Parsing       University of
                                             lists from free-text phrases. It was purpose built for the ―GP
Tool               Adelaide
                                             Vocabulary stage-1‖ project.
Mayo NLP                                     NLP tools developed in-house to machine ―code‖ clinical records
                   Mayo Clinic, USA
Tools                                        for data-mining purposes



Don Walker (October 2004)                                                                                           21
                                  Terminology Tools Report – an overview



Tool            Source                Type and comment
Language &
                                      NLP tools developed in-house to machine ―code‖ free text
Computing       USA Corporation
                                      documents for data-mining purposes
(L&C)
Computer
Science                               NLP tools developed in-house to machine ―code‖ free text
                USA Corporation
Innovations                           documents for data-mining purposes
(CSI)
                                      Terminology and classification modelling/authoring tool and web-
Apelon          USA Corporation       based terminology maintenance tool. It has very sophisticated
                                      description logic and machine classification tools.
                                      Modelling tool and terminology service tool. It has limited
Health
                USA Corporation       modelling, mapping, and searching ability. It is strong on web
Language
                                      based terminology updating and integration of local extensions
                                      Knowledge modelling tool able to hold multiple terminologies and
                Stanford
Protégé                               their relationships. Data is held in RAM. Very large terminologies
                University, USA
                                      may use all available memory.
                                      Terminology and classification modelling/authoring tool with a
                University of
PAT                                   ―list browser‖ and linking/mapping tools. It is designed for ―in-
                Adelaide, Australia
                                      house‖ use.
                NetClue               A purpose built browser for SNOMED CT. It has no editing
SNOMED Clue
                Corporation, UK       functions. It provides an excellent display of SNOMED CT
Browser
                (David Markwell)      features
                                      A Purpose built subset generating tool for SNOMED CT. It can
                NetClue
SNOMED                                create user defined lists of concepts and user defined hierarchies
                Corporation, UK
Subset tool                           that present SNOMED CT in a simpler and more clinically
                (David Markwell)
                                      relevant way.
Other
                Mostly In-house       A variety of applications developed in-house – including poly-
specialised
                application           browsers, NLP, syntax for rules, etc.
tools




Don Walker (October 2004)                                                                                  22
                                     Terminology Tools Report – an overview




Tool status and availability




                                                                                  Computer Science
                                             Aust Parsing Tool




                                                                                  Innovations (CSI)




                                                                                                                                                               SNOMED Subset
                                                                 Mayo NLP Tools


                                                                                  Computin (L&C)




                                                                                                                                                                               Other specialised
                                                                                                               Health Language




                                                                                                                                                 SNOMED Clue
                                                                                  Language &
Status and Availability




                                                                                                                                                 Browser
                                                                                                                                 Protégé
                                                                                                      Apelon




                                                                                                                                                                               tools
                                                                                                                                           PAT




                                                                                                                                                               tool
Commercial                                                                          x        x        x         x                                   x              x
Open-source                                                                                                                      x
Public ownership                              x
In-house product                              x                   x                 x        x                                   x         x                                        x
Licence fee large                                                                   x        x        x         x
Licence fee minimal or free                   x                                                                                  x         x        x              x
Requested enhancements provided at cost       x                                                                                            x
Requires a trained programmer to import
                                              x                   x                 x        x        x         x                x         x        x                               x
new data and ―set-up‖ ready for use
Requires a trained programmer to operate                          x                 x        x        x                                                                             x




A closer, but somewhat cursory look at some of the significant functions of the tools
is shown in the table below.
                                                                                  Computer Science
                                             Aust Parsing Tool




                                                                                  Innovations (CSI)




                                                                                                                                                               SNOMED Subset
                                                                 Mayo NLP Tools


                                                                                  Computin (L&C)




                                                                                                                                                                               Other specialised
                                                                                                               Health Language




                                                                                                                                                 SNOMED Clue
                                                                                  Language &




Significant functions
                                                                                                                                                 Browser
                                                                                                                                 Protégé
                                                                                                      Apelon




                                                                                                                                                                               tools
                                                                                                                                           PAT




                                                                                                                                                               tool



Manually parse phrases                        x
Machine parse phrases                                             x                 x        x                                                                                      x
Natural Language Process single concepts                          x                 x        x                                             x
Natural Language Process & machine
                                                                  x                 x        x                                                                                      x
parse multiple concept phrases
Search and return pre-coordinated
                                                                  x                 x        x        x         x                x         x        x
concepts
Search and return post-coordinated
                                                                  x                 x        x                                                                                      x
concepts
Data mine free text                                               x                 x        x
Remember user-lists of terms                                                                                                               x
Remember user lists of concepts                                                                       x         x                x         x                       x
Create user defined hierarchies of
                                                                                                                                                                   x
concepts
Model local and regional additions                                                                    x         x                x         x
Model via web                                                                                         x         x                x
Define by relationships                                                                               x         x                x         x
Apply description logic                                                                               x



Don Walker (October 2004)                                                                                                                                                         23
                                       Terminology Tools Report – an overview




                                                                                    Computer Science
                                               Aust Parsing Tool




                                                                                    Innovations (CSI)




                                                                                                                                                                 SNOMED Subset
                                                                   Mayo NLP Tools


                                                                                    Computin (L&C)




                                                                                                                                                                                 Other specialised
                                                                                                                 Health Language




                                                                                                                                                   SNOMED Clue
                                                                                    Language &
Significant functions




                                                                                                                                                   Browser
                                                                                                                                   Protégé
                                                                                                        Apelon




                                                                                                                                                                                 tools
                                                                                                                                             PAT




                                                                                                                                                                 tool
Infer hierarchies                                                                                       x
Poly-browser - hold and display multiple
systems (terminologies and                                          x                 x        x        x         x                x         x
classifications)
Provide a ―List browser‖ for a ―list of
                                                                                                                                             x
terms‖ – e.g. ―vocabulary terms‖
Link list-records to Terms                                                                                                                   x
Link list-records to Concepts                                                                                                                x
Link terms between systems                                                                                                                   x
Map concepts between systems (1-to-1)                                                                   x         x                x         x
Map concepts between systems (1-to-
                                                                                                        x                                    x
many)
Map concepts between systems (Many-to-
                                                                                                        x                                    x
many)
Provide a ―standard language‖ for
                                                                                                                                                                                      x
mapping-rules
Send and receive terminology and
                                                                                                        x         x                x
classification data via web
Integrate updated data with local
                                                                                                                  x
extensions
Import all data by user and be fully
                                                x                                                                                  x
functional
Export some data by user                        x                                                       x         x                x         x        x              x
Export any and all data by user                 x                                                                                  x         x




What would be the best licensing arrangements for Australia?
(Assumption: It is assumed that a ―licence for tools‖ and not one for SNOMED CT is
intended).
Because of their cost, licence fees are significant for Health Language, Apelon, and
possibly some specialised NLP software. As the PAT is an in-house tool, the only fees
for the PAT are those that cover its costs.
Paying an expensive licence for a tool that is not used or need not be used seems
wasteful. On the other hand, if a tool will be needed, there may be some value in
acquiring skills and familiarity with it beforehand.
Terminology centres and terminology researches need in-house tools and
programming skills regardless of the special terminology tools that may be licensed.
―The best licensing arrangement for Australia‖ would surely be the lowest cost fee
that offers an adequate, reliable and responsive service for tasks that could not readily
be performed without the resource in question.
In ―Scenario-A‖ depicted earlier, a mission critical terminology service would need
professional terminology servers (e.g. HL or Apelon). Without established standards


Don Walker (October 2004)                                                                                                                                                           24
                            Terminology Tools Report – an overview



for terminology structure and transmission, a single brand terminology server system
could avoid incompatibility problems. Hence the UK has chosen HL for all
terminology servers.

An ―archetype service‖ might also be employed. This is however outside the scope of
this report.




Don Walker (October 2004)                                                          25
                            Terminology Tools Report – an overview




3     Appendix
3.1    Teleconference Notes


3.1.1 Briefing notes for Australian participants in SNOMED CT User Group
      meeting and HL7 meeting – US, September 04
The Department of Health and Ageing and the AIHW are supporting the attendance of
Dr Don Walker, Dr Peter MacIsaac and Karen Malam (SNOMED-CT meeting only).
The intention is that such attendance contributes to the Australian effort to reach well
informed decisions on ways forward in the development and implementation of
terminologies. In particular Dr. Walker is focusing on the application of terminology
development, maintenance and implementation tools.


3.1.2 Notes from teleconference Friday 17th September
Participants included members of the (ex) CTWG, DoHA, States and Territories,
NCCH, AIHW
The following dot points outline the issues that teleconference participants identified
as national concerns that would benefit from some exploration and feedback
following the upcoming SNOMED CT User Group meeting and HL7 meeting in the
USA.
The focus of discussion was on identifying current priority issues related to
terminology toolsets for development, maintenance and implementation. Broader
issues related to terminologies were also discussed and are noted below to inform
those attending these international meetings of matters of current concern in Australia.
     Availability of tools are a priority for CATCH if it is to be implemented within
        12 months or so; maintenance is a particular concern and tools that are
        responsive to changes/updates are crucial. Developments in web-based access
        tools are of particular interest.
     Terminology management tools need to be able to handle more than one
        terminology. Tools need to support a broad architecture of terminology and
        classification. Mapping between terminologies is crucial. The quality or the
        user interface and general usability by dispersed and varied users is very
        important.
     Tools need to deal with both terminologies and classifications and the link
        between the two.
     Maintenance and responsiveness is critical – web service models.
     Should we expect one tool to serve development, maintenance and
        dissemination requirements or do we need a different tool for each? If multiple
        tools are needed what import and export functions are available? Access and
        version controls are particularly important.
     Ideal to minimise the number of terminologies in as effort of mapping
        increases with number of different terminologies..
     What are the developments in natural language processing? E.g. is encoding of
        semi-structured pathology reports possible now eg for Cancer registry?


Don Walker (October 2004)                                                             26
                                     Terminology Tools Report – an overview



          Links with messaging must be kept in mind within the terminology
           architecture. Interface/integration with HL7 is an important support to agency
           integration.
          Ability to support multilingual terminologies preferable. Information about
           recent Dutch experience/work in this area would be of interest; getting the
           syntax right is crucial.
          Need to focus on what can be realistically expected in the short term as well as
           keep an eye on future, such as natural language processing. But important not
           to ‗bite off more than we can chew‘.
          Semantic interoperability is the key/main game, syntactic interoperability is an
           important facilitator of semantic interoperability.
          Pre and Post coordination, how do tools manage these?
          Information about the general state of the market is required. Currently the
           market is quite small for terminology development tools but large for
           terminology implementation tools. Do we buy development tools or build
           ourselves?
          Interest in latest developments in the relationship between interface and
           reference terminologies.
          AIHW is also interested in functional requirements for Meteor (on-line
           metadata registry) in a terminology environment. Main focus is on linkage
           between terminologies and classifications.
          Where is SNOMED up to in general? Domain coverage, localisation? Uptake-
           who is using where? If others have chosen other paths, what are they? What
           impact might that have on Australia?
          What has happened in the US in year since purchase of national SNOMED
           licence? Has purchase of licence ‗opened the floodgates‘ for use? Extent of
           growth in use of SNOMED Ct by hospitals and other key health care
           providers?
          Interest in governance and intellectual property issues, especially any moves
           toward internationalisation of control of SNOMED-CT.
          What is the uptake by software vendors on international scale?
          What is the use of terminology services by software vendors, or are they using
           an internal solutions ?
          Are there tools which manage the 3 levels of— term to classification to
           grouping (eg DRG)?
          Important to consider the customer interface not just developer/ maintainer
           interface— Are there tools that could support a process of submission of new
           terms by users? Web based?
          What are the costs and licensing options for tools such as Health Language
          How would tools be used if we had them? Need to keep in mind there may be
           many users with different purposes. What is the capacity of these tools? What
           would be the best licensing arrangements for Australia?


3.2       E-Mail message from Tad McKeon of St Jude

MessageFrom: McKeon, Tad [Tad.McKeon@STJUDE.ORG]
           Sent:      Friday, 15 October 2004 11:23 PM
           To:        donald.walker@adelaide.edu.au
           Cc:        sarah.a.ryan1@jsc.nasa.gov
           Subject:   Essential SNOMED



Don Walker (October 2004)                                                               27
                                 Terminology Tools Report – an overview




Dear Mr. Walker,
A mutual friend of ours, Sarah Ryan forwarded your ―working document‖ of the Essential SNOMED
to me. I think you are on the right track and would like to take a couple of minutes to provide you with
our history using SNOMED RT and some current observations I made at the SNOMED User‘s Group
in Phoenix.
Approximately three years ago, St. Jude Children‘s Research Hospital adopted the SNOMED RT
vocabulary for use in our research databases. After working with the terminology, we decided that we
were going to develop our own post coordinated terms (PCT‘s). We made this decision for several
reasons:
             1. Not all of the terms we wanted to use were pre-coordinated within SNOMED RT.
             2. We wanted consistency in our approach to data retrieval and could enforce that
                  through our own PCT‘s.
             3. We also had an issue with how the terms in the database would display. Creating our
                  own PCT‘s gave us the best of both worlds. We could satisfy the display issues and
                  still enjoy the benefit of the SNOMED hierarchies.
             4. We also encountered problem with some of the SNOMED concepts such as
                  combined sites.
             5. We also developed our own coder so that we could string together concepts to form
                  our own PCT‘s.

This approach seemed to work well, and we have gotten to the point of proof of concept. In my proof
of concept was not assigning a SNOMED concept to some type of clinical event, and being able to use
the concepts for data aggregation, but instead being able to use the concepts for analysis. In our model,
we do not rely upon the vocabulary to be the total answer. We utilize a relational information model
that has multiple components coded. This helps us to maintain context. Anyway, this proof of concept
was shared at the SNOMED User‘s Group.
While at the User‘s Group, I attended a breakout session that Kent Spackman was leading. After an
hour, I began to wonder whether SNOMED‘s vision was beginning to lean towards being the ultimate
solution for the EMR. As I listened to the conversation, the issue of combined sites began to surface.
Any term that has an ―AND‖ or ―OR‖ would be in essence a combined site. SNOMED has no way of
creating a hierarchy for a combined site. As I began to think about the implications of creating more
pre-coordinated terms, I realized that they might no longer be a good vocabulary for purposes of
research.
So, with all of that said, I agree with our need to have a vocabulary based on root concepts, that can
then be modified for each user‘s special circumstances. This resolves the issue of analysis if SNOMED
properly models. Further, creating PCT models would then enable all users to interpret anyone‘s PCT.
Personally, I would have rather seen them create a cross-walk between RT and CT so that users like us
will not have to spend hours and days figuring out how concepts and there inherent meaning have
changed between the two vocabularies.
Anyway, that is my two cents for this morning.
Tad McKeon



4     Attachments
The following documents are attached. Their electronic file names are included.

4.1    Attached ―NIS approach to in-house tools at NCCH‖
See the separate attachment ―National Centre for Classification in Health Information
System (NIS) – system overview‖ by Young Tjoa
             (Electronic file name: NIS System Overview (Young Tjoa).doc)



Don Walker (October 2004)                                                                             28
                             Terminology Tools Report – an overview




4.2    Attached documents describing the PAT
The Poly-browser and Authoring Tool (PAT) was developed for in-house terminology
project use. A document that describes some aspects of its application to CATCH and
the GP Vocabulary Project is attached.
                (Electronic file name: 2-01-04 PAT Authoring Notes.doc)


4.3    Attached Health Language report
Health Language has been reviewed in detail for the DHAC in August 2001 by Don
Walker. The paper was titled ―Cyber+LE and its components LExScape and
LExIndex (Health Language Inc. USA)‖. It is attached.
              (Electronic file name: CyberLEReport-Final from DHAC.zip)


4.4 Attached ―Parser Notes‖
The ―GP Vocabulary stage-1 project‖ involved parsing GP phrases (utterances) into
―atomic terms‖ and ―themes‖. The tool used is partly described in the attached
document called: ―Getting Started with the Semantic Parser for GP Terms‖.
             (Electronic file name: 040 Semantic Parser Getting Started.doc)


4.5    Attached ――Essential SNOMED‖ or ―SNOMED ET‖‖
See the separate attachment titled ――Essential SNOMED‖ or ―SNOMED ET‖‖. It
presents some early thinking on a restructured, smaller and simpler additional version
of SNOMED.
                   (Electronic file name: 4-00 Essential SNOMED.doc)




Don Walker (October 2004)                                                           29

								
To top