Informatics for Clinicians and Clinical Investigators – v4 5/28/2010
Subtitle: Why is Clinical Informatics So Hard?
Goal: train the next generation of clinical researchers in the basics of clinical information systems (CIS) so they can
both use the data that is derived from these systems as well as understand the issues surrounding the design,
development, implementation, and evaluation of CIS-based interventions.
1. Identify the key clinical information system-related challenges facing clinical researchers over the next 3-5
2. Identify the knowledge that a person with an MD degree and training in health services research should know
about clinical information systems.
Tentative Course Schedule:
June 30 – Sittig, Introduction to course and Informatics.
Read Sittig, Singh JAMA 2009 – 8 Rights of Safe and Effective EHR Use (QSH article).
section 1.1.4 (page 13) in Shortliffe and Cimino.
opportunity for HS researchers to study the effectiveness of information systems
Marc Berg‘s Health Information Management- chapter 4, particularly pages 71 – 78,
July 7 - Sittig - Controlled Clinical Vocabularies
July 14 Sittig - Clinical decision support
July 21 – Herscovich – Natural Language Processing
July 28 – Johnson – User Interfaces
Aug 4 – Data warehouses – Bernstam or sittig
Aug 11- Singh, e-Communication
Aug 18 – Sittig, Final course – Future of Clinical Informatics
1. Sittig – Introduction to Clinical Informatics
a. Right System – Hardware and software must be capable of supporting the clinical activities. It must be
fast, reliable, and appropriately protected to ensure the safety, privacy, and integrity of the clinical and
administrative data it contains.
b. Right Content – EMR vocabulary used to encode the clinical findings, enter orders, and store laboratory
results must be standardized and used to encode all data. The clinical knowledge that forms the basis of
the clinical decision support must be evidence-based and appropriate for the user‘s practice as well as
c. Right Human-Computer User Interface – The EMR‘s user interface must be user-friendly: easy to learn
and use. The interface should present all the relevant patient data in a format that allows the clinicians to
rapidly perceive the problem, formulate a response, and document his/her actions.
d. Right People – Users must be appropriately trained and re-trained and interact closely with the
informatics experts and clinical application coordinators responsible for designing and maintaining the
e. Right Workflow / Communication – the EMR must fit into the workflow of the clinic or hospital and
enhance situational awareness of its users who often practice in time pressured settings.
f. Right Organizational Policy & Procedures – the organization must make adjustments to previous
policies or new policies that account for the EMR use
g. Right State and Federal Rules and Regulations – both the State and Federal governments must continue
to work to create the appropriate regulatory environment that will enable these systems to continue
evolving while maintaining appropriate safety and privacy oversight.
h. Right Monitoring -- organizations or users must continually evaluate the performance of EMRs through
robust, monitoring systems and test if automated processes are working as expected after implementation.
2. Sittig - Controlled Clinical Vocabularies – Common, standards-based clinical vocabularies will become more
important as time passes. In addition, before we can have wide-spread adoption and sharing of clinical data, much
more work will need to be done with the existing clinical vocabulary standards. (based on paper by Alan Rector –
Why is clinical terminology so hard?)
a. The scale and the multiplicity activities tasks and users it is expected to serve is vast.
b. Conflicts between the needs of users and the requirements for rigorously developed software must be
c. The complexity of clinical pragmatics – support for practical use for data entry, browsing, and retrieval –
and the need for testing the pragmatics of terminologies implemented in software.
d. Separating language and concept representation is difficult and has often been inadequate.
e. Pragmatic clinical conventions often do not conform to general logical or linguistic paradigms.
f. Both defining formalisms for clinical concept representation and populating them with clinical knowledge
or ‗ ontologies‘ are hard – and that their difficulty has often been underestimated.
g. Determining and achieving the appropriate level of clinical consensus is hard and requires that the
terminology be open ended and allow local tailoring.
h. The structure idiosyncrasies of existing conventional coding and classification systems must be addressed
The terminology must be coordinated and coherent with medical record and messaging models and
i. Change must be managed, and it must be managed without corrupting information already recorded in
Rector AL. Clinical terminology: why is it so hard? Methods Inf Med. 1999 Dec;38(4-5):239-52.
Rosenbloom ST, Brown SH, Froehling D, Bauer BA, Wahner-Roedler DL, Gregg WM, Elkin PL. Using
SNOMED CT to represent two interface terminologies. J Am Med Inform Assoc. 2009 Jan-Feb;16(1):81-8
3. Clinical decision support – (Sittig) following the clinical decision making process, there is a tremendous amount
of work involved in setting up and maintaining any clinical decision support system. (Sittig – Grand Challenges;
Ash – CDS)
a. CDS means different things to different people
b. For patient-specific CDS, you need DATA!
c. Clinical Knowledge Management is necessary for CDS
d. Knowledge engineers are ―special people‖
e. Work to facilitate translation for collaboration
f. The system, including the hardware, software and user interface must be easy to use and fast
g. Workflow analysis must be a part of the organizational culture
h. Communicating new CDS features and functions to clinicians is hard
i. Training and supporting CDS users is difficult
j. Nurture and support your clinical champions
Ash JS. CDS Themes paper
Sittig DF, Wright A, Osheroff JA, et al. Grand challenges in clinical decision support. J Biomed Inform.
2008 Apr;41(2):387-92. Epub 2007 Sep 21.
4. Natural Language Processing (Herskovic) (Friedman papers)
a) Gold standards
b) Part-of-record detection (i.e. Family history vs personal history, prescription vs current meds) - History of or
Family history of vs. illness patient has
c) Temporality, especially relative time
d) Anaphoric referent disambiguation
e) Cross-document reference disambiguation
f) Word sense disambiguation - ―hand‖ – clap, help, set of cards in poker, end of your arm, height of horse
g) Misspellings, abbreviations, acronyms
h) Relationship detection and extraction
i) Named entity recognition
j) Quality and usefulness of the dictionaries.
k) Negatives in text – need to recognize these
l) Severity of conditions or illnesses
m) Identifying quantity or counts.
n) Optical character recognition vs. ASCII vs. voice recognition – many confuse these
5. User Interface Design – (Johnson) The user interface is the ―face‖ of the clinical information system. This is the
only aspect of the system that most clinicians know about. Using the screen design tools that vendors provide, to
customize various screens for local use, is one of the keys to a successful implementation. (Sanderson – Australia)
a. screen customization
b. paper form design
c. for Health services researchers, interface design is really all about reliable data capture. Interfaces must be
designed to fit the workflow of clinicians and reliably capture the required data. An interface that may be
acceptable to a clinician and captures data adequately for the care of individual patients may not meet the
needs of researchers (or for quality measurement). Typically an interface that meets the needs of researchers
will require ‗buy-in‘ from the humans using the interface – i.e., must agree that it‘s worth the effort to capture
the data reliably n a standard format. Still, the researcher must understand the workflow in order to create an
6. Data warehouses – (Bernstam) In addition to the real-time, transaction oriented face of the EMR, there is also the
vast amount of clinical data that is contained in the off-line clinical data warehouses. Over time, use of this data
will become even more important for administrative and clinical decision support. (Bernstam)
1. Missing data cannot be assumed to be ―normal‖ or unimportant-
2. Data collected for one purpose is not valid for another purpose
3. It is difficult to understand ―why‖ something was done from billing codes. They are better at telling us ―what‖
happened. Also no indication of the severity of the illness.
4. The freetext portion of the EHR contains at least 50% of the important data.
5. Difficult to track relationships in data from a database since you only have timestamps (which may be
inaccurate). The rest is conjecture. Also not everything that is done is tracked.
6. Difficult to get all the data you want even prospectively since other people are not as interested in particular
data items as you are. Therefore, large DB-centric trials reduce to the least common denominator.
7. No matter how big your database is, if you apply enough filtering criteria you can run out of sample. (ref:
Weiner, M ?)
8. No matter how many study inclusion or exclusion criteria you develop, you will always have a few individuals
in the sample that are not appropriate and you will always miss a few who should be included in the sample, but
9. There are many patients in your database that have essentially no data (they registered but never came, went to
the ED once, came for a test, etc.) and will wreck havoc with your denominators.
10. There‘s often more than one storage location, or multiple ways to code the same concept, for any particular
data item – make sure you find them all (i.e. HbA1c could be in labs, health maintenance, a flowsheet, a note, etc.)
At the same time, make sure you aren‘t double counting (e.g. the same HbA1c result in two places, or a pending
and final result). (Ref: Safran BP article)
11. Often even so-called, standardized data such as ICD-9 and CPT codes, or even Admit time are used differently
in different clinics, even in the same location but especially across locations. These clinics can have different
billing practices, standard operating procedures, levels of aggressiveness with billing, etc that makes "standard"
codes assigned to patients, non-standard.
7. Communication & Workflow analysis – (Singh) prior to system implementation, careful workflow analysis and
documentation can improve the changes of implementation success.
a. Figuring out who to send a message (whether computer generated or not) to.
b. Acknowledgement: Making sure that all messages are received.
c. Attestation is the act of applying an electronic signature to the content, showing authorship and legal
responsibility for a particular unit of information.
d. Authentication is the security process of verifying a user‘s identity with the system that authorizes the
individual to access the system (e.g., the sign-on process). Authenticating is important because it assigns
responsibility for an entry they create, modify, or view.
e. Non-repudiation—strong and substantial evidence that will make it difficult for the signer to claim that the
electronic presentation is not valid.
f. Asynchronous vs. synchronous
g. Channels – a wide variety of different communication channels available, from basic face-to-face
conversation, through to telecommunication channels like the telephone or e-mail, and computational
channels like the medical record. Includes written, spoken, email, message
h. Coded vs. freetext messages
i. Fail-safe mechanisms
Coiera E. Communication systems in healthcare. Clin Biochem Rev. 2006 May;27(2):89-98.
AHIMA. "Electronic Signature, Attestation, and Authorship. Appendix B: Laws, Regulations, and
Electronic Signature Acts." Journal of AHIMA 80, no.11 (November-December 2009). Available at:
8. Sittig – Future of Clinical Informatics
a. The next generation Internet;
b. Real-time clinical decision support systems;
c. Off-line, population-based systems;
d. Large, integrated, individual patient-level phenotypic and genotypic databases with intelligent data mining
e. Wireless, invasive and non-invasive physiologic monitoring devices;
f. Natural Language Processing (NLP) systems;
g. Mathematical models of complex biological systems
Reading: Sittig DF. Potential impact of advanced clinical information technology on cancer care in 2015. Cancer
Causes Control. 2006 Aug;17(6):813-20.
Preliminary Grading scheme:
Students are required to attend 5/8 classes during the course. For each course session attended, students will
receive 2% of their final grade (in other words, class attendance counts for 10% of the final grade. Special
exceptions may be made for students who are not physically located at UTHouston.
Following each class there will be a quiz consisting of 5-10 multiple choice or true/false or matching questions that
cover key points from that lecture. This quiz will be available online for 1 week after the end of the lecture. The
results of all these quizzes will count toward 40% of the final course grade.
The final project which counts for 50% of the final course grade will consist of 10, 1-2 page (200-500word)
explanations of one of the key points from each week of the course. Students must choose at least 1 and not more
than 2 topics from any single lecture. Each explanation should include at a minimum:
1. A definition or explanation of the key point
2. An explanation of why this point is important
3. An explanation of what has been done so far to address this issue
4. Three references that relate to the topic you are explaining
The first 3 explanations will be due following the end of week 3 (July 14, 2010); the second 3 explanations will be
due following the end of week 6 (August 4, 2010) and the final 4 explanations will be due at the end of week 8
(August 18, 2010). There is no penalty for turning in these assignments early.