Learning Center
Plans & pricing Sign in
Sign Out



									       Diagnosis and Interpretation
• We concentrate on diagnosis and interpretation because
  historically they are significant problems that AI has
   – And there are numerous and varied solutions, providing us
     with an interesting cross-section of AI techniques to examine
• Diagnosis is the process of determining whether the
  behavior of a system is correct
• If incorrect, which part(s) of the system is(are) failing
   – We often refer to the result of a diagnosis is one or more
   – The system being diagnosed can be an artificial system (man-
     made) or natural system (e.g., the human body, the ecology)
       • man-made systems are easier to diagnose because we understand the
         systems thoroughly enough to develop an accurate model
• Interpretation is a related problem, it is the process of
  explaining the meaning of some object of attention
             Data Driven Processes
• While both diagnosis and interpretation have goals
  of “seeking to explain”, the processes are triggered
  by data
   – We use the data (symptoms, manifestations, observations)
     to trigger possible reasons for why those data have arisen
• Thus, these problems are distinct from goal-driven
   – Like planning, design, and control
      • control encompasses planning, interpretation, diagnosis and
        possibly prediction
• One way to view diagnosis/interpretation is that
  given data, explain why the data has arisen
   – Thus, it is an explanation-oriented process
      • the result of the process is an explanation which attempts to
        describe why we have the resulting behavior (malfunctions or
      • we will reconsider this idea (explanation as a process) later
               The Diagnostic Task
• Data triggers causes (hypotheses of malfunctions, or potential
  diagnoses), typically an associational form of knowledge
• Hypotheses must be confirmed through additional testing and
  inspection of the situation
• Hypotheses should be as specific as possible, so they need to be
  refined (e.g., given a general class of disease, find the most
  specific subclass)
             Forms of Interpretation
• The idea behind interpretation is that we are trying to
  understand why something has happened
   – Diagnosis is a form of interpretation in that we are trying to
     understand a system’s deviation from the norm
       • what caused the system to deviate? what components have broken down?
• Diagnosis is a form of interpretation, but there are other forms
   – Data analysis – what phenomenon caused the data to arise, e.g.,
     studying astronomical phenomena by looking at radio signals, or
     looking at blood clots and decided on blood types
   – Object identification – viewing a description (in some form, whether
     visual or data) of an object, what is the object
   – Speech recognition – interpret the acoustic signal in terms of
   – Communication – what is the meaning behind a given message?
     This can be carried over to analysis of artwork
   – Evidence analysis – trying to decipher the data from a crime scene to
     determine what happened, who committed the crime and why
   – Social behavior –explaining why someone acted in a particular way
                 Some Definitions
• Let us assume that our knowledge of a given system
  is contained as a model
  – A diagnosis is a particular hypothesis of how the system
    differs from the model
     • what component(s) is(are) not functioning as modeled?
  – A diagnosis is a description of one possible state of the
    system where the state is not the “normal state”
  – A consistency-based diagnosis is a diagnosis where each
    component of the system is labeled as either normal or
    abnormal (functioning correctly or not) such that the
    description is consistent with the observations
     • If there are n components in a system, there are 2n different
       diagnoses because we must consider that multiple components
       may fail
  – A minimal diagnosis is a diagnosis consisting of some set
    of components C such that there is no consistent diagnosis
    that is a subset of C
       First Interpretation System
• The system Dendral, from 1966, was given mass
  spectrogram data and inferred the chemical
  composition from that data
  – The input would be the mass of the substance along with
    other experimental lab data
  – Dendral would apply knowledge of atomic masses,
    valence rules and connectivity among atoms to determine
    combinations and connections of the atoms in the
    unknown compound
     • The number of combinations grows exponentially with the size
       (mass) of the unknown compound)
  – Dendral used a plan-generate-test process
     • First, constraints would be generated based on heuristic
       knowledge of what molecules might appear given the initial
       input and any knowledge presented about the unknown
                 Dendral Continued
• The planning step would constrain the generate step
   – At this step, graphical representations of possible molecules would
     be generated
   – The constraints are necessary to reduce the number of possible
     graphs generated
• The final step, testing, attempts to eliminate all but the correct
   – Each remaining graph is scored by examining the candidate
     molecular structure and comparing it against mass spectrometry rules
     and reaction chemistry rules
   – Structures are discarded if they are inconsistent with the spectrum or
     known reactions
   – Any remaining structures are presented the operator
• At this point, the operator can input additional heuristic rules
  that can be applied to this case to prune away incorrect
   – These rules are added to the heuristics, so Dendral “learns”
   – A thorough examination is presented in
• Mycin was the next important step in the evolution of
  AI expert systems and AI in medicine
   – The first well known and well received expert system, it
     also presented a generic solution to reasoning through rules
   – It provided uncertainty handling in the form of certainty
   – After creating Mycin, some of the researchers developed
     the rule-based language E-Mycin (Essential or Empty
     Mycin) so that others could develop their own rule-based
     expert systems
• Mycin had the ability to explain its conclusions by
  showing matching rules that it used in its chain of logic
• Mycin outperformed the infectious disease experts
  when tested, coming to an “acceptable” therapy in 69%
  of its cases
   – A spinoff of Mycin was a teaching tool called GUIDON
     which is based on the Mycin knowledge base
     The Importance of Explanation
• The Dendral system presented an answer but did not
  explain how it came about its conclusions
• Mycin could easily generate an explanation by
  outputting the rules that matched in the final chain of
   – E.g., rule 12 & rule 15  rule 119  rule 351
   – A user can ask questions like “why was rule 351
     selected?” to which Mycin responds by showing the rule’s
     conditions (lhs) and why those conditions were true
   – The reason why a rule is true is usually based on previous
     rules being true leading to conclusions that made the
     given rule true
• By being able to see the explanation, one can feel
  more confident with the system’s answers
   – But it is also a great tool to help debug and develop the
     knowledge base
               Mycin Sample Rules
IF: 1) the identity of ORGANISM-1 is not known
** 2) the gram stain of ORGANISM-1 is not known**
   3) the morphology of ORGANISM-1 is not known
   4) the site of CULTURE-1 is csf
   5) the infection is meningitis
   6) the age (in years) of the patient is less than equal to .17
THEN: There is weakly suggestive evidence (. 3) that the
    category of ORGANISM-1 is enterobacteriaceae

IF: 1) the morphology of ORGANISM-1 is rod
    2) the gram stain of ORGANISM-1 is gramneg
    3) the aerobicity of ORGANISM-1 is facultative
** 4) the infection with ORGANISM-1 was acquired while the
         patient was hospitalized**
THEN: There is evidence that the category of ORGANISM-1
is enterobacteriaceae
   Systems Generated From Emycin
• SACON – Structural Analysis CONsultant
IF: 1) The material composing the sub-structure is one of the metals, and
   2) The analysis error that is tolerable is between 5% and 30%, and
   3) Then non-dimensional stress of the sub-structure > .9 , and
   4) The number of cycles the loading is to be applied is between 1000 and10000
THEN: It is definite (1.0) that fatigue is one of the stress behavior phenomena in
the sub-structure
• Puff – pulmonary disorders
   – originally implemented in Emycin before being re-
     implemented as an OO system
I f : 1) The mmf/mmf-predicted ratio is [35..45] & the fvc/fvc-predicted ratio > 88
      2) The mmf/mmf-predicted ratio is [25..35] & the fvc/fvc-predicted ratio < 88
Then : There is suggestive evidence (.5) that the degree of obstructive airways
disease as indicated by the MMF is moderate, and it is definite (1.8) that the
following is one of the findings about the diagnosis of obstructive airways
disease: Reduced mid-expiratory flow indicates moderate airway obstruction.
         A Fuzzy Logic Approach
• The process is one of
  – Fuzzifying the inputs
     • blood pressure of 145 mmHg can be denoted as {low/0,
       medium/.4, high/.6}
  – Fuzzy reasoning
     • applying rules similar to Mycin
         – recall that fuzzy systems do poorly with lengthy chains of rules, so we
           will primarily use fuzzy logic in diagnosis when there are few rules
           and limited chains of logic
     • we use fuzzy logic and set theory to compute AND, OR, NOT,
       Implication, Difference, etc. as needed for the rules
  – Fuzzy classes
     • given the result of our rules, we defuzzify by identifying which
       class (malfunction(s)/diagnosis(es)) is rated the highest
  – FL has been used for automotive diagnosis, clinical lab
    test interpretation, mammography interpretation, …
         Analyzing Mycin’s Process
• A thorough analysis of Mycin was performed and it
  was discovered that the rule-based approach of
  Mycin was actually following three specific tasks
   – Data are first translated using data abstraction from
     specific values to values that may be of more use (e.g.,
     changing a real value into a qualitative value)
   – The disease(s) is then classified
   – The hypothesis is refined into more detail
• By considering the diagnostic process as three
  related but different tasks, it allows one to more
  clearly understand the process
   – With that knowledge, it becomes easier to see how to
     solve a diagnostic task – use classification
           Classification as a Task
• One can organize the space of diagnostic conclusions
  (malfunctions) into a taxonomy
  – The diagnostic task is then one of searching the taxonomy
     • Coined hierarchical classification
  – The task can be solved by establish-refine
     • Attempt to establish a node in the hierarchy
     • If found relevant, refine it by recursively trying to establish any
       of the node’s children
     • If found non-relevant, prune that portion of the hierarchy away
       and thus reduce the complexity of the search
• How does one establish a node as relevant?
  – Here, we can employ any number of possible approaches
    including rules
     • Think of the node as a “specialist” in identifying that particular
     • Encode any relevant knowledge to recognize (establish) that
       hypothesis in the node itself
             Supporting Classification
• The establish knowledge can take on any number of different
   –   Rules (possibly using fuzzy logic or certainty factors, or other)
   –   Feature-based pattern matching
   –   Bayesian probabilities or HMM
   –   Neural network activation strength
   –   Genetic algorithm fitness function
• In nearly every case, what we are seeking are a set of pre-
  determined features
   – Which features are present? Which are absent?
   – How strongly do we believe in a given feature?
• If the feature is not found in the database, how do we acquire
   – By asking the user? By asking for a test result? By performing
     additional inference?
   – Notice that in the neural network case, features are inputs whereas in
     most of the rest of the cases, they are conditions usually found on the
     LHS of rules
       Feature-based Pattern Matching
• A simple way to encode associational knowledge to support a
  hypothesis is to enumerate the features (observations,
  symptoms) we expect to find if the hypothesis is true
   – We can then enumerate patterns that provide a confidence value that
     we might have if we saw the given collection of features
• Consider for hypothesis H, we expect features F1 and F2 and
  possibly F3 and F4, but not F5 where F1 is essential but F2 is
  somewhat less essential
   –   F1       F2      F3    F4      F5              Result
   –   yes      yes     yes   yes     no              confirmed
   –   yes      yes     ?     ?       no              likely
   –   yes      ?       ?     ?       no              somewhat likely
   –   ?        yes     ?     ?       no              neutral/unsure
   –   ?        ?       ?     ?       yes             ruled out
   –   ? means “don’t care”
• We return the result from the first pattern to match, so this is in
  essence a nested if-else statement
                    Data Abstraction
• In Mycin, many rules were provided to perform data
   – In a pattern matching approach, we might have a feature of
     interest that may not be directly evident from the data but the
     data might be abstracted to provide us with the answer
      • Example: Was the patient anesthetized in the last 6 months?
      • No data indicates this, but we see that the patient had surgery 2 months
        ago and so we can infer that the patient was anesthetized
• Data abstractions might be domain specific
   – In which case we have to codify each inference as shown
• Or may be domain independent
   – Such as temporal reasoning or spatial reasoning
• Another form is to discard a specific value in favor of a
  more qualitative value (e.g., temperature 102 becomes
  “high fever”)
Example 1: Automotive Diagnosis
                                   Engine Trouble

Fuel                Electrical        Air and Exhaust          Spark             Control

                                                                                    Oxygen Sensor
   Fuel Pump                              EGR Valve              Spark Plug
   Relay               Battery            Solenoid                                  Mass Air Flow Sensor
                                                                 Spark Plug
                       Voltage            Air Filter             Wire               Throttle Position Sensor
   Fuel Pump Fuse
                                          Catalytic              Ignition Coil      Knock Sensor
   Fuel Pump                              Converter
                                                                                    Manifold Absolute
   Fuel Injector                          Idle Air Control                          Pressure Sensor
                                                                                    Manifold Air
   Fuel Pressure                          Idle Speed Control                        Temperature Sensor
   Regulator                              Motor
                                                                                    Coolant Temperature
   Fuel Filter                                                                      Sensor

                                                                                    Engine Control Module

                                                                                    Crankshaft Position Sensor

                                                                                    Camshaft Position Sensor

                                                                                    Vehicle Speed Sensor
Example 2: Syntactic Debugging
Ex 3: Linux User Classification
            Lack of Differentiation
• Notice that through the use of simple classification
  (what is called hierarchical classification), one does
  not differentiate among possible hypotheses
   – If two hypotheses are found to be relevant, we do not have
     additional knowledge to select one
      • What if X and Y are both established with X being more certain
        than Y, which should we select?
      • What if X and Y have some form of association with each other
        such as mutually incompatible, or jointly likely?
• We would like to employ a process that contains
  such knowledge as to let us select only the most
  likely hypothesis(es) given the data
   – In a neural network, we would only select the most likely
     node, and similarly for an HMM, the most likely path
• This leads us to abduction, a form of inference first termed
  by philosopher Charles Peirce
   – Peirce saw abduction as the following:
      • Deduction says that
          – If we have the rule A  B
          – And given that A is true
          – Then we can conclude B
      • But abduction says that
          – If we have the rule A  B
          – And given that B is true
          – Then we can conclude A
   – Notice that deduction is truth preserving but abduction is not
   – We can expand the idea of abduction to be as follows:
      • If A1 v A2 v A3 v … v An  B
      • And given that B is true
      • And if Ai is more likely than any other Aj (1<=j<=n), then we can infer
        that Ai is true
          – for this to work, we need a way to determine which is most likely
   Inference to the Best Explanation
• Another way to view abduction is as follows:
   –   D is a collection of data (facts, observations, symptoms) to explain
   –   H explains D (if H is true, then H can explain why D has appeared)
   –   No other hypothesis explains D as well as H does
   –   Therefore H is probably correct
• Although the problem can be viewed similar to classification
  – we need to locate an H that accounts for D
   – We now need additional knowledge, explanatory knowledge
        • What data can H explain?
        • How well can H explain the data?
        • Is there some way to evaluate H given D?
   – Additionally, we will want to know if
        • H is consistent
        • Did we consider all H’s in our domain?
• What complicates generating a best explanation is that H and
  D are probably not singletons but sets
• Assume H is a collection of hypotheses that can all
  contribute to an explanation, H = {H1, H2, H3, …, Hn}
• D is a collection of data to be explained, D = {d1, d2,
  d3, …, dn}
   – a given hypothesis can account for one or more data (e.g., H3
     can explain {d1, d5})
   – assume that we have ranked all elements of H with some
     scoring algorithm (Bayesian probability, neural network
     strength of activation, feature-based pattern matching, etc)

• The abductive process is
  to generate the best
  subset of H that can
  explain D
   – what does best mean?
               Ways to View “Best”
• We will call a set of hypotheses that can explain the data as a
  composite hypothesis
• The best composite hypothesis should have these features
   – Complete – explains all data (or as much as is possible)
   – Consistent – there are no incompatibilities among the hypotheses
   – Parsimonious – the composite has no superfluous parts
   – Simplest – all things considered, the composite should have as fewer
     individual hypotheses as possible
   – Most likely – this might be the most likely composite or the
     composite with the most likely hypotheses (how do we compute
• In addition, we might want to include additional factors
   – Cheapest costing (if applicable) – the composite that would be the
     least expensive to believe
   – Generated with a reasonable amount of effort – generating the
     composite in a non-intractable way (abduction is generally an NP-
     complete problem)
   Internist – Rule based Abduction
• One of the earliest expert systems to apply
  abduction was Internist, to diagnose internal
   – Internist was largely a rule-based system
   – The abduction process worked as follows
      • Data trigger rules of possible diseases
      • For each disease triggered, determine what other symptoms are
        expected by that disease, which are present and which are absent
          – Generate a score for that disease hypothesis
      • Now compare disease hypotheses to differentiate them
          – If one hypothesis is more likely, try to confirm it
          – If many possible hypotheses, try to rule some out
          – If a few hypotheses available, try to differentiate between them by
            seeking data (e.g., test results) that one expects that the others do not
   – The diagnostic conclusion are those hypotheses that still
     remain at the end that each explain some of the data
         Neural Network Approach
• Paul Thagard developed ECHO, a system to learn explanatory
   – ECHO was developed as a neural network where nodes represent
     hypotheses and data
   – links represent potential explanations between hypotheses and data
   – and hypothesis relationships (mutual incompatibilities, mutual
     support, analogy)
• Unlike a normal neural network, nodes here represent specific
   – weights are learned by the strength of relationships are found in test
• In fact, the approach is far more like a Bayesian network with
  edge weights representing conditional probabilities (counts of
  how often a hypothesis supports a datum)
   – When data are introduced, perform a propagation algorithm of the
     present data until the hypothesis nodes and data nodes have reached
     a stable state (similar to a Hopfield net) and then the best
     explanation are those hypothesis nodes whose probabilities are
     above a preset threshold amount
Ex: Evolution (DH) vs Creationism (CH)
        Probabilistic Approach(es)
• Pearl’s Belief networks and the generic idea behind
  the HMM are thought to be abductive problem
  solving techniques
  – Notice that there is no explicit coverage of hypotheses to
    data, for instance, we do not select a datum and ask “what
    will explain this?”
  – Instead, the solution is derived to be the best explanation
    but where the explanation is generated by finding the most
    probable cause of the collection of data in a holistic
• The typical Bayesian approach contains probabilities
  of a hypothesis (state) being true, of a hypothesis
  transitioning to another hypothesis, and of an output
  being seen from a given hypothesis
  – But there is no apparent mechanism to encode hypothesis
    incompatibilities or analogies
• In the diagram of a
  – I represents inputs
  – O represents
  – Ab represent
    component parts
    that might be
• In the formula
  – dc is a diagnostic
    based on input and
    output i, o
             The Peirce Algorithm
• The previous strategies assume that knowledge is
  available in either a rule-based or probabilistic-based
• The Peirce algorithm instead uses generic tasks
   – The algorithm has evolved over the course of construction
     several knowledge-based systems
• The basic idea is
   – Generate hypotheses
      • this might be through hierarchical classification, neural network
        activity, or other
   – Instantiate generated hypotheses
      • for each hypothesis, determine its explanatory power (what it can
        explain from the data), hypothesis interactions (for the other
        generated hypotheses, are they compatible, incompatible, etc) and
        some form of ranking
   – Assemble the best explanation
      • see the next slide
              The Assembly Algorithm
• Examine all data and see if there are any data that can only be explained
  by a single hypothesis
     – such a hypothesis is called an essential hypothesis
•   Include all essential hypotheses in the composite
•   Propagate the affects of including these hypotheses (see next slide)
•   Remove from the data all data that can be explained
•   Start from the top (this may have created new essentials)
•   Examine remaining data and see if there are any data that can only be
    explained by a superior hypothesis
     – such a hypothesis would clearly beat all competitors by having a much
       higher ranking
• Include all superior hypotheses in the composite, propagate and remove
• Start from the top (this may have created new essentials)
• Examine remaining data and see if there are any data that can only be
  explained by a better hypothesis
     – such a hypothesis would be better than all competitors
• Include all better hypotheses in the composite, propagate and remove
• Start from the top (this may have created new essentials)
• If there are still data to explain, either guess or quit with unexplained
• The idea behind the Peirce algorithm is to build on islands of
   – If a hypothesis is essential, it is the only way to explain
     something, it MUST be part of the best explanation
• If a hypothesis is included in the composite, we can leverage
  knowledge of how that hypothesis relates to others
   – If the hypothesis, say H1, is incompatible with H2, since we
     believe H1 is true, H2 must be false, discard it
   – If hypothesis H1 is very unlikely to appear with H2, we can
     downgrade H2’s ranking
   – If hypothesis H1 is likely to appear with H2, we can either
     reconsider H2 or just bump up its ranking
   – If hypothesis H1 can be inferred to be H2 by analogy, we can
     include H2
• Since H1 was included because it was the only (or best) way
  to explain some data, we build upon that island of certainty
  by perhaps creating new essentials because H1 is
  incompatible with other hypotheses
                Layered Abduction
• For some problems, a single data to hypothesis
  mapping is insufficient
  – Either because we have more knowledge to bring to bear
    on the problem or because we want an explanation at a
    higher level of reasoning
     • For instance, in speech recognition, we wouldn’t want to just
       generate an explanation of the acoustic signal as a sequence of
       phonetic units
     • So we map the output of one level into another
         – The explanation of one layer becomes the input of the next layer – we
           explain the phonetic unit output as a sequence of syllables, and we
           explain the syllables as a sequence of words, and then explain the
           sequence of words as a meaningful statement
  – We can use partially formed hypotheses at a higher level
    to generate expectations for a lower layer thus giving us
    some top-down guidance
    Example: Handwritten
Character Recognition (CHREC)
                Overall Architecture
 • The system has a search space of hypotheses
    – the characters that can be recognized
        • this may be organized hierarchically, but here, its just a flat
          space – a list of the characters
    – each character has at least one recognizer
        • some have multiple recognizers if there are multiple ways to
          write the character, like 0 which may or may not have a
          diagonal line from right to left
After characters
are generated for
each character in
the input, the
assembler selects
the best ones to
account for the
            Explaining a Character
• The features (data) found to be explained for this character
  are three horizontal lines and two curves
• While both the E and F characters were highly rated, “E”
  can explain all of the features while “F” cannot, so “E” is
  the better explanation
             Top-down Guidance
• One benefit of this approach is that, by using
  domain dependent knowledge
   – the abductive assembler can increase or decrease
     individual character hypothesis beliefs based on
     partially formed explanations
   – for instance, in the postal mail domain, if the
     assembler detects that it is working on the zip code
     (because it already found the city and state on one
     line), then it can rule out any letters that it thinks it
      • since we know we are looking at Saint James, NY, the
        following five characters must be numbers, so “I” (for one of
        the 1’s, “B” for the 8, and “O” for the 0 can all be ruled out
        (or at least scored less highly)
Example in
 a Natural
   Model-based Diagnosis: Functional
• In all of our previous examples of diagnosis and
  interpretation, our knowledge was associational
  – We associate these symptoms/data with these
     • This is fine when we do not have a complete understanding the
        – Medical diagnosis
        – Speech recognition
        – Vision understanding
  – What if we do understand the system?
     • E.g., a human-made artifact
  – If this is the case, we should be able to provide
    knowledge in the form of the function that a given
    component will provide in the system and how that
    function is achieved through its behavior (process)
     • Debugging can be performed by simulating performance with
       various components not working
                The Clapper Buzzer
• This mechanical device works as follows:
   – When you press the button (not shown) it completes the circuit
     causing current to flow to the coil
   – When the magnetic coil charges, it pulls the clapper hand toward it
                     – When the clapper hand moves, it disconnects the
                       circuit causing the coil to stop pulling the hand and
                       then hand falls back, hitting a bell (not shown)
                       causing the ringing sound
                     – This also reconnects the circuit, and so this process
                       repeats until the button is no longer pressed
           Generating a Diagnosis
• Given a functional representation, we can reason
  over whether a function can be achieved or not
  – Hypothetical or “what would happen if” reasoning
     • What would happen if the coil was not working?
     • What would happen if the battery was not charged?
     • What would happen if the clapper arm were blocked?
  – We can also use the behavior and test results to find out
    what function(s) was not being achieved
     • With the switch pressed, we measure current at the coil, so the
       coil is being charged
     • We measure a magnetic attraction to show that the coil is
     • We do not hear a clapping sound, so the magnetic attraction is
       either not working, or the acoustic law is not being fulfilled
         – Why not? Perhaps the arm is not magnetic? Perhaps there is
           something on the arm so that when it hits the bell, no sound is being
 Model-based Diagnosis: Probabilistic
• While a functional representation can be useful for
  diagnosis, it is somewhat problem independent
  – FRs can be used for prediction (WWHI reasoning),
    diagnosis, planning and redesign, etc
• Diagnosis typically is more focused, so we can
  create a model of system components and their
  performance and enhance the system with
  – Failure rates can be used for prior probabilities
  – Evidential probabilities can be used to denote the
    likelihood of seeing a particular output from a
    component given that it has failed
• Bayesian probabilities can then be easily computed
• The device consists of 3
  multipliers and 2 adders
• F computes A*C+B*D
• G computes B*D+C*E
   – Given the inputs, F should output
     12 but computes 10
   – Given the inputs, G should output
     12 and does                       • We can employ probabilities of
• We use the model to compute          component failure rate and
  the diagnosis                        likelihood of seeing particular
   – Possible malfunctions are with    values given the input to
     M1, M2, A1 but not M3 or A2       compute the most likely cause
• If we can probe the inside of the     – note: it could be multiple
  machine                                 component failure
   – we can obtain values for X, Y and • If we have a model of the
     Z to remove some of the             multiplier and adder, we can
     contending malfunction
     hypotheses                          also use that knowledge to assist
                                       in diagnosis
       Neural Network Approach
• Recall that neural networks, while trainable to perform
  recognition tasks, are knowledge-poor
   – Therefore, they seem unsuitable for diagnosis
• However, there are many diagnostic tasks or subtasks
  that revolve around
   – data interpretation
   – visual understanding
• And neural networks might contribute to diagnosis by
  solving these lower level tasks
• NNs have been applied to assist in
   – Congestive heart failure prediction based on patient
     background and habits
   – Medical imaging interpretation for lung cancer and breast
     cancer (MRI, chest X-ray, catscan, radioactive isotope, etc)
   – Interpreting forms of acidosis based on blood work analysis
           Case-Based Diagnosis
• Case based reasoning is most applicable when
  – There are a sufficiently large number of cases
  – There is knowledge of how to manipulate a previous case
    to fit the current situation
     • This is most common done with planning/design, not diagnosis
  – So for diagnosis, we need a different approach
     • Retrieve all cases that are deemed relevant for the current input
     • Recommend those cases that match closely by combining
       common diagnoses, a weighted voting scheme
     • Supply a confidence based on the strength of the votes
     • If deemed useful, retain the case to provide the system with a
       mechanism for “learning” based on new situations
  – This approach has been employed by GE for diagnosing
    gas engine turbine problems
                   AI in Medicine
• The term (abbreviated as AIM) was first coined in
  1959 although actual usage didn’t occur until the
  1970s with Mycin
  – Surprisingly using AI for medical diagnosis has largely
    not occurred in spite of all of the research systems
    developed, in part because
     • the expert systems impose changes to the way that a clinician
       would perform their task (for instance, the need to have certain
       tests ordered at times when needed by the system, not when the
       clinician would normally order such a test)
     • the problem(s) solved by the expert system is not a particular
       issue needing solving (either because the clinician can solve the
       problem adequate, or the problem is too narrow in scope)
     • the cost of developing and testing the system is prohibitive
                          AIM Today
• So while AI diagnosis still plays a role in AIM, it is a small
  role, much smaller than those in the 1980s would have
• Today, AIM performs a variety of other tasks
   – Aiding with laboratory experiments
   – Enhancing medical education
   – Running with other medical software (e.g., databases) to
     determine if inconsistent data or knowledge has been entered
      • for instance, a doctor prescribing medication that the patient is known to
        be allergic too
   – Generating alerts and reminders of specific patients to nurses,
     doctors or the patients themselves
   – Diagnostic assistance – rather than performing the diagnosis,
     they help the medical expert when the particular problem is of a
     rare case
   – Therapy critiquing and planning, for instance by finding
     omissions or inconsistencies in a treatment
   – Image interpretation of X-Rays, catscans, MRI, etc
               AI Systems in Use
• Puff – interpretation of pulmonary function tests has been
  sold to hundreds of sites world-wide starting as early as
• GermWatcher – used in hospitals to detect in-patient
  acquired infections by monitoring lab data on culture data
• PEIRS – pathology expert interpretive reporting system is
  similar, it generates 80-100 reports daily with an accuracy
  of about 95%, providing reports on such things as thyroid
  function tests, arterial blood gases, urine and plasma
  catecholamines, glucose test results and more
• KARDIO – a decision tree learning system that interprets
  ECG test results
• Athena – decision support system implements guidelines
  for hypertension patients to instruct them on how to be
  more healthy, in use since 2002 in clinics in NC and
  northern CA
• PERFEX – an expert rule-based system to assist with
  medical image analysis for heart disease patients
• Orthoplanner – plans orthodonture treatments using
  rule-based forward and backward chaining and fuzzy
  logic, in use in the UK since 1994
• PharmAde and DoseChecker – expert systems to
  evaluate drug therapy prescriptions given the patient’s
  background for inaccuracies, negative interactions, and
  adjustments, in use in many hospitals starting in
• IPROB – intelligent clinical management system to keep
  track of obstetrics/gynecology patient records and
  cases, risk reduction, decision support through
  distributed databases and rules based on hospital
  guidelines, practices, etc, in use since 1995

To top