Annotating Scientific Images

Document Sample
Annotating Scientific Images Powered By Docstoc
					                              SSDBM 2002, Edinburgh

              Annotating Scientific
                 A Concept-based
                                         Kai-Uwe Sattler,
                   TU Dresden/Uni Magdeburg (Germany)
Michael Gertz, Fredric Gorin, Michael Hogarth, Jim Stone,
                                           UC Davis (US)
      Context & Requirements
      Model for Conceptualized Annotations
        –   Base Components
        –   Query Model
        –   Views
        –   Compatibility Checking
      Prototype Implementation

Kai-Uwe Sattler        Annotating Scientific Images   2
      Study of the (human) brain
      Brain slices, digitally photographed under
        – approx. 1200 sections of the monkey brain
        – each image about 55-85MB
        – cell and nuclei details at < 10 m
      Researchers mark regions of interest (e.g, cell
      structures) and assign concepts (e.g., cell types)
      Collaborative research environment, geographically
      Controlled vocabularies play a very important role
      (e.g., for naming structures)

Kai-Uwe Sattler        Annotating Scientific Images    3
        – heterogeneity of data and lack of local and global schema
        – extensive use of metadata to semantically enrich and
          describe (Web accessible) scientific datasets
        – What metadata model is appropriate and how is it used?
      Usage scenarios:
        –   Yahoo-like categorizations and catalogs of images
        –   Metadata search engine
        –   Browsing metadata and annotations
        –   Support of different views on the same data / images

Kai-Uwe Sattler          Annotating Scientific Images              4
Conceptualized Annotations
      Domain specific concept specifications:
        – Serve as semantic rich metadata schemes to encode
          domain knowledge
      Conceptualized data annotations:
        – Instances of specific metadata schemes associated with
          data (images)
        – Data annotations are external to datasets
        – Can occur at different levels of granularity
        – Provide semantic rich linkage structures among datasets
          and domain specific concepts
        – Different users can annotate the same dataset using
          different metadata schemes

Kai-Uwe Sattler        Annotating Scientific Images                 5
Annotation Graph Model
      Uniform modeling constructs for
        –   base concepts,
        –   relationships among concepts,
        –   images (documents) and regions, and
        –   data annotations
      Formal model including a precise syntax and
        – Graph structure consisting of
              nodes (concepts, annotations, and images)
              edges (relationships between nodes)
        – Basis for a query model

Kai-Uwe Sattler          Annotating Scientific Images      6
        – concept identifier, definition
        – set of concept names (terms) including synonyms and
          preferred term
        – set of properties (attributes, e.g. type, default value(s))
        – creational metadata elements (author, date)
      Base concepts and relationship type concepts
        – Relationship type concepts describe admissible
          relationships among base concepts (e.g., sub-type or
          spatially oriented)
        – annotates /annotatedBy
        – ofConcept /hasAnnotation
        – contains /containedIn
Kai-Uwe Sattler         Annotating Scientific Images                    7
Example: Concept Classification
      Spatial vs. sub-type relationships

                                                           Retina cells

                                           amacrin cell            bipolar cell   ganglion cell

    …                cerebrum                                  …              …
           …               diencephalon

      ephithalamus          dorsal      subthalamus

Kai-Uwe Sattler                 Annotating Scientific Images                                      8
Images & Annotations
        – Assumed to be Web accessible (have URL, e.g., managed
          by image repositories)
        – Description of fragments: region information (rectangle,
          polygon, circle, ...)
        – Link image (fragments) to base concepts; consist of
                 id of the referenced concept
                 URI of the image (fragment)
                 values for properties, based on the chosen base concept
                 creational metadata
        – can be viewed as instance of concept plus additional
          reference and author information

Kai-Uwe Sattler              Annotating Scientific Images                   9
Example: Annotation Graph
                                            C1      cell

                       is_of_cell_type     C2       is_of_cell_type
cell type A                                                                  cell type B
                  C3                                                  C4
                             ofConcept              ofConcept

                  A1                                                  A3       annotations

        annotates                                                             annotates

                        D1                                      D2

Kai-Uwe Sattler              Annotating Scientific Images                            10
Query Model
      Queries for retrieving concepts, annotations, or
      Based on path expressions and predicates on
      graphs (Algebra and XPath-like syntax)
        – selections (concepts, annotations, images): predicate(V)
        – Graph traversal: traversing between concepts,
          annotations, and images (up/down traversal): relationship(V)
        – transitive closure: +relationship(V)

Kai-Uwe Sattler          Annotating Scientific Images                 11
Example Queries
      „Images annotated using a given concept“

      „Images linked by the same annotation“

      „Similar images“, e.g. annotated by the
      same or a derived concept
Kai-Uwe Sattler    Annotating Scientific Images     12
      Individual researchers / groups have different
      views (interpretation, interest, …) on concepts,
      relationships, annotations, and images
        – View is a sub-graph of an annotation graph

           define view my_view as
             annotation := annotation[created >= ´01/01/02´]
             concept := concept[author=´Smith´]
        – „Only concepts that are spatially related“
        – „Only annotations that are based on certain cell type

Kai-Uwe Sattler         Annotating Scientific Images              13
Compatibility Checking
        – Concepts and annotations introduced by
          different users should satisfy certain
          compatibility criteria
        – Poor quality of concepts and how concepts are
          used for annotating data can have negative
          impact on data retrieval
        – Mechanisms that provide users with feedback
          on possible incompatibilities (analysis) and
          possible additional annotations (synthesis)
Kai-Uwe Sattler      Annotating Scientific Images         14
Annotation Level Mechanisms
        Task: determine annotations in the same image
        that might be incompatible
        –   spatial relationship between the regions associated with
            the new annotation and an existing annotation a’
             same region, overlapping, or disjoint
        For „same region“:
        –   Annotation a’ is also based on concept c and has the
            same property instances
             the new annotation is redundant
        –   a’ is also based on c but has different property
             data conflict; review procedure can be triggered
        –   a’ is based on a concept different from c
              concept reference conflict

Kai-Uwe Sattler         Annotating Scientific Images               15
Concept Level Mechanisms
      Task: upon creating or modifying of concepts 
      check for similar concepts
      Concept similarity:
        – Similar phrases for specifying terms (text similarity,
        – Similar structure / properties (schema matching)
      Usage scenarios:
        – Upon creation  provide an already existing and similar
          concept to the user
        – Exploit annotations to decide about similarity (concepts
          used for annotating the same regions)

Kai-Uwe Sattler         Annotating Scientific Images                 16
      Web-based annotation service
        – Registering images
        – Creating and managing concepts
        – Creating and managing annotations (as instances
          of concepts, linked to images / image regions)
        – Web Service interface
        – Annotation graphs are stored in a RDBS
        – Query evaluation by translating queries into

Kai-Uwe Sattler      Annotating Scientific Images        17
System Architecture
                                           Concept &
           Concept      Annotation                          Search
           Modeller       Tool              Browser         Engine

                             SOAP Interface

         Query Parser
           Translator        Graph                  Consistency      Integrity
                            Mapping                  Checker           Rules
          SQL Builder


Kai-Uwe Sattler          Annotating Scientific Images                       18
      Annotation system
        – System for displaying images (of brain slices)
        – Concept browser for choosing concepts
        – Annotating regions by
              Marking a region in an image
              Instantiate a concept (incl. assign property values)
      Search engine
        – Interface to query engine
        – Results: concepts, annotations, or images
        – Relationships (edges) represented as links
          between them  browsing the graph

Kai-Uwe Sattler          Annotating Scientific Images                 19
Annotation System

Kai-Uwe Sattler   Annotating Scientific Images   20
Search Engine

Kai-Uwe Sattler   Annotating Scientific Images   21
      Framework to semantically enrich regions in
      scientific images
        – applicable to other types of documents
        – Annotations are external to documents
        – Support of multiple views and interpretations of the
          same data
        – Uniform and transparent access to data occurs through
          querying annotation graphs
        – Certain properties of classification hierarchies for
          concepts can be employed for compatibility checks
      Future work
        – Bootstrapping of concepts, terms, and definitions
        – Distributed annotation services

Kai-Uwe Sattler        Annotating Scientific Images               22