Image Tagging by sofiaie

VIEWS: 5 PAGES: 41

									 Image Tagging

Attaching textual meta-information
 or semantic linkages to images

                       By Perry Rajnovic
What is a Digital Image?




   Digital Images are usually defined as an
    organized display of pixels, often called a
    bitmap.
   Each pixel is a numeric representation of the
    color intensities of that point.
What is a Digital Image?
 Each pixel may be explicitly defined or be
  the result rendered by a vector or graphics
  package functionality.
 These representations include no inherent
  textual elements or semantic description.
 Due to the above, images are not easily
  machine-readable.
Why machine-readability?
 Most searches are done via textual
  queries, thus there must be a mechanism
  to link applicable keywords or phrases to
  images.
 For blind persons, being able to convey
  information about the image in another
  medium would be good for accessibility.
Image Contents
 The contents of an image can be a full
  description written in prose (i.e. the adage
  “A picture is worth 1000 words”), or might
  simply have a few keywords describing
  spatial, temporal, or emotional aspects.
 In many cases, accurately identifying the
  content of images requires human
  intervention.
Identifying Image Contents
 Many good pattern recognition algorithms
  exist, however few are able to interpret the
  patterns extracted.
 Artificial Intelligence algorithms can learn
  recognized patterns, but such a system’s
  flexibility is limited by its predefined
  knowledgebase.
Identifying Text in Images




   CAPTCHAs (or Completely Automated Public
    Turing test to tell Computers and Humans Apart)
    are images which contain a distorted rendering
    of some text.
Identifying Text in Images
 Their goal is to provide an easy task for
  humans to do, but that is extremely hard
  for computer programs to perform equally.
 For this task, OCR is generally not
  sufficient enough to extract the text.
 This is a good example of why machine-
  readable information should be available.
Example Tag Contents
   As an example of
    what might be                 Navy Blue
    provided to tag an            Squares
    image, to the right is a      Fade-out
    list of words and
                                  Horizontal Bar
    phrases to describe
    this slide’s header.          Minimalist
                                  Decorative
User Applications
   Many applications take advantage of
    image tagging, below are a few examples.
     AppleiPhoto
     Google Picasa
     Adobe Photoshop Elements

   Generally these programs use tagging for
    organization and user-defined searching.
Web Applications
   Several Web-based applications are now
    including tagging for images, as well as
    other non-image based features.
     Google   ImageLabeler
     Flickr.com
     Facebook.com
     23hq.com
     Fotki.com
Google Images
 Luis von Ahn developed the “ESP Game”
  which could be used to tag images.
 He presented a Google tech talk about the
  game as a form of human computation.
 Google later licensed the technology to
  create a similar web application called the
  Google Images ImageLabeler.
Google ImageLabeler
 The ImageLabeler game allows to random
  users to generate tags that accurately
  describe images.
 The tags should be accurate due to game
  constraints, and gain specificity after
  several rounds.
 The computed tags can improve searches.
Flickr
 Flickr is a “Web 2.0” photo hosting and
  sharing site.
 Users are encouraged to upload photos,
  then to name, describe, tag, annotate,
  geotag, comment on, and group their
  photos in collaborative ways.
Flickr - Tags
   Tags are words or
    phrases meant to act
    as keywords.
   They are searchable
    within the site, and
    can show popular
    topics.
   They improve search
    relevance.
Flickr - Geotagging
   Geotagging is a term
    for adding geospatial
    metadata to images
    such as the latitude,
    longitude and other
    directional indications
    of where a photo was
    taken at.
What are annotations?
   Wikipedia defines            The US DoD defines
    them as:                      them as:
     Extra information           A   marking placed on
      associated with a             imagery or drawings
      particular point in a         for explanatory
      document or other             purposes or to indicate
      piece of information.         items or areas of
                                    special importance.
Annotating Images
   The use of annotations with images can
    provide several useful functions. Below
    are some examples:
     Pointout a specific piece of content.
     Explain some icon or graphic.
     Summarize the meaning of some region.
     Provide additional information via text.
Flickr - Notes
 Flickr provides a feature called Notes. It
  uses a Flash-based implementation of an
  annotation system.
 You can dynamically size a rectangular
  region over a portion of the image, then
  attach a snippet of text to describe it.
FotoNotes
 FotoNotes is a data format for annotating
  images.
 Allows you to embed the metadata directly
  into the image files for portability.
 Flickr’s Notes feature is inspired by this
  standard and accompanying visualization
  implementation.
FotoNotes - More
 It was developed by Greg Elin.
 The homepage provides links to groups
  working with the standard.
 Additionally, an implementation which
  works in most browsers is provided as-is
  for customization.
Facebook
 Facebook.com has a tagging feature that
  is integrated with “My Photos”.
 It allows you to add a textual descriptor
  (tag or person’s name) to a specific point
  in the image.
 This allows the module to describe who or
  what are included in a specific album.
Facebook – Tag Display
 When the images are viewed, placing the
  mouse over a tag displays a fixed sized
  square indicating where the tag (person) is
  located within the image.
 This enables users to identify objects by
  visual inspection or by matching the list of
  contained objects with their tag displays.
Facebook – Links
 Another capability incorporates the site’s
  concept of friends. If the person you tag is
  identified as your friend on the site, their
  name will link to their profile.
 The site will also count this image in the
  “photos of” feature on their profile, allowing
  inclusion of photos added by other users.
Other Image Metadata: MPEG
 The MPEG-7 standard is a “Multimedia
  Content Description Interface”
 “MPEG-7 is not aimed at any one
  application in particular; rather, the
  elements that MPEG-7 standardises shall
  support as broad a range of applications
  as possible.”
Other Image Metadata: Adobe
 Adobe Systems created a new MetaData
  framework for images called XMP
  (Extensible Metadata Platform).
 It is publicly documented, based on W3C
  standards, built on XML, and is designed
  to eliminate growing incompatibility for
  metadata storage.
Other Image Metadata: IPTC
 The International Press
  Telecommunications Council created
  standards for the interchange of news data
  over a decade ago.
 These standards still persist in their IIM
  standard, as well as being usable in the
  newer XMP framework.
Improving Clustering
Search Interfaces
Joint Term Project
By Perry Rajnovic
and Mark Zalar
Term Project
 For my term project, I will be working with
  Mark Zalar to develop a new search
  engine interface
 It will draw inspiration from all of the top
  search engines today, along with the
  enhancements now possible using
  emerging technologies.
Project Goal
 The Goal of the project is to implement a
  search site that provides a highly usable
  interface for query refinement.
 Our backend will use clustering
  mechanisms to allow for easy refocusing
  of search topics
 Our frontend will use AJAX for flexibility.
Frontend Design
 The frontend will be designed with a
  technology known as Asynchronous
  JavaScript and XML (AJAX).
 This technology allows the site designer to
  run unseen requests to the server and
  parse XML-based results in the scripting
  language for interactivity.
Frontend Theory
 Most clustering based search solutions
  available today use minimally interactivity.
 Our theory is that making the ability to
  harness the power of clustering
  dynamically as you refine your search will
  improve results as well as time necessary
  to finish a search.
AJAX Functionality
 Our site will use AJAX to dynamically
  reconfigure the clustering menu. This
  allows a quicker browsing of clusters to
  identify the optimal range of pages to
  search within.
 The menu will also use a novel interface
  that shows sibling and parent clusters.
AJAX Functionality 2
 The results will be displayed to the user
  with some animation.
 This will help to alert users when changes
  are made to the order or set of results.
 Another advantage of this is that users will
  be more aware of the difference between
  clusters as they browse them.
Search Target
 This search engine could target both
  websites and images.
 Valid keywords improve content
  knowledge.
 Clustering would be highly useful in finding
  an image with a desired scene or set of
  objects.
Example: search “creature”
   The engine might
    identify a general
    cluster of “animal” or
    “being”.
   Animal might have
    more results, so the
    medium level clusters
    are shown for that.
Example: search “creature”
   User wants a general
    discussion of
    mammals. Selects
    that cluster.
   The results change to
    focus on those related
    to mammals as a
    group and in specific.
Results Display
 To provide animation, a similar technology
  to that found in “TiddlyWiki” will be used.
 This interface allows topics to be added
  and removed dynamically with animation.
 Additionally, extra links can be attached to
  each topic for more functionality (open in
  new window, similar items, etc.)
Future Enhancements
 Our implementation will provide a basic
  mockup of the interface and refinement
  techniques made available.
 Several enhancements could be made to
  this interface that would improve its
  usability or functionality.
Enhancements in Search
 Taking advantage of a meta-search would
  allow the clustering algorithm to have a
  higher volume of data with which to
  generate data topologies to be explored.
 Using adaptive search (by userID or global
  optimization) would improve clustering by
  choosing ones more often used.
Enhancements in Interface
 Because the site will be AJAX based, a
  large amount of flexibility is possible with
  respect to changes in the interface.
 The browser window is similar to a
  canvas, with all of the site’s underlying
  Document Object Model available for
  addition, modification or deletion.

								
To top