NSA meeting

Document Sample
NSA meeting Powered By Docstoc
					A New Approach To Cross-Modal
     Multimedia Retrieval
     Nikhil Rasiwasia, Jose M. Costa Pereira,
     Emanuele Coviello, Gabriel Doyle, Gert
    Lanckriet, Roger Levy, Nuno Vasconcelos

         University of California, San Diego




                  SVCL
Motivation
• Massive explosion of ―content‖ on the
  web.
• Content rich in multiple modalities —
  Text, Images, Videos, Music etc.
• There is a need for retrieval systems that
  are transparent to modalities.
  – Cross modal text query, eg. retrieval of images
    from photoblogs using textual query.
  – Finding images to go along with a text article
  – Finding music to enhance videos.
  – Position an image in the text.
  – Etc.



  “    Cross Modal Retrieval System
                                            ”
Retrieval system that operates across multiple
                   modalities

                             SVCL
Current Retrieval Systems
• Current retrieval systems are
  predominantly uni-modal.
   – The query and retrieved results are
     from the same modality

      Text                      Text
    Images                    Images
     Music                     Music
    Videos                    Videos



• Is Google Image search cross-
  modal retrieval?
  – No, text is matched to text metadata      Florence
                                              renowned
                                                                  is
                                                                for
                                              helping usher in the
                                                                               Florence is renowned for
                                                                               helping      usher     in    the
                                                                                                                    Florence is renowned for helping
                                                                                                                    usher in the Renaissance, but at the
                                                                                                                    same time , it can seem like layer



    for the image
                                              Renaissance, but a t
                                                                               Renaissance, but at the              upon layer o f gray . A tour of
                                              the same time , i t              same time , it can seem like         Florence I taly is a bi t confusing at
                                              can seem like layer
                                                                               layer upon layer of gray . A         first everything is s tone and
                                              upon layer of gray .             tour of F lorence I taly is a bi t   commands your a tten tion like an
                                              A tour o f Florence
                                                                               confusing at first everything        insult.


  – The operation would fail, in absence of
                                              Italy is a bit
                                                                               is stone and commands
                                                                               your attention like an insult.




    text modality for the retrieval set.
                                                                                                                    Florence          is
                                                                                                                    renowned         for
                                              Florence is renowned              Florence is renowned for            helping usher in
                                              for helping usher in the          helping usher in           the      the
                                              Renaissance, but at the           Renaissance, but a t the            Renaissance,
                                              same ti me, it can see m          same ti me, i t can seem like       but a t the same
                                              like layer upon layer o f         layer upon layer of gray . A        time , it can seem
                                              gray . A tour o f Florence        tour o f Florence Italy is a        like layer upon
                                              Italy is a bi t con fusing a t    bit   confusing a t      f irs t    layer of gray . A
                                              first everything is s tone        everything is s tone and            tour of Florence
                                              and commands your                 commands your attent ion            Italy is a bit
                                              attention like an insult.         like an insult. v                   confusing at first
                                                                                                                    everything is




                              SVCL                                                                                                                           3
Current Retrieval Systems
• Several multi-modal systems have been proposed [TRECVID,
 ImageCLEF, Iria‘09, Wang‘09, Escalante‘08, Pham‘07, Snoek‘05, Westerveld‘02, etc.]
  – Given a query consisting of multiple modalities, retrieve examples
    containing the same multiple modalities.
  – Eg. Combining the modalities into a single modality, combining the
    outputs of multiple uni-modal systems.
                       Text                          Text
                     Images                        Images
                      Music                         Music
                     Videos                        Videos



• Annotations systems            [TRECVID, ImageCLEF, Carneiro‘07, Feng‘04,
 Lavrenko‘03, Barnard‘03, etc]
  – Given a query from a modality (say image), assign text labels.
  – Are true cross-modal systems.
  – However, text modality is constrained to a few keywords.


                                  SVCL
 Cross Modal Retrieval
• Given query from modality A, retrieve results from modality B.
  – The query and retrieved items are not required to share a common
    modality.
                      Text                     Text
                              .
                    Images    .              Images
                     Music    .               Music
                    Videos                   Videos


• In this work – we restrict to text and image modalities
  – Although similar ideas can be applied to other modalities.
• Thus,                                                           Like most of t he UK, t he M ancheste r are a mob il ised
                                                                  extensiv ely du ri ng Wor ld War II. For examp le,
                                                                  casting and mach in ing expe rtise at Be yer, Peacock
                                                                  and Co mpan y's loco moti ve works in Gort on was



  – the retrieval of text in response
                                                                  switched to bom b maki ng ; Dun lop 's rub ber w orks
                                                                                               Mart in Lut her Ki ng 's prese nce in Bi rmi ngh am was
                                                                  in Chorlton-on-Me dlock made barrage balloons;
                                                                                               not welcom ed by al l in th e black comm uni ty. A
                                                                                               black atto rney w as quote d in ' 'T ime' ' magaz ine as
                                                                                               saying, " The ne w admi nist rati on shou ld have be en


    to a query image.
                                                             In 192 0, at t he age of 20, Cowa rd sta rred in his
                                                                                               given a chance t o confer wit h the v ari ous gro ups
                                                                                                   'l l Leave of th Y ou'
                                                             own pl ay, the l ig ht comedy ''ILike most It to e UK,'.t he Ma nchester area mo bi lise d


                                                         
                                                                                               interested in change. …
                                                                                                   o pene d in du ri n
                                                             After a t ryout in M ancheste r, it extensiv ely Lo ndong atWor ld War II. For exampl e,
                                                                                                       the and     Cowa rd
                                                             the New Theatr e (ren amedcasting Noël mach ini ng expe rtise at Beye r, Pe acock
                                                                                                  ll -le Co mpan y's t he
                                                             Theatr e i n 2 006 ), h is fi rst fuand ngth p lay in locom otiv e works i n Gorto n was



  – And, the retrieval of images in
                                                                                                  switched t re Gui m
                                                             West End.T haxte r, John. B rit ish Theato bomb de, aki ng; D un lop 's rub ber wo rks
                                                                                 In 192 0, at t he age of 20, Cowa rd sta rred in his
                                                                                                   in ' 'Th e Manchest er
                                                             2009 Nev il le Car dus's pr aise in Chorlton-on-Me dlock made barrage balloons;
                                                             Guardian''          own pl ay, the l ig ht comedy ''I 'l l Leave It to Y ou' '.
                                                                                 After a t ryout in M ancheste r, it o pene d in Lo ndo n at
                                                                                 the New Theatr e (ren amed the Noël Cowa rd


    response to a query text.
                                                                                 Theatr e i n 2 006 ), h is fi rst fu ll -le ngth p lay in t he
                                                             Mart in Lut her Ki ng 's presence in Bi rmi ngha m was
                                                                                 West End.T haxte r, John. B rit ish Theat re Gui de,
                                                             not welcom ed by al l in th e black commu nit y. A
                                                                                 2009 Nev il le Car dus's pr aise in ' 'Th e Manchest er
                                                             black atto rney w as quote d in ' 'T ime' ' magaz ine as
                                                                                 Guardian''
                                                             saying, " The ne w admi nist rati on shou ld have be en
                                                             given a chance t o confer w ith the va rio us gro ups
                                                             interested in change. …




                                  SVCL
Design of Retrieval Systems
• Uni-modal Retrieval System
  – Design a feature space ( ) for given
    modality
  – Map the query and retrieval set onto
  – Using a suitable similarity function to
    rank the retrieval set.


• Can this be applied to Cross
  Modal Retrieval?
  – Design feature spaces             for
    two modalities.                                                                   Like most of t h e UK, t h e Man ch est er area
                                                                                      mob ilised ext en siv ely d u ring World War II.
                                                                                      For    ex amp le, ca st in g     an d mach in ing
                                                                                      exp ert ise at Beyer, Pea coc k an d Comp an y's




  – Map query onto      and the retrieval
                                                                                      locomot ive works


                                                                                             Mart in Lu t h er King 's p resen ce in Birmin g h am
                                                                                             was n ot w elcom ed b y all in t h e b la ck




    set onto
                                                                                             commu n it y. A b lack at t orn ey was q u ot ed in
                                                                                             ''Time'' mag azin e as




  – But, what similarity function to use
                                              In 1 92 0 , at t h e ag e of 2 0 , Cow ard st arred in
                                              h is own p lay , t h e lig h t comed y ''I'll Lea ve It
                                              t o You ''. Aft er a t ryou t in Man ch est er, it
                                              op en ed in Lon d on at t h e New Th eat re
                                              (ren amed t h e Noël Coward




    for ranking?


                           SVCL
The problem.
   • No natural correspondence between representations of
     different modalities.
   • For example, we use Bag-of-words representation for both
     images and text
                  – Images: vectors over visual textures (                                                           )
                  – Text: vectors of word counts (   )

                                                                   Image Space                                                    Text Space

                                                                                                                                 Like most of t h e UK , t h e Man ch est er area mob ilised
                                                                                                                                 ext en sively d u rin g World War II. For ex amp le ,
                                                                                                                                                         n
                                                                                                                                 cast in g an d mach in i g exp ert ise at Beyer, Pe aco ck
                                                                                                                                 an d Comp an y's locomot ive work s in Gort on was
                                                                                                                                 swit ch ed t o b omb mak in g ; Du n lop 's ru bb er works in
                                                                                                                                 Ch orlt on -on -Med lock mad e b arrag e b alloon s;




                                                                                       ?                                 ?
                                                                                                                                                                                                         n
                                                                                                                                                                                      Mart in Lu t h er Ki g 's p resen ce in Birming h am was n ot
                                                                                                                                                                                      welcom ed b y al l in t h e b lac k c ommu n it y. A b lack
                                                                            In 19 20 , at t h e ag e of 2 0 ,                                                                         at t orn ey was q u ot ed in ''Time'' mag azin e as sayin g ,
                                                                            Coward st arred in h is o wn                                                                              "Th e n ew ad min ist rat ion sh ou ld h ave b een g iven a
                                                                            p lay, t h e lig h t comed y ''I'll                                                                       ch an ce t o con fer wit h t h e variou s grou p s int erest ed
                                                                            Leave It t o You ''. Aft er a t ryou t                                                                    in ch an g e. …
                                                                            in Man ch est er, it op en ed in
The pop ulat ion of Tu rkey stood at 71. 5                                  Lon d on at t h e
mil lio n wi th a g rowt h rate of 1.31 % pe r
annum, b ased on t he 20 08 Census. It has a n
average popu lat ion d ensit y of 92 perso ns
                                                                                                                             In 1 92 0 , at t h e ag e of 2 0 , Cow ard st arred in h is own
per km². The pro port ion of the pop ula tio n
                                                                                                                             p lay, t h e lig h t comed y ''I'll Lea ve It t o Y ou ''. Aft er a
resid ing in ur ban a reas is 70. 5%. Pe opl e                                                                               t ryou t in Man ch est er, it op en ed in Lon d on at t h e New
with in the 15 –64 ag e grou p constitut e                                                                                   Th eat re (ren amed t h e Noël C oward Th eat re in 2 0 0 6),
                                                                                                                             h is first fu ll-l en g t h p lay in t h e West E n d .Th axt er ,
66.5 % of t he tot al pop ulat ion, the 0–1 4 ag e
                                                                                                                             Joh n . Brit ish Th eat re Gu id e, 2 0 0 9 Neville Card u s's
                                                     America
                                                     Iran
                                                     Poverty
                                                     Food
                                                     Music
                                                     Books
                                                     Sun
                                                     Army
                                                     Navy
                                                     President
                                                     Prime
                                                     Weather
                                                     Success
                                                     India
                                                     Terrorist
                                                     SkyBomb




group corresponds 26.4% of th                                                                                                p raise in ''Th e Man ch est er Gu ard ian ''




   • How do we compute similarity?

                                                                 SVCL
An Idea
• Learn mappings (             ) that maps different modalities into
  intermediate spaces (          ) that have a natural and invertible
  correspondence ( )

  Image Space                                              Text Space

                                                          Like most of t h e UK , t h e Man ch est er area mob ilised
                                                          ext en sively d u rin g World War II. For ex amp le ,
                                                                                  n
                                                          cast in g an d mach in i g exp ert ise at Beyer, Pe aco ck
                                                          an d Comp an y's locomot ive work s in Gort on was
                                                          swit ch ed t o b omb mak in g ; Du n lop 's ru bb er works in
                                                          Ch orlt on -on -Med lock mad e b arrag e b alloon s;



                                                                                                                                  n
                                                                                                               Mart in Lu t h er Ki g 's p resen ce in Birming h am was n ot
                                                                                                               welcom ed b y al l in t h e b lac k c ommu n it y. A b lack
                                                                                                               at t orn ey was q u ot ed in ''Time'' mag azin e as sayin g ,
                                                                                                               "Th e n ew ad min ist rat ion sh ou ld h ave b een g iven a
                                                                                                               ch an ce t o con fer wit h t h e variou s grou p s int erest ed
                                                                                                               in ch an g e. …




                                                      In 1 92 0 , at t h e ag e of 2 0 , Cow ard st arred in h is own
                                                      p lay, t h e lig h t comed y ''I'll Lea ve It t o Y ou ''. Aft er a
                                                      t ryou t in Man ch est er, it op en ed in Lon d on at t h e New
                                                      Th eat re (ren amed t h e Noël C oward Th eat re in 2 0 0 6),
                                                      h is first fu ll-l en g t h p lay in t h e West E n d .Th axt er ,
                                                      Joh n . Brit ish Th eat re Gu id e, 2 0 0 9 Neville Card u s's
                                                      p raise in ''Th e Man ch est er Gu ard ian ''




• Given a text query    in      the cross-modal retrieval reduces to
  find the nearest neighbor of:
• Similarly for image query:

• The task now is to design these mappings.

                           SVCL
The Fundamental Hypotheses
• We explore two fundamental hypotheses

  1.   Correlation Matching (CM) Hypothesis: The problem is that
       there is no correlation between the representations of
       different modalities.

       Can be tested by designing intermediate representations that
       maximizes correlations between modalities.

  2.   Semantic Matching (SM) Hypothesis: The problem is that the
       representation lacks common semantics.

       Can be tested by designing a shared semantic representation
       for all modalities.




                          SVCL
Correlation Matching (CM)
• Learn subspaces that maximize correlation between
  two modalities
           Image Space         UI        U   T
                                                                       Text Space




                             UI                  UT
                                                                         UT
         UI              Maximally Correlated Sub-spaces



• We use Canonical Correlation Analysis (CCA) to obtain
  mappings that maximize correlation.
  – joint dimensionality reduction across two (or more) spaces
                                                           Basis for the maximally correlated space

                                                                  Empirical covariance
                                                                  for images and text,
                                                                  and their cross
                            SVCL                                  covariance.
Semantic Matching (SM)
• Design semantic spaces for both modalities                                         [Rasiwasia‘07, Smith‘03]
   – A space where each dimension is a semantic concept.
   – Each point on this space is a weight vector over these concepts

                                                                                                         Semantic
      Image Space   RI                                                                      Semantic
                                                                                            Concept 1     Space
                                                                                                             S
            Image Classifiers



                                Literature


                                 Biology
                                 Warfare




                                 History
                                  Places

                                    Art
                                    …
                                    …
                                    …
                                    …
            Text Classifiers                                                                            Semantic
                                                                                                        Concept V
                                             Mart in Luth er K ing 's presence i n
                                             Bir min gham was not we lcome d



                   R   T                     by al l in the black commu nit y. A
                                             black atto rney was quot ed i n

      Text Space                             ''T ime '' magaz ine as sayin g,
                                             "The new administration
                                                                                     Semantic
                                                                                     Concept 2


• We use multiclass logistic regression to classify both text and images
• The posterior probability under the learned classifiers serves as the
  semantic representation                           Text/Image features
                                                                                        Learned parameters


                                                                                 Total number of classes
                                   SVCL                                                                             11
Cross Modal Retrieval
Example: Image to text retrieval using                                Example: Text to images retrieval
               CM                                                                using CM

                                                                                                 Closest Text to the
                                                                                                    Query Image
                                                                                     Like most of the UK, the Manchester area
                                                                                     mobilised extensively during World War II.
                                                                                           example,      the UK, the    machining
                                                                                     For Like most of casting and Manchester area
                                                                         Semantic        mobilised extensively during World War
                                                                                     expertise at Beyer, Peacock and Company's II.
                                                                                         For Like most Gorton UK, the
                                                                                                 works in    the was switched to
                                                                                                                            machining
                                                                                     locomotiveexample, of casting and Manchester area
                                                                         Concept 1       expertise
                                                                                              mobilised extensively during World War
                                                                                                       Beyer, rubber works in
                                                                                     bomb making;atDunlop's Peacock and Company's II.
                                                                                     Chorlton-on-Medlock made barrage balloons; to
                                                                                         locomotiveexample, of casting and Manchester area
                                                                                              For Like most Gorton UK, the
                                                                                                      works in    the was switched
                                                                                                                                machining
                                                                                              expertise
                                                                                                   mobilised extensively during World War
                                                                                                            Beyer, rubber works in
                                                                                         bomb making;atDunlop's Peacock and Company's II.
                                                                                              locomotiveexample, of casting and Manchester area
                                                                                                   For Like most Gorton UK, the
                                                                                                           works in   the was switched
                                                                                                                                    machining
                                                                                         Chorlton-on-Medlock made barrage balloons; to
                                                                                                   expertise
                                                                                                        mobilised extensively during World War
                                                                                                                Beyer, rubber works in
                                                                                              bomb making;atDunlop's Peacock and Company's II.
                                                                                                   locomotiveexample, Gorton was and machining
                                                                                                        For    works in casting    switched
                                                                                              Chorlton-on-Medlock made barrage balloons; to
                                                                                                        expertise    Beyer, rubber works in
                                                                                                   bomb making;atDunlop's Peacock and Company's
                                                                                                        locomotive works in Gorton was switched
                                                                                                   Chorlton-on-Medlock made barrage balloons; to
                                                                                                        bomb making; Dunlop's rubber works in
                                     Closest Image To                                                   Chorlton-on-Medlock made barrage balloons;



U   I
                    U        T        the Query Text
                Like most of the UK, the Manchester area
                mobilised extensively during World War II.
                                                                                                        Semantic
                For example, casting and machining
                expertise at Beyer, Peacock and Company's                                               Concept V
                locomotive works in Gorton was switched to
                bomb making; Dunlop's rubber works in
                Chorlton-on-Medlock made barrage balloons;
                                                                    Semantic
                                                                    Concept 2
            Correlated Sub-space                                                     Semantic Space


• Ranking is based on a suitable similarity function
   − L2 distance, L1 distance, Normalized Correlation, KL divergence
        (for SM only) etc.

                                                             SVCL
Dataset
• We propose a dataset build using Wikipedia‘s
  featured articles

  – 2700 articles, selected and reviewed by Wikipedia‘s
    editors since 2009.

  – The articles are accompanied by one or more pictures
    from the Wikimedia Commons

  – Each article is split into sections that may or may not
    have an assigned image (sections without images
    were dropped)

  – Each article is categorized into one of 29 categories
    (only the 10 most populated categories were chosen)


  – Each ‗document‘ in the proposed set is a ‗section of
    Wikipedia featured article‘ and its ‗associated image‘.




                                 SVCL
  Dataset (examples)
                                       The population of Turkey stood at 71.5 m illion with a growth rate of 1.31% per annum , based on the
                                       2008 Census. It has an average population density of 92 persons per km ². The proportion of the
                                       population residing in urban areas is 70.5%. People within the 15–64 age group constitute 66.5% of the
                                       total population, the 0–14 age group corresponds 26.4% of the population, while 65 years and higher of
                                       age correspond to 7.1% of the total population. Life expectancy stands at 70.67 years for m en and 75.73
                                       years for wom en, with an overall average of 73.14 years for the populace as a whole. Education is
                                       com pulsory and free from ages 6 to 15. The literacy rate is 95.3% for m en and 79.6% for wom en, with
                                       an overall average of 87.4%. The low figures for wom en are m ainly due to the traditional customs of the
                                       Arabs and Kurds who live in the southeastern provinces of the country. Article 66 of the Turkish
                                       Constitution defines a "Turk" as "anyone who is bound to the Turkish state through the bond of
   Geography and Places                citizenship"; therefore, the legal use of the term "Turkish" as a citizen of Turkey is different from the
                                       ethnic definition. (…)


A num ber of variants were built on the sam e chassis as the TAM tank. The original program called for
the design of an infantry fighting vehicle, and in 1977 the program finished m anufacturing the
prototype of the ''Vehículo de Com bate Transporte de Personal'' (Personnel Transport Combat Vehicle),
or VCTP. The VCTP is able to transport a squad of 12 m en, including the squad leader and nine
riflem en. The squad leader is situated in the turret of the vehicle; one riflem an sits behind him and
another six are seated in the chassis, the eighth m anning the hull m achine gun and the ninth situated
in the turret with the gunner. All personnel can fire their weapons from inside the vehicle, and the
VCTP's turret is arm ed with Rheinm etall's Rh-202 20 m illim eter (.79 in) autocannon. The VCTP holds
880 rounds for the autocannon, including subcaliber arm or-piercing DM63 rounds. It is also arm ed with
a 7.62 m illim eter FN MAG 60-20 m ounted on the turret roof. Infantry can dism ount through a door on
the rear of the hull. (…)                                                                                                Warfare


                                        Despite agreeing on m ost issues regarding the protection of national parks, friction between the NPA
                                        and NPS was seem ingly unavoidable. Mather and Yard disagreed on m any issues; whereas Mather
                                        was not interested in the protection of wildlife and accepted the Biological Survey's efforts to
                                        exterm inate predators within parks, Yard vehem ently criticized the program as early as 1924 (Fox, p.
                                        204). Yard was also highly critical of Mather's adm inistration of the parks. Mather advocated plush
                                        accommodations, city comforts and various entertainments to encourage park visitation. These plans
                                        clashed with Yard's ideals, and he considered such urbanization of the nation's parks m isguided. While
                                        visiting Yosem ite National Park in 1926, he stated that the valley was "lost" after finding crowds,
                                        autom obiles, jazz m usic and even a bear show (Sutter, p. 126). In 1924, the United States Forest
                                        Service initiated a program to set aside "prim itive areas" in the national forests that protected
                                        wilderness while opening it to use. (…)
    Culture and Society


                                                        SVCL
 Dataset characterization
  • Wikipedia featured articles (10 categories)
  • Overall 2,866 pairs of (text; image) documents

          Category           Training           Query/     Total documents
                                               Retrieval
Art & Architecture              138                  34          172
Biology                         272                  88          360
Geography & Places              244                  96          340
History                         248                  85          333
Literature & Theatre            202                  65          267
Media                           178                  58          236
Music                           186                  51          237
Royalty & Nobility              144                  41          185
Sport & Recreation              214                  71          285
Warfare                         347                  104         451
TOTAL                          2173                  693        2866

                                SVCL
Retrieval Performance
                                       Mean Average Precision

                              Model      Image      Text        Avg.
• The performance of both                query     query
  Correlation & Semantic
                              Chance      0.118     0.118       0.118
  Matching is ~90% better
  than chance.                CM          0.249     0.196   0.223
                              SM          0.225     0.223   0.224




                       SVCL
Semantic Correlation Matching (SCM)
• Although CM and SM work on different principles they are
  not mutually exclusive.
• Combination of the two approaches can lead to improved
  performance
  – Learn the maximally-correlated subspaces using CCA
  – Design semantic spaces using the correlated feature as the low-level
    representation.



   Image Space    RI                Image Classifiers          Semantic
                                                               Concept 1



                 Canonical                                                 Semantic
                 Correlation                                               Concept V
                  Analysis     UI          UT
                                                                       Correlated
    Text Space   R   T                                               Semantic Space
                                                                           S
                                    Text Classifiers    Semantic
                                                        Concept 2




                                SVCL
Retrieval Performance


                                        Mean Average Precision

                               Model      Image      Text        Avg.
                                          query     query
• Combining the benefits of
                               Chance      0.118     0.118       0.118
  CM and SM leads to
  further ~13%                 CM          0.249     0.196       0.223
  improvements.                SM          0.225     0.223       0.224
                               SCM        0.277     0.226    0.252




                        SVCL
Text to Image Query (1)
Between October 1 and October 17, the Japanese delivered 15,000 troops to Guadalcanal, giving
Hyakutake 20,000 total troops to employ for his planned offensive. Because of the loss of their
positions on the east side of the Matanikau, the Japanese decided that an attack on the U.S. defenses
along the coast would be prohibitively difficult. Therefore, Hyakutake decided that the main thrust of
his planned attack would be from south of Henderson Field. His 2nd Division (augmented by troops
from the 38th Division), under Lieutenant General Masao Maruyama and comprising 7,000 soldiers in
three infantry regiments of three battalions each was ordered to march through the jungle and attack
the American defences from the south near the east bank of the Lunga River. The date of the attack
was set for October 22, then changed to October 23. To distract the Americans from the planned
attack from the south, Hyakutake's heavy artillery plus five battalions of infantry (about 2,900 men)
under Major General Tadashi Sumiyoshi were to attack the American defenses from the west along the
coastal corridor. The Japanese estimated that there were 10,000 American troops on the island, when
in fact there were about 23,000…

                                     Top 5 Retrieved Images




                                        SVCL
Text to Image Query (2)
Around 850, out of obscurity rose Vijayalaya, made use of an opportunity arising out of a conflict

between Pandyas and Pallavas, captured Thanjavur and eventually established the imperial line of the

medieval Cholas. Vijayalaya revived the Chola dynasty and his son Aditya I helped establish their

independence. He invaded Pallava kingdom in 903      and killed the Pallava king Aparajita in battle,

ending the Pallava reign. K.A.N. Sastri, ''A History of South India'' p 159 The Chola kingdom under

Parantaka I expanded to cover the entire Pandya country. However towards the end of his reign he

suffered several reverses by the Rashtrakutas who had extended their territories well into the Chola

kingdom…


                                    Top 5 Retrieved Images




                                       SVCL
Text to Image Query (3)
The lumber boom on Plunketts Creek ended when the virgin timber ran out. By 1898, the old growth
hemlock was exhausted and the Proctor tannery, then owned by the Elk Tanning Company, was closed
and dismantled. Lumbering continued in the watershed, but the last logs were floated down Plunketts
Creek to the Loyalsock in 1905. The Susquehanna and Eagles Mere Railroad was abandoned in
sections between 1922 and 1930, as the lumber it was built to transport was depleted. The CPL
logging railroad and their Masten sawmills were abandoned in 1930. Without timber, the populations of
Proctor and Barbours declined. The Barbours post office closed in the 1930s and the Proctor post office
closed on July 1, 1953. Both villages also lost their schools and almost all of their businesses. Proctor
celebrated its centennial in 1968, and a 1970 newspaper article on its thirty-ninth annual "Proctor
Homecoming" reunion called it a "near-deserted old tannery town". In the 1980s, the last store in
Barbours closed, and the former hotel (which had become a hunting club) was torn down to make way
for a new bridge across Loyalsock Creek…


                                      Top 5 Retrieved Images




                                         SVCL
Text to Image Retrieval Example
• Ground truth image corresponding to the retrieved text is
  shown




                       SVCL
Conclusion
• Proposed an approach to build cross-modal retrieval
  systems.

• Explored two hypotheses
  – CM: The problem is that there is no correlation between the
    representations of different modalities.
  – SM: The problem is that the representation lacks common
    semantics.


• Both CM and SM hypotheses holds true
  – Tested by building intermediate spaces based on maximizing correlation
    and a common semantic representation.


• CM and SM are not mutually exclusive and their combination
  leads to further improvements.

                            SVCL
       Questions?



SVCL