Jordan by changcheng2

VIEWS: 7 PAGES: 26

									“Duplicate” Entries
  in Gazetteers


     jordan Hastings
 Department of Geography
  University of California
      Santa Barbara
Gazetteer “Duplicates”
Names & Features (1)


   Naming Features in the Environment
       Linguistic Necessity
       Identity and Ownership
       Navigation and Wayfinding
   Features Cover a Large Territory
       Crisp or Diffuse
       Compact or Extended
       Tangible or Abstract
Gazetteer “Duplicates”
Names & Features (2)


   Locations are Numerous & Various
       Multiscale
       Generalized
       Dis-coordinated
       Time-variant
Gazetteer “Duplicates”
Names & Features (3)


   Names are Numerous & Various
       Polynymous
       Mis-spelled
       Multilingual
       Time-variant
Gazetteer “Duplicates”
Names & Features (4)




Lake Bigler, thru 1920s
Lake Bonpland (also Bondland), thru 1890s
Da-ow-a-ga, thru 1850s
      $


                             Kings                             $


                             Beach Incline Village-
                                   Crystal Bay
                                       $

           Tahoe Vista   $     $




    Dollar Point $
                                                         Carson
            $                                                      $

Sunnyside-
Tahoe City                                            Indian Hills
                                                           $


                                                        Johnson Lane
                                                                           $

                                               Zephyr Cove-
                                               Round Hill Village
                                                $


                                           $       Kingsbury
                                                    $
                                                               $

                                   $       Stateline                   $

          South Lake Tahoe                               Minden
                                                                   $
Gazetteer “Duplicates”
Feature Types (1)


   Dependable Type System
       Because Features are “Objects”
       Because Human Mind Categorizes
   Types present in Taxonomy
       Hierarchy is Natural in Environment
       Because Human Mind Categorizes
Gazetteer “Duplicates”
Feature Types (2) – Examples


Cultural Environment
     Nations -> States -> Provinces -> Districts
        Gazetteer “Duplicates”
        Feature Types (2) - Examples


   Physical Environment
       Watersources:
         Springs-->Seeps
       Watercourses:
         Rivers-->Streams-->Creeks
       Waterbodies:
         Lakes-->Ponds-->Sloughs
         ?Glaciers
Gazetteer “Duplicates”
Fundaments (1)


   Definition: Gazetteer
       A spatial dictionary of
       named & typed features
       in the environment
   Implications
       Features uniquely identified
       Searchable by name and type
       Also searchable geospatially
Gazetteer “Duplicates”
Fundaments (2)


   Duplicates: An approximate notion
       Firm types, ±close in hierarchy
       Locations ±close dependent on scale
       Names ±close dependent on language
         … or not at all
       All aspects variant in time
Gazetteer “Duplicates”
Fundaments (3)


   Database Implications / Support
       Custom Datatypes
            Hierarchy
            Geometry
       Multiple Attribution (unlimited)
            Names
            Locations
       Efficient Geospatial Processing
Gazetteer “Duplicates”
Approach (1)


   Independent Measures of Duplicates
       1. Type Thesaurus Metrics
            Inter-feature: hierarchy, explicit linkages
       2. Geospatial Metrics
            Intra-feature: size, compactness, …
            Inter-feature: distance, overlap, …
       3. Geonomial Metrics
            Intra-feature: NL translation [not considered yet]
            Intra-feature: stemming, soundex, substitution
Gazetteer “Duplicates”
Approach (2)


   Unified Assessment of Duplicates
       Weighted Combination of Measures
            1 Type
            2 Location(s)
            3 Name(s)
       Geographic Visualization, over Maps
       Final Authority of Human Cataloger
Gazetteer “Duplicates”
Processing Cycle




                         random features

                          prep



                         grouped features
                                            rework
Gazetteer “Duplicates”
Processing Cycle




                         random features

                          prep



                         grouped features
                                            rework
Gazetteer “Duplicates”
Processing Cycle

                         random features

                          prep



                         grouped features
                          weigh




       accepted                             suspended



                         feature
                         database
Gazetteer “Duplicates”
Processing Cycle

                         random features

                          prep



                         grouped features
                          weigh




                          review
       accepted                             suspended



                         feature
                         database
Gazetteer “Duplicates”
Processing Cycle
                          random features

                           prep



                          grouped features
                                             rework
                            weigh




                          review
        accepted                             suspended


                   post
                          feature             reject
                          database


                                               trash
[end]

								
To top