Introduction to spatial data analysis

Document Sample
Introduction to spatial data analysis Powered By Docstoc
					 Geodemographic classification
          schemes




GEOG3025
 Geodemographic classification
          schemes
• Lecture overview:
  – History
  – Data sources
  – Classification methods
  – Selection and labelling of classes
  – Examples
  – Applications


GEOG3025
             Objectives
• To be familiar with the core techniques
  of geodemographic classification
• To recognise the inherent subjectivity
  and limitations of the approach
• To understand the application and use
  of geodemographic classification
  schemes


GEOG3025
      Introductory questions…

                   What is a
                   geodemographic
How does it        classifier?
capture the
characteristics of a
neighbourhood?
  GEOG3025
                 History

• Originally derived from census small
  area statistics (1971)
• Richard Webber: a classification of
  residential neighbourhoods (ACORN)
• Data-led area classification scheme
• Labelled neighbourhood ‘types’
• General purpose neighbourhood
  classification initially for commercial use
GEOG3025
  Motivation for classification

• To find customers by identifying
  neighbourhoods with similar population
  characteristics
• By inference, finding local populations
  with similar consumer behaviour
• More recently, used for differentiating
  strategies for existing customers and
  branches
GEOG3025
           Data sources

• More modern classifiers using
  combination of census and non-census
  data
• Information from customer databases
  and lifestyle/consumption surveys
• Data generally recorded by postcode
• Postcode/census matching issues

GEOG3025
  What makes neighbourhoods
         different?




                     Multiple
                     dimensions…
GEOG3025
       Classification methods
• Large pool of input variables
• Geographical linkage to clustering scale?
• Data reduction/classification methodology
  intended to capture key dimensions
  – Cluster analysis
  – Principal components/factor analysis
  – Numerous variants and extensions
• Standardization of variables
• Clustering in multidimensional ‘space’

GEOG3025
      e.g. 2 standardised variables
                             HIGH
  e.g. run down                  e.g. Prestigious
  larger houses in               large houses in
  unfashionable                  semi-rural
  inner suburbs                  commuter locations
LOW                                           e.g. income
                                                 HIGH
                                  e.g. Apartments in
  e.g. Inner city high
  rise rented flats               prestigious
                                  modern urban
                     LOW          redevelopments

GEOG3025                 e.g. house size
 Data on 2 standardised variables




                               e.g. income




GEOG3025     e.g. house size
      Initial cluster centres



                                e.g. income




GEOG3025      e.g. house size
    Allocation of data to cluster
               centres



                                e.g. income




GEOG3025      e.g. house size
 Iteration to achieve final cluster
           memberships



                                e.g. income




GEOG3025      e.g. house size
       Design considerations

• How to determine initial cluster centres?
  – Subjective
  – Data-driven
• How to measure distances in
  multidimensional space?
• Treat all variables with equally or assign
  weights?

GEOG3025
   Clustering by grouping of most
        similar observations

    4
           1
   2
                                     e.g. income




                                 3



GEOG3025       e.g. house size
    Classification dendrogram



                               Number of
                               classes




    Individual observations…

GEOG3025
 Selection and labelling of clusters

• Subjective decision regarding number of
  clusters
• Most classification schemes offering
  several different levels of grouping
• Subjective naming of clusters and
  groups


GEOG3025
    Example ACORN hierarchy…
New (2001)
Acorn
classification:
5 Categories
17 Groups
56 Types




GEOG3025
Pre-2001 classification – outer Southampton

     GEOG3025
New ACORN for same postcode




           Reflecting late 1990s new-build housing

GEOG3025
    Contemporary applications

• Increasing opportunity to integrate data
  from different scales:
  – Person/address
  – Postcode
  – Output area
• Neighbourhood data most powerful
  where relatively little transactional data
  available about customer base
GEOG3025
                Assignment
•   Select variable set and study region
•   CASWEB retrieval (including denominators)
•   Run Geodemo spreadsheet
•   Import dataset
•   Define denominators
•   Select number of classes
•   Run/re-run
•   Label classes
    GEOG3025
           Lecture summary
• Geodemographic classifications seeking
  to capture key differentiating
  characteristics by data reduction
• Subjective decisions in cluster
  methodology and labelling
• Use of census and non-census data,
  multiple geographies
• Widespread commercial applications
GEOG3025

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:3
posted:8/4/2012
language:Latin
pages:24