Docstoc

Spatial Association Rules

Document Sample
Spatial Association Rules Powered By Docstoc
					Data Mining Query Languages

             Donato Malerba


     Dipartimento di Informatica
     Università degli studi di Bari
     malerba@di.uniba.it
     http://www.di.uniba.it/~malerba/
A database perspective on
KDD

Most current KDD systems offer isolated
 discovery features using tree inducers,
 neural nets, and rule discovery algorithms
They cannot be embedded into a large
 application and typically offer just one
 knowledge discovery feature
True also for OLAP tools
  This is the first generation of KDD tools

  DMQL – Prof. D. Malerba
                                              2
Short term research program
 Efficient DM algorithms on top of large
  databases and utilizing the existing DBMS
  support
Example:
1. Realization of C4.5 on top of a large database requires
    tighter coupling with the DBMS and intelligent use of
    indexing techniques.
2. Exploitation of caching techniques for association rule
    mining
3. Exploitation of special indexing techniques for
    clustering
See IBM‟s Intelligent Miner
   DMQL – Prof. D. Malerba
                                                         3
Long term research program
 KDD should follow one of the key DBMS
  paradigms: building interpreters for query
  languages and compilers for ad hoc queries
  and embedding queries in application
  programming interfaces (API)
 Focus: increasing programmer productivity for
  KDD application development
  Knowledge and Data Discovery Management Systems
     (KDDMS) are the second generation KDD systems.


  DMQL – Prof. D. Malerba
                                                      4
Imielinski & Mannila’s view
 KDD object
    Rule: probabilistic formula or multidimensional
     correlation
X.Diagnosis=“heart disease” and X.Age <50  X.BMI > 29 [300, 0.80]
    Classifier: decision trees, neural network,
     multidimensional regression
    Clustering: collection of objects
 KDD query: a predicate which returns a set of
  objects that can either be KDD objects or
  database objects (records or tuples)

   DMQL – Prof. D. Malerba
                                                                 5
Imielinski & Mannila’s view
    The KDD objects typically will not exist a priori, thus
     querying the KDD objects requires their generation at
     run time.
    KDD objects may also be pre-generated and stored in
     a “inductive” database, such as metadata.
    In such cases querying can be reduced to retrieval.
    KDDMS should be able to persistently store and
     manage the KDD objects as well as provide the ability
     to query them
    Querying involves
         The generation of new KDD objects
         Retrieval of the ones which were generated before

    DMQL – Prof. D. Malerba
                                                               6
Imielinski & Mannila’s view
 Closure principle: the result of a query is a
  relation that can be queried further.
 A result of a KDD query may be an argument
  of another compatible type of KDD query.
 In principle a KDD query can be nested within
  a regular relational query.
 KDD queries can be embedded in a host
  programming environment just as SQL queries
  can be embedded in host languages.

  DMQL – Prof. D. Malerba
                                              7
Imielinski & Mannila’s view

    Generate a decision tree on a user-defined training set
     (specified through a database query) with user-
     defined attributes and user-specified classification
     categories. Then find all records in a database wrongly
     classified using that classifier as a training data for
     another classifier.
    Generate all rules with consequent values computed
     by an SQL query (KDD queries may not be completely
     known at a compile time!).
    Find tuples that belong to the largest cluster in a
     clustering constructed according to a user-specified
     distance metrics.
    DMQL – Prof. D. Malerba
                                                           8
Imielinski & Mannila’s view

Research program:
1. A KDD query language has to be formally defined
2. Query optimization tools would be developed to
    compile queries into reasonably efficient execution
    plans.
Very challenging!
    KDD queries are much more powerful than SQL
    queries


   DMQL – Prof. D. Malerba
                                                          9
Imielinski & Mannila’s view

Example:
Patient(Age, Sex, City, Diagnosis, Height, Weight,
     ClaimAmount, …)
City(State, Population, …)
X.Diagnosis=“heart disesase” and Sex=“male” 
     X.Age>50 [1200,0.70]
The user wants to see all the rules about a patient with
     heart disease such that the consequent of this rule
     says something about the age of the patient, there are
     at least 1,000 cases which the rule body applies, and
     the confidence of the rule is at least 65%.
   DMQL – Prof. D. Malerba
                                                         10
Imielinski & Mannila’s view
In M-SQL (Imielinski et al., Proc. KDD‟96)
SELECT
FROM MINE(T):R
WHERE R.Body={(Diagnosis=“heart disesase”)} AND
    R.Consequent = {(Age=*)}
R.Support > 1000
R.Confidence > 0.65
R renames MINE(T)
MINE(T) is an operator that takes a class T and generates
    all propositional rules about T
         Rule discovery: Another type of querying!
   DMQL – Prof. D. Malerba
                                                        11
Imielinski & Mannila’s view

Rules are not necessarily the final product of KDD
    applications.
A proper API, which embeds a rule query
    language in a more expressive, general
    purpose, host programming environment is
    necessary.
   Iterate over a collection of rules




  DMQL – Prof. D. Malerba
                                                 12
KDD query languages
Imielinski, Virmani, Abdulghani. Discovery board application
   programming interface and query language for database mining.
   Proc. KDD96
Imielinski and Virmani. MSQL: A query language for database mining.
   Journal of Data Mining and Knowledge Discovery, 3(4), 1999.
Meo, Psaila, and Ceri. A new SQL-like operator for mining association
  rules. Proc. VLDB, 1996.
Han, Fu, Koperski, Wang, and Zaiane. DMQL: A Data Mining Query
  Language for Relational Databases„, Proc. SIGMOD'96 Workshop.
   on Research Issues on Data Mining and Knowledge Discovery
   (DMKD'96), 1996.
Shen, Ong, Mitbander, and Zaniolo. Metaqueries for Data Mining. In:
  Fayyad et al. Advances in Knowledge Discovery and Data Mining,
  AAAI Press, 1996.
    DMQL – Prof. D. Malerba
                                                                        13
KDD query languages
Giannotti, Manco. Querying Inductive Databases via Logic-Based User-
   Defined Aggregates. PKDD 1999
De Raedt. An Inductive Logic Programming Query Language for Database
   Mining. AISC 1998
De Raedt. A Logical Database Mining Query Language. ILP 2000
De Raedt. Query execution and optimization for inductive databases. Proc.
   EDBT Workshop on Database Technologies for Data Mining, 2002
Boulicaut, Klemettinen, Mannila. Querying inductive databases: a case
   study on the MINE RULE operator. In: Proceedings of the Second
   European Symposium on Principles of Data Mining and Knowledge
   Discovery PKDD'98, LNAI 1510, 1998
Elfeky, Saad, Fouad. ODMQL: Object Data Mining Query Language. In
   Dittrich et al. (eds), Objects and Databases 2000, LNCS 1944, 2001
Johnson, Lakshmanan, Ng. The 3w model and algebra for unified data
   mining.–Proc. VLDB, 1998
     DMQL Prof. D. Malerba
                                                                        14
KDD query languages

Han, Koperski, Stefanovic. GeoMiner: A System Prototype for
  Spatial Data Mining. SIGMOD Conference 1997
Malerba, Appice, Ceci, Vacca. SDMOQL: An OQL-based Data
  Mining Query Language for Map Interpretation. Proc. EDBT
  Workshop on Database Technologies for Data Mining, 2002




    DMQL – Prof. D. Malerba
                                                        15
  DMQL: just some syntactic
sugar on top of DM algorithms?
 A user can formulate a DM task without paying attention
  to
   Logical and physical representation problems
   The correct procedural order in which some DM steps should be
    performed
 The development of decision support applications is
  easier, just as SQL make implementation of operational
  information systems easy
 A casual user can find patterns by means of a DMQL in
  the same way he can find data by means of a SQL
  query: no development of ad hoc applications
 A DMQL provides a foundation on which a GUI can be
  built
  DMQL – Prof. D. Malerba
                                                              16
                 Spatial Data Mining

    Spatial Data Mining: the extraction of spatial
     patterns from both spatial and aspatial data,
     possibly stored in a spatial database
    Spatial Pattern: a pattern showing the interaction
     of two or more spatial objects or space-depending
     attributes according to a particular spacing or set
     of arrangements

IF a large town intersects the motorway A14
       THEN it is also close to the Adriatic sea (13%, 90%)


    DMQL – Prof. D. Malerba
                                                              17
Spatial Data Mining & GIS

Geographical Information Systems (GIS) offer an
important application area where spatial data mining
techniques can be effectively used
Example: topographic map interpretation




  DMQL – Prof. D. Malerba
                                                       18
 Interpreting Topographic Maps
 Topographic map: large scale
  (1:10000 to 1:100000) composite
  map showing relief, vegetation and
  man-made features of a portion of
  a land surface.
 Interpreting the colored lines,
  areas, and other symbols is the first
  step in using topographic maps.
 Easy! Symbols correspond univocally to concepts
  explicitly modelled by the map creator.
 Difficult! locating in a map some geographical objects not
  explicitly modelled (e.g., industrial area)
   DMQL – Prof. D. Malerba
                                                         19
 Interpreting Topographic Maps
 Solution: embedding intelligent capabilities in geo-based
  tools
 Knowledge-based GIS use
   spatial reasoning capabilities
   available domain knowledge
  to support map interpretation
 But operational definitions of some complex concepts
   are difficult to elicit
   are not portable on different data models
   depend on the scale of the map

     DMQL – Prof. D. Malerba
                                                              20
      Data Mining to Support Map
      Interpretation Tasks

Data Mining tools and techniques to find
 spatial patterns of interest.
INGENS (INductive GEographic iNformation
 System) = GIS + Data Mining Server + …
Training functionality
The user can train the system by providing
 instances of geographical objects to be
 recognized in a map

   DMQL – Prof. D. Malerba
                                          21
INGENS Architecture

  Interface
   Layer
                              GUI (Web Browser)                    The interface
                                                                   Suite tools
                                                                   layer of the for
                                                                   Permits
                                                                    Allows any
                                                                   integration user
               Map Converter       Map Editor
 Application                                         Query         import/export of
                                                                   implements a
                                                                 Responsible fora
                                                                  Ato formulate
                                                   Interpreter
                                                                   mapswhich is
                                                                   and/or
                                   Data mining
  Enabler
                                                                   GUI, of data
                                                                     suite
                   Map
                                                                 the automated
                                     Server
                                                                    queries in
                 Descriptor
                                                                   modification
                                                                   Java applet. of
                                                                  mining systems
                   Map Storage
                                                                 generation of
                                                                  Is the only access
                                                                    SDMOQL
                                                                   information
                    Subsystem                                     that can be run
 Resource
                                                                   Manages logic
                                                                 first-order bydata
                                                                  path to the by
                                                                    language.
                                                                   acquired
                                                                  concurrently
                                            Deductive DBMS
 Manager        ObjectStore DBMS
                                                                 descriptions of to
                                                                  contained in the
                                                                   discovered
                                                                   means users
                                                                  multipleof the
                                                                 some Repository
                                                                  Map INGENS
                                                                   patterns
                                                                   Map
                                                                  train Converter
                                                                 geographical
                                                                  Involved
                                                                 objects. in
                        Map                      Knowledge
                      Repository                 Repository
                                                                  storing, updating
                                                                  and retrieving
                                                                  items
   DMQL – Prof. D. Malerba
                                                                                 22
The data model for the map
repository

Hybrid tessellation-topological model
Tessellation model: a map is decomposed
 according to a regular grid of cells
Topological model has two structural hierarchies:
  physical (describes the geographical objects by means
   of the most appropriate geometric entity);
  logical (expresses the semantics of geographical
   objects).



  DMQL – Prof. D. Malerba
                                                     23
    The object-oriented data model
    in UML                                                                                                                                     Lower scale

                                                                                                                                          0..1
                                                                                                                                        Map               0..*

                                                                                                                                               1
                                                                                                        N/NE/NW/S/SE/SW/E/W                                               Gif
                                                                                                                                               Grid              1
                                                                                                                    0..1         0..1
                                                                                                                                          1..*        1
                                                                                                Logical structure                                                                                               1..*
                                                                                                                             1                                                   Physical structure
                                                             Logical Object 1..*                                                        Cell          1                                                                          Physical Object

                                                                                       1..*
                                                                                                                                                                                                       1..*
                                                                                                                                 Representation


                                                                                                                                                                                                                                             Disjoint/Meet/Overlap/Contains/Equal/Covers



                                                                                                                                                                                                                                                                   1..*



                                                                                                                                                                                                                       Point                 Line             Region 1..*

                                                                                                                                                                                                                   1..*        1..*   1..*      1      0..1       0..*

   Hydrography           Orography        Land Adm inistration         Vegetation             Adm inistrative Boundary Ground Trasportation Net.                      Construction                     Built-up Area
                                                                                                                                                                                                                                Line vertex      Boundary
                                                                                                                                                                                                                                        Inside/Border




River          Canal         Lake       Parcel       Park         Cultivation   Forest                      Road           Ropeway       Railway                 Bridge           Hamlet        Town       Chief Town          Regional Capital     Capital




        Font           Sea          Contour Slope   Slope        Level point    City      Province      County       State        Building            Airport        Wall       Power Station    Factory      Boat Station        Deposit




                       DMQL – Prof. D. Malerba
                                                                                                                                                                                                                                                                  24
Different technologies: what
    support for the user?
 Problem: The user should not suffer from problems
  related to the integration of different technologies, such
  as
   Data mining
   OODBMS
   Deductive databases
   GIS
 Solution: A data mining query language (DMQL)
  interfaces users with the whole system and hides the
  different technologies.



   DMQL – Prof. D. Malerba
                                                          25
SDMOQL

 DMQL is the data mining query language define by Han
  et al. (1996) for relational databases
 GMQL (Geo Mining Query Language) is a language for
  spatial data mining, based on DMQL (Koperski 1999)
 Both inspired to SQL and the relational model  not
  appropriate for an OO information system like INGENS
 SDMOQL (Spatial Data Mining Object Query Language)
  is a spatial mining query language for INGENS users
  based on OQL


   DMQL – Prof. D. Malerba
                                                     26
Data Mining primitives
A DMQL must incorporate a set of DM primitives
 designed     to    facilitate efficient, fruitful
 knowledge discovery.
Primitives include:
  The specification of portions of the database in which
   the user is interested;
  The kinds of knowledge to be mined
  Background knowledge useful in guiding the
   discovery process;
  Interestingness measures of pattern evaluation
  How the discovered knowledge should be visualized
  DMQL – Prof. D. Malerba
                                                       27
Task-relevant data specification
    In traditional DM applications, it is sufficient to specify
     Database attributes or
     Datawarehouse dimensions
since: 1. No interaction between objects of assumed,datathat
       2.        complex transformation is stored so is
            required
            each object can be effectively described by a tuple
    Not in in the relation
            spatial data mining, where working at the level of
          in spatial data geometric representations of the
    Notstored data, that is mining, where attributes (points,
       neighborsregions) of geographic objects isinterest may
       lines and of some spatial object of undesirable.
       influence the object itself.
     The user is interested in working at higher conceptual
         Data where mine cannot be properties and
     levels, set to human-interpretable straightforwardly
       relations between geographical objects are expressed
       represented by means of a relational table, where
       distinct tuples refer to distinct, independent objects.
    DMQL – Prof. D. Malerba
                                                           28
Example

Two roads can cross each other, or run parallel,
 or can be confluent, independently of the fact
 that they are represented by one or more tuples
 of a relational table of “lines” or “regions”




  DMQL – Prof. D. Malerba
                                               29
A solution
SDMOQL interpreter allows user to select the
 geographical objects that are relevant to the
 data mining task, and then it invokes the Map
 Descriptor to produce their high level conceptual
 descriptions.
Conceptual descriptions are based on first-order
 logic language, where both properties and
 relations of selected geographical objects can be
 easily represented.


  DMQL – Prof. D. Malerba
                                                30
  Example
  SELECT x FROM x IN Cell
  WHERE x->num_cell = 11
contain(x1,x2)=true, …, contain(x1,x70)=true,
type_of(x1)=cell, …, type_of(x4)=vegetation,…,
subtype_of(x2)=cultivation,…, subtype_of(x7)=cart_track_road,…,
color(x2)=black, …, color(x70)=black,
extension(x7)=111.018,…, extension(x33)=1104.74,
geographic_direction(x7)=north, …, geographic_direction(x68)=north,
line_shape(x7)=straight,…, line_shape(x33)=cuspidal,…,
altitude(x19)=106.00,…, altitude(x43)=102.00,
area(x2)=187525.00, …, area(x62)=30250.00,
density(x2)=high, …, density(x62)=low,
line_to_line(x7,x68)=almost_parallel, …, region_to_region(x2,x21)=meet,…,
distance(x7,x68)=5.00, line_to_region(x8,x27)=adjacent, …,
       DMQL – Prof. D. Malerba
                                        point_to_region(x4,x18)=outside,…
                                                                    31
Describing topographic maps
 33 geographical objects: contour_slope, slope, river, canal,
  primary_road, farm_road, interfarm_road, main_road, …
 16 descriptors: contain(x, y), type_of(y), subtype_of(y),
  color(y), area(y), density(y), extension(y),
  geographic_direction(y), line_shape(y), altitude(y),
  line_to_line(y), distance(y, z), region_to_region(y,z),
  line_to_region(y,z), point_to_region(y,z)
 Defined together with town planners, the set of descriptors
  is quite general and can capture geometric, topological and
  directional features of geographical objects in a topographic
  map.
    DMQL – Prof. D. Malerba
                                                           32
Task-relevant data specification

 In SDMOQL the selection of geographical objects is
  performed by means of simplified OQL queries with a
  SELECT-FROM-WHERE structure.
 Example 1: cell-level query
The user selects cell 26 from the topographic map of Canosa
  (Apulia, Italy)
SELECT x
FROM x IN Cell
WHERE x->num_cell = 26 AND x->part_map->map_name =
  “Canosa”
The Map Descriptor generates the description of all the
  objects in this cell.
   DMQL – Prof. D. Malerba
                                                       33
Task-relevant data specification

 Example 2: layer-level query
The user selects the layer Horography from the
  topographic map of Canosa and the layer Construction
  from any map.
SELECT x, y
FROM x IN Horograhy, y IN Construction
WHERE x->part_map->map_name = “Canosa”

The Map Descriptor generates the description of the objects
  in these layers.

   DMQL – Prof. D. Malerba
                                                        34
Task-relevant data specification

 Example 3: object-level query
The user selects the objects of the logic class River and the objects
  of type motorway (instances of the class Road), from cell 26 of
  the topographic map of Canosa.
SELECT x, y
FROM x IN River, y IN Road
WHERE x->part_map->map_name = “Canosa” AND
       y->part_map->map_name = “Canosa” AND
       x->log_incell->num_cell = 26 AND
       y->log_incell->num_cell = 26 AND
       y->type_road = “motorway”
The Map Descriptor generates the description of these objects.

    DMQL – Prof. D. Malerba
                                                                 35
Task-relevant data specification
   Example 4: Semantically ambiguous query
SELECT x, y
FROM x IN Cell, y IN River
WHERE x->num_cell = 26 AND
        y->log_incell->num_cell = 26
This query selects the object cell 26 and all rivers in it. However, it is
     unclear whether the Map Descriptor should describe
1. the entire cell 26 or      Formulate a cell-level query
2. only the rivers in it, or  Formulate an object-level query
3. both.                      (unusual) case, anyway the problem can be
                                  solved by the UNION operator, applied to
                                  the cell-level query and the object-level
                                  query.
    DMQL – Prof. D. Malerba
                                                                     36
Task-relevant data specification
The following constraint is imposed on SDMOQL:
the selected data must belong to the same level (cell, layer or
    logic object).
More formally the FROM clause can contain either a group of
   Cells or a set of Layers, or a set of Logic Objects, but
   never a mixture of them.




   DMQL – Prof. D. Malerba
                                                          37
The kind of knowledge to be
mined
<Spatial_Data_Mining_Statement> ::=
        <Limited_OQL_Query>
        mine
        <Kind_of_Pattern>
<Kind_of_Pattern> ::=
<Classification_Rules> | <Association_Rules>

<Classification_Rules> ::=
        classification as <Pattern_Name>
        for <Classification_Concept>{,<Classification_Concept>}
         [analyze <Descriptor> {, <Descriptor>}]
The analyze clause indicates that the descriptions of selected data is
  based on spatial/aspatial descriptors in the list

   DMQL – Prof. D. Malerba
                                                                  38
Example
SELECT x
FROM x in Cell
WHERE x->num_cell >= 5 AND x->num_cell <= 12
mine classification as MorphologicalElements
for class(_)=system_of_farms, class(_)=fluvial_landscape
analyze      contain/2, type_of/1, subtype_of/1,
             area/1, density/1, extension/1,
             line_shape/1, geographic_direction/1,
             line_to_line/2, distance/2, line_to_region/2,
             region_to_region/2, point_to_region/2


   DMQL – Prof. D. Malerba
                                                             39
Defining background knowledge

 In SDMOQL the BK is defined as a set of definite clauses.
 Example:
define knowledge
  close_to(X,Y)=true :- region_to_region(X,Y)=meet.
  close_to(X,Y)=true :- close_to(Y,X)=true.




   DMQL – Prof. D. Malerba
                                                              40
Defining schema hierarchies

 Define a total or partial order among attributes in the database
  schema.                                                  Activity
 Example:


                                                   business_activity     other_activity




                                       low_business_activity   high_business_activity

define hierarchy Activity as
  level1:{business_activity, other_activity} < level0: Activity;
  level2:{low_business_activity,high_business_activity} < level1:
  business_activity;
    DMQL – Prof. D. Malerba
                                                                                   41
Defining set-grouping
hierarchies
 Organize values for given attributes or dimensions into groups of
  constants or range of values                    Distance
 Example:


                                                 far     near




                                 2 Km .. + Km            0 m … 1,999 m


define hierarchy Distance for distance/2 as
  level1:{far, near} < level0: Distance;
  level2:{0, 1999} < level1: near;
  level2:{2000, +inf} < level1: far;
    DMQL – Prof. D. Malerba
                                                                          42
Interestingness measure
specification
 threshold values: e.g. the user can set thresholds such
   as confidence and support as follows:
           ThresholdParameter threshold Value
 search biases in the hypotheses space: The user can
   specify a number of preference criteria, such as
   maximization of the number of covered examples or
   minimization of the number of variables in the body of a
   learned clauses, according to the following syntax:
 preference criteria (minimize | maximize ) Criterion
                    with tolerance Value.
 generic input parameter of a data mining algorithm:
                 ParameterName = Value
   DMQL – Prof. D. Malerba
                                                        43
An example

Problem: Localize a “sistema poderale” (system
 of farms) in Apulian maps.
The user browses the maps with INGENS and
 finds some examples of system of farms …




  DMQL – Prof. D. Malerba
                                             44
An example: the data
… and some
counterexample




   DMQL – Prof. D. Malerba
                             45
An example: the DM query
 Formulate a data mining task through SDMOQL:
SELECT x FROM x in Cell
WHERE(x->num_cell>=1 AND x->num_cell<=6) OR x->num_cell=11
   OR x->num_cell=34 OR (x->num_cell>=15 and x->num_cell <= 17)
mine classification as MorphologicalElements
for class(X)=system_of_farms
analyze contain/2, type_of/1, subtype_of/1, color/1, altitude/1,
   area/1, density/1, extension/1, line_shape/1, geographic_direction/1,
        line_to_line/2, distance/2, line_to_region/2,
        region_to_region/2, point_to_region/2         with preference
   criteria
minimize negative_example_covered with tolerance 0.6,
maximize positive_example_covered with tolerance 0.4,
minimize cost with tolerance 0.4
number_of_rules threshold 15, consistent threshold 500
   DMQL – Prof. D. Malerba
                                                                   46
An example: the process
                                                   VISUALIZATION
 QUERY OF                          DATA MINING
  SPATIAL                          ALGORITHMS
                                                     DISCOVERED
   DATA                                              KNOWLEDGE
  MINING               MAP
                                     SYMBOLIC
                    DESCRIPTOR      DESCRIPTIONS
                                                      DEDUCTIVE
                                                      DATABASE
                 OBJECT ORIENTED
                      DBMS



                       OBJECT
                      ORIENTED
                      DATABASE



 DMQL – Prof. D. Malerba
                                                                   47
An example: results

class(S1)=system_of_farms 
contain(S1,S2)=true, region_to_region(S2,S3)=meet,
area(S2)[68437.5 .. 187525],
region_to_region(S2,S4)=disjoint,
   region_to_region(S4,S3)=meet, type_of(S1)=cell,
   type_of(S2)=parcel, type_of(S4)=parcel,
   type_of(S3)=parcel

there are two pairs of adjacent parcels (S2, S3) and (S4,
  S3), one of which is relatively large (the area is between
  68437.5 and 187525 m2)

   DMQL – Prof. D. Malerba
                                                          48
An example:results

class(S1)=system_of_farms 
   contain(S1,S2)=true, region_to_region(S2,S3)=disjoint,
density(S3)=high, region_to_region(S2,S4)=meet,
region_to_region(S4,S5)=meet, region_to_region(S2,S5)=meet,
type_of(S1)=cell, area(S2)[12381.2 .. 25981.2], type_of(S2)=parcel

there are three adjacent regions (S2, S4, S5), one of which is certainly
   a medium-sized parcel (the area is between 12381.2 and 25981.2
   m2), and there is a fourth region (S3) with a high density
   (presumably vegetation), disjoint from the parcel S2



    DMQL – Prof. D. Malerba
                                                                      49
An example: use of results
 The user asks INGENS to find all cells in the Canosa map
  that are classified as system of farms and contain a main
  road.
   SELECT C
   FROM M in Map, C in Cell, R in Road
   WHERE M->name = “Canosa” AND C->map = M AND R->log_incell = C AND
     R->type_road=“main_road” AND class(C) = system_of_farms
 To    check       condition defined by the predicate
                     the
  class(C)=system_of_farms,     the     Query     Interpreter
  generates the symbolic description of each cell in the map
  and asks the Query Engine of the Deductive Database to
  prove the goal class(C)=system_of_farms given the logic
  program previously learned.
    DMQL – Prof. D. Malerba
                                                                50
Conclusions and future work
  A query language for spatial data mining based on OQL
  A solution to the problem of integrating different
   technologies (OODBMS, Deductive database, DM, …)
  Differences with respect to traditional DMQL
  Implementation of the interpreter in INGENS.
                        Future Work
  Extension of the set of descriptors automatically
   extracted from a vectorized map
  Extension to other spatial data mining tasks supporting
   quantitative interpretation of maps

    DMQL – Prof. D. Malerba
                                                        51