; ISOIEC 11179 Standard Edition 3 Revision Issues
Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out
Your Federal Quarterly Tax Payments are due April 15th Get Help Now >>

ISOIEC 11179 Standard Edition 3 Revision Issues

VIEWS: 59 PAGES: 69

  • pg 1
									ISO/IEC 11179 Standard Edition 3 Revision Issues
(also ISO/IEC 20944)
F. Olken, K.D. Keck, J. McCarthy, B. Bargmeyer Lawrence Berkeley National Laboratory presented to XMDR Project Meeting July 14, 2005 Berkeley, CA Version 026
2005-07-14 F. Olken, et al. 1

Outline of Talk
● ● ●

Goals of Revisions to ISO/IEC 11179 Review of ISO/IEC Revision Process Review of Major Issues Areas

●
●

Detailed discussion of issues proposals
Attempt to reach consensus on each proposal and subsequent recommendation to L8

●

Add new issues as necessary

2005-07-14

F. Olken, et al.

2

Goals for ISO/IEC 11179 Revisions
●

Facilitate incorporation of terminologies, ontologies, axioms
– – –

To provide names, definitions for data element concepts, value domains, ... To capture more precise semantics To support semantics-based computing (SOA, Semantic web services, ...)

–
–
●

To support ontology life cycle management
Better capture provenance of metadata (for traceability)

Better support for relationships among concepts, metadata information, ... Better support DB integration:
– – –

●

For federated/mediated queries For data exchange By capturing complete DB schemas, ontologies, terminologies, ...

2005-07-14

F. Olken, et al.

3

Issue Management for ISO/IEC 11179 Revisions
●

LBNL (or others) prepare draft issue document Discuss issue at XMDR meeting Proposed issue is discussed at L8 meeting Revisions to proposed issue document

●

●

●

●

Proposed issue is discussed again at L8 meeting
Issue given official consideration by L8 Discuss issue again at XMDR meeting Further discussion of issue by L8 Recommendation to JTC1 SC32 WG2 WG2 debates Issue, votes adoption/rejection Issue draft incorporated into 11179 Edition 3 Draft

●

●

●

●

●

●

●

11179 Edition 3 goes to committee draft, draft intl. std, int'l std. F. Olken, et al. 4

2005-07-14

Overview
● ● ● ● ●

Issues related to Relationships
Issues related to Concepts Issues related to Ontologies and Axioms Issues related to Type System Internationalization Issues

2005-07-14

F. Olken, et al.

5

Relationships

2005-07-14

F. Olken, et al.

6

Relationships Issues
● ● ● ● ● ●

Binary relationships
Graph theoretic relationship constraints N-ary relationships Dependency constraints (key, foreign key, ...)? Administered items ....

Standard semantic relationships

2005-07-14

F. Olken, et al.

7

Uses of Relationships
●

Semantic Relationships
– –

Used for terminologies, ontologies Examples include: is-a, instance-of, part-of, ...

●

Capture Relationships From DB Schemas, ...

2005-07-14

F. Olken, et al.

8

Binary Relationships
●

Most relationships in terminology and ontology systems
Used in NIAM

●

●
●

Can be represented as directed graphs
Properties:
– – – –

Reflexive/irreflexive, symmetric/anti-symmetric, transitive Cardinality constraints Type constraints (need a type system) Name, name of inverse
F. Olken, et al. 9

2005-07-14

Graph Theoretic Relationship Constraints
●

Undirected vs. Directed graphs Bipartite graphs vs. non-bipartite Trees

●

●

●

DAGs (Directed Acyclic Graph)
Partial Order (DAG + transitivity) Simple Graphs = nodes + edges (pairs of nodes)

●

●

●

Nested Graphs = graphs within nodes
Compound Graphs = edges connect subgraphs (used in conceptual graphs) [cf. Named graphs] Hypergraphs (variously, edge=set of nodes, edge connects sets of nodes) [used for relational schemas]
F. Olken, et al. 10

●

●

2005-07-14

Why do we care about graph theoretic constraints on relationships?
● ●

They affect relationship representation. They affect choice of algorithms for searching, comparison, inference These are semantic constraints Taxonomies are always partial orders ! Partonomies are DAGs, maybe partial orders Conceptual graphs, schema matchings are bipartite graphs
F. Olken, et al. 11

● ● ● ●

2005-07-14

N-ary Relationships
●

Relationships with arbitrary (fixed) arity:
–

e.g., ternary, ...

●

Basis for relational data models, entity-relationship modeling More concise than binary relationships Properties:
– – –

● ●

Analogs for reflexive, symmetric, transitive properties? Cardinalities Roles: names of roles, names of inverses, ...
F. Olken, et al. 12

2005-07-14

Dependency Constraints
●

Functional, inclusion dependencies – keys, foreign keys Usually captured during relational DB design Enforced by relational DBMS integrity constraints

●

●

●

Important for data exchange, querying multiple DB
Not yet included in ISO 11179 Inclusion dependencies are a generalization of a controlled vocabulary (stronger than a type constraint) Dependency constraints exist in SQL, XML Schema, OCL

●

●

●

2005-07-14

F. Olken, et al.

13

Standard Semantic Relations
●

For new ontologies, ... we would like a standard set of semantic relations This facilitates ontology comparison, integration, ... Obvious candidates include:
–

●

●

Is-a, instance-of, part-of, and their inverses From UMLS metathesaurus From Foundational Model of Anatomy .....

●

Other candidate sets of semantic relations
– – –

2005-07-14

F. Olken, et al.

14

Modified Classification v006

2005-07-14

F. Olken, et al.

15

Relationship Types - v003

2005-07-14

F. Olken, et al.

16

Modified Classification v005

2005-07-14

F. Olken, et al.

17

Relationship Types - v002

2005-07-14

F. Olken, et al.

18

Issue 136: Directed (Binary) Relationships
●

Problem: At present we are unable to specify asymmetric (binary) relationships, because existing std. does not differentiate between the ends (roles) of a relationship.
Solution: make (binary) relationships asymmetric. Label relationship ends (roles) separately, i.e, subject and object. See following diagram.

●

●

2005-07-14

F. Olken, et al.

19

Issue 136: Directed (Binary) Relationships

2005-07-14

F. Olken, et al.

20

Issue 138: Relationship Type Description
●

Problem:
–

Relationship types are merely strings, precluding the possibility of attributes on relationship types, etc. Make relationship type an object, connect to relationship via an association. Relationship type attributes include:
●

●

Solution:
– –

Reflexivity, anti-reflexivity, symmetry, anti-symmetry, transtivity
F. Olken, et al. 21

2005-07-14

Issue 138: Relationship Type Description

2005-07-14

F. Olken, et al.

22

Issue 138: Relationship Type Description (n)

2005-07-14

F. Olken, et al.

23

Issue 139: Rename Relationships in Figure 3 of Part 3 of ISO/IEC 11179

2005-07-14

F. Olken, et al.

24

Issue 139: Rename Relationships in Figure 3 of Part 3 of ISO/IEC 11179
●

Problem: The relationship names in Figure 3 Part 3 do not reflect standard naming conventions and are confusing. Solution:
– – –

●

Rename the horizontal role and association names in Clause 4.7.3 Replace “specifying” and “representing” with “has domain” and “domain of” Replace “data element representation” with “data_element_value_domain_relationship”
F. Olken, et al. 25

2005-07-14

Issue 140: Make “relationship types” into “administered items

2005-07-14

F. Olken, et al.

26

Issue 146: Add cardinality constraints for both ends of binary relationships

2005-07-14

F. Olken, et al.

27

Issue 155: Reify reference relationships to permit types of reference relationships

2005-07-14

F. Olken, et al.

28

Issue 162: Default set of “relationship types”
●

Problem: Absence of a default set of relationship types (a.k.a. Semantic relationships) invites every user to adopt their own conventions – this is a barrier to ontology integration, etc.
Solution: Adopt a default set of “relationship types” for use in new ontologies/schemas, ...

●

2005-07-14

F. Olken, et al.

29

Concepts

2005-07-14

F. Olken, et al.

30

Concept Issues
●

Kinds of concepts
– – –

Data element concepts = (object, property) Concepts in ISO/IEC 11179 Part 2 Classification systems Our discussion focuses on revisions to concept modeling in ISO/IEC 11179 Part 2 Classification Systems

2005-07-14

F. Olken, et al.

31

Revisions to Concepts Simplification and Modularization
●

Data Element Concept has properties object class and property ISO/IEC 11179 Edition 2
–

●

These reference instances of classes Object_Class and Property which are subclasses of Administered_Item These reference instances of the class Concept which are contained in Concept_System which is a subclass of Administered_Item This avoids the need to replicate relationships of Concept Also provides a means for modularizing Concepts

●

ISO/IEC 11179 Edition 3 proposal
–

– –

2005-07-14

F. Olken, et al.

32

Revisions to Concepts v002

2005-07-14

F. Olken, et al.

33

Ontologies and Axioms

2005-07-14

F. Olken, et al.

34

Ontologies and Axioms
● ● ●

Semantic Transparency
Notations Granularity, Modularity

2005-07-14

F. Olken, et al.

35

Semantic Transparency
●

Are axiom sets, etc. considered to be opaque blobs, or are they transparent to the registry – is the registry “aware” of the content of the axiom set, ...? How much of semantics of notation does the registry understand? Alternatives:
–

●

●

Opaque blob = bag of bits

–
– –

Extract symbols
Extract symbols + taxonomic relations ....

–

Store axioms as parse trees + symbol tables
F. Olken, et al. 36

2005-07-14

Semantic Transparency Considerations
●

Opaque blobs:
– –

Easiest to implement Least useful Requires additional processing of axiom sets, ... Requires additional indexing, query facilities Either standard notation, or multiple parsers needed

●

More transparent treatments:
– – –

2005-07-14

F. Olken, et al.

37

Semantic Transparency Recommendation
●

Blobs plus extraction of symbols, taxonomic relations
–

i.e., is-a, instance-of, part-of, ...

–
● ●

Perhaps broader and narrower term (KDK)

Intermediate effort Intermediate utility

2005-07-14

F. Olken, et al.

38

Notations for Axioms, Ontologies, ...
●

Ontologies may be encoded in:
– – – – – –

OWL-DL, OWL-full, OWL-Lite KIF, XCL, XCG (XML Conceptual Graphs) UML/XMI, MOF/XMI, CWM/XMI, ... OWI, OIL+DAML Ontylog, CycL, CycML, ... XTM (XML Topic Maps) Notation Agnostic Standardize Notation Translate to Standard Notation
F. Olken, et al. 39

●

Alternatives
– – –

2005-07-14

Implications of Notation Decisions
●

Notation Agnostic (with notation designation)
– –

Allows great freedom of notation in registries Makes it very difficult to compute over axioms, ontology fragments, ontology definitions, ontologies

●

Standardize Notation
–

Extremely difficult to enforce, limits utility of registry
Considerable investment (unless automated?)

●

Translate to Standard Notation
–

–

Requires a very powerful Standard Notation:
●

●

Either full First Order Logic, possibly Higher Order Logic Problems with decidability, computational complexity
F. Olken, et al. 40

2005-07-14

Current Notation Proposal
●

Notation agnositicism
–

Do NOT standardize notation Requires notation annotation for all definitions, axiom sets, ontologies, ontology fragments Suggests use of a controlled vocabulary (taxonomy?) of notations Either:
●

●

Implications:
– – –

●

Constrains registry computations over defintions, axiom sets, ontologies, etc. Or requires multiple (or multi-lingual) reasoners
F. Olken, et al. 41

2005-07-14

Granularity and Modularity of Registration / Administration for Ontologies, Axiom Sets, etc.
●

What granularity should ontologies, axiom sets, etc. be registered / administered?
Fine granules
– – –

●

Permits fine grained tracking of changes, provenance Requires additional effort to record metadata Ignores the need to GROUP axioms, etc. into meaningful units for release, etc.

2005-07-14

F. Olken, et al.

42

Granularity / Modularity (cont.)
●

Coarse granules
– – –

Less effort to record metadata Permits grouping of individual axioms, etc. into meaningful collections, releases, ... Obscures detailed record of changes, provenance

2005-07-14

F. Olken, et al.

43

Granularity / Modularity
●

Two separate issues
– –

Addressability granularity – ideally very fine grained Adminstrative granularity – possiblly coarser, to reflect groupings of axioms, etc. Would like to be able to assemble axioms, etc. into variable size modules, modules of modules ... akin to module management in programming languages

●

Modularity
–

2005-07-14

F. Olken, et al.

44

Granularity/Modularity Recommendations
● ●

Fine grained addressability of axioms, ...
Possibility of coarser groupings for administration

●

Support for hierarchical (DAG) module structures
Modules = administrative units (?)

●

2005-07-14

F. Olken, et al.

45

Issue 143: Register Formal Definitions
●

Problem:
–

We have no mechanism to register formal definitions. Create generalization hierarchy of definition, natural language definition, and formal definition

●

Solution:
–

–
–

Attributes of formal definition include notation (KIF, OWL, ...) and (natural) language (of symbols, etc.)
Farance suggests collapsing notions into a single object (definition) and treat notation as a language.
F. Olken, et al. 46

2005-07-14

Issue 143: Register Formal Definitions

2005-07-14

F. Olken, et al.

47

Issue 148: Register formal axioms
●

Problem:
–

We have no way to register axioms (sets of axioms), hence no way to register portions of axiomatized ontologies. Add an axiom set object, with notation attribute.

●

Solution:
–

2005-07-14

F. Olken, et al.

48

Issue 148: Register formal axioms

2005-07-14

F. Olken, et al.

49

Issue 150: Represent “rule templates”
● ●

Requested by Pragati Systems
Problem:
–

No way to capture “rule templates” in data model. These are used to partition large axiom sets. Add “rule template set” objects, notation attribute, ...

●

Solution:
–

2005-07-14

F. Olken, et al.

50

Issue 150: Represent “rule templates” in data model

2005-07-14

F. Olken, et al.

51

Types

2005-07-14

F. Olken, et al.

52

Type Systems Issues
●

Current Status:
– – – –

ISO/IEC 11179 has a weak type system Views types as a issue of representation (e.g., in value domains) Does not view types as part of semantic specification

Unable to specify “types” as part of constraints on relationships among “concepts”

2005-07-14

F. Olken, et al.

53

Why worry about types?
●

Type constraints are ubiquitously used in DB design, KR, and programming languages
Types are not just issues of “representation” but often capture important semantics:
–

●

Integers support addition, subtraction, and multiplication. Rational numbers allow division; real numbers allow square root, etc. Ultimately need ability to specify various types of mathematical structures (scalars, vectors, tensors, matrices, geometric structures: lines, polygons, polytopes, ...)
F. Olken, et al. 54

–

2005-07-14

Semantic Types vs. Representational Types
●

Given common usage of types to refer to aspects of representation we may need to distinguish “semantic types” from “representational types”
KDK suggests “abstract types” vs. “concrete types”

●

2005-07-14

F. Olken, et al.

55

References

2005-07-14

F. Olken, et al.

56

Issue 154: Add Dublin Core Attributes to “References”
●

Problem: At present “references” treat bibliographic citations as a text blob, precluding selective searching, causing loss of semantic structure when round tripping for more detailed sources.
Solution: Add Dublin Core Metadata attributes

● ●

Comment: Farance, et al. want to allow other bibliographic citation attribute sets, e.g., MARC

2005-07-14

F. Olken, et al.

57

Issue 154: Add Dublin Core Attributes to “References”

2005-07-14

F. Olken, et al.

58

Internationalization

2005-07-14

F. Olken, et al.

59

Internationalization Issues
●

Issues revolving around support for multiple natural languages and character sets
–

Treatment of Language Identifiers

–

Treatment of Character Encodings

2005-07-14

F. Olken, et al.

60

Which governments care about Internationalization?
● ● ● ● ● ● ●

US domestic agencies: English, Spanish (?)
US DOD, State, CIA, FBI: Yes, multiple languages EU: Yes, multiple languages China: Yes, Mandarin, English, .... Japan: Yes, Japanese, Mandarin, Korean, English, ...

Korea: Yes, Korean, Mandarin, Japanese, English, ...
Canada: Yes, English, French, ...

2005-07-14

F. Olken, et al.

61

Commercial Aspects of Internationalization
● ● ●

Support for multi-national corporations
Support for international trade Support for global marketing of software
Microsoft, Oracle, Sun, BEA, ....

–IBM, –This

is the most important driver for internationalisation of IT standards

2005-07-14

F. Olken, et al.

62

Language Identifiers – Current Status
●

Status in Edition 2 of ISO/IEC 11179
– – –

ISO/IEC 11179 Languages Identifiers do not conform to IETF RFC 3066

Language Ids: language code + country code (for dialect)
IETF RFC 3066:
●

●

Specifies use of 2 letter language codes (from ISO 639) vs. 3 letter language codes (from ISO 639) where possible Allows UN region codes in addition to country codes

2005-07-14

F. Olken, et al.

63

Language Identifiers – The Problem
●

IETF 3066 are the most popular language codes. Used in:
–

MIME

–
– –

XML
HTML Java (?)

–

Use of non-standard language IDs requires translation of language Ids whenever importing/exporting from other formats (i.e., always). Extra work, lossy transformations (?), ...
F. Olken, et al. 64

2005-07-14

Language Identifiers in ISO/IEC 11179
●

Use of RFC 3066 language identifiers would facilitate interchange of language metadata
Recommendation: Change ISO/IEC 11179 Language Identifiers to comply with IETF 3066 language identifiers. Implication: will require modification of existing registry contents Issue number: not yet registered with WG2

●

●

●

2005-07-14

F. Olken, et al.

65

Character Encodings in ISO 20944 & ISO/IEC 11179
●

Issue:
–

When, where, and how should character set encoding issues (Latin-1, UTF-8, UTF-16, ...) concerning strings input/output/stored in ISO 11179 registries be recorded Store char set info in ISO/IEC 11179 registry (along with strings)

●

Possibilities:
–

–
–

Specify char set info in ISO 20944 (ISO/IEC 11179 API)
Do not standardize char set encoding metadata, but leave it to implementations

●

Related issue: canonicalization of strings ...
F. Olken, et al. 66

2005-07-14

Rationales for various approaches to character set encodings
●

Record character set encoding on per string basis in ISO/IEC 11179 metamodel [F. Farance]
–

Preservationist approach – assures preservation of original string encoding when round tripping (store/retrieve from registry)

●

Record character set encoding in API (ISO 20944), silence in ISO/IEC 11179
–

Allows flexibility of registry implementation
Various native string implementations, Unicode, ... Risks loss of original encodings
●

–

2005-07-14

F. Olken, et al.

67

Rationales for Character Set Encoding Decisions
●

Silence on character set encodings
– –

Maximum flexibility in registry implementation Possible confusion in API's

–
●

Risk loss of original encodings
Facilitates comparison of strings Requires conversion of all non-canonical encodings Risk loss of original encodings Requires agree on canonicalization
F. Olken, et al. 68

Canonicalization of all string encodings
– – – –

2005-07-14

Status of Character Set Encodings
● ●

In Edition 2 there was complete silence
For Edition 3, L8 has recently approved F. Farance's multi-string approach
–
–

Preservationist approach
Strings in ISO/IEC 11179 consist of:
●

●

Character set encoding descriptor Sequence of octets (bytes)

●

Our Recommendation:
–

Move character set encoding specifictions to APIs, i.e., ISO/IEC 29044
F. Olken, et al. 69

2005-07-14


								
To top