Document Sample

Rough Sets and Incomplete Information In´ s Couso1 Didier Dubois2 e 1. Department of Statistics, Universty of Oviedo, Spain e-mail: couso@uniovi.es e 2. IRIT - CNRS Universit´ Paul Sabatier - Toulouse, France, e-mail: dubois@irit.fr May, 2009 0-0 Introduction • Rough sets were introduced to cope with the lack of expressivity of descriptions of objects by means of attributes in databases (indiscernibility). • Another source of uncertainty is the lack of information about objects (incompleteness). Both situations lead to upper and lower approximations of sets of objects. • Independently, formal deﬁnitions of rough sets have been extended to relations other than equivalence relations – Fuzzy similarity relations (fuzzy rough sets induced by fuzzy partitions) – Tolerance relations (rough sets induced by coverings) Goal: deﬁne approximations of sets when both indiscernibility and incompleteness are present, and bridge the gap with coverings-based rough sets. 1 Pawlak’s Rough sets • Let f : U → V be an attribute function from a ﬁnite set of objects to some domain V = {v1 , . . . , vm }(f may represent a collection of attributes). • Let C = f −1 ({v}) be collection of objects associated to v • Non-empty C’s form a partition Π = {C1 , . . . , Cm } of U . • Upper and lower approximations of an arbitrary set of objects S induced by f : apprΠ (S) = ∪{C ∈ Π : C ∩ S = ∅}; apprΠ (S) = ∪{C ∈ Π : C ⊆ S}. (1) – S is an exact set when apprΠ (S) = S = apprΠ (S). – If not, it is called a rough set. Then apprΠ (S) S apprΠ (S) is the best we can do to describe S with attribute function f . For instance, in a classiﬁcation problem, the partition induced by a decision function d : U → V , will be approximated by the partition induced by an attribute function f . 2 Ill-known sets • A one-to-many mapping F : U → ℘(V ) represents an imprecise attribute function f :U →V. • How to describe the set f −1 (A) of objects that satisfy a property A ⊆ V , namely f −1 (A) ⊆ U . • Because incomplete information, the subset f −1 (A) is an ill-known set. NOTE: F is NOT a set-valued attribute: For each object u ∈ U , all that is known about the attribute value f (u) is that it belongs to the set F (u) ⊆ V . f −1 (A) can be approximated by upper and lower inverses of A via F : • F ∗ (A) = {u ∈ U : F (u) ∩ A = ∅} : all objects that possibly belong to f −1 (A). • F∗ (A) = {u ∈ U : F (u) ⊆ A} : all objects that surely belong to f −1 (A). The pair (F∗ (A), F ∗ (A)) is such that F∗ (A) ⊆ f −1 (A) ⊆ F ∗ (A). Mappings F ∗ and F∗ : 2V → 2U are Dempster ’s upper and lower inverses of F . 3 Ill-known rough sets • In the rough set construction, it is impossible to precisely describe sets deﬁned in extension by means of attribute values, subsets thereof (properties) etc... : insufﬁcient language. • In the ill-known set construction, it is impossible to give an explicit list of objects deﬁned by means of properties : incompletely informed attributes. This paper: the case when both sources of imperfection are combined. When a set cannot be described perfectly : neither in extension in terms of properties, neither in intension. 4 Covering induced by an ill-known attribute function Again the multimapping F between U and V . For each value v ∈ V , let us consider its upper inverse image, the subset of objects of U for which it is possible that f (u) = v: F ∗ ({v}) = {u ∈ U : v ∈ F (u)} ⊆ U. In other words, if u ∈ F ∗ ({v}), we are sure that f (u) = v. COVERING INDUCED BY F : C = {F ∗ ({v1 }), . . . , F ∗ ({vm })} = {C1 , . . . , Ck }. Then, it is obvious that: 1. If F (u) = ∅, ∀ u ∈ U then C is a covering of U , i.e. ∪m Ci = U. i=1 2. v ∈ F (u) if and only if u ∈ F ∗ (v), the set attached to attribute value v in the covering. 3. If F ∗ is injective then, the covering C determines F only up to a possible permutation of elements of V , i.e. |C| = |V |. 5 Example • Let U = {u1 , u2 , u3 , u4 }. Let V = {v1 , v2 , v3 } and F (u1 ) = {v1 , v2 }, F (u2 ) = {v1 , v3 }, F (u3 ) = {v2 , v3 }, F (u4 ) = {v3 }. • The covering associated to F , C = {C1 , C2 , C3 }, is given by: C1 = F ∗ ({v1 }) = {u1 , u2 }, C2 = F ∗ ({v2 }) = {u1 , u3 }, C3 = F ∗ ({v3 }) = {u2 , u3 , u4 }. • If we only know the covering C = {C1 , C2 , C3 }, F can then be retrieved (up to a renaming of elements in V ) as follows: F (u1 ) = {vk : u1 ∈ Ck } = {v1 , v2 } F (u2 ) = {vk : u2 ∈ Ck } = {v1 , v3 } F (u3 ) = {vk : u3 ∈ Ck } = {v2 , v3 } F (u4 ) = {vk : u4 ∈ Ck } = {v3 } 6 Interpretation of coverings • Ci is the class of objects that are possibly in one equivalence class induced by the real information on objects • The covering C encodes an ill-known partition. According to the information provided by F , we know that f induces one of the 7 following partitions of U : Π1 = {{u1 , u2 }, {u3 }, {u4 }}; Π2 = {{u1 , u2 }, {u3 , u4 }} Π3 = {{u1 }, {u2 , u4 }, {u3 }} ; Π4 = {{u1 }, {u2 , u3 , u4 }} Π5 = {{u1 , u3 }, {u2 }, {u4 }} ; Π6 = {{u1 }, {u2 }, {u3 , u4 }} Π7 = {{u1 , u3 }, {u2 , u4 }}. Note : • There are at most u∈U |F (u)| partitions • the covering could be a family of nested sets! 7 Covering based rough sets: Y.Y. Yao The same deﬁnitions of rough sets as for a partition can be used, but the duality between upper and lower approximations is lost. • Y.Y. Yao (1998) considers the following two pairs of approximations – The loose pair: apprL C (S) = ∪{C ∈ C : C ∩ S = ∅} apprL C (S) = [apprL C (S c )]c = {u ∈ U : ∀ C ∈ C , [u ∈ C ⇒ C ⊆ S]} = ∪{C ∈ C : , C ⊆ S ∧ [ ∃C ∈ C, C ∩ S c = ∅ ∧ C ∩ C = ∅]}. – The tight pair: apprT C (S) = ∪{C ∈ C : C ⊆ S} apprT C (S) = [apprT C (S c )]c = {u ∈ U : ∀ C ∈ C , [u ∈ C ⇒ C ∩ S = ∅]}. = ∪{C ∈ C : C ∩ S = ∅ ∧ [ ∃C ∈ C, C ⊆ S c ∧ C ∩ C = ∅]} 8 Covering based rough sets: Y.Y. Yao • The loose inner approximation apprL C (S) of S includes all elements of the covering included in S, but not intersecting the loose outer approximation apprL C (S c ) of its complement. • The tight outer approximation apprT C (S) of S includes all elements of the covering intersecting S, but not intersecting the tight inner approximation apprT C (S c ) of its complement. Then: apprL C (S) ⊂ apprT C (S) ⊆ S ⊆ apprT C (S) ⊂ apprL C (S). The ﬁrst approximation pair is looser than the second pair of sets 9 Covering -based rough sets : Bonikowski Bonikowski et al. (1998) rely on the duality between intensions (properties) and extensions (sets of objects) along the line of formal concept analysis. Then, a covering is a set of known concepts or properties. • The minimal description M (u) of object u is the set of minimal elements in the covering C, that contain u. • The lower approximation of a subset S of objects is chosen as apprT C (S) • The boundary of S is Bn(S) = ∪u∈S\apprT (S) ∪C∈M (u) C C • The upper approximation is apprB C (S) = apprT C (S) ∪ Bn(S). 10 The top-class mapping • Based on the multi-valued mapping F : U → ℘(V ), another multi-valued mapping I : U → ℘(U ) is deﬁned : F I (u) := F ∗ (F (u)) = {u ∈ U : F (u ) ∩ F (u) = ∅}, ∀ u ∈ U. F F I is called the top-class function associated to F . • I (u) is the set of objects that could be in the same equivalence class as u if attribute F function were better known : a kind of neighborhood of u. • Associated tolerance relation R: uRu if and only if u ∈ I (u). F Orlowska & Pawlak (1984) interpret uRu as a similarity between u and u , but this is misleading as it is only potential similarity. 11 Upper and lower approximations induced by top-class mappings • in terms of covering : I (u) = ∪{C ∈ C : u ∈ C} = F v∈F (u) F ∗ ({v}). • ∪{C ∈ M (u)} ⊂ I (u) : the latter is wider than the sets of objects having the same F minimal description • Let I : U → ℘(U ) be the top-class function associated to F . Let apprL C (S) and F apprL C (S) be Y.Y. Yao’s loose upper and lower approximations of S. Then: – apprL C (S) = I ∗ (S) = ∪u∈S I ∗ (u) = {u, I (u) ∩ S = ∅} F F F = {u : ∃u ∈ S, u Ru} – apprL C (S) = I ∗ (S) = ∩u∈S I ∗ (u)c = {u, I (u) ⊆ S} F F F = {u : ∀u , u Ru implies u ∈ S} • These deﬁnitions are thus the natural ones in the setting of incomplete information. 12 Differences with pure rough sets • The covering C provides more information than R and I . F Example :F : U → ℘(V ) and F : U → ℘(V ) deﬁned as follows: F (u1 ) = {v1 , v2 }, F (u2 ) = {v2 , v3 }, F (u3 ) = {v1 , v3 }. F (u1 ) = F (u2 ) = F (u3 ) = {v1 , v2 , v3 }, but they induce the same binary relation R = U × U. But different coverings C = {{u1 , u3 }, {u1 , u2 }, {u2 , u3 }} and C = {{u1 , u2 , u3 }} • For a property A ⊂ V , apprL C (F ∗ (A)) does not necessarily coincide with F ∗ (A) (The set of objects deﬁned by a property is not representable by the covering). 13 The selection function approach • The multiple valued mapping F represents a set of attribute functions f such that ∀u, f (u) ∈ F (u) (f is a selection of F .) • each selection f is associated with a possible partition Πf of U , with equivalence classes [u]f = f −1 (f (u)). • Each subset S ⊆ U can be approximated with respect to f : apprΠ (S) ⊆ S ⊆ apprΠf (S) f Then we can express approximations with respect to incomplete mapping F in terms of its selections: • F ∗ ({v}) = ∪f ∈F f −1 ({v}); I (u) = ∪f ∈F [u]f ; F • apprL C (S) = ∪f ∈F apprΠf (S); apprL C (S) = ∩f ∈F apprΠ (S). f 14 Ill-known rough sets as nested 4-tuples of sets • The tight pair of upper and lower approximations of S by covering C in the sense of Y.Y. Yao, as induced by F is apprT C (S) = ∪{C ∈ C : C ⊆ S} = ∪f ∈F ∪ {f −1 ({v}) ⊆ S} = ∪f ∈F apprf (S) ⊆ S (union of all possible lower approximations). • Hence, by duality apprT C (S) = ∩f ∈F apprf (S). It contains S. BASIC CLAIM : An ill-known rough set is the description of a subset S ⊆ U of ill-known objects by means of an imprecise and incomplete attribute function described by a multimapping F , and it consists of four subsets apprL C (S) ⊆ apprf (S) ⊆ apprT C (S) ⊆ S S ⊆ apprT C (S) ⊆ apprf (S) ⊆ apprL C (S) 15 Quality functions of an ill-known rough set • The upper and lower quality functions of S reﬂect how well S is described by attribute function f . |apprf (S)| |apprf (S)| q f (S) = and q f (S) = |U | |U | • The upper and lower quality functions of S associated to an imprecise representation F of f are ill known: |apprT C (S)| |apprL C (S)| | ∩f ∈F apprf (S)| | ∪f ∈F apprf (S)| q C (S) = [ , ]=[ , ] |U | |U | |U | |U | and |apprL C (S)| |apprT C (S)| | ∩f ∈F apprf (S)| | ∪f ∈F apprf (S)| q C (S) = [ , ]=[ , ] |U | |U | |U | |U | 16 Accuracy of an ill-known rough set q(S) • the accuracy of approximation of S by f is the quantity αR (S) = q(S) ∈ [0, 1]. • The accuracy of approximation of S by ill-known f is the interval q f (S) q f (S) ˜ αR (S) = [ inf , sup ] f ∈F q f (S) f ∈F q f (S) q (S) |apprL (S)| |apprT (S)| and not C q C (S) =[ C , C ] (because the latter do not correspond to |apprL C (S)| |apprT C (S)| the same f in numerator and denominator. ) • Imprecise rough membership function : The Laplacean probability P (S|u) that an object u belongs to S is only known to lie in interval |[u]f ∩ S| |[u]f ∩ S| [ inf , sup ] f ∈F |[u]f | f ∈F |[u]f | 17 Imprecise rough probability Let P be a probability measure on U , and P C (S) := P (apprL C (S)), P C (S) := P (apprL C (S)), ∀ S ⊆ U. Theorem : P C and P C are respectively a plausibility and a belief function. Proof: This is because apprL C (S) is the upper inverse image of S via the top-class F mapping I . We get an interval [P C (S), P C (S)] that coincide with Pawlak’s rough probability if C is a partition. T Not clear P C (S) := P (apprT C (S)), P T (S) := P (apprT C (S)) are plausibility and C belief functions too. 18 Ill-known sets from fuzzy attribute mappings ˜ • A fuzzy mapping F : U → [0, 1]V represents an ill-known attribute function f . • How to describe the set f −1 (A) ⊆ U of objects that satisfy a crisp property A. • Because incomplete information, the subset f −1 (A) is an ill-known set bracketed by a pair of fuzzy sets. For each object u ∈ U , µF (u) (v) is the degree of possibility that f (u) = v. ˜ ˜ ˜ Deﬁne fuzzy sets F ∗ (A) and F∗ (A) as: • µF ∗ (A) = supv∈A µF (u) (v) : all objects more or less possibly in f −1 (A). ˜ ˜ • µF∗ (A) = inf v∈A 1 − µF (u) (v) : all objects that surely belong to f −1 (A). ˜ ˜ ˜ ˜ ˜ ˜ The pair (F∗ (A), F ∗ (A)) is such that Support(F∗ (A)) ⊆ f −1 (A) ⊆ core(F ∗ (A)) and is called a twofold fuzzy set (Dubois - Prade, 1987). 19 Fuzzy rough sets • A fuzzy relation R on U that is symmetric and reﬂexive and min-transitive (Similarity) • Any subset S ⊆ U of objects can be described by a fuzzy rough set deﬁned as a pair of nested fuzzy sets (apprR (S), apprR (S)): – µapprR (S) (u) = supu ∈S R(u, u ). : all objects more or less possibly in S. – µappr (S)(u) = inf u ∈S [1 − R(u, u )] : all objects that surely belong to S. R • Again, Support(apprR (S)) ⊆ S ⊆ core(apprR (S)) • It is in the spirit of the loose approximation pairs of Y.Y. Yao, taking uRu as ∃C, c ∈ C, u ∈ C, u ∈ C and C ∩ C = ∅ in the crisp case. It is the tolerance relation induced by the covering. 20 Fuzzy rough set from a fuzzy attribute mapping We show that: ˜ 1. A fuzzy–valued imprecise attribute function F induces a fuzzy rough set in a natural way. But now, R will not be a similarity relation. It will be reﬂexive and symmetric, but it will not necessarily be min-transitive. 2. The fuzzy rough set expresses loose fuzzy upper and lower approximations of a crisp rough set. • Deﬁne RF (u, u ) = supv∈V min(µF (u) (v), µF (u ) (v)). ˜ ˜ ˜ • Let S ⊆ U : deﬁne apprF (S) and apprF (S) as follows: ˜ ˜ µapprF (S) (u) = µapprR ˜ (S) (u) = supu ∈S RF (u, u ), ∀ u ∈ U, ˜ ˜ F µappr ˜ (S) (u) = µappr (S) (u) = inf u ∈S [1 − RF (u, u )], ∀ u ∈ U. ˜ F R˜ F 21 Interpretation as families of ill-known rough sets ˜ • Consider the crisp multi-mapping : Fα (u) = { v ∈ V : µF (u) (v) ≥ α}. ˜ ˜ • Interpretation of fuzzy mapping F in tems of imprecise probabilities : the probability ˜ that f (u) belongs to Fα (u) = { v ∈ V : µF (u) (v) ≥ α} is greater or equal to 1 − α. ˜ • The ill-known fuzzy rough set approximating S can be retrieved as follows: ˜∗ – Consider Cα = {Fα (v), : v ∈ V } the covering induced by Fα˜ – Consider Rα the tolerance relation deﬁned as uRα u as Fα (u) ∩ Fα (u ) = ∅. µapprR (S) (u) = sup{α ∈ (0, 1] : u ∈ apprLα } C ˜ F µappr (S) (u) = sup{α ∈ (0, 1] : u ∈ apprL }, C R˜ α F using the loose upper and lower approximations of Y.Y. Yao wrt covering Cα – Each α-cut of the fuzzy rough set (apprR (S), apprRF (S)) is a pair of sets ˜ ˜ F bracketing S with probability at least 1 − α. 22 Conclusion • We proposed an interpretation of covering-based rough sets as an ill-known rough set induced both by the ill-observation of attribute values and the lack of discrimination of the set of attributes. • A upper and a lower approximation is not enough: in fact the rough approximations are themselves bracketed from above and from below since ill-known. • Our choice of covering-based generalization of rough sets is justiﬁed by the incomplete information semantics. • Perspectives 1. Relate the deﬁnition of ill-known rough sets to incomplete information database research (Nakata, especially). 2. Complete the fuzzy extension by the study of tight pairs of approximations, fuzzy top-class function etc. 3. fuzzy quality indices and fuzzy rough probabilities 4. Connection with formal concept analysis. 23

DOCUMENT INFO

Shared By:

Categories:

Tags:
rough sets, rough set, incomplete information, information system, rough set theory, information systems, data mining, set analysis, knowledge discovery, soft computing, decision table, attribute set, rough sets theory, rough set approach, rough approximation

Stats:

views: | 6 |

posted: | 12/29/2009 |

language: | English |

pages: | 24 |

OTHER DOCS BY kellena89

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.