VIEWS: 26 PAGES: 69 CATEGORY: Lifestyle POSTED ON: 2/8/2010 Public Domain
Recent Trends in Fuzzy Clustering: From Data to Knowledge pedrycz@ee.ualberta.ca Shenyang, August 2009 Agenda Introduction: clustering, information granulation and paradigm shift Key challenges in clustering Fuzzy objective-based clustering Knowledge-based augmentation of fuzzy clustering Collaborative fuzzy clustering Concluding comments Clustering Areas of research and applications: •Data analysis •Modeling •Structure determination Google Scholar -2, 190,000 hits for “clustering” (as of August 6, 2009) Clustering as a conceptual and algorithmic framework of information granulation Data information granules (clusters) abstraction of data Formalism of: set theory (K-Means) fuzzy sets (FCM) rough sets shadowed sets Main categories of clustering Graph-oriented and hierarchical (single linkage, complete linkage, average linkage..) Objective function-based clustering Diversity of formalisms and optimization tools (e.g., methods of Evolutionary Computing) Key challenges of clustering Data-driven methods Selection of distance function (geometry of clusters) Number of clusters Quality of clustering results The dichotomy and the shift of paradigm Fuzzy Clustering: Fuzzy C-Means (FCM) Given data x1, x2, …, xN, determine its structure by forming a collection of information granules – fuzzy sets Objective function c N m Q u ik || x k v i || 2 i 1 k 1 Minimize Q; structure in data (partition matrix and prototypes) Fuzzy Clustering: Fuzzy C-Means (FCM) Vi – prototypes U- partition matrix FCM – optimization c N m Q u ik || x k v i || 2 Minimize i 1 k 1 subject to (a) prototypes (b) partition matrix Optimization - details Partition matrix – the use of Lagrange multipliers c c V u m d 2 ( u ik 1) ik ik i1 i1 dik= ||xk-vi||2 –Lagrange multiplier V V 0 0 u st Optimization – partition matrix (1) c c V V V u d λ( u ik 1) m 2 0 0 u st λ ik ik i 1 i 1 V 1 mu st 1d st λ m 2 2 1 2 λ m-1 m-1 λ m 1 c u st u st d st d m 1 1 jt m m j1 1 1 λ m 1 1 u st 1 m 2 c d 2 m 1 d c d m 1 st jt j1 2 j1 jt Optimization- prototypes (2) c N n m 2 Q u ik (x kj v ij ) i 1 k 1 j1 Euclidean distance N m Gradient of Q with respect to vs u ik (x kt v st ) 0 k 1 N m u ik x kt v st k 1 N m u ik k 1 Fuzzy C-Means (FCM): An overview procedure FCM-CLUSTERING (x) returns prototypes and partition matrix input : data x = {x1, x2, ..., xk} local: fuzzification parameter: m threshold: norm: ||.|| INITIALIZE-PARTITION-MATRIX t0 repeat for i=1:c do N m u ik ( t )x k v i ( t ) k 1 compute prototypes N m u ik ( t ) k 1 for i = 1:c do for k = 1:N do update partition matrix 1 u ik ( t 1) update partition matrix 2/(m 1) c || x v i (t) || k j 1 || x k v j (t) || tt+1 until ||U(t+1)-U(t)|| return U, V Geometry of information granules n=1 m =1.2 m =2.0 m =3.5 Domain Knowledge: Category of knowledge-oriented guidance Partially labeled data: some data are provided with labels (classes) Proximity knowledge: some pairs of data are quantified in terms of their proximity (closeness) Viewpoints: some structural information is provided Context-based guidance: clustering realized in a certain context specified with regard to some attribute Clustering with domain knowledge (Knowledge-based clustering) Information granules Information granules (structure) (structure) CLUSTERING CLUSTERING Domain knowledge Data Data Data-driven Data- and knowledge- driven Context-based clustering To align the agenda of fuzzy clustering with the principles of fuzzy modeling, the following features are considered: Active role of the designer [customization of the model] The structural backbone of the model is fully reflective of relationships between information granules in the input and output space Clustering : construct clusters in input space X Context-based Clustering : construct clusters in input space X given some context expressed in output space Y Context-based clustering: Computing considerations structure structure context Data Data •computationally more efficient, •well-focused, •designer-guided clustering process Context-based clustering Context-based Clustering : construct clusters in input space X given some context expressed in output space Y Context – hint (piece of domain knowledge) provided by designer who actively impacts the development of the model Context-based clustering: Context design Context – hint (piece of domain knowledge) provided by designer who actively impacts the development of the model. As such, context is imposed by the designer at the beginning Realization of context Designer focus information granule (fuzzy set) (a) Designer, and (b) clustering of scalar data in output space Context – fuzzy set (set) formed in the output space Context-based clustering: Modeling Determine structure in input space given the output is high Determine structure in input space given the output is medium Determine structure in input space given the output is low Input space (data) Context-based clustering: examples Find a structure of customer data [clustering] no context Find a structure of customer data considering context customers making weekly purchases in the range [$1,000 $3,000] Find a structure of customer data considering context customers making weekly purchases at the level of around $ 2,500 Find a structure of customer data considering context customers making significant weekly purchases who (compound) are young Context-oriented FCM Data (xk, targetk), k=1,2,…,N Contexts: fuzzy sets W1, W2, …, Wp wjk = Wi(targetk) membership of j-th context for k-th data Context-driven partition matrix c N U (Wj ) u ik 0,1 | u ik w jk k and 0 u ik N i i 1 k 1 Context-oriented FCM: Optimization flow c N Objective function Q u ik || x k v i || 2 m i 1 k 1 Subject to constraint U in U(Wj) Iterative adjustment of partition matrix and prototypes N u ik w jk 2 m u ik x k c xk vi m 1 vi k 1 N x v u m ik j1 k j k 1 Viewpoints: definition Description of entity (concept) which is deemed essential in describing phenomenon (system) and helpful in casting an overall analysis in a required setting “external” , “reinforced” clusters Viewpoints: definition viewpoint (a,b) viewpoint (a,?) x2 x2 b a x1 a x1 200 150 100 50 0 0 100 200 300 400 500 -50 -100 -150 Viewpoints: definition Description of entity (concept) which is deemed essential in describing phenomenon (system) and helpful in casting an overall analysis in a required setting “external” , “reinforced” clusters Viewpoints: definition viewpoint (a,b) viewpoint (a,?) x2 x2 b a x1 a x1 200 150 100 50 0 0 100 200 300 400 500 -50 -100 -150 Viewpoints in fuzzy clustering B- Boolean matrix characterizing structure: viewpoints prototypes (induced by data) 1, if the j - th feature of the i - th row of B is determined by the viewpoint b ij 0, otherwise 1 1 B 0 0 x2 b 0 0 a b F 0 0 a x1 0 0 Viewpoints in fuzzy clustering N c n N c n Q = u (xkj vij ) m ik 2 u ik (x kj f ij ) 2 m k 1 i 1 j1 k 1 i 1 j1 i, j:b ij 0 i, j:b ij 1 v ij if b ij 0 g ij f ij if b ij 1 N c n Q u ik (xkj g ij ) 2 m k 1 i 1 j1 Viewpoints in fuzzy clustering B- Boolean matrix characterizing structure: viewpoints prototypes (induced by data) 1, if the j - th feature of the i - th row of B is determined by the viewpoint b ij 0, otherwise 1 1 B 0 0 x2 b 0 0 a b F 0 0 a x1 0 0 Viewpoints in fuzzy clustering N c n N c n Q = u (xkj vij ) m ik 2 u ik (x kj f ij ) 2 m k 1 i 1 j1 k 1 i 1 j1 i, j:b ij 0 i, j:b ij 1 v ij if b ij 0 g ij f ij if b ij 1 N c n Q u ik (xkj g ij ) 2 m k 1 i 1 j1 Labelled data and their description Characterization in terms of membership degrees: F = [fik] i=12,…,c , k=1,2, …., N and supervision indicator b = [bk], k=1,2,…, N Augmented objective function c N Q u2 || x k vi ||2 ik (uik fik ) 2 bk || x k vi ||2 i1 k1 >0 Proximity hints Prox(k,l) Characterization in terms of proximity Prox(s,t) degrees: Prox(k, l), k, l=1,2, …., N and supervision indicator matrix B = [bkl], k, l=1,2,…, N Proximity measure Properties of proximity: (a)Prox(k, k) =1 (b)Prox(k,l) = Prox(l,k) Proximity induced by partition matrix U: c Prox(k, l) min(u ik ,u il ) i1 Augmented objective function c N c N N Q u 2 ik || x k v i || 2 [Prox(k1, k2) Prox(U)(k1, k2)] 2 b(k1, k2) || x k1 x k2 ||2 i1 k1 i1 k11 k2 1 >0 Two general development strategies SELECTION OF A “MEANINGFUL” SUBSET OF INFORMATION GRANULES Two general development strategies (1) HIERARCHICAL DEVELOPMENT OF INFORMATION GRANULES (INFORMMATION GRANULES OF HIGHER TYPE) Information granules Type -2 Information granules Type -1 Two general development strategies (2) HIERARCHICAL DEVELOPMENT OF INFORMATION GRANULES AND THE USE OF VIEWPOINTS viewpoints Information granules Type -2 Information granules Type -1 Two general development strategies (3) HIERARCHICAL DEVELOPMENT OF INFORMATION GRANULES – A MODE OF SUCCESSIVE CONSTRUCTION Information granules and their representatives Represent vk[ii] with the use of z1, z2, …, zc 1 u i (v k [ii]) c || v [ii] z || 2/(m1) || vk [ii] z i ||Fii F j1 k j F ii F z1 z2 v1[ii] zc F Fii Representation of fuzzy sets: two performance measures Entropy measure Reconstruction criterion (error) Expressing performance through entropy measure p c c[ii] H(u ii 1 i 1 k 1 i ( v k [ii])) Reconstruction error p c[ii] Q= || v( v ii 1 ˆ k 1 k [ii]) v k [ii] || Fii 2 where c c c v ( v k [ii]) u ( v k [ii]) z i ˆ m i v( v k [ii]) u ( v k [ii]) z i / u im ( v k [ii]) ˆ m i i 1 i 1 i 1 Requirement of “coverage” condition c p F ik Fi k 1 i 1 Optimization problem Form a collection of prototypes Z = {z1, z2, …, zc} such that entropy (or reconstruction error) is minimized while satisfying coverage criterion c F ik p Fi MinZ Q subject to c F ik Fi p k 1 i 1 k 1 i 1 Optimization of fuzzification coefficient (m) c p MinZ Q subject to m>1 and Fi k Fi k 1 i 1 Collaborative structure development (2) Information granules of higher type Information granules data-1 data-2 data-P phenomenon, process, system… Collaborative structure determination: Information granules of higher order Prototypes (higher order) Clustering prototypes D[1] D[2] D[P] Determining correspondence between clusters (3) Prototypes zj (higher order) Clustering Select prototypes in D[1], D[2], …, D[p] associated with z j with the highest degree of membership Determining correspondence between clusters (4) zj vi[ii] 1 D[ii] λ ij [ii] 2 c[ii] || v i [ii] z j || || v k [ii] z j || k 1 Prototype i 0 associated with prototype zj λ i 0 j[ii] max i 1,2,...,c[ii]λ ij Family of associated prototypes Prototype i 1 in D[1] associated with prototype zj Prototype i2 in D[2] associated with prototype zj … Prototype i p in D[p] associated with prototype zj v i1 [1], v i 2 [2],...., v i p [P] i1 , i 2 ,...., i p From numeric prototypes to granular prototypes v i1 [1], v i 2 [2],...., v i p [P] i1 , i 2 ,...., i p individual coordinate of the associated prototypes: a1 a2 …. ap R m1 m2 …. mp [0,1] Information granule The principle of justifiable granularity: Interval representation a1 a2 …. ap m1 m2 …. mp 1 0 b a0 d if ai [b,d] then elevate to membership grades to 1 required change 1- mi : The principle of justifiable granularity: Interval representation a1 a2 …. ap m1 m2 …. mp 1 0 b a0 d if ai [b,d] then reduce membership grades to 0 required changemi : The principle of justifiable granularity: optimization criterion 1 0 z1 z2 Min b,d R:bd { (1 mi ) m } i a i [b,d] a i [b,d] Hyperbox prototypes Hi Hj i j : H i H j (the number of clusters at theaggregatio n level) Interval-valued fuzzy sets and granular prototypes x Hi Hj Interval-valued fuzzy sets and granular prototypes vi || x v i || min x | x v i || max Bounds of distances determined coordinate-wise Interval-valued fuzzy sets: membership function u i ( x) 1 Upper bound 2 c || x v i || m in m 1 || x v j || m ax j1 1 Lower bound u i ( x) 2 c || x v i || m ax m 1 || x v j || m in j1 Collaborative structure determination: Structure refinement Feedback and structure refinement Collaborative structure determination: Structure refinement Iterate Clustering at the local level Sharing findings and clustering at the higher (global) level Assessment of quality of clusters in light of the global structure gi(U)[ii] formed at the higher level c[ii] Refinement of clustering Q[ii] γ (U)[ii] || x i k v i [ii] ||2 i 1 x k X[ii] Until termination criterion satisfied Concluding comments Paradigm shift from data-based clustering to knowledge-based clustering Accommodation of knowledge in augmented objective functions Emergence of type-2 (higher type) information granules when working with collaborative clustering