VIEWS: 24 PAGES: 77 POSTED ON: 11/19/2010
Additional Material: Mahalanobis Distance Prof. Dr. Rudolf Kruse 1 NF SEURO UZZY Interpretation of a Covariance Matrix A univariate normal distribution has the density function A multivariate normal distribution has the density function Prof. Dr. Rudolf Kruse 2 NF SEURO UZZY Variance and Standard Deviation Univariate Normal/Gaussian Distribution The variance/standard deviation provides information about the height of the mode and the width of the curve. Prof. Dr. Rudolf Kruse 3 NF SEURO UZZY Interpretation of a Covariance Matrix The variance/standard deviation relates the spread of the distribution to the spread of a standard normal distribution The covariance matrix relates the spread of the distribution to the spread of a multivariate standard normal distribution Example: bivariate normal distribution Question: Is there a multivariate analog of standard deviation? Prof. Dr. Rudolf Kruse 4 NF S EURO UZZY Eigenvalue Decomposition Yields an analog of standard deviation. Let S be a symmetric, positive definite matrix (e.g. a covariance matrix). Prof. Dr. Rudolf Kruse 5 NF S EURO UZZY Eigenvalue Decomposition Special Case: Two Dimensions Prof. Dr. Rudolf Kruse 6 NF SEURO UZZY Eigenvalue Decomposition Prof. Dr. Rudolf Kruse 7 NF SEURO UZZY Eigenvalue Decomposition Prof. Dr. Rudolf Kruse 8 NF SEURO UZZY Eigenvalue Decomposition Special Case: Two Dimensions Prof. Dr. Rudolf Kruse 9 NF SEURO UZZY Cluster-Specific Distance Functions The similarity of a data point to a prototype depends on their distance. If the cluster prototype is a simple cluster center, a general distance measure can be defined on the data space. In this case the Euclidean distance is most often used due to its rotation invariance. It leads to (hyper-)spherical clusters. However, more flexible clustering approaches (with size and shape parameters) use cluster-specific distance functions. The most common approach is to use a Mahalanobis distance with a cluster-specific covariance matrix. The covariance matrix comprises shape and size parameters. The Euclidean distance is a special case that results for Prof. Dr. Rudolf Kruse 10 NF S EURO UZZY Additional Material: Neuro-Fuzzy Systems Prof. Dr. Rudolf Kruse 11 NF SEURO UZZY Beispiel : Automatik-Getriebe Aufgabe: Verbesserung des VWAutomatik-Getriebes - keine zusätzlichen Sensoren - individuelle Anpassung des Schaltverhaltens Idee (1995): Das Fahrzeug “beobachtet” und klassifiziert den Fahrer nach Sportlichkeit - ruhig, normal, sportlich Bestimmung eines Sport-Faktors aus [0, 1] - nervös Beruhigung des Fahrers Testfahrzeug: - verschiedene Fahrer, Klassifikation durch Experten (Mitfahrer) - gleichzeitige Messungen: Geschwindigkeit, Position, Geschwindigkeit des Gaspedals, Winkel des Lenkrades, ... (14 Attribute). Prof. Dr. Rudolf Kruse 12 NF S EURO UZZY Modellierung unscharfer Informationen mit Fuzzy-Mengen Zugehörigkeitsgrad fast genau 2 etwa zwischen ungefähr 13 6 und 8 1 2 6 8 13 negativ negativ negativ ungefähr positiv positiv positiv groß mittel klein null klein mittel groß Prof. Dr. Rudolf Kruse 13 NF SEURO UZZY Example:Continously Adapting Gear Shift Schedule in VW New Beetle classification of driver / driving situation gear shift by fuzzy logic computation fuzzification inference defuzzifi- interpolation machine cation accelerator pedal filtered speed of determination accelerator pedal of speed limits rule for shifting gear number of sport base into higher or selection changes in factor [t] lower gear pedal direction depending on sport factor sport factor [t-1] Prof. Dr. Rudolf Kruse 14 NF S EURO UZZY If X is positive small and Y is positive small then Z is positive small 1 1 1 x X y Y Z If X is positive big and Y is positive small then Z is positive big 1 1 1 x X y Y Z 1 Eingabewerte: x und y Stellwert: z Z defuzzifizierter Wert Prof. Dr. Rudolf Kruse 15 NF SEURO UZZY Fuzzy-Regler mit 7 Regeln AG 4 Optimiertes Programm 24 Byte RAM auf Digimat 702 Byte ROM Laufzeit 80 ms, 12 mal pro Sekunde wird ein neuer Sportfaktor bestimmt In Serie im VW Konzern Erlernen von Regelsystemen mit Hilfe von Künstlichen Neuronalen Netzen, Optimierung mit evolutionären Algorithmen Prof. Dr. Rudolf Kruse 16 NF SEURO UZZY Beispiel : Fuzzy Datenbank TOP MANAGEMENT NACH- FOLGER TALENTBANK MANAGEMENT Nachfolger für Top-Management Positionen Prof. Dr. Rudolf Kruse 17 NF SEURO UZZY Prof. Dr. Rudolf Kruse 18 NF SEURO UZZY Beispiel : Automatisiertes sensor-basiertes Landen Prof. Dr. Rudolf Kruse 19 NF SEURO UZZY Neuro-Fuzzy Systems Building a fuzzy system requires prior knowledge (fuzzy rules, fuzzy sets) manual tuning: time consuming and error-prone Therefore: Support this process by learning learning fuzzy rules (structure learning) learning fuzzy set (parameter learning) Approaches from Neural Networks can be used Prof. Dr. Rudolf Kruse 20 NF SEURO UZZY Prof. Dr. Rudolf Kruse 21 NF SEURO UZZY Example: Prognosis of the Daily Proportional Changes of the DAX at the Frankfurter Stock Exchange (Siemens) Database: time series from 1986 - 1997 DAX Composite DAX German 3 month interest rates Return Germany Morgan Stanley index Germany Dow Jones industrial index DM / US-$ US treasury bonds Gold price Nikkei index Japan Morgan Stanley index Europe Price earning ratio Prof. Dr. Rudolf Kruse 22 NF SEURO UZZY Fuzzy Rules in Finance Trend Rule IF DAX = decreasing AND US-$ = decreasing THEN DAX prediction = decrease WITH high certainty Turning Point Rule IF DAX = decreasing AND US-$ = increasing THEN DAX prediction = increase WITH low certainty Delay Rule IF DAX = stable AND US-$ = decreasing THEN DAX prediction = decrease WITH very high certainty In general IF x1 is m1 AND x2 is m2 THEN y=h WITH weight k Prof. Dr. Rudolf Kruse 23 NF SEURO UZZY Classical Probabilistic Expert Opinion Pooling Method DM analyzes each source (human expert, data + forecasting model) in terms of (1) Statistical accuracy, and (2) Informativeness by asking the source to asses quantities (quantile assessment) DM obtains a “weight” for each source DM “eliminates” bad sources DM determines the weighted sum of source outputs Determination of “Return of Invest” Prof. Dr. Rudolf Kruse 24 NF SEURO UZZY E experts, R quantiles for N quantities each expert has to asses R·N values stat. Accuracy: R si C 1 R 2 N I s, p , 2 I s, p si ln i 0 p information score: 1 N R 1 pr 1 I lnvi, R 1 vi,o pr 1 ln N i 1 r 1 vi,r vi,r 1 ce I e id ce weight for expert e: we E e1ce I e id e ce E outputt= we outputt e e 1 T roi = yt sign outputt DM Prof. Dr. Rudolf Kruse t 1 25 NF SEURO UZZY Formal Analysis Sources of information R1 rule set given by expert 1 R2 rule set given by expert 2 D data set (time series) Operator schema fuse (R1, R2)fuse two rule sets induce(D) induce a rule set from D revise(R, D) revise a rule set R by D Prof. Dr. Rudolf Kruse 26 NF SEURO UZZY Formal Analysis Strategies: fuse(fuse (R1, R2), induce(D)) revise(fuse(R1, R2), D) fuse(revise(R1, D), revise(R2, D)) Technique: Neuro-Fuzzy Systems Nauck, Klawonn, Kruse, Foundations of Neuro-Fuzzy Systems, Wiley 97 SENN (commercial neural network environment, Siemens) Prof. Dr. Rudolf Kruse 27 NF S EURO UZZY Neuro-Fuzzy Architecture Prof. Dr. Rudolf Kruse 28 NF SEURO UZZY From Rules to Neural Networks 1. Evaluation of membership degrees 2. Evaluation of rules (rule activity) x j 1 m c( ,js) xi n r ml: IR [0,1] , l D 3. Accumulation of rule inputs and normalization kl m l x NF: IR IR, x l 1 wl n r k j m j x r j 1 Prof. Dr. Rudolf Kruse 29 NF SEURO UZZY The Semantics-Preserving Learning Algorithm Reduction of the dimension of the weight space 1. Membership functions of different inputs share their parameters, e.g. m dax m cdax stable stable 2. Membership functions of the same input variable are not allowed to pass each other, they must keep their original order, e.g. m decreasing m stable m increasing Benefits: the optimized rule base can still be interpreted the number of free parameters is reduced Prof. Dr. Rudolf Kruse 30 NF S EURO UZZY Return-on-Investment Curves of the Different Models Validation data from March 01, 1994 until April 1997 Prof. Dr. Rudolf Kruse 31 NF SEURO UZZY Neuro-Fuzzy Systems in Data Analysis Neuro-Fuzzy System: System of linguistic rules (fuzzy rules). Not rules in a logical sense, but function approximation. Fuzzy rule = vague prototype / sample. Neuro-Fuzzy-System: Adding a learning algorithm inspired by neural networks. Feature: local adaptation of parameters. Prof. Dr. Rudolf Kruse 32 NF SEURO UZZY A Neuro-Fuzzy System is a fuzzy system trained by heuristic learning techniques derived from neural networks can be viewed as a 3-layer neural network with fuzzy weights and special activation functions is always interpretable as a fuzzy system uses constraint learning procedures is a function approximator (classifier, controller) Prof. Dr. Rudolf Kruse 33 NF S EURO UZZY Learning Fuzzy Rules Cluster-oriented approaches => find clusters in data, each cluster is a rule Hyperbox-oriented approaches => find clusters in the form of hyperboxes Structure-oriented approaches => used predefined fuzzy sets to structure the data space, pick rules from grid cells Prof. Dr. Rudolf Kruse 34 NF SEURO UZZY Hyperbox-Oriented Rule Learning y Search for hyperboxes in the data space Create fuzzy rules by projecting the hyperboxes Fuzzy rules and fuzzy sets are created at the same time x Usually very fast Prof. Dr. Rudolf Kruse 35 NF SEURO UZZY Hyperbox-Oriented Rule Learning y y y y x x x x Detect hyperboxes in the data, example: XOR function Advantage over fuzzy cluster anlysis: No loss of information when hyperboxes are represented as fuzzy rules Not all variables need to be used, don„t care variables can be discovered Disadvantage: each fuzzy rules uses individual fuzzy sets, i.e. the rule base is complex. Prof. Dr. Rudolf Kruse 36 NF S EURO UZZY Structure-Oriented Rule Learning y large Provide initial fuzzy sets for all variables. The data space is partitioned medium by a fuzzy grid Detect all grid cells that contain data (approach by Wang/Mendel 1992) small Compute best consequents and select best rules x (extension by Nauck/Kruse 1995, NEFCLASS model) small medium large Prof. Dr. Rudolf Kruse 37 NF SEURO UZZY Structure-Oriented Rule Learning Simple: Rule base available after two cycles through the training data 1. Cycle: discover all antecedents 2. Cycle: determine best consequents Missing values can be handled Numeric and symbolic attributes can be processed at the same time (mixed fuzzy rules) Advantage: All rules share the same fuzzy sets Disadvantage: Fuzzy sets must be given Prof. Dr. Rudolf Kruse 38 NF SEURO UZZY Learning Fuzzy Sets Gradient descent procedures only applicable, if differentiation is possible, e.g. for Sugeno- type fuzzy systems. Special heuristic procedures that do not use gradient information. The learning algorithms are based on the idea of backpropagation. Prof. Dr. Rudolf Kruse 39 NF SEURO UZZY Learning Fuzzy Sets: Constraints Mandatory constraints: Fuzzy sets must stay normal and convex Fuzzy sets must not exchange their relative positions (they must not „pass“ each other) Fuzzy sets must always overlap Optional constraints Fuzzy sets must stay symmetric Degrees of membership must add up to 1.0 The learning algorithm must enforce these constraints. Prof. Dr. Rudolf Kruse 40 NF SEURO UZZY Example: Medical Diagnosis Results from patients tested for breast cancer (Wisconsin Breast Cancer Data). Decision support: Do the data indicate a malignant or a benign case? A surgeon must be able to check the classification for plausibility. We are looking for a simple and interpretable classifier: knowledge discovery. Prof. Dr. Rudolf Kruse 41 NF SEURO UZZY Example: WBC Data Set 699 cases (16 cases have missing values). 2 classes: benign (458), malignant (241). 9 attributes with values from {1, ... , 10} (ordinal scale, but usually interpreted as a numerical scale). Experiment: x3 and x6 are interpreted as nominal attributes. x3 and x6 are usually seen as „important“ attributes. Prof. Dr. Rudolf Kruse 42 NF SEURO UZZY Applying NEFCLASS-J Tool for developing Neuro-Fuzzy Classifiers Written in JAVA Free version for research available Project started at Neuro-Fuzzy Group of University of Magdeburg, Germany Prof. Dr. Rudolf Kruse 43 NF SEURO UZZY NEFCLASS: Neuro-Fuzzy Classifier Output variables (class labels) Unweighted connections Fuzzy rules Fuzzy sets (antecedents) Input variables (attributes) Prof. Dr. Rudolf Kruse 44 NF SEURO UZZY NEFCLASS: Features Automatic induction of a fuzzy rule base from data Training of several forms of fuzzy sets Processing of numeric and symbolic attributes Treatment of missing values (no imputation) Automatic pruning strategies Fusion of expert knowledge and knowledge obtained from data Prof. Dr. Rudolf Kruse 45 NF SEURO UZZY Representation of Fuzzy Rules Example: 2 Rules c1 c2 R1: if x is large and y is small, then class is c1. R2: if x is large and y is large, then class is c2. The connections x R1 and x R2 R1 R2 are linked. small large large The fuzzy set large is a shared weight. x y That means the term large has always the same meaning in both rules. Prof. Dr. Rudolf Kruse 46 NF SEURO UZZY 1. Training Step: Initialisation Specify initial fuzzy partitions for all input variables y large c1 c2 medium small x x y small medium large Prof. Dr. Rudolf Kruse 47 NF S EURO UZZY 2. Training Step: Rule Base Algorithm: Variations: for (all patterns p) do Fuzzy rule bases can find antecedent A, also be created by such that A( p) is maximal; using prior if (A L) then add A to L; knowledge, fuzzy end; cluster analysis, fuzzy decision trees, genetic for (all antecedents A L) do algorithms, ... find best consequent C for A; create rule base candidate R = (A,C); Determine the performance of R; Add R to B; end; Select a rule base from B; Prof. Dr. Rudolf Kruse 48 NF SEURO UZZY Selection of a Rule Base e Pe rformanc of a Rule : • Order rules by performance. N 1 1 Pr c Rr x p , with • Either select N p 1 the best r rules or the best r/m rules per class. 0 if class(x p ) con( Rr ), • r is either given or is c determined automatically such that all patterns are 1 otherwise. covered. Prof. Dr. Rudolf Kruse 49 NF SEURO UZZY Rule Base Induction NEFCLASS uses a modified Wang-Mendel procedure y large c1 c2 medium R1 R2 R3 small x x y small medium large Prof. Dr. Rudolf Kruse 50 NF S EURO UZZY Computing the Error Signal Error Signal Fuzzy Error ( jth output): E j sgn(d ) 1 (d ) , with d t j o j c1 c2 ad 2 d and : 0, 1, (d ) e max (t : correct output, o : actual output) R1 R2 R3 Rule Error: x y Er r 1 r Econ( Rr ) , with 0 1 Prof. Dr. Rudolf Kruse 51 NF SEURO UZZY 3. Training Step: Fuzzy Sets x a b a if x [a, b) Example: triangular c x membership m a ,b,c : [0,1], m a ,b,c ( x) if x [b, c] function. c b 0 otherwise m ( x) if E 0 f 1 m ( x) otherwise Parameter updates for an antecedent b f E c a sgn(x b) fuzzy set. a f E b a b c f E c b b Prof. Dr. Rudolf Kruse 52 NF SEURO UZZY Training of Fuzzy Sets y large initial fuzzy set m(x) medium reduce enlarge 0.85 0.55 small 0.30 x x small medium large Heuristics: a fuzzy set is moved away from x (towards x) and its support is reduced (enlarged), in order to reduce (enlarge) the degree of membership of x. Prof. Dr. Rudolf Kruse 53 NF S EURO UZZY Training of Fuzzy Sets Algorithm: Variations: repeat for (all patterns) do Adaptive learning rate accumulate parameter updates; Online-/Batch accumulate error; Learning end; modify parameters; optimistic learning until (no change in error); (n step look ahead) local Observing the error on minimum a validation set Prof. Dr. Rudolf Kruse 54 NF SEURO UZZY Constraints for Training Fuzzy Sets Valid parameter values Non-empty intersection of 1 adjacent fuzzy sets Keep relative positions 2 Maintain symmetry Complete coverage (degrees of membership add up 3 to 1 for each element) Correcting a partition after modifying the parameters Prof. Dr. Rudolf Kruse 55 NF SEURO UZZY 4. Training Step: Pruning Goal: remove variables, rules and fuzzy sets, in order to improve interpretability and generalisation. Prof. Dr. Rudolf Kruse 56 NF SEURO UZZY Pruning Algorithm: Pruning Methods: repeat 1. Remove variables select pruning method; (use correlations, information gain etc.) repeat execute pruning step; 2. Remove rules train fuzzy sets; (use rule performance) if (no improvement) 3. Remove terms then undo step; (use degree of fulfilment) until (no improvement); 4. Remove fuzzy sets (use fuzziness) until (no further method); Prof. Dr. Rudolf Kruse 57 NF SEURO UZZY WBC Learning Result: Fuzzy Rules R1: if uniformity of cell size is small and bare nuclei is fuzzy0 then benign R2: if uniformity of cell size is large then malignant Prof. Dr. Rudolf Kruse 58 NF SEURO UZZY WBC Learning Result: Classification Performance Predicted Class malign benign not sum classified malign 228 (32.62%) 13 (1.86%) 0 (0%) 241 (34.99%) benign 15 (2.15%) 443 (63.38%) 0 (0%) 458 (65.01%) sum 243 (34.76%) 456 (65.24%) 0 (0%) 699 (100.00%) Estimated Performance on Unseen Data (Cross Validation) NEFCLASS-J: 95.42% NEFCLASS-J (numeric): 94.14% Discriminant Analysis: 96.05% Multilayer Perceptron: 94.82% C 4.5: 95.10% C 4.5 Rules: 95.40% Prof. Dr. Rudolf Kruse 59 NF SEURO UZZY WBC Learning Result: Fuzzy Sets uniformity of cell size sm lg 1.0 0.5 0.0 1.0 2.8 4.6 6.4 8.2 10.0 bare nuclei 1.0 0.5 0.0 1.0 2.8 4.6 6.4 8.2 10.0 Prof. Dr. Rudolf Kruse 60 NF SEURO UZZY NEFCLASS-J Prof. Dr. Rudolf Kruse 61 NF SEURO UZZY Resources Detlef Nauck, Frank Klawonn & Rudolf Kruse: Foundations of Neuro-Fuzzy Systems Wiley, Chichester, 1997, ISBN: 0-471-97151-0 Neuro-Fuzzy Software (NEFCLASS, NEFCON, NEFPROX): http://www.neuro-fuzzy.de Beta-Version of NEFCLASS-J: http://www.neuro-fuzzy.de/nefclass/nefclassj Prof. Dr. Rudolf Kruse 62 NF SEURO UZZY Download NEFCLASS-J Download the free version of NEFCLASS-J at http://fuzzy.cs.uni-magdeburg.de Prof. Dr. Rudolf Kruse 63 NF SEURO UZZY Conclusions Neuro-Fuzzy-Systems can be useful for knowledge discovery. Interpretability enables plausibility checks and improves acceptance. (Neuro-)Fuzzy systems exploit tolerance for sub-optimal solutions. Neuro-fuzzy learning algorithms must observe constraints in order not to jeopardise the semantics of the model. Not an automatic model creator, the user must work with the tool. Simple learning techniques support explorative data analysis. Prof. Dr. Rudolf Kruse 64 NF SEURO UZZY Information Mining Information mining is the non-trivial process of identifying valid, novel, potentially useful, and understandable information and patterns in heterogeneous information sources. Information sources are data bases, expert background knowledge, textual description, images, sounds, ... Prof. Dr. Rudolf Kruse 65 NF SEURO UZZY Information Mining Problem Information Information Modeling Evaluation Deployment Understanding Understanding Preparation Determine Collect Initial Select Infor- Select Evaluate Plan Problem Information mation Modeling Results Deployment Objectives Technique Assess Describe Clean Infor- Generate Test Review Plan Moni- Situations Information mation Design Process toring and Maintenance Determine Explore Construct In- Build Model Determine Produce Final Information Information formation Next Steps Results Mining Goals Verify Integrate In- Assess Model Review Produce Project Information formation Project Plan Quality Format Infor- mation Prof. Dr. Rudolf Kruse 66 NF SEURO UZZY Example: Line Filtering Extraction of edge segments (Burns‟ operator) Production net: edges lines long lines parallel lines runways Prof. Dr. Rudolf Kruse 67 NF SEURO UZZY Example: Line Filtering Problems extremely many lines due to distorted images long execution times of production net Prof. Dr. Rudolf Kruse 68 NF SEURO UZZY Example: Line Filtering Only few lines used for runway assembly d Approach: ow left wind d Extract textural features of lines grad dow right win Identify and discard superfluous lines ient Prof. Dr. Rudolf Kruse 69 NF SEURO UZZY Example: Line Filtering Several classifiers: minimum distance, k-nearest neighbor, decision trees, NEFCLASS Problems: classes are overlapping and extremely unbalanced Result above with modified NEFCLASS: all lines for runway construction found reduction to 8.7% of edge segments Prof. Dr. Rudolf Kruse 70 NF SEURO UZZY Surface Quality Control: the 2 Approaches Today’s Approach The current surface quality control is done manually an experienced worker treats the exterior surfaces with a grindstone. The experts classify surface form deviations by means of linguistic descriptions. Cumbersome – Subjective - Error Prone Time Consuming The Proposed Approach Our Approach is based on the digitization of the exterior body panel surface with an optical measuring system. We characterize the form deviation by mathematical properties that are close to the subjective properties that the experts used in their linguistic description. Prof. Dr. Rudolf Kruse 71 NF SEURO UZZY Topometric 3-D measuring system Triangulation and Gratings Projection 0 P(x,y) φn 1 b(x,y) (x,y) 0 b 0 z(x,y) Miniaturized Pixel Projection coding Technique (Grey Code z Phase shift) y x High Point Density Fast Data Collection Measurement Accuracy Contact less and Non-destructive Prof. Dr. Rudolf Kruse 72 NF S EURO UZZY Data Processing • Approximation by a • Difference • Colour-Coded Polynomial Surface Visualization z(x,y) Dz(x,y) ˜ (x,y) z ˜ (x,y) z 3-D Data Detection of Post-Processing Features Analysis Acquisition Form Deviation • 3-D-Point Cloud • Feature Calculation Form Deviation • Classification (Data-Mining) z(x,y) Prof. Dr. Rudolf Kruse 73 NF SEURO UZZY Color Coded Visualization Result of Grinding Prof. Dr. Rudolf Kruse 74 NF SEURO UZZY 3D Visualization of Local Surface Defects Uneven Surface Press Mark (several sink marks in series or adjoined) (local smoothing of (micro-)surface) Sink Mark Waviness (slight flat based depression inward) (several heavier wrinklings in series) Prof. Dr. Rudolf Kruse 75 NF S EURO UZZY Data Characteristics We analysed 9 master pieces with a total number of 99 defects For each defect we calculated 42 features The types are rather unbalanced We discarded the rare classes We discarded some of the extremely correlated features (31 features left) We ranked the 31 features by importance We use stratified 4-fold cross validation for the experiment. uneven radius line w aviness draw line flat area sink mark press mark Prof. Dr. Rudolf Kruse uneven surface 0 10 20 30 40 50 76 NF SEURO UZZY Application and Results The Rule Base for NEFCLASS Classification Accuracy NBC DTree NN NEFCLASS DC Train Set 89.0% 94.7% 90% 81.6% 46.8% Test Set 75.6% 75.6% 85.5% 79.9% 46.8% Prof. Dr. Rudolf Kruse 77 NF SEURO UZZY