BISC-DSS by yaohongm

VIEWS: 0 PAGES: 91

									                 Berkeley Initiative in Soft Computing (BISC)
Fuzzy Set: 1965 … Fuzzy Logic: 1973 … Soft Decision: 1981 … BISC: 1990 … Human-Machine Perception: 2000 -


                                   BISC Program
  Masoud Nikravesh*
  BISC Program, EECS-UCB
  &
  Imaging and Informatics – Life Sciences
  Lawrence Berkeley National Laboratory (LBNL)

  http://www-bisc.cs.berkeley.edu/
  Email: Nikravesh@cs.berkeley.edu
  Tel: (510) 643-4522; Fax: (510) 642-5775

  CITRIS-Europe
  June 19-21
  Helsinki Finland

  *Member of Executive Committee; UC Discovery (Appointed by Provost and Senior Vice President)*
  *Member of Research Council: UC Discovery (Appointed by Provost and Senior Vice President)
  *LBNL-NERSC Representative in UC Discovery Program (Appointed by NERSC Director)
                Berkeley Initiative in Soft Computing (BISC)

                                               Outline

       Introduction to the BISC Program
       Technologies within BISC
             BISC-DSS
             BISC-FLINT-Search-NeuSearch
                         Theory and the Applications of Natural
                         Language Computing: Computation and
                         Reasoning with Information Presented in
                         Natural Languages

       Current Sponsors’ Projects
Fuzzy Set: 1965 … Fuzzy Logic: 1973 … Soft Decision: 1981 … BISC: 1990 … Human-Machine Perception: 2000 - …
     Berkeley Initiative in Soft Computing (BISC)

      BISC-UCB        EECS-CS Division-UCB



BISC Program is an international center for
basic and applied research in soft computing

The principal constituents of soft computing
(SC) are fuzzy logic (FL), neural network theory
(NN), and Evolutionary computing (EC)

 +20 BISC-SIG and 5000+ members from around
the world
                  Berkeley Initiative in Soft Computing (BISC)
                                EVOLUTION OF FUZZY LOGIC

generality
                                                                          nl-generalization
                    computing with words and
                    perceptions (CWP)

                                                                              f.g-generalization



                                                                     f-generalization
                                                                     classical bivalent
                                                                                                  time
            1965                   1973                    1999
 1965: crisp sets   fuzzy sets
 1973: fuzzy sets   granulated fuzzy sets (linguistic variable)
 1999: measurements     perceptions
  Fuzzy Set: 1965 … Fuzzy Logic: 1973 … Soft Decision: 1981 … BISC: 1990 … Human-Machine Perception: 2000 - …
 Berkeley Initiative in Soft Computing (BISC)



Factual Information About the
   Impact of Fuzzy Logic
          Berkeley Initiative in Soft Computing (BISC)

 Factual Information About the Impact of Fuzzy Logic
                                                          January 26, 2005
                                    PATENTS

-- Number of fuzzy-logic-related patents applied for in Japan: 17,740
-- Number of fuzzy-logic-related patents issued in Japan: 4,801
-- Number of fuzzy-logic-related patents issued in the US: around 1,700


   INSPEC - "fuzzy" in the title
       1970-1979:    569
       1980-1989:    2,403
       1990-1999:    23,210
       2000-present: 21,147
   Total:             47,329

MathSciNet - "fuzzy" in the title
      1970-1979:      443
      1980-1989:      2,465
      1990-1999:      5,487
      2000-present: 5,504
  Total:               13,899
       Berkeley Initiative in Soft Computing (BISC)

      Machine Intelligence – Human Intelligence
                     Year is 2020
• Computing Power == > Quadrillion/sec/$100
   – 5-15 Quadrillion/sec (IBM’s Fastest computer = 100 Trillion)
• High Resolution Imaging (Brian and Neuroscience)
   – Human Brain, Reverse Engineering
   – Dynamic Neuron Level Imaging/Scanning and Visualization
• Searching, Logical Analysis Reasoning
   – Searching for Next Google
• Technology goes Nano and Molecular Level
   – Nanotechnology
   – Nano Wireless Devices and OS
       • Tiny- blood-cell-size robots
       • Virtual Reality through controlling Brain Cell Signals
• Who should work and who should get paid?
    Berkeley Initiative in Soft Computing (BISC)
Four Phases of Problem (Predictive and Monitoring Tools)


Anomaly Prediction                  Diagnosis      Reason 1

            anomaly                          anomaly      Reason2

   normal                           normal



 Defect Detection
                    Def1           Anomaly Detection

                                             anomaly
                      Def2

                                    normal
             Berkeley Initiative in Soft Computing (BISC)

                           Customer Satisfaction




                                    potential cross-over
                                         factors ?
      Totally                                                                      Totally
      dissatisfied   dissatisfied                                      satisfied   satisfied




                                                    profile analysis


Successful efforts to proactively convert marginally dissatisfied
customers to satisfied ones by even a few percentage points will
benefit most companies. Preventing “decay” in the other way is
equally important and beneficial.
     Berkeley Initiative in Soft Computing (BISC)
                      Opportunity

   • Overall forecast for document and
     content management estimates the
     market growing from $1 billion in 1999 to
     $4 billion in 2004.

 Intelligent Information Management

 Merrill Lynch:The adoption of the e-commerce business model
drives demand for solutions that address unstructured data. We
 forecast the overall market for Unstructured Data Management
 software to grow at a CAGR of 31% from $1.2 billion in 1999 to
                       $4.7 billion in 2004.
     Berkeley Initiative in Soft Computing (BISC)

          Google is just the beginning.




Internet advertising is expected to attract
 $8 Billion this year, %15 up from 2004.

 US Advertisers will spend $275 Billion
     this year, %5.7 up from 2004
     Berkeley Initiative in Soft Computing (BISC)
                  TV and the Internet

           The Next Big Thing
           Real Interactive TV
       Internet Protocol TV (IPTV)

  Every viewer could potentially receive
   different advertisement based on its
profile, search, and shows the viewer has
               been watched
Family will not skip the ads, because it is
            targeted advertising
          Berkeley Initiative in Soft Computing (BISC)
                         BISC-DSS Software


               Neuro-Fuzzy-Evolutionary Computing
Multi-Criteria Decision Analysis with Uncertain and Incomplete Information
    Berkeley Initiative in Soft Computing (BISC)

                     OBJECTIVES
Develop soft-computing-based techniques
           for decision analysis
 Tools to assist decision-makers in assessing the
consequences of decision made in an environment of
imprecision, uncertainty, and partial truth and providing a
systematic risk analysis;

Tools to assist decision-makers answer “What if
Questions”, examine numerous alternatives very quickly
and find the value of the inputs to achieve a desired level of
output;


Tools to be used with human interaction and feedback to
achieve a capability to learn and adapt through time;
              Berkeley Initiative in Soft Computing (BISC)
                        BISC DSS: Components and Structure



                                                                         Model and Data
                                                                          Visualization
                                                   Model Management
 Evolutionary Kernel
 Genetic Algorithm,
                                               • Query
                         • Selection           • Aggregation
 Genetic Programming,    • Cross Over
 and DNA                 • Mutation
                                               • Ranking
                                               • Fitness Evaluation




Input From       Experts Knowledge
Decision
Makers     Model Representation Including
                   Linguistic Formulation
              • Functional Requirements                                  Data
              • Constraints                                           Management
              • Goals and Objectives
              • Linguistic Variables Requirement
         Berkeley Initiative in Soft Computing (BISC)
              BISC-DSS: Interaction and Optimization


                                   Comparison,
                                   Aggregation, Scoring
            DB
                                   MODEL based on
                                          • Aggregation operators,
              QUERY                          • Similarity measures
                            Fuzzy
  User                                               • Norm-Pairs
                            Search
Interface    ANSWERS                                   • Fuzzy sets
                            Engine
                             (FSE)


User preferences :                             Evolutionary
                                                Computing
(re-ranking, selection)
                              OPTIMIZATION        Kernel
         Berkeley Initiative in Soft Computing (BISC)
                        Basic concepts

Advanced Multi-Aggregator Model
   Parameters
    - aggregators
    - weights
    - tree structure.                          Aggregators




                                                    Attributes


                           Aggregation tree
            Berkeley Initiative in Soft Computing (BISC)
                   Multi-Criteria Decision Model (2)


       Query                        Data

                 Fuzzification          Fuzzy sets
For each
attribute
                Fuzzy similarity           Norm-pairs [,]
                  calculation              Fuzzy similarity measures


                  aggregation              Aggregation
                                           model


                    Scoring          Ranking or Selecting Answers
       Berkeley Initiative in Soft Computing (BISC)
                   Other Applications
Application                     Description
Finance           • stock prices and characteristics, credit
                  scoring, credit card ranking
Military           • battlefield simulation and decision
                   making
Medicine          • diagnosis

Marketing          • store and product display
                   • electronic shopping
Internet           • provide knowledge and advice to
                   large numbers of user
Education          • university admission

Banking            • fraud detection
Berkeley Initiative in Soft Computing (BISC)
       Berkeley Initiative in Soft Computing (BISC)

      Machine Intelligence – Human Intelligence
                     Year is 2020
• Computing Power == > Quadrillion/sec/$100
   – 5-15 Quadrillion/sec (IBM’s Fastest computer = 100 Trillion)
• High Resolution Imaging (Brian and Neuroscience)
   – Human Brain, Reverse Engineering
   – Dynamic Neuron Level Imaging/Scanning and Visualization
• Searching, Logical Analysis Reasoning
   – Searching for Next Google
• Technology goes Nano and Molecular Level
   – Nanotechnology
   – Nano Wireless Devices and OS
       • Tiny- blood-cell-size robots
       • Virtual Reality through controlling Brain Cell Signals
• Who should work and who should get paid?
   Berkeley Initiative in Soft Computing (BISC)

                                       NeuFCSearch
                   NeuSearch: Neuroscience Approach
          Search Engine Based on Conceptual Semantic Indexing
                                                                                                                             FCS Based on
                                                                                                                         Neuroscience Approach
                                       Probability                                       PRBF

       LSI                                 Bayesian                                     GRRBF                                            RBF
                                                                  w(i, j )
                                             Fuzzy                                      ANFIS
Classical Search                                                                                                                                 NeuFCS

                                   NNnet                                                 RBFNN                                             Neuro-Fuzzy Conceptual
                             (BP, GA-GP, SVM)                                       (BP, GA-GP, SVM)                                       Search (NeuFCS)



                                                             Term-Document Matrix

                                                                                            Lycos, etc.
           Keyword search;            [ 0, 1]                                    [ tf-idf] 
           classical techniques;                                                           
           Google, Teoma, etc.       
                                                           
                                                                                 
                                                                                             
                                                                                                                                                  Topic, Title,
                                     The use of bivalent-logic Theory
                                                                          The use of statistical-Probabilistic Theory   GA-GP
                                                                                                                        Context-Based
                                                Specialization          w(i, j )                                        tf-idf; Ranked
                                                                                                                        tf-idf                      Summarization
                                                                                                           
                                                                                                         
                                                                                                                                                  Concept-Based
                                                                                          [set]            
                                                                                                         
                                                                                                                                                    Indexing
                                                                                                         
                                                             
                                   The use of Fuzzy Set-Object-Based Theory         The use of Fuzzy Set Theory

                              Use Graph Theory
                              and Semantic Net.                                      Imprecise Search
                              NLP with GA-GP
                              Based NLP; Possibly
                              AskJeeves.
           Berkeley Initiative in Soft Computing (BISC)

                                                        NeuFCSearch

   PNL-Based Conceptual Fuzzy Sets
         Using Brain Science
                                                                        Interconnection based on
      Concept-Context Dependent Word Space
                                                                            Mutual Information



                                                               rij:     i is an instance of j       (is or isu)
       w ( j , k )  f p( j , k ), p( j ), p(k )                      i is a subset of j          (is or isu)
                                                                        i is a superset of j        (is or isu)
              j: neuron in document
              or Concept-Context                                        j is an attribute of i
              layer                                                     i causes j                  (or usually)
              k: neuron in word                                         i and j are related
Document      layer


(Corpus)

                                                                        w(i, j )  rij
            w (i , j )  f p(i , j ), p(i ), p( j )
           i: neuron in word layer
           j: neuron in document
           or Concept-Context
           layer



                                                                      i: neuron in document layer
                                        Word Space                    j: neuron in word layer
             Berkeley Initiative in Soft Computing (BISC)

                                                   NeuFCSearch
                                                     Neu-FCS
                    Concept-Context Dependent Word Space




w ( j , k )  f p( j , k ), p( j ), p(k )
 j: neuron in document or
 Concept-Context layer
 k: neuron in word layer



Document                                                            Documents Space or

(Corpus)                                                         Concept and Context Space
                                                                   Based on SOM or PCA

   w (i , j )  f p(i , j ), p(i ), p( j )
              i: neuron in word layer
              j: neuron in document or
              Concept-Context layer


                                                                  W(i, j) is calculated based on
                                                                  Fuzzy-LSI or Probabilistic LSI
                                                                   (In general form, it can be
                                                                    Calculated based on PNL)
                                               Word Space
              Berkeley Initiative in Soft Computing (BISC)

                             NeuFCSearch
  Neu-FCS                          Output: Concept-Context Dependent Word




                                                              Activated Document or
 Document
                                                                 Concept-Context
  (Corpus)




Input: Word


                      Word Space
Berkeley Initiative in Soft Computing (BISC)
     Berkeley Initiative in Soft Computing (BISC)

          Google is just the beginning.




Internet advertising is expected to attract
 $8 Billion this year, %15 up from 2004.

 US Advertisers will spend $275 Billion
     this year, %5.7 up from 2004
     Berkeley Initiative in Soft Computing (BISC)
                  TV and the Internet

           The Next Big Thing
           Real Interactive TV
       Internet Protocol TV (IPTV)

  Every viewer could potentially receive
   different advertisement based on its
profile, search, and shows the viewer has
               been watched
Family will not skip the ads, because it is
            targeted advertising
Berkeley Initiative in Soft Computing (BISC)
              Pattern Trees




          Zhiheng Huang
    Berkeley Initiative in Soft Computing (BISC)
                  Pattern Trees



A pattern tree represents the pattern for an output
class
The worked model consists of many pattern
trees, with each corresponding to one output
class
Assume two fuzzy variables A and B which take
fuzzy linguistic labels A1, A2 and B1, B2
respectively, and two output classes X and Y
      Berkeley Initiative in Soft Computing (BISC)

                     Pattern Trees

Pattern Trees




Decision Tress
   Berkeley Initiative in Soft Computing (BISC)
                 Pattern Trees


Both trees generate the same rules

  If A = A1 Then class = X
  If A = A2 And B = B1 Then class = X
  If A = A2 And B = B2 Then class = Y

Pattern Trees are not just inverted
decision trees(!)
     Berkeley Initiative in Soft Computing (BISC)
               Pattern Tree Induction


1. Choose the fuzzy linguistic label which
   has the maximal similarity to the output
   class as the bottom leaf
2. Try all the remaining fuzzy linguistic
   labels with different aggregations,
   choose the label and aggregation which
   together lead to the highest similarity
3. Apply steps 1 and 2 until no fuzzy label
   and aggregation can lead to a higher
   similarity
     Berkeley Initiative in Soft Computing (BISC)
             Pattern Tree Induction Example


An artificial dataset

         A               B           Class
   A1        A2    B1        B2     X     Y
   0.8       0.2   0.0       1.0   0.9   0.1
   0.9       0.1   0.1       0.9   0.8   0.2
   0.5       0.5   0.9       0.1   0.7   0.3
   0.2       0.8   0.8       0.2   0.9   0.1
   0.4       0.6   0.4       0.6   0.3   0.7
   0.3       0.7   0.5       0.5   0.1   0.9
     Berkeley Initiative in Soft Computing (BISC)
          Pattern Tree Induction Example


Constructing pattern tree …
     Berkeley Initiative in Soft Computing (BISC)
         Pattern Tree Induction Example


Constructing pattern tree …
     Berkeley Initiative in Soft Computing (BISC)
          Pattern Tree Induction Example


Constructed pattern tree
Berkeley Initiative in Soft Computing (BISC)
            BISC-DSS-ASIS Software



Automated Sensory Inspection System
       OMRON-BISC Project
     Berkeley Initiative in Soft Computing (BISC)

     Automated Sensory Inspection System (ASIS)

What is ASIS?


• A software system which generates time
  series classifiers.
• Principal application for classifying Faults
  in Sensory Inspection System.
   Berkeley Initiative in Soft Computing (BISC)
                Four Phases of Problem


Anomaly Prediction                 Diagnosis      Reason 1

           anomaly                          anomaly      Reason2

  normal                          normal



Defect Detection
                   Def1          Anomaly Detection

                                            anomaly
                     Def2

                                   normal
             Berkeley Initiative in Soft Computing (BISC)

                                    Performances
                                    Prediction of NG: %89.5
                                    Prediction of OK: %100
AllAttributes/Old     Observed     Highest Second      Third      Fourth     Fifth         6th
           *kachikachikachikachi       OK kachikachi   gagaga          bu-           bi-   kakaka
          bu-               bu-        bu-   kakaka     gagaga        OK kachikachi             bi-
          kakaka         kakaka     kakaka    gagaga        bu-       OK kachikachi             bi-
          gagaga        gagaga      gagaga       bu-       OK kachikachi     kakaka             bi-
          kachikachi kachikachi kachikachi      OK      gagaga         bi-       bu-       kakaka
          kachikachi kachikachi kachikachi    gagaga       OK         bu-            bi-   kakaka
          gagaga        gagaga      gagaga    kakaka kachikachi       OK         bu-            bi-
          kachikachi kachikachi kachikachi      OK      gagaga        bu-            bi-   kakaka
          bu-               bu-         bu-   gagaga       OK      kakaka            bi- kachikachi
          bu-               bu-         bu-   kakaka    gagaga        OK kachikachi             bi-
          bu-               bu-         bu-   gagaga       OK kachikachi     kakaka             bi-
          bu-               bu-         bu-   gagaga       OK kachikachi     kakaka             bi-
          bu-               bu-         bu-   gagaga       OK    kakaka              bi- kachikachi
          bu-               bu-         bu-   gagaga       OK kachikachi             bi-   kakaka
          bi-                bi-        bi-   kakaka    gagaga        bu-       OK kachikachi
          bi-               bi-        bi-      OK kachikachi     gagaga     kakaka           bu-
          *bi-              bi-       OK         bi- kachikachi      bu-     gagaga        kakaka
          kachikachi kachikachi kachikachi      OK gagaga            bu-         bi-       kakaka
          bu-               bu-         bu-     OK      gagaga kachikachi    kakaka             bi-
          Berkeley Initiative in Soft Computing (BISC)

                                   Performances
                                   Prediction of NG: %95
                                  Prediction of OK: %99.7
Onlyattribut1        Observed Highest        Second     Third     Fourth     Fifth     6th
          kachikachi kachikachi kachikachi    kakaka       OK          bu-   gagaga          bi-
          *bu-              bu- kachikachi    kakaka        bu-   gagaga         bi-     OK
          *kakaka       kakaka kachikachi     kakaka        bu-   gagaga         bi-     OK
          *gagaga       gagaga         bi-    gagaga        bu- kachikachi   kakaka      OK
          kachikachi kachikachi kachikachi    kakaka        bu-   gagaga         bi-     OK
          *kachikachikachikachi       bu- kachikachi    kakaka    gagaga         bi-     OK
          *gagaga       gagaga        bu-         bi-   gagaga kachikachi    kakaka      OK
          kachikachi kachikachi kachikachi    kakaka        bu-   gagaga         bi-     OK
          *bu-              bu-        bi-    gagaga        bu- kachikachi   kakaka      OK
          *bu-              bu-        bi-    gagaga        bu- kachikachi   kakaka      OK
          *bu-              bu-        bi-    gagaga        bu- kachikachi   kakaka      OK
          bu-               bu-       bu- kachikachi    kakaka    gagaga         bi-     OK
          *bu-              bu-        bi-    gagaga        bu- kachikachi   kakaka      OK
          *bu-              bu-        bi-    gagaga        bu- kachikachi   kakaka      OK
          bi-               bi-        bi-    gagaga        bu- kachikachi   kakaka      OK
          bi-               bi-        bi-    gagaga       bu- kachikachi    kakaka      OK
          *bi-              bi- kachikachi    kakaka       OK         bu-    gagaga          bi-
          kachikachi kachikachi kachikachi    kakaka       bu-    gagaga         bi-     OK
          *bu-              bu-     OK        kakaka kachikachi      bu-     gagaga          bi-
          OK                OK      OK        kakaka kachikachi      bu-     gagaga          bi-
          *OK                kachikachi
                            OK                kakaka        bu-   gagaga         bi-     OK
          OK                OK      OK        kakaka kachikachi      bu-     gagaga          bi-
                Berkeley Initiative in Soft Computing (BISC)

                                        Performances
                                        Prediction of NG: %100
                                        Prediction of OK: %100

AllAttributes/Old+2NewAttrbs Observed Highest           Second    Third     Fourth     Fifth          6th
                    kachikachi kachikachi kachikachi     kakaka   gagaga         bi-      OK                bu-
                     bu-              bu-        bu-     gagaga   kakaka        OK             bi- kachikachi
                     kakaka        kakaka     kakaka     gagaga      bu-         bi-      OK kachikachi
                     gagaga       gagaga     gagaga         bu-   kakaka        OK             bi- kachikachi
                     kachikachi kachikachi kachikachi      OK         bi-   gagaga         bu-       kakaka
                     kachikachi kachikachi kachikachi    kakaka   gagaga         bi-      OK                bu-
                     gagaga       gagaga     gagaga      kakaka      bu-        OK kachikachi               bi-
                     kachikachi kachikachi kachikachi    gagaga      OK          bi-   kakaka               bu-
                     bu-              bu-        bu-     gagaga      OK      kakaka            bi- kachikachi
                     bu-              bu-        bu-     kakaka      OK     gagaga             bi- kachikachi
                     bu-              bu-        bu-     gagaga      OK          bi-   kakaka kachikachi
                     bu-              bu-        bu-       OK     gagaga         bi-   kakaka kachikachi
                     bu-              bu-        bu-     gagaga      OK          bi-   kakaka kachikachi
                     bu-              bu-        bu-       OK     gagaga kachikachi            bi-   kakaka
                     bi-               bi-        bi-    kakaka   gagaga        bu-       OK kachikachi
                     bi-               bi-        bi-    kakaka   gagaga       bu-        OK kachikachi
                     bi-               bi-        bi-      OK        bu-    gagaga     kakaka kachikachi
                     kachikachi kachikachi kachikachi    kakaka   gagaga         bi-      OK                bu-
                     bu-              bu-        bu-       OK     gagaga     kakaka        bi- kachikachi
                     OK               OK         OK         bu-   gagaga         bi-   kakaka kachikachi
                    Berkeley Initiative in Soft Computing (BISC)

                                               Performances
                                               Prediction of NG: %95
                                              Prediction of OK: %99.8


TwoNEwAttrb/Label/(22,23,26,30,35,39)   Observed Highest         Second     Third     Fourth     Fifth          6th
                             kachikachi kachikachi kachikachi     kakaka    gagaga         bu-        OK        bi-
                              bu-              bu-        bu-     gagaga    kakaka          bi-       OK kachikachi
6Variables or Old OMRON       kakaka        kakaka     kakaka        bu-    gagaga          bi- kachikachi     OK
                              *gagaga      gagaga         bu-     gagaga         bi-   kakaka         OK kachikachi
                              *kachikachikachikachi    kakaka     gagaga        bu- kachikachi         bi-     OK
                              kachikachi kachikachi kachikachi    gagaga    kakaka         bu-        OK        bi-
                              gagaga       gagaga     gagaga         bu-    kakaka kachikachi          bi-     OK
                              kachikachi kachikachi kachikachi    gagaga    kakaka         bu-        OK        bi-
                              bu-              bu-        bu-         bi-   gagaga     kakaka kachikachi       OK
                              *bu-             bu-    gagaga         bu-    kakaka          bi-       OK kachikachi
                              bu-              bu-        bu-     gagaga         bi-   kakaka         OK kachikachi
                              *bu-             bu-    gagaga         bu-    kakaka          bi-       OK kachikachi
                              *bu-             bu-         bi-       bu-    kakaka     gagaga         OK kachikachi
                              *bu-             bu-    gagaga         bu- kachikachi    kakaka          bi-     OK
                              bi-               bi-        bi-       bu-    gagaga     kakaka         OK kachikachi
                              bi-               bi-        bi-       bu-    kakaka     gagaga         OK kachikachi
                              *bi-              bi-    kakaka        bu-         bi-   gagaga         OK kachikachi
                              kachikachi kachikachi kachikachi    kakaka    gagaga         bu-        OK        bi-
                              *bu-             bu-        OK         bu-    gagaga     kakaka            bi- kachikachi
                    Berkeley Initiative in Soft Computing (BISC)

                                                Performances
                                               Prediction of NG: %100
                                               Prediction of OK: %99.8


TwoNEwAttrb/Label/(22,23,26,30,35,39) and (24,25,28,29,31,32,33,34)
                                                    Observed Highest        Second     Third     Fourth     Fifth         6th
                                        kachikachi kachikachi kachikachi     gagaga    kakaka         bi-      OK        bu-
                                         bu-              bu-        bu-     gagaga    kakaka         bi-      OK kachikachi
          13 Variables of Old OMRON      kakaka        kakaka     kakaka     gagaga       bu-         bi-      OK kachikachi
                                         gagaga       gagaga     gagaga          bi-      bu-        OK     kakaka kachikachi
                                         *kachikachikachikachi   gagaga kachikachi     kakaka        bu-            bi-     OK
                                         kachikachi kachikachi kachikachi    gagaga    kakaka         bi-      OK               bu-
                                         gagaga       gagaga     gagaga      kakaka       bu- kachikachi       OK               bi-
                                         kachikachi kachikachi kachikachi    gagaga    kakaka        bu-            bi-     OK
                                         bu-              bu-        bu-         bi-   gagaga        OK     kakaka kachikachi
                                         bu-              bu-        bu-     gagaga        bi-    kakaka       OK kachikachi
                                         bu-              bu-        bu-         bi-   gagaga        OK kachikachi   kakaka
                                         bu-              bu-        bu-         bi-   gagaga        OK    kakaka kachikachi
                                         *bu-             bu-         bi-       bu-    gagaga     kakaka      OK kachikachi
                                         bu-              bu-        bu-     gagaga        bi-       OK    kakaka kachikachi
                                         bi-               bi-        bi-       bu-    gagaga     kakaka      OK kachikachi
                                         bi-               bi-        bi-       bu-    gagaga     kakaka      OK kachikachi
                                         bi-               bi-        bi-       bu-      OK      gagaga     kakaka kachikachi
                                         kachikachi kachikachi kachikachi    gagaga    kakaka       bu-         bi-      OK
                                         bu-              bu-        bu-       OK      gagaga         bi-   kakaka kachikachi
                        Berkeley Initiative in Soft Computing (BISC)

                                                      Performances
                                                     Prediction of NG: %100
                                                     Prediction of OK: %99.8



TwoNEwAttrb/Label/(22,23,26,30,35,39) and (24,25,28,29,31,32,33,34) and (32, 36, 37, 38)
                                                              Observed Highest        Second      Third     Fourth     Fifth          6th
                                                   kachikachi kachikachi kachikachi      gagaga   kakaka         bi-      OK                bu-
                                                     bu-              bu-        bu-    gagaga    kakaka        OK             bi- kachikachi
17 Variables of Old OMRON                            kakaka        kakaka     kakaka    gagaga        bu-        bi-      OK kachikachi
                                                     gagaga       gagaga     gagaga        bu-        bi-       OK     kakaka kachikachi
                                                     *kachikachikachikachi   gagaga kachikachi    kakaka        bu-       OK                bi-
                                                     kachikachi kachikachi kachikachi   gagaga    kakaka         bi-      OK                bu-
                                                     gagaga       gagaga     gagaga     kakaka kachikachi       bu-       OK                bi-
                                                     kachikachi kachikachi kachikachi   gagaga    kakaka        bu-       OK                bi-
                                                     bu-              bu-        bu-        bi-   gagaga        OK     kakaka kachikachi
                                                     bu-              bu-        bu-    gagaga    kakaka         bi-      OK kachikachi
                                                     bu-              bu-        bu-        bi-   gagaga        OK kachikachi   kakaka
                                                     bu-              bu-        bu-        bi-   gagaga        OK    kakaka kachikachi
                                                     *bu-             bu-         bi-      bu-    gagaga     kakaka      OK kachikachi
                                                     bu-              bu-        bu-    gagaga       OK          bi-   kakaka kachikachi
                                                     bi-               bi-        bi-      bu-    gagaga     kakaka       OK kachikachi
                                                     bi-               bi-        bi-      bu-    gagaga     kakaka       OK kachikachi
                                                     bi-               bi-        bi-      bu-    gagaga        OK     kakaka kachikachi
                                                     kachikachi kachikachi kachikachi   gagaga    kakaka        bu-        bi-      OK
                                                     bu-              bu-        bu-       OK     gagaga     kakaka            bi- kachikachi
Berkeley Initiative in Soft Computing (BISC)

  PCA Result using all the Attributes
Berkeley Initiative in Soft Computing (BISC)

  PCA Result using all the Attributes
Berkeley Initiative in Soft Computing (BISC)
            BISC-DSS-CS3 Software


        British Telecom-BISC
                   CS3
    Berkeley Initiative in Soft Computing (BISC)
          BT funded BISC Research Project


Theme: application of soft-computing to data analysis and
modelling

Focus: predictive capability of data model

Domain: customer satisfaction data analysis

Type of data: customer survey responses, ~ 50 parameters,
5000 to 50000 records

Model: handle soft concepts and bias decisions based on
user preferred criteria (weights)
             Berkeley Initiative in Soft Computing (BISC)




                                    potential cross-over
                                         factors ?
      Totally                                                                      Totally
      dissatisfied   dissatisfied                                      satisfied   satisfied




                                                    profile analysis


Successful efforts to proactively convert marginally dissatisfied
customers to satisfied ones by even a few percentage points will
benefit most companies. Preventing “decay” in the other way is
equally important and beneficial.
          Berkeley Initiative in Soft Computing (BISC)
           Sample Survey Questions (repair & fault resolution)

•   Questions with Yes/No responses (majority):
    – Whether BT person knew how to deal with the requirement?
    – Whether BT person explained what would happen next?
    – Whether BT person understood the requirement?

• Questions with ratings:
 (Esat, Vsat, Fsat, Nsat/Dis, Fdis, Vdis, Edis)
  – Overall Satisfaction with BT's handling of event?
  – Satisfaction with Engineers performance?
  – Satisfaction with BT person?

•   Questions with indirect clues:
    – Did you use the automated reporting system?
    – on ACE was the Information clear and easy to understand?
    – Were the instructions clear and easy to follow?
        Berkeley Initiative in Soft Computing (BISC)
                      Sample Survey Questions



  – Was the length of the message about right?
  – Did you use the automated reporting system?
  – Whether respondent encountered Call Steering when they
    called BT?


• Questions with special responses:
  – Whether respondent was given a time that the Engineer would
    visit when they reported the fault? (precise, 2hr slot, AM/PM,
    which day)
  – How many Bt engineers visited? (1, >=2)
  – Whether Engineer(s) arrived when expected? (yes, earlier, later,
    no expectations)
        Berkeley Initiative in Soft Computing (BISC)
               Customer Satisfaction Data Modelling


• Classification:
  – Allow user to modify priorities of different survey
    parameters (e.g. via weights)
  – Establish soft definitions of categories (e.g. Vsat, Fsat,
    Nsat/Dis, Fdis, Vdis) that can be user defined
  – Based on survey data and (user specified priorities)
    produce hard & soft classification of customers into
    predefined categories
  – Provide simple visualization of the classification results
  – Model adaptation based on user feedback/correction
        Berkeley Initiative in Soft Computing (BISC)
               Customer Satisfaction Data Modelling


• Predictive tool to answer:
  – What would cause the movement of customers from one
    class to another? (e.g. from a ‘Fsat’ customers to a ‘Vsat’,
    or ‘Nsat’ customers to ‘Fdis’)

   Identify what parameters play an important role to each
    category
   Identify parameter relations/dependencies between
    categories, especially non-linear ones
   Model should adapt to changes in user priorities of
    parameters (survey questions)
Berkeley Initiative in Soft Computing (BISC)
      Customer Satisfaction Data Modelling

                       Data

     UserW       Preprocessing


                     ModelW
                BISC
                Model



       Predictive      Classification
       Results         Results
        Berkeley Initiative in Soft Computing (BISC)
               Customer Satisfaction Data Modelling


• Desired features:
  – Focus on algorithms for classification and data modelling
   Predicting the key parameters that would cause
    movement of customers from one class to another is
    highest priority for this project
  – Minimal UI for effective demonstration
      – Classification results (pie charts, scatter grams)
      – Prediction results (with verification capabilities)
  – First version may be stand-alone (rather then a web-
    based client-server application) to simplify software
    engineering issues
Berkeley Initiative in Soft Computing (BISC)
 Berkeley Initiative in Soft Computing (BISC)




Question and Answering System
        BISC/BT Text Mining & QA Research




  Mirza Sufyan Beg and Zengchang Qin
             Berkeley Initiative in Soft Computing (BISC)

                 BISC/BT Text Mining & QA Research

                                 query                       answer            Response
                                                                              formulation
                  dialog clarification

                                         precisiated query
   doc                 Preprocessing                                     Deduction /
                                                                      Reasoning Engine
                        NLP + PNL           IE module
                                                                          GrC / CW
                                         facts & relations




copora                                                          application
                                             domain /
                                             background         knowledge
                   world knowledge                              (DDB)
                                             knowledge



         later stage
      Berkeley Initiative in Soft Computing (BISC)
                      Approach


  Q&A System
    Precisiation
    Deduction

• Begin with Simplified Natural Language
   – Only simple sentences (Subject, Verb, Object)
   – Object is optional (non-transitive verbs)

  Look for Protoforms first

  Else existing approaches
     Berkeley Initiative in Soft Computing (BISC)
         Subject, Verb and Object Phrases


• Extract the subject phrase, verb phrase and the
  object phrase from a simple sentence that is
  already tagged according to Penn Treebank
  tagset
   – Tagged Proposition: Airway/NNP allows/VBZ
     linking/VBG a/DT number/NN of/IN
     telephones/NNS ./.
   – The Subject Phrase is:- Airway
   – The Verb Phrase is:- allows linking
   – The Object Phrase is:- a number of telephones
  Berkeley Initiative in Soft Computing (BISC)
                  Protoforms


X is A
Y is (X+B)
Q As are Bs
(Q1 × Q2)As are (B and C)s Form
f(X) is A
X isr A
Causal Facts
If-Then rules
           Berkeley Initiative in Soft Computing (BISC)
                               Protoforms
   “X is A” form : : Look in A for any of the following
"is", "was", "were", "are", "am", "shall be", "will be", "should be",
"would be", "can be", "could be", "must be", "have to be", "had to be",
"might be", "ought to be", "is likely to be", "was likely to be", "were
likely to be", "are likely to be", "am likely to be", "shall probably be",
"shall usually be", "shall partially be", "shall possibly be", ….

  “X isr A” form : : Look in A for any of the following
"probably", "usually", "partially", "possibly", "mostly", "likely"

  “Q As are Bs” form: Look in A for any of the following
   "some", "somewhat", "few", "none", "many", "almost", "a little bit",
      "about", "most", "all", "a lot" (many), numbers (words and digits),
      …

  Causal Facts/ “Why” questions: Look for “due to” / “because” /
“since” in a sentence

  For If Then Rule: Simply look for “If”
    Berkeley Initiative in Soft Computing (BISC)
              Question Processing


Get keywords from a query
Pull out the facts that contain any of these
keywords or their synonyms
Run the deduction module on this subset of facts
What is X? IS Forms
Why X is A? Causal Facts
“How” requires Procedures
           Berkeley Initiative in Soft Computing (BISC)
                            Example Input File
Airway is Ideal for the home or small office.
The cost of calls to Spain is about 40p per minute.
Airway allows linking a number of telephones. Airway allows linking a
number of computers.
The cost of calls to Germany is a little more than the costs of calls to Spain.
The cost of calls to Germany will be probably a little more than the cost of
calls to Spain.
Airway allows linking a number of fax machines.
This sentence is written because we want to test the causal sentences.
Airways allows linking of telephones, computers, and fax machines into a
single system.
Most balls are large.
If this sentence comes all right then we can be sure that all If-then cases are
fine.
Airway does not require any internal telephone wiring.
Here most Swedes are tall.
The heart of Airway network is possibly the Controller.
Many large balls are heavy.
His son is quick.
Controller provides wireless links to the telephone handsets and sockets.
Data Sockets are used to connect computers. Phone Sockets are used to
connect faxes and traditional phones. Up to 16 handsets and sockets can be
connected to one Controller.
          Berkeley Initiative in Soft Computing (BISC)
                          Example Output File

XisA: Airway IS Ideal for the home or small office
XisA: The cost of calls to Spain IS about 40p per minute
Fact: Airway allows linking a number of telephones
Fact: Airway allows linking a number of computers
XisA: The cost of calls to Germany IS a little more than the costs of calls to
     Spain
XisrA: probably - The cost of calls to Germany IS a little more than the cost of
     calls to Spain
Fact: Airway allows linking a number of fax machines
CAUSE: we want to test the causal sentences, EFFECT: This sentence is
     written
Fact: Airways allows linking of telephones , computers , and fax machines
     into a single system
XisA: Most balls IS large
IF: this sentence comes all right, THEN: we can be sure that all If-then cases
     are fine
Fact: Airway does not require any internal telephone wiring
           Berkeley Initiative in Soft Computing (BISC)
                          Example Output File

XisA: Here most Swedes IS tall
XisrA: possibly - The heart of Airway network IS the Controller
XisA: Many large balls IS heavy
XisA: His son IS quick
Fact: Controller provides wireless links to the telephone handsets and sockets
Fact: Data Sockets are used to connect computers
Fact: Phone Sockets are used to connect faxes and traditional phones
Fact: Up to 16 handsets and sockets can be connected to one Controller
YisX+B: The cost of calls to Germany IS The cost of calls to Spain PLUS a little
    more than ; with similarity value 0.5
YisrX+B: probably - The cost of calls to Germany IS The cost of calls to Spain
    PLUS a little more than; with similarity value 0.7
QAs_are_Bs: most - balls ARE large
QAs_are_Bs: most - Here Swedes ARE tall
QAs_are_Bs: many - large balls ARE heavy
Q1Q2AsAreBandCs: (most x many) (balls)s are (large and heavy)s
size(Airway) IS small – WRONG (What to do?)
size(Most balls) IS large
weight(Many large balls) IS heavy
speed(His son) IS quick
              Berkeley Initiative in Soft Computing (BISC)

      Deduction Engine for PNL Based QA Systems

Deductions are made from recognized protoforms of
PNL.

                                  Deduction / Reasoning
Precisiated                              Engine
                    Protoforms                            Conclusions
expressions




                                 DDB         background
                                             knowledge
 Berkeley Initiative in Soft Computing (BISC)

           Multi-pipe Deduction


Fuzzy Reasoning
Deduction Rule
Concept Matching
Keyword Searching
    Berkeley Initiative in Soft Computing (BISC)
                 Fuzzy Reasoning

If-Then Fuzzy Rule Deduction

IF someone is well-educated, THEN his/her Salary
likely to be medium or high.

Prob(IF A.education = well-educated, THEN
A.salary = medium or high) is likely.

  Variable “education” and “salary” need to be defined.

  The values for these variables must be defined in order
  to proceed with fuzzy reasoning.
     Berkeley Initiative in Soft Computing (BISC)
                    Deduction Rules


If the relevant terms are NOT defined by fuzzy
variables.

XisA Form: if we are asked “what is X”, the answer is
A.

   Education is
                              Salary is medium or high
  well - educated


    Concept         John has good
    matching          education
  Berkeley Initiative in Soft Computing (BISC)
             Keyword Searching



 There are many sentences are not
recognizable PNL protoforms, we
treat them as facts.

If we cannot use the approaches we
mentioned, we will still can do
keyword searching.

QA system is at least an keyword
based information retrieval.
Berkeley Initiative in Soft Computing (BISC)
    Berkeley Initiative in Soft Computing (BISC)




        Energy – Earth Sciences
Intelligent Reservoir Characterization
Berkeley Initiative in Soft Computing (BISC)
Berkeley Initiative in Soft Computing (BISC)

                  IRESC
 Berkeley Initiative and Well Placement
 Risk Assessmentin Soft Computing (BISC)
Similarity Map                               Degree of Confidence
                                  300                                                90

                   Non

                                                                                     85
                  H 1-6




                     Hansen 1-6
                                  250



                                                                                     80



                                  200

                                                                                     75
                  H B-6

                                  150                                                70




                                                                                     65
                  W O-6           100



                                                                                     60



                                  50

                                                                                     55
                  A N-6

                  High                  20    40   60   80   100   120   140   160
                                                                                     50
Berkeley Initiative in Soft Computing (BISC)




Scientific Data and Simulation
            Berkeley Initiative in Soft Computing (BISC)
Center for Computational Machine Intelligence and Systems Science

• Scientific instruments and numerical simulation models
  generate massive amounts of multi-spectral spatio-
  temporal image data
   – Difficult to visualize and summarize
• With meaningful high level representation: data

For example, simulated sea surface                                  Multi-level
temperature data at a typical ocean model
resolution of 100km (horizontal grid size of
                                                                  representation
384x320) from a 100 year simulation run with
24 hour sampling interval reveals a matrix of
36,500-by-36,500. (through singular value
decomposition), which is roughly 1.3 Tera       Feature-based                      Predictive
points. Even at this resolution, standard                                          models
                                                query
principal component analysis (PCA) may
require sampling in high performance                             Quantitative
computing environment.                             Feature-based analysis     Comparative
                                                   visualization              analysis

                       B. Parvin and Masoud Nikravesh; vision.lbl.gov
                  Berkeley Initiative in Soft Computing (BISC)
Center for Computational Machine Intelligence and Systems Science

 Feature-based Representation of Spatio-temporal data




Feature-based representation of Oceanography simulation data
                                                               Feature-based representation of core collapse super nova simulation data




                                 B. Parvin and Masoud Nikravesh; vision.lbl.gov
              Berkeley Initiative in Soft Computing (BISC)
  Center for Computational Machine Intelligence and Systems Science

Tera-Flop computation of scientific data are going to be routine

•For example, simulated sea surface temperature data at a typical ocean model
resolution of 100km (horizontal grid size of 384x320) from a 100 year simulation run
with 24 hour sampling interval reveals a matrix of 36,500-by-36,500. (through singular
value decomposition), which is roughly 1.3 Tera points. Even at this resolution,
standard principal component analysis (PCA) may require sampling in high
performance computing environment.

•An example is CCSM3, which was used to provide a suite of simulations for the
Fourth Assessment report of the Intergovernmental Panel on Climate Change (IPCC).
CCSM3 provided simulations with the unprecedented atmosphere resolution of 180km.
Over 7.5 TB of data are produced in each 100year simulation of this model, with most
data output only at monthly intervals. Several 100-year integrations are required to
simulate all of the IPCC future emission scenarios.
                                                     Astrophysics
                                                          Now and 5 yrs: Can soak up
      Climate                                             anything!
      Now: 20-40TB per simulated year
      5 yrs: 100TB/yr 5-10PB/yr                      Fusion
                                                         Now: 100Mbytes/15min
                                                         5 yrs: 1000Mbytes/2 min
         Berkeley Initiative in Soft Computing (BISC)
Center for Computational Machine Intelligence and Systems Science
 Algorithmic Complexity:                     Algorithmic Complexity:
 Calculate Means       O(n)                  Kernelized CCA        O(n m2)
 Calculate FFT         O(n log(n))           Hierarchical Clust. O(n2)
 Calculate PCA         O(r • c)              Decision Tree Induction: O(mn log n) + O(n (log n)2)

                                     High Performance Computer
                                  1Tflop/sec == 7000 1GHz Computer
        Data size, n                               Algorithm Complexity
                                     n                n long (n)                  n2
           1 MB                   10 -6 sec           10 -5 sec                 1 sec
          100 MB                  10 -4 sec           10 -3 sec                3 Hours
            1GB                   10 -3 sec           10 -2 sec                12 days
           10 GB                  10 -2 sec            0.1 sec                 3 years
          100 GB                   0.1 sec              1 sec                 317 years
          1 TeraB                  1 sec                10 sec               31,710 years
          10 TeraB                 10 sec              100 sec             3,171,000 years
         100 TeraB                100 sec              1000 sec           317,100,000 years
          1 PetaB                 1000 sec             3 hours           31,710,000,000 year
       Berkeley Initiative in Soft Computing (BISC)

     Mining Associations in Earth Science Data: Challenges




How to transform Earth Science data into transactions?
                    Berkeley Initiative in Soft Computing (BISC)
    Center for Computational Machine Intelligence and Systems Science
      Ocean and Land Temperature (Jan 1982)




                                                            NPP                                          NPP
                                                                                      .
                                                                 Pressure                 .                Pressure
                                                                                              .
                                                                      Precipitation                            Precipitation
                                                                            SST                                       SST

                                                                          Latitude




                                              grid cell   Longitude                                                zone
                                                                                                  Time


Vipin Kumar
         Berkeley Initiative in Soft Computing (BISC)

                 Natural Language Computing
                    FC-DNA as a basis for
•Common Sense Knowledge, Human Reasoning and Deduction
•Next Generation of Concept-Based Search Engine
       Berkeley Initiative in Soft Computing (BISC)




Knowledge Mobilization and Intelligent Augmentation
                    (KnowMInA)

        Application to Health Care and
    Defense & Security (Homeland Security)
                                           Berkeley Initiative in Soft Computing (BISC)




                                                        Knowledge Mobilization for Health care and Defense & Security



                          Embedded Systems                            Soft Computing Technology for                            Human-Machine Interaction                  Others?
                                                                          Knowledge Mobilization                             (User and Graphic-visualization)

                         Intelligent Augmentation


                                                                         Q&A                                                                                    Navigation including
Knowledge mobilization




                                                             (Reasoning and deduction)           Behavior/Profile Modeling              Visual Analytics        Navigation using
                            Multi-agent systems                                                    (Fuzzy Semantic net)                                         Voice (VoIP)


                                                                                                                    Extracting behavior from                     Clarification Dialog
                                                                               Precisiated Natural                      Multimedia data                            and Ambiguity
                                         EDSS for decision                  Languages and Computing                       (TIKManD)                                  Resolution
                                           alternatives                            with Words
                                                                                                                  Advanced summarization
                                                                                                                  capabilities using profile
                                                                            Deduction from query-                         modeling
                                         Information-alert                   relevant information

                                                                                                                                                                 Feature-based Data
                                                                                                                     Rendering Huge Data                              Mining
                                                                                 Organization of world               into Relevant Data to
                           Recurrent Fuzzy Logic (and                            knowledge; Epistemic                help Summarization
                                                                                                                                                            Data Visualization and
                           similar Algorithms) to                              (knowledge directed) and
                                                                                                                                                            Visual Interactive
                           include State Related                            Lexicon (EL) (Ontology related)
                                                                                                                                                            Decision Making
                           Information
          Berkeley Initiative in Soft Computing (BISC)
Berkeley International Institute for System and Computational Intelligence
          Berkeley Initiative in Soft Computing (BISC)
Berkeley International Institute for System and Computational Intelligence
          Berkeley Initiative in Soft Computing (BISC)
Berkeley International Institute for System and Computational Intelligence

								
To top