Appl by yaofenji


									Mária Markošová
1.   Language lexicon as a network
2.   Functional brain network
3.   Network of sexual contacts
4.   Computer science applications
5.   Image processing networks
6.   Exam: an example

word = node of a graph
                                               word net
connection between words = graph edge

How to define the connection between words ?
Two possibilities of the word net

1. Conceptual (related to semantics) – word is a node and all words which
    are in the expository dictionary in the entry of the word in question
    (and are also an an entries in the dictionary) are connected.

                Motter et al 2002

2. Positional (related to syntax) – word is a node and all words which are
    neighbors of the word in question in the text are connected .

                Ferrer, Solé, 2001
                Dorogovtsev, Mendes, 2001
       Conceptual word net


     paw                dog

           fur                animal

Properties of the conceptual word web:

-has small world character

                             N        kaver   Caver     l

    Conceptual network       30 244   59.9     0.53         3.16

    Random network           30 244   59.9     0.002        2.5

N- number of nodes, kaver - average degree, Caver – average clustering
coefficient, l –average shortest distance

-Does not have scale free character – degree distribution is not power law
           Positional, syntactical word web

  I suppose, tomorrow will be a wonderful day.

As mutually connected are taken the words, which are neighbors in a
sentence. The additional conditions can be: pij  pi p j

 pij   -probability of mutual occurrence of i-th and j-th word.

 pi -probability of occurrence of i-th word
pi p j - probability of random occurence of the i-th and j-th word together
National language corpus

Database, which includes all possible information about the certain language,
e.g. all possible texts in all possible variants (slangs, dialects, argots) and in a
real ratio (which is the same as in the society).

National language corpus thus represents the knowledges about the language.

Kernel lexicon: About 10 000 – 15 000 words which create the basis of
certain language. These words are used by the majority of the population,
regardless of education, status, gender….etc.

Positional word web: Ferrer and Solé on the basis of the English national
corpus created a network and studied the propertyies of the network.

Properties of the positional word web: has small world and scale free
 Ferrer, Solé, 2001

 Degree distribution of the positional word net

log Pk 
           100               1.5

           106                               kernel lexicon

                                                 2.7
           1010                                        log k
                       0         2        4       6
                      10       10      10       10
Dorogovtsev - Mendes model
(Dorogovtsev, Mendes, 2001b)

                               Preferential attachment

                               New links between old
                               nodes - preferential
k s, t                  k s, t 
            1  2ct  t
                         du k u, t 

                  2ct new ends of edges among old sites

      one edge of a new coming node

                      The edges are preferentially distributed
                 1                     3
             ct   2  ct 
                     2                     2
k s ,t      
                                                    1            3           1  2
                                              s        2
                                                             s        2
                                                                           s s
             cs   2  cs 
                   1

Pk   k , 

 3                     -for great degrees     s  t                     Data:     2.7
                         -for small degrees     t s
  1.5
1.   Is the small world property of the word web universal?
2.   What is the reason of discrepancy between data and theory?

Markošová, Jazyk ako sieť malého sveta, in Jazyk a kognícia, Bratislava ,
Kaligram (2005) 306
Markošová, Hrebčík, Orosi, Modelling language as a small world network, IPSI
Amsterdam, 1-4.9.2005
Markošová, Náther, Language as a small world network,Hybrid Intelligent
Systems, Neuro Computing and Evolving Intelligence, Auckland ,13. –15.
Markošová: Network model of human language, Physica A 387(2008)661
English language (positional network) : British national corpus, n=450 000 nodes
(Ferrer, Solé, 2001)

   Graph type                 Caver      C(rand)           l        l(rand)

                             0.687 1.55 10              2.63         3.03
                             0.473 1.55 104 2.67                    3.06
Slovak language: positional word web based on various texts found on
the internet (Orosi, diplomová práca, 2004)

n=59542 nodes

   Graph type      Caver        Crand           l      k aver
      G1           0.369      3.36  104     2.87     29.96
      G2           0.607      8.09  104     2.62     53.01

             Nearest neighbor and next nearest neighbor interaction

             Nearest neighbor interaction
           Positional word web of The Bible (Náther, dizertačná práca)

             Degree distribution scaling (log – log plots)


Our variant of Dorogovtsev Mendes process
Markošová 2008

                           Preferential attachment

                           New links between old
                           nodes - preferential

                            Old links are chosen
                            randomly and rewired
      k s, t                       k s, t 
                  m  2ct  mr  t
        t                                           t
                                    du k u, t 

new links added by
                           revired old links
new node
                                               revired old links

          new links between
          old nodes                             random exclusion

              preferential linking

It is a sum of all edges in the system. Rewiring processes doesn’t influence this
sum, therefore it is the same as in the Dorogovtsev – Mendes model:


   k s, t  dt  2mt  ct

                  m  mr                      m  mc
            t             2m  ct 
k s, t    
                   2m                          2m
                                    
            s             2m  cs 

                m  mr       1      m  mr
s  t                 1  2 
                 2m                m  mr

                   m  mr m  mr
st           2               2   1.5
                    2m     2m
Maria Markosova, Liz Franz, Lubica

    Department of Applied Informatics,
      Comenius University, Slovakia,
       Department of Psychology &
     Department of Computer Science
     University of Otago, New Zealand
   Measures neural activity based on the ratio of
    de- and oxygenated haemoglobin (iron) in

   Blood Oxygenation Level Dependent (BOLD)

    neural activity

    blood oxygen

    fMRI signal
                               Slice Thickness                                         (Volumetric Pixel)
                                  e.g., 6 mm                               In-plane resolution
                                                                            e.g., 192 mm / 64
                                                                                  = 3 mm
                                                                                         mm           6
 SAGITTAL SLICE                                                        IN-PLANE SLICE                 mm
   Number of Slices
      e.g., 10

                                                        Matrix Size
                                                       e.g., 64 x 64

                                                 Field of View (FOV)
                                                    e.g., 19.2 cm

   Networks of functional units (e.g., voxels) that
    temporarily self-organize themselves to engage in
    a given task

   If the temporal BOLD activity in the two voxels is
    well correlated, then the link is established

   I.e., when the linear correlation coefficient exceeds
    some threshold, i.e., |r(i, j)| ≥ rc , where

               V (i, t )V ( j, t )  V (i, t ) V ( j, t )
    r(i, j ) 
                         V (i, t ) V ( j, t )
   Previous studies indicated
    that fMRI functional
    networks have both scale-
    free and small world

   The scale-free network:
    implies hubs with many

   A small-world: many local
    clusters with occasional
    global interactions.

   fMRI data were obtained for 4 healthy adult
    ◦ 6 cycles of task and rest periods, each lasting 20 s
    ◦ Task: bimanual finger tapping according to a 1 Hz
    ◦ 32, 728 voxels, i.e. [( Z = 1 to 8 slices) x (X = 64)]

   We calculated ri,j for 80 million randomly chosen
    pairs of voxels, raw activity of which was more than
    100 or 200 (no difference). If | ri,j | > rc then there is
    a functional link.


   We calculated network characteristics for each
    subject, for task and rest data, in order to see
    ◦ difference between the task and rest condition
    ◦ difference between subjects

Rest   Task

   Both rest and task functional networks have a small-
    world character across subjects (many local clusters with
    occasional long interactions (subjects listened to tone);

   There seem to be differences between subjects with
    respect to scaling coefficient;

   There seem to be differences between conditions (task /
    rest) in terms of the length of the linear portion of the

   Functional networks seem to be on the edge between
    exponential distribution and scale-free (few hubs in rest
     more hubs during task)

First studies were done by Liljeros et all (2001) on the 2810 respondents
which represented the Swedish populatuion.

Network: bipartite, nodes males and females, edges = partnerships

Results: Power law degree distribution leads to the possible preferential
         attachment of new nodes.

         Plausible mechanism responsible for the structure:

         1. Skills of getting new partner grows with the number of
         2. Different level of attractivness (more attractive have probably
            more partners)
         3. The need to have more partners to maintain self image
  What was analysed:

  1. Short time network (number of partners during one year).
  2. Network, where the number of partners during lifetime was taken into



         1    10       100   1000
                                                         1      10      100    1000
                   k                                                   k
                             Both distributions are in log- log plot. Left –short time
                             network (gamam exponent 3.54 males, 3.31 females),
                             right = life time network (exponents 3.1 females, 2.6
    Other social networks
Movie Actor network:
            Actor is a node. If two actors play in the same movie,
            they are connected by an edge.

             This network has :
              a) small world structure with high average clustering coefficient
                 and small average shortest distance, which indicates that
                 actors tends to play with common partners an only some of
                 them are „universal“
              b) the network is scale free – this is explained by the popularity
                 of some actors and short „life time in movies“ of many actors.
Kevin Bacon number shows small world structure.

                                 Kevin Bacon has Baconon
                                 number 0.

                                 Julia Roberts a Tom Hanks
                                 have Baconovo number
                                 equal to 1.

                                 Johny Depp, Robin
                                 Williams ... Have Bacon
                                 equal to 2.
Phone call Network:
               Nodes are users and mutual phopne calls between users
               are edges (if they really called together- the call has been
               received). He weigths on edges represented the duration of the
               call. Network has power law degree distribution (is scale free).
               Strong edges were inside communities of friends. Weak
               edges were between communities. Removing weak edges can
               disrupt the net.

                                                              Short time phone
                                                              call network from
                                                              based on an
                                                              Orange data of one
                                                              European country
Social network of dolphins: Lusseau 2003: Nodes are dolphins. If two
dolphins were seen together more probably then accidentally, they were
connected by an edge. The data were collected 7 years.

Properties: The network is has small world structure indicating
            communities of the animals.

           The network has scale free structure: hubs were old femals

                                     Original Lusseau network based on
                                     62 dolphins living in Doubtfull
                                     sound, New Zealand.
Email communication Ebel et al:

Importamt also from the point of view of computer virus spreading.

Two possible email networks:

A) Nodes are email addresses, link is established if mail is sent from the
   address i to the address j.
B) Nodes are email addresses, if the address j is in the address book of the user
   with the address i , link is established.

Both networks were studied as undirected. Data were collected from the email
   network of several universities.
Properties of the email networks:

Both networks have small world and scale free character. Both networks have
shown communities.

 Braha and Bar Yam have studied dynamical changes of such network and have
shown, that the average degree and betweeness change dramatically from day
to day (specially in the network A). This shows, that it should be reinterpreted
in such ad hoc dynamical networks what is “hub” for example.

Models of worm spreading on the email network:

Spreading of the email worms Zou et al.: Worm is a malicious computer
program propagating through mail attachments. When user clicks on the
attachment the worm found all email addresses stored on the computer and
sends out worm email.
The authors simulated the spreading on the model B.
Ti - email checking time interval for the user i- th user (user with the address i is
Pi - probability of the attachment opening by the i- th user
Email checking time is a random variable with the mean value E(Ti). Random
variable Ti can have various distributions.

Model parameters :   Gaussian distribution was used for the checking time
                     interval. User always check all new mails when he checks
                     a mailbox.
                     Probability to open an attachment was constant for
                     each user and among users it has Gaussian distribution.

                     State of the user: infected, if he opens an attachment.
                     N0 - number of initially infected users
                     Nt - number of infected users at time t
                     N  - number of users which are not infected at all when
                           the worm propagation time is over, because they
                            did not open the attachment.
                     V - total number of email users
                      N0  Nt  V
                      E(Nt ) – average number of infected users at time t
                reinfection model : worm is sent repeatedly, when the
                                    user opens the attachment
                non reinfection model: worm is sent only once (after first
                                    attachment opening.

Non reinfection case: User i having mi neighbors gets at most m i worm
                      copies and the probability of his not being infected is:

                        1 Pi m i

                       Let all users equally likely open the attachment , that is

                         Pi  p
                       Then the number of uninfected people, which never
                       open the attachment is estimated

                             V  Pk  j 1  p 
                             h                               j
                        EN   
                                      j 1

                                             Probability of the user to have k neighbors

                                                          Average degree of
       12                                                 infected users
E N t                                                      60

In ten
s                                                            40



               0   100      200     300   400    500
                              time                                 0     40       60       80

                                                                  In scale free network, first are
               Scale free         Random graph                    infected hubs. Therefore the worm
               network            topology (same                  spreads faster. Because network
                                  number of nodes and
                                                                  has also small world structure,
                                  average degree as the
                                  scale free graph)               average smallest distance between
                                                                  nodes is small and spreading is
                                                                  more effective.
 Effect of email checking time distribution: This has some effect , not on
                      the shape of the left fig. curve (previous slide). The worm
                       propagates faster if the email checking time is not
                       constant, but is variable.

 Effect of selective immunization:

       12                                                  12
E N t                                             E N t 
In ten
        8                                          In ten
thousands                                          thousands                         Random graph
                                                                                     (same number of
           4                                                4                        nodes and
                                   Scale                                             average degree as
                                   free                                              the scale freenet)
           0                                                0

               0   100 200   300   400     500                      0     100       200   300   400   500
                         time                                                        time
                                     no immunization
                                     5 percent randomly chosen nodes immunization
                                     5 percent most connected nodes immunization
Immunization of email network against worm – summary

To prevent worm propagation , the most connected users should be
immunized first, because they are most important infectors for the scale free
Software architecture: (Valverode et al 2002): Authors have shown, that
          important class of networks derived from software architecture maps
          displays scale free and small world character due to design
          optimization process.

Software architecture : the structure of the program. Nodes are software
components (classes), links are relations (interactions) among the components
described by the class diagram. Software component is thus class.

Authors analyzed the class diagram of the public Java development framework
1.2., which is a large set of the software components used by Java applications
and is also a highly optimized structure.
The software graph is defined by the set of nodes (classes) and links
(connections ) between classes.

Software is developed by engineers in parallel , different people build
different parts which are then connected together.

Heuristic rules: optimize communication among modulus, optimize cost in
                 terms of wiring, avoid hubs.

The authors have found, that local optimizing process leads to the scale
free and small world structure of the class graph. In this local optimizing
process no preferential attachment has been implemented explicitly.

Question: Is local optimizing a new mechanism leading to the scale free

           Is in local optimizing process hidden preferential node
  Image processing analysis: Costa (2004)

Uncolored image containing various levels of gray ranging from 0 to 1
(normalized grey levels) in M x M=N pixels .
Pixels are nodes and each pixel is connected to each other by N(N-1)/2
weighted edges.

Edges represent several types of interactions: light intensity, color components,
local shape, texture, spatial adjacency… . Scalar values of these features create
features create a feature vector f . Each pixel is associated with its feature
vector. The weight of the the edge connecting two nodes (pixels) i,j is

wi , j   f i  f j   , where    is an Euclidean norm.

 Weight denotes the visual dissimilarity of the pixels i, j.
We can create matrix W of weights. Rows and columns are pixels and          wij matrix
element is a weight between pixel i and pixel j.

                    pix1       pix2   pix3 
                                           
        pix         w11        w12    w13 
    W  1
        pix2        w21        w22    w23 
        pix                           w33 
        3           w31        w32         

Adjacency matrix A: A is thresholded W. If the matrix element is greater
then the threshold T put one, if not put zero.

             pix1     pix2       pix3 
                                      
      pix      1           1      0 
  A  1                                        - Because   w13 , w31  T
      pix2     1           1      1  
      pix                         1 
      3        0           1          
 Image segmentation

The network is partitioned to the connected components (using appropriate
algorithm and W and A matrices) according visual similarity of the pixels
which leads to the image segmentation.

   Images                                               for T=0.05
1. Vypočítajte priemerný klasterizačný koeficient grafu:

2. Môže existovať obyčajný graf (nemá násobné hrany, slučky ani orientované hrany)
   ktorý má 15 vrcholov a každý vrchol má stupeň 5? Matematicky odôvodnite svoju
3. A) Napíšte dynamickú rovnicu pre Barabási Albert model a popíšte význam
   jednotlivých jej členov.
   B) Ako vyzerá distribúcia stupňov uzlov pre BA model a čo to hovorí o štruktúre
   siete rastúcej BA procesom?
4. Aké spôsoby matematickej reprezentácie sietí poznáte?
   Napíšte maticu susednosti a incidenčnú maticu pre graf z úlohy 1.

5. Popíšte epidemické modely a ich vlastnosti.

                       Prajeme vám krásne

To top