Mária Markošová 1. Language lexicon as a network 2. Functional brain network 3. Network of sexual contacts 4. Computer science applications 5. Image processing networks 6. Exam: an example Language: word = node of a graph word net connection between words = graph edge How to define the connection between words ? Two possibilities of the word net 1. Conceptual (related to semantics) – word is a node and all words which are in the expository dictionary in the entry of the word in question (and are also an an entries in the dictionary) are connected. Motter et al 2002 2. Positional (related to syntax) – word is a node and all words which are neighbors of the word in question in the text are connected . Ferrer, Solé, 2001 Dorogovtsev, Mendes, 2001 Conceptual word net 1) paw dog fur animal mammal Properties of the conceptual word web: -has small world character N kaver Caver l Conceptual network 30 244 59.9 0.53 3.16 Random network 30 244 59.9 0.002 2.5 N- number of nodes, kaver - average degree, Caver – average clustering coefficient, l –average shortest distance -Does not have scale free character – degree distribution is not power law Positional, syntactical word web I suppose, tomorrow will be a wonderful day. As mutually connected are taken the words, which are neighbors in a sentence. The additional conditions can be: pij pi p j pij -probability of mutual occurrence of i-th and j-th word. pi -probability of occurrence of i-th word pi p j - probability of random occurence of the i-th and j-th word together National language corpus Database, which includes all possible information about the certain language, e.g. all possible texts in all possible variants (slangs, dialects, argots) and in a real ratio (which is the same as in the society). National language corpus thus represents the knowledges about the language. Kernel lexicon: About 10 000 – 15 000 words which create the basis of certain language. These words are used by the majority of the population, regardless of education, status, gender….etc. Positional word web: Ferrer and Solé on the basis of the English national corpus created a network and studied the propertyies of the network. Properties of the positional word web: has small world and scale free character Ferrer, Solé, 2001 Degree distribution of the positional word net log Pk 100 1.5 kcross 102 106 kernel lexicon 2.7 1010 log k 0 2 4 6 10 10 10 10 Dorogovtsev - Mendes model (Dorogovtsev, Mendes, 2001b) Preferential attachment New links between old nodes - preferential k s, t k s, t 1 2ct t t 0 du k u, t 2ct new ends of edges among old sites one edge of a new coming node The edges are preferentially distributed 1 3 ct 2 ct 2 2 k s ,t 1 3 1 2 s 2 s 2 s s cs 2 cs 1 1 Pk k , 3 -for great degrees s t Data: 2.7 -for small degrees t s 1.5 Goals: 1. Is the small world property of the word web universal? 2. What is the reason of discrepancy between data and theory? Markošová, Jazyk ako sieť malého sveta, in Jazyk a kognícia, Bratislava , Kaligram (2005) 306 Markošová, Hrebčík, Orosi, Modelling language as a small world network, IPSI Amsterdam, 1-4.9.2005 Markošová, Náther, Language as a small world network,Hybrid Intelligent Systems, Neuro Computing and Evolving Intelligence, Auckland ,13. –15. 12.2006 Markošová: Network model of human language, Physica A 387(2008)661 English language (positional network) : British national corpus, n=450 000 nodes (Ferrer, Solé, 2001) Graph type Caver C(rand) l l(rand) 4 UWN 0.687 1.55 10 2.63 3.03 RWN 0.473 1.55 104 2.67 3.06 Slovak language: positional word web based on various texts found on the internet (Orosi, diplomová práca, 2004) n=59542 nodes Graph type Caver Crand l k aver G1 0.369 3.36 104 2.87 29.96 G2 0.607 8.09 104 2.62 53.01 Nearest neighbor and next nearest neighbor interaction Nearest neighbor interaction Positional word web of The Bible (Náther, dizertačná práca) Degree distribution scaling (log – log plots) ln(N(k)) ln(k) Our variant of Dorogovtsev Mendes process Markošová 2008 Preferential attachment New links between old nodes - preferential Old links are chosen randomly and rewired preferentially. k s, t k s, t m 2ct mr t mr t t du k u, t 0 new links added by revired old links new node revired old links new links between old nodes random exclusion preferential linking Solution Integral: It is a sum of all edges in the system. Rewiring processes doesn’t influence this sum, therefore it is the same as in the Dorogovtsev – Mendes model: t k s, t dt 2mt ct 2 0 m mr m mc 2 t 2m ct k s, t 2m 2m s 2m cs m mr 1 m mr s t 1 2 2m m mr m mr m mr st 2 2 1.5 2m 2m Maria Markosova, Liz Franz, Lubica Benuskova Department of Applied Informatics, Comenius University, Slovakia, Department of Psychology & Department of Computer Science University of Otago, New Zealand Measures neural activity based on the ratio of de- and oxygenated haemoglobin (iron) in blood Blood Oxygenation Level Dependent (BOLD) signal neural activity blood oxygen fMRI signal 22 23 VOXEL Slice Thickness (Volumetric Pixel) e.g., 6 mm In-plane resolution e.g., 192 mm / 64 = 3 mm 3 mm 6 SAGITTAL SLICE IN-PLANE SLICE mm 3 mm Number of Slices e.g., 10 Matrix Size e.g., 64 x 64 Field of View (FOV) e.g., 19.2 cm 24 Source: http://psychology.uwo.ca/fmri4newbies/ Networks of functional units (e.g., voxels) that temporarily self-organize themselves to engage in a given task If the temporal BOLD activity in the two voxels is well correlated, then the link is established I.e., when the linear correlation coefficient exceeds some threshold, i.e., |r(i, j)| ≥ rc , where V (i, t )V ( j, t ) V (i, t ) V ( j, t ) r(i, j ) V (i, t ) V ( j, t ) 25 Previous studies indicated that fMRI functional networks have both scale- free and small world properties. The scale-free network: implies hubs with many connections. A small-world: many local clusters with occasional global interactions. 26 fMRI data were obtained for 4 healthy adult subjects ◦ 6 cycles of task and rest periods, each lasting 20 s ◦ Task: bimanual finger tapping according to a 1 Hz tone ◦ 32, 728 voxels, i.e. [( Z = 1 to 8 slices) x (X = 64)] We calculated ri,j for 80 million randomly chosen pairs of voxels, raw activity of which was more than 100 or 200 (no difference). If | ri,j | > rc then there is a functional link. tone 27 We calculated network characteristics for each subject, for task and rest data, in order to see ◦ difference between the task and rest condition ◦ difference between subjects 28 Rest Task 29 Both rest and task functional networks have a small- world character across subjects (many local clusters with occasional long interactions (subjects listened to tone); There seem to be differences between subjects with respect to scaling coefficient; There seem to be differences between conditions (task / rest) in terms of the length of the linear portion of the plot; Functional networks seem to be on the edge between exponential distribution and scale-free (few hubs in rest more hubs during task) 30 First studies were done by Liljeros et all (2001) on the 2810 respondents which represented the Swedish populatuion. Network: bipartite, nodes males and females, edges = partnerships Results: Power law degree distribution leads to the possible preferential attachment of new nodes. Plausible mechanism responsible for the structure: 1. Skills of getting new partner grows with the number of partners. 2. Different level of attractivness (more attractive have probably more partners) 3. The need to have more partners to maintain self image What was analysed: 1. Short time network (number of partners during one year). 2. Network, where the number of partners during lifetime was taken into account. P(k) P(k) 1 male 1 1/10 1/10 1/100 1/100 1/1000 1/1000 1 10 100 1000 1 10 100 1000 k k Both distributions are in log- log plot. Left –short time network (gamam exponent 3.54 males, 3.31 females), right = life time network (exponents 3.1 females, 2.6 males) Other social networks Movie Actor network: Actor is a node. If two actors play in the same movie, they are connected by an edge. This network has : a) small world structure with high average clustering coefficient and small average shortest distance, which indicates that actors tends to play with common partners an only some of them are „universal“ b) the network is scale free – this is explained by the popularity of some actors and short „life time in movies“ of many actors. Kevin Bacon number shows small world structure. Kevin Bacon has Baconon number 0. Julia Roberts a Tom Hanks have Baconovo number equal to 1. Johny Depp, Robin Williams ... Have Bacon equal to 2. Phone call Network: Nodes are users and mutual phopne calls between users are edges (if they really called together- the call has been received). He weigths on edges represented the duration of the call. Network has power law degree distribution (is scale free). Strong edges were inside communities of friends. Weak edges were between communities. Removing weak edges can disrupt the net. Short time phone call network from based on an Orange data of one European country Social network of dolphins: Lusseau 2003: Nodes are dolphins. If two dolphins were seen together more probably then accidentally, they were connected by an edge. The data were collected 7 years. Properties: The network is has small world structure indicating communities of the animals. The network has scale free structure: hubs were old femals Original Lusseau network based on 62 dolphins living in Doubtfull sound, New Zealand. Email communication Ebel et al: Importamt also from the point of view of computer virus spreading. Two possible email networks: A) Nodes are email addresses, link is established if mail is sent from the address i to the address j. B) Nodes are email addresses, if the address j is in the address book of the user with the address i , link is established. Both networks were studied as undirected. Data were collected from the email network of several universities. Properties of the email networks: Both networks have small world and scale free character. Both networks have shown communities. Braha and Bar Yam have studied dynamical changes of such network and have shown, that the average degree and betweeness change dramatically from day to day (specially in the network A). This shows, that it should be reinterpreted in such ad hoc dynamical networks what is “hub” for example. Models of worm spreading on the email network: Spreading of the email worms Zou et al.: Worm is a malicious computer program propagating through mail attachments. When user clicks on the attachment the worm found all email addresses stored on the computer and sends out worm email. The authors simulated the spreading on the model B. Ti - email checking time interval for the user i- th user (user with the address i is meant) Pi - probability of the attachment opening by the i- th user Email checking time is a random variable with the mean value E(Ti). Random variable Ti can have various distributions. Model parameters : Gaussian distribution was used for the checking time interval. User always check all new mails when he checks a mailbox. Probability to open an attachment was constant for each user and among users it has Gaussian distribution. State of the user: infected, if he opens an attachment. N0 - number of initially infected users Nt - number of infected users at time t N - number of users which are not infected at all when h the worm propagation time is over, because they did not open the attachment. V - total number of email users N0 Nt V E(Nt ) – average number of infected users at time t reinfection model : worm is sent repeatedly, when the user opens the attachment non reinfection model: worm is sent only once (after first attachment opening. Non reinfection case: User i having mi neighbors gets at most m i worm copies and the probability of his not being infected is: 1 Pi m i Let all users equally likely open the attachment , that is Pi p Then the number of uninfected people, which never open the attachment is estimated V Pk j 1 p h j EN j 1 Probability of the user to have k neighbors Results: Average degree of 12 infected users E N t 60 In ten 8 thousand s 40 4 20 0 0 100 200 300 400 500 time 0 40 60 80 time In scale free network, first are Scale free Random graph infected hubs. Therefore the worm network topology (same spreads faster. Because network number of nodes and has also small world structure, average degree as the scale free graph) average smallest distance between nodes is small and spreading is more effective. Effect of email checking time distribution: This has some effect , not on the shape of the left fig. curve (previous slide). The worm propagates faster if the email checking time is not constant, but is variable. Effect of selective immunization: 12 12 E N t E N t In ten 8 In ten 8 thousands thousands Random graph (same number of 4 4 nodes and Scale average degree as free the scale freenet) 0 0 0 100 200 300 400 500 0 100 200 300 400 500 time time no immunization 5 percent randomly chosen nodes immunization 5 percent most connected nodes immunization Immunization of email network against worm – summary To prevent worm propagation , the most connected users should be immunized first, because they are most important infectors for the scale free network. Software architecture: (Valverode et al 2002): Authors have shown, that important class of networks derived from software architecture maps displays scale free and small world character due to design optimization process. Software architecture : the structure of the program. Nodes are software components (classes), links are relations (interactions) among the components described by the class diagram. Software component is thus class. Authors analyzed the class diagram of the public Java development framework 1.2., which is a large set of the software components used by Java applications and is also a highly optimized structure. The software graph is defined by the set of nodes (classes) and links (connections ) between classes. Software is developed by engineers in parallel , different people build different parts which are then connected together. Heuristic rules: optimize communication among modulus, optimize cost in terms of wiring, avoid hubs. The authors have found, that local optimizing process leads to the scale free and small world structure of the class graph. In this local optimizing process no preferential attachment has been implemented explicitly. Question: Is local optimizing a new mechanism leading to the scale free architecture? Is in local optimizing process hidden preferential node attachment? Image processing analysis: Costa (2004) Uncolored image containing various levels of gray ranging from 0 to 1 (normalized grey levels) in M x M=N pixels . Pixels are nodes and each pixel is connected to each other by N(N-1)/2 weighted edges. Edges represent several types of interactions: light intensity, color components, local shape, texture, spatial adjacency… . Scalar values of these features create features create a feature vector f . Each pixel is associated with its feature vector. The weight of the the edge connecting two nodes (pixels) i,j is wi , j f i f j , where is an Euclidean norm. Weight denotes the visual dissimilarity of the pixels i, j. We can create matrix W of weights. Rows and columns are pixels and wij matrix element is a weight between pixel i and pixel j. pix1 pix2 pix3 pix w11 w12 w13 W 1 pix2 w21 w22 w23 pix w33 3 w31 w32 Adjacency matrix A: A is thresholded W. If the matrix element is greater then the threshold T put one, if not put zero. pix1 pix2 pix3 pix 1 1 0 A 1 - Because w13 , w31 T pix2 1 1 1 pix 1 3 0 1 Image segmentation The network is partitioned to the connected components (using appropriate algorithm and W and A matrices) according visual similarity of the pixels which leads to the image segmentation. Segmentations Images for T=0.05 1. Vypočítajte priemerný klasterizačný koeficient grafu: 2. Môže existovať obyčajný graf (nemá násobné hrany, slučky ani orientované hrany) ktorý má 15 vrcholov a každý vrchol má stupeň 5? Matematicky odôvodnite svoju odpoveď. 3. A) Napíšte dynamickú rovnicu pre Barabási Albert model a popíšte význam jednotlivých jej členov. B) Ako vyzerá distribúcia stupňov uzlov pre BA model a čo to hovorí o štruktúre siete rastúcej BA procesom? 4. Aké spôsoby matematickej reprezentácie sietí poznáte? Napíšte maticu susednosti a incidenčnú maticu pre graf z úlohy 1. 5. Popíšte epidemické modely a ich vlastnosti. Prajeme vám krásne sviatky.
Pages to are hidden for
"Appl"Please download to view full document