Docstoc

MULTIMODE NETWORK BASED EFFICIENT AND SCALABLE LEARNING OF COLLECTIVE BEHAVIOR

Document Sample
MULTIMODE NETWORK BASED EFFICIENT AND SCALABLE LEARNING OF COLLECTIVE BEHAVIOR Powered By Docstoc
					  International Journal of JOURNAL OF and Technology (IJCET), ISSN 0976-
 INTERNATIONALComputer EngineeringCOMPUTER ENGINEERING
  6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME
                             & TECHNOLOGY (IJCET)
ISSN 0976 – 6367(Print)
ISSN 0976 – 6375(Online)
Volume 4, Issue 1, January- February (2013), pp. 313-317
                                                                               IJCET
© IAEME: www.iaeme.com/ijcet.asp
Journal Impact Factor (2012): 3.9580 (Calculated by GISI)                ©IAEME
www.jifactor.com




      MULTIMODE NETWORK BASED EFFICIENT AND SCALABLE
         LEARNING OF COLLECTIVE BEHAVIOR: A SURVEY

                                        1                                  2
                       Vibha B. Lahane , Prof. Santoshkumar V. Chobe
             1
                 Department of Computer Engineering,Pad.Dr. DYPIET Pimpri, Pune
             2
                 Department of Computer Engineering, Pad.Dr. DYPIET Pimpri, Pune



  ABSTRACT

          Study of collective behavior is to understand how individuals behave in a social
  networking environment. Oceans of data generated by social media like Facebook, Twitter,
  Flickr, and YouTube present opportunities and challenges to study collective behavior on a
  large scale. To learn to predict collective behavior in social media. In particular, given
  information about some individuals, how to infer the behavior of unobserved individuals in
  the same network? A social-dimension-based approach has been shown effective in
  addressing the heterogeneity of connections presented in social media. However, the
  networks in social media are normally of colossal size, involving hundreds of thousands of
  actors. The scale of these networks entails scalable learning of models for collective behavior
  prediction. To address the scalability issue, an edge-centric clustering scheme to extract
  sparse social dimensions. With sparse social dimensions, the proposed approach can
  efficiently handle networks of millions of actors while demonstrating a comparable
  prediction performance to other non-scalable methods.

  I. INTRODUCTION

  a) Collective Behavior
          Collective behavior refers to the behaviors of individuals in a social networking
  environment, but it is not simply the aggregation of individual behaviors. In a connected
  environment, individuals’ behaviors tend to be interdependent, influenced by the behavior of
  friends. This naturally leads to behavior correlation between connected users. Take marketing
  as an example: if our friends buy something, there is a better-than-average chance that we
  will buy it, too.


                                               313
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

b) Need for Acquirements of Collective Behavior
        The advancement in computing and communication technologies enables people
to get together and share information in innovative ways. Social media, in forms of Web
2.0 and popular social networking sites like Facebook, Flickr, YouTube, Digg, Blog, etc.,
is reshaping various fields including online business, marketing, epidemics and intelligent
analysis.     Concomitant with the opportunities indicated by the rocketing online traffic
in social media are the challenges for user/customer profiling, accurate user search,
matching, recommendation as well as effective advertising and marketing. Take
blogsphere as an example. Bloggers can upload tags for their own blog sites. The tags of a
blog site provide the description of the blogger, facilitating blog search, retrieval and
other tasks. Unfortunately, not all the bloggers provide tags; even if some do, they may
just choose some for convenience. Thus, it becomes a challenge to infer the likely tags of
those bloggers with partial information. Another problem is social networking
advertising. Currently, advertising in social media has encountered many challenges.
        A recent study from the research firm IDC suggested that “just 57% of all users of
social networks clicked on an ad in the last year, and only 11% of those clicks lead to a
purchase”. Note that some social networking sites can only collect very limited user
profile information, either due to the privacy issue or because the user declines to share
the true information. On the contrary, the friendship network is normally available. If one
can leverage a small portion of user information and the network data wisely, the
situation might improve significantly Social networking sites (a recent phenomenon)
empower people of different ages and backgrounds with new forms of collaboration,
communication, and collective intelligence. Prodigious numbers of online volunteers
collaboratively write encyclopedia articles of unprecedented scope and scale; online
marketplaces recommend products by investigating user shopping behavior and
interactions; and political movements also exploit new forms of engagement and
collective action. In the same process, social media provides ample opportunities to study
human interactions and collective behavior on an unprecedented scale.

II. LITERATURE SURVEY

       L.Tang and H.Liu [1] stated that collective behavior refers to how individuals
behave when they are exposed in a social network environment. In this paper, they
examined how they could predict online behaviors of users in a network, given the
behavior information of some actors in the network. Many social media tasks can be
connected to the problem of collective behavior prediction. Since connections in a social
network represent various kinds of relations, a social-learning framework based on social
dimensions is introduced. This framework suggests extracting social dimensions that
represent the latent affiliations associated with actors, and then applying supervised
learning to determine which dimensions are informative for behavior prediction. It
demonstrates many advantages, especially suitable for large-scale networks, paving the
way for the study of collective behavior in many real-world applications. Collective
behavior is not simply the aggregation of individuals' behavior. In a connected
environment, behaviors of individuals tend to be interdependent. That is, one's behavior
can be influenced by the behavior of his/her friends. This naturally leads to behavior
correlation between connected users. Such collective behavior correlation can also be

                                           314
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

explained by homophily. M.McPherson, L.Smith-Lovin, and J.M.Cook [7] discussed that, the
people who are interacting with each other share certain similarities between them. The
author also described that this correlated behavior information also used for prediction of
online behaviors in a network.
    M.E.J. Newman, A.L. Barab´asi and D.J.Watts [17] proposed a concept called collective
inference. It assumes that the behavior of one actor is dependent upon that of his friends. To
make prediction, collective inference is required to find an equilibrium status such that the
inconsistency between connected actors is minimized. This is normally done by iteratively
updating the possible
behavior output of one actor while fixing the behavior output (or attributes) of his connected
friends in the network. It has been shown that considering this network connectivity for
behavior prediction outperforms those that do not. However,connections in social media are
often not homogeneous. The heterogeneity presented in network connectivities can hinder the
success of collective inference.
 P.Singla and M.Richardson [4] applied data mining techniques to study this relationship for a
population of over 10 million people, by turning to online sources of data. The analysis
reveals that people who chat with each other (using instant messaging) are more likely to
share interests (their Web searches are the same or topically similar). The more time they
spend talking, the stronger this relationship is. People who chat with each other are also more
likely to share other personal characteristics, such as their age and location (and, they are
likely to be of opposite gender). Similar findings hold for people who do not necessarily talk
to each other but do have a friend in common. Their analysis is based on a well-defined
mathematical formulation of the problem, and is the largest such study they were aware of.
M.E.J.Newman [3] considered the problem of detecting communities or modules in
networks, groups of vertices with a higher-than-average density of edge connecting them.
Previous work indicates that a robust approach to this problem is the maximization of the
benefit function known as “modularity” over possible divisions of a network. Here the author
showed that this maximization process can be written in terms of the eigen spectrum of a
matrix they called the modularity matrix, which plays a role in community detection similar
to that played by the graph Laplacian in graph partitioning calculations. This result leads us to
a number of possible algorithms for detecting community structure, as well as several other
results, including a spectral measure of bipartite structure in networks and a new centrality
measure that identifies those vertices that occupy central positions within the communities to
which they belong. The algorithms and measures proposed are illustrated with applications to
a variety of real-world complex
networks. H. W. Lauw, J. C. Shafer, R. Agrawal, and
A. Ntoulas [11] study the phenomenon of homophily in the digital world. Unlike the physical
world, the digital world doesn’t impose any geographic or organizational constraints on
friendships. Online friends might share common interests, there’s no reason to believe that
two users with common interests are more likely to be friends. A common assumption about
human nature is that people have a tendency to associate with other, similar people (a
phenomenon called homophily). Sociology has studied homophily in the physical world
extensively. However, the studies have generally been conducted on a small scale, and the
similarity factors examined have been limited mostly to easily observed or surveyed socio
demographic characteristics, such as race, gender, religion, and occupation — characteristics
that don’t necessarily manifest themselves in online social networks. One of the strongest
underlying sources of homophily in the physical world is locality due to geographic


                                              315
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

proximity, family ties, and organizational factors, such as school and work. However, in the
digital world, physical locality becomes less important, and other factors such as common
interests might play a greater role.
S.A.Macskassy and F.Provost [9] presented that the classified entities that are interlinked
with entities for which the class is known and also they use a modular toolkit called NetKit
for classification in networked data. NetKit is based on a node-centric framework in which
classifiers comprise a local classifier, a relational classifier, and a collective inference
procedure. Various existing node-centric relational learning algorithms can be instantiated
with appropriate choices for these components, and new combinations of components realize
new algorithms. This study focuses on univariate network
classification, for which the only information used is the structure of class linkage in the
network (i.e., only links and some class labels).It also shows that there are two sets of
techniques that are preferable in different situations, namely when few versus many labels are
known initially. They also demonstrated that link selection plays an important role similar to
traditional feature selection.
Lei Tang, Huan Liu,Jianping Zhang, Zohreh Nazeri[22]stated that a multi-mode network
typically consists of multiple heterogeneous social actors among which various types of
interactions could occur. Identifying communities in a multi-mode network can help
understand the structural properties of the network, address the data shortage and unbalanced
problems ,and assist tasks like targeted marketing and finding influential actors within or
between groups. In general, a network and the membership of groups often evolve gradually.
In a dynamic multi-mode network, both actor membership and interactions can evolve, which
poses a challenging problem of identifying community evolution. In this work, they try to
address this issue by employing the temporal information to analyze a multi-mode network.
A spectral framework and its scalability issue are carefully studied. Experiments on both
synthetic data and real-world large scale networks demonstrate the efficacy of algorithm and
suggest its generality in solving problems with complex relationships.

III. CONCLUSION

        This paper has surveyed different schemes that are used for collective behavior
prediction. The networks in social media are normally of big in size, involving hundreds of
thousands of actors. The scale of these networks entails scalable learning of models for
collective behavior prediction. To address the scalability issue, an edge-centric clustering
scheme is proposed. The proposed approach can efficiently handle networks of millions of
actors while demonstrating a comparable prediction performance to other non-scalable
methods like Modularity maximization are dense so memory requirement is high and
memory requirements hinder both extractions of social dimensions as well as subsequent
discriminative learning. With edge-centric view, shows that the extracted social dimensions
are guaranteed to be sparse.

REFERENCES

[1] L. Tang and H. Liu, “Toward predicting collective behavior via social dimension
extraction,” IEEE Intelligent Systems, vol. 25, pp. 19–25, 2010.
[2] N.J. Smelser and N. Joseph, Theory of collective behavior. London (UK): Routledge &
Kegan Paul, 1962


                                             316
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

[3] M.Newman, “Finding community structure in networks using the eigenvectors of matrices,”
Physical Review E (Statistical, Nonlinear, and Soft Matter Physics), vol. 74, no. 3, 2006. [Online].
Available: http://dx.doi.org/10.1103/PhysRevE.74.036104
[4] P.Singla and M. Richardson, “Yes, there is a correlation: - from social networks to personal
behavior on the web,” in WWW ’08: Proceeding of the 17th international conference on World Wide
Web. New York, NY, USA: ACM, 2008, pp. 655–664.
[5]N.J. Smelser and N. Joseph, Theory of collective behavior. London (UK): Routledge & Kegan
Paul, 1962.
[6] L. Tang, H. Liu, “Relational learning via latent social dimensions,” in KDD ’09: Proceedings of
the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. New
York, NY, USA: ACM, 2009, pp. 817–826.
[7] M.McPherson , L.Smith-Lovin , and J. M. Cook, “Birds of a feather: Homophily in social
networks,” Annual Review of Sociology, vol. 27, pp. 415–444, 2001.
[8] H. W. Lauw, J. C. Shafer, R. Agrawal, and A. Ntoulas,
“Homophily in the digital world: A Live Journal case study,” IEEE Internet Computing, vol. 14, pp.
15 23,2010.
[9] S. A. Macskassy and F. Provost, “ Classification in networked data: A toolkit and a univariate case
study,” J. Mach. Learn. Res.,vol. 8, pp. 935–983, 2007.
[10] S. Gupta, R. M. Anderson, and R. M. May, Networks of sexual contacts: Implications for the
pattern of spread of HIV. AIDS 3, 807–817 (1989).
[11] H. W. Lauw, J. C. Shafer, R. Agrawal, and A. Ntoulas, “Homophily in the digital world: A
LiveJournal case study,” IEEE Internet Computing, vol. 14, pp. 15–23, 2010.
[12] M. E. J. Newman, Fast algorithm for detecting community structure in networks. Phys. Rev. E
69, 066133 (2004).
[13] J.Reichardt and S. Bornholdt, Statistical mechanics of community detection. Preprint cond-
mat/0603718 (2006).
[14] J.Duch and A. Arenas, Community detection in complex networks using extremal optimization.
Phys. Rev.E 72, 027104 (2005).
[15] M. Granovetter. Threshold models of collective behavior. American journal of sociology,
83(6):1420, 1978.
[16] T. C. Schelling. Dynamic models of segregation. Journal of Mathematical Sociology, 1:143{186,
1971.
[17] M. E. J. Newman, A.-L. Barab´asi, and D. J. Watts, The Structure and Dynamics of Networks.
Princeton University Press, Princeton (2006)
[18] M.Girvan and M. E. J. Newman, Community structure in social and biological networks. Proc.
Natl. Acad. Sci.USA 99,7821–7826 (2002).
[19] P.Holme, M. Huss, and H. Jeong, Subnetwork hierarchies of biochemical pathways.
Bioinformatics 19,532–538 (2003).
[20] G.W.Flake, S. R. Lawrence, C. L. Giles, and F. M. Coetzee, Self-organization and identification
of Web communities. IEEE Computer 35, 66–71 (2002).
[21] R. Guimer`a and L. A. N. Amaral, Functional cartography of complex metabolic networks.
Nature 433, 895–900 (2005).
[22] L. Tang, H. Liu, J. Zhang, and Z. Nazeri, “Community evolution in dynamic multi-mode
networks,” in KDD ’08: Proceeding of the 14th ACM SIGKDD international conference on
Knowledge discovery and data mining. New York, NY, USA: ACM, 2008, pp. 677–685.
[23] Sonia and Satinder Pal, “An Effective Approach To Contention Based Bandwidth
Request Mechanism In Wimax Networks” International journal of Computer Engineering &
Technology (IJCET), Volume 3, Issue 2, 2012, pp. 603 - 620, Published by IAEME.
[24] F.Fenita and Sumitha C.H, “Prediction of Customer Behavior Using CMA” International
journal of Computer Engineering & Technology (IJCET), Volume 1, Issue 2, 2010,
pp. 192 - 200, Published by IAEME.

                                                 317

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:9
posted:2/26/2013
language:
pages:5