Docstoc

20320130406021-2

Document Sample
20320130406021-2 Powered By Docstoc
					 INTERNATIONAL JOURNAL OF ADVANCED and Technology (IJARET), ISSN 0976 –
International Journal of Advanced Research in Engineering RESEARCH IN ENGINEERING
6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 7, November – December (2013), © IAEME
                                AND TECHNOLOGY (IJARET)

ISSN 0976 - 6480 (Print)
ISSN 0976 - 6499 (Online)
Volume 4, Issue 7, November - December 2013, pp. 176-182
                                                                             IJARET
© IAEME: www.iaeme.com/ijaret.asp
Journal Impact Factor (2013): 5.8376 (Calculated by GISI)                    ©IAEME
www.jifactor.com




               AN OVERVIEW OF OPINION MINING TECHNIQUES

                                         Dr. Jamshed Siddiqui
             Department of Computer Science, Aligarh Muslim University, Aligarh, U.P.




ABSTRACT

        The world with an intense increase in the changing technologies and due to the rapid growth
in the Internet, facing changes dramatically and created the scenario that life style of an individual
has also got changed. Users use Internet tools like blogs, social networking sites etc. to share their
views and opinions for various daily life issues that vary from on-line marketing and reviewing the
product to the election campaign and views of a voter for their electoral candidate. Opinion Mining
is required for such situation to infer the correct results and predict the future behavior. In this paper,
a short review to various opinion mining techniques is performed with tables depicting the
contribution in recent times with figures illustrations. Various techniques used, Corpus description
and basic concepts are also discussed. This work may be useful for the researcher to get a
background of opinion mining and finding future trends.

Keywords: Blogs, Opinion Mining, Internet, Social Networking, Corpus.

I. INTRODUCTION

         The celerity in the growth of the Internet has a diverse effect in the daily life of a common
man. The world with an intense increase in the changing technologies and due to the rapid growth in
the Internet, facing changes dramatically and created the scenario that life style of an individual has
also got changed. Because Internet is being used commonly, the social lives are also affected. Users
use Internet tools like blogs, social networking sites etc. to share their views and opinions for various
daily life issues that vary from on-line marketing and reviewing the product to the election campaign
and views of a voter for their electoral candidate. Opinions of one influence other and play a key role
in deciding the behavior. Almost all of us take help from other ideas and views to decide own way to
proceed, individuals would like to know the opinions of their family and friends, organizations use
opinion polls, conduct surveys and hire consultants for making their strategies. The need of opinion

                                                   176
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 –
6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 7, November – December (2013), © IAEME

mining emerges in such a situation to infer the right choice and to get benefited by the other
experiences and views. In last few decades, the opinion mining has gained much popularity [1].
        Opinion mining can be considered as a computational study of the opinions [2]. The sources
of the opinion may be the social networking sites like Facebook and twitter, feedback from emails of
the employee of an organization, opinions in news articles, blogs, product web sites where thousands
of user generated free text are available, etc.
        Inferring the true aspect of an opinion is not an easy task, sometimes a very confusing state is
created. The same object may be perceived differently by different people. For an example let us
consider Mahatma Gandhi, for India he is non-violent, father of the nation but British ruler would
refer him as their enemy. So the main task in opinion finding is to consider all these aspect and to
infer right thing according to the situation for the target customer that may be a newspaper reader
and an online-customer even a voter who wishesto vote for his choice of candidate.
        Rest of the paper is organized as follows. The second section reviews the literature and in the
third section the different aspects of data mining is elaborated. Fourth section concludes the
summary.

II. LITERATURE REVIEW

        Data Mining is one of the esteemed branch in computer science which was started initially
for business purposes[3], with timeits area of research is going to spread over medical sciences,
scientific research and social networking.[4]. Opinion mining is one of the most emergingbranch of
the data mining widely being used worldwide, interchangeably termed as sentiment analysis[2].
Themining of direction-based was first proposed by Hearst and Wieb that may include biases [5].
Opinion mining is done in text analysis and is similar as to find positive negative text, that can also
be done by supervised and unsupervised learning as well [6].
        Reviews and opinions of user generated freetext plays an important role in sentiment analysis
[7] the contribution in identifying review in heterogeneous groups of text was made by Finn[8], it
simply implies subjectivity in [9] tried to overcome the difficulties in getting the similarities and
differences in the connoted word. The technicality in the similarity between two words was also
described by [10].
        In [11, 12, 13] authors tried to give a solution for overly specific problem in subjectivity
identification. The manual annotation has two main issues; time consuming and expensive. [14, 15,
16] for linguistic corpora, a research based upon corpus is very useful. MPQA is one of these kinds
of existing corpus. But annotation was not based upon sentence level, but only word level[17]. In
[18] the contribution of Pang corpus generation is also discussed.For1 million English word, an
international corpus of English is supposed to be launched soon [19], A detail discussion of opinion
mining is performed in [20]; subtask in TREC blog track is one of the examples of opinion retrieval.
Feature based opinion summarization of reviews was suggested byHu and Liu [21].




                                                  177
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 –
6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 7, November – December (2013), © IAEME

 Sl.No.          Authors           Year   Proposed Model       Basis of              Contribution
                                                              Research
   1.     Yi Fangy, LuoSiy,        2012   Cross-prospective Query Topic,       Opinion of individual
          Naveen                          Topic Model       Text               perspective on the topic,
          Somasundaramy,                                    Collections        Quantifying opinion
          ZhengtaoYuz                                                          differences

   2.     Daniel E. O'Leary        2011   Reviewing Blogs   Blogs              Extensive review on the
                                                                               topic

   3.     PawelSobkowicz,          2012   Opinion           previous           emotion and opinion
          Michael Kaschesky,              formation         research, online   detection, model of
          Guillaume Bouchard              framework         opinions           opinion network,
                                                            emerge             information flow
                                                            diffusion          modeling, agent base
                                                                               simulation


   4.     KaiquanXu, Stephen       2010   Graphical Model   customer           extract customer reviews,
          Shaoyi Liao, Jiexun             to extract and    generated          visualization, analyze
          Li, Yuxia Song                  visualize         reviews            customer generated data



   5.     M. Eirinaki, S. pisal,   2011   sentiment         reviews, search    Analysis of sentiments,
          J. Singh                        analysis and      engine             Semantic Orientation,
                                          semantic                             Opinion search engine.
                                          orientation
                                          algorithm
   6.     AmnaAsmi and             2012   Auto generation   user generated     generation of corpus
          TankyoIshaya                    of corpus         content,
                                                            WorldNet

   7.     S.S. Patil and A.P.      2012   SVM               Corpus,            Emotion classification in
          Chaudhary                       classification    reviews            6 categories

   8.     Kushal Dave, Steve       2003   Reviews           Product            Distinguishing positive
          Lawrence, David M.              Classifiers       Reviews            and negative reviews ,
          Pennock                                                              grouping sentiments into
                                                                               attribute


III. OPINION MINING TECHNIQUE

       The term opinion mining is similar to the sentiment analysis and both are widely used in
academia, though in industry sentiment analysis is used extensively[2]. In general, there are two
types of opinions:

        1) Regular opinions
        2) Comparative opinions



                                                    178
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 –
6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 7, November – December (2013), © IAEME

       In regular opinion, sentiments on some target entities are expressed; simply they are either
negative or positive expressions of their opinions for some specific aspects. There are two sub
categories:

   a) Direct opinion
   b) Indirect opinion

        Direct opinion involves direct statement like, “the book is very interesting and knowledge-
giving”, indirect statement involves indirect expressions like, “after reading the book, I fill the
bucket of my knowledge”.
        In comparative opinions, two are more entities are compared with each other, e.g. “MCQ
physics by Upkar is better than Objective Physics by Sanjeev Gupta.” The term entity refers to
products in general, this product can be a person, an organization or even a topic of discussion. The
entity ‘product’ is associated with a set of attributes called node if represented graphically. One can
express opinion on any attribute of the node. Depicted in figure 1, university is an organization that
has several branches, here branches refer faculties. And these branches have nodes, for our example
these are departments. These can be considered as the components of the attributes university. So
concluding about university i.e. entity or attribute may need knowing the opinion about the
department i.e. components. The term aspect or feature refers to both attributes and components.


                                                            University




                      Faculty               Faculty                       Faculty          Faculty

                       (Arts)              (Soc. Sc.)                     (Science)       (Medical)
                                                                                             )

     Departments




         English                         Hindi
                                                                                Physics       Computer


                         Urdu
                                                                Geology

                   Figure 1. Opinion Mining for different branches of an item

IV. CRITERIA FOR OPINION MINING

        The opinion mining can be categorized on several bases, extracting features from opinion and
inferring the conclusion from those opinions is a very important aspect in the field of sentiment
analysis. Therefore feature based opinion mining is very important and plays a vital role in sentiment
analysis. Feature based opinion mining involves following steps:


                                                      179
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 –
6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 7, November – December (2013), © IAEME

IV. I Reviews Collection
       The collections of reviews are also very important. The source for these reviews decides the
content and accordingly the decision is made. Blogs, Social networking site, News, emails, Products
web sites etc. are the main sources in the Internet for the reviews collections.

IV. II Feature Extraction
       Features extraction is one of the important task in opinion mining, in [s] gave a formulation
based method to extract opinion. Though the work was done manually, however features extraction
method is also discussed in [22], where author discussed two types of features:

   a) Implicit Feature
   b) Explicit Feature

        Implicit features involve those features of the products which users give in some specific
form, like in adjective form. The sentences in the reviews as described below could be implicit.

                                       The book is too huge to read

Explicit features involve the features of a product described in Noun form, like:

                                  The book is precise and interesting

        There are features pruning as well so that to remove unnecessary features not essential and
probably may be incorrect. A diagrammatic representation of opinion mining is depicted in the
figure 2.

                                                          Mining




  Review/Blog                   User                     Sentiment                  Inference




                                                          Corpus

                         Figure 2. Overview of Opinion Mining Technique

IV. III Inferring Results
         After the extraction of features, the inference of conclusion is the major task that depends
upon the experimental work done and the parameters and algorithms developed. There are various
statistical parameters and different classifiers that is usualy used for the purpose. In [patiliaeme], they
discussed various classifiers, Support Vector Machine (SVM), Vector Machine (VM) and NAVIE
etc.


                                                   180
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 –
6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 7, November – December (2013), © IAEME

V. CONCLUSION

        We usually take help from other ideas and views to decide own way to proceed, individuals
would like to know the opinions of their family and friends, organizations use opinion polls, conduct
surveys and hire consultants for making their strategies. The need of opinion mining emerges in such
a situation to infer the right choice and to get benefited by the other experiences and views. In this
paper the importance of user generated reviews including the description of existing corpus and
technologies used to extract features from these reviews are discussed. Different types of opinion
mining and techniques are also discussed. In future these work would help the research to find the
coming challenges and to contribute with a high note in the filed of opinion mining.

REFERENCES

 [1]    AmnaAsmi, and TankyoIshaya, A framework for automated corpus generation for semantic
        sentiment analysis, proceeding of the world congress on engineering vol. 1, 2012.
 [2]    Bing Liu, Sentiment Analysis and Opinion Mining, (San Francisco: Morgan & Claypool
        Publishers, 2012).
 [3]    A.M. Patel, A.R.Patel and H.R.Patel, A Comparative analysis of data mining tools for
        performance mapping of WLAN data, 4(2), 241-251.
 [4]    R. Manickam, An Analysis of data mining: past, present and future, 3(1), 1-9.
 [5]    Abbasi, A., Chen, H. and Salem, A. (2008) Sentiment analysis inmultiple languages: Feature
        selection for opinion classification inWeb forums. ACM Trans. Inf. Syst., 26(3), 2005, 1-34.
 [6]    M. J. Silva et al., The Design of OPTIMISM, an Opinion Mining System for Portuguese
        Politics. In: New Trends in Artificial Intelligence: Proceedings of EPIA 2009 - Fourteenth
        PortugueseConference on Artificial Intelligence. Aveiro, Portugal. Universidade de Aveiro,
        2009, 565-576.
 [7]    B. Pang, L. Lee, and S. Vaithyanathan, Thumbs up?: sentiment classification using machine
        learning techniques. In: The ACL-02 conference on Empirical methods in natural language
        processing Philadelphia, PA, USA. Association for Computational Linguistics, 2002, 79-86.
 [8]    Aidan Finn, Nicholas Kushmerick, and Barry Smyth, Genre classification and domain
        transfer for information filtering. In Fabio Crestani, Mark Girolami, Proceedings of ECIR-
        02, 24th European Colloquium on Information Retrieval Research, Glasgow, UK. Springer
        Verlag, Heidelberg, DE, 2002
 [9]    VasileiosHatzivassiloglou and Kathleen R. McKeown, Predicting the semantic orientation of
        adjectives.In Proceedings of the 35th Annual Meeting of ACL, 1997.
 [10]   P.D. Turney and M.L. Littman. Unsupervised learning of semantic orientation from a
        hundred-billion-word corpus. Technical Report ERB-1094, National Research Council
        Canada, Institute for Information Technology, 2002.
 [11]   Deepak Ravichandran and Eduard Hovy. Learning surface text patterns for a question
        answering system. In ACL Conference, 2002.
 [12]   Ellen Riloff. Automatically generating extraction patterns from untagged text.In Proceedings
        of AAAI/IAAI, Vol. 2, 1996 1044–1049.
 [13]   JanyceWiebe, Theresa Wilson, and Matthew Bell.Identifying collocations for recognizing
        opinions.In Proceedings of ACL/EACL 2001 Workshop on Collocation.
 [14]   David Holtzmann. Detecting and tracking opinions in on-line discussions. In UCB/SIMS
        Web Mining Workshop, 2001.
 [15]   Dekang Lin. Automatic retrieval and clustering of similar words. In Proceedings of
        COLING-ACL, 1998, 768–774.
 [16]   JanyceWiebe. Learning subjective adjectives from corpora.In AAAI/IAAI, 2000, 735–740.

                                                 181
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 –
6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 7, November – December (2013), © IAEME

 [17] Wiebe, J. and Riloff, E. Finding Mutual Benefit between Subjectivity Analysis and
      Information Extraction.Affective Computing, IEEE Transactions, Vol.99, 2011, 1-1.
 [18] M. J. Silva et al. (2009) The Design of OPTIMISM, an Opinion Mining System for
      Portuguese Politics. In: New Trends in Artificial Intelligence: Proceedings of EPIA -
      Fourteenth Portuguese Conference on Artificial Intelligence. Aveiro, Portugal. Universidade
      deAveiro, 2009, 565-576.
 [19] Bhattacharyya, D., et al., Refine Crude Corpus for Opinion Mining. In: The First
      International Conference on Computational Intelligence, Communication Systems and
      Networks. Washington, DC, USA. IEEE Computer Society, 2009, 17-22.
 [20] B. Pang and L. Lee. Opinion mining and sentiment analysis. Foundations and Trends in
      Information Retrieval, 2 (1-2), 2008, 1-135.
 [21] Mingqing Hu and Bing Liu, Mining Opinion Features in Customer Reviews, American
      Association for Artificial Intelligence. 2004.
 [22] T. Saranya, Mining features and ranking products from online customer reviews,
      International Journal of Engineering Research & Technology, 10(2), 2013, 643-648.
 [23] R. Manickam, D. Boominath, V. Bhuvaneswari, “An Analysis of Data Mining: Past, Present
      and Future”, International Journal of Computer Engineering & Technology (IJCET),
      Volume 3, Issue 1, 2012, pp. 1 - 9, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375.
 [27] Jamshed Siddiqui, “An Exploration of Total Quality Management and Supply Chain
      Management Enablers”, International Journal of Computer Engineering & Technology
      (IJCET), Volume 4, Issue 6, 2013, pp. 212-218, ISSN Print: 0976 – 6367, ISSN Online:
      0976 – 6375.
 [24] Sandip S. Patil and Asha P. Chaudhari, “Classification of Emotions from Text using SVM
      Based Opinion Mining”, International Journal of Computer Engineering & Technology
      (IJCET), Volume 3, Issue 1, 2012, pp. 330 - 338, ISSN Print: 0976 – 6367, ISSN Online:
      0976 – 6375.
 [25] Jamshed Siddiqui, “A Framework for ICT Adoption in Indian Smes: Issues and Challenges”,
      International Journal of Information Technology and Management Information Systems
      (IJITMIS), Volume 4, Issue 3, 2013, pp. 114 - 120, ISSN Print: 0976 – 6405, ISSN Online:
      0976 – 6413.
 [26] M. Karthikeyan, M. Suriya Kumar and Dr. S. Karthikeyan, “A Literature Review on the
      Data Mining and Information Security”, International Journal of Computer Engineering &
      Technology (IJCET), Volume 3, Issue 1, 2012, pp. 141 - 146, ISSN Print: 0976 – 6367,
      ISSN Online: 0976 – 6375.




                                              182

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:2
posted:12/27/2013
language:Unknown
pages:7