Embed
Email

Sentiment Analysis

Document Sample
Sentiment Analysis
Description

Sentiment analysis or opinion mining refers to the application of language processing to identify and extract subjective information in source materials. Generally speaking, sentiment analysis aims to determine the attitude of a speaker or a writer with respect to some topic or the overall contextual polarity of a document.

Sentiment Analysis



michel.bruley@teradata.com









Extract from various presentations: Bing Liu, Aditya Joshi, Aster Data …

January 2012





www.decideo.fr/bruley

Introduction





 Two main types of textual information: Facts and Opinions



 Most current text information processing methods work

with factual information (e.g., web search, text mining)



 Sentiment analysis or opinion mining, computational study

of opinions (sentiments, emotions) expressed in text



 Why opinion mining now? Mainly because of the Web huge

volumes of opinionated text.









www.decideo.fr/bruley

What is Sentiment Analysis?





 Identify the orientation of opinion in a piece of text (blogs,

user comments, review websites, community websites, …), in

others words determine if a sentence or a document

expresses positive, negative, neutral sentiment towards some

object?







The movie The movie The movie

was fabulous! stars Mr. X was horrible!



[ Sentimental ] [ Factual ] [ Sentimental ]







www.decideo.fr/bruley

SA at different levels







His last movie was Word-level SA

The movie was

Great and interesting.

His police stopped

The movie was

The last movie was

interesting and

corruption

great.

very boring

one’s a

Thisfabulousdud.

Sentence-level SA









Document-level SA

fabulous



interesting



boring



police (subj.) stopped (verb) corruption (obj.)



www.decideo.fr/bruley

What is an Opinion?



 An opinion is a quintuple:



(oj, fjk, soijkl, hi, tl)

where

– oj is a target object

– fjk is a feature of the object oj

– soijkl is the sentiment value of the opinion of the opinion

holder hi on feature fjk of object oj at time tl

– hi is an opinion holder

– tl is the time when the opinion is expressed





www.decideo.fr/bruley

Objective: structure the unstructured



 Objective: Given an opinionated document,

– Discover all quintuples (oj, fjk, soijkl, hi, tl),

• i.e., mine the five corresponding pieces of information

in each quintuple



 With the quintuples,

– Unstructured Text  Structured Data

• Traditional data and visualization tools can be used to

slice, dice and visualize the results in all kinds of ways

• Enable qualitative and quantitative analysis



 With all quintuples, all kinds of analyses become possible



www.decideo.fr/bruley

SA is not Just ONE Problem



 Track direct opinions:

– document

– sentence

– feature level



 Compare opinions: different types of comparisons



 Detect opinion spam detection: fake reviews









www.decideo.fr/bruley

Polarity Classifier



 First eliminate objective sentences, then use remaining

sentences to classify document polarity (reduce noise)









www.decideo.fr/bruley

Level of Analysis



We can inquire about sentiment at various linguistic levels:



 Words – objective, positive, negative, neutral



 Clauses – “going out of my mind”



 Sentences – possibly multiple sentiments



 Documents









www.decideo.fr/bruley

Words

 Adjectives

– objective: red, metallic

– positive: honest, important, mature, large, patient

– negative: harmful, hypocritical, inefficient

– subjective (but not positive or negative): curious, peculiar, odd,

likely, probable

 Verbs

– positive: praise, love

– negative: blame, criticize

– subjective: predict

 Nouns

– positive: pleasure, enjoyment

– negative: pain, criticism

– subjective: prediction, feeling



www.decideo.fr/bruley

Clauses



 Might flip word sentiment

– “not good at all”

– “not all good”



 Might express sentiment not in any word

– “convinced my watch had stopped”

– “got up and walked out”









www.decideo.fr/bruley

Some Problems



 Which features to use? Words (unigrams), Phrases/n-grams,

Sentences

 How to interpret features for sentiment detection? Bag of

words (IR), Annotated lexicons (WordNet, SentiWordNet),

Syntactic patterns, Paragraph structure

 Must consider other features due to…

– Subtlety of sentiment expression

• irony

• expression of sentiment using neutral words

– Domain/context dependence

• words/phrases can mean different things in different

contexts and domains

– Effect of syntax on semantics

www.decideo.fr/bruley

Some Applications Examples



 Review classification: Is a review positive or negative

toward the movie?



 Product review mining: What features of the ThinkPad

T43 do customers like/dislike?



 Tracking sentiments toward topics over time: Is anger

ratcheting up or cooling down?



 Prediction (election outcomes, market trends): Will

Obama or Republican candidate win?



 Etcetera





www.decideo.fr/bruley

Aster Data position for Text Analysis



Data Analytic

Pre-Processing Mining

Acquisition Applications



Gather text from relevant Perform processing Apply data mining Leverage insights from

sources required to transform and techniques to derive text mining to provide

store text data and insights about stored information that improves

(web crawling, document information information decisions and processes

scanning, news feeds,

Twitter feeds, …) (stemming, parsing, indexing, (statistical analysis, (sentiment analysis, document

entity extraction, …) classification, natural management, fraud analysis,

language processing, …) e-discovery, ...)









Aster Data Fit

Third-Party Tools Fit





Aster Data Value: Massive scalability of text storage and processing, Functions for text processing, Flexibility to develop diverse

custom analytics and incorporate third-party libraries





www.decideo.fr/bruley



Other docs by Michel Bruley
Text Mining
Views: 37  |  Downloads: 1
Sentiment Analysis
Views: 147  |  Downloads: 2
1 - Text mining V0
Views: 43  |  Downloads: 0
By registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!