Information Filtering Personalization by fsy40675

VIEWS: 119 PAGES: 42

									       Information Filtering /
       Personalization

                 Luz M. Quiroga
                 Stimulate 2005


Stimulate 2005         IF-Personalization / Luz. M. Quiroga
Information Filtering (IF) /
Personalization
    What do we understand for IF?

    How different is IF from IR

    Why do we might need it?

    What personalization means to you?

    Do you make use of it? For what purpose?


Stimulate 2005         IF-Personalization / Luz M. Quiroga
IF / personalization
issues / related concept
    Blocking, delivering                          From class feedback
    Profiles, Information needs,
     user modeling
    Organizing, Searching,
     finding, discovering
    Web design, Usability,
     personas
    Database, web, e-mail,
     distribution lists, blogs,
     community of practice
    Recommenders, alert,
     agents
    Privacy, ethics, trust

Stimulate 2005            IF-Personalization / Luz M. Quiroga
Information Filtering: variants

    SDI (selective dissemination of information)
    Current awareness
    Alert
    Routing
    Customization
    Recommenders
    Personalization


Stimulate 2005      IF-Personalization / Luz M. Quiroga
Main concepts in IF

      Information Filtering .vs.
       Information Retrieval (definition)
      Profiles
      User models
      Agents




Stimulate 2005        IF-Personalization / Luz M. Quiroga
IF v.s. IR. Definitions of IF
    “a field of study designed for creating a
     systematic approach to extracting
     information that a particular person finds
     important from a larger stream of
     information” (Canavese 1994, p.2).
    “tools … which try to filter out irrelevant
     material” (Khan & Card 1997, p.305)
    a process of selecting things from a larger
     set of possibilities, then presenting them in
     a prioritized order (Malone et al. 1987).
Stimulate 2005       IF-Personalization / Luz M. Quiroga
Defining Information Filtering
Belkin & Croft, 1992. “IF and IR: two sides of the
  same coin”
 Typical characteristics of the IF process
    Document set: Dynamic

    Information need: Stable, long term, specified in
     a profile
    Profile: Highly personalized

    Selection process: Delegated

 Filtering: “the process of determining which
  profiles have a high probability of being satisfied
  by particular object from the incoming stream”
Stimulate 2005     IF-Personalization / Luz M. Quiroga
  Retrieval System Model (Douglas Oard)

                                                                        User
  Query
Formulation

                     Detection

 Index                                   Selection

                                                                   Examination

                   Indexing                Docs
                                                                                 Delivery
  Stimulate 2005                 IF-Personalization / Luz M. Quiroga
  IF System Model

                                               User profile                        Profile
Information                                                                      acquisition
    need
(long term)
                     Detection
                                        Selection
 Index                                 (delegated:
                                         agent)
                                                                   Examination
                                                                                   Delivery
                   Indexing             Docs
                                      (dynamic)
  Stimulate 2005                 IF-Personalization / Luz M. Quiroga
Why do we need IF?
    Internet growth is exponential: MIDS (Matrix
     Information and Directory Services) home page:
     http://www.mids.org/
    One of the impacts of Internet is that any person
     with access to the Internet can become an author
     and a publisher. As a consequence, the quality of
     the information to be found in the Internet is
     extremely diverse and the quantity of information
     available is enormous (Lynch 1997)
                     Information overload

    Stimulate 2005    IF-Personalization / Luz M. Quiroga
Information overload
    With the explosion of information, the major
     concerns are not availability but obtaining
     the right information. Information that is
     highly important for one individual has no
     meaning for many others
    “at least 99% of available data is of no
     interest to at least 99% of the users
     (Bowman et al. 1994, p. 106).



Stimulate 2005      IF-Personalization / Luz M. Quiroga
  The need for IF: History
      1945: Vannevar Bush / Memex
        “... There is a new profession of trial blazers,
       those who find delight in the task of
       establishing useful trails through the
       enormous mass of the common record..”
      1958, Luhn: Selective Dissemination of
       Information
      1965: Ted Nelson / Xanadu / Hypertext
            ... Professionals who would compete to create
             better trails, which would attract more users and
             royalties .....

Stimulate 2005             IF-Personalization / Luz M. Quiroga
The need for IF: History

         1969: Hollis & Hollis: “Personalizing
          Information processes”
                the amount of information was doubling every
                 seven to ten years
         1982, Denning (ACM president / Filtering
          e-mail)
         1987: Malone: Social filtering (collaboration
          - annotation in documents - groupware)


Stimulate 2005               IF-Personalization / Luz M. Quiroga
     The need for IF: History

        Information Filtering / Users profiles / agents

    Need a system that selectively weed out the
     irrelevant information based on users
     preferences (user profile)

    The system will act on behalf of the user and
     will deliver selected, prioritized information
     (active, agent)

    Stimulate 2005      IF-Personalization / Luz M. Quiroga
Profiles
   User characteristics; user preferences
   Profiles are the basis for the performance of IF
    systems:
        “the construction of accurate profiles is a key task -- the system’s
         success will depend to a large extent on the ability of the learned
         profile to represent the user’s actual interest” (Balabanovic &
         Shonan 1997, p.68)
        building a “good” profile is still the central obstacle to achieving
         reasonable performances in IF systems

   Need: evaluation of IF (profiles)
        Fidel (corporations’ employees)
        Quiroga (consumer health information systems)
Stimulate 2005                 IF-Personalization / Luz M. Quiroga
User modeling

   In order to build a good system in which a
    person and a machine cooperate to perform a
    task it is important to take into account some
    significant characteristics of people (Elaine
    Rich, 1983)
   User models are personal characteristics of the
    user that the system maintains (Chris Borgman)

        A profile can be thought as a user model.


Stimulate 2005         IF-Personalization / Luz M. Quiroga
 Profiles, IF and User modeling
All information filtering models and systems are based
on modeling the user and presenting his information
needs in the form of a profile [1]
A conceptual framework for the design of IF systems
come from two established lines of research: IR & User
Modeling [2]

[1] Shapira, Peretz & Hanani. Dept. of Industrial Engineering, Ben Gurion
University; Dept. of IS, Bar-Ilan University
[2] Oard & Marchionini. University of Maryland


Stimulate 2005              IF-Personalization / Luz M. Quiroga
Agents
    Software programs that implement user
     delegation [1]
    A personal assistant who is collaborating with the
     user in the same work environment; information
     filtering is one of the many applications an agent
     can assist [2]
    Mental agents / Society of agents. Each mental
     agent can only do small process; joining these
     agents in societies leads to true intelligence [3]
     [1] Jansen James. Phd Candidate Texas University, Computer Sc. US
     Academy Military. Research: combination of agents & search engines
     [2] Maes, Patty. MIT Media Lab. Research AI
     [3] Minsky, Marvin. The Society of minds, 1986




Stimulate 2005                IF-Personalization / Luz M. Quiroga
Types of user models (Rich)
Depending on:

   The user being modeled
       Individual
       Canonical (stereotype; group)


   Acquisition model
       Explicit (stated)
       Implicit (inferred)


Stimulate 2005            IF-Personalization / Luz M. Quiroga
Individual / Canonical user models
(Elaine Rich)
    Individual: Each user with one interface;
     appropriate to his/her need; emphasis in
     individual differences
    Canonical [stereotype, group]]: The user is
     part of a group; interface for the group;
     emphasis in what the group has in common
         Shared knowledge; community of practices
         Collaborative filtering
         Influencing the design of web sites for e-commerce

Stimulate 2005           IF-Personalization / Luz M. Quiroga
   Individual / Canonical user models
   (Elaine Rich)
  GRUNDY: an example of a canonical type of
  user model
  • A case study in the use of sterotypes
  • Grundy recommends novels that people might like to read
  • Stereotypes contain facets that relate to people’s taste in books
  • Grundy learns from user feedback: have they read it / liked it
  (reinforcement); if not, why?
  • Experiments showed that Grundy does significantly better with
  the user model than without it
  • It is a good start toward the construction of individual models


Stimulate 2005           IF-Personalization / Luz M. Quiroga
Explicit / Implicit user models (Rich)

    Explicit: [stated].                     Issues to consider:
     The model is built by the
     system based on explicit                    How to capture “user pre-
     information provided by the                  Knowledge” ?
     user
    Implicit: [inferred].
     The model is built by the                   User effort
     system by mean of a
     learning process based on:                  User control
      User feedback (inferred                    (acceptability,
        from responses)                           understanding)
      User behavior (inferred
        from action) -> AGENTS


Stimulate 2005           IF-Personalization / Luz M. Quiroga
ASIS: Closing keynote presentations.
Plenary debate; the future of IR, IF
    ASIS2001
          James Hendler: chief scientist of the Information System Office at
           the Defense Advanced Research Agency. He has Joint
           appointments in the Computer Science, the Electrical Engineering
           Department and the Advanced computer studies at University of
           Maryland, College Park
          Ben Schneiderman: Professor in the Department of Computer
           Science at the University of Maryland, College Park. Founder of
           the Human-Computer Interaction laboratory; fellow of ACM; he
           received the ACM CHI lifetime Award in 2001
    ASIST 2004
          Tim Berners-Lee : inventor of the WWW; currently director of the
           W3C (World Wide Web Consortium)


    Stimulate 2005               IF-Personalization / Luz M. Quiroga
    ASIS: Closing keynote presentations.
    Plenary debate; the future of IR, IF
   James Hendler (asist 2001)
        Solution: AUTONOMOUS AGENTS: when we need
         information, one way to find it is to talk to an expert; both
         engage in a conversation; the expert learns about our
         needs, constrains and preference; the expert presents
         options; we decide.
   Ben Schneiderman (asist 2001)
        Solution: Good Interfaces; with autonomous agents we
         loose control; we can not trust agents; who has the power:
         the agent or the user?
   Tim Bernster (asist 2004)
        The semantic web; ontological representation of
         knowledge (metadata)
        Critics: any system that requires metadata is meant to fail
Stimulate 2005               IF-Personalization / Luz M. Quiroga
Some other user modeling techniques

    Social and collective profiles
    Collaborative filtering
    Social data mining
    Filtering and communities of practices




Stimulate 2005      IF-Personalization / Luz M. Quiroga
Social Profiles
      Ardissono & Goy (1999)
            SETA: A recommender system for electronic
             shops
            Based on Stereotypes
            Profiles include “beneficiaries models”: user
             models for each third person for whom the
             shipper is selecting goods




Stimulate 2005             IF-Personalization / Luz M. Quiroga
Social profiles
   Petrelli et al (1999)
        Personalized guides to museums
        Based on stereotypes
        Study suggest including “family profiles”
         besides the individualized museum guide




Stimulate 2005          IF-Personalization / Luz M. Quiroga
Collaborative profiles
     A process where the system gives
      suggestions based on information gleaned
      from members of a community or peer
      group.
     Example: Amazon
     People who (bought, read) X
      also (bought, read) Y



Stimulate 2005      IF-Personalization / Luz M. Quiroga
Social data mining


                 Blogs
                 Community of practices / knowledge
                 sharing




Stimulate 2005                IF-Personalization / Luz M. Quiroga
Web usability / Personas /
User models for web design
    Sources:
         Personas: Setting the Stage for Building Usable
          Information Sites
          By Alison J. Head
          http://www.infotoday.com/online/jul03/head.shtml
         Alan Cooper, The Inmates Are Running the
          Asylum: Why High-Tech Products Drive Us Crazy
          and How to Restore the Sanity, Indianapolis:
          Sams, 1999


Stimulate 2005          IF-Personalization / Luz M. Quiroga
    Web usability / Personas /
    User models for web design
    Personas are hypothetical archetypes; imaginary
    Personas are defined by their goals (detailed)
    Developed through a series of ethnographic
     interviews with real and potential users.
      Demographic (quantitative) data, such as age,
        education, and job title. (similar to marketing
        segmentation)
      More important: to collect qualitative data
        (persona)
    Interfaces are built to satisfy personas' needs and
     goals.
    Stimulate 2005     IF-Personalization / Luz M. Quiroga
Personalization and web design
Web usability / Personas
    Alan Cooper original idea: using a fictitious user with
     a set of goals to guide and focus the design of a
     product.
    “His original idea was turned out into a rigorous form
     of user model, based on behavior patterns that
     emerge from ethnographic research.”
    “A set of personas represents the key behaviors,
     attitudes, skill levels, goals, and workflows of real
     people we interview and observe, which we then use
     along with scenarios to guide the product's
     functionality and design.”
    “The method has matured to the point that anyone
     trained in it should be able to get the same personas
     from the same data.”
Stimulate 2005         IF-Personalization / Luz M. Quiroga
       Personalization - environments
       where is being used
      Databases
      Newsgroups, discussion lists
      Personal Information Management (desktop files, E-mail,
       bookmarks, etc.)
      News: electronic journals
      Search engines
      Web sites
        Business

        e-commerce

        e-health

        e-etc.


    Stimulate 2005        IF-Personalization / Luz M. Quiroga
LIS 678: IF & Personalization
Example of Special topics (previous semesters)
    Privacy and personalization
    E-commerce and personalization
    Mining usage data for web personalization
    Machine learning and personalization
    Adaptive web sites: learning from visitor access patterns
    Children's information seeking for electronic resources
    Users' criteria for relevance in IF systems
    Patterns in the use of search engines
    Satisfaction of information users
    Individual differences in organizing, searching, retrieving
     and evaluating information
    Information retrieval technologies for special users

Stimulate 2005           IF-Personalization / Luz M. Quiroga
    LIS 678: IF & Personalization
    Example of Special topics (this semester)

    Personal Ontologies
    Personal Information Management
    Social / Collaborative filtering (wikis, blogs,
     community of practice)
    Desktop searching
    Semantic Web: metadata, XML, RDF
    Probabilistic IR / IF

    Stimulate 2005     IF-Personalization / Luz M. Quiroga
LIS 678: IF & Personalization
Example of projects (this semester)
    Technology and literacy in developing
     countries (panel)
    Business application of IF products
    Personalized ranking
    Semantic web and personalization




Stimulate 2005      IF-Personalization / Luz M. Quiroga
IF Independent studies

    Alex Guilloux: usability study of bookmarking
     behaviour; how specificity level in the hierarchy of
     bookmarks affect relevance
    Susan Lin:
         Bookmarking software; specification for design
         Bookmarking habits of reference librarians (Information
          Architecture class)
    Steve Lum: Ontology mapping; bookmark mapping
     for collaborative filtering
     Jennifer Cambell: Personalization and communities
     of practice (evaluation)
Stimulate 2005              IF-Personalization / Luz M. Quiroga
    LIS 678: Projects
      Evaluation, comparison of IR / IF systems (e.g. search engines;
       recommenders, personalization features in digital libraries and
       portals)
      Designing / running an IR/IF experiment (e.g. building a
       collaborative profile using a movie recommender; testing usability
       of a search interface; incorporating personalization in the design of
       a digital library)
      Analysis / design / prototype of a IR/IF component (e.g. a ranking
       algorithm; building a prototype of a searching interface; designing
       personalized web sites)
      Writing a paper: literature review, reaction paper on IR/IF/User
       modeling
      Conducting research or development on IF - User modeling (e.g.
       using faceted classification schemes for personalized web-IR);
       using bookmarks as a source of profiles; visualization for personal
       information management; observing users' searching behavior -
       children, young adults, patients, students, members of a
       community)

    Stimulate 2005             IF-Personalization / Luz M. Quiroga
Exercises
   Use Sifter filtering system
    http://ella.slis.indiana.edu/~junzhang/demo.html

   Use the information filtering agent at:
    http://www.ics.uci.edu/~pazzani/Publications/ - download several papers of
    interest and see what recommendations you get

   Use the movielens system: http://movielens.umn.edu/ rate movies (you
    decide how many you need to rate to adjust your profile) and see what
    recommendations you get


For all exercises discuss:

   Content of the profile
   Is the profile representing user interests?
   To what extent do these systems allow the user control over their profile?



Stimulate 2005                    IF-Personalization / Luz M. Quiroga
People / Resources

    Douglas Oard IF page:
     http://www.ee.umd.edu/medlab/filter/

    SIFTER Project
     http://sifter.indiana.edu/




Stimulate 2005       IF-Personalization / Luz M. Quiroga
People interested in IF in UH

    User modeling: Martha Crosby, David Chin
    User – Information interaction: Diane Nahl
    Filtering in corporations: Bob SW.
    Profile acquisition and representation: Luz
     Quiroga




Stimulate 2005      IF-Personalization / Luz M. Quiroga
Comments

    Comments, Questions?



    Thanks!




Stimulate 2005   IF-Personalization / Luz M. Quiroga

								
To top