brody-nordic

Document Sample
scope of work template
							    Citation Analysis for the Free,
           Online Literature
                            Tim Brody
            Intelligence, Agents, Multimedia Group
                    University of Southampton


28 April 2004          Second Nordic Conference on   1
                         Scholarly Communication
                     Content
• Current services for Open Access
  Literature
• Institutional Archives Registry
• Metadata Harvesting through Celestial
• Citebase Search
     – Citation Linking
     – Search and Navigation Service
• Web Impact as a predictor of Citation
  Impact
28 April 2004      Second Nordic Conference on   2
                     Scholarly Communication
    Institutional Archives Registry




28 April 2004   Second Nordic Conference on   3
                  Scholarly Communication
28 April 2004   Second Nordic Conference on   4
                  Scholarly Communication
                 Sites in the IAR
• Things we want to know:
     – GNU EPrints sites
     – Other research collections (Other Archives, Open
       Journals)
     – BOAI 1. vs BOAI 2.
• A submission form consisting of:
     – URL, Name, OAI URL, Country, ‘type’, full-text,
       software
• Can’t (yet) track full-texts
• (Create a master-list so archives only register-
  once?)

28 April 2004         Second Nordic Conference on         5
                        Scholarly Communication
                    Celestial
• Designed to:
     – Be an abstraction over OAI-PMH versions
     – Caching OAI metadata records
• Technological questions:
     – How big can the OAI-PMH go (ok for 5 million
       records so far)
     – How reliable are OAI-PMH implementations
• Feeds Citebase, IAR, some external users
28 April 2004      Second Nordic Conference on        6
                     Scholarly Communication
28 April 2004   Second Nordic Conference on   7
                  Scholarly Communication
28 April 2004   Second Nordic Conference on   8
                  Scholarly Communication
                  Services for Open Access
                          Literature



                                                 OAIster




                                                                                 Google
                                        Scirus           Search Engines




                                                                                                     OAI-PMH Transport
                                                         Navigation Tools
                            Citebase




                                                      Analysis & Assessment




                                                                                          Citeseer
      BMC




                                         Citation Analysis/Linking Services
                                       (Citebase / Citeseer / OpenURL / DOI)
                                              Version Linking Services
                arXiv.org




                                       Self-Archived Full-texts (Pre/Post-prints)
                                               Open Access Publishing


                    n.b. Scirus/OAIster aren’t citation-analysis aware yet, Google
                    indexes Citeseer. Not an exhaustive list …
28 April 2004                                      Second Nordic Conference on                                 9
                                                     Scholarly Communication
         Citation Analysis & Linking
• A citation is a reference from one work to
  another [as a hyperlink: a citation link]
• Citation analysis uses citation
  relationships to analyse patterns in
  research
• As a graph a work (paper, book etc.) is a
  vertex and a citation an edge
• ‘Bibliometrics’
     – (study of patterns in literature)
28 April 2004        Second Nordic Conference on   10
                       Scholarly Communication
 Digitometric/Infometric Analysis
• Bibliometrics for the online age
• Couple citation analysis with Web analysis
     – (how many times has x been accessed?)
• Similar to readership studies, but easier to
  survey and more comprehensive
     – (though subject to the same problems of
       copies being re-distributed, multiple accesses
       etc.)

28 April 2004       Second Nordic Conference on     11
                      Scholarly Communication
                     Citebase Search

                Metadata Harvest
                  (OAI-PMH)                                                 Web
                                                                            Interface


                                     Meta Database
 Repositories

                Full-text Harvest
                                                               Citation
                                                                            OAI-PMH
                                                              Database
                                                                            Interface


                                       References
                                        Database                          Citebase


28 April 2004                   Second Nordic Conference on                             12
                                  Scholarly Communication
                Citation Linking
• Retrieve and cache full-texts
     – LaTeX, PDF, XML
• Extract reference list
• Extract individual references
• Parse references into components
     – Author, year, title, journal, volume, pagination
• Store in structured database

28 April 2004        Second Nordic Conference on      13
                       Scholarly Communication
                Citebase Search




28 April 2004      Second Nordic Conference on   14
                     Scholarly Communication
28 April 2004   Second Nordic Conference on   15
                  Scholarly Communication
                Citebase Search:
           Navigation by Citation Links
                                                   Article with
   Future                                          reference list
                                                   Reference
                                                   link


                     Related
   Current Article                    Co-cited




   Past




28 April 2004        Second Nordic Conference on               16
                       Scholarly Communication
28 April 2004   Second Nordic Conference on   17
                  Scholarly Communication
          Predicting Citation Impact
• The Web gives us access to new metrics
     – Download/access frequency
• Can early-day ‘download’ frequency give an
  indication of longer-term citation frequency?
• (Web logs from the UK arXiv.org mirror, Citation
  data from Citebase Search)
• Pearson correlation after 6 months of web logs =
  0.42 for the High Energy Physics sub-arXiv


28 April 2004       Second Nordic Conference on   18
                      Scholarly Communication
28 April 2004   Second Nordic Conference on   19
                  Scholarly Communication
28 April 2004   Second Nordic Conference on   20
                  Scholarly Communication
28 April 2004   Second Nordic Conference on   21
                  Scholarly Communication
28 April 2004   Second Nordic Conference on   22
                  Scholarly Communication
                   0.5
                  0.45
                   0.4

                  0.35
Correlation (r)




                   0.3

                  0.25
                   0.2
                  0.15
                   0.1

                  0.05
                    0
                         0   100   200     300     400      500    600   700   800
28 April 2004                        Second Nordic Conference on               23
                                            Days since deposit
                                       Scholarly Communication
           Assessing Research(ers)
• Citation Impact
     – By-Paper, Author, [Journal, Institution]
• Web Impact
     – Predictor of citation-impact, combine with
       citation-impact
• Search Engines
• More detailed research assessment

28 April 2004       Second Nordic Conference on     24
                      Scholarly Communication
    Comparing Online/Offline Impact
• Using ISI CD-ROM data
• Use Web crawlers to find ‘online’ articles
• Compare citation impact of online and
  offline articles
     – By discipline, by journal, by author?
• Initial results for Physics show 2-3x
  increase
     – arXiv.org
• Southampton, U. Quebec, Oldenburg (de)
28 April 2004       Second Nordic Conference on   25
                      Scholarly Communication
                Relevant Web Pages
• EPrints – http://www.eprints.org/
     – IAR: http://archives.eprints.org/
• Citebase Search
     – http://citebase.eprints.org/
• Celestial
     – http://celestial.eprints.org/
• Correlation Generator
     – http://citebase.eprints.org/analysis/correlation.php

• Tim Brody <tdb01r@ecs.soton.ac.uk>
28 April 2004           Second Nordic Conference on           26
                          Scholarly Communication

						
Related docs
Other docs by liujizheng
Card_Application
Views: 17  |  Downloads: 0
cloud-final
Views: 9  |  Downloads: 0
PowerPoint - Growth House_ Guide
Views: 18  |  Downloads: 0
VvFactSheet
Views: 0  |  Downloads: 0
Semantics for the Web
Views: 21  |  Downloads: 2
Health Care Facilities Manageme
Views: 10  |  Downloads: 1
bF5cW1xY20090319161114
Views: 3  |  Downloads: 0
CostEffective-3
Views: 15  |  Downloads: 2
eunis-2005-108-cc
Views: 2  |  Downloads: 0