Open Access; Open Data
I647
Fall 2006
Budapest Open Access Initiative
• Based on:
– Self archiving by authors
– Open Access journals, e.g., BioMed Central
• http://www.soros.org/openaccess/
Open Access
• Institute of Physics: most papers free for 30 days
after publication
– http://www.iop.org/EJ/ and
http://www.iop.org/EJ/journal/NJP
• Public Library of Science
– http://www.publiclibraryofscience.org
• Highwire Press
– http://www.highwire.org/
• PubMed Central
– http://www.pubmedcentral.nih.gov/
Opposition to Open Access
• Reacting to NIH’s proposed policy on open
access, C&EN Editor Rudy Baum says:
“[This] action will inflict long-term damage on
the communication of scientific results and
on maintenance of the archive of scientific
knowledge.”
-- C&EN, September 20, 2004, p. 7
Open Access + Semantic Web
• "Almost all of an author's output (compounds,
spectra, reactions, properties, etc.) is nowadays
computerised and in principle redistributable to
the community for re-use. Few journals actively
validate the primary data (e.g. spectra) involved
in a publication (chemical crystallography being
a clear exception where data are intensively
reviewed by machine). We reassert that
chemists must now move towards publishing
their collective knowledge in a systematic and
easily accessible form for re-use and
innovation....
Open Access + Semantic Web
• We urge that authors, funders, editors,
publishers and readers move further towards the
following protocol:
[1] All information should be ultimately machine-
understandable in XML....
[2] Machine-understandable information for a compound
should include a connection table, the IUPAC unique
identifier (INChI) which guarantees that the
connection table can be checked and regenerated,
and a name....
[3] Rights metadata.”
-- Murray-Rust, Rzepa, Tyrrella, Zhanga (2004)
Google Digitization Plans
• Digitize all content of:
– University of Michigan
• committed to complete digitization of all 7 million volumes in
its collection, excluding its rare books and other fragile
material
– Harvard University
– New York Public Library
– Stanford University
• Aimed at out-of-print material, whether public
domain or in copyright
• Opportunity for libraries to concentrate on truly
unique or special holdings to digitize locally
Getting at the Data
• New CAS Information Use Policies
– http://www.cas.org/infopolicy.html
• STN’s Information Keep & Share Program
– http://info.cas.org/copyright/index.html
• SciFinder Scholar download restrictions:
100 items at a time
Data Analysis Tools
• STN’s Analyze and Tabulate feature
• STN Express with Discover! (Analysis
Edition)
• Limited access because of A&I publishers’
reluctance to turn loose of the data
InChI
• IUPAC-NIST Chemical Identifier
• a unique label which would be a non-proprietary
identifier for chemical substances that could be
used in printed and electronic data sources thus
enabling easier linking of diverse data
compilations
• latest version handles:
– organic, covalent structures
– inorganic and organometallic compounds
• http://chemdata.nist.gov/IChI/INChIv11b.zip
Future
• XML and metadata
– Dymond (DYnamic Metadata ON Demand)
• Virtual journals (Virtual Journal of Nanoscale
Science and Technology)
• Copyright question and open access resolution
• Legal protection of databases
• Impact of InChI and CML
• Demise of Abstracting and Indexing Services?
Conclusion
• “The main challenge is for chemists to
recognise the value of making their data
machine-understandable, rather than
destroying it with traditional paper or slide-
focused publication and dissemination
processes.”
-- Murray-Rust, Rzepa, Tyrrella, Zhanga (2004)
Parting words . . .
If you're not part of the solution, you're part
of the precipitate!
Bibliography
• Gasaway, Laura. “The open archives movement.”
Information Outlook October 2004, 8(10), 36, 39-40.
• Murray-Rust, Peter; Rzepa, Henry S.; Tyrrell, Simon M.;
Zhang, Yong. “Representation and use of chemistry in
the global electronic age.” Organic & Biomolecular
Chemistry 2004, 2(22), 3192-3203.
http://www.ch.ic.ac.uk/rzepa/obc/ (preprint)
• Townsend, Joe A.; Adams, Sam E.; Waudby,
Christopher A.; de Souza, Vanessa K.; Goodman,
Jonathan M.; Murray-Rust, Peter. “Chemical
documents: machine understanding and automated
information extraction.” Organic & Biomolecular
Chemistry 2004, 2(22), 3294-3300.