Archiving Court Records
W
Description
Archiving Court Records document sample
Document Sample


The Library of Congress
Cooperative Web Archiving
Project
November 4, 2009
Abbie Grotke, Library of Congress
Grant Harris, Library of Congress
Jennifer Long, Georgetown University
Agenda
• LC’s Web archiving program
• Overview of the Cooperative Project
• Featured Partner: Georgetown University
• Lessons Learned
2 The Library of Congress
Library of Congress Web Archives:
loc.gov/lcwa
3 The Library of Congress
LC Collections: over 130 TB
– US National Elections—2000, 2002, 2004, 2006, 2008
– Iraq War 2003
– September 11 2001 & September 11 Remembrance 2002
– Olympics 2002
– Congress—106th, 107th , 108th , 109th, 110th,
– Supreme Court Nominations
– Legal Blawgs
– Papal Transition
– Overseas Operations: Indian and Indonesian Elections
– Case Studies: health care, terrorism, visual image content,
organizational Web sites, Crisis in Darfur, ―single site‖
http://www.loc.gov/webarchiving/projects.html
4 The Library of Congress
Organizational Structure
CURATORS/RECOMMENDING OFFICERS WEB ARCHIVING TEAM
In Library Services, Congressional In the Office of Strategic Initiatives (OSI).
Research Service, and the Law Library We are project managers and technical staff
pick the collections and what URLs to archive, focused on capture, tools, and permissions.
and research who to contact for permission.
INFORMATION
TECHNOLOGY OFFICE and
TECHNICAL ARCHITECTURE TEAM
Also in OSI. Supports Wayback
BIBLIOGRAPHIC ACCESS
and Web Curator Tool development,
MODS records are created in Library
Repository development and Data Transfers.
Services: the Network Development
Contractors are also used in this area.
& MARC Standards Office &
Acquisitions & Bibliographic Access
staff do the cataloging. 5 The Library of Congress
Collaborations and Partnerships
• Early collections: Election 00 and 02,
September 11
• End of Term Project
• Hurricane Katrina Archive
• IIPC – upcoming Olympics Collection
• NDIIPP Partners
• K-12 Web Archiving
• Cooperative Archive-IT projects
6 The Library of Congress
Problem
• Web content that will be important for future
research is disappearing before it can be
collected
• Identification of sites, and review of captured
sites, is labor-intensive; LC staff are stretched
thin
• Outside institutions may not have
resources/budgets for collecting web sites
7 The Library of Congress
Cooperative Archive-IT Project Concept
• Enlist Library Services subject experts to identify
international and national high-value collecting areas,
with a focus on foreign countries experiencing volatile
political situations
• Enlist Library Services subject experts to identify
scholarly centers, or partner institutions, with
recognized expertise in the collecting areas, to assist in
the collection and preservation of important at-risk
materials
• Prioritize collecting areas/centers of expertise (7
priority areas selected)
8 The Library of Congress
Goals
• To enable institutions outside the Library to gain
experience creating Web site collections
• To extend the network of NDIIPP partners working to
identify and collect high value, at-risk Web materials
• To develop subject areas collections that could become
part of the Library’s collections in the future, and
• To broaden the understanding of issues related to the
development of curated collections of Web content.
9 The Library of Congress
Library of Congress agreed to:
• Establish and fund an Archive-It account for the partner
for up to one year (with possible extension);
• Provide support as needed;
• Provide subject matter expertise as requested by the
partner;
• Invite partner institutions to at least one conference at
the Library (if funding is available);
• Maintain a second copy of the harvested content.
10 The Library of Congress
Each Center Was Asked To:
• Identify high risk, high value web sites for their
area, and use Archive-It to harvest the sites;
• Document their selection criteria and provide it
to the Library;
• Document issues, lessons learned, etc. related
to their web collecting;
• Participate in a conference with Library experts
and other participants (if scheduled).
11 The Library of Congress
Electronic Literature Literary Sites July 12, 2008 – 9,214,920 documents
Organization (ongoing) 401.29 GB
George Washington Russian Parliamentary August 13, 2007 – 18,175,664 documents
University, Institute for Elections, Dec. 07, and August 12, 2008 870.09 GB
European, Russian, and the Russian Presidential
Eurasian Studies Election 08
Georgetown University Belarus, Moldova, September 17, 2007 19,880,435 documents
Ukraine - (ongoing) 580 GB
University of North Islam in Asia September 27, 2007 3,856,205 documents
Carolina, Chapel Hill – February 1, 2008 105.35 GB
Stanford University Iranian Blogs February 29, 2008 - 27,997,040 documents
Libraries, Islamic (ongoing) 2,099.70 GB
Studies
George Washington Avian bird flu in Asian June 3, 2008 – 18,699,986 documents
University, Center for countries January 6, 2009 640.6 GB
Global Health 12 The Library of Congress
Featured Partner: Georgetown University
Belarus, Moldova, Ukraine Collection
• Proposed by LC Curator: Grant Harris
• Aim: the web capture of fragile websites from
Belarus, Moldova, and Ukraine, to include
selected government websites, opposition
parties, ethnic and religious groups, elections,
and security issues.
13 The Library of Congress
14 The Library of Congress
15 The Library of Congress
16 The Library of Congress
17 The Library of Congress
Lessons Learned
• Finding good partners was KEY - partners should
be committed and really ―get‖ the concept of
web archiving and archiving primary source
materials
• Crawling ALL of Twitter – not so good.
• Confusion over LC’s own web archiving program
vs. this project
18 The Library of Congress
Lessons Learned
• Collaborative collection building is a good thing
– New partnerships formed
– New ways for our curators to get engaged with
web archiving
– LC might not have been able to archive some
content collected on our own (permissions, staff
time, etc.)
19 The Library of Congress
Next Steps
• Three partners collecting (at least) for another
year: ELO, Georgetown, and Stanford
• Focus on description and access: George
Washington University/Russian Elections
• Future: Data transfer to LC
20 The Library of Congress
For more information
• LC Web Archiving:
http://www.loc.gov/webarchiving/
• LCWA: http://loc.gov/lcwa/
• National Digital Information and
Infrastructure Preservation Program:
http://www.digitalpreservation.gov/
• Georgetown’s Archive-IT collections:
http://archive-it.org/public/partner?id=168
21 The Library of Congress
Questions?
• Abbie Grotke abgr@loc.gov
• Grant Harris grha@loc.gov
• Jennifer Long longj@georgetown.edu
22 The Library of Congress
Get documents about "