Docstoc

caBIG Essentials

Document Sample
caBIG Essentials Powered By Docstoc
					caBIG™ Essentials:
Self-Paced Training Program


Prepared by the caBIG™ Documentation and Training
(D&T) Workspace – Last Updated: 8 November 2007


Getting Started: Click on right arrow
below to advance to the next slide.
Training Overview:
Training Topic, Audience & Prerequisites


• Topic Statement: This training provides an overview of the caBIG™
  program and its offerings. It introduces caBIG™ tools, describes
  different ways to connect with caBIG™, and points to additional
  resources to support specific next steps.

• Target Audience: This training program is designed for
  organizations wishing to learn about and then connect with caBIG™.
  Specific audiences include leaders charged with planning and guiding
  caBIG™ deployment efforts in their organizations.


• Prerequisites: This training does not assume any pre-existing
  knowledge about caBIG™. Those familiar with caBIG™ may wish to
  skip material in Lesson 1, and proceed to later lessons.
Training Overview:

Lesson Plan


•   Lesson 1:   Introduction to caBIG
•   Lesson 2:   caBIG™ Self-Assessment & Tools Overview
•   Lesson 3:   Focus on Clinical Trials Compatibility Framework
•   Lesson 4:   Focus on Life Sciences Distribution
•   Lesson 5:   Making A Tool caBIG™ Compatible – Overview
•   Lesson 6:   Focus on the Grid
•   Lesson 7:   Focus on caBIG™ Data Sharing and Security Framework
•   Lesson 8:   Conclusion
Training Overview:
Training Program Goals


• By the end of this training, you will be able to:

     • Describe the caBIG™ initiative and its goals.

     • Locate caBIG™ software tools, so that you can evaluate their benefits
       against your needs, and determine their appropriateness for
       implementation in your organization‟s environment.

     • Explain the roles of interoperability, the Grid, and data sharing in
       achieving caBIG™ goals.

     • Evaluate and select specific actions that your organization can take to
       connect with caBIG™, including adopting caBIG™ tools and adapting
       your own tools to become caBIG™ compatible.
PROPERTIES
Allow user to leave interaction:   Anytime
Show „Next Slide‟ Button:          Show upon completion
Completion Button Label:           Next Slide
Lesson 1: Introduction to caBIG™


• This lesson introduces the purpose of caBIG™ and provides an
  overview of what it means to connect with caBIG™.

• Learning Objectives for this Lesson:

   •   Describe caBIG™ and Its Goals
   •   Introduce the caBIG™ Structure and Organization
   •   Introduce Key caBIG™ Concepts
   •   Explain What it Means to Connect with caBIG™
   •   Determine Your Path Forward
   •   Decide Where to Go Next in This Training
Lesson 1: Introduction to caBIG™
caBIG™ and Its Goals


• Launched in 2004, the cancer Biomedical Informatics Grid (caBIG™)
  is an information technology program that is redefining how cancer
  research is conducted.

• The vision of caBIG™ is a full cycle of integrated cancer research,
  extending from bench to bedside, and back again.

• We do this by:

     • Developing software tools that support cancer research efforts, and by
     • Establishing a common infrastructure that can be used to share data and
       applications across organizations.
Lesson 1: Introduction to caBIG™
caBIG™ and Its Goals


• The Goals of caBIG™ are to:

    • Connect cancer research communities through a shareable and
      interoperable infrastructure
    • Develop standard rules and a common language to more easily share
      information
    • Build or adapt tools for collecting, analyzing, integrating, and
      disseminating information associated with cancer research


      This training program is designed for people who want to support
      caBIG™ goals by adopting caBIG™ tools, applying the standards that
      support data sharing with others in the caBIG™ community, and
      connecting to the caBIG™ infrastructure. If your interest in caBIG™ is
      more general, please visit the caBIG™ Public Website:
      http://cabig.cancer.gov – Note: all links in this training open a new
      window/tab!
Lesson 1: Introduction to caBIG™
How Connecting with caBIG™ Benefits You


• Connecting with caBIG™ benefits cancer research organizations in
  the following ways:

    • Facilitate collaborative research through greater access to research data
      made available through caBIG™ infrastructure.
    • Increase the research potential from data sets, by facilitating the flow of
      data between multiple tools on the “bench to bedside” path.
    • Install free customizable caBIG™ software applications to support the
      conduct of basic and clinical cancer research
    • Learn about the tools and resources that other organizations are willing to
      share, and advertise your own presence and services to others.


• The increased data sharing made possible by connecting with
  caBIG™ increases the effectiveness and efficiency of cancer
  research, helping individual scientists, the cancer research
  community, and – ultimately - patients.
Lesson 1: Introduction to caBIG™
caBIG™: From Pilot to Enterprise


   caBIG™ began with the vision of creating a virtual web of
   interconnected data, individuals, and organizations that
   redefines how research is conducted, care is provided, and
   patients/participants interact with the research enterprise
                                                                    2004 –
   PILOT PHASE                                                      2006
   • 1,000 individuals from over 190 organizations
   • Developed a standards-based caGrid infrastructure
   • 40 applications that support basic/ clinical cancer research

                                                                    2007
    ENTERPRISE PHASE
    • Rollout of available caBIG™ tools and infrastructure to
      NCI-designated Cancer Centers
    • Formation of an enterprise support network
    • Engagement with the broader biomedical research community
Lesson 1: Introduction to caBIG™
caBIG™ Structure and Organization


• caBIG™ is sponsored by the National Cancer Institute (NCI), and is
  administered by the National Cancer Institute‟s Center for Biomedical
  Informatics and Information Technology.

• Many caBIG™ activities are carried out in management structures
  called workspaces.

    • Domain workspaces focus on specific areas of cancer research, and
      oversee the development of domain-focused software tools.

    • Cross-cutting and Strategic workspaces support these groups by
      developing standards, infrastructure, policies, and documentation.


• The following slides introduces these workspaces and their missions.
PROPERTIES
Allow user to leave interaction:   Anytime
Show „Next Slide‟ Button:          Show upon completion
Completion Button Label:           Next Slide
PROPERTIES
Allow user to leave interaction:   Anytime
Show „Next Slide‟ Button:          Show upon completion
Completion Button Label:           Next Slide
Lesson 1: Introduction to caBIG™
What it Means to Connect with caBIG™


• There are two basic ways to become connected with caBIG™:

   • Adopt a caBIG™ Tool and then Connect to the Grid: You can connect
     with caBIG™ by adopting pre-existing caBIG™ compatible tools and then
     connecting through caGrid. This is best if you need domain-specific tools,
     caBIG™ has tools that meet these needs, and you are prepared to install
     the caBIG™ tools and integrate them into your workflow.

   • Adapt a Tool to make it caBIG™ Compatible and then Connect to the
     Grid: If you have a tool (either created in-house or by a vendor) that is not
     caBIG™ compatible, you can achieve compatibility by reengineering the
     existing tool or creating an interface that maps an existing tool‟s data
     structures and APIs to caBIG™ standards. Once the tool is compatible,
     you can connect to the Grid.


• The following sequence of diagrams shows how these options play out
  – ultimately aligning with the key concepts just covered.
Lesson 1: Introduction to caBIG™
Pathways to caBIG™ Compatibility

             ADOPT                                     ADAPT
       Available caBIG™
          Applications
      App 1           App 2                     There are many existing
                                            caBIG™ compatible applications
                                             that can be adopted and then
                                            connected to the Grid with ease.

                                            Lessons 2-4 introduce these tools.




                                   caGrid
Lesson 1: Introduction to caBIG™
Pathways to caBIG™ Compatibility

             ADOPT                                        ADAPT
       Available caBIG™
          Applications
      App 1           App 2


                                         Connecting a caBIG™ compatible
                                    application to the Grid (creating a Grid node)
                                            is facilitated by caBIG™ tools
                                                   and infrastructure –
                                                  Lesson 6 addresses
                                              this process in more depth.




                                   caGrid
Lesson 1: Introduction to caBIG™
Pathways to caBIG™ Compatibility

                 ADOPT                                       ADAPT
                                                              ADAPT
        Available caBIG™                                        Existing
           Applications                                       Applications
       App 1           App 2                             App 3           App 4
          You can also adapt your own tool
   to be caBIG™ compatible and then connect to
  the Grid – this is generally more labor intensive
    than adopting a tool, and will require support
     from a development team with certain skills
                   and experience.

       We discuss this in Lessons 5 and 6.




                                                caGrid
Lesson 1: Introduction to caBIG™
Pathways to caBIG™ Compatibility

                 ADOPT                                     ADAPT
                                                            ADAPT
        Available caBIG™                                     Existing
           Applications                                    Applications
       App 1           App 2                           App 3           App 4
    You do not need to make the entire
  application compatible to connect to the
  Grid – just the elements that you wish to
  expose. Once compatibility is achieved,
   connecting to the Grid is facilitated by
     caBIG™ tools and infrastructure –

         Lessons 5 and 6 address
      these processes in more depth.




                                              caGrid
Lesson 1: Introduction to caBIG™
Pathways to caBIG™ Compatibility

              ADOPT                             ADAPT
                                                 ADAPT
        Available caBIG™                          Existing
           Applications                         Applications
       App 1           App 2                App 3           App 4




                                   caGrid
Lesson 1: Introduction to caBIG™
Life on the Grid: Implementation Scenarios
Lesson 1: Introduction to caBIG™
Determining the Best Path for You


ADOPTING MAY BE BEST IF…                   ADAPTING MAY BE BEST IF…

• You have requirements that caBIG™        • You have existing technology
  tools can fulfill in the domains of        infrastructure and tools that meet
  clinical trials, bench research,           the needs of the users, and that are
  imaging, and/or tissue banking.            already integrated into workflows.
• End users are open to learning new       • You have technical team(s)
  tools and integrating them in their        available with (1) software
  workflow.                                  development expertise, particularly
• You do not have a technical team           in the areas of data modeling and
  available to build or reengineer           application programming interface
  software, but you can to leverage          (API) development. (2) Previous
  internal resources to support a            experience with the NCI‟s
  software installation effort.              vocabulary and data model
                                             tools/repositories.

  To learn about this path, you may want    To learn about this path, you may want
  to focus on Lessons 2, 3 and 4            to focus on Lessons 5 and 6
Lesson 1: Introduction to caBIG™
Determining the Best Path for You


• While “adopt or adapt” are the two basic ways to connect with
  caBIG™, hybrid approaches exist. Here are two examples:

     • Integrate a caBIG™ module within your existing bioinformatics
       infrastructure, and develop an interface to facilitate data exchange
       between them.

     • Create a caBIG™-compatible “wrapper” around an existing tool, so that
       data appropriate for external data sharing can be more easily served,
       while not reengineering an entire tool.

• Which options are best for your organization? The rest of this
  training program will help you determine the best choices based
  on your existing needs, infrastructure, and capability.
Lesson 1: Introduction to caBIG™
Decide Where to Go Next in This Training


• Here‟s some tips on what to explore next in this training:
 If You Are Interested In…                       Visit….

 Accessing a self-assessment and seeing the      Lesson 2: caBIG™ Tools and
 range of tools that caBIG™ has to offer.        Infrastructure Overview
 Focusing on caBIG™ tools related to clinical    Lesson 3: Focus on the Clinical
 trials management.                              Trials Compatibility Framework
 Learning more about caBIG™ tools related to     Lesson 4: Focus on Life
 bioinformatics research, bio-banking, and the   Sciences Distribution
 Grid.
 Determining whether your team is ready to       Lesson 5: Making a Tool
 engage in a development effort to adapt your    caBIG™ Compatible –
 existing tools to be caBIG™ compatible.         Overview
 Learn more about caBIG™ policies and            Lesson 7: Focus on caBIG™
 resources related to data sharing.              Data Sharing and Security
                                                 Framework
Lesson 1: Introduction to caBIG™

Lesson Review

• In this lesson, we:

    • Discussed the goals of caBIG™ and the benefits of getting connected.

    • Reviewed key concepts and terms related to the caBIG™ effort.

    • Introduced different methods of getting connected with caBIG™, including
      adopting caBIG™ tools and adapting existing tools.


• The next three lessons will focus on the different tools available
  from caBIG™ - these focus on connecting with caBIG™ through
  tools adoption.
Lesson 2: caBIG™ Self-Assessment &
Tools Overview

• This lesson begins by pointing to a self-assessment to help identify
  your readiness to connect with caBIG™. Then, we introduce you to
  the full range of tools available from caBIG™. While some
  organizations will adopt groupings of tools called bundles, others may
  want to adopt individual tools independent of those bundles. This
  lesson will also introduce you to the infrastructure offered by caBIG™.

• Learning Objectives for this Lesson:

    •   Point to a caBIG™ Self-assessment
    •   Point you to a comprehensive list of caBIG™ tools
    •   Introduce key elements of the caBIG™ Infrastructure
    •   Introduce the “Bundles” concept
Lesson 2: caBIG™ Self-Assessment & Tools

Starting with a Self-Assessment

• Getting started with caBIG™ involves three key steps:

    – Assess where you are – What tools and infrastructure does your
      organization already have? What‟s your level of readiness to engage with
      caBIG™? Want a systematic set of questions to define your starting
      state? Review caBIG™ Deployment Self-Assessment (PDF).

    – Determine where you want to be – Where are the gaps that caBIG™
      may be able to fill?

    – Build a plan for getting there – Decide what you will adopt and adapt,
      and how you will engage with the Grid.


– Understanding what‟s available can help you determine where you are
  and where you want to go – the rest of this lesson will point you to the
  tools and infrastructure already available through caBIG™.
Lesson 2: caBIG™ Self-Assessment & Tools

caBIG™ Website Tools Landing Pages

•   Tools pages on caBIG™ website provide a full listing of caBIG™ tools – with
    information to support their use - Visit https://cabig.nci.nih.gov/tools/




                           Web Object Placeholder
                    Address:https://cabig.nci.nih.gov/tools/
                        Displayed in: Articulate Player
                           Window size:320 X 240
Lesson 2: caBIG™ Self-Assessment & Tools
A Look at the caBIG™ Infrastructure

• caGrid (The Grid) and caCORE are two vital resources when
  connecting with caBIG™. We‟ll talk more about these later, but visit
  https://cabig.nci.nih.gov/inventory/Infrastructure/ to preview the
  caBIG™ infrastructure – the backbone of caBIG™.




                         Web Object Placeholder
         Address:https://cabig.nci.nih.gov/inventory/Infrastructure/
                      Displayed in: Articulate Player
                          Window size:600 X 240
Lesson 2: caBIG™ Self-Assessment & Tools
A Starting List of Candidate Tools


• To help audiences identify related tools more easily, caBIG™ has
  identified three “tools bundles.” These are a subset of the full list of
  available caBIG™ tools.

• Two of these bundles include caGrid-enabled software components,
  which can interoperate or be used in workflows through the caGrid
  infrastructure. A third bundle deals with data sharing and compatibility
  policies.

• Example tools from the three bundles are listed on the next slide:

    • Clinical Trials Compatibility Framework
    • Life Sciences Distribution
    • Data Sharing and Security Framework
Lesson 2: caBIG™ Self-Assessment & Tools
Introducing the Bundles


              Compatibility Achieved through caBIG™ Bundles




Clinical Trials             Life Sciences        Data Sharing and
Compatibility               Distribution         Security
Framework                      • CTODS           Framework
   • C3PR                      • caArray           • caBIG™ Policies
   • PSC                       • caTissue          • Processes and
   • caAERS                                          Best Practices
                               • geWorkbench
   • caXchange                                     • Model Documents
                               • caGWAS
   • CTODS                                         • Trust Fabric
                               • NCIA
   • caGrid                    • caGrid

 More in Lesson 3!           More in Lesson 4!     More in Lesson 7!
Lesson 2: caBIG™ Self-Assessment & Tools

Lesson Review

• In this lesson, we:

    • Presented the three central steps to connecting with caBIG™: Assessing
      where you are, deciding where you want to go, and planning to get there.
    • Provided links to existing tools and infrastructure provided by caBIG™
    • Introduced the concept of the caBIG™ bundles to facilitate tools adoption


• The next two lessons will focus more closely on the software
  bundles related to clinical trials and life science distribution -
  these focus on connecting with caBIG™ through tools adoption.
Lesson 3: Focus on the Clinical Trials
Compatibility Framework

• This lesson provides a closer look at the Clinical Trials Compatibility
  Framework, including the caBIG™ Clinical Trials Suite, which is the
  software product component of the framework.

• Learning Objectives for this Lesson:

    • Describe the Purpose of the Framework and End User Benefits
    • Provide an Overview of the Clinical Trials Suite and
      Supporting Tools
    • Review Success Criteria
    • Decision Support: Adopt, Adapt or Hybrid
Lesson 3: Clinical Trials Compatibility Framework

Welcome to CTMS & Its Goals


•   The Clinical Trials Management Systems (CTMS) Workspace
    oversees the development and integration of tools for the framework.

•   Workspace goals include facilitating the following aspects of clinical
    trials:

    • Planning and instantiation of clinical trials (and monitoring of
      trials once they are instantiated)
    • Conduct of clinical trials
    • Reporting of clinical trials data to sponsors
    • Interoperability
        • Increase the ability to share data
        • Increase the ability to use the functionality of tools on other
           systems
Lesson 3: Clinical Trials Compatibility Framework
Purpose of the Framework


• The caBIG™ Clinical Trials Suite is an integrated, stable, and secure
  collection of interoperable software tools that support the management
  of study participant information through the clinical trial lifecycle.

• The current version of the framework enables management of tasks
  such as:

    • Screening and registering patients for accrual to clinical trials;
    • Scheduling and tracking of patient encounters during the course of a
      study;
    • Integrating laboratory results with the patient record;
    • Tracking and managing adverse events;
    • Capturing, storing, analyzing and routing clinical data in a meaningful
      manner.
Lesson 3: Clinical Trials Compatibility Framework
End User Benefits of the Framework


• Adopting the Clinical Trials Compatibility Framework will
  ultimately facilitate the management of clinical trials.

• The caBIG™ infrastructure supplements the software tools to
  support communication and security.

• These tools are available at no cost to the end user – they will
  all be available by early 2008.

• Once installed by an organization, the tools are designed to be
  used through a standard internet web browser.
caBIG™ Clinical Trials Compatibility
Framework Architecture


                                              • caAERS: Cancer
          PSC        caXchange
                                     caAERS     Adverse Event Reporting
                                                System
   C3PR                      CTODS            • caXchange: Cancer Data
                                                Exchange System
           CLINICAL TRIALS
       COMPATIBILITY FRAMEWORK                • PSC: Patient Study
                                                Calendar
                                              • C3PR: Cancer Central
           Data Sharing & Security              Clinical Participant
                 Framework
                                                Registry
                                              • CTODS: Clinical Trials
                  caGrid                        Object Data System
                                              • caGrid: caBIG™
                                                compatible systems
            Data for Sharing                    architecture
Lesson 3: Clinical Trials Compatibility Framework
Cancer Adverse Event Reporting System (caAERS)


• Cancer Adverse Event Reporting System
  (caAERS) – Captures and manages reports
  describing adverse events that occur during
  clinical trials.

    • Allows the collection, management and querying       Click Here to
      of adverse event data (both routine and serious)     Visit caAERS
    • Provides configurable rules to facilitate             Tools Page
      management and reporting of adverse events at
      protocol, sponsor and institution levels           Click above and a new browser

    • Generates customizable reports and submits to         tab or window will appear.


      external agencies, including generation of NCI
      and FDA compliant reports.
    • Supports regulatory compliance
    • Maps to vocabularies and coding systems
Lesson 3: Clinical Trials Compatibility Framework
caXchange


• caXchange - Facilitates the automatic capture of
  data from clinical systems and then performs an
  automatic translation and import to caBIG™
  compatible clinical trials databases.

    • Enables automatic transfer of clinical data from          Click Here to
      point-of-care systems in medical centers, such as        Visit caXchange
      clinical chemistry lab systems.                            Tools Page
    • Incorporates caXchange Lab Viewer, allowing
      viewing of clinical lab data imported from clinical     Click above and a new browser
      chemistry and other lab systems.                           tab or window will appear.


    • Labs can be selected for loading into clinical trials
      databases and/or adverse event reporting systems
    • Automatically flags lab values that may indicate
      toxicities
Lesson 3: Clinical Trials Compatibility Framework
Patient Study Calendar (PSC)


• Patient Study Calendar (PSC) - Enables clinical trial
  managers to schedule and manage treatment and
  care events for participants in a clinical trial.

    • Accommodates interventional, epidemiological (and              Click Here to
      population), and observational studies.                          Visit PSC
                                                                      Tools Page
    • Represents study workflow in time, process, and
      phases
                                                                Click above and a new browser
    • Represents event-driven and date-driven behaviors            tab or window will appear.

    • Allows creation, editing, and management of study
      calendar templates over the lifecycle of a study
    • Allows retrospective outcomes review and reporting of
      calendar activities
    • Tracks activities as they occur and provides a
      framework for reviewing historic study calendar events
    • Provides consent / re-consent notification and tracking
Lesson 3: Clinical Trials Compatibility Framework
Cancer Central Clinical Participant Registry (C3PR)


• Cancer Central Clinical Participant Registry
  (C3PR) - Tracks subject registrations to clinical
  trials.

    • Provides repository for participant information
      across studies, sites, systems, and organizations      Click Here to
      as well as current enrollment statistics                Visit C3PR
                                                              Tools Page
    • Verifies registration criteria (study open,
      participant eligible, consent received)
    • Facilitates Summary 3 & 4 reporting, by providing   Click above and a new browser
                                                             tab or window will appear.
      extracts of data elements needed for these
      reports
    • Manages study personnel who have access to
      the registry
    • Facilitates compliance with Federal regulations
      including 21 CFR Part 11, HIPAA and Section
      508
Lesson 3: Clinical Trials Compatibility Framework

Clinical Trials Object Data System (CTODS)

• Clinical Trials Object Data System (CTODS) - Enables storing and
  sharing of clinical trials data in both identifiable and de-identified form.

    • Enables data from any Clinical Trials Data Management System (CDMS)
      or data source to be available to the cancer research community
    • Provides the broader cancer research community with de-identified clinical
      trials data (data that have all patient identification information removed)
    • Consistent with the Biomedical Research Integrated Domain Group
      (BRIDG) model that underpins data interchange standards and technology
      solutions, which enable harmonization between the biomedical/clinical
      research and healthcare arenas.

    • Note: CTODS was also previously known as CTOM (Clinical Trials Object
      Model).
Lesson 3: Clinical Trials Compatibility Framework

Integration and Interoperability Tools

• A vital part of the framework consists of the tools and infrastructure
  that help you connect local clinical trials data management
  investments to the caBIG™ network.

    • caGrid - caBIG™-compatible systems architecture
    • Clinical Data Management System (CDMS) Integration


• These also provide security features and access controls to ensure
  protection of human subject information and clinical research data.

• These are covered further on the following two slides.
Lesson 3: Clinical Trials Compatibility Framework

caGrid

• caGrid provides the services backbone for data
  and message exchange across all the tools.

    • Connects all the tools in the caBIG Clinical Trials
      Framework
    • Provides common identity and security                   Click Here to
      management across the tools                           Visit caGrid Page
    • Facilitates message transport and routing between
      systems
    • Manages secure access, query and retrieval of data    Click above and a new browser
                                                               tab or window will appear.
      across tools
    • Contains an index of registered services


• caGrid is the topic of Lesson 6.
Lesson 3: Clinical Trials Compatibility Framework
CDMS Integration

• Clinical Data Management System (CDMS) Integration enables the
  exchange of data between the framework tools and a caBIG™
  compatible clinical data management system.

    • Data retrieval from CDMS products that use the caBIG Common Data
      Elements (CDE‟s) and standard Case Report Forms (CRF‟s)
    • Supports conforming CDMS products, such as Cancer Central Clinical
      Database (C3D)
    • Reduces data entry errors and facilitates clinical trials workflows
Lesson 3: Clinical Trials Compatibility Framework
Installation and Infrastructure Basics


• The caBIG™ Clinical Trials Framework is a series of enterprise
  applications that must be installed in a sufficiently robust computer
  server environment.

• Examples of the dependent software that may be required include:

    •   Apache Ant, Maven, Service Mix, Tomcat
    •   Java SE Development Kit (JDK)
    •   MySQL Database, Oracle Database or PostGreSQL Database
    •   caBIG-compatible Clinical Data Management System (CDMS)
Lesson 3: Clinical Trials Compatibility Framework
Adoption Success Criteria


• Successful adoption of the caBIG™ Clinical Trials Framework
  includes:

    • Deploying and integrating one or more components of the caBIG™
      framework into the institution‟s clinical research enterprise

    • Using caBIG™ harmonized Common Data Elements (CDEs) to capture
      information collected in clinical trials activities.

    • Identifying clinical data that can be made available electronically through
      the caBIG™ data sharing federation
Lesson 3: Clinical Trials Compatibility Framework
Decision Support: Adopt, Adapt or Hybrid


• Each organization‟s circumstances and environment will be unique.

• The caBIG™ Deployment Self-Assessment can help decide whether
  the path to caBIG™ compatibility lies in:

    • Adoption of the full caBIG™ Clinical Trials Suite
    • Adaptation of existing systems to connect to the Grid.
    • Use a Hybrid approach to facilitate the bridging of existing, non-caBIG™
      systems to the caBIG™ infrastructure.


• Organizations can choose which of these paths, or which combination
  of these paths, best serves their needs.
Lesson 3: Clinical Trials Compatibility Framework
Lesson Review

• In this lesson, we:

    • Presented the purpose and end benefits of the Clinical Trials Compatibility
      Framework

    • Introduced the tools and infrastructure provided by the framework

    • Summarized success criteria related to adopting the framework, and
      reminded users of the different paths towards that end




   See Also: Getting Connected with caBIG™ Clinical Trials
   Compatibility Framework (PDF)
Lesson 4: Focus on the Life Sciences Bundle


• This lesson provides a closer look at the Life Sciences software
  bundle.

• Learning Objectives for this Lesson:

    • Describe the Purpose of the Bundle and End User Benefits
    • Provide an Overview of Bundle Tools
    • Review Success Criteria
Lesson 4: Life Science Distribution
Purpose and End User Benefits


• Modern Cancer Research involves:
     • Complex study designs including: capture and refinement of clinical data;
       and selection of tissue/serum samples for molecular analysis
     • High-throughput molecular assays (e.g., RNA microarrays, CHiP on CHiP,
       Protein gels, spectrometry, SNP chips, etc)
     • Sophisticated analyses involving interdisciplinary teams of investigators,
       and advanced algorithms and software.
     • Integration of data and analyses (automated and human)

• The Life Sciences Distribution is a set of tools and infrastructure to
  provide support for interdisciplinary teams engaged in translational
  cancer research. It supports:
     • Data capture and analysis
     • Management of clinical trial and biospecimen resources.
     • Sharing of data and interchange of experimental results among
       geographically distributed teams of interdisciplinary investigators.
Lesson 4: Life Science Distribution
caBIG™ Life Sciences Distribution


                                                    • caArray: Microarray data
                caTissue     caGWAS                   management system
    caArray                              NCIA
                                                    • CaTissue: Biorepository
                                                      management system
                                      geWorkbench
              CTODS                                 • caGWAS: Cancer
                                                      Genome Wide
                    LIFE SCIENCES                     Association Studies
                    DISTRIBUTION
                                                    • NCIA: National Cancer
                                                      Imaging Archive
                                                    • geWorkbench:
                 Data Sharing & Security
                       Framework                      Microarray gene
                                                      expression and sequence
                                                      data management
                                                    • CTODS: Clinical Trials
                        caGrid
                                                      Object Data System
                                                    • caGrid: caBIG™
                                                      compatible systems
                   Data for Sharing                   architecture
Lesson 4: Life Science Distribution
caArray – Microarray Data Management System


• caArray - Microarray data management
  system that guides annotation and supports the
  exchange of microarray gene expression array
  data

    • Provides both web browser-based and                      Click Here to
      programmatic access to microarray data                   Visit caArray
                                                                Tools Page
    • Facilitates integration of array data with diverse
      data types including clinical, imaging, tissue, and
      other functional genomics data through                Click above and a new browser
                                                               tab or window will appear.
      harmonization with relevant caBIG™ models
    • Connects to analytical tools like geWorkbench
      and GenePattern
Lesson 4: Life Science Distribution
caTissue Core


• caTissue Core is a biorepository management
  that supports collecting, processing, managing,
  annotating, requesting, and distributing
  biospecimens and associated information.

    • Provides browser-based and programmatic              Click Here to
      access to biospecimen data                        Visit caTissue Core
                                                             Tools Page
    • Provides a means for collecting, processing,
      storing, and distributing specimens for
      correlative science cancer research               Click above and a new browser
                                                           tab or window will appear.
    • Manages tissue, fluid, cell, and molecular
      biospecimen information
    • Allows users to find and request specimens
      needed for use in molecular correlative studies
Lesson 4: Life Science Distribution
caGWAS – Managing Large Association Studies


• Cancer Genome-Wide Association Studies (caGWAS) allows
  researchers to integrate, query, report, and analyze a variety of data
  types from multiple sources including microarray, genomic,
  immunohistochemistry, imaging, and clinical data through a single
  application

    • Facilitates rapid sharing of information
    • Accelerates the process of analyzing results from various biomedical
      studies
    • Allows researchers and bioinformaticians to access and analyze clinical
      and experimental data across multiple clinical studies
Lesson 4: Life Science Distribution
NCIA - National Cancer Imaging Archive


• National Cancer Imaging Archive (NCIA) is a
  searchable repository of in vivo cancer images, such
  as CT, MRI, and Digital X-rays. NCIA also contains
  annotation files (PDF, image markup) and annotation
  data provided by a curator. Cancer images are
  integrated with clinical and genomic data.                       Click Here to
                                                                    Visit NCIA
                                                                    Tools Page
    • Enables development of imaging resources that will
      lead to improved clinical decision support                Click above and a new browser
                                                                   tab or window will appear.
    • Provides an accessible repository for images along with
      key annotations
    • Accelerates diagnostic imaging decision-making and
      quantitative imaging assessment of drug response
    • Serves as a platform for image data management and
      integration with other research data types
Lesson 4: Life Science Distribution
geWorkbench – Expression and Sequence
Analysis Tools

• geWorkbench is a desktop bioinformatics platform
  that offers a comprehensive and extensible collection
  of tools for the management, analysis, visualization,
  and annotation of microarray-based gene expression
  and sequence data.                                                    Click Here to
                                                                     Visit geWorkbench
                                                                         Tools Page
    • Desktop application with a powerful graphical interface
    • Enables integrated analysis of genomic data (gene             Click above and a new browser
      expression, sequence, pathway, structure)                        tab or window will appear.


    • Brings together analysis and visualization tools for gene
      expression, sequences, pathways, and other biomedical
      data
    • Provides seamless access to databases, computational
      services, and biological annotation sources
    • Enables sophisticated analysis of genomic data through
      the integration of visualization tools, external databases,
      and computational services
Lesson 4: Life Science Distribution

Clinical Trials Object Data System (CTODS)

• Clinical Trials Object Data System (CTODS) - Enables storing and
  sharing of clinical trials data in both identifiable and de-identified form.

    • Enables data from any Clinical Trials Data Management System (CDMS)
      or data source to be available to the cancer research community
    • Provides the broader cancer research community with de-identified clinical
      trials data (data that have all patient identification information removed)
    • Consistent with the Biomedical Research Integrated Domain Group
      (BRIDG) model that underpins data interchange standards and technology
      solutions, which enable harmonization between the biomedical/clinical
      research and healthcare arenas.
Lesson 4: Life Science Distribution

caGrid

• caGrid is a service-oriented architecture and
  federation that connects caBIG™-compatible
  systems together regardless of where they are
  installed.

     • Query across data resources installed in different     Click Here to
       locations                                            Visit caGrid Page
     • Automatically integrate comparable data from
       different sources
     • Create workflow pipelines for data retrieval and     Click above and a new browser
                                                               tab or window will appear.
       analysis using resources across the grid


• caGrid is the topic of Lesson 6.
Lesson 4: Life Science Distribution
Bringing Tools Together: Case Study


• Here is a scenario for how these tools might be used together:
     • A Team of Clinical and Basic Science Investigators from multiple Cancer
       Centers wish to collaborate to discover new candidate genes involved in
       breast cancer.
     • Subjects need to be selected that fit a rigorous profile: CTODS and the
       NCIA are used to search the databases of the multiple Centers to select
       the patient pool. caGrid makes it possible for the databases at multiple
       institutions to be interconnected for a joint search
     • Molecular studies at the gene expression and genome structure levels are
       envisioned for this study. caTissue makes it possible to determine which
       subjects have suitable tissue and serum samples to permit genome-wide
       SNP analyses and gene expression profiles before and after treatments.
     • caGWAS is used to design and execute the SNP-based study.
     • geWorkbench supports a number of analyses related to RNA expression
       profiles in tissues of affected patients.
Lesson 4: Life Science Distribution

Installation and Infrastructure Basics


• The caBIG™ Life Science Distribution is a collection of diverse
  applications that require different types and levels of dependent
  software and infrastructure support.

• Examples of the dependent software that may be required include:

     • Apache Ant
     • Apache Axis
     • Java Development Kit (JDK)
     • JBoss Application Server
     • MySQL Database
     • Hibernate
     • Common Security Module: CSM
     • Other software: MIRC T29-a and Cedara I-Response Workstation (IRW)
       (for NCIA)
     • Castor (for CTODS)
Lesson 4: Life Science Distribution

Adoption Success Criteria


• Successful adoption of caBIG™ Life Sciences Distribution includes:

     • Deploying and integrating at least one of the caGrid-enabled software
       applications into the organization‟s working activities
     • Deploying and maintaining a functional caGrid node at the organization
     • Providing access to appropriate real data from the underlying Life
       Sciences Distribution applications through the caGrid node


• The caBIG™ Deployment Self-Assessment (PDF) can help determine
  your best path to connecting to caBIG™ and adopting a Life Sciences
  Distribution tool.

     • In addition to adopting one of these tools, you can adapt your own life
       sciences tool to be caBIG™ compatible.
Lesson 4: Life Science Distribution

Lesson Review

• In this lesson, we:

    • Presented the purpose and end benefits of the Life Sciences Distribution
      set of tools

    • Introduced the tools and infrastructure provided by this tools set

    • Summarized success criteria related to adopting the framework, and
      reminded users of the different paths towards that end




   See Also: Getting Connected with caBIG™ Life Sciences
   Distribution (PDF)
Lesson 5: Making a Tool caBIG™ Compatible


• This lesson provides an overview of the process and criteria for
  making a tool caBIG™ compatible. This lesson is for organizations
  that wish to adapt existing tools to be caBIG™ compatible.

• Learning Objectives for this Lesson:

   •   Introduce Compatibility and the Compatibility Criteria
   •   Summarize the Compatibility Life Cycle
   •   List Variables Impacting Compatibility Development Time
   •   Summarize the Compatibility Review Process
Lesson 5: Making a Tool caBIG™ Compatible
What is caBIG™ Compatibility?


• caBIG™ compatibility is about utilizing standards to ensure
  interoperability among caBIG™ tools – so that data can be
  exchanged and understood between systems.
Lesson 5: Making a Tool caBIG™ Compatible
When Compatibility is Important


• If you are adopting caBIG™ tools, don‟t worry about making them
  caBIG™ compatible. Compatibility is built in to caBIG™ tools; look
  for tools advertised at the “Silver” or “Gold” level.

• You need to make a software tool caBIG™ compatible if:

    • If you want to share data (“a data service”) with others via the Grid
    • If you want to share data analysis software (“analytical service”) with
      others via the Grid
    • You want to integrate a caBIG™ tool into your organization‟s workflow or
      existing tool base, and exchange data between tools and with the Grid.
    • You want to extend or modify an existing open source caBIG™ tool to
      meet your needs.


 For More Information Access: caBIG™ Compatibility Guidelines
Lesson 5: Making a Tool caBIG™ Compatible
Compatibility Guidelines

• Differing degrees of interoperability are assigned based on the tool's
  adherence to the caBIG™ Compatibility Guidelines.

    • There are four different levels of interoperability: Legacy, Bronze, Silver,
      and Gold. caBIG™ focuses on achieving Silver and Gold compatibility.


• There are four areas of compatibility - an application must meet the
  guidelines in all four areas to be considered "caBIG Compatible:”

    • Semantic Interoperability
        • Information Models                                                   CDEs


        • Vocabularies and Ontologies
                                                                  APIs


        • Common Data Elements (CDEs)
                                                              Vocabularies   Information
    • Syntactic Interoperability                                               Models


        • Programming and Messaging Interfaces
          (e.g., Application Programming Interfaces)
Lesson 5: Making a Tool caBIG™ Compatible
How the Compatibility Criteria Work Together


• Information Models are developed to represent
  the interfaces of a system. The information                           CDEs
  model is then annotated with controlled                  APIs
  vocabularies to establish a foundation of
  shared meaning for all components of the             Vocabularies   Information
  information model. This annotated information                         Models


  model is then converted into Common Data
  Elements that provide the structure (or format)
  for the data.

• The information model is really the starting point
  of all things caBIG™ Compatible. The
  information model also serves as the starting
  point to generate the API, which is the
  mechanism by which the data is exchanged.
Lesson 5: Making a Tool caBIG™ Compatible
Compatibility Criteria & Resources


• The following four slides introduce the compatibility criteria. The slides
  also point to classes in the caCORE curriculum related to the criteria.
• caCORE is composed of a controlled vocabulary repository, and a
  metadata repository. The caCORE Software Development Kit (SDK)
  provides tools to generate a caCORE like system, which satisfies silver
  compatibility. caCORE classes teach these tools.

• caCORE Courses to Get Started:

    • 1000: Intro to caCORE and caDSR
    • 1010: Intro to ISO/IEC 11179
    • 1020: Using the CDE Browser and UML Model Browser

The caCORE Software Development Kit (SDK) is an important set of tools used to
create a caBIG compatible system. Using the kit requires intermediate Java
development skills.
Lesson 5: Making a Tool caBIG™ Compatible
Information Models

• Information Models
    • Object-oriented model of the system
    • Detail associations and relationships between objects
    • Objects of system defined by a controlled public vocabulary

• Silver Level Review
    • The model accurately reflects the scientific domain
    • Semantic annotation is complete and accurate
    • UML modeling best practices are followed                               CDEs

                                                                APIs




                                                            Vocabularies   Information
                                                                             Models




  caCORE Training Courses:
  • 1070: Curating Metadata from UML Models
  • 2010: Using the caCORE Software Development Kit (SDK)
 Lesson 5: Making a Tool caBIG™ Compatible
 Vocabularies & Ontologies

 • Vocabularies and Ontologies
      • Contain agreed-upon concepts, terms, and definitions
      • Used to annotate Object Models which results in defined metadata (e.g.
        CDEs) and Data (e.g., permissible values)
      • Agreement upon the basic concepts, terms and definitions that are
        inherent in all biomedical information is essential for achieving semantic
        interoperability.

 • Silver Level Checklist – Main Points
                                                                                     CDEs
      • Controlled terminologies are used where
                                                                        APIs
        appropriate
      • All terminologies must be publicly available
                                                                                   Information
      • Concepts used to annotate classes and attributes            Vocabularies
                                                                                     Models
        must be publicly available and from a designated
        caBIG Standard vocabulary.
caCORE Training Courses:
• 1030: Using Enterprise Vocabulary System (EVS)
• 1040: Creating Well-formed Metadata and Metadata Business Rules
Lesson 5: Making a Tool caBIG™ Compatible
Common Data Elements


• Common Data Elements
    • Metadata description that define and describe data, including data
      representations like UML models
    • Tagged with concepts from accepted vocabularies


• Silver Level Checklist – Main Points
    •   All attributes are correctly annotated and loaded into the caDSR
    •   Semantic annotation is complete and accurate
    •   Value domains are curated where appropriate
                                                                              CDEs
    •   Data Elements are reused where appropriate
                                                                 APIs




                                                             Vocabularies   Information
                                                                              Models


caCORE Training Courses:
• 1070: Curating Metadata from UML Models
• 2010: Using the caCORE Software Development Kit (SDK)
Lesson 5: Making a Tool caBIG™ Compatible
Application Programming Interfaces


• Programming and Messaging Interfaces
    • Standards-based Application Programming Interfaces (API‟s) provide
      access to data in the form of objects as specified in an UML model
    • Standards-based messaging protocols are supported
    • Syntax of interfaces are defined by agreed-upon standards


• Silver Level Checklist – Main Points                                      CDEs
    • API exists                                               APIs
    • API is well-described and documented
    • API parameters are objects defined by CDEs           Vocabularies   Information
                                                                            Models




    caCORE Training Courses:
    • caCORE SDK Programmers Guide
Lesson 5: Making a Tool caBIG™ Compatible
The Five Steps to Compatibility


•     There are five steps in developing a caBIG™ compatible application –
      the steps relate to the four compatibility criteria above:

       1.   Creating an Information Model
       2.   Performing Semantic Integration (Vocabularies)
       3.   Transforming the Information Model into Metadata (Common Data Elements)
       4.   Generating Code and Messaging Interfaces (API‟s)
       5.   Generating a caGrid Interface


                                                                            y
    Create an           Perform             Transform the    Generate Code      Generate a
    Information         Semantic            Information      and Messaging      caGrid Interface
                        Integration using   Model into       Interfaces using   using “Introduce”
    Model in a
                        the Semantic        Metadata using   the caCORE SDK              y
    Modeling Tool       Integration         the UML Loader   Code Generator
                        Workbench (SIW)


     Information        Vocabularies            CDEs             APIs
       Models
     Lesson 5: Making a Tool caBIG™ Compatible
  A Closer Look at the Compatibility Process

                                                                                   Transform the
                                                                                      Information Model
                                                                                                                     Generate Code and
                                                                                                                                             y   Upload application
                                                                                                                                                  to the Grid using
                                                                                      into Metadata using             Messaging                       Introduce
                           Using          Perform                                     the UML Loader                  Interfaces using the
                XMI         SDK       N   Semantic                                                                    caCORE SDK Code
                File       CodeG      O   Integration                                                                 Generator
                            en?
                                          using the
Create an
Information                 YES
                                          Semantic
                                          Integration
                                                              caDSR
                                                                                      UML
                                                                                     Loader
                                                                                                                     Input        XMI
Model in a                                Workbench                                                                                                caGrid Introduce
                                                                                                                     for next
UML                                       (SIW)                                                                      version
                                                                                                                                   File                Toolkit
Modeling                                                    Terminology
Tool            XMI                                          Services                                             Roundtrip
                File                                                                                                UML
                                                                          Load to                                  Model
                                                                          Sandbox
                           Exported
                           using
                 XMI                                                               caDSR
                           SDK 3.1                                                                                                SIW                         Global
                 File      format                                                 SANDBOX         Approved                                        Index
                                                                                                                                RoundTrip                     Model
                                                                                                  Annotated                                      Service
                                                                                                    XMI                                                      Exchange

                 Run
               caCORE
                                                                XMI
                                                                                                                                  Final
              SDK Code                                                          NO
              Generation                                        File                        Load to                              caCORE
                                                                                     Load                                       SDK Code
                                                                                             Prod                                                caGrid Service
   NO                                     YES                                      Success?    Compatibility                    Generation
                                                                       Terminology                 Review
                                                 Verified
                                                Annotated               Services
                                                  UML
                                                                                        Yes
                                                  Model                                           Review
              CodeGen                                                                                                           Public
              Success?                                                                UML
                                                                                                                                APIs
                                                                                     Loader


                                                                                        Prod      caDSR
                                                                                                 Production
                                                                                                               Metadata Retrieval
Lesson 5: Making a Tool caBIG™ Compatible
Variables Impacting Development Time

• Past experience has shown that six key variables will impact the effort
  and time required to make a tool caBIG™ compatible:

    •   Existing familiarity with tools: Access to a development team that has
        knowledge and skills related to UML and CDE models and tools, Enterprise
        Architect, and NCI tools such as the Semantic Integration Workbench (SIW),
        caAdapter, and Introduce will help speed development.
    •   UML modeling skills: Team understanding and facility with classes, attributes and
        data types; cardinality; UML structures; understanding of inheritances and
        associations; and camel case naming conventions will speed development time.
    •   Projected number of classes and attributes: The projected number of
        classes/attributes for the tool in question will impact development time.
    •   Technology/Infrastructure: The following technical environment and infrastructure
        needs to be in place: Windows, Java WebStart, Internet Explorer 6.0, and the
        Enterprise Architect software tool.
    •   Access to Domain Expertise: Access to the appropriate domain specialists to
        support the data modeling effort will facilitate and speed development.
    •   Time Availability: Having the team development available to spend concentrated
        time on the project will help speed efforts.
Lesson 5: Making a Tool caBIG™ Compatible
Critical Deliverables


• A number of project deliverables help determine whether an application
  is caBIG™ compatible:

   • UML Model – represents the domain model; and class diagrams for the API
   • Annotated XMI file – used to verify semantic consistency between the UML
     and CDE definitions; must have clean SIW error log
   • UML Loader Checklist - for registration in caDSR: gives information about
     the model and constituent packages
   • Value Domain Report – data types and permissible values
   • Vocabulary Report – lists the controlled vocabularies used
   • Standard CDE Report – lists the standard CDEs used and loaded into the
     caDSR; and reasons for why other standards CDEs could not be used
   • CDE Report – export of all CDEs
   • API Documentation – Documents methods, parameters, messaging
     interfaces, etc (e.g. JavaDocs)
   • Test Scripts - Establishes existence of API, tests APIs and objects
   • Test Log – From API test scripts, establishes existence
Lesson 5: Making a Tool caBIG™ Compatible
Compatibility Review Process


• As of October 2007, compatibility reviews for funded caBIG™ projects
  are conducted through the VCDE and Architecture workspaces.

• This review process may be modified in the future based on demand
  as more organizations deploy caBIG™ in their environment and adapt
  their tools.

• If you have questions about the current status of the compatibility
  process, visit the Compatibility and Certification page of the caBIG™
  website.

• The next slide provides a snapshot of the current compatibility review
  process for reference.
Lesson 5: Making a Tool caBIG™ Compatible
Compatibility Review Process
Lesson 5: Making a Tool caBIG™ Compatible
Compatibility Resources


Reference                              Link

Current Version of Compatibility       https://cabig.nci.nih.gov/guidelines_docu
Guidelines                             mentation
caCORE Software Development Kit        http://ncicb.nci.nih.gov/NCICB/infrastruct
(SDK)                                  ure/cacoresdk
caDSR Tooling (Semantic Integration http://ncicb.nci.nih.gov/NCICB/infrastruct
Workbench, CDE Browser, UML Model ure/cacore_overview/cadsr
Browser)
caCORE Curriculum                      http://ncicbtraining.nci.nih.gov/TP2005/tp
                                       2000web.dll/NCICBTraining
Lesson 5: Making a Tool caBIG™ Compatible

Lesson Review

• In this lesson, we:

    •   Introduced caBIG™ compatibility and the four compatibility criteria
    •   Summarized the Compatibility Development Life Cycle
    •   Listed Variables Impacting Compatibility Development Time
    •   Summarized the Current Compatibility Review Process
Lesson 6: Focusing on the Grid


• This lesson provides a closer look at caGrid. This topic is critical
  because “getting on the Grid” is the ultimate target of the compatibility
  process.

• Learning Objectives for this Lesson:

    •   Understanding the Grid: What it means to “be on the Grid”
    •   Implementing caGrid Infrastructure and Tools
    •   Understanding Grid Security
    •   Setting up a Grid Node: Overview and Resources
    •   Surfacing a Grid Data Service
Lesson 6: Focusing on the Grid
What is a Grid?


  • Grids have evolved from the concept of distributed computing to
    support science and engineering.

  • Key Features and Benefits:

      • Sharing of resources (computational, storage, data, etc)
      • Secure Access (global authentication, local authorization, policies,
        trust, etc)
      • Open Standards
      • Virtualization


   “The real and specific problem that underlies the Grid concept is coordinated resource
   sharing and problem solving in dynamic, multi-institutional virtual organizations.”

        I. Foster, C. Kesselman, S. Tuecke. International J. Supercomputer Applications,
        15(3), 2001.


       Source: caBIG Annual Meeting 2007: caGrid 1.0 Tutorial Overview
Lesson 6: Focusing on the Grid
Grids Help Users Find Services & Data


•   Metadata (information about the stored data) is deposited in a “Grid index
    service” that can be queried by grid users (Advertisement and Discovery).


                                               Grid (Client Apps, Users)




                                                        Advertisement
              Grid Service                              and Discovery
        (Metadata & Index Service)


                           caBIO
                            Grid
                            Grid
         Tool or
                           Service
                           Service
          Data



    Source: Modified from caBIG Annual Meeting 2007: caGrid 1.0 Tutorial Overview
Lesson 6: Focusing on the Grid
What is caGrid?


• caGrid (or the Grid) is an Architecture development project that
  provides the core infrastructure and tooling needed to connect to the
  Grid - this is the ultimate target of the compatibility process.

• caGrid and Gold Compliant tools create the G in caBIG™
    • Gold => Grid => Connecting Silver Systems – HOWEVER -
    • Grid enabling your tool does NOT make it GOLD
      compatible. Additional criteria at the GOLD level include CDE re-use,
      CDE standard enforcement, vocabulary standard usage, and model
      harmonization.

• “Getting something on the Grid” – or “establishing a Grid node” -
  means meeting requirements for interoperability (through adoption or
  adaptation), and then connecting that interoperable service or data
  source to a Grid (internal or external). Grid services are “advertised”
  through an index. You can also connect to a Grid to “discover”
  services or data provided by others.
Getting on the Grid:
Advertising and Discovering Services

Core services – including data standards, shared vocabularies, and indexing –
provide the critical elements needed to advertise and discover other Grid resources.




              Source: caGrid 1.1 Users Guide, Figure 5-2, pg. 42
Getting on the Grid:
Key Requirements

•   You cannot “get your tool on the Grid” unless you have: (1) adopted or
    adapted a tool that is caBIG™ compatible; (2) Installed caGrid software.

•   To Grid-enable a system (expose a compatible tool on the Grid) you must:

     • Use object types and information models registered in the caDSR.
     • Develop object oriented APIs and data resources.
     • Define a Grid service interface that defines the functionality that you
       are exposing to the Grid. This grid service interface uses the same
       object types as your existing system, but represents them in a way that
       is platform and language neutral (e.g., using XML)
     • Complete Grid service implementation by mapping service invocations
       to API calls or queries into the existing system
     • All of these activities are enabled through caGrid tools and infrastructure
       – adopting a caBIG™ compatible tool means much of the work has
       already been done for you – adaptation may take more effort.

       Source: caBIG Annual Meeting 2007: caGrid 1.0 Tutorial Overview
 Getting on the Grid:
 A Closer Look at the Data Infrastructure

• Client and service APIs are object        Core Services
  oriented, and operate over well-
  defined and curated data types                            Registered In
                                                                                                                                          E
                                                                                                            Registered In              GM

• Objects are defined in UML and
                                        Cancer Data
  converted into ISO/IEC 11179           Standards             Semantically
                                                                                         Enterprise
                                                                                         Vocabulary
                                                                                                                                 Global
                                                               Described In                                                      Model
  Administered Components. These         Repository                                       Services                              Exchange

  components are in turn registered
  in the Cancer Data Standards                         Object                                Data Type
                                                                                                                            Client
                                                                              WSDL                             XSD
  Repository (caDSR)                                  Definitions                            Definitions




• Object definitions draw from                                          Service Definition                                                          Client Uses
                                                                                                             Validates
  controlled terminology and            Service
                                                   Object Definitions                                         Against

  vocabulary registered in the
  Enterprise Vocabulary Services
  (EVS), and their relationships are    Service
                                         API
                                                                             Grid
                                                                            Service
                                                                                              Objects
                                                                                             Serialize To
                                                                                                              XML
                                                                                                                              Grid
                                                                                                                              Client
                                                                                                                                                     Client
                                                                                                                                                      API
  thus semantically described                          Objects
                                                                                                                                          Objects

• XML serialization of objects adhere
  to XML schemas registered in the
  Global Model Exchange (GME)



                 Source: caBIG Annual Meeting 2007: caGrid 1.0 Tutorial Overview
caGrid Infrastructure & Tooling:
High Level View

• Ultimately, the Grid consists of a collection of applications and
  services, connected to each other through a secure infrastructure.




  Source: www.cagrid.org
caGrid Infrastructure & Tooling:
Technology Overview

caGrid 1.1 leverages the following key technologies:
Globus Toolkit          Provides the core Grid infrastructure and supports      Globus Alliance
                        service deployment, service registry, invocation and
                        secure communication
Mobius GME:             Provides grid repository for XML Schemas of strongly    Ohio State
                        typed objects transferred on caGrid                     University


Cancer Data Standards   Provides repository for Common Data Elements and        National Cancer
Repository (caDSR)      UML models                                              Institute Center
                                                                                for Bioinformatics
Enterprise Vocabulary   Provides controlled vocabularies                        National Cancer
Services (EVS):                                                                 Institute Center
                                                                                for Bioinformatics
ActiveBPEL™:            Provides an open source workflow engine whose           Active Endpoints,
                        implementation follows the Business Process Execution   Inc.
                        Language standard
Grouper                 Provides ability to manage group information across     Internet2
                        integrated applications and repositories


           Source: https://cabig.nci.nih.gov/workspaces/Architecture/caGrid/
Lesson 6: Focusing on the Grid
Installing caGrid v1.1


• caGrid 1.1 requires that Java 1.5 JDK be installed on the target
  machine. Download from http://java.sun.com.

• The caGrid 1.1 installer provides a graphical, wizard-like interface for
  installing caGrid dependencies, source, services, and applications.
  Features include:

    • Component installers: Install all caGrid services and applications
    • Designed for re-use: Can be used to re-install/re-configure previous
      installations

• Detailed installation instructions are available online at:
  http://www.cagrid.org/mwiki/index.php?title=CaGrid:Software




            Source: http://www.cagrid.org/mwiki/index.php?title=CaGrid:Software
PROPERTIES
Allow user to leave interaction:   Anytime
Show „Next Slide‟ Button:          Show upon completion
Completion Button Label:           Next Slide
Lesson 6: Focusing on the Grid
What About Security?!

•   caGrid uses several packages to provide security services.

     •   Dorian allows institutions to locally authenticate their users onto caGrid.
     •   GridGrouper group memberships and project access rights to be managed.
     •   Trust Relationships specify which institutions trust each other‟s authentication.


•   GAARDS was developed on top of the Globus Toolkit and extends the Grid Security
    Infrastructure (GSI) to provide enterprise services and administrative tools for:

     •   Grid user management
     •   Identity federation
     •   Trust management
     •   Group/VO management
     •   Access control policy management and enforcement
     •   Integration between existing security domains and the grid security domain


•   The following figure shows how this all fit together to form a security architecture.
Lesson 6: Focusing on the Grid
caGrid GAARDS Security




   For Explanatory Content Visit: GAARDS Wiki OR caGrid 1.1 Users Guide, Figure 6-1, pg. 60
Lesson 6: Focusing on the Grid
caGrid Security Flows

    For Explanatory Content Visit: Dorian Wiki OR caGrid 1.1 Users Guide, Figure 6-2, pg. 53
Lesson 6: Focusing on the Grid
Grid Trust Service - Detail




      For more context, visit: caGrid 1.1 Users Guide Figure 6-20, pg. 85
Lesson 6: Focusing on the Grid
Creating a Grid Service

• What kinds of Grid services are there?

    • An Analytical Service provides operations to perform a particular
      analysis on data of interest.
    • Services providing data resources to the grid are required to be developed
      as Data Services. In addition to meeting basic service requirements,
      these must implement a standard query operation and language, and
      expose standardized data service metadata.


• Why create or access a Grid service? Here are two use cases related
  to the bundles introduced earlier:

    • geWorkbench can be used to download, view and analyze data stored
      within caArray.
    • Queries could be developed joining clinical annotations from caTissue
      Core with association data from caGWAS (Cancer Genome-Wide
      Association Studies).
Lesson 6: Focusing on the Grid
Creating a Grid Service


• The first step in creating a Grid service is to complete the compatibility
  process described in earlier lessons. The caCORE Software
  Development Kit (SDK) is a useful resource for that process.
• Once you have loaded your data model into caDSR and generated
  public APIs, the Introduce Toolkit supports the next step. Introduce is
  a graphical development environment, and is one of the Grid services
  installed when you install caGrid 1.1.
• A Closer Look at the Technical Implications:
    •   XML Schemas specify how objects are communicated on the grid. The caCORE SDK can be
        used to create the XML Schemas.
    •   If the SDK is not used, then you can either provide an XML schema, or ask Introduce to use the
        SDK internally to create one.
    •   Introduce also maps XML to Java beans, which can be preexisting classes (such as those
        generated by the SDK), or it can generate classes from the XML schemas using Axis. It can
        also import data types from the caDSR and further customize them as needed.
                  (Source: Scott Oster, personal communication; also see caGrid 1.1 User‟s Guide, pgs. 15, 18 and 33.)
Lesson 6: Focusing on the Grid
Introduce – Capabilities & Prerequisites


• Why use Introduce? Here’s a partial           •   Here’s the software
  set of capabilities:                              recommended for use with
                                                    Introduce V1.1:
   • Domain Model import from caDSR
   • Service creation and modification.             •    Java 1.5 JDK
   • Import of new data types from GME              •    Ant 1.6.5 or 1.7
     (global model exchange), file system or        •    Globus 4.0.x
     other sources.                                 •    Eclipse 3.x (not required)
   • Service object code generation if
     needed.
   • Generation of service method stubs – the
     developer provides the service methods
     within these.
   • Client API generation.
   • Configuring security.
   • Deploying the service to a container.
   • Enabling special data transfer methods
     such as Bulk Data Transfer or WS-
     Enumeration.                                       Source: http://www.cagrid.org
Introduce Graphical Development
Environment

• Includes Graphical User
  Interface (GUI) for creating and
  manipulating a grid service.

• Allows you to create a simple
  service skeleton that a
  developer can then implement,
  build, and deploy.

• Automatically generates code
  for a completely caBIG™
  compliant grid service,
  configured to provide:

   •   Advertisement
   •   Standard Metadata
   •   Security
   •   Complete Client API


                  Source: caBIG Annual Meeting 2007: caGrid 1.0 Tutorial Overview
Lesson 6: Focusing on the Grid
Introduce – Setting Up a Service




   Source: http://www.cagrid.org/mwiki/index.php?title=CaGrid:Tutorials:Services:Beginner:Creating
Lesson 6: Focusing on the Grid
Introduce – Choosing the Data Source


 • There are several
   steps to configuring
   the new service, only
   the first is shown here
   – choosing the type of
   data source.




Source: http://www.cagrid.org/mwiki/index.php?title=CaGrid:Tutorials:Services:Beginner:Configuring
Lesson 6: Focusing on the Grid
Introduce – Configuring the Data Service




           Source: caGrid 1.1 Users Guide, Figure 4-1, pg. 32
Lesson 6: Focusing on the Grid
Introduce – Final Notes on Services & APIs

• Invoking Operations on a Service
   • While the grid makes it possible to dynamically invoke services for which a client has
     no APIs (and this is true for caGrid services), this is generally not the procedure
     clients or applications follow. Generally, the API for the service is already available,
     and is “bound” to the particular service of interest at runtime.
   • For example, to query any caGrid Data Service, a common client API can be used,
     regardless of the type of data it exposes. Applications built to query data services
     generally would build against this API. For Analytical Services, however, it is more
     likely a client API specific to a type of analytical service would be used, and, again,
     instances of that service would be bound to it at runtime. In both cases, the
     application developer would just make use of a pre-provided client API.

• Dynamic Invocation for Workflows
   • Constructing a workflow is one instance where a client or application may wish to
     invoke operations on a service without having previously downloaded a client API.
   • The caGrid workflow infrastructure provides the mechanism to describe service
     invocations using a workflow language (BPEL), and request that the workflow service
     perform the invocation.
   • Further details on the workflow infrastructure can be found in the caGrid 1.1
     Programmer‟s Guide.

                        Source: caGrid 1.1 User‟s Guide, pgs. 44-45
Lesson 6: Focusing on the Grid
caGrid Service Layers

Here‟s how it all folds together.
Web Server (Apache/Tomcat): Binds to server port(s)

 Web Application Server (Tomcat): Hosts web applications connected to the web server

   SOAP Engine (Axis): Interprets SOAP requests, installed as a web application
     Web/Grid Service (Globus): Binds “protocol” to operations on local application resources

       Security (GSI)              Metadata (WSRF – Resource                      Service
       * Secure Communication      Properties)                                    Definitions
       * Authentication            * caGrid Service Metadata                      * WSDL
       * Authorization             * caGrid Service Security Metadata             * XSDs
                                   * (caGrid Data Service Metadata)
                                   * (Custom Metadata)


       Service Implementation                                                         Resources
                                                                                      (WSRF
         Advertisement          Configuration         Business Logic
                                                                                      Resource)
         (WSRF-SG)              Properties




         Source: caBIG Annual Meeting 2007: caGrid 1.0 Tutorial Overview
PROPERTIES
Allow user to leave interaction:   Anytime
Show „Next Slide‟ Button:          Show always
Completion Button Label:           Next Slide
Lesson 6: Focusing on the Grid
Grid Addendum: Case Study


• The following slides present a case study related to building a Grid
  service. The example relies on caTRIP (Cancer Translational
  Research Informatics Platform), a platform created by Duke University
  during the caBIG™ pilot.

• The case study demonstrates the technologies needed for a GUI-
  based client to perform queries across multiple grid services. The
  current version can perform cross-database joins using de-identified
  medical record numbers.

• Case Study Focus:
    •   Connecting existing data systems, including basic science data, to enhance patient care
    •   Initial problem scenario is on outcome analysis: Use data from existing patients to inform the treatment of
        another patient, leveraging clinical, pathology, tissue, and basic science data
    •   Scenario: Patient A enters the clinic. What treatments were applied with success on other patients with similar
        characteristics (race, sex, symptoms, pathology results, adverse events, biomarkers)?


             We are grateful for slides from Patrick McConnell at Duke Comprehensive Cancer Center.
Lesson 6: Focusing on the Grid
Overview: Connecting Disparate Data Systems


                                             CAE
                                      Pathology Biomarkers




   Tumor Registry                                                     caTissue CORE
    Diagnosis, Treatment,                                                  Tissue Bank
    Recurrence, Follow-up



                                           MRN


                   caIntegrator                                     caTIES
                      SNP Data                                   Pathology Reports




   Source: caTRIP: A translational tool in action – caTRIP_icr_2007_05.ppt
Lesson 6: Focusing on the Grid
caTRIP In-depth Architecture




                             GUI
                                                     Distributed
                                                       Query
                                                       Engine



      Core Grid Services                                       Domain Grid Services

  IdP         Index      Grid   authorize caTissue                                     CGEMS
                                                      caTIES       CAE        TR
 Service     Service    Grouper            CORE                                         SNP


                                               Duke
                                        caTissue
                                                     caTIES       CAE        TR       caIntegrator
                                         CORE
           Domain
           Controller
                                                     MAW3         Tumor Registry      Illumina

   Source: caTRIP: A translational tool in action – caTRIP_icr_2007_05.ppt
Lesson 6: Focusing on the Grid
caTRIP In-depth Service Implementation


                                  Index Service




                                       advertise
                                                                                   Distributed
                                                                       CQL Query
                             caGrid Data Service                                     Query
                                    caCORE SDK/
                                                                                     Engine
                                      Hibernate

                                    CQL Engine
                                      domain
                                       model
                                                   Object-relational
                                                       mapping




                                   database

   Source: caTRIP: A translational tool in action – caTRIP_icr_2007_05.ppt
  Lesson 6: Focusing on the Grid
  caTRIP caGrid Security


                                                                 authorization
                                                User Grid
authentication                                  Certificate            Grid
                                                                    Data Service
      User            SAML
    Credentials      Assertion       Dorian                              CSM


             caGrid
      Authentication Service          Trust Fabric
                                                                 Grid            backend
              Duke                                              Grouper            data
       Authentication Plugin

     Duke Domain Controller
          NT Security

      Source: caTRIP: A translational tool in action – caTRIP_icr_2007_05.ppt
Lesson 6: Focusing on the Grid
caTRIP caGrid Security

                                                                                           Is member of?




                                           Should I trust the                           Is
                                           credential signer?                       Authorized?




                                                                                Grid
                                                                             Credentials

                                            Authenticate with
                                         Local Credential Provider



                                             SAML Assertion
   Source: caTRIP: A translational tool in action – caTRIP_icr_2007_05.ppt
Lesson 6: Focusing on the Grid
Technical Details Interfaces and Metadata


                                                Discover Metadata



                                    Query
                                                                                      Cancer Data
                                                                                       Standards
 Discover Services                                                       Object
                               Invoke                                                  Repository
                                                                        Definitions



          Distributed Query Engine
                                                                    Register
                                                                                      Enterprise
                                                                                      Vocabulary
                                                                                       Services


      Query          Query       Query
                                                                                               E
                                                                             XSD            GM



                                                          Model
                                            …
                                                                                        Global
       Grid           Grid        Grid                                                  Model
      Service        Service     Service                                               Exchange




     Database    Database       Database    …
   Source: caTRIP: A translational tool in action – caTRIP_icr_2007_05.ppt
Lesson 6: Focusing on the Grid
caTRIP Extensibility: Creating a New Service


Prerequisite: There is some data available in a relational database…

1. Domain model
    1. Create a new domain model (it is easier to reuse one)
    2. Register it in the caDSR
2. Create object/relational mapping (use caCORE SDK or do manually)
    1. Create Java beans
    2. Generate Hibernate mappings
3. Create grid service
    1. Use Introduce to generate a data service
    2. Select the domain model
    3. Select the caTRIP CQL Processor
4. Deploy grid service
    1. Use Introduce to deploy to your container (Tomcat)
5. Add to caTRIP

   Source: caTRIP: A translational tool in action – caTRIP_icr_2007_05.ppt
Lesson 6: Focusing on the Grid
Lesson Review


• In this lesson, we provided a closer look at caGrid by.

    • Describing what it means to “be on the Grid”
    • Outlining how caGrid is installed
    • Introducing the infrastructure and tools that make establishing a grid
      presence possible
    • Reviewing the Grid security architecture
    • Discussing how to establish a grid node, so that services can be
      advertised and discovered.
For More Grid Information –
Resources & References

•   caGrid Websites:
     • https://cabig.nci.nih.gov/workspaces/Architecture/caGrid - Primary
       website, with links to software and documentation
     • http://cagrid.org – Primary developer site for caGrid.
     • https://cagrid-portal.nci.nih.gov/portal/ - Lists available services and
       participants (under construction)

•   caGrid Users Mailing List
     • https://list.nih.gov/archives/cagrid_users-l.html
     • cagrid_users-l@list.nih.gov

•   Materials from caBIG Developer’s Boot Camp, April 2007
     • https://cabig.nci.nih.gov/training/2007_boot_camp

•   Article about caGrid in Bioinformatics, 2006
     • http://bioinformatics.oxfordjournals.org/cgi/content/full/22/15/1910 -
Lesson 7: Focus on the caBIG™ Data
Sharing and Security Framework

• This lesson provides a closer look at the Data Sharing and Security
  Framework, which is designed to facilitate appropriate data sharing
  between and among organizations by addressing legal, regulatory,
  policy, ethical, proprietary, contractual, and socio-cultural barriers.

• Learning Objectives for this Lesson:

    •   Introduce the Data Sharing and Security Framework
    •   Outline the End User Benefits of Data Sharing
    •   Learn How to Use the Framework as a Decision Support Tool
    •   Review the Next Steps toward Adopting the Framework


• This framework was established and is being
  further developed by the caBIG™ Data
  Sharing and Intellectual Capital (DSIC) Workspace.
caBIG™ Data Sharing and
Security Framework




                                   When fully developed,
                                    the Data Sharing and
                                    Security Framework
             COMPATIBLE             will consist of:
            APPLICATIONS
                                   • caBIG™ Policies

         Data Sharing & Security
                                   • Processes and Best
               Framework             Practices
                                   • Model Documents
                caGrid             • Trust Fabric


          Data for Sharing
Lesson 7: Focus on the caBIG™ Data Sharing and Security Framework
End User Benefits of Data Sharing


• WHY SHARE DATA?
    • The large volumes of research data created by the high throughput
      genomics and proteomics technologies can best be harvested by
      teams of individuals - rather than by single PI‟s.
    • To realize the scientific and public health benefits of translational
      and personalized medicine, collaboration across and within
      disciplines is required to leverage broad knowledge and skill
      bases.
    • Data sharing raises the visibility of individual studies and data
      collections; it opens avenues of data dissemination and validation
      – leading to more citations in publications and increased
      prominence.
    • Grants from NIH exceeding $500,000 require a plan for data
      sharing.
Lesson 7: Focus on the caBIG™ Data Sharing and Security Framework
Participating in the DSIC Workspace

•   If you are in a Cancer Center participating in the “Getting Connected
    with caBIG™” deployment effort, you will need to participate in the
    activities of the DSIC Workspace in order to facilitate data sharing.
•   Your participation will involve assisting with the review and refinement
    of the tools the Workspace is developing to assist researchers and
    institutions in their data sharing initiatives. You will be asked to
    evaluate using these products in your own institution.
•   These DSIC tools include such documents and processes as:
    •   Using the caBIG™ Data Sharing and Security Framework as a decision
        support tool
    •   Model documents such as data sharing plans for use with IRBs,
        standardized click-through data use agreements between providers and
        users of data, model informed consent provisions for sharing data via
        caBIG™
    •   Policy papers on topics such as data de-identification, incentives to share
        data, and timeframes for sharing unpublished data
Lesson 7: Focus on the caBIG™ Data Sharing and Security Framework
Understanding the Framework


• You can use the caBIG™ Data Sharing and Security Framework as a
  decision support tool to facilitate data sharing at your Center by
  determining which data can be shared and under which type of access
  and data security controls. To do so, you will need to assess the
  sensitivity of the data by using the Framework‟s four elements:

    •   Proprietary Value
    •   Data Sensitivity
    •   IRB or Institutional Restrictions
    •   Sponsor Restrictions


• The organization assesses the data to be shared along the four
  elements and assigns a low, medium or high sensitivity rating to the
  data, which drives the selection of the sharing mechanism.
Lesson 7: Focus on the caBIG™ Data Sharing and Security Framework
caBIG™ Data Sharing andFramework for Data Sharing Terms and – Roll-up View
             caBIG™ DSIC WS Security Framework Conditions


                                                                                                      Data/Images/
                                                                                                       Specimens




  Economic/Properitary/IP Value                                Data Sensitivity                                              IRB/Institutional Restrictions                                                     Sponsor Restrictions
     (Need for Protection from                        (Privacy/Security Considerations                                     (Human Subjects Considerations                                                        (Grant or Contract
   Institution or PI Perspective)                           – Legal/Regulatory)                                                        – Ethical)                                                              Terms and Conditions)



 Examples: is the data subject to a restrictive                                                                        Do your Institution's or IRB's policies or the applicable
                                                      Do federal or state law or your institution's                                                                                               Do terms and conditions in any sponsored agreements
  license? Is it related to an invention report                                                                      informed consent documents explicitly or implicitly restrict
                                                       policies prohibit or restrict disclosure?                                                                                               prohibit or restrict disclosure outside institution or to caGRID?
you have or intend to file with your institution?                                                                         or permit disclosure (e.g., “no commercial use”)?




                                                                 De-Identified/                                                      Explicit Permission
                 None/Low                                      Anonymized Data                                                          for Registry                                                                No Restrictions
                                                                      Set                                                              Participation



                                                                                                                                                                                                                    Delays or Other
                                                                  Coded/Limited
                  Medium                                                                                                              Policy Limitations                                                              Moderate
                                                                    Data Set
                                                                                                                                                                                                                     Restrictions



                                                                                                                                       Explicit Consent                                                               Classified
                     High                                       Identifiable Data                                                        Limitations/                                                               Research/Major
                                                                                                                                         Restrictions                                                                Restrictions




            ALL of the following:                                                     ANY of the following:                                                                         ANY of the following:
            - no IP value                                                             - moderate IP value                                                                           - high IP value
            - low sensitivity data                                                    - moderate sensitivity data (e.g., LDS)                                                       - high sensitivity data (e.g., PHI)
            - no IRB restrictions                                                     - limited institutional or IRB policy restrictions                                            - significant IRB/consent restrictions
            - no sponsor restrictions                                                 - moderate sponsor restrictions                                                               - major sponsor restrictions




                                                                                                Standardized Click-Through                                                                   Individually Negotiated
           “EZ Pass” - General Website Terms of Use
                                                                                                   Terms and Conditions                                                                  Bi-Lateral or Multi-Lateral MTA
Lesson 7: Focus on the caBIG™ Data Sharing and Security Framework
Understanding the Framework


• The organization arrives at an overall level of sensitivity for the data by
  weighting the outcomes of the four elements according to its own
  judgment.
• The outcome, that is, a low, medium or high sensitivity rating,
  determines how the organization wants to control access to that data.
• The organization offering to share data determines the controls on
  access to that data by determining
    • The level of certainty needed regarding the authentication of the identity of
      data users, and
    • Whether particular authenticated groups or individuals are authorized to
      access the particular data.
• The levels of security attached to data sensitivities of various levels
  are informed by guidance from the National Institute of Standards and
  Technology (NIST).
        The assessment is discussed on the following four slides.
Lesson 7: Focus on the caBIG™ Data Sharing and Security Framework
Proprietary Value


• Relates to the Need for Protection. Sample Questions: Are the data
  subject to a restrictive license? Do the data relate to an invention
  report you have, or intend to file, with your institution? Is the study
  closed? Are the data or study findings awaiting publication?

• The Framework asks you to select the category of proprietary value
  that best describes your data:

                        None/Low – Not subject to
                  restrictive license or invention report

           Medium – Data not yet submitted for publication

               High – PHI, nonpublic intellectual property
                     or other significant restriction
Lesson 7: Focus on the caBIG™ Data Sharing and Security Framework
Data Sensitivity


• Relates to Privacy and Security. Sample Question: Do federal or
  state laws or your institution's policies prohibit or restrict disclosure?

• The Framework asks you to select the category of sensitivity that best
  describes your data:


                     Low Sensitivity - De-Identified/
                        Anonymized Data Set


                    Medium - Coded/Limited Data Set


                    High Sensitivity - Identifiable Data
Lesson 7: Focus on the caBIG™ Data Sharing and Security Framework
IRB or Institutional Restrictions


• Relates to Human Subjects Research Considerations. Sample
  Question: Do your institution's or IRB's policies or the applicable
  informed consent documents explicitly or implicitly restrict or permit
  disclosure (e.g., “no commercial use”)?

• The Framework asks you to select the level of restriction that best
  describes your data:

                         Low - Explicit Permission
                         for Registry Participation

                        Medium - Policy or Consent
                               Limitations

                          High - Explicit Consent
                          Limitations/Restrictions
Lesson 7: Focus on the caBIG™ Data Sharing and Security Framework
Sponsor Restrictions


• Relates to Restrictions from Sponsors in Grants and Contracts.
  Sample Question: Do the terms and conditions in any sponsored
  agreements prohibit or restrict disclosure outside the institution or to
  the Grid?

• The Framework asks you to select the level of sponsor restriction that
  best describes your data:


                            Low - No Restrictions

                          Medium - Delays or Other
                           Moderate Restrictions

                         High - Classified Research/
                             Major Restrictions
Lesson 7: Focus on the caBIG™ Data Sharing and Security Framework
Using the Framework as a Decision Support Tool


  • Using the Framework as a decision support tool can help you
    determine the structures and mechanisms needed to share the data
    under consideration:

                         General Website Terms of Use
                          No Restrictions on Access

             Standardized Click-Through Terms and Conditions-
                  Some Limitations on Access to the Data

        Individually Negotiated Bi-Lateral or Multi-Lateral Agreement-
                      More Restricted Access Conditions

      Note: Each organization must select the type of agreement that best
      fits the needs. The Data Sharing and Security Framework is not a
      strict policy or guideline.
Lesson 7: Focus on the caBIG™ Data Sharing and Security Framework
Next Steps


•   If you are in a Cancer Center participating in the “Getting Connected
    with caBIG™” deployment effort, complete the following steps to
    begin implementing the caBIG™ Data Sharing and Security
    Framework:

    1. Visit the DSIC Workspace to review workspace activities and identify a
       qualified Cancer Center representative that can best contribute to the
       review of future workspace products, such as model documents, policies
       and best practices. Much of DSIC‟s work occurs in two special interest
       groups (SIG): the Regulatory SIG and the Proprietary SIG.
    2. Contact the DSIC WS Lead, Marsha Young (young_marsha@bah.com)
       with your representative‟s name, contact information, and area of
       interest/expertise.
    3. Join the DSIC listservs to receive announcements.
       CABIG_DSIC-L - Data Sharing and Intellectual Capital Workspace
       CABIG_DSIC_PRO_SIG-L - Working Group - Proprietary SIG
       CABIG_DSIC_REG_SIG-L - Working Group - Regulatory SIG
Lesson 7: Focus on the caBIG™ Data Sharing and Security Framework
Lesson Review


• In this lesson, we:

    • Presented the purpose of the Data Sharing and Security
      Framework
    • Outlined the benefits of data sharing
    • Learned how to use the framework as a data sharing decision
      support tool
    • Summarized the next steps toward adopting the framework




  See Also: Getting Connected with caBIG™ Data Sharing and
  Security Framework (PDF)
Lesson 8: Conclusion


• This training program has:

   • Described the caBIG™ initiative and its goals.
   • Pointed to the location of caBIG™ software tools and described their
     benefits.
   • Explained the roles of interoperability, caGrid, and data sharing in
     achieving caBIG™ goals.
   • Described specific actions that your organization can take to connect
     with caBIG™, including adopting caBIG™ tools and adapting your own
     tools to become caBIG™ compatible.
Lesson 8: Conclusion
Next Steps


  • The caBIG™ website is the best centralized resource for more:
    https://cabig.nci.nih.gov/

  • To send in a specific question about caBIG™, write to:
    caBIGconnect@cancer.gov

  • The Learning Management System includes both the caCORE
    training curriculum and self-paced training modules about tools:
    http://ncicbtraining.nci.nih.gov
Lesson 8: Conclusion
caBIG™ Essentials Training Evaluation


• This training program was developed by the caBIG™ Documentation
  and Training Workspace, and will be updated as deployment
  experiences continue.

• The following screens ask three feedback questions about this
  training – your responses will help us better meet community needs.

• Thanks for connecting with caBIG™!
Training Credits


• This training was developed by the caBIG™ Documentation and Training
  Workspace. Special thanks go to:

    •   Tom Casavant - Holden Comprehensive Cancer Center at the University of Iowa
    •   Robert Freimuth - Mayo Clinic Comprehensive Cancer Center
    •   Brooke Hatcher - Booz Allen Hamilton
    •   Eugene Kraus - Meyer L. Prentis/Karmanos Comprehensive Cancer Center
    •   Salvatore Mungal - Duke Comprehensive Cancer Center
    •   Jamie Parker - ScenPro
    •   Kenneth Smith - Herbert Irving Comprehensive Cancer Center - Columbia
        University
    •   Many slides pertaining to caGrid, its architecture, security and its implementation
        were drawn from publicly available presentations. Original authors included Scott
        Oster and Shannon Hastings of Ohio State University, and Patrick McConnell of
        Duke University. We are grateful for these contributions!

• D&T Workspace Leads:
  Leslie Derr (NCI–CBIIT) & Jennifer Tucker (OKA – Otto Kroeger Associates)