NEESGrid Data and MetaData Technology
Kincho Law, Jun Peng, Jim Eng, Terry Weymouth, Paul Hubbard, Charles Severance
Outline
Model activities Metadata tools and repository Data as video What is done What we have to do
The Slide
Data Ingestors
There is a layer is where we develop tools which take advantage and begin to depend on of the “meaning” of the data – where we begin to depend on the meaning of a second.
Data
measurement distance mile travel model car vehicle volume gallon fluid gas unit mileage consumption efficiency fuel estimate rate numerator ratio denominator
Data Mappers
Where we make a viewer capable of viewing a certain type of object. This is where we build things which make use of knowledge. This layer will never be complete but it is a large focus of the coming months.
Metadata
Data Viewers
09/2003
How to prioritize model exploration and development
Focus on the following areas:
– Areas where we have or are building tools – Areas where we already have incoming data in some format – Build the model through experiment based deployment - solve real problems in an open way and see if (with some adaptation) the solutions apply more broadly (i.e. Minnesota )
11/2003
Go Forward - Tools
Evaluate the ORST interface and use it to implement experiment-based interface to meta data repository Investigate tools to represent structural data (like SAC data) Extend and improve viewers – publish API so that sites can extend the viewers Improve notebook
– Single signon using CHEF/Grid credentials – Integration with Metadata – Smother integration with CHEF
Explore automated synchronized video and data capture and after-experiment replay of synchronized video and data (ORST UMinn) Explore the capture of high quality still images as data (UMinn) Investigate adopting a data-editing tool (XMLSpy)
09/2003
RDF Integration
Some of the data and meta data task force members are using Protégé-2000 to develop their models and expressing them in RDF. RDF and NEESML are very similar but not identical so it may be challenging to ingest any arbitrary RDF We expect that we will be able to map a subset of RDF to NEESML for ingestion or adapt an RDF parser (Jena or Raptor) to ingest that subset directly into the repository
11/2003
Data Model Activities
Groups have been formed to develop coordinated models
– – – – Shake Table Centrifuge Tsunami GeoTech
Gokhan Pekhan is organizing this effort Kincho / Jun / Jim pushed forward with a RDF-Based Shake Table Model from Protege-2000 and developed software
Models + Data Model
Data
RDF Load
Repo
Models
RDF/ OWL
Configure
Models + Data Model
Data
RDF Load Repo Models
RDF/ OWL
Protégé - 2K
Configure
Electronic Notebook
Collaborative effort with the DOE SciDAC
– – – – – Electronic notebook - metadata entry Data mapping Data provenance Data display Slide data/metadata jakarta.apache.org/slide/
Ultimate integration will be via JSR-170 www.scidac.org/SAM/
collaboratory.emsl.pnl.gov/docs/collab/sam/samtechoverview.html
DOE ELN / NEESgrid Integration (to date)
MyProxy
NEESgrid Repository
Chef Grid Security SAM / Slide Repository
Technology Celebration
DOE ELN / NEESgrid Integration (ultimate)
MyProxy
NEESgrid Repository
Chef Grid Security
DAQ 0 3 4 0 6 8
“Skunkworks” project
0 3 4 0 6 8
09/2003
Data Turbine
Commercial, free data streaming toolkit
Data Turbine (cont)
Existing data viewers will be adapted to access and display data from data turbine Data acquisition software will be adapted to place information in Data Turbine Channels Metadata elements will be developed to represent data turbine live, stored, and derived channels New efforts (video as data) will be developed from the ground up using Data Turbine outlet.creare.com/rbnb/
11/2003
Video as Data
Follow on to initial demonstration at ORST Experiment based development: Minnesota Design phase complete Joint effort, NEESGrid SI, ORST, Minnesota, UC Davis, Texas, Buffalo, and others
Data Turbine - Today
CTL NTCP Plugin NTCP Control Control Plugin
DT Main System Axis
AXIS / DT Gateway
BT848
rbnbjcap DT Client
NEES NSDS Driver
DAQ
Data Capture DT Client
Data Turbine
Control
NTCP Control
Control Plugin
Make Smoothie
DT Main System
Thumbs
Technology Celebration
DT Capturing
PTZ/ USB Still Capture DT Client
Each still capture produces two channels - Small 1-5fps stream + large single images when picture is taken
Camera Control Control Plugin DT Main System
BT848
rbnbjcap DT Client
Audio
Audio Encoder DT Client
NEES NSDS Driver Still Capture - Minnesota / Paul Hubbard Video capture - From Creare Audio capture - From Creare (TBD) Data Capture - From sites (upwards compatible) NEES NSDS Driver - Paul Hubbard Camera Control Plugin - Mich / Minn
DAQ
Data Capture DT Client
User Views / Still Camera
Control Plugin DT Main System
^ Still Image / Camera Control
~
Thumbnail + Audio + Data
Thumbnail - uBuffalo / Umichigan Thumbnail viewer - Creare / Mich Camera Control Applet - Minn / Mich Quicktime Slicing tool - Mich (low) Stored Data Viewer - Mich JPEG Viewer - Creare Quickime Viewer - Apple
Thumbnail Process
Quicktime Storage System
Quicktime Slicing Tool
^
+
Data Viewer
Minnesota Mock - up
If you area developer and interested in following / helping / participating in this activity, join the mailing list
neesgrid-dv@neesgrid.org
Tool List - To Do
Next release of repository Integrate ELN into repository DAQ Control Panel in CHEF
– Set/Retrieve Metadata – Start / Stop – Ingest data from staging space
Data Turbine Control Panel in CHEF
– Start / Stop / Configure Sources
• Video | Audio | Data | Thumbnail
– Control permanent storage of video
To Do (cont)
NTCP Debugging and Monitoring in CHEF
– Needed Data Turbine
Data as Video Client Tools in CHEF
– New Monitor Tool – Still Image – Camera Control
Data Turbine Audio Capture We may need to support XML Schema QuickTime Capabilities
– Archive, retrieve, slice, dice, convert, present – Probably will not be completed as part of SI effort
Data Model Work
Data Curation Summit
– Understand issues form go-forward plan – Meeting 3/18/2004
Data and Metadata Task Force
– Finish the tsunami and centrifuge models
DSAC Committee
– Meeting 3/19/2004
Summary
In September 2003, we met and “re-visioned” data A bunch of requirements gathering and development has been done
– The “high risk” elements are working now
There is more to do - We will run out of time
– Evolutionary development approach - there will always be usable working code - we will stop when we run out of time
The people…
Gokhan Peckan - Data Models Kincho Law - Data Models / Software design Jun Peng - Data Models / Software design Jim Eng - Parse / Ingest / RDF / Project Browser Jim Myers - Electronic Notebook Terry Weymouth - Data Turbine Paul Hubbard - DAQ and NTCP Joe Futrelle - Data / Metadata Repository