Design and Rationale of the CenSSIS Image Database System

Reviews
Shared by: Shame Ona
Stats
views:
2
rating:
not rated
reviews:
0
posted:
2/11/2009
language:
English
pages:
0
Design and Rationale of the CenSSIS Image Database System Student : Huanmei Wu Advisors: Prof. David Kaeli Prof. Betty Salzberg Contributor: Becky Norum (CCS, NU) (ECE, NU) (CCS, NU) (CenSSIS, NU) This work was supported in part by CenSSIS, the Center for Subsurface Sensing and Imaging Systems, under the Engineering Research Centers Program of the National Science Foundation (Award Number EEC-9986821) Abstract The CenSSIS Image Database System (CenSSIS-DB) is a scientific database that enables effective collaborative scientific data sharing and accelerates fundamental research. We describe a state-of-theart system using the Oracle RDBMS and J2EE technologies to provide remote, Internet-based data access. The system incorporates efficient submission and retrieval of images and metadata, indexing of metadata for efficient searching, and complex relational query capabilities. 2.2 Software Architecture CenSSIS-DB uses a standard client-server A few key types of queries currently available: J2EE™ Application Server HTTP architecture (Figure 4). The system is divided into layers comprising database access, application logic, and presentation. Web Client Web Container EJB™ Container RMI Data Services Notification Services Miscellaneous Services JDBC™ Oracle RDBMS SMTP Server Image File Server The components are: • A user interface written in HTML • Java™ source code, Java™ Servlets, Enterprise Java Beans™ (EJB), JDBC, Java™ Server Pages (JSP) • Metadata stored in a relational database system (Oracle) • Image and data files stored on a separate file server and referenced by pointers in the relational database Secure Web Client HTTPS JSP™ JavaMail™ •ID Search: Simplest, based upon the image id (assigned uniquely upon submittal for each image). •Complex Queries: Form based, multiple entries from a list of metadata or entered criteria, The criteria can be executed with AND or OR operations. Figure 6 shows the execution of a complex query. •Textual Search: Search upon keyword and description fields •Subtree Queries (Hierarchical View): as mentioned in section 3.2 1. Challenges and Significance A major barrier facing CenSSIS researchers is the storing, indexing, and sharing of subsurface image and sensor data. The geographical separation between and the diverse disciplines of CenSSIS members make collaboration a particular challenge. In addition, scientific disciplines such as biology and the earth sciences have recently been generating data at enormous rates, making it difficult for scientists to track and organize these vast repositories. The development of a centralized database system to store, organize and retrieve subsurface imaging data is key to addressing these challenges. A centralized image database system has several benefits. First, it facilitates data collection for individual members by providing a framework for experimental annotations and variables. Also, it provides a valuable resource for the educational initiatives of CenSSIS by providing real data for students to use in the classroom. Thirdly, it minimizes the required effort of individual CenSSIS members to manage data sets, freeing their time for analysis and research. Fourth, it forces a consensus on data and imaging standards within the CenSSIS community. These standards will then facilitate the development of CenSSIS toolboxes and other data management tools. Java™ Servlets 3.4 File System Management The file system will aid the research group to manage and navigate data. When a client login, there is a list of groups the client belongs to. You can put or view files in any of the group file folders. Once the client choose a group, the client will enter the root directory of the file folders. The file system is very similar to the windows explorer. It is easy to upload, delete a file from a folder, create a new folder, and change groups. system. Figure 4. CenSSIS-DB System Architecture. The client can interact with the system via a secure (HTTPS) or insecure (HTTP) connection. Web pages are generated by the J2EE™ application server using Java™ Servlets and Java Server Pages™ which interact with the Enterprise JavaBeans™ in order to retrieve and submit data. The J2EE™ application server interacts with the database using JDBC™, the SMTP server using JavaMail™, and accesses the file server. 4. Accomplishments The database is presently online and being populated with a diverse set of subsurface sensing and imaging data. Several research groups from several academic partners and strategic affiliates are using the database and will explore more it. By April 2003, we will have more registered users, more image data along with accompanying metadata. For example, Dr. Carol Warner and Dr. Charles Dimarzio of NEU are investigating embryo viability and plan to use the CenSSIS database to facilitate their work. Figure 7 is a sample oocyte image and its associated metadata. Dr. George Chen at MGH has lots 4-D CT images which Figure 6 Example of an oocyte and its associated metadata are processed at RPI. They do not have their own web site now. They plan to use CenSSIS-DB to search, query, organize and process their data. In 2002, we have redesign the system to handle collections. The concept of collections is critical to our project’s success. For example, in quadrature tomographic microscopy, multiple images of a sample are taken and reconstructed into a final processed image. It is important for us not only to store the initial raw images, but any and all reconstructions because additional reconstructions may be executed on the raw data. The new design is implemented using J2EE technology and will be available at the end of this year. 2.3 File Server The binary data files could be stored in the database itself or in a separate file system. There were several compelling reasons to store them in a separate file system, with links to the data stored with the descriptive metadata. Such as: • Storage of binary data is not standardized across relational database systems. • File server is more reliable to store relatively large amounts of binary data. • Easily accessible to other tools that need to manipulate the data. • The size of the file containing the metadata is smaller and searches will be more efficient 2. Technical Approach 2.1 Data Model Our key considerations in developing a data model and choosing a relational database system were flexibility, extensibility, and reliability. The broad research base of the CenSSIS community requires that a number of different types of image data are generated, each with unique metadata characteristics. Although we have incorporated several types in our data model, it is likely that additional image types will be identified in the future; therefore, designing for extensibility and flexibility in our model is imperative. 2.4 Security The security of CenSSIS-DB is of special concern because it is world wide web accessible. The search and retrieval areas of the system are publically accessible. Some CenSSIS clients, however, need to restrict access to their data sets. Not only do we need to restrict access of particular data sets; but also we want to be able to provide restricted access for the submission of data in order to minimize the need to curate data. A client must select an access permission level when submitting a data set. •Public - anyone in the world with a web browser •CenSSIS - registered CenSSIS users •Client - a registered client •Group - a predefined group of users A data set can have different permission levels for view or update. This functionality allows CenSSIS members to create online communities where they can share privileged information. We have identified a set of common characteristics to be included with all date sets – these are the metadata for all categories. Category refers to the image type. Then we add additional metadata required for a particular category. Figure 1 presents a partial data model of the system as an entity-relationship (ER) model. Each box in the diagram corresponds to an entity in the database (i.e., a table). An entity has attributes (table fields). Entities Figure 1. Partial CenSSIS-DB Data Model can be related to one another using relations. Two relationships are critical in our model, and are a key to its understanding. The first is the relationship between the DATA entity and its subtypes, represented by an "IS-A". This design allows us to extend the DATA entity attributes by creating subtypes with minimal redundancy. This design also makes our model flexible and extensible, since we can create new subtypes quickly without negatively impacting the model. 5. Plans Future research topics include content-based indexing and retrieval (CBIR), multidimensional database indexing and content-based image tagging and searching. We will develop new tools to ensure seamless image format interchange and develop an advanced graphical interface to allow researchers to annotate and query parts of any image. We will also continue to collaborate with other CenSSIS members to broaden the scope of our data collection. In addition, we plan to add the functionality of mass updates to the system in order to permit batch submissions of images and data. 3. Applications of CenSSIS-DB 3.1 Data Submission Data submission is a critical challenge for CenSSIS-DB. Clients can choose to create a new data collection or add data to an already existing collection. The client can save some existing files as default settings for later submission. Figure 5 is an example of the submission page. The metadata goes through a quality check based upon expected values. Upon submission, a data set is available for retrieval immediately. However it is associated with a conditional tag until approved by a system administrator. Figure 5 A sample data submission page 6. Relation to Center's Mission The CenSSIS-DB effort is only possible through the combined efforts of many individual researchers working toward a common vision. As the Center was being conceptualized, a number of common barriers were identified across the diverse sensing and imaging domains. Efficient image and sensor data management was identified as one of the Center's seven barriers. Giving researchers the ability to share and search on image data efficiently will both enable CenSSIS to develop solutions to problems using real data, and well as to develop new solutions that bridge traditional disciplinary boundaries. 100 200 210 300 Figure 2. Representation of the Bill of Materials data structure in the CenSSIS-DB data model. (a) The data_id field in the DATA entity is the primary key. In order to establish relations between data, the DATA_RELATIONS entity contains two fields, both deriving from the data_id field. (b) Observe that element 100 contains elements 200, 210, and 220 and element 300 is contained by element 210 400 220 310 510 600 520 3.2 Hierarchical View A client can select a data set as a root element and be given a tree presentation of all of its child nodes. This presentation can be expanded and reduced upon request. This is a way to present data sets in a way that is convenient for the client and easily navigable. 7. Impact/Implications The class of imaging problems we are addressing in CenSSIS include medical, environmental, biological, and civil applications. Many of these problems come from the most pressing societal issues in these fields: breast cancer detection, bridge deck assessment, cardio-vascular plaque imaging, landmine detection, embryo viability and coral reef assessment. The second interesting relationship is between DATA and DATA_RELATIONS entities. This is a billof-materials (BOM) data structure, used to represent a data hierarchy.We created a DATA_RELATIONS entity containing the attributes of parent and child to associate data sets with one another. This allows us to generate an unlimited number of relationships between data entities (Figure 2) and thus allow clients to organize data sets into collections (Figure 3). The client creates a root collection (refer to nodes 100 and 400 in Figure 3). Within this collection, the client can store images (for example, nodes 200 and 220 in Figure 3), or additional collections (refer to node 210 in Figure 3). Figure 3. A hierarchical representation of the data relationships presented in Figure 2. The root nodes are nodes 100, 400, and 600. Node 100 has three children: 200, 210, and 220. Node 210 has two children: 300 and 310. Node 400 has two children: 510 and 520. Node 600 has no children. References Figure 6 An example of a complex query 3.3 Searching Abilities Special Features: •CenSSIS-DB permits clients to retrieve image/data sets based upon metadata. •Clients can indicate whether results should be presented one at time or in a list format •Clients can select whether to have thumbnails of the data sets returned with the results for viewing •Clients can put interest data in a cart folder for later review 1. http://www.apache.org 2. http://technet.oracle.com/tech/xml 3. http://xml.coverpages.org/mpeg7.html 4. http://www.ks.uiuc.edu/Research/biocore/ 5. http://medical.nema.org/ 6. http://www.unn.ac.uk/iidr/report.html. 7. http://java.sun.com/j2ee/ Contact Info Name: Huanmei Wu Institution: Northeastern University Office: Egan 225 Phone: (617) 373 7349 Email: hwu@ece.neu.edu CenSSIS Research and Industrial Collaboration Conference Dec. 2002

Related docs
Rationale and Design
Views: 0  |  Downloads: 0
EIT_CenSSIS_Boston_OCT_2005
Views: 0  |  Downloads: 0
Rationale for the project
Views: 0  |  Downloads: 0
PARTNERSHIP RATIONALE
Views: 0  |  Downloads: 0
RATIONALE
Views: 1  |  Downloads: 0
Rationale
Views: 0  |  Downloads: 0
Rationale
Views: 3  |  Downloads: 0
Rationale
Views: 6  |  Downloads: 0
Other docs by Shame Ona
Transcript of Surrender of Japan
Views: 171  |  Downloads: 0
APPLICANT INFORMATION RELEASE
Views: 256  |  Downloads: 8
2007-04-16 BJ Flak Wolf Design Doc[0]
Views: 183  |  Downloads: 0
Notice of Directors Meeting
Views: 145  |  Downloads: 3
Agreement to manage hotel
Views: 408  |  Downloads: 30
Geothermal well Application for Permit
Views: 172  |  Downloads: 0
Consulting agreement[1]
Views: 142  |  Downloads: 0
Zimmermann Telegram info
Views: 294  |  Downloads: 0
Signature Formats
Views: 392  |  Downloads: 9
License to insolvent debtor to continue business
Views: 206  |  Downloads: 0