project description for the graduate course by luckboy


More Info
									Project Description of the Graduate Course

Distributed Database Systems
Department of Computer Science and Technology Tsinghua University 2008-11-7
This project – an experimental DDBMS -- is an integral part of the graduate course Distributed Database Systems. It weights 60% of the credited mark for the course, of which 50% attribute to system design and implementation and 10% attribute to project documentation. Students of this course are required to form a team of three members for the completion of the project by. The purpose of the project is  To help the students gain deep and insight understanding on the knowledge of distributed database systems through a hands-on software design and implementation experiment. To enhance the problem solving capability of the students by giving them system requirements while leaving the design and implementation issues be solved by their own. To nurture the team-work spirit through cooperative work on a joint project.



1. System Requirements
Every student team shall build a DDBMS (Distributed Data Base Management System) with functions described as follows 1.1 Distributed Database Building (1) Support of the creation of distributed database including the creation of a named database, a table and data segmentation methods of  Horizontal fragmentation,compulsory  Vertical fragmentation, compulsory  Hybrid fragmentation, optional  Derived fragmentation, optional (2) The building of data dictionary (3) Loading of data from prepared files into database according to database schema definition, fragmentation and allocation scheme. The format of the data file is referred to Table 1. Note that if a field is of character type, single quotation marks will be used to its values.

Table 1 the format of data file

Table1_Name (field1, field2, …, filed n) 100 value1,value2,…,valuen …… value1,value2,…,valuen Table1_Name (field1, field2, …, filed m) 200 value1,„value2‟, …,valuem …… value1,‟value2‟,…,valuem ……. Note that the data file won‟t upload to the website until 2 weeks before the project benchmark. You can generate some sample files to test import function. 1.2 SQL Support Support the SQL query select…from …where…. statement. The where clause allows the users to express simple predicates conditions connected by logical AND. Note this select statement has a minimal feature. Other features of the select statement are not required for the project. 1.3 Functions of DDBMS Either graphic user interface or command line format can be selected to input the following functions  createdb for the creation of a database  dropdb fro the deletion of a database  createTable for creating a relational table including its fragmentation strategy such as horizontal, vertical etc.  dropTable for the deletion of a table from the database  select … from … where… for database query  insert for adding a record to a table  delete for deletion of a record from a table Note the project does not require a complete SQL language system. But above statements are necessary for the testing and final evaluation of the DDBMS of the project. An example of command script is listed in Table 2. You can either follow this format or use your own one.
Table 2 an example of command script

definesite site1 localhost 12345 //define site definesite site2 localhost 23456 //define site createdb ddb //create a database with name “ddb” createtable student (id int key, name char(10), rank int) //create a table fragment student horizontally into id > 1000, id <= 1000 // horizontal fragmentation allocate student.1 to site1 // allocation

allocate student.2 to site2 // allocation insert into student(id, name, rank) values(10, 'fanju', 10) // insert delete from student where id = 10 //deletion import infile “D:/data.txt” into database ddb // import from a file select * from student where name = “fanju” // query // more select clauses can be referred to the 2007 benchmark documentation drop table student / /drop the table dropdb ddb / /drop the database 1.4 System Environment The DDBMS of the project shall run on three computers (sites) connected by a network. The architecture of the system must be P2P. For every site, the local DBMS can be any commercial product or open source software. The communication mechanism between sites can be socket, RPC, or others. The programming language for the DDBMS of the project is open for every team to choose. 1.5 Distributed Query Processing The DDBMS of the project shall have the components for Query decomposition and localization, and Query optimization, such as  The optimization on initial general query tree  Query tree reduction using fragmentation information  Network traffic optimization Note for the selected optimization method, its effect must be presented on the screen. For example, by displaying the initial query tree and the optimized tree to show the optimization result could be a good option. 1.6 Documentation For the project, every team shall hand in the following two reports  Design report: The due time for this report is in the week of Nov. 21. The contents of the report shall include the preliminary design and implementation for the DDBMS and a time line for the project work plan.  Final report: This report is due before the on-site evaluation of the DDBM. It shall include the details of the design and implementation of the DDBMS such as the architecture, the query optimization method, the implementation of communication protocols between sites, etc. An example of the documentation is available on the course site:

2. Project Team
The DDBMS project shall be completed by a team of three students. One of the three students shall take the leading role of the project. The whole project load should be allocated between members properly. This load allocation must be included in the final project report.

To top