Computer Laboratory MySQL Biological Databases and Distributed Computing Term

Reviews
Computer Laboratory 4: MySQL 140.637 Biological Databases and Distributed Computing Term 3, 2003-2004 F. J. Pineda The purpose of this lab is to practice the mechanics of creating and loading a MySQL database from tab delimited files. Start by creating a subdirectory named lab4 in your home directory and cd into it. Create all your files in this directory. There are two appendices in this lab write-up Appendix A. is a MySQL cheat sheet, Appendix B. shows example SQL statements for creating tables. 1. MySQL basics MySQL documentation can be found on-line at www.mysql.com/documentation/index.html. I have also included a cheat sheet at the end of this write-up. a. Basics Open a terminal window and start a MySQL session by using the mysql client to connect to the mysql server. Enter the following command (with your user id in place of userid). Enter your password when you are prompted mysql –h bilbo.sph.ad.jhsph.edu –u userid -p Once you have the mysql command line prompt, try using some basic commands to get around. SHOW DATABASES; USE database; SHOW TABLES; DESCRIBE table; # list what databases are available #select a default database # list the tables in the default database # describe the attributes in a table Note that ‘#’ is used to indicate a comment. Everything from the ‘#’ to the end of the line is ignored by mysql (just like in perl). b. Running MySQL batch files It is useful to develop databases using mysql batch files. Exit from mysql. Open an editor window and create a mysql batch file named test.sql. Enter the same commands you entered at the MySQL command line and close the file. Now execute the commands in your batch file by passing the batch file to the standard input of the mysql client: mysql –h bilbo.sph.ad.jhsph.edu –u userid –p < test.sql c. Configuring your mysql client As you no-doubt noticed, it gets very tedious having to retype the host and username every time you want to invoke the mysql client. To solve this problem we will create a configuration file that the mysql client will read everytime it starts up. This configuration file will contain the your userid and the name of the MySQL server. You can put the password here too, but it’s probably not a very good idea. a. In your home directory, create a file name .my.cnf (The leading dot makes the file invisible when you do an ls without specifying the filename.) Change the permissions so only you can read and modify the file. chmod go-rwx .my.cnf Check that you succeeded. ls –l .my.cnf Edit the file and add the following lines (as always, replace userid with your userid). [mysql] user = userid host = bilbo.sph.ad.jhsph.edu d. Creating and dropping databases and tables Use the create and drop commands to create and drop database and tables. You will probably have to create and drop your databases many times in the course of developing your databases. You cannot create a database or table that already exists, and you cannot drop a database or table that does not exist. To prevent these errors it is useful to always drop a database or table with the if exists option, before trying to create it. Try creating a database with the same name as your userid by adding the following sql commands to your mysql batch file. show databases; drop database if exists userid; create database mydb; Replace userid with your userid so that you create a database with the same name as your userid. If your database does not already exist you will not see your database in the listing the first time you execute the batch file. The second time you execute the batch file you will see your database in the list of databases. 3. Creating and loading the database We will create a database and load it with data. a. Download the datafiles The data is contained in four tab- delimited text files that you can download from the course web site. The files contain data extracted from the Swiss-prot database. Note that some files contain a row number in the first column. This id can be used as a primary key, if you wish. The sequence.txt file is missing the first column of row numbers so you will have to figure out how to create a primary key for it. The filenames and the contents of each column are summarized below. sequence.txt (gene_name, gene_position, gene_length, amino_acid_sequence) species.txt( id, gene_name, species_name) accession.txt( id, accession_number, gene_name) feature.txt( id, gene_name, feature_type, start, stop, description) b. Add commands to your batch file to create and load a database of InnoDB tables. I recommend that you initially try to get just one table defined and loaded. Hint 1: The species table is fairly small and simple). Follow the SQL examples in Appendix B. for creating tables. Hint 2: Use the describe table SQL command, to make sure that you have correctly created your the tables. c. Load the data into your table. After you have successfully created a table, you can use the load data SQL command to load the data into the table. Again, simply add the command to the SQL batch file and execute it. d. Once you have created and loaded the species table, you can repeat the process for each of the subsequent tables. 4. Testing the database Take your database for a spin by construct SQL queries that answer the following questions: a. How many rows are in each of the four tables? b. How many distinct species are there? c. What is the average length of a gene (in terms of amino acids)? d. What is the average length of human genes (in your table)? Appendix A. MySQL cheat sheet Basic SQL database commands: SHOW DATABASES; USE database; SHOW TABLES; DESCRIBE table; Creating a database: CREATE DATABASE db_name; Creating a table: CREATE TABLE table_name (field1_name TYPE(SIZE), field2_name TYPE(SIZE ) type=tableType; For examples see appendix B. Load tab-delimited data into a table: LOAD DATA LOCAL INFILE "infile.txt" INTO TABLE table_name; (Use ‘\n’ to represent NULL in your input file) Reloading a new data set into existing table: SET AUTOCOMMIT=1; # used for quick recreation of table DELETE FROM tablename; LOAD DATA LOCAL INFILE "infile.txt" INTO TABLE table; Inserting one row at a time: INSERT INTO tablename VALUES (value, value, value, …); (Use NULL to represent NULL) Retrieving information (general): SELECT from_columns FROM table WHERE conditions; All values: SELECT * FROM table; Some values: SELECT * FROM tableName WHERE attribute = value; Multiple critera: SELECT * FROM tableName WHERE attribute = value AND attribute = value Updating just one record: UPDATE tablename SET value = newvalue WHERE attribute = attributename ; Selecting specific columns: SELECT column_name FROM table; # list what databases are available #select a default database # list the tables in the default database # describe the attributes in a table Retrieving unique output records: SELECT DISTINCT column_name FROM table; Sorting: SELECT col1, col2 FROM table ORDER BY col2; #ascending order SELECT col1, col2 FROM table ORDER BY col2 DESC; #descending Date calculations: SELECT CURRENT_DATE, (YEAR(CURRENT_DATE)-YEAR(date_col)) AS time_diff [FROM table]; MONTH(some_date) extracts the month value and DAYOFMONTH() extracts day. Pattern Matching: SELECT * FROM table WHERE rec LIKE "blah%"; (% is wildcard - arbitrary # of chars) Find 5-char values: SELECT * FROM table WHERE rec like "_____"; (_ is any single character) Extended Regular Expression Matching: SELECT * FROM table WHERE rec RLIKE "^b$"; (. for char, [...] for char class, * for 0 or more instances ^ for beginning, {n} for repeat n times, and $ for end) (RLIKE or REGEXP) To force case-sensitivity, use "REGEXP BINARY" Counting Rows: SELECT COUNT(*) FROM table; Grouping with Counting: SELECT owner, COUNT(*) FROM table GROUP BY owner; (GROUP BY groups together all records for each 'owner') Selecting from multiple tables: (Example) SELECT pet.name, comment FROM pet, event WHERE pet.name = event.name; (You can join a table to itself to compare by using 'AS') Currently selected database: mysql> SELECT DATABASE(); Maximum value: SELECT MAX(col_name) AS label FROM table; Auto-incrementing rows: CREATE TABLE table (number INT NOT NULL AUTO_INCREMENT, name CHAR(10) NOT NULL); INSERT INTO table (name) VALUES ("tom"),("dick"),("harry"); Adding a column to an already-created table: ALTER TABLE tbl ADD COLUMN [column_create syntax] AFTER col_name; Removing a column: ALTER TABLE tbl DROP COLUMN col; (Full ALTER TABLE syntax available at mysql.com.) Backing up a database with mysqldump: mysqldump --opt -u username -p database > database_backup.sql (Use 'mysqldump --opt --all-databases > all_backup.sql' to backup everything.) (More info at MySQL's docs .) Appendix B. Example SQL statements for creating tables in the videoStore database create table customer ( customer_id int NOT NULL AUTO_INCREMENT, customer_name varchar(64), customer_address varchar(64), customer_telephone char(12), customer_credit_card char(20), PRIMARY KEY (customer_id) ) type=InnoDB; create table theOrder ( order_id int NOT NULL AUTO_INCREMENT, customer_id int, order_date DATE, FOREIGN KEY (customer_id) REFERENCES customer(customer_id), PRIMARY KEY (order_id), INDEX(customer_id) ) type=InnoDB; create table lineItem ( order_id int NOT NULL, vid_id int NOT NULL, quantity int, FOREIGN KEY (order_id) REFERENCES theOrder(order_id), FOREIGN KEY (vid_id) REFERENCES video(vid_id), PRIMARY KEY (order_id,vid_id), INDEX (order_id), INDEX (vid_id) ) type=InnoDB;

Related docs
Distributed Biological Databases
Views: 9  |  Downloads: 1
PHP and MySQL
Views: 436  |  Downloads: 91
Mysql
Views: 923  |  Downloads: 106
MySQL Tutorial
Views: 111  |  Downloads: 32
Distributed Computing
Views: 26  |  Downloads: 3
Parallel and Distributed Databases
Views: 1  |  Downloads: 1
MySQL
Views: 3004  |  Downloads: 59
premium docs
Other docs by One Seven