An Introduction to SQL

Document Sample
An Introduction to SQL Powered By Docstoc
					An Introduction to SQL
             Kirk Anne
    Computing & Information Technology
             SUNY Geneseo
           kma@geneseo.edu
What is a database?
           Parts of a database
• Attributes (fields)
   – An attribute or field is a component of a record that
     describes something about an item.
• Records
   – A record is the representation of an individual item.
• Table
   – A collection of records
• Database
   – A collection of tables and rules for accessing the
     tables
 What is a relational database?
• Originally developed by E.F. Codd in 1970
• Organizes data into tables where each
  item is a row and the attributes of the item
  are in columns.
• Different from “flat file” databases because
  you can define “relationships” between
  items in different tables.
             Parts of a database
                     Record

Tables


         Attribute/Field




                              • Records become “rows”
                              • Attributes/fields become “columns”
                              • Rules determine the relationship
                                between the tables and tie the data
                                together to form a database
      I need a new database!
• Many people ask for “new databases”
  when in fact they only need a new table
  within an existing database.
• The data within the tables should be all
  related somehow.
  – By owner
  – By project
             Creating a database
• What information are we trying to store?
• How do we describe the information?
• Phone Book/Contact entries
   –   Name
   –   Address
   –   Company
   –   Phone Number
   –   URL/Web Page
   –   Age
   –   Height (in meters)
   –   Birthday
   –   When we added the entry
                  Data Types
• Binary
  – Database specific binary objects
  – Pictures, digital signatures, etc.
• Boolean
  – True/False values
• Character
  – Fixed width or variable size
• Numeric
  – Integer, Real (floating decimal point), Money
• Temporal
  – Time, Date, Timestamp
Phone Book/Contact Record
Name                      Character
Address                   Character
Company                   Character
Phone Number              Character
URL/Web Page              Character
Age                       Integer
Height                    Real (float)
Birthday                  Date
When we added the entry   Timestamp
          “Normal Forms”

 Summarized from Barry Wise’s
article on Database Normalization
http://www.phpbuilder.com/columns/barry20000731.php3?page=1
 What are the “normal forms”?
• E. F. Codd in 1972 wrote a paper on
  “Further Normalization of the Data Base
  Relational Model”
• Normal forms reduce the amount of
  redundancy and inconsistent dependency
  within databases.
• Codd proposed three normal forms and
  through the years two more have been
  added.
                  The Zero Form
• No rules have                            Contacts
  been applied     Name    Company   Address   Phone1   Phone2   Phone3   ZipCode

• Where most
                   Joe     ABC       123       5532     2234     3211     12345
  people start
                   Jane    XYZ       456       3421                       14454
  (and stop)
                   Chris   PDQ       789       2341     6655              14423
• No room for
  growth
• Usually
  wastes space
                   First Normal Form
• Eliminate
  repeating columns
                                               Contacts
  in each table          Id        Name      Company Address     Phone     ZipCode

• Create a separate      1         Joe       ABC       123       5532      12345
  table for each set     1         Joe       ABC       123       2234      12345
  of related data        1         Joe       ABC       123       3211      12345
• Identify each set of   2         Jane      XYZ       456       3421      14454
  related data with a    3         Chris     PDQ       789       2341      14423
  primary key            3         Chris     PDQ       789       6655      14423

                         Benefits: Now we can have infinite phone numbers or
                         company addresses for each contact.

                         Drawback: Now we have to type in everything over and over
                         again. This leads to inconsistency, redundancy and wasting
                         space. Thus, the second normal form…
         Second Normal Form
                                            People
• Create separate         Id   Name    Company   Address      Zip
  tables for sets of      1    Joe     ABC       123          12345

  values that apply to    2    Jane    XYZ       456          14454
                          3    Chris   PDQ       789          14423
  multiple records
• Relate these tables                  PhoneNumbers

  with a “foreign key”.    PhoneID      Id             Phone
                           1            1              5532
                           2            1              2234
                           3            1              3211
                           4            2              3421
                           5            3              2341
                           6            3              6655
                Third Normal Form
• Eliminate fields that                       People
  do not depend on the       Id             Name          AddressID
                             1              Joe           1
  primary key.
                             2              Jane          2
                             3              Chris         3
          PhoneNumbers
PhoneID    Id        Phone
                                             Address
1          1         5532
                             AddressID     Company Address     Zip
2          1         2234
                             1             ABC      123        12345
3          1         3211
                             2             XYZ      456        14454
4          2         3421
                             3             PDQ      789        14423
5          3         2341
6          3         6655        Is this enough? Codd thought so…
                                 What about “many to many”?
      Kinds of Relationships
• “One to One”
  – One row of a table matches exactly to another
    • One person, one id number, one address
• “One to Many”
  – One row of a table matches many of another
    • One person, many phone numbers
• “Many to Many”
  – One row may match many of another or many
    rows match one row of another
               Fourth Normal Form
• In a “many to many” relationship,
                                                   PhoneNumbers
  independent entities cannot be                   PhoneID        Phone

  stored in the same table.                        1              5532
                                                   2              2234
                                                   3              3211
                       People
                                                   4              3421
          Id     Name      AddressID
                                                   5              2341
          1      Joe       1                       6              6655
          2      Jane      2
          3      Chris     3                       PhoneRelations
                                               PhoneRelID    Id      PhoneID

                   Address                     1             1       1

     AddressID   Company Address       Zip     2             1       2

     1           ABC        123        12345   3             1       3

                                               4             2       4
     2           XYZ        456        14454
                                               5             3       5
     3           PDQ        789        14423
                                               6             3       6
           Fifth Normal Form
• The “very esoteric” one that is probably not
  required to get the most out of your database.
• “The original table must be reconstructed from
  the tables into which it has been broken down.”
• The rule ensures that you have not created any
  extraneous columns and all the tables are only
  as large as they need to be.
             Why normalize?
•   Increases the integrity of the data
•   Reduces redundancy
•   Improves efficiency
•   Although normalization can be hard, it is
    worth it in the long run.
 What do I need to remember?
• Keep normalization in mind.
• Don‟t replicate data in a table.
• If you break the rules, know why you are
  breaking the rules and do it for a good
  reason.
All you need to know about
SQL in 30 minutes (or less)
        Basic SQL Commands
•   Creating tables with CREATE
•   Adding data with INSERT
•   Viewing data with SELECT
•   Removing data with DELETE
•   Modifying data with UPDATE
•   Destroying tables with DROP
  Creating tables with CREATE
• Generic form

  CREATE TABLE tablename (
     column_name data_type attributes…,
     column_name data_type attributes…,
     …
   )


• Table and column names can‟t have spaces or
  be “reserved words” like TABLE, CREATE, etc.
Phone Book/Contact Record
Name                      Character
Address                   Character
Company                   Character
Phone Number              Character
URL/Web Page              Character
Age                       Integer
Height                    Real (float)
Birthday                  Date
When we added the entry   Timestamp
  Phone Book/Contact Table
CREATE TABLE contacts (
   Name                            VARCHAR(40),
   Address                         VARCHAR(60),
   Company                         VARCHAR(60),
   Phone                           VARCHAR(11),
   URL                             VARCHAR(80),
   Age                             INT,
   Height                          FLOAT,
   Birthday                        DATE,
   WhenEntered                     TIMESTAMP
);
                Plan your tables very carefully!
            Once created, they are difficult to change!
  Phone Book/Contact Table
CREATE TABLE contacts (
   ContactID                               INT PRIMARY KEY,
   Name                                    VARCHAR(40),
   Address                                 VARCHAR(60),
   Company                                 VARCHAR(60),
   Phone                                   VARCHAR(11),
   URL                                     VARCHAR(80),
   Age                                     INT,
   Height                                  FLOAT,
   Birthday                                DATE,
   WhenEntered                             TIMESTAMP
);
         If you are going to use the relational nature of a database,
        don’t forget you need to have a unique way to access records!
          There is a way to make the key automatically increment,
              so you don’t have to worry about which one is next.
                 Data Types
• Binary
  – Database specific binary objects (BLOB)
• Boolean
  – True/False values (BOOLEAN)
• Character
  – Fixed width (CHAR) or variable size (VARCHAR)
• Numeric
  – Integer (INT), Real (FLOAT), Money (MONEY)
• Temporal
  – Time (TIME), Date (DATE), Timestamp (TIMESTAMP)
    Adding data with INSERT
• Generic Form

  INSERT INTO tablename (column_name,…)
    VALUES (value,…)
Inserting a record into „contacts‟
INSERT INTO contacts
  (contactid,name,address,company,phone,
  url,age,height,birthday,whenentered)
VALUES
  (1,„Joe‟,‟123 Any St.‟,‟ABC‟,
  ‟800-555-1212‟,„http://abc.com‟,30,1.9,
  ‟6/14/1972‟,
  now());
     Inserting a partial record
INSERT INTO contacts
  (contactid,name,phone)
  VALUES (2,‟Jane‟,‟212-555-1212‟);
     Automatic key generation
• CREATE SEQUENCE contactidseq;
• Change the ContactID line in the
  CREATE TABLE to:
    ContactID INT DEFAULT nextval(„contactidseq‟) PRIMARY KEY

• Or when inserting into a table
    INSERT contacts (contactid,name,phone)
    VALUES (nextval(„contactidseq‟),‟Jack‟,
              „716-555-1212‟);
    Viewing data with SELECT
• Generic Form
  SELECT column,… FROM table,…
      WHERE condition
      GROUP BY group_by_expression
      HAVING condition
      ORDER BY order_expression
• The most used command
• Probably the most complicated also
• If used improperly, can cause very long waits
  because complex computations
      A few simple SELECTs
• SELECT * FROM contacts;
  – Display all records in the „contacts‟ table
• SELECT contactid,name FROM contacts;
  – Display only the record number and names
• SELECT DISTINCT url FROM contacts;
  – Display only one entry for every value of URL.
 Refining selections with WHERE
• The WHERE “subclause” allows you to
  select records based on a condition.
• SELECT * FROM contacts
     WHERE age<10;
  – Display records from contacts where age<10
• SELECT * FROM contacts
    WHERE age BETWEEN 18 AND 35;
  – Display records where age is 18-35
        Additional selections
• The “LIKE” condition
  – Allows you to look at strings that are alike
• SELECT * FROM contacts
    WHERE name LIKE „J%‟;
  – Display records where the name starts with „J‟
• SELECT * FROM contacts
    WHERE url LIKE „%.com‟;
  – Display records where url ends in “.com”
 Removing data with DELETE
• Generic Form

DELETE FROM table WHERE condition;

DELETE FROM contacts WHERE age<13;
  Modifying data with UPDATE
• Generic Form

UPDATE table SET column=expression
 WHERE condition;

UPDATE contacts SET company=„AOL‟
 WHERE company=„Time Warner‟;
 Destroying tables with DROP
• Generic Form

DROP TABLE tablename;

DROP TABLE contacts;
More about SELECT
  “Normal Forms” and SELECT
• Good database design using the normal
  forms requires data to be separated into
  different tables
• SELECT allows us to join the data back
  together
• We can use “views” to create virtual tables
              The Normal Forms
• First Form
   – Eliminate replicated data in tables
   – Create separate tables for each set of related data
   – Identify each set of related data with a primary key
• Second Form
   – Create separate tables for sets of values that apply to multiple
     records
   – Relate the tables with a foreign key
• Third Form
   – Eliminate fields that do not depend on the primary key
• Fourth Form
   – In many-to-many relationships, independent entities cannot be
     stored in the same table
            Joining together tables
•   SELECT name,phone,zip FROM
    people, phonenumbers, address                       People
    WHERE                                Id           Name          AddressID
    people.addressid=address.addressid
                                         1            Joe           1
    AND people.id=phonenumbers.id;
                                         2            Jane          2
                                         3            Chris         3
          PhoneNumbers
PhoneID      Id            Phone
                                                       Address
1            1             5532
                                         AddressID   Company Address     Zip
2            1             2234
                                         1           ABC      123        12345
3            1             3211
                                         2           XYZ      456        14454
4            2             3421
                                         3           PDQ      789        14423
5            3             2341
6            3             6655
       Different types of JOINs
• “Inner Join”
   – Unmatched rows in either table aren‟t printed
• “Left Outer Join”
   – All records from the “left” side are printed
• “Right Outer Join”
   – All records from the “right” side are printed
• “Full Outer Join”
   – All records are printed
• Multiple Table Join
   – Join records from multiple tables
 General form of SELECT/JOIN
SELECT columns,…
 FROM left_table
 join_type JOIN right_table ON condition;

SELECT name,phone FROM people
 JOIN phonenumbers ON
    people.id=phonenumbers.id;
          Other versions
SELECT name,phone FROM people
 LEFT JOIN phonenumbers ON
    people.id=phonenumbers.id;
SELECT name,phone FROM people
 RIGHT JOIN phonenumbers ON
    people.id=phonenumbers.id;
SELECT name,phone FROM people
 FULL JOIN phonenumbers ON
    people.id=phonenumbers.id;
           “Theta style” vs. ANSI
• Theta Style (used in most SQL books)
SELECT name,phone,zip FROM people, phonenumbers, address
  WHERE people.addressid=address.addressid AND
              people.id=phonenumbers.id;


• ANSI Style uses JOIN
SELECT name,phone,zip FROM people
  JOIN phonenumbers ON people.id=phonenumbers.id
  JOIN address ON people.addressid=address.addressid;
    Other SELECT examples
• SELECT * FROM contacts
    WHERE name is NULL;
• SELECT * FROM contacts
    WHERE zip IN („14454‟,‟12345‟);
• SELECT * FROM contacts
    WHERE zip IN (
        SELECT zip FROM address
             WHERE state=„NY‟
        );
       GROUP BY/HAVING
• The “GROUP BY” clause allows you to
  group results together with “aggregate
  functions”
  – AVG(), COUNT(), MAX(), MIN(), SUM()
  – COUNT DISTINCT
• HAVING allows you to search the GROUP
  BY results
     GROUP BY Examples
SELECT company,count(company)
 FROM contacts
 GROUP BY company;

SELECT company,count(company)
 FROM contacts
 GROUP BY company
 HAVING count(company) > 5;
             ORDER BY
• The “ORDER BY” clause allows you to
  sort the results returned by SELECT.

SELECT * FROM contacts
 ORDER BY company;

SELECT * FROM contacts
 ORDER BY company, name;
                Views
• You can use “CREATE VIEW” to create a
  virtual table from a SELECT statement.

CREATE VIEW contactview AS
 (SELECT name,phone,zip FROM
    people,phonenumbers,address
 WHERE people.id=phonenumbers.id AND
    people.addressid=address.addressid);
Now… Let‟s do it on the web

          PHP style
     Basic PHP/SQL interaction
1.   Open a connection to the database
2.   If ok, generate SQL command
3.   “Execute” SQL command
4.   Handle responses from the server
5.   If not done, go back to step 2
6.   If done, close connection to database
           Creating a table with PHP
<?php
// create a connection with the database
$connection=pg_Connect("host=oracle.geneseo.edu port=5432
        dbname=mgmt357 user=mgmt357 password=ecommerce");
// if there is no connection, generate an error and get out
if (!$connection) {
        print pg_ErrorMessage(); exit(-1);
};
// Create the table with the pg_exec command
$result=pg_exec($connection,"create table customer (
                                           id int8,
                                           name varchar(50),
                                           address varchar(50),
                                           email varchar(64)
                              )");
if ($result) {
        print "The customer table has been created.";
} else {
        print pg_ErrorMessage();
}
pg_close($connection);
?>
                 Entering data with PHP
<?php                                                          pg_close($connection);
                                                         } else {
if ($submit) {                                                 // display form
// create a connection with the database                 ?>
$connection=pg_Connect("host=oracle.geneseo.edu
      port=5432 dbname=mgmt357 user=mgmt357
                            password=ecommerce");        <form method="post"
                                                              action="<?php echo $PHP_SELF?>">
if (!$connection) {                                      Name: <input type="Text" name="name”>
       print pg_ErrorMessage();                          Address: <input type="Text" name="address”>
       exit(-1);                                         Email:<input type="Text" name="email”>
};                                                       <input type="Submit" name="submit" value="Enter
                                                              information”>
$id=time(); // set a unique id number to each record     </form>

$result=pg_exec($connection, "insert into customer
     VALUES ($id,'$name','$address','$email')");         <?
                                                         }
if ($result) {                                           ?>
      print "The customer data has been inserted.<p>";
} else {
      print pg_ErrorMessage();
}
                  Updating data with PHP
<?php                                                              pg_close($connection);
                                                            } else {
if ($submit) {                                                     // display form
// create a connection with the database                    ?>
$connection=pg_Connect("host=oracle.geneseo.edu port=5432
       dbname=mgmt357 user=mgmt357
                                  password=ecommerce");     <form method="post"
                                                                   action="<?php echo $PHP_SELF?>">
if (!$connection) {                                         Id: <input type=“Text” name=“id”>
       print pg_ErrorMessage();                             Name: <input type="Text" name="name”>
       exit(-1);                                            Address: <input type="Text" name="address”>
};                                                          Email:<input type="Text" name="email”>
                                                            <input type="Submit" name="submit" value="Enter information”>
$result=pg_exec($connection, "update customer               </form>
               set name='$name',
               set address='$address',                      <?
               set email='$email' where id='$id'");
                                                            }
                                                            ?>
if ($result) {
       print "The customer data has been inserted.<p>";
} else {
       print pg_ErrorMessage();
}
             Displaying data with PHP
$result=pg_exec($connection,"select * from customer");

if ($result) {
      // get the number of rows and store it in $r
      $r = pg_numrows($result);
      print "The customer table has $r row(s) of data.<p>";
      // set the row counter to the beginning 0
      $row=0;
      print "<table border=1>\n";
      // while there are records to deal with...
      while ($data=pg_fetch_object($result,$row)) {
                print "<tr>";
                print "<td>$data->name</td>";
                print "<td>$data->address</td>";
                print "<td>$data->email</td>";
                print "</tr>\n";
                $row++;                                       // increment the counter
      };
      print "</table>\n";
} else {
      print pg_ErrorMessage();
}
    Updating data with PHP part 2
<?php                                                              pg_close($connection);
                                                            } else {
if ($submit) {                                                     // display form
// create a connection with the database                    ?>
$connection=pg_Connect("host=oracle.geneseo.edu port=5432   <form method="post" action="<?php echo $PHP_SELF?>">
       dbname=mgmt357 user=mgmt357                          <?php
                                  password=ecommerce");     $connection=pg_Connect("host=oracle.geneseo.edu port=5432
                                                                   dbname=mgmt357 user=mgmt357 password=ecommerce");
if (!$connection) {                                         $result=pg_exec($connection,"select id from customer limit 30");
       print pg_ErrorMessage();                             $row=0;
       exit(-1);                                            print "<select name=id>\n";
};                                                          while ($data=pg_fetch_object($result,$row)) {
                                                                   print "<option>$data->id\n";
$result=pg_exec($connection, "update customer                      $row++;
               set name='$name',                            };
               set address='$address',                      print "</select>\n";
               set email='$email' where id='$id'");         ?>
                                                            Name: <input type="Text" name="name”>
if ($result) {                                              Address: <input type="Text" name="address”>
       print "The customer data has been inserted.<p>";     Email:<input type="Text" name="email”>
} else {                                                    <input type="Submit" name="submit" value="Enter information”>
       print pg_ErrorMessage();                             </form>
}                                                           <?
                                                            }
                                                            ?>
    Getting even easier… PEAR
• Abstracting the database interface
<?php
include('dbinfo.php');
require_once( 'DB.php' );
$db = DB::connect( "mysql://$user:$pass@$host/$dbname" );
// no need to select DB
$sql = 'SELECT * FROM demo';
$demoResult = $db->query($sql);
while ($demoRow = $demoResult->fetchRow()) {
    echo $demoRow[2] . '<br>';
}
$db->disconnect();
?>

http://www.phpbuilder.com/columns/allan20010115.php3?page=1
                  Resources
• PHP Sites
  –   http://www.php.net
  –   http://www.phpbuilder.com
  –   http://www.devshed.com
  –   http://www.geneseo.edu/~kma/PHP_Intro
• Books
  – SQL in a Nutshell, Kevin Kline, O‟Reilly
  – PostgreSQL: Introduction and Concepts, Bruce
    Momjian, Addision Wesley
  – Introduction to Database Systems, C.J. Date