Basic Information about Data Desk by danman21


									                   Basic Information about Data Desk
                                         Fall 2006

DataDesk is a statistical package written by statistician Paul Velleman. He is well known
for his contributions to statistics, especially in the area of exploratory data analysis. By
buying ActivStats, you are also getting a student version of DataDesk, which contains all
the tools and techniques needed for an introductory course like Stat100.


Insert the ActivStats CD and follow the instructions to launch the program. Note that
you will need to enter a serial number (contained in your ActivStats packing materials)
the first time you run it. If you have enough space on your computer, copy the whole
CD over to a directory of your choice. You can run ActivStats from there and not have to
bother with inserting the CD etc.

To start DataDesk directly, go to the Tools menu and select Launch DataDesk. An annoy-
ing feature is that DataDesk gives you some pop up advice windows when it first starts.
To close these, click in the top right-hand corner. If this will not work, you might need to
“unlock” by right clicking on the window first and selecting “unlock.”

It is an unadvertised feature of DataDesk that you can copy the DataDesk executable
file without having to install ActivStats. For a Windows system, or for a Mac with OS9,
copy Data Desk AS (which will be in the Courses folder) onto your hard drive (e.g., the
desktop). For a Mac with OS-X, copy Data Desk ASX from the Courses folder onto your
hard drive. You can then run DataDesk without a CD and without launching ActivStats
by simply double-left-clicking on the DataDesk icon where you placed the file on your
computer. The drawback of this approach is that the help files won’t be available. A fix
to this problem is instead to copy the entire \Course folder onto your computer, and
then run DataDesk by navigating to the \Course folder and then double-clicking the
DataDesk program.

Reading in data:

There are a few different ways to enter data into DataDesk.

  1. Read it in from a file that has already been saved in DataDesk format (extension is
     dsk). To do this, go to the File menu and select “Open DataFile.” A dialog window
     will pop up. Under “Files of Type,” specify datafile.

  2. Read it from a text file. This is the preferred method to read in data because it also
     provides the flexibility to read the data into other programs. It is generally a good
     idea to use the first row of the dataset to list out the variable names.

     To read a text file into DataDesk, follow the same steps as above, except specify Text
     as the file type. A dialog window will pop up telling you what the first tow of the
     dataset looks like and asking if you want to use these variable names. The default
     is for the columns to be separated by tabs. In the dialog box, tabs are indicated by
     a vertical line. If you have a dataset with other kinds of separators (e.g. commas
     or spaces), you can specify this by selecting “set Delimiters.” If you have a dataset
     without variable names in the first row, select Prompt for Variable Name and you
     can enter them one by one.

  3. Enter it directly into DataDesk. This method is not recommended for several rea-
     sons. First, it is complicated (see the Data Entry topic in DataDesk help). More im-
     portantly, it is error prone. A much simpler and safer way is to create your dataset
     outside of DataDesk (e.g. by using Excel or Access), saving it as a text file and fol-
     lowing method (b)

DataDesk manual:

To access the DataDesk manual from your ActivStats CD, start up ActivStats, and then go
to the first page of the last section (i.e., the Reference section). An icon on that page opens
up the DataDesk documentation.

I’ve placed a more abridged summary of DataDesk commands (“borrowed” from Paul
Velleman’s web site) on the course web page:

This document is a good resource for quick descriptions of DataDesk commands.

Final comments:

The specific commands you need to analyze and manipulate data will be explained in
the homework problem sets. You are encouraged, however, to explore techniques even
just by trial and error to familiarize yourself with data analysis beyond what is taught in

(This document is largely based on one written by David Harrington.)


To top