Docstoc

STATA Tutorial

Document Sample
STATA Tutorial Powered By Docstoc
					     STATA Tutorial


         Elena Capatina
  elena.statahelp@gmail.com
Office hours: Mondays 10am-12,
             SS5017
          Stata 10: How to get it?
• Link from Blackboard course website.
• Buy from Stata website, pick up at Robarts:
http://www.stata.com/order/new/edu/gradplans/cgpcampus-
   order.html
                             or
http://www.utoronto.ca/ic/software/detail/stata.html
              STATA windows
•   The command window
•   The viewer/results window
•   The review of commands window
•   The variable window
         Working with STATA
1. From the command window
2. Using a “do” file

• Note: commands should be typed using
  lower case letters only
                The “do” file
• A text file that can be edited using any text
  editor (the STATA do-file editor, notepad,
  word, etc), but you need to save it as
  “filename.do” for STATA to read it
• From the STATA do-file editor, click “do” for
  STATA to execute all commands
• Can highlight and click “do” to execute only
  the highlighted command lines
       Data editor/data browser
• Shows you your data
• Check this frequently, especially after
  commands you are unsure about
           Type of commands
1. Administrative commands that tell STATA
  where to save results, how to manage
  computer memory, and so forth
2. Commands that tell STATA to read and
  manage datasets
3. Commands that tell STATA to modify existing
  variables or to create new variables
4. Commands that tell STATA to carry out the
  statistical analysis
           Example: “stata1.do”
clear
log using "C:\Users\Elena\Documents\Various\STATA-
   TA\stata1.log", replace
use "C:\Users\Elena\Documents\Various\STATA-
   TA\caschool.dta"
describe
generate income = avginc*1000
summarize income
log close
exit
      The “log using” command
• The log file is an “output file”
• Creates and saves a log with all the actions
  performed by STATA and all the results
• How to view it later?
  – In Stata, go to “File”, then “log”, “view”, and
    search for your filename, keeping in mind it has
    extension “.log”
            Loading your data
• If your data is in STATA format, ie
  “filename.dta”, then enter:
            use “filename.dta”
• If your data is a comma delimited file:
           insheet using “filename.txt”
• For other formats, can use “StatTransfer” to
  convert to STATA format
             Useful Commands:
• “describe”:
  – STATA will list all the variables, their labels, types,
    and tell you the # of observations
• Two types of variables:
    1. Numerical
    2. String (usually appear in red in the data browser)
            More commands:
• “generate” or “gen”
  – Creates a new variable

  – i.e. generate income = avginc*1000
  – i.e. generate log_inc = log(income)
  – i.e. gen inc_sq = (income)^2
             More commands:
• “summarize”
  – tells STATA to compute summary statistics (mean,
    standard deviations, and so forth) for all variables
  – Useful to identify outliers and get an idea of your
    data

  – i.e. summarize
  – i.e. summ income inc_sq
             Ending the do file
• log close closes the file stata1.log that
  contains the output.
• The command exit tells STATA that the
  program has ended.
                       Example: stata2.do
# delimit ;
* Administrative Commands;
set more off;
clear;
log using "C:\Users\Elena\Documents\Various\STATA-TA\stata1.log", replace
* Read in the Dataset;
use "C:\Users\Elena\Documents\Various\STATA-TA\caschool.dta"
describe;
* Transform data and Create New Variables;
**** Construct average district income in $'s;
generate income = avginc*1000;
* Carry Out Statistical Analysis;
***** Summary Statistics for Income;
summarize
income;
* End the Program ;
log close;
exit;
      Comments in your do file:
• Asterisk:
• STATA ignores the text that comes after *
  (does not execute them)
• these lines can be used to describe what the
  commands are doing, or allows you to write
  comments.
  – i.e. * Administrative Commands
            Useful commands
• # delimit ;
  – tells STATA that each STATA command ends with a
    semicolon.
  – Useful for long commands
  – Do not forget the “;” and write this even after the
    comment lines that start with *.
            Useful Commands
• set more off
  – Ensures STATA executes all commands. Otherwise,
    if your code is too long, the output window might
    be filled, and STATA will display --more-- at the
    bottom, not executing all commands
           Increasing memory
• If you have a large dataset, then use, for
  example:
  – “set memory 600m” (or a smaller or larger
    number depending on the size of your data)
        Example of my typical admin
                commands
•   clear
•   #delimit ;
•   set more off;
•   set mem 1200000;
•   cap log close; (this closes any open log file)
•   cd "C:\NLSY_1\mainfiles\";
•   log using "Logs\Calibration_statistics.log",
    replace;
            The “cd” command
• Tells STATA the path to the folder in your
  computer where you store your data and
  other files. Stata will look for files in this folder
  and save all the output files in this folder as
  well when you use this command
             Other commands
• tabulate
  – i.e. tabulate county
  – Shows the frequency and percent of each value of
    “county” in the dataset
           The “if” command
• i.e. generate teachers_new= teachers if
   teachers<=10
  replace teachers_new=0 if teachers>10

• i.e. summarize teachers if county==“Nevada”
                   Operators
•   < less than
•   > greater than
•   <= less than or equal to
•   >= greater than or equal to
•   == equal to
•   ~= not equal to
              Sorting the data
• sort
  – i.e. sort income
  – i.e. sort county income
          The “by” command
• i.e. by county, sort: summarize income
 Deleting variables and observations
• drop
  – i.e. drop avginc
    - this drops the variable acginc

  - i.e. drop if teachers<=5
     - this deletes only the observations for which teachers is
       less than 5.
 Deleting variables and observations
• Keep
  – i.e. keep if teachers>=7
         Statistical relationships
1. Correlations:
• correlate
  – i.e. correlate income teachers
  – i.e. correlate income teachers computers

2. Regressions:
• reg
  –   i.e. reg income teachers
  –   i.e. reg income teachers computers
                    Graphs
• Scatter Plots
  – i.e. twoway (scatter income computer)
              Saving your data
• Saving in Stata format:
  – i.e. save “file name.dta”
• You can export your data in another format
  from “File”, then “Export”, then choose the
  type of file you want.
         GETTING MORE INFORMATION
                ABOUT STATA
The Help menu in STATA
• STATA has detailed help files available for all
  STATA commands.
• STATA commands are described in detail in the
  STATA User’s Guide and Reference Manual.
• www.stata.com.
• Finally, you can find several good STATA tutorials
  on the Web. An easy way to find a list is to do a
  Google search for Stata tutorial.
•   (This tutorial was prepared using information from “STATA Tutorial to accompany Stock/Watson
    Introduction to Econometrics” Pearson 2003. )

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:80
posted:2/22/2011
language:Italian
pages:32