Stata help by fanzhongqing

VIEWS: 58 PAGES: 19

									Stata help                                                                  May. 12
1      BASICS ............................................................................................................................................................................................ 4
    1.1     HELP .......................................................................................................................................................................................... 4
    1.2     SHORT CUTS ............................................................................................................................................................................... 4
    1.3     OPTIONS..................................................................................................................................................................................... 4
       1.3.1    Memory, Max variables and max matrix size .................................................................................................................... 4
    1.4     SAVE COMMANDS ...................................................................................................................................................................... 4
    1.5     SAVE OUTPUT............................................................................................................................................................................. 4
    1.6     NOTATION .................................................................................................................................................................................. 4
       1.6.1    Variable names ................................................................................................................................................................. 4
    1.7     COMMAND SYNTAX ................................................................................................................................................................... 4
       1.7.1    By ...................................................................................................................................................................................... 4
       1.7.2    Weights .............................................................................................................................................................................. 4
       1.7.3    If exp.................................................................................................................................................................................. 5
       1.7.4    Ranges ............................................................................................................................................................................... 5
    1.8     PREFIX COMMANDS.................................................................................................................................................................... 5
    1.9     ESTIMATION COMMANDS ........................................................................................................................................................... 5
    1.10 POSTESTIMATION COMMANDS.................................................................................................................................................... 5
2      FUNCTIONS ................................................................................................................................................................................... 5
    2.1     MATEMATICAL FUNCTIONS ........................................................................................................................................................ 5
    2.2     STATISTICAL FUNCTIONS ............................................................................................................................................................ 5
       2.2.1    Examples ........................................................................................................................................................................... 5
    2.3     LOGICAL .................................................................................................................................................................................... 6
3      DATA HANDLING ........................................................................................................................................................................ 6
    3.1     IMPORT DATA............................................................................................................................................................................. 6
    3.2     USE AND SAVE ........................................................................................................................................................................... 6
    3.3     DESCRIBE, LABELS ..................................................................................................................................................................... 6
    3.4     FORMATS ................................................................................................................................................................................... 6
    3.5     RECODING.................................................................................................................................................................................. 6
    3.6     GENERATE, REPLACE ................................................................................................................................................................. 6
    3.7     EXTENDED GENERATE................................................................................................................................................................ 6
       3.7.1    Functions........................................................................................................................................................................... 6
    3.8     DROP, KEEP ................................................................................................................................................................................ 6
    3.9     MISSING ..................................................................................................................................................................................... 7
    3.10 SORT .......................................................................................................................................................................................... 7
    3.11 STRING COMMANDS ................................................................................................................................................................... 7
    3.12 ACCESSING RESULTS FROM COMMANDS..................................................................................................................................... 7
       3.12.1 System variables ................................................................................................................................................................ 7
       3.12.2 Saved results ..................................................................................................................................................................... 7
       3.12.3 Accessing results from commands, save as macros .......................................................................................................... 7
4      UNI- AND BIVARIATE ................................................................................................................................................................. 7
    4.1     LIST ........................................................................................................................................................................................... 7
    4.2     TABULATE ................................................................................................................................................................................. 7
       4.2.1     One-way tables .................................................................................................................................................................. 7
       4.2.2     Two-way tables ................................................................................................................................................................. 8
       4.2.3     Three-way tables ............................................................................................................................................................... 8
    4.3     TABLE OF SUMMARY STATISTICS................................................................................................................................................ 8
    4.4     MEANS AND CONFIDENCE INTERVALS ........................................................................................................................................ 8
    4.5     SUMMARIZE ............................................................................................................................................................................... 8
    4.6     T-TEST ....................................................................................................................................................................................... 8
       4.6.1     Test of equal variance (standard deviation)...................................................................................................................... 8
       4.6.2     One way anova .................................................................................................................................................................. 8
    4.7     NON-PARAMETRIC ANALYSIS ..................................................................................................................................................... 8

-508                                                                        5/18/2012             8:02 PM                                                                         H.S. 
                                                                                                    2


    4.8        PROPORTIONS ............................................................................................................................................................................ 8
5      GRAPHICS ..................................................................................................................................................................................... 9
    5.1     PLOT TYPES ................................................................................................................................................................................ 9
    5.2     GRAPH TWOWAY ....................................................................................................................................................................... 9
       5.2.1    Twoway syntax .................................................................................................................................................................. 9
       5.2.2    Twoway plot types ............................................................................................................................................................. 9
       5.2.3    Twoway fitlines.................................................................................................................................................................. 9
    5.3     GRAPH BAR, HBAR AND DOT..................................................................................................................................................... 9
       5.3.1    Syntax ................................................................................................................................................................................ 9
       5.3.2    Options .............................................................................................................................................................................. 9
    5.4     GRAPH BOX, HBOX .................................................................................................................................................................... 9
    5.5     GRAPH PIE ............................................................................................................................................................................... 10
       5.5.1    Options ............................................................................................................................................................................ 10
    5.6     GRAPH MATRIX ....................................................................................................................................................................... 10
    5.7     OTHER GRAPHS ........................................................................................................................................................................ 10
    5.8     TITLES...................................................................................................................................................................................... 10
       5.8.1    Title options..................................................................................................................................................................... 10
    5.9     LEGEND ................................................................................................................................................................................... 10
    5.10 AXIS SCALE, LABEL, TICKS AND GRID ....................................................................................................................................... 10
       5.10.1 Axix title .......................................................................................................................................................................... 10
       5.10.2 Axis scale......................................................................................................................................................................... 10
       5.10.3 Axis labels and ticks ........................................................................................................................................................ 11
    5.11 TEXT ........................................................................................................................................................................................ 11
    5.12 MARKERS AND MARKER LABELS .............................................................................................................................................. 11
       5.12.1 Markers ........................................................................................................................................................................... 11
       5.12.2 Marker labels .................................................................................................................................................................. 11
    5.13 LINES ....................................................................................................................................................................................... 11
       5.13.1 Connecting points ........................................................................................................................................................... 11
       5.13.2 Line options ..................................................................................................................................................................... 11
    5.14 TEXT BOX OPTIONS .................................................................................................................................................................. 12
    5.15 OTHER OPTIONS ....................................................................................................................................................................... 12
       5.15.1 Colors .............................................................................................................................................................................. 12
       5.15.2 Positions .......................................................................................................................................................................... 12
    5.16 OVER()..................................................................................................................................................................................... 12
    5.17 BY() ......................................................................................................................................................................................... 12
    5.18 SCHEMES ................................................................................................................................................................................. 12
    5.19 COMBINDING GRAPHS .............................................................................................................................................................. 13
6      REGRESSION COMMANDS ..................................................................................................................................................... 13
    6.1     REGRESSION MODELS ............................................................................................................................................................... 13
       6.1.1    Linear regression with simple error structure ................................................................................................................ 13
       6.1.2    GLM ................................................................................................................................................................................ 13
       6.1.3    Conditional logistc .......................................................................................................................................................... 13
       6.1.4    Multiple outcome............................................................................................................................................................. 13
       6.1.5    Linear regression with complex error structure.............................................................................................................. 13
       6.1.6    Survival models ............................................................................................................................................................... 13
    6.2     ORTHOGONAL VARIABLES ........................................................................................................................................................ 13
    6.3     TEST AFTER REGRESSION COMMANDS ...................................................................................................................................... 13
       6.3.1    Wald test .......................................................................................................................................................................... 13
       6.3.2    Likelihood ratio test ........................................................................................................................................................ 14
    6.4     CATALOGING ESTIMATION RESULTS ......................................................................................................................................... 14
    6.5     COV, CORR, AIC, BIC AND SAMPLE ........................................................................................................................................ 14
    6.6     PREDICTION ............................................................................................................................................................................. 14
7      LINEAR REGRESSION .............................................................................................................................................................. 14
       7.1.1          Test of assumtions ........................................................................................................................................................... 14
       7.1.2          Test of influence .............................................................................................................................................................. 14
       7.1.3          Test of multicollinearity .................................................................................................................................................. 14
                                                                                                     3


8      LOGISTIC REGRESSION .......................................................................................................................................................... 15
    8.1        SYNTAX ................................................................................................................................................................................... 15
    8.2        CATEGORICAL COVARIATES ..................................................................................................................................................... 15
    8.3        RESIDUALS, GOODNES-OF-FIT................................................................................................................................................... 15
    8.4        DIAGNOSTIC PLOTS .................................................................................................................................................................. 15
9      ST SURVIVAL TIME DATA ...................................................................................................................................................... 15
    9.1     INITIAL SETTINGS AND DESCRIPTION ........................................................................................................................................ 15
    9.2     KAPLAN –MEIER … ................................................................................................................................................................. 15
    9.3     SURVIVAL REGRESSION MODELS .............................................................................................................................................. 15
       9.3.1     Cox .................................................................................................................................................................................. 15
       9.3.2     Parametric survival ......................................................................................................................................................... 15
10         XTMIXED -- MULTILEVEL MIXED-EFFECTS LINEAR REGRESSION ..................................................................... 15
    10.1       SYNTAX ................................................................................................................................................................................... 15
    10.2       RANDOM EFFECT COVARIANCES .............................................................................................................................................. 15
    10.3       PREDICT ................................................................................................................................................................................... 16
11         DATA REDUCTION ................................................................................................................................................................ 16
    11.1       FACTOR ANALYSIS ................................................................................................................................................................... 16
12         PROGRAMING ........................................................................................................................................................................ 16
    12.1 PROGRAMS............................................................................................................................................................................... 16
      12.1.1 Program definition .......................................................................................................................................................... 16
    12.2 MACROS .................................................................................................................................................................................. 16
    12.3 LOOPS ...................................................................................................................................................................................... 16
      12.3.1 For loop .......................................................................................................................................................................... 16
      12.3.2 Foreach ........................................................................................................................................................................... 16
      12.3.3 While ............................................................................................................................................................................... 17
    12.4 CONDITIONS ............................................................................................................................................................................. 17
      12.4.1 If ...................................................................................................................................................................................... 17
    12.5 MATRIX EXPRESSIONS .............................................................................................................................................................. 17
      12.5.1 Matrix operators ............................................................................................................................................................. 17
13         GLLAMM .................................................................................................................................................................................. 17
    13.1 INSTALATION ........................................................................................................................................................................... 17
    13.2 DATA FORMAT ......................................................................................................................................................................... 17
    13.3 SYNTAX EXAMPLES .................................................................................................................................................................. 18
      13.3.1 A two-level random intercept model (logistic) ................................................................................................................ 18
      13.3.2 A two-level random intercept and slope model (linear) .................................................................................................. 18
      13.3.3 A two-level random intercept model, x1 and x2 categorical ........................................................................................... 18
    13.4 PREDICTION ............................................................................................................................................................................. 18
      13.4.1 Syntax and options .......................................................................................................................................................... 18
14         SURVEY COMMANDS ........................................................................................................................................................... 18
    14.1       SETTING STRATIFICATION, CLUSTERING, FINITE POPULATION CORRECTION AND SAMPLE WEIGTHS ........................................... 18
    14.2       MEANS AND PROPORTIONS ....................................................................................................................................................... 18
    14.3       TABLES .................................................................................................................................................................................... 18
    14.4       REGRESSION ............................................................................................................................................................................ 18
    14.5       STATA WEB LINKS .................................................................................................................................................................... 18
                                                                       4



                                                                1 Basics
1.1      Help
help cmd                                             show help file for cmd

1.2      Short cuts
Ctlr-R                                               run selection in do file
Ctlr-D                                               do selection in do file
Ctrl-Alt-T                                           start STATA
PgUp / PgDown                                        prew/next command in command window
# review n                                           show last n commands
esc                                                  clear command

1.3      Options
1.3.1    Memory, Max variables and max matrix size
set memory 100m                            default =10 Mb, max=as large as OS allows
set maxvar 1000                            default =5000, max=32767
set matsize 500                            default =400, max=11000
set xxx, permanently                       will set for all sessions

1.4      Save commands
cmdlog using myfile                              start a command log file
cmdlog close                                     close (and save) command log file
Can also save Review windov as do file, click on left upper “minus”

1.5      Save output
(set more off), Begin log, …….., close log, save log, print log

1.6      Notation
==                                                equal
~= (or !=)                                        not equal
&                                                 and
|                                                 or
~ (or !)                                          not
x^2                                               x square
+                                                 string concatination
.                                                 missing
x[3]                                              3. Observation of x
x[_n-1]                                           previous value of x
replace x=2 if _n==3                              x[3]=2
1.6.1    Variable names
Names can be 1-32 ch long, letters (case sensitive), digits, underscore. Start with letter.

1.7      Command syntax
[by varlist:] command [varlist] [weigth] [if exp] [in range] [using filename] [, options]
OBS All command are lower case letters!
1.7.1    By
by varlist:                                         repeat for all combinations of values in varlist, use sort varlist first
by varlist, sort:
1.7.2    Weights
[weighttype=var]
fweight=freq                                        frequency weighting for aggregated data
aweight=1/sd                                        analytic weighting by precision
pweight=1/prob                                      probability weighting by sample probabilities
iweight=                                            importance weighting, manual controll of weights
                                                       5


ref U 23.13 and U 30
1.7.3      If exp
if exp                             do if exp == true       (OBS, missing includsed)
1.7.4      Ranges
in range                           restrict to range (in first/last), f=first, l=last, -n from end. Ex: 5/25, -10/l
list x in 5/10                     x from 5 to 10
list x in f/10                     x from first to 10
list x in -10/l                    x from –10 to last= 10 last observations

1.8      Prefix commands
by:
statsby:
bootstrap:
jackknife:
simulate:
svy:
stepwise:
xi:

1.9      Estimation commands


1.10     Postestimation commands
mfx                                marginal effects
adjust                             adjusted means
estat vce                          variance/covariance of estimates
predict, predictnl
ereturn list                       list of saved results
test, testnl                       linear and nonlinear Wald test
lrtest                             likelihood ratio tests
lincom                             point estimates and conf int of linear combinations
nlcom                              non-linear comb
estimates                          store and retrieve results

                                           2 Functions
2.1      Matematical functions
sqrt()
ln() or log()                      natural log
log10()
abs()
int()
exp()
min(x1,…,xn) max….

2.2      Statistical functions
comb(n,k)                          “n over k”
binomial(n,k,p)
chi2(df,x)                         cum chi2
normden(z,s)                       N(0,s2)
norm(z)                            cum N(0,1)
uniform()                          0-1
2.2.1    Examples
a+(b-a)*uniform()                  random uniform [a,b)
a+int((b-a+1)*uniform())           random integers [a,b]
mu+s*invnorm(uniform())            random normal mu s2
                                                                      6


2.3      Logical
cond(x,a,b)                                      if x then a else b

                                                     3 Data handling
3.1      Import data
Use DBMS copy to convert from SPSS to Stata format. Use Stata 6 , 8 byte double as outcome file

3.2      Use and save
use file.dta
save newfile.dta                                 save new copy
save file.dta ,replace                           Overwrite original data

3.3      Describe, labels
describe                                         overview of variables
label var varname “text”                         variable lable
label define lblname # “text” # “text”…          define mapping between numeric values (#) and labels (“text”) called lblname
label values varname lblname                     associate mapping with variable

3.4      Formats
format varname %w.d type                         w=widht in columns, d=decimal places,
type                                             g=general, f=fixed, s=string.
Examples: %9.0g , %9.2f, %10s

3.5      Recoding
recode varlist (rule) (rule), gen(varlist) copy syntax
recode x (1 2=1 low) (3 4=2 high)(missing=.), gen(x2)         recode 1 and 2 into 1, 3 and 4 into 2 give labels and generate new x2
recode x(1=2) if sex==1), gen(x2) copy          copy values for sex!=1
egen ageGr3=cut(age), group(3) label            3 equal sized groups
egen ageGr2=cut(age), at(0,50,80) label         2 groups 0-50, 50-80, values outside set to missing
encode stringvar, generate(newvar)              make numerical newvar (1,2,3…) based on stringvar values

3.6      Generate, replace
generate newvar=exp                              create new variable
replace oldvar=exp
gen agegr=age>=30 if age!=.                      missing values are greater than all numerical values
gen xlag=x[_n-1]
gen xlead=x[_n+1]

3.7      Extended generate
egen [type] newvar = fcn(arguments) [if exp] [in range] [, options]
egen newvar=fcn(arg)                              extended generate: make newvar from stored functions.
Ex: by code, sort: egen mx=median(x)              gives medians of x by values of code
by ... : may be used with some egen functions
3.7.1     Functions
count(exp)                                        number of nonmissing observations of exp.
cut(varname), {at(#,#,...,#)|group(#)}            cut at the at() numbers, or in equal groups
mean, median, max, min, std, sum
pctile(exp) [, p(#)]                              percentiles
group(var1 var2)                                  new var from all combinations of var1 and var2
rmiss

3.8      Drop, keep
drop varnames                                    drop variables from memory
drop in 3                                        drop observation 3
keep var1-var5                                   keep variables 1 to 5. OBS Keep if age==10 will also keep missing.
drop if age==.                                   Remove missing
                                                                    7


3.9      Missing
.                                              numerical missing
“”                                             string missing
missing(x)                                     is eqv to x==. if x is numeric, is equv to x==”” is x is string
missing values are greater than all numerical values and are sorted last, age>=30 will include missing.
gen agegr=age>=30 if age!=.
drop if age==.                                 Remove missing
mvdecode x1, mv(99)                            set 99 to missing
mvencode x1, mv(.=99)                          set missing to 99

3.10     Sort
sort varname                                      sort by variable. Use before “by var:” command

3.11     String commands
fname+” “+lname                                   string concatination
substr(name,1,10)
See U 16.3.5

3.12     Aggregate
contract vars, freq(fname) percent(pname)         contract (aggregate) over variable patterns to freq and percents
collapse vars                                     collapse data to means (or other ststs) over variable patterns

3.13     Accessing results from commands
3.13.1 System variables
_b[varname]                                        regression coef
_b[cons]                                           intercept
_se[varname]                                       SE of regression coef
_n                                                 current observation
_N                                                 total number of obs
_pi                                                pi
Ex: regress y x, _b[_cons] gives constatnt term, _b[x[1]] gives coeff of first category of x, _se[x[1]] gives stand error
Ex: xi:regres y I.x, _b[_Ix_2] gives coef of second level of x (created dummy called _Ix_2)
3.13.2 Saved results
return list                                        run after a command to find list of saved results
ereturn list                                       run after a command to find list of estimated saved results
e(name)                                            estimation class, live until next estimation
r(name)                                            result class, live until next command
Ex: summarize age, gen agedev=age-r(mean)
Ex: regress y x1 x2, matrix B=e(b), matrix corr=e(V) save coeff and corr matrices
3.13.3 Accessing results from commands, save as macros
sum w if c==1                                      mean of w for c=1
global w1=r(mean)                                  save as global macro
dis $w1                                            show content of macro

                                                   4 Uni- and bivariate
4.1      List
list varlist [, [no]display nolabels]             list variables, nodisplay gives tabular data, nolables gives values
list varname-i – varname-j                        List a group of variables
list in 3                                         3. Observation, -1=last, 1/10 = 1 to 10
list if exp                                       list if var>10, list if var==10
list var1 if var2==.                              List if var2 is missing

4.2      Tabulate
4.2.1    One-way tables
tabulate var [weight][if expr][in range][,nofreq plot missing nolable]   nolable shows category values
                                                                    8


tab1 varlist                                       one way tables for all variables
tab c, gen(c)                                      create dummies c1, c2,.. for each category of c
4.2.2     Two-way tables
tab var1 var2 [weight][if expr][in range][,nofreq col row cells chi2 exact missing nolabel]
tab var1 var2 , nofreq col chi                     crosstab column % no freq with chi-square test
tab var1 var2 ,exact                               Fisher exact test
tabi 30 20 \ 20 10, col chi2                       immediate table
tab var1 var2, summarize(var3)                     mean, sd and freq of var3 by var1 and var2. Use mean standard or freq to limit out
4.2.3     Three-way tables
sort var3
by var3: tab var1 var2

4.3      Table of summary statistics
table rowvar [colvar [supercolvar]] [if] [in] [weight] [, options]
table rowvar, contents(clist) row col              clist:freq, mean, sd, sum, n, max, min, median, p# (percentile),iqr. Totals: row col.
                                                   Show missing: missing
table rowvar colvar supercolvar by superrowvarlist multi way tables
Ex: table sex, c(n age mean age mean educ) row subjects, mean age and mean educ by sex , plus total row

tabstat varlist [if] [in] [weight] [, options]

epitab

4.4      Means and confidence intervals
means varlist                                      3 types of means with ci
ci varlist, binomial poisson total                 ci for means, proportions and counts

4.5      Summarize
summarize vars                                     number, mean, sd, min, max. Summarize alone takes all variables.
summarize vars ,detail                             percentiles, var, skew, kurt
inspect var                                        details on values

4.6      T-test
ttest var=#                                         one sample T-test
ttest var, by(c)                                    two sample T-test
ttest var1=var2                                     paired two sample T-test
ttest var1=var2, unpaired                           two sample T-test
,unequal                                            equal variances not assumed
Ex: sdtest age, by(sex) (equal var rejected) ttest age, by(sex) unequal
4.6.1     Test of equal variance (standard deviation)
sdtest var=#                                        standard deviation=#
sdtest var, by(c)                                   two groups compared
sdtest var1=var2                                    same variance in both variables
4.6.2     One way anova
oneway response_var factor_var [weight] [if exp] [in range] [, noanova nolabel missing wrap tabulate
           [no]means [no]standard [no]freq [no]obs bonferroni scheffe sidak ]
Ex: oneway var c, tabulate                          analysis of var by c

4.7      Non-parametric analysis
by gender, sort: centile partners, centile(25 50 75) cci percentiles with exact confidence interval
ranksum partners, by(gender)                        Mann-Whitney test=Wilcoxon rank sum, 2 group
kwallis partners, by(age3)                          Kruskal Wallis K-group test


4.8      Proportions
proportions x1,over(c)                             proportions with ci
                                                                     9




                                                            5 Graphics
5.1      Plot types
graph twoway                                       scatter, line, density, histogram, function,..
graph matrix
graph bar, hbar, dot
graph box
graph pie

5.2      Graph Twoway
5.2.1       Twoway syntax
graph twoway plot [if exp] [in range] [, options] twoway syntax (graph may be omitted)
where plot=(plottype varlist, options)            plot syntax, several plots may be listed and combined
where varlist= y1 y2 … x                          lats variable is x
Ex: twoway scatter y x                            plot y by x
5.2.2       Twoway plot types
scatter, line, connected, area
dot, bar, histogram, kdensity                     kernal desity
function y=f(x),range( x1 x2)                     f(x) from x1 to x2
rarea rcap rbar                                   range area, range cap, range bar ,
Ex: twoway area y x , sort base(50)               gives shading from 50
Ex: Histogram, bin(10) start(-2.5) percent/frequency
Ex: twoway (histogram x, width(1) frequency) (kdensity x, area(3200))                area scaled to the sum of subjects
Ex: function y=normden(x), range(-4 4) droplines(-1.96 1.96) function plots
Ex: twoway dropline db id if abs(db>.25) , mlabel(id)            deltabeta >0.25
5.2.3       Twoway fitlines
lfit, qfit, mband, mspline,lowess                 linear and quadratic fits, median band, median splines and lowess
lfitci, qfitci, fpfitci                           fit with CI: linear, quadratic, fractional polynom
Ex: (lfitci y x, ciplot(rline)) default is rarea
Ex.: twoway (lfit y x) (lowess y x) (scatter y x)      scatter with linear and lowess fit

5.3      Graph Bar, Hbar and Dot
5.3.1    Syntax
graph bar/hbar/dot yvars [if exp] [in range] [, options]
Where yvars=varlist, or =(stat) varlist, or= (stat) name=varname
stat= mean, median, p1, p2, p99, sum, count, min, max
Ex: graph bar x ,over(c) nofill                           means of x over categories of c
Ex: graph bar (mean) meany=x (median) medy=x              mean and median of the same variable
Ex: graph bar (median) x1 x2 , percent stack              stacked percentages
5.3.2    Options
nofill                                               skip empty categories
sort(1)                                              sort by 1 variable
over(c1)                                             values for each c1
by(c2)                                               separete plots for each c2
bargap(0)                                            % overlap, -30=30% overlap, 30=gap.
blabel(what,where_and_how)                           bar labels
what: bar/ total/ name/ group                        print height, total height, name of yvar, name of first over() group
Where_and_how:
position(outside/ inside/base/center)                where to lpace the bar label
format(%9.1f) gap(rel_size) textbox_options          options for labels
Ex: graph bar teq1 ,over(landsdel) nofill blabel(bar, pos(inside) size(*1.3) format(%9.1f) color(white))
Ex: graph hbar teq1 ,over(landsdel,axis(off) sort(1))nofill blabel(group, pos(base) size(*1.3) format(%9.1f) color(white))

5.4      Graph Box, Hbox
graph box x1 x2 x3, ascategory                     boxplot of separate cariables, ascat puts labels on the y-axis
                                                                     10


graph hbox x, over(c, total)                        plot of x over cat of c plus total

5.5      Graph Pie
graph pie x1 x2 x3                               sum of x1, x2 and x3
graph pie x ,over(c)                             sum of x for each category of c
graph pie ,over(c)                               number of cases for each category of c
5.5.1    Options
plabel(_all sum/ percent/ name/ text, text_box_options)        label all slices with sum, percent, x-names or a given text

5.6      Graph Matrix
graph matrix x1-x5                                  scatter of all 5 variables

5.7      Other graphs
gladder y, qladder y                                histograms over different transformations of y, QQ plot of the same

5.8      Titles
title(“text”), xtitle(“text”), ytitle(“text”)       titles
title, subtitle, captition, note                    title types
5.8.1     Title options
position(clockpos)
ring(ringpos)
span
text_box_options
Ex: scatter teq1 moralder, title("Title", position(12) ring(0))

5.9      Legend
legend([contents] [location])
Contetnts:
order(1 2 3)                                     may also use order(1-“label1” 2 3)
label(1 “label1”)                                override legend for var 1
cols(1)                                          legend in 1 column. Row(1) …
stack                                            stack symbol and text
rowgap(2) colgap(2)                              gap between each element
Location:
on/off                                           legend on/off
position(clock)                                  position of legend
ring(1)                                          radial distance from plot, ring(0)=inside
Ex: legend(label(1 "Density of TEQ") label(2 "Mean") label(3 "Median") ring(0) pos(2) cols(1))
Ex: graph bar teq_di teq_fu teq_npcb teq_mopc teq_hcb ///
   , legend(row(1) stack colgap(10) label(1 "Dioxin") label(2 "Furan") label(3 "Non-o") label(4 "Mono") label(5 "HCB"))

5.10     Axis scale, label, ticks and grid
5.10.1 Axix title
x|ytitle(“line1” “line2”)
5.10.2 Axis scale
x|yscale(opts)
Options:
axis(1)                                          axis to modify (1-9)
[no]log
[no]reverse
range(0 100)                                     extend range, will not decrease range. range(0): start at 0, range(100): end at 100
alt                                              axis at alternative side
on/off                                           axis on/off
Ex: scatter teq1 moralder,xscale(range(0 80)) yscale(off)       no y-axis
                                                                   11


5.10.3 Axis labels and ticks
x|ylabel(rule_or_values,opts)                major ticks and labels
x|ytick(rule_or_values)                      major ticks
x|ymlabel(rule_or_values)                    minor ticks and labels
x|ymtick(rule_or_values)                     minor ticks
rule or values (may use both):
#10                                          10 nice values
1 5 50                                       labels at 1, 5 and 50
0 5 10 “mean” 15 20                          labels every 5, with mean printed at 10
0 (10) 100                                   labels from 0 to 100 in steps of 10
minmax                                       min and max values
none
Label options:
angle(0)
[no]grid                                     add gridlines
format(%5.0f)                                5 places, o decimals, fixed
Ex: xlabel(1 “Low” 2 “Medium” 3”High”,angle(45)) text labels at values 1 2 and 3, at 45 deg
Ex: scatter teq1 moralder,xlabel(#10,grid)

5.11     Text
text(y x “text”, opts)                           text at y,x in the plot
placement(c )                                    c=centered, n=north, s=south, ..
orientation(vertical)
box                                              draw box around text
Ex: graph …, text(10 50 “Line1” “Line2”, just(left) color(blue) )        two lines of text at (y,x)=(10,50)

5.12     Markers and marker labels
5.12.1 Markers
mstyle(p1 p2 )                                 default styles
msymbol(sym1 sym2 …)                           marker, Square, square(small), Sh (hollow), Square, Diamond, Triange, O circle, X ,
                                               +, p point, . default, i invisible. Ex msymbol(S)
msize(small medium large), msize(*2)           small meduin large markers, twize the size
mcolor(green)                                  both outside and inside color
Ex msymbol(. t Oh)                             markers for 3 variables: default, small triagles and hollow circles
Ex twoway scatter y x [aweight=z], msymbol(oh) msize(small)              point size prop to z
5.12.2 Marker labels
mlabel(var)                                    label marker by var content
mlabsize(size)
mlabcolor(color)
mlabelpos(12)                                  label at 12 o’clock position
mlabvposition(var)                             postitions based on variable containing clock positions
mlabgap(*3)                                    3 times larger gap between marker and label
Ex scatter y x, mlabel(z) mlabpos(center) msymbol(i) use contents of z to label points, labels in the center and invisible points

5.13     Lines
5.13.1 Connecting points
Twoway scatter y x, connect(l) sort               sort points, connect with line
connect(l)                                        line
connect(L)                                        separate line for each series
connects(J stepstair)                             for survival curves
5.13.2 Line options
lcolor(red)                                       line color
lwidth(thick) or lwidth(*3)                       thick line
lpattern(dash)
lpattern(“l” “.-“ “-###”)                         solid, dotdashed, dash+3 spaces
                                                                    12


5.14     Text box options
tsstyle(textboxstyle)                             overall style
box/nobox                                         border
size(textsizestyle)
color(colorstyle)                                 text color
justification(justificationstyle)                 text left, center, right
alignment(alignmentsyle)                          text top, middle, bottom, baseline
bfcolor(colorstyle)                               background color
bcolor(colorstyle)                                background and border color
blstyle(linestyle)                                style of border
orientation(orientationstyle)                     vertical/horizontal, rvertical/rhorizontal
placement(compassdirstyle)                        location
ring(1)                                           0:inside, 1-7 outside
format(%9.1f)                                     9 places, 1 desimal, fixed
Ex: graph…,title(“My title”, color(red) box size(*1.5))

5.15     Other options
5.15.1 Colors
black, white, red, blue, cyan, green, mint, yellow….
gs0… gs16                                            gray scales from black to white
gray=gs8
color*0.5                                            half the intensity
5.15.2 Positions
clockpos(12)                                         12 o’clock. clockpos(0) means center if valid
placement(north)                                     alternative to clock with 9 positions
ring(1)                                              0:inside, 1-7 outside
justification(left/ centered/ right)                 text justification
alignment(top/ middle/ bottom/ baseline)             text alignement
orientation(horizontal/ vertical/rhorizontal/ rvertical)

5.16     Over()
over(c, total)                                     split by categories of c plus total, can use over(c1) over(c2)
over(c, descending)                                sort values.
over(c, sort(c2)), sort(1)                         sort by c2 or by the first y variable
over(var, relabel(1 “lab1” “lab2”))                new labels for ”over” variable
ascategory / asyvars                               as categories: plotted with spaces, as yvars: plotted dense
missing, nofill                                    show missing, do not show empty combinations
Ex: graph bar teq_di teq_fu ,over(landsdel, total) nofill

5.17     By()
by(varlist, suboptions)                            separate graphs for each varlist
total                                              add total group
missing                                            add missing groups
colfirst                                           display down columns
rows(#), cols(#)                                   number of rows or cols
holes(numlist)                                     positions to leave blank
compact
Ex: graph bar teq_di teq_fu ,by(star, total rows(1) compact)

5.18     Schemes
set scheme(schemename) [,permanently]              set overall look of graphs
graph …, scheme(schemename)                        set overall look for current graph
graph query, schemes                               list installed schemes
schemenames:
s2color                                            Default, will vary colors of lines and markers
s2mono                                             monocrome, will vary patterns of lines and markers
                                                                  13


5.19    Combinding graphs
graph …., saving(plt1,replace) or name(plt1)    saving to file
graph …., name(plt1,replace)                    saving to memory
graph use plt1.gph or display plt1.gph          show saved graph from file
graph combine plt1 plt2, ycommon cols(1)        combine from memory in 1 row with same y scaling
graph combine plt1.gph plt2.gph                 combine from file


5.20    Graph query
graph query                                     list of all styletypes
graph query color                               list of all colorstyles
graph query linepattern                         list of all linepatternstyles

5.21    Palettes
palette line                                    plot showing the linetypes
palette symbol                                  plot showing the symboltypes
palette color1 color2                           plot comparing colors


                                               6 Regression commands
6.1     Regression models
6.1.1    Linear regression with simple error structure
regress                                        linear regression (also heteroschedastic errors)
boxcox                                         linear regression on BoxCox transformations of y and x’s
nl                                             non linear least squares
6.1.2    GLM
logistic                                       logistic regression
poisson                                        Poisson regression
binreg                                         binary outcome, OR, RR, or RD effect measures
glm                                            use for non-canonical links
6.1.3    Conditional logistc
clogit                                         for matched case-control data
6.1.4    Multiple outcome
mlogit                                         multinomial logit (not ordered)
ologit                                         ordered logit
6.1.5    Linear regression with complex error structure
xtmixed                                        linear mixed models
xtlogit                                        random effect logistic
xtpoisson                                      random effect Poisson
6.1.6    Survival models
stcox                                          Cox proportional hazard models (with frailty)
streg                                          parametrix survival models (with frailty)

6.2     Orthogonal variables
orthog x1 x2 x3, gen(q1 q2 q3) matrix(R)        make orthogonal variables and transformation matrix R
regress y q1 q2 q3                              regression command
matrix b=e(b)*inv(R)’                           transforming coefs back to original metric
matrix list b                                   show coefs

6.3     Test after regression commands
6.3.1    Wald test
test x1 x2                                      joint effect of two variables
test x1=-2                                      H0: x1=-2
test x1-2*x2=3                                  test of linear combinations of variables
                                                               14


6.3.2    Likelihood ratio test
regress y x1 x2 x3 x4                          fit model 1
estimates store m1                             store model 1
regress y x1 x2                                fit model 2
lrtest m1 .                                    test model 1 against current model
lrtest m1 m2                                   test m1 vs m2

6.4      Cataloging estimation results
quietly: regress y x1 x2                       fit model without output
estimates store m1                             store results as m1
estimates dir                                  list stored results
est table m1 m2 …                              compare coefs
est stats m1 m2 …                              compare fit (ll, AIC..)
estimates replay                               show results
estimates restore m2                           make m2 active

6.5      Cov, Corr, AIC, BIC and sample
estat vce                                      vce=variance-covariance estimate
estat vce, corr                                correlation matrix
estat ic                                       information criteria: AIC and BIC
estat summarize                                show mean, min and max for variables in the model

6.6      Prediction
regress y x1 x2                                fit model
gen y1=_b[_cons]+_b[x1]*x1+_b[x2]*x2           direct predicition
predict y1                                     prediction in the same metric as the outcome, prob of sucsess for logistic, counts for
                                               Poisson, …
predict y1, xb                                 linear prediction
pred y1 if e(sample), xb                       linear prediction restricted to the estimation sample
pred sey, stdp                                 standard error of prediction
pred r1, resid                                 residuals
pred c1, cooksd                                Cooks distance

                                                7 Linear regression
regress y x1 x2 x3                             regress y on x1 x2 x3
regress                                        repeat last result
test x2 x3                                     F-test of joint effect of x2 and x3
vce                                            variance covariance matrix of estimators. Vce, rho gives corr matrix
predict                                        predicted values
predict newvar, stat                           pred, resid, DFBeta,…
regress y x1 x2 x3 if influ<1                  Stored Cooks dist in influ, rerun without high influential points
7.1.1     Test of assumtions
predict fteq ,xb                               predicted y
predict res ,res                               residuals
twoway (qfitci res fteq ) (scatter res fteq)   scatter with qubic +ci
rvfplot, mlabel(id) yline(0)                    residuals versus fitted, look for non linearity and heterosk.
ovtest                                         test for omitted higher order y's, p<.05 means non-linear effects
ovtest, rhs                                    test for omitted higher order x-variables, p<.05 means non-linear effects */
hettest                                        test for heterosk., p<0.05 means heterosk.
7.1.2     Test of influence
lvr2plot ,mlabel(id)                           leverage vs residuals squared, look for high leverage
avplot moralder ,mlabel(id)                    added variable plot
7.1.3     Test of multicollinearity
vif                                            variance inflation factor, look for vif>10 (or 30) and mean vif>1
                                                                 15



                                                 8 Logistic regression
8.1      Syntax
logistic y x1 x2 x3                             show odds ratios
logistic , coef                                 show coefs of last model
logit                                           show coefs of last model

8.2      Categorical covariates
xi: logistic y x1 i.x2 x3                       indicator variables for x2
char _dta[omit] prevalent                       make the most prevalent group the reference category (Permanent setting)
char _dta[omit]                                 make the 1. Group reference. (Permanent setting)
char catvar[omit] 3                             make 3. Group of catvar reference. (Permanent setting)

8.3      Residuals, goodnes-of-fit
predict newvar, stat                               predict statistic and put into newvar.
ptat: p=probabilities, xb=fitted values, db=delta beta, de=deviance resid, r=Pearson resid, rsta=standardized resid, hat=leverage
test x1 x2                                         test joint effect of x1 x2
lfit                                               Pearson chi-square goodness of fit. , group(10) gives Hosmer-Lemeshow with 10 g
lstat                                              summary statistics
lincom                                             OR of one covariate pattern versus another

8.4      Diagnostic plots
After fitting the logistic model do:
predict p, p                                   probabilities
predict db, db                                 delta beta
predict dx2, dx2                               Hosmer Lemeshow delta chi-square influence
graph dx2 p [w=db],border ylab xlab t1(“Symbol size prop to delta-beta”)

                                              9 ST Survival time data
9.1      Initial settings and description
stset timevar, failure(died)                    set time variable and failure indicator
stdes                                           describe data
stsum                                           summarize data

9.2      Kaplan –Meier …
sts graph, by(drug)                             Kaplan-Meier plot
sts test drug                                   log rank test

stci, by(sex) p(25)                             25 percentile with ci by sex


9.3      Survival regression models
9.3.1    Cox

9.3.2    Parametric survival

                      10 xtmixed -- Multilevel mixed-effects linear regression
10.1     Syntax
xtmixed y x1 x2 x3 ||id: x1 , cov(ind)          y and fixed part || id for second level: random part (intercept understood), covariance

10.2     Random effect covariances
independent                                     one variance parameter per random effect, all covariances zero; default
exchangeable                                    equal variances for random effects, and one common pairwise covariance
identity                                        equal variances for random effects, all covariances zero; the default for factor vars
                                                                  16


unstructured                                     all variances/covariances distinctly estimated

10.3     Predict
xb                                               xb, linear predictor for the fixed portion of the model
stdp                                             standard error of the fixed-portion linear prediction xb
fitted                                           fitted values, linear predictor of the fixed portion plus predicted random effects
residuals                                        residuals, response minus fitted values
rstandard                                        standardized residuals
Ex: predict yhat, fitted                         predict fixed and random effect into new variable yhat

                                                   11 Data reduction
11.1     Factor analysis
factor v1 v2 v3 v4, mineigen(1) factors(5)       minimum eigenvalue 1, max number of factors 5
estat anti                                       anti-image corr and cov
estat kmo                                        Kaiser-Meyer-Olkin measure of sampling adequacy, 0.00 to 0.49 unacceptable,
                                                 0.50 to 0.59 miserable, 0.60 to 0.69 mediocre, 0.70 to 0.79 middling, 0.80 to
                                                 0.89 meritorious, 0.90 to 1.00 marvelous
rotate                                           varimax orthogonal
loadingplot                                      plot 2 factors


                                                      12 Programing
12.1     Programs
12.1.1 Program definition
program define name
  arguments x1 x2 x3
  local m=`x1’ +1
.
end

program drop name                                remove old program definition

12.2     Macros
local name “content”                             define macro
local name= expression                           define macro
`name’                                           use local macro
global name= expression                          define macro
$name                                            use global macro

12.3     Loops
12.3.1 For loop
forvalues i=1(1)10 {
disp `i'                                         commands on separate lines
}
12.3.2 Foreach
foreach lname in any_list {
foreach lname of local lmacname {
foreach lname of global gmacname {
foreach lname of varlist varlist {
foreach lname of newlist newvarlist {
foreach lname of numlist numlist {
Ex:
local grains "rice wheat corn rye barley oats"
     foreach x of local grains {
           display "`x'"
                                                                     17


     }
Ex: foreach x of varlist mpg weight-turn {
           ...
     }
12.3.3 While
local i=1
while `i’<5 {
  commands
  local i= `i’+1
}

12.4     Conditions
12.4.1 If
if exp {
    Commands
}
else {                                             the else part is optitional
   commands
}


12.5     Matrix expressions
matrix A=(1,2,3\4,5,6)                             define matrix A as 2 by 3
A[.,“col1”] or A[.,1]                              first col, “col1” is the column name
A[”row1”,. ] or A[1,.]                             first row
A[“row i2,”col j”] or A[i,j]                       element i,j
A[2:,1..2]                                         submatrix (2-n) by (1-2), may also use names
mat B=J(3,4,0)                                     3 by 4 matrix of zero’s
mat B[2,2]=1                                       change element
12.5.1 Matrix operators
-B     negate
B'    transpose
B \ C add rows of C below rows of B
B , C add columns of C to the right of B
B + C add
B - C subtract
B * C multiply (including mult. by scalar)
B / z division by scalar
B # C Kronecker product

matrix list A                                      show matrix
matrix dir                                         List the currently defined matrices
matrix list                                        Display the contents of a matrix
matrix rename                                      Rename a matrix
matrix drop                                        Drop a matrix

                                                         13 GLLAMM
13.1     Instalation
Run the following Stata command to install gllamm:
ssc install glamm, replace

13.2     Data format
Use long data format with identifiers at the different levels
                                                                  18


13.3       Syntax examples
13.3.1 A two-level random intercept model (logistic)
gllamm y x1 x2, i(level2-Id) family(binom) link(logit) nip(8) number of integration points=8
13.3.2 A two-level random intercept and slope model (linear)
gen cons=1
eq interc: cons
eq slope1: x1
gllamm y x1, i(level2_id) nrf(2) eqs(interc slope1)                            number of random functions=2,
3 random parameters estimated: var(interc), var(slope1) and covar(interc,slope1).
Option nocor would set the last to 0
13.3.3 A two-level random intercept model, x1 and x2 categorical
xi:gllamm y i.x1 i.x2, i(level2-Id) family(binom) link(logit) nip(8)

13.4       Prediction
13.4.1 Syntax and options
Gllapred varname [, xb u linpred]
xb                                               fixed effect part of linear prediction
u                                                posterior means and std for latent variables
linpred                                          linear prediction of both fixed and random parts


                                                 14 Survey commands
A family of commands to account for survey design (stratification and clustering)

14.1       Setting stratification, clustering, finite population correction and sample weigths
Svyset strata varname                            stratification
Svyset psu varname                               clustering (psu=principal survey unit)
Svyset fpc varname                               finite population correction
Svyset pweigth=varname                           sample probability weights
Settings remain untill cleared
Svyset , clear
Svyset                                           shows current settings

14.2       Means and proportions
Svymean varname by (variable) subpopulation(variable)         subpopulation will select values different from 0 and missing. Do not
                                                              use if in svy commands
Svyprop varname
Svyratio varname
Svytotal varname

14.3       Tables
Svytab x y, row column obs se ci                 two-way tables

14.4       Regression
Svyreg                                           linear
Svylogit                                         logistic
Svypois                                          Poisson



14.5       Stata web links
Stata programs for generalized linear measurement error models, USA
        Programs by R. J. Carroll, J. Hardin, and H. Schmiediche, fit generalized linear models when one or more
        covariates are measures with error.
Stata program by Tony Brady, Sealed Envelope Ltd
                                                                   19


        Programs for Hosmer–Lemeshow goodness of fit test, conversion of regression output into near publication
        quality tables, time utilities to translate strings in 24 hr clock HH:MM format to elapsed times and back again,
        tabulate longitudinal data at the cluster level, count clusters in longitudinal data, etc.
Stata programs from Dr. Gareth Ambler, University College, UK
        Programs for Hosmer-Lemeshow test, penalised logistic regression, and generalized additive models, and a
        postestimation routine.


One great source for user-written software for Stata is the Stata Journal (SJ). There are many other resources available,
including the Statalist archive, but we will use the SJ archive for this example.
From Stata's toolbar, click on Help > SJ and User-written Programs, or at the command prompt, type [view] help
net_mnu.



                                                    15 New in Stata 10

15.1     Graph editor

15.2     Exact

15.3     Mixed models
xtmelogit
xtmepoisson


15.4     Survival
sts graph, risk table ci plotopt() ciopt()
st curve, ---#---

15.5     Power
stpower cox
stpower logrank 0.7 0.8, power(80)               sample required to increase the survival from 0.7 (untreated) to 0.8 (treated) at the
                                                 end of survey
stpower logrank 0.7, n(100 250 500) hratio(0.1(0.01)0.9) saving(mypower)


15.6     Saved results
est save filename
est use filename

15.7     Mata

15.8     Diverse
lpoly
mkspline

								
To top