MP_L4_Using_Excel

Document Sample
MP_L4_Using_Excel Powered By Docstoc
					Using Excel

 Biostatistics 212

    Lecture 4
             Housekeeping
• Questions about Lab 3?
  – replace vs. recode


• Final Project Dataset!
  – “Housekeeping” commands vs. data cleaning
    (don’t show data cleaning)

• A little short-handed today…
                       Today...
• Why are we talking about spreadsheets?
• Pro’s and Con’s of using a spreadsheet for:
    – Data management, Statistics, Calculating, Modeling,
      Tables, Figures
•   Cells
•   Formulas
•   Cutting and pasting formulas
•   Spreadsheet style
•   Examples
            Why spreadsheets?
• Excel is widely used, and for good reason
  –   Store numbers and text
  –   Calculations
  –   Desktop graphics – Tables and Figures
  –   Flexible creation of ledgers, models, other
      complex programs
          Why spreadsheets?
• How is a spreadsheet different than Stata’s
  data editor?
  – Less structured
  – Formulas
  – Formatting
          Why spreadsheets?
• How is a spreadsheet different than a
  database program like Access?
  – Less structured
  – Formula chains
  – Formatting
 Pro’s and Con’s of spreadsheets
• For data management
  – Pro’s
     • Easy start – just name columns and start typing


  – Con’s
     • No structure
     • Can’t sort, filter or query data
     • “Flat” file – no relational table structure allowed
 Pro’s and Con’s of spreadsheets
• For statistical analysis
   – Pro’s
      • Easy start, if you know how to do formulas


   – Con’s
      • Extremely limited range of options
      • Difficult to document
 Pro’s and Con’s of spreadsheets
• For calculating, or “modeling”
  – Pro’s
     • Repetitive calculations easy
     • Complex calculations easy

  – Con’s
     • Simple, 1-time calculations not as fast as a
       calculator
     • Sometimes hard to decipher in retrospect
 Pro’s and Con’s of spreadsheets
• Tables and Figures – will discuss in
  Sessions 6 and 7
                             Cells
• The basic building block of a spreadsheet

• Can contain:
   –   Numbers
   –   Text
   –   Dates, times, other special formats
   –   “blanks”
        • Start with 46 million blank cells!
         (230 cols x 66536 rows x 3 worksheets)
                Cells, cont
• Enter anything you like into each cell
  (numbers, text, symbols, etc) using
  keyboard

• Contents displayed on spreadsheet

• Organized and named by column/row
                Formulas
• Use when you want the contents of one cell
  to depend on the contents of other cells



• ALWAYS starts with:           =
                           (an “equals sign”)
                    Formulas
• Can contain:
  –   Numbers
  –   Text
  –   References to cells
  –   The usual math operators (+ - * / ^ )
  –   Built-in functions
                Formulas
• Cell contents update automatically when a
  referenced cell content changes

• “Chains” of formulas make for flexible
  calculating
                Formulas
• Contents of a cell displayed on spreadsheet

• The formula determining that content is
  displayed in the “formula box”

• Example
                        Formulas
• Types of formulas
  – Arithmetic
     • +, -, *, /, ^
  – Logic
     • IF(boolean, value 1, value 2)
          – Returns value 1 if TRUE, value2 if FALSE
     • AND(boolean, boolean, boolean…)
          – Returns TRUE if all booleans are true, otherwise FALSE
     • OR(boolean, boolean, boolean…)
          – Returns TRUE if any booleans are true, otherwise FALSE
                        Formulas
• Types of formulas, cont
   – Functions, for example:
      • SUM(range of cells)
          – Returns the sum of the values in the range
          – SUM(A5:A10)
      • AVERAGE(range of cells)
          – Returns the average of the values in the range
      • STDEV(range of cells)
          – Returns the standard deviation
      • NORMINV(probability, mean of dist, SD of dist)
          – Returns the z-value associated with a given probability…
                         Formulas
• Types of formulas, cont
   – Functions, for example:
      • LN(number)
          – Returns the natural log of a number
      • ABS(number)
          – Returns the absolute value of a number
      • LEFT(text, number of characters=x)
          – Returns x number of characters from the text in the cell, starting
            at the left side…
      • NOW()
          – Returns the current date, time
                      Formulas
• Tips
  – Use parentheses
     • IF(SUM(A5:A10)>5,1,IF(C9=“y”,2,3))


  – Or do in multiple steps
   Cutting/Copying and Pasting
• Cutting and Copying treat formulas
  differently!
   Cutting and pasting formulas
• Excel assumes the cell references are
  ABSOLUTE, and you’re just moving the location
  of the formula cell

• Example
    Copying and pasting formulas
• Excel assumes the cell references are RELATIVE

• Example


• Shortcut: drag little square in the corner…
  Copying and pasting formulas
• If you want to FIX the position of a referenced
  cell, use $’s
   = A5 + $B$6


• Example
                       Examples
• Repetitive calculations
  – Back-transforming linear regression coefficients


• Complex calculations
  – 2 x 2 template


• Modeling
  – Mortgage calculator
  – Risk integrator
  – Figure 2 for LDL-lowering paper
             Spreadsheet style
• Formatting
  –   Text
  –   Column width
  –   Borders
  –   Placement of stuff on the page
             Spreadsheet style
• For models:
  –   Inputs on the left, in red
  –   Outputs on the right, in blue, boxed, bolded, etc
  –   Calculations on other sheets
  –   “Protect” all cells besides inputs
       • Format/Cells…/Protection
       • Tools/Protect
          Take home points
• Understand cells and formulas
• Use copy/paste with and without fixed cells
  ($A$45)
• Good formatting adds significant value to
  your spreadsheet
                        Lab 4
• Practice with:
   – A repetitive calculation spreadsheet
   – A complex calculation spreadsheet
   – Introduction to making a figure with Excel


• Due before lecture next week

• Extra credit puzzle challenge – 2x2 excel template
   – Due Sept 18th – email to mpletcher@epi.ucsf.edu
                  To come…
• Next lecture
  – Epidemiologic analysis with Stata
     • 2 x 2 tables, confounding and interaction
     • Epitab commands
     • Logistic regression introduction

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:0
posted:5/18/2012
language:
pages:30
fanzhongqing fanzhongqing http://
About