# Introduction to R exercises

Document Sample

```					                       Introduction to R exercises
Renée de Menezes and Judith Boer, May 2008

These exercises are meant to get you familiar with the way R works, its syntax, and learn
some basic stuff: how to log transform variables, calculate sums and means, and make
graphs. Note that the <- can be replaced by =. To scroll through earlier commands use
Arrow Up and Down. To scroll through a command line use Arrow Left and Right, or
End and Home.

Exercise 0: Getting started
# Start R and copy & paste one row at a time:

X <- 3
X
Y <- X+2
Y
Z <- c(10,100,1000,10000,100000,1000000)
Z
W1 <- log10(Z)
plot(Z,W1)
W2 <- log(Z)
plot(Z,W2)
exp(1)
exp(2)
log(exp(1))
log(exp(2))

# Working with matrices
mx <- matrix(c(1,2,3,4,5,6,7,8,9),3,3)
mx
c1 <- mx[,1:2]
c1
r1 <- mx[2:3,]
r1
c2 <- cbind(mx[,1],mx[,3])
c2
r2 <- rbind(mx[1,],mx[3,])
r2

1
Exercise 1: Read table from text file with expressions only
Data file: exercise1_signal.txt
This file contains Affymetrix expression measurements for 4 arrays, the first two under
control conditions and the last two under treatment. There are n measurements in each
array. The file is formatted as follows: the first column contains row labels (probe sets),
and subsequent columns contain expression measurements, in the order control-control-
treatment-treatment; the first row contains the labels for the arrays, for the second column
onwards (no label for the column of row labels). The entries are separated by tab, but that
is used as a default in R. Before starting to work in R, open the datafile with Notepad to
see if the structure is really as above.
# After checking that all is well, start R and paste one row at a time:

# Identify the working directory using File > Change dir... and Browse.
# The directory already has to exist, you cannot create it here.

# Or type your working directory
setwd("indicate here your working directory; use forward slashes!!!")
# For example:
setwd("C:/the_relevant_directory/R exercises")

# Read in the file with expression data present in your working dir
dim(dataset1)
dim(dataset2)
# Note that you can extract the numbers of row and columns separately
nrow(dataset1)
ncol(dataset1)
# Examining the first 5 rows of the datasets read
dataset1[1:5,]
dataset2[1:5,]
# Note that only the expression measurements were read in dataset1,
# using the first row for row names and the first column as header.

# We split expressions under treatment and control conditions
# into two matrices:
cont <- data.matrix(dataset1[,1:2])
trea <- data.matrix(dataset1[,3:4])
plot(cont[,1],cont[,2])
plot(trea[,1],trea[,2])
# Note that you can draw two graphs side-by-side.
# First, define the lattice in which the figures are to be drawn;
# then, produce the figures in the desired order, as usual.
par(mfrow = c(1,2))
plot(cont[,1],cont[,2])
plot(trea[,1],trea[,2])
# If you now wish to carry on with only one graph at a time,
# you need to reset the lattice using
par(mfrow = c(1,1))
# Do you see any pattern in these graphs? What sort of pattern?
# Is it the same in the treated and control groups?

2
# Now compute the log2 logarithms of the measurements and make graphs.
lcont <- log2(cont)
ltrea <- log2(trea)
plot(lcont[,1],lcont[,2])
plot(ltrea[,1],ltrea[,2])
# What pattern do you see now?
# Is it different from the one in the original data?
# Is it different comparing treated and control groups?

# The data matrix can also be used directly to produce plots
# of all arrays against each other.
# This is useful because if a specific array had
# hybridisation problems, these should be visible in this graph.
pairs(dataset1)
# The same, now for log-transformed values
pairs(log2(dataset1))

# You can save a graph as a file, for example in pdf format.
# First, define the output format using pdf(“myfigure.pdf”),
# then produce the graph(s) as usual,
# and finish by terminating the device driver using dev.off()
# For example for a pdf in landscape orientation:
pdf(“scatterplots2.pdf”,height=5,width=8)
par(mfrow = c(1,2))
plot(cont[,1],cont[,2])
plot(lcont[,1],lcont[,2])
dev.off()
# These commands are useful when producing many graphs at a time
# Alternatively, you can save graphs from the R graphics window

# Compute the mean expression level for each probeset per condition:
lmean.c <- (lcont[,1]+lcont[,2])/2
lmean.t <- (ltrea[,1]+ltrea[,2])/2

# Alternatively, use function "apply" to take means across the rows (1)
lmean.c <- apply(lcont, 1, mean)
# See what comes out
lmean.c[1:3]

# You can make a scatter plot of these two means:
plot(lmean.c,lmean.t)

# To see all objects in your working directory
ls()

# When you would like to save the workspace
save.image("GettingStarted.Rdata")

savehistory("GettingStarted.Rhistory")

# To quit R
q()

3

```
DOCUMENT INFO
Shared By:
Categories:
Stats:
 views: 73 posted: 11/21/2008 language: English pages: 3
How are you planning on using Docstoc?