9/10/2012 B.Ramamurthy 1
October 25, 2011
Location 107 Talbert
Pencils, pens and erasers.
This is a closed book exam.
NO Other material is allowed.
Arrive on time, no extra time will be
given if you arrive late
9/10/2012 B.Ramamurthy 2
Defining data intensive computing ( as in Fourth Paradigm: up
Enabling Technologies (ET):
ET1: Web service
ET2: Special data structures and algorithms
MapReduce model: components: Mapper, Reducer, Partitioner,
Combiner; Execution framework , shuffle and sort
Hadoop (HDFS) : as in yahoo site: Ch1, 2, 4; 5 only partitioner.
Problem solving with MR:
Chapter 1-4 in Lin and Dryer’s text
Tom White analysis of web log (Don’t ask me for the
handout, go find it)
9/10/2012 B.Ramamurthy 3
Defining data-intensive computing: J. Gray
Given a problem solve it using MR
Given a MR provide, provide a numerical example
Best practices and design patterns described in the
Web services and project 1
Hadoop (HDFS) architecture
Functions of various MR modules
9/10/2012 B.Ramamurthy 4
How to study?
Make a list of all material to study.
Study the material
Practice writing pseudo code for the
Use block diagrams and numerical
examples when necessary
9/10/2012 B.Ramamurthy 5