To Measure Or Not to Measure
Document Sample


TO MEASURE OR NOT TO
MEASURE?
By
Baruch Lev
New York University
blev@stern.nyu.edu
www.baruch-lev.com
Any use of Baruch Lev's material for public presentation
should receive his written confirmation
December 2010
THINGS TO COME
Measures, measures, everywhere.
Why the urge to measure?
So what? Let them measure.
Costs of inadequate measures.
Criteria for useful measures.
Underlying knowledge/theory
Representativeness
Reliability (Objectivity)
Integrity
Validity (Performance)
What to do with useless measures?
2
MEASURES, MEASURES EVERYWHERE
(Definition: Assigning numbers to phenomena or individuals in a systematic
way to represent their properties.)
“Shiur Komah,” the dimensions of “god’s body.”
People’s optimism.
Parents’ love.
Trust in societies.
Corporate reputation.
People’s social connections.
Professors’ teaching effectiveness.
Researchers’ scientific standing.
The value of life.
How much is saving the planet worth.
The social benefits of drugs.
3
CONTINUED
“Level 3 Fair Value” of assets.
How happy are you?
Envy in organizations.
“Good companies”
You got the drift
Everything (at least in social sciences) can be measured.
But would you rely—put money—on any of the above
measures?
(Hire optimistic, well-connected people, promote high-
impact scientists, refuse to work for “envious”
organizations, invest in reputable corporations, sell high
“Level 3” shares, move to “trust societies?”) 4
WHY THE URGE TO MEASURE?
Science/knowledge advances by measurement (validating
empirically relativity theory).
Measurement replaces intuition/guessing in the accumulation
of knowledge (corporate incomesystematic performance
evaluationreplaces intuition in investment decisions).
Goals/objectives of policies stated by measures—effectiveness
assessed against quantitative goals: (Obama’s $1 trillion
“rescue plan” expected to yield 8% unemployment).
Daily decisions based on measures (invest in high return
mutual funds; enroll in MBA programs with high graduate
salaries; work in profitable companies; go to high-rated
movies; visit “low failure” doctors/hospitals).
What is not measured, is not managed.
5
WHO CARES? LET THEM MEASURE
Wrong! Measurements have consequences: lead to
costly decisions/actions.
Increase of unemployment rate in the U.S. from 9.5% to
9.6% (Fall 2010)extending unemployment benefits.
Accounting measurements have strong effects on
managerial decisions: Expensing stock options (2005).
Low rated professorsdenied tenure.
Students with low GMAT rejected from top business
schools.
SEC: Reporting corporate damage to the environment.
Measurement affects behavior
6
CRITERIA FOR USEFUL MEASURES
Two examples for concrete discussion:
Instructors’ effectiveness: Students’ ratings.
Value of patents: forward citations, scope of
claims, science-based, other patent attributes.
7
8
9
CRITERION 1: UNDERLYING
KNOWLEDGE/THEORY
What is the body of systematic knowledge, or theory
underlying the design of the measure and directing its
use?
How much do we know about people’s optimism?
Instructors’ rating Patent value
Little is know about what makes The science of Bibliometrics—
for effective university instruction: inferring value/effectiveness/
Course organization? contribution from references/
Instructor enthusiasm?
citations to work.
Easy/hard grading?
Socratic method?
Morning/afternoon?
Strict/lenient teaching?
10
CRITERION 2: REPRESENTATIVENESS
Correspondence of measure to what is presumed to be
measured.
• Perfect correspondence: Stock prices.
• Poor correspondence: Manipulated earnings (Enron,
WorldCom); unemployment (stopped seeking
employment).
Instructor’s rating Patent value
Week correspondence. Moderate-to-good
By course-end students correspondence.
have insufficient Citation intensity reflects
information to evaluate contribution of patent to
instructor. technological progress.
11
CRITERION 3: RELIABILITY (OBJECTIVITY)
Repeat measurements, or measurements by different
persons, are highly correlated.
• Perfect reliability: Air temperature
• Poor reliability: Values of houses on sale
Instructor’s rating Patent value
Poor reliability. Perfect reliability.
My 5.9 and 6.5 student Based on Patent office
ratings last year. data.
12
CRITERION 4: INTEGRITY
Measure is not subject to bias, manipulation, or political
agendas.
• Perfect integrity: instrumental measurements in sport
events.
• Poor integrity: Corporate earnings, climate change,
production in the Soviet Union.
Instructor’s rating Patent value
Poor-Moderate integrity. High integrity.
Affected by students’ Hard to manipulate.
biases.
13
CRITERION 5: VALIDITY
Putting the measure to use: The measure’s performance in
prediction, or decision consequence (“the test of the pudding
is in the eating”)
• Good validity: R&D predicting corporate growth.
• Poor validity: “Good companies” don’t predict positive
stock returns.
Instructor’s rating Patent value
Unknown: Instructors’ High validity: Patent
rating predicts students’ value predicts corporate
success? success, stock returns,
etc.
Don’t tolerate measures which fail the usefulness criteria.
14
SO, WHAT’S TO BE DONE WITH POOR/DEFICIENT
MEASURES?
Have the courage to say: “this is unreliable”
(measuring trust), or “this is outright nonsense.”
Reject publications with dubious measures.
“Harden” soft measures (measure performance by
cash flows when earnings are of low quality).
Indicate range of error or uncertainty like polling
data (separate earnings based on facts from
estimates).
Always apply the “embarrassment test”
(measurements of god).
15
The Ultimate Measure—
Time to Lunch: O
16
Get documents about "