Why the existing theory of software reliability must be discarded
Document Sample


Why the existing theory of software
reliability must be discarded..and what
should replace it?
Aditya P. Mathur
Professor, Department of Computer Science,
Associate Dean, Graduate Education and International Programs
Purdue University
Wednesday July 26, 2006. Microsoft@Redmond, WA, USA.
Reliability
Probability of failure free operation in a given environment over a
given time.
Mean Time To Failure (MTTF)
Mean Time To Disruption (MTTD)
Mean Time To Restore (MTTR)
Operational profile
Probability distribution of usage of features and/or scenarios.
Captures the usage pattern with respect to a class of customers.
Reliability estimation
Operational Decision process
profile
Random or semi-random Reliability estimation
Test generation
Test execution Failure data collection
Issues: Operational profile
Variable. Becomes known only after customers have access to
the product. Is a stochastic process…a moving target!
Random test generation requires an oracle. Hence is generally
limited to specific outcomes, e.g. crash, hang.
Issues: Failure data
Should we analyze the failures?
If yes then after the cause is removed then the reliability
estimate is invalid.
If the cause is not removed because the failure is a “minor
incident” then the reliability estimate corresponds to irrelevant
incidents.
Issues: Model selection
Rarely does a model fit the failure data.
Model selection becomes a problem. 200 models to choose
from? New ones keep arriving! More research papers!
Markov chain models suffer from a lack of estimate of transition
probabilities.
To compute these probabilities, you need to execute the
application.
During execution you obtain failure data. Then why proceed further
with the model?
Issues: Markovian models
12
C1 C2 12 + 13=1
21
13 32
C3
Markov chain models suffer from a lack of estimate of transition
probabilities.
To compute these probabilities, you need to execute the
application.
During execution you obtain failure data. Then why proceed further
with the model?
Issues: Assumptions
Software does not degrade over time; memory leak is not
degradation and is not a random process; a new version is a
different piece of software.
Reliability estimate varies with operational profile. Different
customers see different reliability.
Can we not have a reliability estimate that is independent of
operational profile?
Can we not advertise quality based on metric that are a true
representation of reliability..not with respect to a subset of features
but over the entire set of features?
Sensitivity of Reliability to test adequacy
Risky Desirable
Reliability
high
low
Undesirable Suspect model
low high
Coverage
Problem with existing approaches to reliability estimation.
Basis for an alternate approach
Why not develop a theory based on coverage of testable items and
test adequacy?
Testable items: Variables, statements,conditions, loops,
data flows, methods, classes, etc.
Pros: Errors hide in testable items.
Cons: Coverage of testable items is inadequate. Is it a good
predictor of reliability?
Yes, but only when used carefully. Let us see what happens when
coverage is not used or not used carefully.
Saturation Effect
R’f R’d R’df R’m
Reliability
Rm
Rdf Mutation
Rd Dataflow
Rf Decision
Functional
t fs t fe td s tde tdfs tdfe tms tfe
Testing Effort
True reliability (R) FUNCTIONAL, DECISION, DATAFLOW
Estimated reliability (R’) AND MUTATION TESTING PROVIDE
Saturation region TEST ADEQUACY CRITERIA.
Modeling an application
Component
Component ……….
Component Component
OS Component Interactions
Component Component
Component Interactions
Component
Interactions
Reliability of a component
Reliability, probability of correct operation, of function f based
on a given finite set of testable items.
R(f)= (covered/total), 0<<1.
Issue: How to compute ?
Approach: Empirical studies provide estimate of and
its variance for different sets of testable items.
Reliability of a subsystem
C={f1, f2,..fn} is a collection of components that collaborate
with each other to provide services.
R(C)= g(R(f1), R(f2), ..R(fn), R(I))
Issue 1: How to compute R(I), reliability of component
interactions?
Issue 2: What is g ?
Issue 3: Theory of systems reliability creates problems when
(a) components are in a loop and (b) are dependent on each
other.
Scalability
Is the component based approach scalable?
Powerful coverage measures lead to better reliability
estimates whereas measurement of coverage becomes
increasingly difficult as more powerful criteria are used.
Solution: Use component based, incremental, approach.
Estimate reliability bottom-up. No need to measure coverage
of components whose reliability is known.
Next steps
Develop component based theory of reliability.
Base the new theory on existing work in software testing and
reliability.
Do experimentation with large systems to investigate the
applicability of the their and its effectiveness in predicting and
estimating various reliability metrics.
Related docs
Get documents about "