arc
Document Sample


ARCHITECTURAL LEVEL
METRICS:
Foundations for a Disciplined
Approach
A. Mili, West Virginia University
ACKNOWLEDGEMENTS
Funded by NSF, under ITR program, for
2000-2003.
Collaboration with Dr H. Ammar (WVU),
Dr M. Shereshevsky (WVU) and Dr
Lionel Briand (Carleton U, Canada).
Co-funded by NASA IV&V, Fairmont,
WV.
Software Architectures: A Key
Paradigm
Codifying Best Practices into
recognizable abstractions.
Supporting various forms of Software
Reuse (PLE, CBSE, COTS).
Architecture: Captures scope of
reusable assets and inter-component
protocols.
PREMISES
Three Tier Approach: Qualitative
Attributes, Quantitative Factors,
Computable Metrics.
Combining Analytical and Empirical
methods.
Static and Dynamic Analysis of
Architectures, using Shannon’s Entropy.
AGENDA
ArchitecturalStyles Rapide
Representation.
Rapide Representation Random
Variables of data/ control flow.
Random Variables Metrics as
entropies.
A THREE TIER APPROACH
Qualitative Attributes: Quality
features that may be arbitrarily vague.
Quantitative Factors: Quantitative
functions that may be arbitrarily difficult
to compute.
Computable Metrics: Numeric factors
that are easy to compute from static/
dynamic analysis of the architecture.
QUALITATIVE ATTRIBUTES
Quality attributes of the architecture as
a product: integrity; completeness;
coherence; feasibility.
Quality Attributes of the architecture as
a Blueprint for software applications:
maintainability; testability; modifiability;
portability; reusability.
QUANTITATIVE FUNCTIONS
ErrorPropagation.
Change Propagation.
Requirements Propagation.
Design Propagation.
Expectation: that these enable us to
apprehend qualitative attributes.
COMPUTABLE METRICS
Coupling and Cohesion.
Static and Dynamic.
Data and Control.
For a given architecture: Four matrices of
metrics; cohesion on the diagonal;
coupling outside.
STATIC VS. DYNAMIC
Static Metrics: Dynamic Metrics:
Entropy of the Entropy of the
vocabulary of language generated
information flow from that vocabulary
within/ between during a typical
components. execution.
DATA VS. CONTROL
Data Interchange: Control Interchange:
carried by carried by method
messages, calls,
parameters, shared synchronization
data, etc.
signals, event
notifications.
Usually high
bandwidth. Usually low bandwidth.
MODELING DECISIONS
Standardizing mapping from coupling to
cohesion (cohesion as self coupling).
Standardizing mapping from Static to
Dynamic (dynamic is language defined
by static vocabulary).
Standardizing mapping from Random
variable to Metric (Shannon’s entropy).
MODELING DECISIONS, II
Data vs. Control: Different ranges;
possibly different correlations.
Dynamic vs. Static: Static is easier to
compute, more reliable, but misses
relevant aspects.
Cohesion as Self Coupling: Gives
meaning to comparison (re: diagonality).
PROPAGATION
PROBABILITIES
Propagation:
Error
EP(A,B) = P([B](x)[B](x’) |
xx’).
Change Propagation:
CP(A,B) = P(([B][B’]) |
([A][A’] [S]=[S’])).
STATIC DATA METRICS
StaticData Coupling:
SDR(A,B): random variable that
represents the vocabulary of data
transferred from A to B.
SDC(A,B): H(SDR(A,B)).
PROPAGATION
PROBABILITIES, II
Requirements Propagation:
RP(A,B) = P(([B][B’]) |
([A][A’] [S][S’])).
Design Propagation:
DP(A,B) = P((BB’)
(AA’) ([S]=[S’])).
STATIC DATA METRICS
StaticData Coupling:
SDR(A,B): Random variable that
represents the vocabulary of data
transfer from A to B.
SDC(A,B): H(SDR(A,B)).
STATIC DATA METRICS, II
SDR(A,B): an integer over 32 bits,
uniformly used SDC(A,B)=32 bits.
SDR(A,B): three independent integer
variables SDC(A,B)=96 bits.
SDR(A,B): an integer representing a
Boolean (a la C) SDC(A,B)=1 bit.
SDR(A,B): an array index 0..7, uniform
usage SDC(A,B) = 3 bits.
STATIC DATA METRICS, III
Static Data Cohesion:
SDR(A): shorthand for SDR(A,A).
Implicitly, SDR(A,A): data transferred
from A to A: state space of A.
Static Data Coupling, Cohesion: a Static
Data NxN Matrix. N: # of components.
STATIC CONTROL METRICS
StaticControl Coupling:
SCR(A,B): Random variable that
represents the vocabulary of control
transfer from A to B.
SCC(A,B): H(SCR(A,B)).
STATIC CONTROL
METRICS, II
SCR(A,B): A may call 8 methods of B,
with equal likelihoodSCC(A,B)=3bits.
SCR(A,B): A may call 2 methods of B,
with equal likelihoodSCC(A,B)= 1bit.
SCR(A,B): A may call 1 method of B
SCC(A,B) = 0 bits. Dynamic control
metrics will distinguish from 0 methods.
STATIC CONTROL
METRICS, III
StaticControl Cohesion:
SCR(A): shorthand for SCR(A,A).
Implicitly: control flow within A: evades
precise generic definition.
Static Control Coupling, Cohesion: Static
Control Matrix.
DYNAMIC METRICS
StaticRandom Variable: SR.
Dynamic Random Variable:
DR = plausible sequences on SR.
DDR: plausible call/control sequences.
DCR: plausible data/parameter
sequences.
DYNAMIC METRICS, II
Normalizing Dynamic Metrics: If a sample
execution produces a call sequence of 1000
method names, is it because
traffic between A and B is intense,
or the data sample is large?
Reflect the 1st dimension, normalize the 2nd.
DYNAMIC METRICS, III
Normalizing for the Size of Data: Let
Ln be the sequence generated by a
datum of size n. Rather than compute
H(Ln), we compute
limn (H(Ln+1)-H(Ln)).
Whether this limit exists? Investigation.
limn (1/n) H(Ln).
DYNAMIC METRICS, IV
Measuring the Size of Data: A
Generic Procedure.
- Well founded ordering on data space,
- Transitive root,
- Stratify data space,
- Size of a datum: ordinal of its stratum.
DYNAMIC METRICS, V
MetricsDependent on Choice of
Ordering? Condition of Convergence
Weeds Out Poor Choices of Ordering.
Binary Trees: Height, vs. Number of
Nodes.
With number of nodes, limits are
defined. Sequence increment: traffic
generated by an extra node.
DYNAMIC METRICS, VI
Reflecting Dynamic Behavior: If A
calls a single method in B, static control
coupling is 0 bits. Dynamic control
coupling is the entropy of the random
variable that represents the length of
the (unitary) call sequence: a
meaningful non-trivial value.
MEASURES OF
DIAGONALITY
Ideal Matrix: High diagonal values; low
values outside diagonal.
Absolute Diagonality: Distance to the
subspace of diagonal matrices.
Relative Diagonality: Absolute diagonality
divided by the norm of the matrix.
Captures modularity of the architecture by a
single scalar between 0 and 1.
DEPLOYMENT PLAN:
Architectures to Rapide
UML as An Architectural
Representation: Rules for extracting
architectural information.
Five Architectural Styles
Independent Components: event based
systems; communicating processes.
Virtual Machines: Interpreter based.
Example: Rule Based Systems.
DataFlow Architectures: Data triggers
nodes. Examples: Batch; Pipe and Filter.
Data Centered Systems: Data Bases;
Blackboard Systems.
Call/ Return Architectures: Main/ Sub;
Remote Procedure Call; OO Systems;
Layered Systems.
Rapide Paradigms/ Constructs
Object Oriented Executable ADL.
Specifying and Prototyping Systems.
Collection of Interfaces, connections between
interfaces, and formal constraints.
Three types of connections: pipeline (),
agent(), and identification (to).
Execution model is event-based and supports
concurrency of node executions.
DEPLOYMENT PLAN:
Rapide to Random Variables
The most difficult/ contentious/ controversial
issues.
Mapping a Rapide Architectural description
into an NxN matrix of random variables.
Relies on information that is for the most part
available at the architectural level: Data/
Control Flow within and between nodes, with
relevant probability distributions.
Eliciting Interchange
Information
Data Flow within nodes: State
Variables.
Data Flow between nodes: Message
Passing, Parameters, Shared Data.
Control Flow between nodes:
Exchange of method calls.
Control Flow within nodes: Debate.
Eliciting Probability
Distributions
Specified Usage Probabilities.
Inferred Usage Probabilities (e.g Stack).
Simulated Usage Probabilities.
Default Usage Probabilities (uniform
over data type, over know subrange).
DEPLOYMENT PLAN:
Random Variables to Metrics
Straightforward: Applying the Entropy
function.
Subject to Validation: Shannon vs.
Renyi. Perhaps other forms.
Selection of metrics formulas dependent
on validation step. Anticipated: logical/
numeric/ probabilistic relationships
between CM and QF.
AUTOMATION PLAN:
Rapide to Random Variables
Syntax Directed Translation (Yacc-like) of
Rapide declarations into Ensemble
definitions.
Bare Rapide Parser, progressively extended.
Investigation: Probabilistic Annotations of
Rapide, using closed PD vocabulary (closed
wrt aggregate declarations).
Outcome: A square matrix of random variables.
AUTOMATION PLAN:
Random Variables to Metrics
Deriving Matrix of metrics from Matrix of
random variables, using Shannon/
Renyi.
Assessing Diagonality, other properties.
Assessing/ Correlating/ Providing
Bounds for Quantitative Factors.
Apprehending/ Providing Ratings for
Qualitative Attributes.
VALIDATION PLAN:
Analytical Validation
Validating Computable Metrics with
respect to Qualitative Factors:
Documented approximations.
Under Weak Hypothesis, found equality
between EP(A,B) and Renyi entropy of
SDR(A,B).
VALIDATION PLAN:
Empirical Validation
Case Study: HCS (Hub Control
Software, ISS); UML descriptions.
Map to Rapide, Compute Metrics.
Correlate with measurable propagation
probabilities, in light of system logs.
Other examples: a Client Server, a
Pacemaker, a KWIC index.
CONCLUSION AND
PROSPECTS
A Three-Tier Quality Model.
A Three-Step Quantification Procedure.
A Three-Pronged Methodology
(Analytical/ Empirical/ Experimental).
Preliminary Work; tentative/
speculative.
Looks easier (nicer?) than it is.
Get documents about "