The If Then Else Framework

Document Sample
The If Then Else Framework Powered By Docstoc
					                                  The If-Then-Else Framework:
                              Coding branching logic without nested if’s
                             Part III: Enhancements To Support Scaling

                                      Paul Corazza (pcorazza@kdsi.net)



                Summary

                In the first two parts of the article, Paul described in detail how to use the
                If-Then-Else framework to code complex branching logic. In this final
                part, he analyzes the performance of the framework and shows how to
                optimize algorithms and implementation choices, and how to minimize the
                risks involved in using the framework in larger-scale projects. The final
                product is a sleek, industry-ready upgrade of the prototype presented in
                previous articles, complete with new user options, a performance test
                harness, and automated safety features capable of handling thousands of
                conditions and rules.



     The If-Then-Else Framework is a tool for coding complex branching logic in a maintainable way. In
the first two parts of this series, I described in some detail how to use the framework in a simple but typical
case. I used this example as a means to illustrate most of the framework's features; in actual practice,
however, the effort required to use the framework is more aptly rewarded when used in a more challenging
context.
     The purpose of this final installment is to study and solve the problems that arise when you do venture
into more challenging territory with this framework. The refinements that I will discuss address the
consequences of increasing the number of conditions, rules and actions. Using the framework without these
refinements, you will find that, as the number of conditions, rules, and actions grows, performance can
degrade rapidly, the order of evaluation becomes nearly impossible to keep straight, and the likelihood of
introducing a pair of inconsistent rules increases quickly. In this article, I will identify the performance
bottlenecks--involving in some cases algorithm selection and in others, Java implementation issues--and
rewrite these critical sections; I will introduce a sequencing mechanism that will automate the bookkeeping
involved in adhering to the order of evaluation; and I will introduce an automated consistency checker that
will throw an exception if inconsistent rules have been loaded. And after all of it, the fundamental steps for
using the framework, already described in Parts 1 and 2, will remain unchanged.
     The discussion here will provide the reader with a number of concrete advantages. First of all, the final
product of these refinements will be an industry-ready package that you can use reliably in any project that
requires implementation of complex branching logic. Also, having worked through the modifications, you
will have an advanced level of familiarity with the framework that will enable you to use its features to the
fullest. Better still, you will see useful Java solutions to design and implementation problems -- these
specific solutions, and the general problem-solving approaches that spawn them, can be reused in a variety
of programming contexts. In that sense, this last part of the series offers a smorgasbord of advanced Java
engineering techniques.
     The focal point of the discussion will be a set of questions that I raised at the end of Part 2 and that I
mentioned briefly above. Before formally stating these, I'll review some of the highlights of previous
installments.
     In order to get started with the framework, you perform a simple analysis of the logic that you need to
code; in the analysis, you determine the conditions (the "if/else if" parts) and the resulting actions (the
"then" parts), and you encapsulate these in instances of the classes Condition and Action,
respectively. Actions are always performed on other objects; in the framework, in order for an object to
be a recipient of an action, it must implement the Updateable interface, which involves implementing
the doUpdate() method. Finally, in order to access the "rules engine" (the guts of the framework that
will evaluate conditions and fire off actions), you need to load up rules that tell you which sequence of
booleans will fire off which action -- these are loaded typically as part of initialization. And you need to
load up the conditions that need to be evaluated -- these are typically determined sometime after
initialization has completed. Both of these operations are performed in a user-defined subclass of the
framework class Invoker by implementing the abstract methods loadRules() and
loadConditions(), respectively.
     Once you have taken these steps, you set the machinery in motion by creating an instance of the
Invoker subclass (which calls the loadRules() method) and by calling the execute() method on
this instance (which calls loadConditions() and passes the conditions and rules to the IfThenElse
class, which in turn orchestrates the evaluation of conditions, the lookup of the appropriate action, and the
execution of that action). Listing 1 displays the re-write of the URLProcessor_bad class, introduced in
Part 1 as the central example of nested-if code; this rewritten version, called URLProcessor_good,
employs the basic principles of implementing the If-Then-Else Framework:

Listing 1: The class URLProcessor_good with user-defined inner classes

class URLProcessor_good extends URLProcessor implements logic.Updateable {
      static final String URL_STR = "urlString";
      URLProcessor_good(){
             super();
      }
      void decideUrl() {
             try {
                   Invoker inv = new ConcreteInvoker((Updateable)this);
                   inv.execute();
             }
             catch(NestingTooDeepException ntde) {}
             catch(IllegalExpressionException iee) {}
             catch(RuleNotFoundException rnfe) {}
             catch(DataNotFoundException dnfe) {}
      }
      /** Updateable interface implementation */
      public void doUpdate(Hashtable ht) {
             if(ht == null) return;
             Object ob = ht.get(Action.MAIN_KEY);
             if(ob != null && ob instanceof String) {
                   setUrl((String)ob);
             }
      }
      static class OtherCondition extends Condition {
             public static final String KEY = "key";
             public OtherCondition(Hashtable ht) {
                   super(ht);
             }
             public Boolean evaluate() throws DataNotFoundException {
                   Object ob = null;
                   if(getData() == null || (ob = getData().get(KEY)) == null) {
                         throw new DataNotFoundException();
                   }
                   if(!(ob instanceof DataBank.Region)) {
                         return Boolean.FALSE;
                   }
                   DataBank.Region region = (DataBank.Region)ob;
                   boolean result = !region.equals(DataBank.WEST_REGION) &&
                                      !region.equals(DataBank.EAST_REGION);
                   return new Boolean(result);
             }


     }
     static class MemberCondition extends Condition {
            public static final String KEY = "id";
            public static final String TABLE = "table";
            public MemberCondition(Hashtable ht) {
              super(ht);
         }
         public Boolean evaluate() throws DataNotFoundException {
              Object id = null, table = null;
              if(getData() == null || ((id = getData().get(KEY)) == null) ||
                    ((table = getData().get(TABLE))==null) ||
                    !(table instanceof Hashtable)) {
                          throw new DataNotFoundException();
              }
              Hashtable members = (Hashtable)table;
              return new Boolean(members.containsKey(id));
         }
     }
     public class ConcreteInvoker extends Invoker {
          public ConcreteInvoker(Updateable ud) throws NestingTooDeepException {
                super(ud);
          }
          public void loadRules() throws NestingTooDeepException {
                rules = new Rules();
                try {
                      //       Conditions:                Actions:
                      //     East|West|Other|Lim|Member
                      rules.addRule("TFFT*",         new Action(ud,EAST_PRIVILEGED));
                      rules.addRule("TFFF*",         new Action(ud,EAST_NOT_PRIVILEGED));
                      rules.addRule("FTFTT",         new Action(ud,WEST_MEMBER_PRIVILEGED));
                      rules.addRule("FTFTF",         new Action(ud,WEST_NONMEMBER_PRIVILEGED));
                      rules.addRule("FTFFT",         new Action(ud,WEST_MEMBER_NOT_PRIVILEGED));
                      rules.addRule("FTFFF",         new Action(ud,WEST_NONMEMBER_NOT_PRIVILEGED));
                      rules.addRule("FFT**",        new Action(ud,OTHER_REGION));
                }
                catch(NestingTooDeepException e) {
                      throw e;
                }
          }

         public void loadConditions() throws IllegalExpressionException {
              try {
                    conditions = new Vector(5);
                    conditions.addElement(new Condition(db.getRegion(),DataBank.EAST_REGION));
                    conditions.addElement(new Condition(db.getRegion(),DataBank.WEST_REGION));
                    Hashtable other = new Hashtable();
                    other.put(OtherCondition.KEY,db.getRegion());
                    conditions.addElement(new OtherCondition(other));
                    conditions.addElement(new Condition(db.LIMIT_THRESHOLD, Condition.LESS,db.getLimit() ));
                    Hashtable mem = new Hashtable();
                    mem.put(MemberCondition.TABLE, db.getWestMembers());
                    mem.put(MemberCondition.KEY, db.getUserId());
                    conditions.addElement(new MemberCondition(mem));
              }
              catch(IllegalExpressionException e) {
                    throw e;
              }
         }
     }
}



     Listing 1 will provide a point of reference as I attempt to refine the framework's functionality. The
following questions will provide a structure for the discussion:

Main Questions

1.   Order of evaluation. As you can see in Listing 1, the comments in the loadRules() method spell
     out the intended order of evaluation: I always consider conditions in the following order, whether I am
     loading conditions or creating a sequence of booleans:

                                            East, West, Other, Limit, Member
     But what happens when the number of conditions grows to 15 or 20? You can no longer expect to rely
     on a difficult-to-read set of notes in a documentation area as a means to guarantee strict adherence to a
     particular order of evaluation. Clearly, this part of the framework begs for an automated solution that
     can eliminate the human error that unavoidably creeps in with scaling.

2.   Performance. In Listing 1, you will notice that one of the exceptions that is handled in the
     decideURL() method (which is the point of control for handling exceptions) is the
     NestingTooDeepException. The framework will throw this exception whenever the number of
     Condition instances exceeds 30. Though in practice, you probably would never need that many
     conditions, you may wonder what dark secrets I am protecting by imposing this limit. As you will see,
     although there are fundamental technical reasons for this restriction, the more significant issue is
     performance: When the number of conditions gets too close to 30, performance becomes unacceptably
     slow. You can actually force the framework to falter badly with as few as 12 or 13 conditions (I will
     show how this works shortly). A review of the framework code will reveal one main reason for
     performance degradation--namely, more than one exponential-time algorithm. Some additional
     profiling will reveal another performance hit that has to do with the use of Java Hashtables. These
     considerations will lead us to implement some new algorithms, and to make a different choice of Java
     container classes.

3.   Consistency checking. In Part 2 of this series, I pointed out how a careless use of wildcards in the
     creation of boolean sequences can easily result in a pair of inconsistent rules. For clarity, I will define
     what I mean by these terms. First, recall that a rule is a match-up between a sequence of booleans
     (represented as a String of Ts and Fs, denoting true's and false's) and an Action instance that
     should be fired off whenever a sequence of Conditions evaluates to this particular sequence of
     booleans. (You create a rule every time you invoke the addRule() method in Rules.) I will say
     that two rules are inconsistent whenever they match the same boolean sequence to two different
     Action instances. For instance, the following lines of code would indicate that inconsistent rules
     have been loaded:
     //assume that Action a and Action b are not equal
     Action a = new Action(..);
     Action b = new Action(..);
     rules.addRule("TFTT", a);
     .
     .
     .
     rules.addRule("TFTT", b);

     It is relatively easy to avoid explicit errors like this one; however, the following pair of inconsistent
     rules could easily be overlooked:
     rules.addRule("*FTT", a);
     .
     .
     .
     rules.addRule("T*TT",b);

     As I explained in Part 2, each of the boolean Strings *FTT and T*TT is actually treated by the
     framework as a pair of boolean Strings with no wildcards. The framework translates the two rules
     created above into the following four rules:


         TFTT --> a
         FFTT --> a

         TTTT --> b
         TFTT --> b
    As you can see, after translation, the two original rules give rise to the pair of inconsistent rules

         TFTT --> a and TFTT --> b.

         It is not hard to imagine that if you had to manually key in a hundred or more rules, many with
    wildcards, you would eventually make one of these subtle mistakes--you would key in a pair that is, at
    least under framework translation, inconsistent. What makes this unacceptable, however, is that the
    framework provides no run-time consistency checking that would allow you to catch and correct such
    errors. As it operates now, the framework will handle the creation of these rules as follows: It will
    silently overwrite the first rule involving TFTT with the second rule involving this String, and will
    therefore always execute Action b when it encounters TFTT.
         Below, I will suggest a consistency-checking mechanism. I will also discuss the performance hit
    that such a mechanism must inevitably cause.


Order of Evaluation

     In order to reduce the risk of error in adhering to a chosen order of evaluation, I need to automate the
procedure that guarantees that the same order is used in loading conditions as is used in building up
boolean Strings during the phase of loading rules. To do this, I need to move the specification of the
order out of the comments section and into the code, and then use this piece of code to properly arrange the
conditions and the boolean Strings as they are loaded.
     One simple observation that suggests how to begin is that I can always specify a unique ordering of
any set of objects by mapping them in a one-to-one way into the integers. So, I can accomplish my
objective of specifying an order of evaluation within the code by using integer constants having names that
correspond to the conditions that I happen to be using. I can then use these constants to ensure a proper
order of loading in both the loadRules() and loadConditions() implementations.
     A handy way in Java to specify a set of constants for a class is to define them in an interface and
require the class that uses the constants to implement this interface. Below, I have created an interface
OrderOfEvaluation which specifies constants for the sample code I have been considering (see
Listing 1) -- the inner class ConcreteInvoker will now need to implement this new interface in order
to have direct access to the constants.

    public interface OrderOfEvaluation {
            final int EAST_CONDITION = 0;
            final int WEST_CONDITION = 1;
            final int OTHER_CONDITION = 2;
            final int LIMIT_CONDITION = 3;
            final int MEMBER_CONDITION= 4;
    }

      To make use of these constants during loading, I introduce wrappers, called sequencers, that will wrap
the target container class in a larger class that will know how to use the constants to maintain proper order.
Each sequencer will provide an "add" method that will permit framework users to add conditions and rules
in a familiar way. However, the method will require that the condition or rule be passed in with the
appropriate constants. Then, the sequencer will directly place the condition or rule in its proper location in
its target container class.
      To show how all this works concretely, I have rewritten the loadConditions() and
loadRules() methods of Listing 1, making use of sequencers. Let's begin with loadConditions()
since it is the easier of the two. Here is the code for the conditions sequencer:

public class ConditionSequencer {
    Vector conditions;
    public ConditionSequencer(Vector conditions) {
            this.conditions = conditions;
    }
    public void addCondition(Condition c, int position) throws RuleNotFoundException {
            if(conditions == null) throw new RuleNotFoundException();
              if(position >= conditions.capacity()) throw new RuleNotFoundException();
              conditions.insertElementAt(c, position);
    }
}

      You instantiate ConditionSequencer by passing in the Vector of conditions that you need to
load. You can then add conditions one-by-one using the addCondition() method. Notice that this
method requires you to pass in the position at which the condition should be inserted in the conditions
Vector. To do this, you just need to pass in the interface constant that corresponds to the particular
condition. The sequencer then places the passed in condition in the specified location in conditions. You
will also note that a RuleNotFoundException will be thrown if you fail to initialize conditions or
fail to set its capacity to a sufficiently large value. (So, in this case, before using the sequencer, you would
initialize the conditions Vector with the total number of conditions.) The reason I chose to use this
particular exception at this point in the framework is that the loadConditions() method is part of the
overall process in which the user is requesting the framework to look up a rule and fire off an action,
initiated by the call to Invoker to execute(). The code displayed below shows how you would begin
to rewrite the loadConditions() method in Listing 1, using this new approach:

public void loadConditions() throws IllegalExpressionException, RuleNotFoundException{
    try {
       conditions = new Vector(numConditions);
       ConditionSequencer sequencer = new ConditionSequencer(conditions);
       sequencer.addCondition(new Condition(db.getRegion(),DataBank.EAST_REGION),
                              EAST_CONDITION);
       sequencer.addCondition(new Condition(db.getRegion(),DataBank.WEST_REGION),
                              WEST_CONDITION);
        . . .
    }
    catch(Exception e){}
}

     The technique for sequencing rules is similar but slightly more complicated. The step in the process of
loading rules that requires attention to the order of evaluation is in the assembly of boolean Strings. A
String like TFTF has been assembled properly only if the T at position 0 signifies that the 0 th condition
evaluates to true, and the F at position 1 signifies that condition number 1 evaluates to false, etc. I can use
the interface constants to ensure that this match-up is occurring correctly by encapsulating the requirement
in a small class that contains both a boolean value and the name of the condition to which it is supposed to
refer. I call this class a ConditionValue -- a skeleton of the class appears below:

public class ConditionValue {
    public ConditionValue(int pos, char c);
    public char getValue();
    public int getPosition();
    public void setPosition(int pos);
    public void setValue(char c);
}

     The instance variable value in the class will always be a 'T', 'F' or '*'. And the variable position
will always store the value of the appropriate interface constant.
     Using this approach, arranging Ts and Fs into a boolean String amounts to placing
ConditionValue instances into a set. I don't need to worry about arranging these in the correct order
because all necessary information about preserving correct order is stored in the individual
ConditionValues. Therefore, we need to create a class that plays the role of a Set in order to contain a
collection of ConditionValues. Java 1.2 provides an interface Set for the creation of just the kind of
class needed here; however, since I want to make this framework accessible to JDK 1.1 users, I have not
made use of the new Collections API. Below I list a skeleton of the class that I will be using; it provides
most of the methods that any implementation of the data type "set" should provide, specialized to my needs
here:

public class ConditionValueSet implements Enumeration {
     //constructors
     public ConditionValueSet();
        public ConditionValueSet(ConditionValue cv);
     //public methods
     public boolean addMember(ConditionValue cv);
     public boolean removeElement(ConditionValue cv);
     public boolean contains(ConditionValue cv);
        public int getCardinality();
     public ConditionValue[] getElements();
    //Enumeration implementation
     public boolean hasMoreElements();
     public Object nextElement();
     public void resetIterator();
}

     As you might expect, the class provides methods for adding and removing elements (addMember(),
removeElement()), for checking whether a particular ConditionValue is one of its elements
(contains()), for obtaining its size (getCardinality()), and for retrieving its elements in the form
of an array (getElements()). The class also implements the Enumeration interface, providing a
uniform way of iterating through the elements of a set.
     Each instance of this class will essentially encapsulate a boolean String like TF*T, with each
character of the String being stored, with its order of evaluation information, in a ConditionValue
instance. Now I need to associate each instance of ConditionValueSet with an Action instance; I
have been calling such a pairing a rule. It makes sense here to encapsulate the notion of a rule in a Rule
class:

public class Rule {
    public Rule(ConditionValueSet cvs, Action action);
    public Action getAction();
    public ConditionValueSet getCVS();
    public void setAction(Action anAction);
    public void setCVS(ConditionValueSet aCVS);
}

     Finally, I need a rule-sequencer, analogous to the ConditionSequencer class mentioned above,
that will serve as a wrapper to Rules so that rules may be loaded in the correct way. The
RuleSequencer class will allow you to add rules (instances of Rule) one by one with an addRule()
method; that method will unpack the data stored in the Rule instance and invoke the addRule() method
on Rules, ensuring that the boolean String is assembled according to the order of evaluation. Here is
the skeleton of RuleSequencer:

public class RuleSequencer {
    //constructor
    public RuleSequencer(Rules rules);
    public void addRule(Rule rule);
}

    Below, I show the first part of a rewrite of the loadRules() method from Listing 1, using this new
approach:

public void loadRules(){
    rules = new Rules();
    RuleSequencer sequencer = new RuleSequencer(rules);
         try {
        //add rule TFFT*, EAST_PRIVILEGED
         ConditionValueSet cvs_eastPriv = new ConditionValueSet();
         cvs_eastPriv.addMember(new ConditionValue(EAST_CONDITION,'T'));
         cvs_eastPriv.addMember(new ConditionValue(WEST_CONDITION,'F'));
         cvs_eastPriv.addMember(new ConditionValue(OTHER_CONDITION,'F'));
         cvs_eastPriv.addMember(new ConditionValue(LIMIT_CONDITION,'T'));
         cvs_eastPriv.addMember(new ConditionValue(MEMBER_CONDITION,'*'));
        sequencer.addRule(new Rule(cvs_eastPriv, new Action(ud,EAST_PRIVILEGED)));
        . . .
    }
    catch(Exception e){}
}

     As you can see, from the framework user's point of view, this new approach involves one additional
step at the top level of the loadRules() method: In addition to creating a new Rules instance, a new
RuleSequencer must also be created. It is then straightforward, but perhaps a little tedious, to create
one ConditionValue for each of the Conditions being considered, associating each with a truth
value. You add these ConditionValues one-by-one to a ConditionValueSet, associate this with
the appropriate Action by creating a new Rule instance, and then add this rule to the
RuleSequencer.
     For lack of space, I have not explained all the back-end details involved in processing these steps, but
these are not complicated at all; of course, you can find everything spelled out in the code itself -- see
Resources.
     Our solution to the problem of automating the bookkeeping for adhering to a specified order of
evaluation does avoid the pitfalls of relying on a specification that resides only in a comments section of
the code. The price we pay for this solution is an increased load of coding, particularly for loading rules.
On the other hand, in practice, there will almost never be a need to consider more than 30 conditions in a
particular implementation, and even that number of conditions would not require too much extra work with
our new approach to implementing the loadConditions() method. Nonetheless, as the number of
rules increases to 100 or more (which is not so uncommon), the increased coding demands introduced here
in implementing loadRules() would become unacceptable.
     I would like to sketch a realistic alternative to solving the problem of safely loading many rules -- I
implemented this alternative approach during performance testing of the framework (I will say more about
this in the Performance section below). When you are faced with a large number of rules, two patterns that
emerge--which can be used to advantage--are:

(1) The rules tend to be more uniform in character, with relatively few "special cases".
(2) There are usually relatively few Actions for you to deal with.

     For instance, when you have 300 rules to code, you may only have to worry about a total of 15 actions.
In that case, you can often automate rule loading by creating 15 separate loops, one for each action. For
each action, the corresponding loop would load all rules that are supposed to fire off this action. This
approach often works because of point (1) -- the uniform character of the rules makes it straightforward to
generate the required boolean Strings in each of the loops.
     In order to introduce some extra flexibility in the initialization of the Invoker class so that
additional, user-defined classes--that might be used for automated rule-loading--will have room to be
initialized themselves, I have made one small change in the way you use the framework, whether or not
sequencers are used. As the framework was originally written, when you create an instance of Invoker,
its constructor automatically calls its init() method, which in turns calls the loadRules() method.
However, this sequence of calls prevents you from performing some other initializations inside your user-
defined ConcreteInvoker constructor that you might need to do before loading rules -- the call to
super() must occur before any other constructor code is run. To solve this problem, I have removed the
call to init() from the Invoker constructor. This means that when you wish to run the framework, you
have to make the following sequence of calls:

    Invoker inv = new ConcreteInvoker(anUpdateable);
    inv.init();
    inv.execute();

What is new here is, of course, the explicit call to Invoker's init() method.
Performance

     When I began performance testing, I already knew that one of the big bottlenecks would be in the
section of code that handles rule-loading because I knew that I had implemented some rather slow
algorithms there. On the other hand, I expected to find that the phase of code that involved loading and
evaluating of conditions (which results in the lookup and firing of the appropriate action), would perform
well because most of the steps in the process are quite simple. The one possible exception was the
procedure for looking up the correct Action based on a particular sequence of Ts and Fs--this step could
have involved lengthy search algorithms if I had implemented this step carelessly. But I had chosen to store
rules in a Hashtable, matching each sequence of booleans with its corresponding Action, so I knew that this
lookup step would perform as well as possible. (Hashtable lookups generally constitute the fastest
procedure for performing searches--including lookups on an array, where array indices play the role of
keys--because the time taken to perform such a lookup does not typically increase as the number of table
elements grows. See for example Robert Lafore's book, listed in Resources.)
     Performance testing revealed that as the number of conditions and rules increased, the percentage of
total run-time that was spent in the rule-loading sections of code grew close to 100%. Therefore, I will
devote this section to an investigation of these parts of the code, seeking to replace bad algorithms with
good ones and replacing less efficient Java implementations with more efficient ones.
     I found that I was able to slow framework performance to a crawl using relatively few conditions by
using many wildcards. On the other hand, using many conditions but very few wildcards did not slow
performance much at all. Therefore, I will look first at scenarios involving wildcard-intensive boolean
Strings. Before examining why the framework does so poorly in such scenarios, I'll describe the way that I
did my testing: I created a test environment which implemented the framework and did a simple kind of
automated rule-loading (a simple case of the kind of rule-loading I was describing in the last section). I
fixed one instance of Action, which I reused in all my rules. I required every boolean String to begin with
one of the following four T-F sequences: TT, TF, FT, FF. The test environment required me to specify
how many wildcards I wished to use--the environment would then append that number of copies of '*' to
each of these four base Strings, and then load these wildcard-intensive Strings in the
loadRules() section of code, each matched with the single Action instance defined earlier. This
environment was ideal for determining exactly how many wildcards were needed to degrade performance.
     Figure 1 below provides a diagrammatic view of the section of code that handles rule-loading.
Figure 1. Code responsible for rule-loading

     As the diagram indicates, loadRules() makes multiple calls to the addRule() method of Rules.
Each such call entails a trip to the BooleanTupleFactory to transform a boolean String into one or
more BooleanTuples. These BooleanTuples are then matched with the Action instance that was
passed in by placing them in the Hashtable ruleTable inside Rules.
     As you may recall from earlier parts of the article, a BooleanTuple encapsulates the notion of a
sequence of booleans -- in particular, it represents a sequence of boolean values not as a String of Ts and
Fs but as an array of Booleans. I chose to use BooleanTuples as keys in my Hashtable of rules
rather than raw Strings of Ts and Fs for the following reason: I had read that the hashing algorithm for
Strings prior to JDK 1.2 was somewhat slow (see M. Weiss' book in Resources), and, of course, you
have no control over how Java implements the hashCode() method for one of its own data types. On the
other hand, it is easy to devise a near-perfect hash algorithm for a sequence of Booleans (as I will
describe in a moment).
     For each String consisting of Ts, Fs and wildcards ('*'), the BooleanTupleFactory is asked to
return BooleanTuples that are generated from the String. As I described in Part 2, the
BooleanTupleFactory replaces any boolean String having a '*' with two Strings--one in which
the '*' is replaced with a T, another in which it is replaced with an 'F'. All the boolean Strings that are
derived from a particular passed in String are transformed into boolean tuples and returned, to be loaded
into the rules table. This design provides one of the keys to performance degradation.
     As a matter of fact, as I will discuss in a moment, although all significant performance degradation
actually does occur within BooleanTupleFactory, the problem does not lie just in the fact that it
creates so many extra boolean Strings. Let's take a look at the code:

public class BooleanTupleFactory {

    public static BooleanTuple [] getTuples(String tf) throws NestingTooDeepException {
        if(tf == null || tf.equals("")) return null;
        if(tf.length() > BooleanTupleConstants.MAX_LEVELS_OF_NESTING){
            throw new NestingTooDeepException("Can't have more than "+
                  BooleanTupleConstants.MAX_LEVELS_OF_NESTING + " levels of nesting");
       }
       int numTuples = countTuples(tf);
       int len = tf.length();
       int numStars = countStars(tf);
       int [] indicesOfStarredCols = {};
       if(numStars > 0) {
           indicesOfStarredCols = computeIndicesOfStarredCols(tf);
       }
       StringBuffer [] tfArr = new StringBuffer[numTuples];
       BooleanTuple [] btArr = new BooleanTuple[numTuples];
       for(int col = 0; col < len; ++col) {
           for(int row = 0; row < numTuples; ++row) {
               if(col == 0) tfArr[row] = new StringBuffer("");
               char c = tf.charAt(col);
               tfArr[row].append(obtainBoolChar(col, row, c, indicesOfStarredCols));
           }
       }
       for(int row = 0; row < numTuples; ++row) {
           btArr[row] = getTuple(tfArr[row].toString());
       }
       return btArr;
   }

public static int truthTable(int starCol, int row, int numStarCols) {
        //validate input
        if(starCol < 0 || starCol > numStarCols - 1 || row < 0 ||
            row > Math.pow(2,numStarCols)-1 || numStarCols < 1) {

             System.out.println("Invalid input ["+starCol+", "+row+", "+numStarCols+"]"+
                          "for truthTable method");
             return -1;
       }
       //if starCol is largest possible, return 0 if row lies in first half of the rows,
       // return 1 if in second half
       if( starCol == numStarCols - 1) {
           return (row < Math.pow(2,numStarCols-1)) ? 0 : 1;
       }
       //if starCol is one of the smaller possible values, use the version of truthTable
       //for which total number of columns is 1 less than the current value
       else {
           return truthTable(starCol,
                             row % ((int)Math.pow(2,numStarCols-1)),
                             numStarCols-1);
       }
   }
   private static char obtainBoolChar(   int col,
                                         int row,
                                         char c,
                                         int[] indicesOfStarredCols) {
       switch(c) {
           case 'T':
           case 't':
               return 'T';

           case 'F':
           case 'f':
               return 'F';

           case '*':
               //find index of starred col index in the indicesOfStarredCols array
               int adjustedColNum = Util.findIndex(col, indicesOfStarredCols);

              //call truthTable to determine appropriate truth value for this row and
              //column
              int booleanAsNumeric =
                  truthTable(adjustedColNum, row, indicesOfStarredCols.length);

              //convert to 'T' or 'F'
              return (booleanAsNumeric == 0) ? 'T' : 'F';
             default:
                 break;
        }
        return 'F';
    }
    private static int countStars(String s) {
        int count = 0;
        if(s == null) return count;
        for(int i = 0; i < s.length(); ++i) {
            if(s.charAt(i) == '*') ++count;
        }
        return count;
    }

    private static int [] computeIndicesOfStarredCols(String s) {
          int arraySize = countStars(s);
          int [] arrayOfIndices = new int[arraySize];
          int currIndex = 0;
          for(int i = 0; i < s.length(); ++i) {
            if(s.charAt(i) == '*') {
                arrayOfIndices[currIndex] = i;
                ++currIndex;
             }
          }
          return arrayOfIndices;
      }

    private static int countTuples(String s) {
        return (int)Math.pow(2,countStars(s));
    }

    private static Boolean getBoolean(char c) {
        if(c == 't' || c == 'T') return Boolean.TRUE;
        return Boolean.FALSE;
    }
    private static BooleanTuple getTuple(String tf) throws NestingTooDeepException {
        if(tf == null || tf.equals("")) return null;
        if(tf.length() > BooleanTupleConstants.MAX_LEVELS_OF_NESTING){
            throw new NestingTooDeepException("Can't have more than "+
                   BooleanTupleConstants.MAX_LEVELS_OF_NESTING + " levels of nesting");
        }
        int len = tf.length();
        Vector v = new Vector(len);

        for(int i = 0; i < len; ++i) {
            char c = tf.charAt(i);
            Boolean b = getBoolean(c);
            v.addElement(b);
        }
        return new BooleanTuple(v);
    }

}

     In the code, the BooleanTupleFactory receives a request to produce an array of BooleanTuples
from a String of Ts, Fs and wildcards when its getTuples(String tf) method is called. One of
the first things this method does is to determine how many BooleanTuples it will have to create--the
method countTuples(tf) takes care of this computation. This computation will reveal an important
bottleneck: If the String tf has no wildcard, clearly only one BooleanTuple will be created. If tf
has one wildcard, two BooleanTuples will be required--one for the case in which '*' is replaced by T,
another for the case in which it is replaced by 'F'. In general, if there are n wildcards, then 2 n
BooleanTuples will need to be created. This explains why performance degrades so quickly when there
are many wildcards in a set of rules: If the average number of wildcards in a set of boolean Strings is n,
then the average number of BooleanTuples that must be created by the BooleanTupleFactory for
each of these boolean Strings is 2n. This means that, in the test environment described above, each time
I append another wildcard to my four original Strings, I nearly double the number of
BooleanTuples that must be created by the BooleanTupleFactory. We have here an exponential-
time algorithm. I will refer to this algorithm, which typically produces so many BooleanTuples from a
given boolean String, the BooleanTuple Algorithm.
     Moving further through the code, you will see that an array of Strings (tfArr) and an array of
BooleanTuples (btArr) are initialized. The code will first attempt to load up tfArr with Strings
having just Ts and Fs, after having replaced all occurrences of '*' in the original String tf. Then it will
attempt to transform each of these Strings into a BooleanTuple and load it into btArr. The first for
loop loads tfArr; the second for loop loads btArr.
     As I've already mentioned, a major performance hit occurs each time this first for loop loads up one of
the arrays tfArr, because of the exponential growth involved. However, lurking within this loop you can
find another performance killer in the call to the method obtainBoolChar(). The purpose of this
method is simply to examine the next unexamined character in tf and append the appropriate character in
tfArr. If it finds a T or an F, it simply appends a T or F, respectively. If it finds a '*', however, it
sometimes has to append a T and sometimes an F. The way it makes this selection involves another
horrendous algorithm.
     To see what's involved, let me recall the notion of a truth table. A truth table with k symbols is a table
that lists all possible values of Ts and Fs for k propositional symbols p1, p2, …, pk. Below are truth tables
for the cases k = 1 and k = 2

                                                       p
                                                       T
                                                       F


                                                   p       q
                                                   T       T
                                                   F       T
                                                   T       F
                                                   F       F

The second table for example, displays all the possible ways Ts and Fs could be assigned to p and q -- there
are 4 different possibilities.
     When the getTuples() method discovers that there are k wildcards in the boolean String tf, it
views these wildcards as determining a truth table with k symbols, because Ts and Fs need to be assigned
to each of the k occurrences of '*' just as in a truth table. In the obtainChar() method, when a '*' is
encountered, a call is made to the truthTable() method to locate which value, T or F, should be read
next. It does this by recursion, which means in this case that it first builds the truth table for 1 symbol, then
for 2 symbols, and so forth, all the way up to k symbols, and then it reads off the appropriate value from the
table. Each time a call is made to truthTable(), all these truth tables are built up again from scratch.
Since the number of rows in a truth table grows exponentially in relation to the starting number of symbols,
this is another exponential-time algorithm that lurks in the mix. I will refer to this recursive truth-table
generating algorithm as the Truth Table Algorithm.
     Before discussing how to replace these two performance-killing algorithms, I should point out one
other performance degrader that became apparent only after I had fixed these other two problems. After the
getTuples() method in BooleanTupleFactory completes its processing, it returns the
BooleanTuple array btArr to the calling function; the calling function is the addRule() method in
Rules. After addRule() has received this array of BooleanTuples, it loads them all up into the
ruleTable, matching each with the appropriate action. When the number of BooleanTuples
approaches 10,000 (as it does when you have, in the test environment I described earlier, around 11
wildcards -- a total of 13 conditions -- since this forces the BooleanTupleFactory to produce 2048
BooleanTuples for each boolean String, yielding a total of 4*2048 = 8192 BooleanTuples), the
very act of loading this many rules into the Hashtable ruleTable begins to slow the framework
down. The reason for this performance hit is that first, in order to load a BooleanTuple as a key, the
Java runtime invokes BooleanTuple's version of hashCode(), performs a bitwise "and" between the
hashCode() result and the largest Java integer, and then applies the mod function (%) to it, using as
modulus the current table size. Worse, each time the number of elements in the Hashtable exceeds 75%
of table's capacity, the Java runtime increases the capacity of the table (by creating a new one) and re-
hashes all current elements. When the number of tuples is large enough, all this processing starts to become
inefficient. I will refer to this inefficiency as the Hashtable Loading Problem.
     I'll turn now to a description of solutions to these performance problems. I'll tackle them in the
following order (arranged in order from easier and less significant to harder and more significant):

A. Solving the Hashtable Loading Problem
B. Fixing the Truth Table Algorithm
C. Fixing the BooleanTuple Algorithm

     The solutions that I offer below are easy to understand at a descriptive level but in most cases, the
actual implementation involves more detail than I have space to explain. My approach will be to give the
high-level description of the solution and point you to the relevant code if you would like to see more
details. I have implemented each of the solutions provided below in the new version of the logic package,
available for download (see Resources).

Solving the Hashtable Loading Problem. There is an elegant solution to this problem that is made possible
by the way in which the hashCode() method in BooleanTuple was implemented. One of the nice
things about a Hashtable is that you can use any kind of object whatsoever as a key -- every object has a
default implementation of hashCode() that will be used in the hashing process. However, in the case of
BooleanTuples, I can obtain an especially nice set of hash values in applying my hashCode() method
to these objects. The important characteristics of these hash values are:

(a) I know in advance the biggest possible value that the hashCode() method can return
(b) I am guaranteed that different BooleanTuples will always have different hash codes.

     These two facts allow me to use an array as a Hashtable: First, by Fact (a), I know how big I should
make the array. Second, by Fact (b), I can use the hashed value of each BooleanTuple as an index in the
array. So, instead of trying to store rules by matching a BooleanTuple with an Action, I can let my
"table of rules" be simply an array of Action instances; the index of each Action instance in the array
will be the hash code of the BooleanTuple that corresponds to it.
     To show you how this hashCode() method works, I'll give an easy example. Suppose a
BooleanTuple encodes the following sequence of booleans: T, T, F, T. Let's transform these to 0's and
1's by letting T correspond to 1 and F to 0. This yields the sequence 1,1,0,1. Now let's view this sequence as
a binary integer, namely 1101. Finally, let's write its base 10 representation: 8 + 4+ 1 = 13. If we use this
scheme, no two distinct sequences of Ts and Fs could give rise to the same binary integer. This observation
establishes Fact (b). Also, if you are using m conditions, I know that the largest possible hash value occurs
when I end up with all 1's -- and in base 10, a binary integer consisting of m 1's corresponds to the number
2m-1. This establishes Fact (a).
     The details of this new implementation of a rules table appear in the new logic package. You can see a
somewhat optimized version of the hashCode() method (an improvement over the old framework
implementation) in the new BooleanTuple class. And you can see the new ruleTable array in the
new Rules class, replacing the old Hashtable version.
     Before looking at the next problem, I'll pause to reveal the technical reason for insisting on a limit of
30 instances of Condition: If I allowed 32 Condition instances, a BooleanTuple all of whose
values are "true" would have a hash value of 232-1, which is larger than the largest Java integer. (Actually,
31 Conditions would have been acceptable from this point of view; I set the limit at 30 because it’s a
nice ―round number.‖) Recall, though, that the real issue concerning the number of Conditions has to
do with algorithm performance, which is not related to this relatively minor hashing issue.

Fixing the Truth Table Algorithm. The problem with the Truth Table Algorithm is not the fact that it builds
so many truth tables, but that it does so every time the calling function requests to know the value in one of
the cells of one of these tables. The sensible thing to do would be to create all the truth tables that will ever
be needed once and for all during initialization. This is the solution I have implemented in the new logic
package.
      The creation of a sequence of truth tables is accomplished in the new Util class, via the static method
truthTable(int k), where the parameter k specifies the number of symbols of the largest truth table
to be created. The method represents the sequence of truth tables it creates as a three-dimensional array,
and stores it in a public static variable TRUTH_TABLES. Now when the BooleanTupleFactory
method obtainBoolChar() makes a request to find out which truth value occupies a particular cell of a
particular truth table, it reads off the value stored in TRUTH_TABLES. To ensure that
Util.truthTable() is called early enough to guarantee that it will be ready for such reads, a new
rules class, called Algorithm_fewStars, makes a call to this utility method as one of its first actions in
its initialization method init(). (This new class, as I explain below, will encapsulate the BooleanTuple
Algorithm. Placing it in a separate class makes room for an alternative, more efficient algorithm for use in
most of the rule loading that will be done.)
      Finally, in order to pass in the right int value to truthTable(), I now require that you pass a
parameter maxNumStars into the Rules constructor. The parameter maxNumStars represents the
largest number of wildcards contained in any of the boolean Strings that are being used in the rules
definition.

Fixing the BooleanTuple Algorithm. The main premise of the BooleanTuple Algorithm is that, before
transforming boolean Strings into BooleanTuples and loading up the rule table, I need to transform
all wildcards into actual Ts and Fs, forcing me to create potentially huge numbers of BooleanTuples.
One superficial reason I have to do this is that a BooleanTuple consists of a sequence of real boolean
values -- each value is either true or false; a wildcard as such is meaningless in this context. But if I were
willing to give up my requirement of using BooleanTuples as the keys for the rules Hashtable, and
just use raw boolean Strings instead, then it wouldn't matter that some of the characters might be '*' --
any String can be a key in a Hashtable. Then I wouldn't have to create all those extra boolean Strings as
I am now.
     This approach sounds good on the surface, but when you think about it, it begins to look a little fishy.
Suppose we use this approach and assume loadRules() has been called and all my rules have been
loaded into a Hashtable successfully, matching raw boolean Strings – possibly containing wildcard
characters -- to corresponding Action instances. Next, Conditions are loaded, and then the
IfThenElse class is asked to evaluate conditions, lookup the appropriate Action, and fire it off. But
notice what happens when the conditions that have been loaded are evaluated: each condition is evaluated
to a boolean -- either true or false. We would assemble these boolean evaluations into a String of Ts
and Fs, and expect that this String will occur as one of the Hashtable keys. The problem is that this
particular boolean String is probably not one of the Hashtable keys.
     To see the difficulty, let's take a simple example. Suppose just two boolean Strings are loaded: FF*
and TFF. Suppose the evaluation of conditions resulted in the boolean String FFT. When we look for
FFT in the Hashtable, the hash of FFT won't look anything like the hash of FF*; yet, FFT really does
match FF*. This scenario should make it clear why I avoided this approach originally, but chose instead to
replace all wildcards with Ts and Fs.
     Despite the obvious difficulty, using the raw boolean Strings as keys is very appealing because
there are typically so few of them to worry about; if we could make the approach work somehow, we could
avoid the exponential-time problem.
     Here is the solution that I propose: I will use the approach of using raw boolean Strings as keys in
my rules Hashtable. I will also have on hand an auxiliary array that also stores all these raw boolean
Strings. When I load rules, I will match a boolean String – possibly containing wildcards – with an
Action in the rules table as usual, and I will also add this boolean String to my array. When it comes
time to lookup a boolean String, which consists only of Ts and Fs, I first compare the lookup String
with each element in my auxiliary array. However, the comparison I use will not be the usual equals()
method. Instead, I will define a new equals(String s, String t) method that returns false if the
two Strings have unequal size, or if, for some integer i less than s.length(), the ith character in
one of the Strings is a T, and the ith character in the other String is an F. This equals() method
will, for example, declare that FF* and FFT are equal. Using this equals() method, I will search my
auxiliary array for a match with my test String, and unless the framework user was careless in his
loading of rules, I will certainly find a match. The match that I find may contain wildcards, but I will be
guaranteed that this match is one of the keys in my rules Hashtable. Thus, I have a guarantee that the
appropriate Action will be found.
     The solution works, but at what price? The price I have to pay is that the lookup time has increased
somewhat. Let's take a look at the efficiency of this new search algorithm. First of all, let's say that there
are m conditions under consideration and n rules that are loaded – in particular, n boolean Strings to
consider. First, to compare two boolean Strings of length m using the new equals() method would
require c*m comparisons in the worst case, where c is a small constant (around 3 or 4) that depends on the
implementation. (To see this, note that the worst case occurs when this character-by-character comparison
of two boolean Strings does not turn up a mismatch – for that case to occur, the procedure has to
examine every character in both Strings. In particular, the procedure must check that the first String
does not have a T at position i when the other String has an F (two comparisons) and conversely (two
more comparisons).) For the sake of discussion, let's say that c = 4. Of course, if the worst case involves
4*m comparisons, the average case will require only half as many – that is, 2*m comparisons.
     Next, as we compare a test String against the Strings in the auxiliary array, I can expect that on
average, the procedure will have to search 50% of the elements before finding a match -- this means that it
will compare against n/2 Strings. Until a match is found, I can expect average case performance of the
comparison procedure provided by the equals() method (as described above). Thus, the total number of
comparisons on average is 2 * m * (n/2) = m*n. In practice, n doesn't tend to be too much larger than m, so
the average lookup time is on the order of m2, which isn't too bad for an unordered collection. In practice, I
can load thousands of rules and thousands of conditions using this new algorithm, and lookup times are
acceptable.
     To see the implementation of the new algorithm, take a look at the new framework class
Algorithm_manyStars, and examine the addRule() and lookUp() methods. You will notice that
in order to initialize the auxiliary array, this class needs to know how many rules there are. For this reason,
you must now pass this data into the new Rules class when you create a new Rules instance. Also, the
new equals() method, which performs a comparison of Strings that takes into account the wildcard
symbol, is located in the new Util class.
     It appears that the new algorithm makes the old algorithm obsolete. For the most part, this is true.
However, as I shall explain in the next section, there are special cases in which the old algorithm does
better (certainly the two algorithms are close when there are no wildcards). In order to handle the presence
of two algorithms either of which may be dynamically invoked, I have created an abstract class
Algorithm and two subclasses Algorithm_manyStars and Algorithm_fewStars. When the
Rules class needs to select an algorithm to use, the framework uses certain criteria to decide which one to
use, and then turns control over to the selected class. I will have more to say about this new design below.
     The cases in which the old algorithm outperforms the new one, and the criteria for algorithm selection,
arise as part of a discussion about checking for rule consistency--the topic that I will take up next.

Consistency Checking

     It is vital to be able to verify that no pairs of inconsistent rules are creeping into our framework
implementations. Luckily, implementing a reliable consistency checker is easy to do. The basic idea is that,
with each call to addRule(boolStr, action) in Rules, I should look to see if the key boolStr
has already been used, and if so, whether in that usage its matching Action is the same as action. If the
key has already been used but has been matched with a different Action, I need to throw an
InconsistentRulesException.
     Since I've described two algorithms now for loading rules, and since I've claimed (without support so
far) that neither algorithm can be discarded, I will need to implement this consitency checking strategy for
both algorithms. In fact, since the discussion in this section will require some significant additional
comparison of these two algorithms, I will take a moment to describe how I have integrated them into the
new framework design. Figures 2 and 3 below show their roles and relationships.
Figure 2. Class diagram for the new algorithm classes.
Figure 3. Interactions with the new algorithm classes.

     As you can see , in the new design, the Rules class now aggregates a single instance of Algorithm,
which it obtains from a call to the AlgorithmAndConsisCheckSelector to select the correct
algorithm. All the criteria that I use (and which I will explain in this section) for selecting one of the
algorithms lives in AlgorithmAndConsisCheckSelector. Algorithm is an abstract class with
two primary abstract methods: addRule(String boolStr, Action action) and
lookUp(Vector results). Its two subclasses, Algorithm_manyStars and
Algorithm_fewStars, implement these methods, using the different algorithms described in the last
section. The "many stars" algorithm (which works well with arbitrarily many wildcards) is the new,
speedier algorithm I described in the last section, while the "few stars" algorithm (which is efficient only
when the number of wildcards is small) is the old algorithm, which now includes the refinements that I
discussed in earlier sections. When Rules receives a request to addRule() or to lookUp(), it
delegates the request to its Algorithm instance, which then invokes its own algorithm to handle the
request. (This design is a version of the strategy pattern described in the well-known Design Patterns book -
- see Resources.)
     In implementing consistency checking for the old algorithm, I will choose to invoke consistency
checking inside the BooleanTupleFactory at the point where the passed in boolean String has
been converted to an array of boolean Strings that have no wildcards, just as they are being converted to
BooleanTuples. Each time we get another BooleanTuple to add to the BooleanTuple array, I
invoke its hashCode() method, check to see if that code is matched with a non-null Action in the rule
table, and if so, whether the passed in Action is equal to the Action at the hash index. Obviously, this
procedure is extremely efficient relative to the number of BooleanTuples produced by the
BooleanTupleFactory. This is how I implemented this consistency checker:

private static void checkConsistency(Action action, int hash, BooleanTuple booltup,
                                      Action[] ruleTable) {
   Action currentAction = ruleTable[hash];
   if(currentAction != null && !currentAction.equals(action)) {
       throw new InconsistentRulesException();
   }
}
The loadTuples() method in the BooleanTupleFactory that calls this method hashes the new
BooleanTuple and passes the hash value in, along with the Action, BooleanTuple, and
ruleTable.
    For the new algorithm, I will also perform consistency checking as each rule is added, verifying that I
am not introducing an inconsistency with any new rule I attempt to add. In order to compare the each new
boolean String with the ones added so far, I will need to use the slower lookup procedure that this
algorithm uses. Here is the implementation:

for(int i = 0; i < nextAvailableIndex; ++i) {
    if(Util.equals(boolStrings[i], boolStr) &&
        !(action.equals((Action)ruleTable.get(boolStrings[i])))) {
             throw new InconsistentRulesException();
    }
}
     It will be useful to know how efficient this procedure is. Again, let's say that there are m
Conditions and n Rules. Let's suppose we are in the process of adding the kth rule, so that our
auxiliary array has k-1 Strings so far. It will take on average 4*m*(k-1)/2 = 2*m*(k-1) comparisons to
verify that the new boolean String has not been used so far (the typical situation). Summing up over the
n rules yields on the order of m*n*n comparisons. At best, when n is close to m, this is a cubic algorithm;
at worst, the computing time is proportional to the square of a large number of rules.
     In the presence of many rules, over 1000 say, you have to wonder whether the new algorithm will start
to slow down when consistency checking is enforced, since it has to perform around 10 million
comparisons in order to load rules. By contrast, if you load 1000 rules using the old algorithm, and there
are no wildcards, you would expect to see no performance degradation as a result of using consistency
checking.
     Using the test environment I described earlier, which allowed me to test performance by varying the
number of conditions, number of rules and number of wildcards (this test environment is part of the
downloadable sample code for this article -- see Resources), I discovered some valuable performance
details. Certainly, the exact amount of time required to execute the various scenarios that I tried will depend
on the processor speed, bus speed, the operating system and extraneous factors. But the fact that one
algorithm performs better than another under identical conditions depends much less on these factors.
     I found that neither algorithm can check consistency of more than 16,000 rules in a reasonable amount
of time (it took more than 10 seconds for each of them on my Win 98 machine with 400 mhz processor) .
The old algorithm does better when there are fewer wildcards, and can handle from 1000 to 8000 rules with
0 to 2 wildcards better than the new algorithm. For 1000 to 4000 rules with three or more wildcards, the
new algorithm is preferable. I have summarized the results in the following table:

                                                   Average Number of Wildcards

Number of Rules              0                   1                   2                 3                4
   500-1000                 old                 new                 new               new              new
   1000-2000                old                 old                 old              new               new
   2000-4000                old                 old                 old              new               new
   4000-8000                old                 old                 old
   8000-16000               old                 old

Table 1. Best performing algorithms with consistency checking

     The entry in the main cells of the table indicate which of the algorithms performs the best. The
blackened cells indicate that neither algorithm performs acceptably under those conditions.
     In my tests, all my boolean Strings had exactly the same number of wildcards. So, what I call in the
table the "average number of wildcards" always turned out to be the actual number of wildcards in each of
my Strings. In practice, however, the number of wildcards varies from String to String. And when
there are many rules, it would be unrealistic to expect a framework user to actually compute the average
number of wildcards in order to be able to use the framework. Luckily, it is possible to estimate this
average number using the number of rules and the number of conditions -- see the
computeAvgNumStars() in the new Rules class for details. Because I need to know the average
number of wildcards, the new Rules constructor requires that the number of conditions be passed in.
(Combined with requirements mentioned earlier, this means that you can construct a new Rules instance
only if you pass in the number of conditions, number of rules, and maximum number of wildcards
occurring.)
     The fact that the algorithm of choice eventually slows down significantly when consistency checking is
used led to the decision to make consistency checking only an optional feature of the framework, and not
the default behavior. Likewise, because there are at least occasions when the old algorithm is preferable, I
decided to provide the framework user the option of selecting an algorithm. If the framework user does not
explicitly request consistency checking, the framework will not enable this functionality. If the user
requests consistency-checking but does not specify an algorithm preference, and there is a large number of
rules, I use the table above to decide which algorithm to use. These and other decisions are made in the
class AlgorithmAndConsisCheckSelector, and I implemented the logic using the If-Then-Else
Framework. (This provides another example of the framework in action. Unfortunately though, if you try to
follow the URLProcessor_good implementation of the framework, and start reading the code in
AlgorithmAndConsisCheckSelector, where a separate implementation occurs, you may find it all
quite confusing -- I would recommend staying away from the code in
AlgorithmAndConsisCheckSelector while you are working with framework implementations.)
     When the consistency checking option is selected, and the framework is unable to find a suitable
algorithm (for example, if there are more than 16000 rules), or if an algorithm has also been selected but is
unsuitable, the framework throws a PoorPerformanceException.
      The way that you specify these options in the new framework is by making use of new versions of the
Rules constructor. The signatures of the two main versions of this constructor are as follows:

public Rules(int numConditions, int numRules, int maxNumStars);
public Rules(int numConditions, int numRules, int maxNumStars, Algorithm alg, Boolean
             consisCheck);

If you don't wish to make use of any of the options, you could use the first constructor or use the second
constructor, passing in null in the last two arguments. If you wish to use just one of the two new options,
you would use the second constructor and pass in null for the argument you don't care about.

Summary of Results

     The following table summarizes the results of our performance/maintenance analysis and enhancement
session:

Performance/Maintenance Problem                       Solution and Resulting Enhancements
No enforcement of proper order of         The new framework provides, as an option, condition and rule
evaluation                                sequencers that do enforce an order of evaluation. This order is
                                          specified with a user-defined interface of integer constants,
                                          implemented by the user-defined subclass of Invoker.
Performance degradation as the            Analysis and testing showed that the bottleneck was located in
number of conditions and rules            the addRules() method. The new framework provides a new
increases                                 default algorithm for adding rules and doing lookups. Since the
                                          old algorithm is still occasionally preferable, an enhanced version
                                          of that algorithm has been retained, and now the framework
                                          dynamically chooses which algorithm to use based on criteria
                                          gathered from testing.
Lack of consistency checking as rules     The new framework makes consistency checking available. Since
are added                                 this feature can degrade performance, it is provided as a
                                          selectable option rather than as the default. The new consistency
                                          checking feature does not degrade performance when the number
                                          of rules does not exceed 1000.

Table 2. Summary of solutions to performance/maintenance challenges

    The changes that I have introduced in addressing these problems involve very few changes for the way
in which framework user interfaces with the framework. However, what few changes there are need to be
noted. I have summarized them all in the table below:


   Framework Function                            How User Interfaces With New Framework
Use of the Invoker class         When you create an instance of Invoker, before calling the
                                 execute() method you must now explicitly call its init() method.
                                 This makes it possible to initialize other classes, such as a rule-loader,
                                 before calling the loadRules() method (which is called by init())
New Exceptions                   When you create and initialize Invoker and ask it to execute(), you
                                 must handle the following new Exceptions:
                                 InconsistentRulesException,
                                 PoorPerformanceException
                                 (Note that the NestingTooDeepException has been replaced by the
                                 PoorPerformanceException.)
Use of the Rules class           To create a new Rules instance (which is typically done when you
                                 implement the loadRules()) method, you must use one of the new
                                 constructors. The three-parameter constructor requires that you pass in the
                                 number of conditions, number of rules and maximum number of
                                 wildcards. The five-parameter constructor requires you to pass in the same
                                 first three parameters, and also a selection of an algorithm (an instance of
                                 Algorithm) and a choice (a Boolean) about consistency checking—
                                 passing in Boolean.TRUE means that you are requesting consistency
                                 checking.
Optional use of order of         If you want to enforce adherence to an order of evaluation, you can create
evaluation interface and         an interface of integer constants whose names correspond to
sequencers                       Condition instances, and then use sequencers to load rules and load
                                 conditions. This approach is optional, though; it is still possible to use the
                                 framework in the original way, using code comments only as a reminder
                                 about the order of evaluation.
Optional use of consistency      By default, consistency checking is not done. You can request the
checking                         framework to activate consistency checking when you create a new
                                 Rules instance, by passing in a Boolean.TRUE as the fifth parameter
                                 in the five-parameter Rules constructor.
Optional selection of an         If you do not select an algorithm, the framework will select one by
algorithm                        considering the number of rules, the average number of wildcards used in
                                boolean Strings, and whether consistency checking has been requested,
                                and also by consulting, if necessary, an internal table that indicates which
                                algorithm performs better in a variety of scenarios. If you do select an
                                algorithm, the framework will attempt to use it. To create an instance of an
                                algorithm, you can use the following code:
                                   Algorithm alg = Algorithm)Util.getInstance(Algorithm_<choice>.NAME>);
                                where <choice> is either ―fewStars‖ or ―manyStars‖.

Table 3. New idioms for interfacing with the modified framework.


Conclusion

     The result of carefully analyzing and testing the framework algorithms and implementation
commitments has been the creation of a "new and better product" -- the revised framework that I have
introduced here performs well even in the presence of thousands of conditions and rules. It also provides
new user options and safety mechanisms--such as consistency checking and automatic enforcement of
order of evaluation-- that enable a framework user to sharply reduce the risks involved in using the
framework for larger-scale projects. For logic-intensive projects in which you must implement complex
branching logic, I would recommend the If-Then-Else framework as the right tool for the job.

				
mikesanye mikesanye
About