Sunday, November 21, 2010

When refactoring does not help

Part of an messaging application I'm working on contains a complex piece of code. This part deals with deciding on basis of content and recipients who is to receive an incoming message.
for (BusRoute route: busRoutes){
  if (route.isMatching(report)){
    for(MessageFilter filter: messageFilters){
      if (filter instanceof DelayOnBusRouteMessageFilter){
        DelayOnBusRouteMessageFilter delayFilter = (DelayOnBusRouteMessageFilter)filter;
        if (delayFilter.getBusRoute().equals(route)){
          if (route.isApproachingRoutePoint(report, delayFilter)){
            if (delayFilter.getDelayApproachingThreshold() < threshold || threshold == 0){
              threshold = delayFilter.getDelayApproachingThreshold();
              deltaThreshold = delayFilter.getDeltaThreshold();
              ....
            else....
             ...
          else ...
What's wrong with this code? Being overly complicated it is difficult to test and maintain and without doubt it will probably contain multiple errors, i.e. it is very difficult to verify that this piece of code indeed implements the specified  requirements of what this part of the application is supposed to do. Trying to refactor it to bring down the cyclomatic complexity (McCabe index - counts separate execution paths) didn't help much. I was looking for other ways to improve code quality when I remembered using a rule engine to implement business rules for a proof of concept some years back. Rule engines are good when you can write declarative rules and want to externalize business rules or have a requirement that the rules should be dynamically modifiable at runtime. Working with a rule engine requires a different way of thinking compared to OO or even functional programming. Some understanding of the rule engine basics is necessary before we go on.


Rule engine basics
Most rule engines are based on the Rete algorithm, which is a matching algorithm in modern forward-chaining inferencing rules engines. Without going into details, in contrast to naïve implementations which loop over each rule and check for matching conditions, the Rete algorithm builds a network of nodes, where each node corresponds to a pattern occurring in the condition part of a rule. A rule is defined with a condition (when) part and a consequence (then) part:
rule "my very simple first rule"
 when
  Person(age > 30 && < 40 || hair == "black")
 then
   retract(p)
In the condition part you access field constraints of the facts present in rule engine's working memory, f.i. to match all persons between 30 and 40 year of age or with black hair. Then in the consequence part you put actions to execute when this rule's condition evaluates to true and this rule is executed, f.i. the selected person is retracted from the rule engine's working memory. The normal sequence of interactions with a rule engine is as follows:
  1. insert all facts
  2. fire all rules
  3. retrieve the results
Facts are represented by object instances, for the example above we would have a Person object with a int property age and a String property hair. In the first step we insert each fact into the working memory of the engine, which implicitly constructs the network of matching conditions. Next we let the rule engine fire all rules against the inserted facts. The engine will put all rules for which the conditions evaluate to true in a certain order on an agenda list and then removes and executes the first rule on the agenda, i.e. run the consequence (then) part of the rule. If this execution changes the working memory (retract fact(s) or insert of new fact(s)) then all rules are again evaluated and a new agenda results from this. The engine then proceeds with the execution of the first rule on the agenda again. This process repeats itself until the agenda is empty. Now the fire-all-rules step has finished and we proceed with retrieving the results, which can be the list of remaining facts or an arbitrary result object created or filled by a rule consequence. From the above you may have noticed that the order in which rules are put on the agenda makes a big difference. Therefore rule engines offer additional directives to group and prioritize rules as they appear on the agenda. Also they offer a large variation in how you may construct the condition part and the consequence part, you can use rule engine's native language or may extend this language with a self written dsl. It is very important that the getters of Fact objects or other data consulted in the conditional part of rules may not have side effects, because of how the Rete algorithm works. Remember, only change fact data in the consequence part and don't forget to trigger the rule engine to reconsider the conditions of the rules if that happens.


Setting up the environment
I set out to work to rewrite the code as a set of rules to be executed by the rule engine. Since I'm working on a Java project, I selected  Drools Expert, open source software from Jboss/Redhat for this task. For a .Net project you could use Microsoft BRE. The remainder of this part will focus on setting up the Eclipse IDE to write (and debug) the rules and supporting code to use the Drools rule engine for a Java application. First we install the Jboss Tools plugin in Eclipse. This plugin gives us several rule editors, a Drools perspective and the execution/debug facilities for Drools. Now we can create a new Drools project or convert an existing project to a Drools project. If you choose for a new project, you can have a sample Hello World drools files generated, which provide great starting point for playing around with the rule engine. The following files are being generated:
  • Sample.drl
  • DroolsTest.java
rule "Hello World"
 when
  m : Message( status == Message.HELLO, myMessage : message )
 then
  System.out.println( myMessage ); 
  m.setMessage( "Goodbye cruel world" );
  m.setStatus( Message.GOODBYE );
  update( m );
end

rule "GoodBye"
 when
  Message( status == Message.GOODBYE, myMessage : message )
 then
  System.out.println( myMessage );
end
The first rule matches any Message with status equal to HELLO. In the consequence the message property is printed, the status of the message is changed to GOODBYE and also the message is changed. Two things to notice here. The 'm :', which binds a new local variable called 'm' to the matched Message instance. And also the myMessage variable is bound to the message property of the Message instance. The auto created variables can be references in subsequent 'when conditions' or in the consequence part of the rule like in this example. The second thing to notice is the update(m) statement, which notifies the rule engine that the Message instance was modified. This means that the engine will clear the agenda and reevaluate all rules, which sounds like a big thing, but can be accomplished very efficiently by the engine because of the Rete algorithm.
public class DroolsTest {

  public static final void main(String[] args) {
    try {
      // load up the knowledge base
      KnowledgeBase kbase = readKnowledgeBase();
      StatefulKnowledgeSession ksession = kbase.newStatefulKnowledgeSession();
      KnowledgeRuntimeLogger logger = KnowledgeRuntimeLoggerFactory.newFileLogger(ksession, "test");
      // go !
      Message message = new Message();
      message.setText("Hello World");
      message.setStatus(Message.HELLO);
      ksession.insert(message);
      ksession.fireAllRules();
      logger.close();
    } catch (Throwable t) {
      t.printStackTrace();
    }
  }

  private static KnowledgeBase readKnowledgeBase() throws Exception {
    KnowledgeBuilder kbuilder = KnowledgeBuilderFactory.newKnowledgeBuilder();
    kbuilder.add(ResourceFactory.newClassPathResource("Sample.drl"), ResourceType.DRL);
    KnowledgeBuilderErrors errors = kbuilder.getErrors();
    if (errors.size() > 0) {
      for (KnowledgeBuilderError error : errors) {
        System.err.println(error);
      }
      throw new IllegalArgumentException("Could not parse knowledge.");
    }
    KnowledgeBase kbase = KnowledgeBaseFactory.newKnowledgeBase();
    kbase.addKnowledgePackages(kbuilder.getKnowledgePackages());
    return kbase;
  }

  public static class Message {
    public static final int HELLO = 0;
    public static final int GOODBYE = 1;
    private String text;
    private int status;
    getters and setters ...
  }
}
The following picture shows the audit logging after execution of the program. The audit log is a tremendous help in understanding how rules are executed.
Audit logging (click to enlarge)
Posted by Picasa
The Message object is inserted, which results in the creation of an activation (rule is placed on the agenda). This line of logging is a result of the line ksession.insert(message) in main(). Next the program calls fireAllRules(). This will execute the first activation on the agenda, i.e. Rule Hello World. The activiation is executed (i.e. the consequence part of the rule is executed) which creates an activiation of the Goodbye rule. Then the Goodbye rule activation is executed which completes the rule engine, because there are no activiations on the agenda.

Conclusion
Now we have some understanding of how a rule engine works. I´ve managed to apply this technology to resolve our complexity problem in the messaging application. The resulting rule files are small and concise and easily maintainable because there is a clear relation to the functional requirements.
Dependency injection (Spring, CDI) gave us the ability to create applications with loosly coupled components. A rule engine can help us improving the internal logic of components on specific spots where complex if.. then constructions cause maintenance to be a helluvajob.

Tuesday, November 16, 2010

Using Selenium and Ajax on Richfaces

We use Selenium for functional testing of a Web Application based on Seam 2 and Richfaces. The test script typically invokes an action by pushing a button or clicking a link, which submits a form or issues an Ajax Request. Next you wait for the new Page to load or wait for an element to (dis)appear or a specific element value change. Then you assert the elements on the page to have the expected values.
Sometimes a Ajax request doesn't result in any change on the page or you don't want to include the expected page changes in the waiting code. On what condition do we wait before continuing. The easiest and also uggliest way is to insert a sleep. Problem is that we have to sleep long enough for the server to finish the ajax call. This is considered bad practice because now the test completion depends on  the hardware the test is executed and all tests will execute a considerable time longer than strictly necessary.
In our application we use the Richfaces a4j:status component (html is: _viewRoot:status.start) to show an indicator (shows text 'busy') on the page while the ajax request is being processed.  I created a waitForAjax method in our test base class which conditionally waits until the indicator is shown on the page and suqsequently waits until the indicator disappears again.
protected void waitForAjax() {
       waitForAnyText("_viewRoot:status.start", 5);
       waitForNoText("_viewRoot:status.start", 5);
}

protected void waitForAnyText(final String locator,int timeoutAsSeconds){
   waitFor(new WaitForCallback(){
     public boolean conditionMet(){
       return !selenium.getText(locator).equals("");
     }
   }, timeoutAsSeconds);
}

protected void waitForNoText(final String locator,int timeoutAsSeconds){
   waitFor(new WaitForCallback(){
     public boolean conditionMet(){
       return selenium.getText(locator).equals("");
     }
   }, timeoutAsSeconds);
}

protected void waitFor(WaitForCallback cb, int timeoutAsSeconds){
  for(int second=0;;second++) {
    if (second >= timeoutAsSeconds){
      fail("timeout");
    }
    if(cb.conditionMet()){
      break;
    }
    try {
      Thread.sleep(1000);
    }
    catch (InterruptedException e){}
  }
}
protected static interface WaitForCallback {
  boolean conditionMet();
}

The problem with this code is that if the server is really quick, the Ajax indicator is already gone when the script executes the waitForAnyText("_viewRoot:status.start") call, and the test will fail after the specified timeout. To fix this we need a non failing waitForAnyText().

protected void waitForAjax() {
       waitForAnyText("_viewRoot:status.start", 5, false);
       waitForNoText("_viewRoot:status.start", true);
}
protected void waitForAnyText(final String locator,int timeoutAsSeconds, boolean fail){
   waitFor(new WaitForCallback(){
     public boolean conditionMet(){
       return !selenium.getText(locator).equals("");
     }
   }, timeoutAsSeconds, fail);
}

protected void waitForNoText(final String locator,int timeoutAsSeconds, boolean fail){
   waitFor(new WaitForCallback(){
     public boolean conditionMet(){
       return selenium.getText(locator).equals("");
     }
   }, timeoutAsSeconds, fail);
}

protected void waitFor(WaitForCallback cb, int timeoutAsSeconds, boolean fail){
  for(int second=0;;second++) {
    if (second >= timeoutAsSeconds){
      if (fail) {fail("timeout");} else {break;}
    }
    if(cb.conditionMet()){
      break;
    }
    try {
      Thread.sleep(1000);
    }
    catch (InterruptedException e){}
  }
}
So, now after calling waitForAjax() from our test script we know for certain that the ajax call completed and we have waited only the minimum amount of time.