February 20, 2013

Profiling in Henshin

Graph transformations are a powerful formalism for realizing any kind of pattern and rule-based modifications of high-level data structures. A lot of work has been done for the theoretical foundations of graph transformations and a large extent of this work is phrased in category theory. One of the nice things of category theory is that constructions and operations are not explicitly given -- instead, the results of operations are defined in terms of universal properties. For graph transformations this means for example that the result of a rule application exists and is uniquely defined (up to isomorphism). While this definition is useful from a theoretical perspective, it is not really helpful when you actually try to use graph transformations in practice, e.g., when you try to implement a graph transformation engine yourself. This is because the universal properties don't tell you anything about the algorithms and optimizations required for realizing the operations efficiently.

The most time-critical part when implementing a graph transformation engine such as the Henshin interpreter is the matching engine for rules. So the task is: given the left-hand side of a rule (and possible nested application conditions), find a or all possible matches of this pattern in a given host graph. In Henshin, this is implemented using constraint solving techniques and some heuristics for finding optimal search plans. Nevertheless, as a user you can still be confronted with cases where a transformation takes an extensive amount of time. In such situations, it is very important to find the bottlenecks in large and complex transformations.

In Henshin, you can use our basic profiling support to find out which part of your transformation is time-critical. Let's take a look at an example. The rules and units below are taken from an old version of the Grid & Comb Pattern example and are used to construct a (full) grid of a given width and height.


The main unit of this transformation is buildGrid. We can execute this unit using Henshin's interpreter API, e.g., like this:
Engine engine = new EngineImpl();
engine.getOptions().put(Engine.OPTION_SORT_VARIABLES, false);
UnitApplication application = 
   new UnitApplicationImpl(engine, graph, unit, null);
application.setParameterValue("width", 40);
application.setParameterValue("height", 40);
application.execute(null);
If we do it like this for the old version of this example, the execution takes about 1 minute, which is really slow. The generated grid has only 1.600 nodes and thus it should be possible to do it much faster. So, this is a good opportunity to try out Henshin's profiling support. The trick is to pass a ProfilingApplicationMonitor to the unit execution, like this:
ProfilingApplicationMonitor monitor =
   new ProfilingApplicationMonitor();
application.execute(monitor); 
monitor.printStats();
This automatically collects statistics about the execution times of all rules. In the last line, we just print the gathered data. We get this output on the console:
Stats for rule 'extendFirstColumn':
 - Number of Executions: 38
 - Total execution time: 35ms
 - Aver. execution time: 0ms

Stats for rule 'startNextColumn':
 - Number of Executions: 39
 - Total execution time: 414ms
 - Aver. execution time: 10ms

Stats for rule 'extendNextColumn':
 - Number of Executions: 1521
 - Total execution time: 60086ms
 - Aver. execution time: 39ms
We see for every rule the number of applications, the total execution time and the average execution time of the rule. Ok, now we see where the bottleneck is! It is the rule extendNextColumn which is applied 1.521 times and takes on average 39ms for one execution, summing up to about one minute. So, why is this rule so slow? The reason is that in the used version of the example, the nodes of the left-hand side of the rule are ordered in a way which is highly inefficient for the match finding. But wait, didn't I say before that Henshin uses heuristics to find optimal search plans? Yes, indeed. So why does it not work here? Well because above, I initialized the interpreter engine with this line:
engine.getOptions().put(Engine.OPTION_SORT_VARIABLES, false);
This turns of the automatic variable ordering, which is useful if you have optimized your rules yourself. By default, the option is enabled. If we use the automatic variable ordering for our example, we get these execution times:
Stats for rule 'extendFirstColumn':
 - Number of Executions: 38
 - Total execution time: 25ms
 - Aver. execution time: 0ms

Stats for rule 'startNextColumn':
 - Number of Executions: 39
 - Total execution time: 345ms
 - Aver. execution time: 8ms

Stats for rule 'extendNextColumn':
 - Number of Executions: 1521
 - Total execution time: 1662ms
 - Aver. execution time: 1ms
Et voila -- the whole transformation takes less than 2 seconds now! The main point that I wanted to show you here is that if you have a large transformation with many rule applications, it can be very useful to quickly use Henshin's ProfilingApplicationMonitor to find out where the performance could be improved. If you have an example where a particular rule takes in your eyes much too long, let us know on the mailing list and we will check whether the performance can be improved.

February 13, 2013

ECT 4

The Extensible Coordination Tools (ECT) are a set of plug-ins for the Eclipse IDE which are centered around the coordination language Reo. In the past, the ECT were hosted in a Subversion repository at the CWI Amsterdam. Since most of the ECT developers are no longer working at the CWI, we decided to manage the source code with a third-party provider for open source projects. ECT is now developed as a project at Google Code. We hope that this will open the project for new developers and make the tools more accessible to external users.

With the new home of ECT comes also a new build infrastructure and a new update / release strategy. We now deliver continuous updates so that we can more quickly react on problems or bugs. To reflect the new philosophy also in the version numbers, the new version of ECT is 4.0.0.X (where X is a date-based qualifier). Hence the name ECT 4.

To install ECT 4, please first make sure that you uninstall any old ECT versions. For this, click on Help - About Eclipse - Installation Details. Then uninstall all Reo features with versions below 4.0.0. To complete the uninstall process, restart Eclipse. Then you are ready to install ECT 4. Just point the Eclipse update manager to this URL: http://reo.project.cwi.nl/update. This is in fact the old ECT update site URL. Select the features you want to install and follow the installation instructions. Once you have upgraded to ECT 4, you can install updates as usual by clicking on Help - Check for Updates.

Note that the Distributed Engine feature is now part of the Reo Core Tools. Also, the conversion tools do no longer include the BPMN convertor because it required out-dated packages. We hope that we can migrate the BPMN converter soon to the new BPMN2 Modeler tools.

If you would like to get in touch with the development of ECT 4, you can check out the source code at our ECT project at Google Code. If you want to follow the news on ECT, you can subscribe to our mailing list. This is also the best place to ask questions or to get in touch with the developers.

Last but not least, here are some stats about ECT 4. I used the cloc script to count the lines of code.

--------------------------------------------------------
Language       files       blank     comment        code
--------------------------------------------------------
Java            1400       34962       83922      142259
Scala             63        1486        3063        6845
XML               45         316          40        3819
JSP               14         148          50         531
Bourne Shell       2         114          28         260
C                  4          59          14         250
CSS                2          18           0          79
C/C++ Header       3          21           8          78
make               1          15           8          62
Javascript         1          10           1          32
HTML               2           4           0          25
--------------------------------------------------------
SUM:            1537       37153       87134      154240
--------------------------------------------------------

February 3, 2013

A Probabilistic Merger for Reo

Reo is a coordination language or component-based systems which is based on the idea of plugging communication channels together using nodes. The channels in Reo are user-defined and can implement virtually any kind of communication or synchronization. What is typically fixed though in Reo is the notion of nodes. Nodes merge incoming data items and replicate them to all outgoing channel ends. The merging is technically more interesting and challenging, which is why in some papers, this behavior is modeled by a ternary component called a Merger, which is often visualized like this:


This component reads data items from a or b and writes them to c. If there is data available at both a and b, the choice for one of them is nondeterministic. If no additional fairness assumptions are made, the model really does not say anything about the policy for choosing between a and b.

The semantics of such a merger component can be formalized using Reo automata. In these automata, transitions are labeled by pairs of guards and dataflow actions, written as "g|f" where g is a guard and f is a dataflow action. The Reo automaton for the merger component is shown below on the left. It consists of only one state and two transitions. The label "ac|ac" is read as: if there are I/O requests on a and c, then there can be dataflow at a and c (analogously for "bc|bc").


The interesting thing about Reo automata is that they can also directly (without encodings) express context-dependent behavior. An example is the automaton for a PriorityMerger shown on the right. The "ac|ac"-transition is the same as in the nondeterministic merger but the transition for choosing b contains an additional constraint in the guard: a (that is the negation of a). This informally means that dataflow from b to c can occur only if there is no data item at a available. Thus, the choice between a and b is no longer nondetermistitic but prioritized to a. So far, nothing new. Ok, let's take a look at the following automaton.


This automaton is not a Reo automaton anymore, because its transitions form discrete probability distributions. This is very similar to probabilistic constraint automata. If you take a look at the two states on the right, they have essentially the behavior of an a-PriorityMerger and a b-PriorityMerger. So, depending on whether the autmaton is in the upper or the lower state, it favor either a or b. The interesting part is that whether it favors a or b is now random. From the initial state, there is one silent step where an initial random decision for either a or b is made (a is chosen with a probability p). Whenever data is available at both a and b, the automaton favors one of them as determined by its current state. Additionally, a coin is tossed to decide whether a or b should be favored the next time. Thus, we have modeled a merger where the choice for a or b is random. What is important here is that we still need context information, i.e. whether there are data items available or not. This is most likely also the reason why this notion of probabilistic mergers was not considered in probabilistic constraint automata (which cannot directly express context information).