February 20, 2013

Profiling in Henshin

Graph transformations are a powerful formalism for realizing any kind of pattern and rule-based modifications of high-level data structures. A lot of work has been done for the theoretical foundations of graph transformations and a large extent of this work is phrased in category theory. One of the nice things of category theory is that constructions and operations are not explicitly given -- instead, the results of operations are defined in terms of universal properties. For graph transformations this means for example that the result of a rule application exists and is uniquely defined (up to isomorphism). While this definition is useful from a theoretical perspective, it is not really helpful when you actually try to use graph transformations in practice, e.g., when you try to implement a graph transformation engine yourself. This is because the universal properties don't tell you anything about the algorithms and optimizations required for realizing the operations efficiently.

The most time-critical part when implementing a graph transformation engine such as the Henshin interpreter is the matching engine for rules. So the task is: given the left-hand side of a rule (and possible nested application conditions), find a or all possible matches of this pattern in a given host graph. In Henshin, this is implemented using constraint solving techniques and some heuristics for finding optimal search plans. Nevertheless, as a user you can still be confronted with cases where a transformation takes an extensive amount of time. In such situations, it is very important to find the bottlenecks in large and complex transformations.

In Henshin, you can use our basic profiling support to find out which part of your transformation is time-critical. Let's take a look at an example. The rules and units below are taken from an old version of the Grid & Comb Pattern example and are used to construct a (full) grid of a given width and height.


The main unit of this transformation is buildGrid. We can execute this unit using Henshin's interpreter API, e.g., like this:
Engine engine = new EngineImpl();
engine.getOptions().put(Engine.OPTION_SORT_VARIABLES, false);
UnitApplication application = 
   new UnitApplicationImpl(engine, graph, unit, null);
application.setParameterValue("width", 40);
application.setParameterValue("height", 40);
application.execute(null);
If we do it like this for the old version of this example, the execution takes about 1 minute, which is really slow. The generated grid has only 1.600 nodes and thus it should be possible to do it much faster. So, this is a good opportunity to try out Henshin's profiling support. The trick is to pass a ProfilingApplicationMonitor to the unit execution, like this:
ProfilingApplicationMonitor monitor =
   new ProfilingApplicationMonitor();
application.execute(monitor); 
monitor.printStats();
This automatically collects statistics about the execution times of all rules. In the last line, we just print the gathered data. We get this output on the console:
Stats for rule 'extendFirstColumn':
 - Number of Executions: 38
 - Total execution time: 35ms
 - Aver. execution time: 0ms

Stats for rule 'startNextColumn':
 - Number of Executions: 39
 - Total execution time: 414ms
 - Aver. execution time: 10ms

Stats for rule 'extendNextColumn':
 - Number of Executions: 1521
 - Total execution time: 60086ms
 - Aver. execution time: 39ms
We see for every rule the number of applications, the total execution time and the average execution time of the rule. Ok, now we see where the bottleneck is! It is the rule extendNextColumn which is applied 1.521 times and takes on average 39ms for one execution, summing up to about one minute. So, why is this rule so slow? The reason is that in the used version of the example, the nodes of the left-hand side of the rule are ordered in a way which is highly inefficient for the match finding. But wait, didn't I say before that Henshin uses heuristics to find optimal search plans? Yes, indeed. So why does it not work here? Well because above, I initialized the interpreter engine with this line:
engine.getOptions().put(Engine.OPTION_SORT_VARIABLES, false);
This turns of the automatic variable ordering, which is useful if you have optimized your rules yourself. By default, the option is enabled. If we use the automatic variable ordering for our example, we get these execution times:
Stats for rule 'extendFirstColumn':
 - Number of Executions: 38
 - Total execution time: 25ms
 - Aver. execution time: 0ms

Stats for rule 'startNextColumn':
 - Number of Executions: 39
 - Total execution time: 345ms
 - Aver. execution time: 8ms

Stats for rule 'extendNextColumn':
 - Number of Executions: 1521
 - Total execution time: 1662ms
 - Aver. execution time: 1ms
Et voila -- the whole transformation takes less than 2 seconds now! The main point that I wanted to show you here is that if you have a large transformation with many rule applications, it can be very useful to quickly use Henshin's ProfilingApplicationMonitor to find out where the performance could be improved. If you have an example where a particular rule takes in your eyes much too long, let us know on the mailing list and we will check whether the performance can be improved.

1 comment:

  1. Update: Matthias Tichy contributed a patch that further increases the performance of the match finder in common cases. For more info see http://dev.eclipse.org/mhonarc/lists/henshin-dev/msg00702.html

    ReplyDelete