October 15, 2013

Henshin + Giraph = Big Data Graph Transformations

Henshin is a graph and model transformation tool for Eclipse that lets you define rules and workflows for transformation in a graphical editor. By default, these transformations are executed using an interpreter on an in-memory EMF model.

Apache Giraph is a distributed and parallel graph processing framework built for high scalability. Giraph is for instance used at Facebook to analyze the underlying graphs of social networks.

So how do Henshin and Giraph fit together? Easy: Henshin provides an expressive graph transformation language with an intuitive graphical syntax. Giraph provides an infrastructure for highly parallel and distributed graph processing. The sort of obvious way of combining the two is to use Henshin as modeling tool and Giraph as execution engine. And that is what we are currently working on.



We implemented a code generator that produces Java code for Giraph from Henshin models. This generated code contains pattern matching code and graph manipulations for rules, and the required coordination to execute transformation units (workflows). The code generation is still an experimental feature but we plan to stabilize it and ship it with the next release of Henshin. We have already a small test suite and conducted some promising benchmarks. More details later.

The Giraph code generator is available in the development version of Henshin (for Eclipse Kepler). You can get it from our nightly build update site or by getting the source code directly.

No comments:

Post a Comment