SuperGlue: Standardizing Glue Components for HPC Workflows

Jay Lofstead; Alexis Champsaur; Jai Dayal; Matthew Wolf; Greg Eisenhauer

doi:10.1109/CLUSTER.2016.90

Abstract

Existing workflow engines, such as Kepler and DAGMan, offer flexible ways to assemble components with rich functionality to manage the control flow. What they both lack is a way to easily deploy and manage the glue code required to connect the various components. The complexities of making and maintaining the glue components as the output shifted and managing the deployment was too high. Falling back to Python scripts managed by the application scientist proved easier and faster to maintain. While this approach using the parallel file system to stage intermediate data was sufficient, it is quickly becoming infeasible. The IO overhead for using the parallel file system is exceeding acceptable runtime percentages forcing a reduction in output and making scientific insights more difficult to discover. To address the performance mismatch, Integrated Application Workflows (IAWs) are being developed. This poster describes our work on SuperGlue, a set of generic, reusable components for composing scientific workflows. These are distributed data analysis and manipulation tools that can be chained together to form a variety of real-time workflows providing analytical results during the execution of the primary scientific code. Unlike existing components used in IAWs, SuperGlue components do not have a fixed data type. This one change enables using these components on completely different kinds of simulations that share nothing in their output format. Key to making this work is using a typed transport mechanism between different components. Many options exist for these transports and the particular mechanism selected is not critical.

SuperGlue: Standardizing Glue Components for HPC Workflows

Authors

Abstract

Related Articles