Retida - Real-Time Data Analytics in the Mobility Domain

Go to Project Site

Enabling real-time, large-scale analytical workflows for data-intensive science requires the integration of state-of-the-art technologies from various different fields. We thus propose to bring together latest developments from Big Data (NoSQL solutions and data-intensive programming, streaming engines), HPC (parallel programming, multi-/many-core architectures, GPUs, clusters), data analytics (analytical models and algorithms), and workflow management (definition and orchestration) in an innovative manner. Optimizing and tuning the execution strategy for data analysis jobs requires explicit knowledge about the system (data and compute resources) and the algorithms used inside the workflows. Even for experts, tuning of complex workflows gets more and more involved and time-consuming and thus is often restricted to a single application or the outer workflow level. Therefore, there is an urgent need for adaptive mechanisms that automatically configure and tune workflows and their execution environment by taking into account the characteristics of the data, the workflow modules (existing in different implementation variants), and the available heterogeneous hardware infrastructure in order to achieve the required QoS.