School of Electrical and Computer Engineering - NTUA

Database and Knowledge Systems Lab

Projects


Relax Doctoral Network


Relax-DN

The RELAX European Doctoral Network aims to train a cohort of highly mobile and adaptable researchers to become experts in the design of scalable and efficient data-intensive software systems. These experts will master the specific skill of navigating the semantics or correctness conditions of applications, with the goal of enhancing scalability, response times, and availability. Working across the disciplinary specialisms of data science, data management, distributed computing and computing systems, the Fellows will develop knowledge of the broad issues underpinning data analytics systems. The bespoke training programme fosters intellectual enquiry and combines technical and scientific research training with courses in innovation, management and leadership. The training network addresses a critical skills gap in data analytics expertise, which needs urgently addressed to support innovation and employment in a fast-growing European data economy. The 14 partner organizations representing 8 countries will benefit first-hand through intersectoral collaboration and an Open Innovation model. Two Fellows will conduct their PhD at DBLab, doing research involving data quality and its impact on large-scale data analytics.


HiDALGO2


HiDALGO2

HiDALGO2 aims to explore synergies between modelling, data acquisition, simulation, data analysis and visualization along with achieving better scalability on current and future HPC and AI infrastructures to deliver highly-scalable solutions that can effectively utilize pre-exascale systems. The project focuses on five use cases from the environmental area: improving air quality in urban agglomerations, energy efficiency of buildings, renewable energy sources, wildfires and meteo-hydrological forecasting. DBLab participates as the leading expert in HPDA (high-performance data analytics) for the aforementioned global challenges.


DAPHNE


DAPHNE

The DAPHNE project aims to define and build an open and extensible system infrastructure for integrated data analysis pipelines, including data management and processing, high-performance computing (HPC), and machine learning (ML) training and scoring. This vision stems from several key observations in this research field:

  • Systems of these areas share many compilation and runtime techniques.
  • There is a trend towards complex data analysis pipelines that combine these systems.
  • The used, increasingly heterogeneous, hardware infrastructure converges as well.
  • Yet, the programming paradigms, cluster resource management, as well as data formats and representations differ substantially.
Therefore, this project aims - with a joint consortium of experts from the data management, ML systems, and HPC communities - at systematically investigating the necessary system infrastructure, language abstractions, compilation and runtime techniques, as well as systems and tools necessary to increase the productivity when building such data analysis pipelines, and eliminating unnecessary performance bottlenecks. Our team is responsible for the DAPHNE runtime engine that provides the implementation of kernels for local, distributed and accelerator-enhanced operations, vectorized execution, integration with existing frameworks and libraries for productivity and interoperability, as well as efficient I/O and communication primitives. (project code)