Project Overview
AI4Science project is a multidisciplinary initiative aimed at advancing the role of artificial intelligence in the scientific process. It focuses on developing novel AI methodologies—spanning explainable machine learning, foundation models, automated scientific modeling, and semantic technologies—to address the unique challenges of applying AI in the physical and life sciences. The project emphasizes the integration of data and domain knowledge, transparency in AI models, and support for open science principles. Applications range from drug and gene therapy design to equation discovery, environmental modeling, and materials science. With a strong consortium of Slovenian research institutions and access to state-of-the-art computational infrastructure, AI4Science seeks to significantly enhance scientific discovery through AI-driven tools and frameworks.

The central objective of WP1 is to create new ML methods that can develop models that are both effective and easy to interpret, particularly for complex data. The project also aims to use these methods to explain model predictions and monitor trends in scientific fields by analyzing bibliographic data. The motivation behind this is to increase transparency, reproducibility, and trust in AI-driven discoveries, especially in fields such as healthcare and genomics. The methods developed under this work package will be applied to practical problems like monitoring scientific field development, designing gene therapy, and drug design.
WP2 aims to develop multimodal foundation models that can be applied to various scientific domains. The project will create new methodologies for pre-training and fine-tuning these models, which will be able to handle multiple data types (modalities) such as text, images, and other scientific data like patient records and genomic information. The models will be designed to handle missing data and fuse information from different modalities into a single representation. The developed models will be applied to practical tasks in different scientific fields, including medicine (e.g., diagnosing breast cancer and predicting brain cancer prognosis), life sciences (e.g., predicting protein-RNA interactions and designing gene therapies), and materials science (e.g., discovering new materials and simulating manufacturing processes).
The main objective of WP3 is to develop new AI methods for discovering scientific models, represented as equations, from both data and existing domain knowledge. The project will use both symbolic and neural approaches to ensure the models are both accurate and interpretable. It will also develop methods for discovering different types of equations, including ordinary, delay, and fractional-order differential equations, which can be used to model complex systems. The developed methods will be applied to various scientific domains, including plant biology, ecology, electrochemistry, and materials science, to solve problems such as estimating reaction rates in plant stress signaling networks and modeling the behavior of solid oxide cells.
The primary goal of WP4 is to develop and extend semantic resources, such as ontologies, for the fields of machine learning and optimization. This includes creating resources that can effectively represent complex data types, tasks, and models, including large language models and multimodal foundation models. The project will also develop semantic resources to support AI applications in various scientific domains like materials science, mathematics, and medicine. The ultimate objective is to apply these semantic resources to explain the relationships between problem properties, algorithm configurations, and algorithm performance, thereby enabling tasks like automated machine learning (AutoML) and automated optimization (AutoOPT) in a more explainable manner.
The core objective of WP5 is to ensure the project is completed successfully and on time through efficient coordination, management, and outreach. This includes maintaining high quality standards for all project outcomes, implementing robust monitoring to manage risks, and promoting open science principles. The work package also focuses on broadly disseminating project results to a wide range of stakeholders through a dedicated project website, various communication materials, and events like workshops, a summer school, and conference presentations. This multi-faceted approach aims to maximize the project's impact by making its findings and resources transparent, accessible, and widely adopted by the scientific community.