AI4Science - Advancing science through explainable AI, foundation models, and automated scientific discovery.

Authors

Petrovčič, Job, Narváez, David E., Todorovski, Ljupčo

Publication

The 5th Workshop on Mathematical Reasoning and AI at NeurIPS 2025, MATH-AI 2025, 2025

DOI

https://openreview.net/attachment?id=9A3XkUON6U&name=pdf

Abstract

Premise selection is a key bottleneck for scaling theorem proving in large formal libraries. Yet existing language-based methods often treat premises in isolation, ignoring the web of dependencies that connects them. We present a graphaugmented approach that combines dense text embeddings of Lean formalizations with graph neural networks over a heterogeneous dependency graph capturing both state–premise and premise–premise relations. On the LeanDojo Benchmark, our method outperforms the ReProver language-based baseline by over 25% across standard retrieval metrics. These results suggest that relational information is beneficial for premise selection.

See other Publications

Combining textual and structural information for premise selection in Lean