Logo

Enhancing Chemical Explainability Through Counterfactual Masking

Łukasz Janisiów, Marek Kochańczyk, Bartosz Zieliński, Tomasz Danel

Jagiellonian Univeristy, Poland

AAAI-26 Conference Proceedings

Tl;DR

A new masking method designed specifically for molecular graphs that replaces molecule parts with realistic alternatives, giving more meaningful and useful explanations for molecular property predictions than standard masking methods, which often lead to out-of-distribution samples.

Motivation: Molecular Masking Is Broken

When atoms or bonds are naively masked (by zeroing features or deleting parts of molecule), the resulting molecules often become chemically implausible or physically impossible, producing examples that fall outside the training distribution. This out-of-distribution problem undermines the reliability of both the explanations and their evaluation metrics. Furthermore, current masking approaches inadvertently leak information about the original graph topology to the model: even when certain atoms are “masked,” their structural relationships and connectivity pattern remain implicitly encoded in the modified graph.

Solution: Counterfactual Masking (CM)

Instead of zeroing features or deleting atoms, Counterfactual Masking (CM) masks by replacing important molecular fragments with chemically plausible alternatives. This enables a direct comparison between the model’s prediction for the original molecule and the average prediction across valid counterfactuals (i.e., all possible alternatives). By doing so, CM:

Provides chemically grounded and robust explanations
Clarifies why specific fragments drive predictions

Logo

Evaluation of Fragment Masking

To evaluate different masking techniques, we used the common substructure pair dataset, which contains pairs of molecules with the same core structure. In each pair, we masked the fragments that were not shared and measured the difference in the model’s predictions. An ideal masking technique should completely obscure the masked fragment, resulting in identical predictions for both molecules in the pair.

We found that the common masking method, feature zeroing, produces molecules that are out of distribution and, because of that, very often leads to unreliable explanations. While Counterfactual Masking using DiffLinker or CReM produces molecules much closer to the training distribution, explanations based on them are more faithful.

Can We Generate Counterfactuals?

Although originally designed for masking tasks, CM can effectively generate realistic, chemically valid counterfactual examples by ensuring that molecules stay within the data distribution. In comparison to other counterfactual generation methods, it also enables targeted, local modifications to specific fragments, which can be easily interpreted by people with a chemistry background.

📄 For full tables and results:

Read the Paper

Acknowledgements

This study was funded by the ”Interpretable and Interactive Multimodal Retrieval in Drug Discovery” project. The ”Interpretable and Interactive Multimodal Retrieval in Drug Discovery” project (FENG.02.02-IP.05-0040/23) is carried out within the First Team programme of the Foundation for Polish Science co-financed by the European Union under the European Funds for Smart Economy 2021-2027 (FENG). We gratefully acknowledge Polish high-performance computing infrastructure PLGrid (HPC Center: ACK Cyfronet AGH) for providing computer facilities and support within computational grant no. PLG/2025/018272.

Citation

@inproceedings{janisiow2026counterfactualmasking,
   title={Enhancing Chemical Explainability Through Counterfactual Masking},
   journal={Proceedings of the AAAI Conference on Artificial Intelligence},
   publisher={Association for the Advancement of Artificial Intelligence (AAAI)},
   author={Janisi{\'o}w, {\L}ukasz and Kocha{\'n}czyk, Marek and Zieli{\'n}ski, Bartosz and Danel, Tomasz},
   year={2026}
   }

Contact

For questions, please open an issue on GitHub or contact Tomasz Danel or Łukasz Janisiów (tomasz.danel <at>.uj.edu.pl, lukasz.janisiow <at> doctoral.uj.edu.pl).

Paper Code Get started