New Preprint on Marginalized Graph Kernels

New preprint alert from our lab! “Interpretable Molecular Property Predictions Using Marginalized Graph Kernels”


Marginalized graph kernels are a new approach to quantify molecular similarity directly from molecular graphs and can serve as input to kernel machines (e.g. SVM or GPR) with performance similar to graph neural networks (Figure below from

1st Image MGK Paper

However, we can not yet interpret graph-based similarity. Interpretability helps to build trust & detect biases. We derive two interpretations to identify (1) the most important atoms, and (2) the most important training data points causing a certain prediction.

2nd Image for MGK Paper


On the “logic” benchmark, our atomic attribution performed similar to state of the art GNN atomic attribution and never performed worse than 95% of the best approach per dataset while neural network performance varied more widely.

3rd Image from MGK Paper


We evaluated predicted performance of MGK-GPR on the FreeSolv benchmark and found that it outperformed standard RBF kernels.

4th Image from Yan Paper

When consulting our “molecular attribution” for why MGK outperforms RBF, we found MGK created more “chemically reasonable” similarities between molecules while classic fingerprints were misled by bit collisions.

5th Image from MGK Paper

Intrigued by this finding, we calculated average molecular attribution and found that MGK creates “more significant” nearest neighbor relationships compared to RBF (p = 1e-116, paired T test).

6th Image from MGK Paper

Overall we hope that these measures of interpretability will help to further establish marginalized graph kernels for molecular machine learning to aid in drug development.This work was conducted by our postdoc Yan together with colleagues Yu-Hang Tang and Guang Lin. 

Congratulations everyone on this exciting study!