%0 Journal Article %T Self-Supervised Molecular Models for P-Glycoprotein Substrate Prediction Using Transporter Assay Data %A Thomas Andersen %A Lars Nielsen %A Mette Sørensen %J Pharmacophore %@ 2229-5402 %D 2026 %V 17 %N 1 %R 10.51847/SLmtXHWnZI %P 91-100 %X P-glycoprotein efflux can strongly constrain oral absorption, brain penetration, and intracellular drug exposure. Computational substrate prediction is therefore an important early filter for molecules likely to face transporter-mediated disposition liabilities. Most transporter models rely on limited labeled assay data and are often trained directly on endpoint-specific measurements. This ignores the broader chemical information contained in large collections of unlabeled molecular structures. This MDL article proposes a self-supervised molecular model for P-glycoprotein substrate prediction. The model pre-trains on large unlabeled chemical databases and is then adapted to a limited set of validated transporter assay labels. A molecular encoder would be pre-trained using contrastive and masked-structure objectives over graph or SMILES representations. The pre-trained encoder would then be coupled to a lightweight classifier for binary substrate prediction using curated P-glycoprotein assay labels. Conceptually, the self-supervised model would be expected to offer better data efficiency than a model trained only from limited labeled transporter data. Attribution methods could also highlight molecular features associated with P-glycoprotein recognition. Self-supervised molecular learning could make transporter prediction more accessible when labeled assay data are scarce. This approach may support earlier design of molecules with more favorable absorption and distribution profiles. %U https://pharmacophorejournal.com/article/self-supervised-molecular-models-for-p-glycoprotein-substrate-prediction-using-transporter-assay-dat-t6myxqdmznqsgga