TY - JOUR T1 - Self-Supervised Molecular Models for P-Glycoprotein Substrate Prediction Using Transporter Assay Data A1 - Thomas Andersen A1 - Lars Nielsen A1 - Mette Sørensen JF - Pharmacophore JO - Pharmacophore SN - 2229-5402 Y1 - 2026 VL - 17 IS - 1 DO - 10.51847/SLmtXHWnZI SP - 91 EP - 100 N2 - P-glycoprotein efflux can strongly constrain oral absorption, brain penetration, and intracellular drug exposure. Computational substrate prediction is therefore an important early filter for molecules likely to face transporter-mediated disposition liabilities. Most transporter models rely on limited labeled assay data and are often trained directly on endpoint-specific measurements. This ignores the broader chemical information contained in large collections of unlabeled molecular structures. This MDL article proposes a self-supervised molecular model for P-glycoprotein substrate prediction. The model pre-trains on large unlabeled chemical databases and is then adapted to a limited set of validated transporter assay labels. A molecular encoder would be pre-trained using contrastive and masked-structure objectives over graph or SMILES representations. The pre-trained encoder would then be coupled to a lightweight classifier for binary substrate prediction using curated P-glycoprotein assay labels. Conceptually, the self-supervised model would be expected to offer better data efficiency than a model trained only from limited labeled transporter data. Attribution methods could also highlight molecular features associated with P-glycoprotein recognition. Self-supervised molecular learning could make transporter prediction more accessible when labeled assay data are scarce. This approach may support earlier design of molecules with more favorable absorption and distribution profiles. UR - https://pharmacophorejournal.com/article/self-supervised-molecular-models-for-p-glycoprotein-substrate-prediction-using-transporter-assay-dat-t6myxqdmznqsgga ER -