<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD with MathML3 v1.3 20210610//EN" "JATS-archivearticle1-3-mathml3.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"
  dtd-version="1.3" xml:lang="en" article-type="research-article">
  <?DTDIdentifier.IdentifierValue -//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN?>
  <?DTDIdentifier.IdentifierType public?>
  <?SourceDTD.DTDName JATS-journalpublishing1.dtd?>
  <?SourceDTD.Version 1.2?>
  <?ConverterInfo.XSLTName jats2jats3.xsl?>
  <?ConverterInfo.Version 1?>
  <?properties open_access?>
  <front>
    <journal-meta>
      <journal-id journal-id-type="iso-abbrev">Pharmacophore</journal-id>
      <journal-id journal-id-type="publisher-id">pharmacophorejournal.com</journal-id>
      <journal-id journal-id-type="publisher-id">Pharmacophore</journal-id>
      <journal-title-group>
        <journal-title>Pharmacophore</journal-title>
      </journal-title-group>
      <issn pub-type="epub">2229-5402</issn>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="publisher-id">pharmacophorejournal.com-6906</article-id>
      <article-id pub-id-type="doi">10.51847/5KyoIKlui1</article-id>
      <article-categories>
        <subj-group subj-group-type="heading">
          <subject>Original research</subject>
        </subj-group>
      </article-categories>
      <title-group>
        <article-title>Molecular Foundation Models in Pharmaceutical Sciences: A Critical Review</article-title>
      </title-group>
                    <contrib-group>
                      <contrib contrib-type="author">
              <name>
                <surname>Hugo</surname>
                <given-names>Victor</given-names>
              </name>
                              <xref rid="aff1" ref-type="aff">1</xref>
                                                            <xref rid="cor1" ref-type="corresp" />
                          </contrib>
                      <contrib contrib-type="author">
              <name>
                <surname>Cruz</surname>
                <given-names>Daniel</given-names>
              </name>
                              <xref rid="aff1" ref-type="aff">1</xref>
                                        </contrib>
                      <contrib contrib-type="author">
              <name>
                <surname>Salazar</surname>
                <given-names>Javier</given-names>
              </name>
                              <xref rid="aff2" ref-type="aff">2</xref>
                                        </contrib>
                  </contrib-group>
                  <aff id="aff1">
            <label>1</label>Department of Intelligent Pharmaceutical Analytics, Faculty of Pharmacy, National University of Colombia, Bogota, Colombia.
          </aff>
                  <aff id="aff2">
            <label>2</label>Department of Computational Drug Sciences, Faculty of Medicine, University of Antioquia, Medellin, Colombia.
          </aff>
                          <author-notes>
            <corresp id="cor1">
              <bold>Address for correspondence:</bold> Prof. Wael Abu Dayyih, Department of
              Pharmaceutical Chemistry, Faculty of Pharmacy, Mutah University, Al-Karak 61710, Jordan.
                              E-mail: <email xlink:href="victor.hugo@gmail.com">victor.hugo@gmail.com</email>
                          </corresp>
          </author-notes>
                    <pub-date pub-type="epub">
        <day>28</day>
        <month>06</month>
        <year>2026</year>
      </pub-date>
      <volume>17</volume>
      <issue>3</issue>
      <fpage>72</fpage>
      <lpage>80</lpage>
      <permissions>
        <copyright-statement>
          Copyright: &#x000a9; 2026 Pharmacophore
        </copyright-statement>
        <copyright-year>2026</copyright-year>
        <license>
          <ali:license_ref xmlns:ali="http://www.niso.org/schemas/ali/1.0/"
            specific-use="textmining" content-type="ccbyncsalicense">
            https://creativecommons.org/licenses/by-nc-sa/4.0/</ali:license_ref>
          <license-p>This is an open access journal, and articles are distributed under the terms of
            the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 License, which allows
            others to remix, tweak, and build upon the work non-commercially, as long as appropriate
            credit is given and the new creations are licensed under the identical terms.</license-p>
        </license>
      </permissions>
      <abstract>
        <title>A<sc>BSTRACT</sc></title>
        <p>Molecular foundation models, pre-trained on millions of chemical structures, are increasingly promoted as a universal solution for pharmaceutical prediction tasks. Their appeal lies in the possibility that large-scale chemical pretraining can reduce dependence on small, noisy, task-specific datasets. Despite their rapid proliferation, critical examination of their pretraining data, leakage risks, transferability, and validation practices remains limited and fragmented. This is problematic because pharmaceutical machine learning is especially vulnerable to hidden similarities between training and test compounds. This critical review evaluates molecular foundation models in pharmaceutical sciences, focusing on pretraining data quality, data leakage, transferability evidence, and validation rigour. It treats reported benchmark performance as a hypothesis requiring scrutiny rather than as sufficient evidence of utility. The review identifies pervasive data biases, frequent over-optimistic evaluation due to leakage, inconsistent evidence on transferability, and a widespread lack of external or prospective validation. These issues are not incidental limitations but structural weaknesses in how many molecular foundation models are developed and assessed. Uncritical adoption of molecular foundation models risks misleading performance claims and may slow, rather than accelerate, pharmaceutical applications. Greater attention to data provenance, split design, uncertainty, and prospective relevance is necessary before such models can be trusted in drug discovery workflows. A set of recommendations is proposed for more robust pretraining, transparent evaluation, and domain-appropriate validation. Molecular foundation models should be judged not by benchmark novelty alone but by their ability to generalize under conditions that resemble pharmaceutical decision-making.</p>
      </abstract>
      <kwd-group>
                <kwd>Molecular foundation models</kwd>
                <kwd>Pharmaceutical machine learning</kwd>
                <kwd>Data leakage</kwd>
                <kwd>Pretraining bias</kwd>
                <kwd>Transfer learning</kwd>
                <kwd>ADMET</kwd>
              </kwd-group>
    </article-meta>
  </front>
</article>