NEURAL NETWORK–BASED SEARCH FOR COX-2 ACTIVE LIGANDS FROM COXIB-LIKE AND SIMILAR COMPOUNDS

Received: 09 Feb 2023 Received in revised form: 10 May 2023 Accepted: 14 May 2023 Available online: 28 Jun 2023


Introduction
The long-established non-steroidal anti-inflammatory drugs (NSAIDs) such as ibuprofen, naproxen, and indomethacin possess adverse effects that include gastrointestinal bleeding, ulceration, and perforation. This is associated with the inhibition of the cytoprotective cyclooxygenase-1 (COX-1) enzyme [1] that is expressed in many tissues. COX-1 is a housekeeping enzyme [2] that mediates important physiological functions such as the production of the mucus lining for inner stomach protection and overall tissue homeostasis [1], and thus dubbed the "good COX". On the other hand, the inducible isozyme COX-2 is expressed largely at the sites of inflammation and is responsible for the production of pro-inflammatory prostaglandins that cause pain, fever, and inflammation [3], and hence tagged the "bad COX". Thus, drugs that would selectively inhibit the activity of COX-2 exert analgesic, antipyretic, and anti-inflammatory effects and the side effects associated with COX-1 inhibition [4,5]. The newer so-called "coxibs" such as celecoxib [6], etoricoxib [7], and parecoxib [8] are remarkably selective to COX-2. But even this group of NSAIDs has an array of undesirable side effects that are specific to a particular drug [9,10]. The later discovery of COX-2 expression in kidneys, brain, and spinal cord [11,12] indicated that the prostaglandins derived from COX-2 have beneficial effects including renal function regulation [13] and ulcer healing [14]. For this reason, some selective NSAIDs may also present side effects such as cardiovascular and renal problems [15,16] depending on the extent of COX-2 inhibition. Thus, the discovery of new classes of anti-inflammatory compounds with optimal COX-2 risk is still an active area of research. The ligand-based method is an established approach in drug discovery that uncovers new active chemical entities using the structural information of known active compounds [17]. It utilizes quantitative models that relate the observed biological activity to structure-derived molecular properties. The experimental bioactivity can be measured with a continuous variable like IC50 (half maximal inhibitory concentration), EC50 (half maximal effective concentration), or MIC (minimum inhibitory concentration); or a nominal variable like compound classification, i.e., active or inactive. Several statistical techniques and machine learning methods can be applied in classification tasks such as logistic regression [18], discriminant analysis [19], random forest [20], and artificial neural network [21], among others. This work developed a neural network (NN) model that was applied to the Derivatives or "coxib"-like variants, and to the Similars, another set of structurally related compounds.

Materials and Methods
The compounds included in this study were collected from literature as described elsewhere [22,23]. These published articles on COX inhibitors covered more than two decades of research. The compounds in the dataset were assigned to either the Active or Inactive group depending on their IC50 values, that is, Active if IC50 ≤ 10 μM and Inactive if IC50 > 10 μM. A personal computer running on Microsoft Windows 7 Professional 64-bit OS using a 3.50-GHz Intel  Core i7-4770K processor with 8.00-GB RAM was used in the creation, manipulation, and optimization of the molecular structures and calculation of their properties. A machine with a MacOS Catalina operating system, 3.1-GHz Dual-Core Intel Core i7 processor, and 16-GB RAM was used in the data analysis and model construction. The chemical structures were drawn using ChemDraw Professional 16.0 (www.perkinelmerinformayics.com) tool, the geometry optimization at the semi-empirical PM3 level was carried out in Spartan 16 (www.wavefun.com), and the calculation of molecular descriptors was performed in both Discovery Studio (DS) version 4.0 (Biovia, Inc.) and Spartan 16. MS Excel was used in organizing the data and RapidMiner Studio 9.7.001 (www.rapidminer.com) in Neural Net modeling and data analyses [24]. The ADMET (absorption, distribution, metabolism, excretion, and toxicity) properties, QED (Quantitative Estimate of Druglikeness) scores, Synthetic Accessibility (SA) scores, and molecular docking data were obtained as described in our previous work [22,23]. Two sets of compounds were generated where the hits would be determined. The set of Derivatives was composed of 1100 compounds created by performing bioisosteric replacement [25] at selected positions in 5 structures that represent the 5 largest families, i.e. cyclopentenes, imidazolyls, difluorobenzenes, furanyls/thiophenyls, and isoxazoles, of known COX-2 inhibitors [22,23]. The similarities included the 600 top hits in a similarity search study done in SwissSimilarity (http://www.swisssimilarity.ch).

Results and Discussion
Data was first cleaned by removing duplicate compounds and descriptors that were essentially invariable or had numerous missing items. The final number of compounds was 1380 with 929 actives and 451 inactives, and the number of descriptors was 184. Subsequently, the 80-20 dataset split was applied putting aside 1104 (80%) compounds for model construction and 276 (20%) for model validation. The best NN model was then implemented in the Derivatives and the Similars. Mimicking biological neural networks, artificial neural networks (ANN) or simply neural nets (NN) are composed of artificial neurons or nodes, which are interconnected by weights that resemble the biological synapses [26]. Excitatory connections are characterized by positive weights while the inhibitory synapses are reflections of negative weights [27]. A linear combination was formed that summed up all inputs, which are modified according to assigned or calculated weights. An activation function, which in many applications was a sigmoid function, was employed to limit the amplitude or keep the output within an acceptable range, i.e. scaled to 0 and 1, or -1 and +1. NN is an adaptive system that uses nonlinear data modeling of complex relationships between inputs and outputs like in this case, between molecular descriptors and compound classification, i.e., active or inactive. This work applied multilayer perceptron (MLP) that develops a model utilizing a feedforward neural network and backpropagation learning technique [28]. It consists of at least three layers of nonlinearly-activating nodes that are fully connected in a feedforward way [29], that is, each node in one layer connects with a certain weight to every node in the following layer, moving in only one direction from the input nodes through the hidden nodes to the output nodes. The backpropagation algorithm is a supervised learning method involving two phases, propagation and weight update, that are repeated until a good network performance is reached [30]. This algorithm computes the value of some predefined error function by comparing the predicted outcomes with the observed values. The error is then fed back through the network by which the algorithm adjusts the weights such that the error diminishes. This process is repeated for a sufficiently large number of training cycles until this error reaches a certain minimum value and the network has learned a certain target function. In this work, the optimization was stopped when the error decreased to 1 × 10 -4 . For hyperparameter tuning, the 80-20 data split was again applied to the dataset intended for model construction, so that a set of 883 compounds was used to build the models that were then tested on the remaining 221 compounds. Among the performance metrics, model specificity was considered also important in that a model should be able to identify potential dropouts and so avoid the costly high attrition rate at the later stages of the drug development. Figure 1 shows the best model constructed using 63 attributes, that allowed a maximum of 0.75 correlation coefficient among the variables. On the validation set, the corresponding best NN model had 96.4% accuracy, 97.3% sensitivity, and 94.4% specificity, outshining both the Multiple Logistic Regression [22] and Random Forest [23] models. Like in the RF study, the models with fewer input nodes, i.e., limited information, gave less accurate predictions. But the inclusion of more but very highly correlated descriptors (r > 0.75) did not further improve the model performance. The best NN model was also indicated with an excellent predictive performance by both ROC-AUC [31] and Cohen's kappa coefficient [32] with high values of 0.977 and 0.918, respectively (Figure 1b). a) b) Figure 1. Performance Scores of Neural Net models as a function of the correlation coefficient (r) among the independent variables: training cycle = 200, learning rate = 0.01, momentum = 0.9, number of hidden layers = 1; input data was shuffled before learning, the sigmoid function was used as activation function, and optimization was stopped at epsilon error below 1 × 10 -4 .  Figure 2. Performance Scores of Neural Net models as a function of the number of training cycles: r < 0.75, learning rate = 0.01, momentum = 0.9, number of hidden layers = 1; input data was shuffled before learning, the sigmoid function was used as activation function, and optimization was stopped at epsilon error below 1 × 10 -4 . a) b) Figure 3. Performance Scores of Neural Net models as a function of the hidden layer size: r < 0.75, learning rate = 0.01, momentum = 0.9, number of hidden layers = 1; input data was shuffled before learning, the sigmoid function was used as activation function, and optimization was stopped at epsilon error below 1 × 10 -4 .
Hence, the best NN model (Figure 4a)   When applied to the set of 1100 Derivatives, the NN model classified 875 as actives, i.e., orange to red data points in Figure  5. The average probability of being active 0.95 is slightly higher than the Multiple Logistic Regression (MLogR) model prediction (0.94) [22], and much higher than that of the RF model (0.69) [23]. Consistent with previous studies, nearly all (96%) cyclopentane derivatives are predicted as active.

Figure 5.
The NN model prediction of compound classification of the Derivatives. The closer to 1 (red) the greater the confidence that the compound is active against COX-2, the closer to 0 (blue) the greater the confidence that the compound is inactive.
On the Similars, the NN model classified 163 out of 600 (27%) compounds as actives (Figure 6). This is slightly fewer than the 188 compounds (33%) identified by the RF model [23]. However, the mean probability of being active (0.89) of compounds predicted active by NN active is remarkably greater than that in RF (0.61) and can be considered comparable to that of the MLogR (0.92). This is not surprising considering that the weight adjustments during NN modeling utilized a sigmoid activation function similar to the logistic function in MLogR modeling. Predicted active compounds from the Derivatives and Similars got through additional screening filters such as measures of drug-likeness, ADMETox, and synthetic accessibility. The average Quantitative Estimate of Druglikeness [33] or QED score of active Derivatives was 0.67 and 71% (797) of them have a QED score above 0.5, i.e., druglike. Having an average synthetic accessibility (SA) score of 3.4 which is within the acceptable range (SA score < 6) [34], the chemical synthesis of the active Derivatives is highly feasible. Most of them have optimal aqueous solubility (68%) and intestinal absorption (98%), all are non-mutagens and non-inhibitors of CYP2D6, and mostly are non-carcinogens (90%), though most are hepatotoxic (98%). Likewise, with an average QED of 0.63, the majority (61%) of the predicted active Similars possess druglike properties, i.e., QED score above 0.5, and can be easily synthesized in the laboratory, i.e., all SA scores were less than 6. They are mostly soluble in biological fluid (72%), non-carcinogens (84%), non-mutagens (81%), non-inhibitors of CYP2D6 (92%), and have good intestinal absorption (93%). The top 20 hits, that is, 10 from each set, are listed in Table 1 along with their properties. The top 10 hits from Derivatives were also the same compounds found in the previous MLogR study [22]. However, only 3 (S269, S265, and S499) of the top 10 hits from Similars were featured in that previous account. Among the new top Similars identified here, the diarylmethylenecyclobutenes (S34, S35, S375) appeared to be highly promising with greater binding energies (-8.6, -9.0, -8.8 kcal/mol, respectively) than the control ligands. Aside from being non-carcinogens, non-mutagens, and non-inhibitors of the CYP2D6 enzyme, the top 20 hits have excellent PA, QED, and SA scores, which are much-desired characteristics of potential candidate compounds.

Conclusion
A neural network model was constructed on a dataset consisting of 1380 compounds that have been tested experimentally against COX-2 enzyme and were classified as Active (IC50 ≤ 10 μM) or Inactive (IC50 > 10 μM). Over 180 molecular descriptors served as the independent variables, and 63 of these attributes were used as input in training the NN. The best performing NN consisted of a single hidden layer with 33 nodes and has learned after 200 training cycles at a learning rate of 0.01 and momentum of 0.9. This NN is an outstanding binary classifier as indicated by an excellent classification accuracy of 93.5% in the test set and ROC-AUC of 0.97.
The NN model classified 875 out of 1100 newly designed compounds (Derivatives) and 188 out of 600 structurally related compounds (Similars) as active against the COX-2 enzyme with high probability. The top hits of 10 Derivatives and 10 Similars have excellent drug-likeness and toxicity profiles and can be synthesized without much difficulty. Except for two from the Similars, these hits have superior (or at least comparable) binding affinities with the drug target compared to the control ligands. These top compounds can be prioritized in any subsequent drug discovery endeavor geared towards the development of a new generation of COX-2-acting NSAIDs.