In order to identify target molecules, we use Similarity Ensemble Approach (SEA), a chemoinformatic search tool, to match our hit molecules with available bioactive mo- lecules in the databases. ChEMBL, a 2D, open-access, bioactive database containing the chemical structures, calculated physiochemical properties and biological activities of over 700,000 small molecules, has been developed (https://www.ebi.ac.uk/chembldb). This website is a tool that can be used to search for bioactive small molecules with structures similar to the hit compounds. The compounds in this database are mostly extracted from primary articles published in the field of drug discovery. In addition, ChEMBL data can also be integrated with PubChem BioAssay data to increase the
total amount of searched compounds.
ChEMBL used as a primary tool to identify possible o↵-target protein binding of the hit molecules. The greater the similarity of our hits to molecules in the database, the greater the possibility that the o↵-target proteins can be found. In our initial search, the criteria for finding a molecule in the database similar to the hit compound is a chemically similarity of more than 90 %. In addition, the assay used to evaluate the compounds in the database must be suitable for our study.
Searching o↵-targets inspired by ChEMBL
Initially, the structures of three hit molecules, 3, 4, and 5, were used as starting points to search for compounds with well defined biological activities that share >90% similarity with the hit molecules.
The resulting compounds were mapped based on their biological activities, and representative compounds were selected from the primary literature for each category (Figure 2.13). We found the following four interesting and suitable categories that share similar structures and may a↵ect the gene-expression process: G9a, C-C chemokine receptor type 4 (CCR4), transforming growth factor beta (TGF- ), and adrenergic receptor inhibitors.
We also identified a few further clusters of bioactive molecules; however, these groups were ignored because they kill Giardia lambila and a group of inactive Human Flap endonuclease 1 inhibitors: compounds 3 and 4 were reported to be inactive against human flap endonuclease 1 and polymerase Iota. Human flap endonuclease 1 and polymerase Iota classes are of lower priority than the other four classes.
In addition to the data obtained from the ChEMBL data base, a kinase inhibitor group was included in our consideration. Although the structures of the kinase inhibitors (Gefitinib) [132] and CDK2 inhibitors [133] are not highly similar to those of the hit molecules, previous studies indicate that they have significant potential to regulate the function of EZH2 [84, 85, 87].
To validate the computational results obtained, a number of experimental tests were conducted.
G9a activity First of all, the G9a inhibitor UNC0638 optimised for cell permeab- ility was evaluated in our cell-based assay. MDA-MB-231 cells were treated for 48 hours at a concentration of 1-10 µM. The mRNA levels of KRT17, FBXO32, EZH2 and JMJD3 were measured by RT-PCR. The resulting data were normalised against the housekeeping gene GAPDH (Figure 2.14.A). UNC0638 could not reactivate KRT17. Only FBXO32 was reactivated at a high dose of UNC0638 (i.e., above EC50(8.8 µM). This usually occurred when the toxicity of a compound induced the
Figure 2.13: A chemical structure similarity map. The chemical structures of the three hits 3–5 were matched with four classes of inhibitors, including G9a, CCR4, TGF- , and adrenergic receptor inhibitors.
reactivation of FBXO32 and JMJD3 together. This interpretation was supported by the results of a single-dose experiment, which indicated that UNC0638 could not reactivate the EZH2 target genes (Table 2.5). On the other hand, our hits previously showed inhibitory activity against G9a in the low micromolar range (Table 2.4). Taken together, although the inhibition of G9a might be important for the cell based e↵ects observed, targeting only the G9a pathway is not sufficient to reactivate EZH2 target genes.
GPCR activity One function of CCR4, which is highly over-expressed in breast cancer cell lines, is to promote tumour growth [134]. No published articles indicate a CCR4 pathway related to gene reactivation or any epigenetic mechanism. Therefore, based on the chemoinformatic results, we synthesised C021, a CCR4 inhibitor, to evaluate its activity in our cell-based assay. C021 (compound 139) was synthesised using the same route used to synthesize the hits (Scheme 2.18). C021 exhibited a
Figure 2.14: The biological results by our collaborators and the assay detail can be found in the experimental section. (A) RT-PCR results of UNC0638 and C021 at a concentration of 1–10 µM. (B) MTT assay results of UNC0638, C021 and the three hits. (C) The biochemical assay results of the three hits and positive control against CDK1 and CDK2.
dose-dependent response in our RT-PCR assay (Figure 2.14.A). However, the JMJD3 RNA level increased dramatically at concentrations of 7.5 µM and 10 µM. Therefore, the increases in KRT17 and FBXO32 expression likely stemmed from the toxicity of C021 (EC50 = 3.75 µM). In addition, the enzymatic assay showed that C021
could weakly inhibit EZH2 (IC50= 40 µM). Although it is likely that C021 cannot
reactivate the EZH2 target gene and has low inhibitory activity against EZH2 in enzymatic studies, the further evaluation of hits against CCR4 is recommend to confirm the relationship between the CCR4 pathway and epigenetic mechanisms.
Scheme 2.18: The synthetic route of C021. (i) 4-Amino-1-benzylpiperidine, DIEA, THF, overnight. (ii) 1,4-Bipiperidine, toluene, reflux for 36 h.
considered, with a particular focus on the adrenergic receptors. At a concentration of 1 mM, UNC0638 showed 64 %, 90 % and 69 % inhibitions of muscarinic M2, adrenergic ↵1A, and adrenergic ↵1B receptor, respectively [29]. At present, there is no strong evidence to support the relationship between GPCR and epigenetic silencing processes. Nevertheless, several ↵1 adrenergic inhibitors appear similar to our leads [135].
Kinase activity The post-translational modifications of CDK1, CDK2 and Akt can disturb EZH2 activities [85, 88]. All three hits were evaluated for their activities against CDK1 and CDK2 using BS-194 as the positive control [136] (Figure 2.14.C). All three hits 3–5 show no activity against CDK1 and CDK2 (IC50 > 100 µM).
Likewise, our hits were evaluated against Akt using Staurosporine as the positive control compound [137, 138]. The three hits are also inactive against Akt (IC50 >
100 µM, data not shown). Therefore, the kinase pathways via CDK1, CDK2 and Akt are not the o↵-target pathway for the post-translational modification of EZH2. In our opinion, access to the recommend inhibitors (ALK5 and Prazosin) and assays (GPCRs screening or TGF- signalling assay) has been limited, and the results have not yet been clarified. Therefore, the benefits of using ChEMBL remain unclear. Moreover, when using ChEMBL data for analysis, the user should carefully consider the type of data used to address the question of interest. We selected only the appropriate target types with the understanding of these individual targets. However, the 90% similarity of compound 3–5 might be too high, restricting the possible protein targets. We were also aware of bias based on only ChEMBL data set which was di↵erent form the commercial data base, such as MDL Drug Data report and WOMBAT target. In addition, the use of ChEMBL and other bioactivity databases can be limited by other factors including missing information regarding stereochemistry and functional groups [139].