Anti-HIV reverse transcriptase plant polyphenolic natural products with in silico inhibitory properties on seven non-structural proteins vital in SARS-CoV-2 pathogenesis

Background Accessing COVID-19 vaccines is a challenge despite successful clinical trials. This burdens the COVID-19 treatment gap, thereby requiring accelerated discovery of anti-SARS-CoV-2 agents. This study explored the potential of anti-HIV reverse transcriptase (RT) phytochemicals as inhibitors of SARS-CoV-2 non-structural proteins (nsps) by targeting in silico key sites in the structures of SARS-CoV-2 nsps. One hundred four anti-HIV phytochemicals were subjected to molecular docking with nsp3, 5, 10, 12, 13, 15, and 16. Top compounds in complex with the nsps were investigated further through molecular dynamics. The drug-likeness and ADME (absorption, distribution, metabolism, and excretion) properties of the top compounds were also predicted using SwissADME. Their toxicity was likewise determined using OSIRIS Property Explorer. Results Among the top-scoring compounds, the polyphenolic functionalized natural products comprised of biflavones 1, 4, 11, 13, 14, 15; ellagitannin 9; and bisisoquinoline alkaloid 19 were multi-targeting and exhibited strongest binding affinities to at least two nsps (binding energy = − 7.7 to − 10.8 kcal/mol). The top ligands were stable in complex with their target nsps as determined by molecular dynamics. Several top-binding compounds were computationally druggable, showed good gastrointestinal absorptive property, and were also predicted to be non-toxic. Conclusions Twenty anti-HIV RT phytochemicals showed multi-targeting inhibitory potential against SARS-CoV-2 non-structural proteins 3, 5, 10, 12, 13, 15, and 16. Our results highlight the importance of polyhydroxylated aromatic substructures for effective attachment in the binding/catalytic sites of nsps involved in post-translational mechanism pathways. As such with the nsps playing vital roles in viral pathogenesis, our findings provide inspiration for the design and discovery of novel anti-COVID-19 drug prototypes. Supplementary Information The online version contains supplementary material available at 10.1186/s43141-021-00206-2.


Background
The rapid spread of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) marks itself as one of the deadliest viruses in recent history due to its high mortality and morbidity rates [1,2]. As of May 2021, the World Health Organization recorded over one hundred sixty-seven million cases worldwide with 3.4 million deaths [3]. Continuous efforts are being carried out to unravel the pathophysiology of the virus, paving the way to the discovery and development of efficacious vaccines and anti-SARS-CoV-2 drugs. While the world continues to make strides in vaccine development and rollout, drug-based treatments are still needed to cure the growing number of COVID-19-afflicted individuals. Thus, developing effective therapeutic agents against SARS-CoV-2 remains a global health need.
The discovery of antiviral chemotherapeutic prototypes requires accurate identification of drug targets. Among which, the SARS-CoV-2 non-structural proteins (nsps) are among the highly favored targets because of their role in viral replication, post-translational mechanisms, and host immunity evasion that influence SARS-CoV-2 virulence and pathogenesis [4]. The repurposing of bioactive natural products is one of the key strategies available for screening potential SARS-CoV-2 nsps inhibitors. To date, plant-based medicines as treatment for SARS-CoV-2 infection have not been reported.
Considering the similarity between SARS-CoV-2 and HIV, we repurposed previously reported anti-HIV reverse transcriptase (RT) secondary compounds using in silico simulations in this study. SARS-CoV-2 and HIV are single-stranded RNA viruses that utilize RNAdependent polymerases and code precursor polyproteins vital for their respective pathogenesis. In this paper, we disclose computational interrogation of 104 known anti-HIV RT phytochemicals against seven target proteins, namely nsp3 (PLpro), nsp5 (3CLpro), nsp12 (RdRp), nsp13 (helicase), nsp15 (endoribonuclease), and the nsp16-nsp10 complex (S-adenosylmethionine complex). The thermodynamic stability and the pharmacokinetic characteristics of the top-ranked compounds are also reported.

Target enzyme preparation
Seven target enzymes with important functions in SARS-CoV-2 infectivity were selected and obtained from the Protein Data Bank (PDB): 3CLpro (PDB ID: 6LU7), PLpro (PDB ID: 6W9C), RdRp (PDB ID: 6M71), helicase (6JYT), nsp16-nsp10 complex (6W4H), and nsp15 (6VWW). These proteins in three-dimensional structures were added to UCSF Chimera 1.14 platform as PDB files [20]. All proteins belong to SARS-CoV-2 except for helicase due to unavailability of nsp13. Thus, helicase model from SARS-CoV-1 which shares 99.8% sequence identity and 100% sequence similarity with that of SARS-CoV-2 was used [21]. Coronavirus helicase domains are distinct compared to other (+)-sense RNA virus domains due to the presence of linkage in a single protein to a binuclear zinc-binding domain at the Nterminus. This domain is composed of 12 conserved cysteine-histidine residues and is a good target in antiviral drug discovery [22][23][24][25].

Ligand selection and preparation
A total of 104 plant secondary metabolites (Supplementary Figure 1; Supplementary Table 1) previously reported to inhibit HIV RT [25] were used as ligands targeting the above-mentioned viral proteins. The plant metabolite structures were formatted as SYBYL mol2 file or in SMILES notation using Avogadro (version 1.20) and were added to the UCSF Chimera 1.14 platform [26].

Molecular docking simulations
Molecular docking experiments were carried out on UCSF Chimera 1.14 platform with AutoDock Vina plugged-in as docking algorithm [20]. Protein structures in three dimensions were opened in PDB formats. Cocrystallized ligands and other molecules were removed from the crystallized protein. Ligands were added in the platform as SYBYL mol2 files or in SMILES notation. Ligand and protein structures were minimized through addition of missing hydrogen atoms and charges to the structures using the Gasteiger charge method, which was computed using Amber's Antechamber module [27]. 'Flexible ligand into flexible active site' protocol was followed during execution of docking procedures. In this protocol, flexible ligands were allowed and positioned within a grid box (Supplementary Table 2) which encompasses the enzymatic ligand-binding cavity, as predicted using COACH algorithms [28].
Druggability, ADME, and toxicity prediction Top ten compounds per nsp, which in total were twenty-seven compounds, were selected for druggability, pharmacokinetic, and toxicity analyses. Absorption, distribution, metabolism, and excretion (ADME) properties of top twenty-seven compounds overall were computationally predicted using SwissADME software. Evaluation of pharmacokinetic profiles of compounds was performed according to Lipinski's 'rule of five' which assesses biochemical properties of a drug candidate involved in permeation and cell absorption. Three of the following values need to be met according to Lipinski's criteria: < 500 Daltons (Da) for molecular weight, < 5 for calculated lipophilicity (MLogP), < 10 for the number of hydrogen-bond acceptors, and < 5 for the number of hydrogen bond donors [29]. Moreover, toxicity of hit compounds, specifically mutagenicity, tumorigenicity, reproductive toxicity, and irritant effects, were predicted in silico using OSIRIS Property Explorer software [30]. Solubility (LogS) was also predicted using the same software in which LogS ≥ − 4 indicates good solubility and favorable absorption of compounds.

Molecular dynamics simulations
Molecular dynamics (MD) simulation was employed to understand the dynamic behavior of the top-binding complexes based on molecular docking analysis. All MD simulations were carried out using GROMACS version 2020.1 under the Ubuntu Linux platform version 2020.1-1 [31]. The SARS-CoV-2 non-structural protein topologies were generated using the CHARMM 36 force field with TIP3P water model, while ligand topology was generated using CGenFF (CHARMM general force field). The complex was solvated on a dodecahedron grid by single point charge (SPC) water. The system was then neutralized with counterions. Energy minimization was done on the system using the steepest descent integrator for 5000 steps and Particle Mesh Ewald (PME) algorithm for the Coulomb and van der Waals interactions [32].
After system equilibration, each system was subjected to molecular dynamic simulation for 20 ns at constant temperature of 300 K. The dynamic trajectories were recorded during the production every 0.01 ns which were used to analyze the root mean square deviation (RMSD) and root mean square fluctuation (RMSF) for each system.

Molecular docking with replication-transcription complex enzymes (nsp12 and nsp13)
Among the top ten compounds against RdRp with binding affinities of − 8.6 to − 9.5 kcal/mol, the ellagitannin punicalin (9) exhibited the highest affinity ( Asn497 (4.13 Å), which are both components of the RdRp finger domain that is responsible for the entry and exit of the RNA template during replication-transcription [33]. Moreover, its glucose hydroxyl and hydrogen participated in hydrogen bonding (4.79 Å) and carbon-hydrogen bonding respectively with Asp684, a component of the motif B of the polymerase active site [34]. Other interactions include the participation of its ellagic acid moiety in pi-alkyl interaction with Lys577, galloyl hydroxyl in hydrogen bonding with Gly590 (2.87 Å), carbonyl oxygen in hydrogen bonding with Tyr689 (5.76 Å), and glucose moiety in carbon-hydrogen bonding with Ala685. Meanwhile, the top-ranked ligands against helicase had binding affinities of − 8.4 to − 9.2 kcal/mol in which the biflavonoids rhusflavanone (13) and morelloflavone (14) exhibited the strongest affinity (Fig.  3B). Compound 13 occupied the helicase Rec1A domain, which is a component of the nucleotide binding site, through hydrogen bonding of its chromanone (ring A′) hydroxyl and pyrone (ring C′) oxygen with Lys288 (4.76 Å) and Ala316 (3.32 Å), respectively [35]; pi-alkyl interactions of its chromene moieties, rings A with Ala316 with Ala316, and ring A′ with Lys320; and pi-cation interaction of its hydroxyphenyl moiety (ring B) with Lys320.
Compound 13 also occupied the Rec2A domain of the nucleotide binding site through a hydrogen bond of its pyrone (ring C) carbonyl with Arg443 (4.73 Å), an amide-pi stacked interaction of its hydroxyphenyl moiety (ring B′) with Gly538, and a pi-sigma interaction of ring B′ with Ser539. A pyrone (ring C) carbonyl further contributed to the binding affinity of compound 13 by binding to Thr286 through van der Waals forces. On the other hand, the dihydroxyphenyl moiety (ring B′) of compound 14 bound Glu341 through a relatively strong hydrogen bonding (3.02 Å) and both Ala312 and Val340 by pi-alkyl interactions. Ring A of its chromanone functionality bound Ala313 through a pi-alkyl interaction and also Ala312 by pi-sigma interaction. These residues are members of the helicase Rec2A domain of the nucleotide binding site. In addition, a benzophyrone hydroxyl (ring A′) of compound 14 bound Asp534 (5.32 Å), which is a residue of the Rec1A helicase domain of the nucleotide binding site.
Molecular docking with enzymes functioning in the evasion of host immunity SAM-dependent 2′-O-methyltransferase complex enzymes (nsp16-nsp10 complex) Top compounds against nsp16 exhibited affinities from − 9.3 to − 10.6 kcal/mol. The SAM-binding site was  compound 19's isoquinoline moiety was in H-bonding with Asp6928 (3.68 Å) and Asp6897 (4.60 Å) and in carbon-hydrogen bonding with Gly6869. Another isoquinoline moiety was in pi-anion interaction with Asp6931. Moreover, the naphthalene moiety participated in pi-pi T-shaped interaction with Phe6947 and in pi-sulfur interaction with Cys6914. The methyl group connected to naphthalene manifested alkyl interactions with Met6929, Leu6898, and Cys6913. In connection, top compounds against nsp10 showed binding affinities of − 6.9 to − 7.7 kcal/mol. The interface between nsp10 and nsp16 was targeted and several interactions were observed. Biflavonoid robustaflavone (4) had the highest affinity (Fig. 4B). Its pyrone ring C′ was in carbonhydrogen bonding with Ile4334. Chromanone (ring A′) hydroxyl formed a strong H-bonding with Asp4335 (3.50 Å). Carbon atoms of chromanone (rings A and C) and hydroxyphenyl ring B′ formed salt bridges with Lys4346 while ring C′ carbonyl exhibited a salt bridge with Arg4331.

Endoribonuclease (nsp15)
Top-scoring compounds against nsp15 exhibited affinities of − 8.6 to − 7.3 kcal/mol, stronger ( Table 4). The biflavonone hinokiflavone (15) scored the highest affinity, noting its interactions with its putative binding site that is proximal to the catalytic triad of His235, His250, and Lys290: flavone moiety (rings A, B, and C) in hydrogen bonding (4.55 Å) and pi-alkyl interaction with Met243, pyrone ring C in pi-sigma interaction with Tyr262, pyrone ring C′ in pi-anion interaction with Glu258, and hydroxyphenyl ring B′ in pi-alkyl and pi-pi stacked interactions with Ala256 and His362, respectively (Fig. 4C).

Druggability, ADME, and toxicity
Six of the 20 top-scoring and multi-targeting repurposed phytochemicals were found to be druggable according to Lipinski's rule of five (Table 5). Hinokiflavone (15) is a top-scoring, multi-targeting, druggable compound. Moreover, compounds 5 and 17 were multi-targeting and exhibited good gastrointestinal absorption properties.
In addition, compounds 25 and 26 showed the best solubility in water of − 2.85, thereby depicting good excretion properties (Table 6). Toxicity prediction through OSIRIS Property Explorer showed that all top  (Table 6).

Molecular dynamics simulations
Molecular dynamics (MD) simulation was performed on the top-binding ligands, chosen based on molecular docking and ADMET analyses, to assess at an atomic level the binding behavior of the various polyphenols against SARS-CoV-2 non-structural proteins. The stability of the complexes, specifically PLpro-amentoflavone (1), 3CLpro-amentoflavone (1), RdRp-punicalin (9), helicase-rhusflavanone (13), nsp16-michellamine B (19), nsp10-robustaflavone (4), and nsp15-hinokiflavone (15), was evaluated using post-simulation parameters root mean square deviation (RMSD) and root mean square fluctuations (RMSF). RMSD is one of the widely used analyses using MD trajectories of protein-ligand complexes to establish equilibrium within a given simulation period. Based on RMSD analysis which was measured as an average throughout a 20-ns simulation, the complexes attained dynamic stability (Fig. 5). In the case of the amentoflavone (1)-bound PLpro, it took some time for the complex to reach equilibrium. As shown in the plot of RMSD (Å) versus simulation time (ns), a steady increase in RMSD can be observed up to 15 ns before stabilization occurs. A similar trend can be observed for the RdRp-punicalin (9) complex. After a steady rise in RMSD, the complex achieved equilibrium around 13 ns with an average RMSD value of 7.1 Å. Among the complexes, the amentoflavone (1)-bound 3CLpro appeared to be the most stable complex having the lowest average equilibrium RMSD (2.2 Å). Although a minor fluctuation can be observed around 11 ns, the complex remained stable for the entire simulation time. In the case of nsp16-michellamine B (19) complex, an incremental increase in RMSD can be noted from the start of the simulation until 8 ns and a relatively high divergence can be seen around 9 ns. However, the system gained equilibrium thereafter. For the helicase-rhusflavanone (13) complex, several minor fluctuations can be noticed from initial binding stage up to mid simulation time. Despite this observation, the average RMSD of the complex remained low (2.8 Å). Another relatively stable complex is robustaflavone (4) bound to nsp10 with an average equilibrium RMSD of 3.9 Å. Shortly after ligand-binding, the complex attained equilibrium. Lastly, the nsp15-hinokiflavone (15) complex exhibited relatively low stability based on the RMSD plot where some minor fluctuations are noted at the beginning of the simulation up to 15 ns. The time-averaged residual fluctuations of the seven top-binding complexes were also analyzed on the basis of trajectory data within a 20-ns simulation (Fig. 6). The results of the residual RMSF analysis revealed that most of the complexes, particularly 3CLpro-amentoflavone (1), nsp16-michellamine B (19), nsp10-robustaflavone (4), helicase-rhusflavanone (13), have average RMSF values ranging from 1.1 to 1.6 Å and have shown relatively stable fluctuation patterns. These data are consistent with RMSD analysis, which confirm that the said complexes are stable. For the hinokiflavone (15)-bound

Discussion
The SARS-CoV-2 non-structural proteins (nsps) play vital roles in the virus' pathogenesis, survival, and virulence. A number of these nsps have been considered as attractive and important drug targets due to their involvement in viral post-translational processing, replication, and host immunity evasion mechanisms (Fig. 7).  The cysteine proteases, nsp3 (PLpro) and nsp5 (3CLpro), are involved in the autolytic cleavage of the polyproteins pp1a and pp1ab wherein PLpro cleaves 3 sites at the Nterminus while 3CLpro cleaves through the remaining sites (11 sites in pp1ab) to release the nsps [36]. Proceeding to the replication-transcription complex, nsp12 (RdRp) elongates the daughter strand through the polymerization of nucleotides while nsp13 (helicase) clears RNA secondary structures and RNA-binding proteins [37]. The nsp 16 (SAM-dependent 2′-O-methyltransferase) in complex with nsp10 as its cofactor provides a 5′ cap to the RNA genome through C2′-Omethyl-ribosyladenine, conferring RNA stability and host cell immunity protection [38]. Lastly, the nsp15 (endoribonuclease) hinders recognition of dsRNA intermediates by host dsRNA sensors [33]. Our results, therefore, highlight the role of anti-HIV RT phytochemicals as potential antagonists of SARS-CoV-2 by interfering with the discussed mechanisms. Natural products have been a subject of investigation concerning their ability to antagonize SARS-CoV-2 due to their availability and wide range of health benefits [34,35,[39][40][41]. In relation, repurposing established anti-HIV phytochemicals means that the lead compounds in this study can be easily obtained from previously explored plants that are consumed by populations. Here, we focused on the employment of computational targetbased drug discovery methodologies, such as molecular docking, molecular dynamic simulations, and pharmacokinetic property predictions in search for potential hits for inhibiting the aforementioned SARS-CoV-2 nsps. Our study revealed that the biflavonoid amentoflavone (1) showed the highest binding to both SARS-CoV-2 cysteine proteases PLpro and 3CLpro. Compound 1, isolated from the Chinese olive fruit, Canarium album [42], was also reported in a previous study to be a potent inhibitor of SARS-CoV PLpro [43,44]. Volkensiflavone (11) from the seeds and rinds of Garcinia intermedia [45] was another top compound against 3CLpro. Punicalin (9) from the pomegranate Punica granatum peel [46] exhibited high binding propensity against RdRp, an enzyme considered to be a promising target inhibiting viral replication. Morelloflavone (14) from G. intermedia was first to be reported here with an inhibitory potential against SARS-CoV-2 helicase. Interestingly, its potential extends against 3CLpro [47]. Robustaflavone (4) from the leaves of Garcinia epunctata [48] showed the best potential against the 2′-O-methyltransferase and its cofactor. This is the first investigation of its activity against these nsps aside from its interaction with 3CLpro [49]. On the other hand, michellamine B (19) from Ancistrocladus korupensis leaves [50] manifested an inhibitory  [51] was reported to be a potential 3CLpro inhibitor and potent against the replication-transcription complex [43,47,49,52]. This, however, is the first investigation of its activity against the endoribonuclease of SARS-CoV-2 in silico. The multi-targeting potential of some of these compounds increases the chance of getting a maximal inhibitory effect [53].
To further validate the molecular docking analysis, the top-binding ligands were submitted for molecular dynamic simulations. Through post-simulation analyses, the top-binding ligands were generally found to be dynamically stable upon binding to respective proteins. Although most of the top 1 compounds were predicted in silico to be non-druggable, efforts are rising to explore compounds in the oral druggable space beyond the rule of five (bRo5) [54,55]. Additionally, four of these did not manifest toxicity in silico. The biflavonoids volkensiflavone (11), morelloflavone (14), and hinokiflavone (15) were computationally predicted as non-mutagenic, nontumorigenic, and non-irritant, but were predicted to pose reproductive toxicity risk which may be attributed to their chromene and hydroxyphenyl moieties. It should be noted, however, that hinokiflavone (15) is a druggable top 1 compound. Michellamine B (19) was predicted to be tumorigenic due to its naphthalene moiety. In addition, compounds 5 and 17 exhibited good gastrointestinal absorptive features as implicated by their favorable lipophilicity and polar surface area [56]. These also did not manifest any form of toxicity in silico. Despite computational incompatibilities, these compounds can still serve as templates for drug design and undergo in vitro and in vivo assays for validating their anti-SARS-CoV-2 properties, noting that their promising polyphenolic nature allowed them to form hydrogen bonds with key residues of the SARS-CoV-2 nsps. With the validation of pre-clinical experiments, the secondary metabolites can be produced through in vitro plant tissue cultures that can be augmented by metabolic engineering, elicitation, and even the use of bioreactors [57,58].

Conclusions
The search for anti-COVID-19 therapeutic agents is a response to the continuous spread of the virus amidst vaccine availability. The similarity between the pathogenesis of HIV and SARS-CoV-2 inspired the repurposing of previously reported anti-HIV reverse transcriptase phytochemicals against SARS-CoV-2 nsps implicated in viral replication, posttranslational processing, and host immunity evasion mechanisms. The top-ranking polyphenolics amentoflavone (1), robustaflavone (4), punicalin (9), volkensiflavone (11), rhusflavanone (13), morelloflavone (14), hinokiflavone (15), and michellamine B (19) can be further screened using confirmatory in vitro and in vivo assays, and can serve as prototypes for designing novel anti-COVID-19 drugs in consideration of their polyphenolic nature. As promising drug templates, functionalities in the compound structure can be modified to improve druggability and pharmacokinetic properties.

Additional files
Additional file 1

Acknowledgements
Not applicable.
Authors' contributions VNODL, JAHM, DYHP: conceptualization, investigation, data collection and analysis, manuscript preparation, and manuscript editing. RATF: conceptualization, investigation, data collection and analysis, manuscript preparation, and manuscript editing and review. JKARC: data collection. MTJQ: conceptualization, investigation, data collection and analysis, manuscript editing, and manuscript review. JCMA: investigation and manuscript editing. KIRN: conceptualization, investigation, manuscript editing and manuscript review. APGM: conceptualization, design, manuscript preparation, editing, and review. All authors have read and approved the present version of the manuscript.

Funding
Not applicable.

Availability of data and materials
All data generated in this study are included in this published article and the supplementary information files.