SARS-CoV-2 host cell entry: an in silico investigation of potential inhibitory roles of terpenoids

Background Targeting viral cell entry proteins is an emerging therapeutic strategy for inhibiting the first stage of SARS-CoV-2 infection. In this study, 106 bioactive terpenoids from African medicinal plants were screened through molecular docking analysis against human angiotensin-converting enzyme 2 (hACE2), human transmembrane protease serine 2 (TMPRSS2), and the spike (S) proteins of SARS-CoV-2, SARS-CoV, and MERS-CoV. In silico absorption-distribution-metabolism-excretion-toxicity (ADMET) and drug-likeness prediction, molecular dynamics (MD) simulation, binding free energy calculations, and clustering analysis of MD simulation trajectories were performed on the top docked terpenoids to respective protein targets. Results The results revealed eight terpenoids with high binding tendencies to the catalytic residues of different targets. Two pentacyclic terpenoids (24-methylene cycloartenol and isoiguesteri) interacted with the hACE2 binding hotspots for the SARS-CoV-2 spike protein, while the abietane diterpenes were found accommodated within the S1-specificity pocket, interacting strongly with the active site residues TMPRSS2. 3-benzoylhosloppone and cucurbitacin interacted with the RBD and S2 subunit of SARS-CoV-2 spike protein respectively. These interactions were preserved in a simulated dynamic environment, thereby, demonstrating high structural stability. The MM-GBSA binding free energy calculations corroborated the docking interactions. The top docked terpenoids showed favorable drug-likeness and ADMET properties over a wide range of molecular descriptors. Conclusion The identified terpenoids from this study provides core structure that can be exploited for further lead optimization to design drugs against SARS-CoV-2 cell-mediated entry proteins. They are therefore recommended for further in vitro and in vivo studies towards developing entry inhibitors against the ongoing COVID-19 pandemic. Supplementary Information The online version contains supplementary material available at 10.1186/s43141-021-00209-z.

different receptors [32,56,57]. Cell entry of coronaviruses depends on a fine interplay between the viral membrane spike (S) proteins and the host cell membrane proteins more importantly are the angiotensin-converting enzyme 2 (ACE2) and serine protease transmembrane protease serine 2 (TMPRSS2) [7]. The S-protein comprises two subunits; S1 as the receptor-binding domain (RBD) while S2 subunit is for the fusion of viral membrane and host cellular membrane. The SARS-CoV-2 relies on the host ACE2 for entry and the TMPRSS2 for S-protein priming. Upon binding of the S-protein to host receptor through the receptor-binding domain (RBD) in the S1 subunit, the S2 subunit mediates fusion of the viral envelope with the host membranes [12]. Although the overall sequence similarity between S-protein of SARS-COV-2 and SARS-CoV is approximately~76%, affinity between S-RBD of SARS-COV-2 and ACE2 is found to be approximately four fold higher when compared with SARS-CoV RBD [12,64]. This molecular interaction is responsible for regulating both the cross-species and higher human-tohuman transmissions of SARS-CoV-2 [63,74]. Therefore, these protein effectors of viral attachment, membrane fusion, and cell entry are known as emerging targets for development of entry inhibitors, antibodies, and vaccines [74].
The use of phytomedicines as alternatives to combat viral diseases and other infections forms an integral component of African cultural practices, and hence a prominent feature in Africa [3,5,18,37,41,61]. Terpenoids are a well-known class of phytochemicals of tremendous pharmaceutical value over time because of their relevant broad-spectrum utility in medicine [17,23,40]. Screening a database of phytochemicals from indigenous African medicinal plants may help identify terpenoids with therapeutic potentials against the COVID-19 pandemic. Therefore, this study explores computational screening of terpenoids from indigenous African medicinal plants as potential inhibitors of the emerging proteins responsible for coronavirus cell entry and subsequent infection.

Ligand preparation
One hundred and six bioactive terpenoids from African medicinal plants were collected based on literature search. Structure Data Format (SDF) of the reference inhibitors (S1: MLN-4760; S2: camostat and S3: nelfinavir mesylates) and 106 bioactive terpenoids derived from African plants were retrieved from the PubChem database (www.pubchem.ncbi.nlm.nih.gov) and converted to mol2 chemical format using Open babel [39]. Other compounds that were not available on the database were drawn with Chemdraw version 19 and converted to mol2 chemical format. Polar hydrogen charges of the gasteiger-type were assigned and the nonpolar hydrogen molecules were merged. The ligand molecules were further converted to the dockable pdbqt format using MGL-AutoDockTools (ADT, v1.5.6) [36].

Molecular docking
Molecular docking was performed to evaluate the binding affinity and to provide initial coordinates and topology parameters for the MD simulations. The screening of human enzymes and active regions of the coronaviruses spike protein and determination of binding affinities were carried out using AutoDock Vina [59]. The binding scores from vina analysis were further validated by BINDSURF [48]. Docking of bioactive terpenoids and reference compounds against human ACE2, human TMPRSS2, and SARS-CoV-2 spike protein was performed by AutoDock Vina to locate alternate binding sites enclosing the whole macromolecules. Default settings of Vina wase used, as the scoring matrix in this program is stochastic, and each run uses a random seed position except for the grid box which was adjusted with extended grid size (60 Å × 60 Å × 60 Å) to reveal all the possible interaction sites. The molecular docking was executed using a flexible docking protocol; all bonds contained in ligand were allowed to rotate freely, making the receptor rigid. Once the molecular docking experiments were completed and 10 configurations for each protein-ligand complex were generated for all the terpenoids, text files of scoring results were also produced for the purpose of manual comparative analysis. The top docked terpenoids were uploaded into the respective columns of BINDSURF webserver to validate and calculate the energetic interactions. The molecular interactions between proteins and selected compounds with higher binding affinity to the proteins were viewed with Discovery Studio Visualizer version 16.

Molecular dynamics simulation
Molecular dynamics simulations were carried out on the top ranked terpenoid to respective protein targets (SARS-CoV-2 spike (S) protein, human angiotensinconverting enzyme 2 (ACE2), and transmembrane protease serine 2 (TMPRSS2)). The complexes were prepared and solvated, in TIP3P water model and neutralized by adding NaCl ions and its concentration was set to 0.154 M using CHARMM-GUI webserver prior to running MD simulation using Nanoscale Molecular Dynamics (NAMD V 2.13) [6,27,44]. The ligands (terpenoids) were parameterization on the SwissParam webserver. The TIP3P water model was used to resemble the added water box, with 10 Å padding, for the periodic boundary condition to be applied [34]. Nose-Hoover Langevin piston was used to control the pressure at 1.01325 bar. In contrast, Langevin dynamics controlled the system's temperature at the physiological value (310 K, 7.0, and 0.154 M NaCl, respectively). The time step was set at its default two fs with SHAKE approximation. Visualizing molecular dynamics (VMD 1.9.3) software was used to prepare the input files and analyze the output trajectories [22]. Minimization step for the complexes was initiated for 10,000 steps using a conjugate gradient algorithm in constant number of atoms, constant volume, and constant temperature ensemble (NVT) using CHARMM 36 force field. Afterwards, equilibration of each system for one nanosecond was started in constant number of atoms, constant pressure, and constant temperature ensemble (NPT). Finally, a production run for 100 ns for each system was initialized in NVT ensemble. Periodic Boundary Conditions (PBC) was applied to the simulation. Trajectories were extracted each 0.1 ns and time step was set to 2 femto second. The analysis of the dynamics was performed by utilizing VMD scripts to calculate root mean square deviation (RMSD), root mean square fluctuation (RMSF), surface accessible surface area (SASA), radius of gyration (RoG), and hydrogen bonds (H-bonds) [22]. All the analyses were performed after removing the PBC using the pbctools package in VMD using this command pbc unwrap-sel "selection" where selection is replaced by the appropriate name.

Clustering of molecular dynamic trajectory
Afterwards, TTClust V 4.9.0 [60] was used to cluster the whole trajectory (1000 frame) using the elbow method to calculate the optimum number of clusters. For each representative frame produced, Protein Ligand Interaction Profiler (PLIP) [47] was used to know the types and number of interactions between the protein and the ligand.

MM/GBSA calculation and MM/GBSA free energy decomposition analysis
To calculate the binding free energies of the top docked terpenoids to each of the protein target, molecular mechanics-generalized born surface area (MM-GBSA) was calculated using the version implemented in Amber-Tools 20 for all frames in the trajectory [35,54]. Saltcon variable was set to 0.154 M and igb, which determines the generalized born method to use, was set to the default value of five. After the decomposition process, the energy contribution could be distributed to each residue of receptor and the binding interaction of each ligandresidue pair consists of three energy terms: van der Waals contribution (ΔE vdw ), electrostatic contribution (ΔE ele ), and the desolvation term (ΔG desolvation ) which included the polar (ΔG GB ), the non-polar (ΔG SA ), and total free energy (ΔG total ) term. Fifty frames separated by equal intervals of 20 frames were used to generate the binding free energies and were also used for the free energy decomposition analysis.

Drug-likeness and ADMET studies
The top terpenoids that demonstrated highest binding affinity for ACE2, TMPRSS2, and active regions of SARS-CoV-2 spike protein were subjected to several drug-likeness predictive descriptors which orally bioactive drug should comply [30,38]. The predicted absorption, distribution, metabolism, excretion, and toxicity (ADMET) studies were analyzed using the ADMET webserver [10]. The SDF file and SMILES of the compounds were downloaded from PubChem database to calculate ADMET properties using default parameters. The result from the docking analysis of the reference inhibitors and bioactive terpenoids with the human ACE2, TMPRSS2, and SARS-CoV-2 spike protein is shown in Table S1 (supplementary material). The top 20 terpenoids with the highest binding affinity for the ACE2 were further analyzed for binding interactions with SARS-CoV-2 chimeric receptor-binding domain complexed with its human receptor ACE2 (ACE2-RBD) and the S protein of SARS-CoV and MERS-CoV (Table  S3, supplementary material) (Fig. 2).

Molecular docking
The docking analysis revealed that the reference inhibitor (MLN-4760) to the human ACE2 protein had binding energy of − 7.7 Kcal/mol, respectively, while camostat an inhibitor of TMPRSS2 had a binding energy of − 7.6 Kcal/mol as represented in Fig. 3. It was further observed that the topmost docked terpenoids to the ACE2 had higher binding affinity for the S protein of SARS-CoV and MERS-CoV than SARS-CoV-2. More than 10 terpenoids had higher binding affinity than the 3 inhibitors used in this study (Table S1: supplementary material). The top 20 docked compounds to SARS-CoV-2 S-proteins had higher binding affinity than nelfinavir mesylates (Table S3: Supplementary material).
From the binding scores generated by the interacting terpenoids with the ACE2 and TMPRSS2 proteins, the top 2 docked terpenoids with the highest binding affinity are 24-methylene cycloarteno and isoiguesterin with corresponding binding energy of − 9.7, and − 9.5 Kcal/mol, respectively. The best two docked terpenoids to SARS-CoV-2 S protein are 3-benzoylhosloppone and cucurbitacin with binding energies of − 9.4 and − 9.3 Kcal/mol respectively. 3-benzoylhosloppone had the highest binding affinity for SARS-CoV-2 S protein and the second top binding affinity to MERS-CoV S protein (Fig. 3).

Interaction of selected terpenoids with amino acids of target proteins
The amino acid interactions of the human target proteins (ACE2 and TMPRSS2) with reference inhibitors and plant derived terpenoids that demonstrated the highest binding tendencies are represented in Table 1. In the same way, the amino acid residues of the coronaviruses S protein that interacted with reference inhibitors and terpenoids with the highest binding affinity are shown in Table 2. The interacting residues of human ACE2 and TMPRSS2 with respective ligand groups were majorly through hydrophobic interactions and H-bond. Few H-bonding below 3.40 Å were observed with coronaviruses S protein (Table 1 and Fig. 3). The binding of MLN-4760 to ACE2 showed that it was docked into the N terminus and zinc-containing subdomain I of ACE2   73 , and TRP 69 residues respectively in a similar binding pattern with MLN-4760 (Fig. 4c). Camostat was docked into the S1-specificity pocket of TMPR SS2 (Fig. 5a). It interacted via conventional H-bond to five amino residues (ARG 41 , SER 195 , TRP 215 , ALA 190 , and ASP 189 ) and via carbon hydrogen bond to GLN 192 of TMPRSS2. The conventional H-bond was formed in the direction of the guanidine group in this order: first ester bond, second ester bond, while the last three residues interacted with amidino nitrogen of guanidine group, respectively. The phenyl ring was responsible for the carbon-hydrogen bond with GLN 192 (Fig. 5a). T3 and T4 were docked into S1-specificity pocket of TMPR SS2 in a similar binding pattern as in the case of camostat (Fig. 5b, c). The only difference observed between the binding pattern of T3 and T4 was an additional Hbond between T3 with ARG 41 (Fig. 5b). Nelfinavir mesylates an inhibitor of SARS-CoV and MERS-CoV S protein interacted in its best docked conformation to the S protein of SARS-CoV-2 in a different manner. Nelfinavir mesylates was docked into the S2 subunit of SARS-CoV S protein (Fig. 7a). The same inhibitor was docked into to the N-terminal domain (NTD) region of the S1 subunit of SARS-CoV-2 and MERS-CoV S protein (Figs. 6a and 8a). 3-benzoylhosloppone with the highest binding affinity for SARS-CoV-2 S protein interacted via H-bond to THR 547 ; Alkyl interaction to PHE 541 and Pi-Alkyl interaction to PRO 589 and LEU 546 . The region of interaction was between the CTD and SD1 region of S1 subunit of SARS-CoV-2 S protein. Cucurbitacin B was docked to the S2 subunit of SARS-CoV-2 S protein but interacted with different amino acid residue. The interaction of cucurbitacin B to the protein was via H-bond   (Fig. 6c). The same pattern of interaction was observed in both 7-Deacetoxy-7-oxogedunin and 3friedelanone to the S2 subunit of SARS-CoV S protein.

Energy profile of best docked terpenoids to respective proteins
The overall energy profiles of terpenoid-protein complexes in the selected clusters with the best docked poses are shown in Figures S1-(supplementary data). Figure  S1a-a (supplementary data) shows the breakdown of the binding energy of the selected cluster into different contributions. Gauss 1 (blue) and 2 (leaf green) bars represent the non-bonding interactions, red bar: repulsion, light blue bar: hydrophobic, purple bar: hydrogen bonds, light green bar: rotational forces, while the black bar represents total binding affinity which is a representative contribution of all bonding and non-bonding interactions between the terpenoids and the protein residues. The contributions of the various type of interaction as presented in graph (Figures S1a-a: supplementary data) shows that of the total binding energy of − 9.7 Kcal/mol exhibited by the binding of 24-methylene cycloartenol to the ACE2, − 2.1 and 1.8 Kcal/mol of hydrophobic and H-bond energies respectively was contributed, while the rest were contributed by non-bonding interaction mainly van der Waals, repulsive, and rotational forces.

Molecular dynamics simulation
Four compounds including camostat, T3, 24-methylene cycloartenol, and 3-benzoylhosloppone were analyzed for their interactions with transmembrane protease serine 2 (TMPRSS2), and Angiotensin-converting enzyme 2 (ACE2) and SARS-CoV-2 Spike glycoprotein (S protein). Molecular dynamics simulation was done on each of the target protein-terpenoids complexes and the trajectories were analyzed. The radius of gyration (RoG), root mean square deviation (RMSD), root mean square fluctuation (RMSF), and surface accessible surface area (SASA) results were calculated for each trajectory. The RoG values give indication on the folding/unfolding of the protein. There was no observed difference between the RoG of TMPRSS2_camostat and TMPRSS2_T3 complexes (Fig. 9a)   protein_3-benzoylhoslopponecomplex are the most fluctuating. The RMSD values show the deviation of each frame from the initial configuration (Fig. 9a, b). The average RMSD values from the plots of the TMPRSS2_ camostat (2.13 Å) and TMPRSS2_T3 (2.14 Å) system were very close, while the ACE2-24_methylene cycloartenol and S protein_3-benzoylhosloppone complexes are around 3.6 Å and 16.78 Å, respectively (Figs. 10 and 11). The SASA plots indicate the rate of conformational changes in the protein based on its solvent accessibility. TMPRSS2_cemostat, TMPRSS2_T3, ACE2_24-methylene cycloartenol, and S protein 3-benzoylhosloppone complexes have average values of 11563 Å 2 , 11498 Å 2 , 29667 Å 2 , and 53680 Å 2 (Fig. 10). The RMSF plots give information on the fluctuation of individual amino acids. All the four complex systems have spikes at the end of RMSF plots that indicates the motion of the terminals. The mean RMSF values for TMPRSS2_camostat and TMPRSS2_T3 are 0.68 and 0.73 Å (Fig. 12a), while the ACE2_24-methylenecycloartenol and S protein_(3-benzoylhosloppone) complexes were fluctuating around 1.29 Å and 7.36 Å, respectively (Fig. 12b). The spikes in the middle and the start of the RMSF of ACE2_(24-methylene cycloartenol) complex between amino acid 265 and amino acid 443 and spikes in S protein_(3-benzoylhosloppone) complex corresponds to the loops in the two protein respectively (Fig. 12).  Figure S4 (supplementary data) shows the first and last cluster representatives for the protein-terpenoids complexes and the mode of interaction in the enlarged part of the image. Images were generated using PyMol software V 2.2.2.

Molecular mechanics/generalized born surface area and decomposition analysis
MM/GBSA free energy decomposition analysis was employed to decompose the total binding free energies (ΔGbind) into terpenoid-residue pairs, which would provide more detailed information regarding the contribution of each residue for ligand binding. It is obvious that the residue spectrograms of the TMPRSS2 systems were similar, though with different intensity of interactions.   (Fig. 13).

Drug likeness and pharmacokinetic properties of selected terpenoids
The result generated from the Lipinski and ADMET filtering analyses are represented in Table 4 and Figure S5 (supplementary file). Four terpenoids T1, T3, T5, and T6 fulfilled the requirement for Lipinski analysis of the rule-of-five with corresponding favorable predicted Fig. 9 The radius of gyration plots for a TMPRSS2_camostat and TMPRSS2_11-hydroxy-2-(3,4-dihydroxybenzoyloxy)abieta-5,7,9(11),13-tetraene-12one and b ACE2_24-methylene cycloartenol and S protein-3-benzoylhosloppone complexes ADMET parameters. The in silico predictive pharmacokinetic and ADMET properties from the filtering analyses suggested T1, T3, T5, and T6 with a high probability of absorption, subcellular distribution, and low toxicity. Though pharmacokinetic analysis indicated T1 (Table 4) to be less soluble while the ADME/tox analysis indicated high aqueous solubility, ability to pass the high human intestinal absorption, low acute oral toxicity with a good bioavailability score as exhibited by T3, T5, and T6 (Table 4).

Discussion
The prediction of drug-target interactions especially in new proteins is an essential stage in the drug discovery and development process [33]. Interference with several proteins that mediate viral attachment, membrane fusion, and cell entry of coronaviruses is an emerging therapeutic strategy for preventing COVID-19 infection [7,20]. This principle was earlier demonstrated with HIV [13,19] and SARS-CoV [2]. Earlier screening and prospecting of therapeutic phytocompounds have been reported for both SARS-CoV and MERS-CoV [42,46,50,65]. Cell-based assays have shown the antiviral potentials of specific plant terpenoids against severe acute respiratory syndrome coronavirus (SARS-CoV) [65,70]. This study was therefore undertaken to identify plantderived terpenoids with inhibitory potentials against membrane-mediated SARS-CoV-2 entry proteins. Specifically, two triterpenes namely 24-methylene cycloartenol and isoiguesterin were reported to target ACE2 as well as the host-virus interface (S-protein-ACE2 receptor complex). These compounds interacted with adjacent residues in the conserved domain, apparently portraying its ability to bind and block interactions of hotspot 31 residues. The residues near lysine 31, and tyrosine 41, 82-84, and 353-357 in human ACE2 are important for the binding of S-protein of coronavirus [28]. The hotspots, 31 and 353, make salt bridge between Lys31 and Glu35, and the hotspot 353, comprising a salt bridge between Lys353 and Asp38, and are both buried in hydrophobic environment; therefore, interaction within this  region is suggested to affect the binding of its substrate [69]. In a similar study in which five selected phytochemicals from Chinese and Indian herbs, though the individual compounds interacted differently with the active site of ACE2, they tend to distort the conformation that is necessary for its binding to the viral S protein [4]. The binding interactions of 24-methylene cycloartenol and isoiguesterin to the Site-2 binding site of ACE2 were similar to the pattern exhibited by some repurposed drugs such as delapril and lisinopril perindopril [24]. Abietane diterpenes, namely 11-hydroxy-2-(3,4-dihydroxybenzoyloxy) abieta-5,7,9(11), 13-tetraene-12-one (T3), and 11-hydroxy-2-(4-hydroxybenzoyloxy)-abieta-5,7,9(11), 13-tetraene-12-one (T4) showed the strongest interaction with TMPRSS2. In a similar binding pattern to camostat, these compounds were fitted into the S1specificity pocket. They interacted with residue ALA 190 , ASP 189 , and GLN 192 that are known to be part of the amino acid found at the basement of the pocket. ASP 189 at the bottom of the pocket is known to determine the specificity of the S1 pocket for basic residues Arg and Lys at position P1 of the substrate [26]. The result showed that the hydroxybenzoyloxyl moiety of the terpenoids (T3 and T4) was responsible for at least 75% of the H-Bond with the protein. It was further observed that just as in the case of benzamidine (the native ligand) and camostat, the hydroxybenzoyloxyl moiety of the two terpenoids points with its hydroxyl group towards the carboxylate group of ASP 189 forming strong H-bonds with ASP 189 and other residue in the pocket. For camostat, the phenylquanidine moiety pointed into the hydrophobic pocket with the negatively charged ASP 189 at its bottom. Unlike the H-bond formed between the amidino nitrogen of the phenylquanidine and benzamidine, in T3 and T4 the H-Bonds were formed mainly with the hydroxyl and carboxylate group. A striking similarity observed was that the ester bond that linked both the phenylquanidine moiety of camostat and the hydroxybenzoyloxyl moiety of T3 and T4 to the remaining structural unit of the compounds formed strong H-Bonds to the same residue SER 195 . The phenyl group of the hydroxybenzoyloxy moiety of T3 and T4 further interacted with hydrophobic interactions to CYS 119 and CYS 219 just as the peptide planes of the bonds between TRP 215 -GLY 216 and CYS 191 -GLN 192 sandwich the phenyl ring of benzamidine [16,26]. The additional hydrophobic interaction by T3 and T4 may have been responsible for the exhibited higher binding affinities than camostat and benzamidine. Furthermore, while the hydroxybenzoyloxy moiety was directed towards the hydrophobic cleft created by ASP 189 , the abietane agylcon interacted with the imidazol ring of HIS 57 of the S2 pocket that is found next to the S1 pocket and ARG 41 (in the case of T4) which are outside the hydrophobic cleft. A similar interaction as the later was observed with camostat. The strong similarity in the binding pattern and even a far strong binding affinity than camostat and benzamidine indicates that T3, T4, and other abietane diterpenes especially those with hydroxybenzoyloxyl moiety attached to the abietane aglycon are potential inhibitors of TMPRSS2, thus preventing some coronaviruses from entering host [26]. Some natural compounds were found to interact with the protease furan of TMPRSS2, and these compounds exhibited different binding modes in the active site [52,62]. It is known that, like SARS-CoV, SARS-CoV-2 S protein recognizes and binds to host-cell receptor angiotensinconverting enzyme 2 (ACE2) using a transmembrane protease serine 2 (TMPRSS2) which activates the S protein to facilitate viral fusion and entry into cells [68]. It is important to note that serine protease inhibitors like camostat mesylate, which blocks the activity of TMPR SS2 [77], has been approved in Japan for human use. Related compounds with antiviral activity potentiates as anti-SARS-CoV-2 agent [71]. Also, some abietane terpenoids have been identified to exhibit in vitro anti-SARS-CoV activity [65]. This corroborates the result of our study that shows that abietane diterpenes exhibits a wide spectrum and multiplicity of protein binding and may thereby specifically execute a complete blockage of viral entry. With regard to coronavirus S-proteins, 3benzoylhosloppone and cucurbitacin B were the two terpenoids of utmost interest. While 3benzoylhosloppone interacted with amino acid residue of the RBD and SD1 region of the S1 subunit, cucurbitacin B was docked into the S2 subunit of SARS-CoV-2 S protein. The former subunit is responsible for receptor recognition while the later mediates the fusion of viral membrane and the host cellular membrane [76]. Some phytochemicals known to interact with the RBD region and other binding site of the SARS-CoV-2 S protein have been reported to disrupt the binding of the S protein to the ACE2 protein [4,45]. These terpenoids may prevent interaction of spike protein with its host cell receptor, thereby preventing entry of virus into host cell. 3-benzoylhosloppone has been reported for its antimalarial property while cucurbitacin B is an anticancer agent [1,15].
Molecular dynamics (MD) simulations was performed after docking analysis to assess the physical transitions of atoms to effectively adopt the structureto-function relevance of top docked terpenoids-target proteins and to further understand the dynamic behavior of the top docked terpenoids in the binding site of the various conformations of the target protein complexes in a dynamic environment [75]. The stability and structural/conformational fluctuations that occurred in the target proteins-terpenoids systems were monitored by clustering analysis of the MDS trajectory files. The RMSD is a plausible measure of protein stability. RMSD data shows how much each frame is deviated from the initial conformation of the reference structure as a function of time [11]. The comparison of the RMSD plots for the camostat_ TMPRSS2 and T4_TMPRSS2 systems shows that the binding of T3 did not cause any structure deformation in TMPRSS2 as the binding of camostat. RMSF indicates the flexibility of different regions of a protein and the amino acid residue along the trajectory, which can be related to crystallographic B factors [11]. Though a lower amount of fluctuation occurred at with the interacting residues, it has been established that greater amounts of structural fluctuations usually occur in regions known to be involved in ligand binding and catalysis, notably the catalytic loop regions [14]. The RoG and SASA were assessed to evaluate the structural compactness and the accessibility of solvent to the proteins. A stably folded protein maintains a reasonably steady RoG over the simulation time. The stability of the complex is affected by loss of compactness through the introduction of weak intermolecular bonds [51].The RoG and SASA plots of all the systems did not show fluctuation that indicates deformation of the structural integrity of the proteins. The analyses of the thermodynamic parameters of the systems show that the top docked terpenoid complexed with respective proteins targets were stable and could be therefore subjected to experimental processes in further studies. At a quantitative level, simulation-based methods provide substantially more accurate estimates of ligand binding affinities (free energies) [43]. These results are calculated based on the total binding free energy of the complex. In these calculations, the binding free energy (ΔG bind ) measures the affinity of a ligand to its target protein. The free energy difference between the ligand-bound state (complex) and the corresponding unbound states of proteins and ligands are also employed in the calculations. Thus, the ΔG bind calculations are important to gain in-depth knowledge about the binding modes of the hits in drug design [25]. The result from the MMPBSA calculation further corroborated the docking studies. The same amino acid residues were involved in the interactions with the top docked terpenoids in the static and dynamic states. From the Lipinski, pharmacokinetic, and ADMET filtering analyses, we identified four druggable and non-toxic, natural terpenoids that exhibited strong binding tendency to the various protein targets that mediates coronavirus-host cell entry. The result from the predicted filtering analyses of the four compounds showed parameters that suggest a favorable in silico ADMET and pharmacokinetic properties. The terpenoids expressed high probability of human intestinal absorption. They were also non-substrate to the permeability-glycoprotein (P-gp) [29], expressed capability to cross the blood brain barrier (BBB). SARS-CoV-2 has been reported to infect the brain, thus indicating its ability to cross the blood brain barrier (BBB) [73]. Therefore, compounds that can cross the BBB will be beneficial in the overal all viral clearance. The four terpenoids did not show indication of mutagenicity in silico, thereby they may not cause genetic mutations. The compounds did not display inhibitory potential for the various cytochrome P450, thus may not adversely affect phase I drug metabolism in the liver. These terpenoids are therefore considered as potential drug candidates.

Conclusions
A virtual screening approach was successfully applied to identify plant-derived terpenoids as potential inhibitor of coronavirus cells entry proteins. Two pentacyclic terpenoids (4-methylene cycloartenol and isoiguesterin) interacted strongly with the binding sites residues that are known to interfere with the activity of ACE2. The abietane diterpene especially: 11-hydroxy-2-(3,4-dihydroxybenzoyloxy) abieta-5,7,9 (11), 13-tetraene-12-one (T3), and 11-hydroxy-2-(4-hydroxybenzoyloxy)-abieta-5,7,9(11), 13-tetraene-12-one (T4) exhibited a similar binding pattern to the S1-specificity pocket of TMPRSS2 as camostat (reference inhibitor). They also showed wide spectrum and multiplicity of entry protein binding. The terpenoids binding conformations in the complexes were stable in a simulated dynamic environment. The MM-GBSA binding free energy calculations corroborated the static docking analysis. Since the identified lead terpenoids showed drug-likeness and low toxicity as indicated by the in silico pharmacokinetically relevant molecular descriptors, they are postulated as potential inhibitors that can be considered for further in vitro and in vivo studies towards developing entry inhibitors against the ongoing coronavirus pandemic.
Additional file 1: Table S1. Binding energies of bioactive terpenoids from African plants with higher affinity to human ACE2 and TMPRSS2, and SARS-Cov-2 S protein. Table S2. AutoDock scores (binding energies) of standard drugs and top 20 bioactive terpenoids with human Angiotensin-Converting Enzyme 2 (ACE2), Transmembrane Protease Serine 2 (TMPRSS2), and ACE2-Spike Receptor Binding Domain complex (ACE2-RBD). Table S3. AutoDock scores (binding energies) of standard drug and bioactive terpenoids from selected African phytochemicals to the spike protein of Coronaviruses. Table S4. Shows the number of clusters produced from TTClust, its representative frame for each of the protein-ligand complexes, and the interactions between the ligand and the protein from PLIP webserver for that frame. Figure S1. Energy profile of 24-methylene cycloartenol binding groups in human ACE2: (a) Energetic contribution to the Binding energy (d) Energetic contributions for each atom in the ligand. Number of poses in selected cluster: 68, best pose: 116 and binding site coordinate: 39.14, 35.33, and 12.71. Figure S2.  Figure  S4. The representative structure for each cluster in cartoon representation, ligands in sticks representation and the types of interactions. Graydotted line: hydrophobic interactions, blue lines: H-bond interactions, yellow-dotted lines: salt-bridges interactions, and green-dotted lines: pistacking interactions. Single-letter amino acids are in red color. Figure  S5. Summary of phamacokinetic properties of top binding terpenoids from African plants (a) T1: 24-methylene cycloartenol; (b) T3:11-Hydroxy-2 -(3,4-dihydroxybenzoyloxy) abieta -5,7,9(11),13-tetraene-12-one: (c) T5: 3-Benzoylhosloppone and (d) T6: Cucurbitacin B to the ACE2, TMPRSS2 and S protein of SARS-Cov-2.