The phylogenetic history of human AADAC orthologues infers those common ancestral sequences evolved in the avian taxon, with speciation events occurring in a descending fashion in reptile, amphibian, and mammalian taxa, respectively.
Only nine AADAC orthologues showed interactions with carboxyl esterases (CES), belonging to red fox (Vulpes vulpes), Bolivian squirrel monkey (Samiri boliviensis), brown rat (Rattus norvegicus), house mouse (Mus musculus), Mus pahari, shrew-like mouse (Mus coroli), Mongolian gerbil (Meriones unguiculatus), one-humped camel (Camelus dromedarius), and human (Homo sapiens). It can be inferred that these nine AADAC may also be able to hydrolyze human AADAC substrates, such as xenobiotics.
The consensus sequence GDSAG found in mammals, birds, reptiles, and amphibians belongs to family IV of α/β hydrolases, also known as HSL family. The GESAG motif, found only in three orthologues, gray seal (Halichoerus grypus), common seal (Phoca vitulina), and weddell seal (Leptonychotes weddellii)), belonged to family VII of α/β hydrolases [43]. Three AADAC orthologues belonging to the order of Crocodilia (Alligator sinensis, Crocodylus porosus, and Gavialis gangeticus) and shrewmouse (Mus pahari) showed the GDSSG consensus sequence. The natal long-fingered bat (Miniopterus natalensis) and the black flying fox (Pteropus alecto) were the only AADAC orthologues showing the consensus pentapeptide GSSSG (Table 1 and Supplementary Figure S1).
The HGGG motif forms a loop close to the active site and directly participates in the catalytic hydrolysis by stabilizing the oxyanion intermediate of the reaction. The presence of multiple G/Gly residues confers greater flexibility to this loop. In contrast, some studies point that a change in the last G/Gly residue for an A/Ala decreases the enzyme activity up to 40% [44]. Pteropus alecto AADAC was the only orthologue to show an A/Ala residue placed at the last position (HGGA). The YXLXP motif has been previously reported in the telomeric protein SNM1B/Apollo (a member of the Metallo-β-lactamase/β-CASP family), which is involved in binding to TRF2 (telomeric repeat-binding factor 2) [45]; Similarly, SLX4 protein involved in DNA repair, contains a H/YXLXP motif and binds human TRF2. A similar site in the TRFH domain of TRF1 (around residue F142), FXLXP motif, serves as an anchor to TIN2 protein (TRF1-interacting protein 2) [46, 47].
Another essential structural region is the transmembrane region (Fig. 3C). The amino acids belonging to this region are responsible for anchoring the protein to the lipid bilayer. The most conserved amino acids among the studied orthologues were Y/Tyr, P/Pro, G/Gly, and K/Lys. In contrast, amino acids at positions 3, 5, 11, 12, 15, 21, and 23 showed increased variability among AADAC orthologues.
The predicted physicochemical characteristics of AADAC orthologues are resumed in the Supplementary Table S1. The calculated Molecular weight (MW) of AADAC orthologues is shown for amphibians, birds, mammals, and reptiles, and the minimum and maximum values are highlighted using a box and whisker plot (Fig. 7A). Human AADAC has been previously purified and characterized from liver tissue. Its MW was deduced at 45.671 kDa by gel electrophoresis [4], which is consistent with the MW value obtained by ProtParam for human AADAC (45.73 kDa), and similar to the average MW value calculated for mammals (45.68 kDa). The highest MW value belonged to AADAC from reptiles (46.77 kDa), while AADAC from birds had an average of 45.1 kDa, being the taxa with the lowest MW values.
The amount of Asp + Glu and Arg + Lys is related to the isoelectric point of the protein (Fig. 5B, C). Human AADAC orthologues showed isoelectric point (pI) values between 5.5 and 9.3 (Fig. 5D and Supplementary Table S1). The theoretical pI of human AADAC was 8.75, which differed from the pI calculated by in vivo experiments (9.36) [4]. Mammals showed pI values ranging from 6.0 to 9.3, followed by birds (5.6–9.2), reptiles (5.5–8.5), and amphibians (6.0–8.3).
The GRAVY value for a protein or a peptide is the sum of hydropathy values of all amino acids divided by the protein length [48]. Positive GRAVY values indicate hydrophobicity, whereas negative values mean hydrophilicity. The GRAVY value for human AADAC was found to be negative, implying a hydrophilic character, and similar GRAVY results were observed for mammals, amphibians, and reptiles AADAC (Fig. 7E). In contrast, birds were the only taxa showing positive GRAVY values, thus indicating hydrophobicity.
The instability index of a protein is calculated based on the presence of certain dipeptides. If the instability index value of a given protein is less than 40, it is considered stable under atmospheric conditions, and have a longer in vivo life span (~ 16 h), whereas proteins with an instability index > 40 show an in vivo half-life of less than 5 h [49]. The instability index value of human AADAC was 33.96 (Fig. 7E), which is considered a stable protein for in vivo experiments. In fact, human AADAC has already been expressed in sf9, DH10Bac, and E. coli cells, further purified and characterized for drug hydrolysis assays [6, 50, 51]. The average instability index values of AADAC orthologues ranged from 40 to 45 (Fig. 7F). Different model organisms for toxicological studies found in the mammalian taxon, such as mice, rats, and dogs, showed a protein instability index > 40, thus classifying them as in vivo unstable with a life span of fewer than five hours. Despite these values, mouse, rat, and dog AADAC microsomes have been successfully used in vitro to hydrolyze various prodrugs, confirming their role in the metabolism of the already known substrates of human AADAC [11, 12, 52].
For human AADAC, the high aliphatic index value was 99.15, which is considered thermostable. In general, all the AADAC orthologues studied showed aliphatic index values above 86.22, indicating a reliable degree of thermostability (Fig. 7G).
The potential N-glycosylation sites of amphibian, bird, mammal, and reptile AADAC were predicted with the NetNGlyc-4.0 server, and the results are detailed in Table 1. For N-glycosylation, an N-acetylglucosamine residue is linked via an amide bond to an asparagine (N/Asn) from the consensus sequence NX(S/T), where X can be any amino acid except proline (P/Pro). The presence of the consensus sequence is required for N-linked glycosylation; however, the potential sites might not be glycosylated [53]. For amphibians, the neuronal prediction gave two main glycosylation sites. All four amphibian sequences showed glycosylation sites at 78(NVTV/I). For bird AADAC, 39 sequences had one glycosylation site (78(NVTV), 73.6% of sequences), 12 sequences had two glycosylation sites (78(NVTV)-282(NWS[Q/R/V/L/N]) or 78(NVTV)-88(N[V/T/I][S/T]V), 22.6% of sequences), whereas the sequence of the bird Neopelma chrysocephalum showed no glycosylation site.
Mammalian AADAC showed six different glycosylation patterns. The most common pattern was 78(NV/ITV)-282(NWSS/A) (36 sequences in total, 53.7%), including AADAC sequences of Homo sapiens, Pan paniscus, Pan troglodytes, and Canis lupus familiaris. This is in agreement with the results reported in the literature [52], where their findings proved that N-glycosylation at the N282 residue of human AADAC was crucial for enzymatic activity by proper folding. The following observed patterns were 77(NVTV) (7 sequences, 10.4%, including Rattus rattus and Rattus norvegicus), and 282(NWSS) (5 sequences, 7.5%), and 77(NVTV)-281(NWSS) (4 sequences, 6%). Three glycosylation sites were found for four mammal sequences (78(NITV)-281(NWSS)-298(NRTY), 78(NVTV)-282(NWSS)-302(NGTS), 56(NHSM)-78(NVTV)-282(NWSS), and 55(NHSM)-77(NVTV)-281(NWSS), whereas 11 mammal sequences showed other glycosylation patterns. The AADAC sequence of Mus musculus had no N-glycosylation sites.
The prediction of N-glycosylation sites for reptile AADAC projected four different patterns. Six reptile sequences (33.3%) had the 282(NWS/N/R/H/K/E) site, 4 sequences (22.2%) had the 78(NVTV)-282(NWSN) site, 5 sequences (27.8%) showed different N-glycosylation patterns, and 3 sequences (16.7%) had no N-glycosylations.
In humans, AADAC is a type II transmembrane protein comprising a short, positively charged N-terminal region oriented towards the cytoplasm, a single transmembrane region, and a C-terminal tail oriented towards the lumen of the ER, containing the catalytic triad Ser189, Asp343, and His373 [5, 52, 54]. The lumen location of the catalytic triad was confirmed by the hydrolysis of D-13223, a substrate that is not observed in human liver cytosol [55].
The transmembrane regions of human AADAC and its orthologues were predicted with TMHMM (Table 1). TMHMM has a reliability of soluble and transmembrane proteins of 70–80% of assertiveness [37]. The server predicted a transmembrane helix located at residues 5–24 (Table 1), and an outer sequence of amino acid 25-399, which is consistent with the literature [54]. The active site of human AADAC faces the lumen site of the endoplasmic reticulum (RE), which implies different catalytic capacities compared to other carboxylesterases [5]. For example, substrates for cytosolic lipases, such as glycerolphospholipids or triglycerides stored in cytoplasmic lipid droplets, cannot cross the ER membrane and therefore are not hydrolyzed by AADAC [5].
All studied orthologues shared the same amino-terminal location at the cytosol, whereas the C-terminal was oriented towards the lumen of the RE.