Characterization, comparative, and functional analysis of arylacetamide deacetylase from Gnathostomata organisms

Background Arylacetamide deacetylase (AADAC) is a lipolytic enzyme involved in xenobiotic metabolism. The characterization in terms of activity and substrate preference has been limited to a few mammalian species. The potential role and catalytic activities of AADAC from other organisms are still poorly understood. Therefore, in this work, the physicochemical properties, proteomic analysis, and protein-protein interactions from Gnathostomata organisms were investigated. Results The analysis were performed with 142 orthologue sequences with ~ 48–100% identity with human AADAC. The catalytic motif HGG[A/G] tetrapeptide block was conserved through all AADAC orthologues. Four variations were found in the consensus pentapeptide GXSXG sequence (GDSAG, GESAG, GDSSG, and GSSSG), and a novel motif YXLXP was found. The prediction of N-glycosylation sites projected 4, 1, 6, and 4 different patterns for amphibians, birds, mammals, and reptiles, respectively. The transmembrane regions of AADAC orthologues were not conserved among groups, and variations in the number and orientation of the active site and C-terminal carboxyl were observed among the sequences studied. The protein-protein interaction of AADAC orthologues were related to cancer, lipid, and xenobiotic metabolism genes. Conclusion The findings from this computational analysis offer new insight into one of the main enzymes involved in xenobiotic metabolism from mammals, reptiles, amphibians, and birds and its potential use in medical and veterinarian biotechnological approaches. Graphical Abstract Supplementary Information The online version contains supplementary material available at 10.1186/s43141-022-00443-z.

Graphical Abstract

Background
Arylacetamide deacetylase (AADAC) is a lipolytic enzyme involved mainly in the hydrolysis of esters and amides [1][2][3]. Its active site domain is a classic lipase/ esterase GXSXG motif (catalytic triad Ser189, His373, and Asp343), sharing high sequence homology with the active site of hormone-sensitive lipases (HSL). Although AADAC was initially considered as an esterase, current findings suggest that AADAC could be classified as a lipase, due to the strong homology shared with HSL, the acceptance of water-insoluble substrates, and the observed inhibition by classic lipase inhibitors, such as E600 (diethyl-p-nitrophenyl phosphate) [4,5]. AADAC strongly contributes to drug hydrolysis, resulting in xenobiotic detoxification or prodrug activation [1]. The catalytic activity of AADAC has been tested on the hydrolysis and deacetylation on a wide variety of prodrugs, such as flutamide, phenacetin, indiplon, ketoconazole, rifabutin, rifampicin, and rifapentine [2,[6][7][8][9]. In contrast to other transmembrane enzymes, such as carboxylesterase 1 (CES1), carboxylesterase 2 (CES2), and paraoxonase 1 (PON1), AADAC selectively prefers bulky and small acyl moieties, such as fluorescein diacetate, N-monoacetyldapsone, and propanil [1]. Similarly, AADAC has been linked to the hydrolysis and activation of abiraterone acetate, a prodrug for metastatic castration-resistant prostate cancer [10].
Although in humans, AADAC is primarily expressed in the lumen side of the endoplasmic reticulum (ER) of the liver and gastrointestinal tract (jejunum, small intestine, and colon) [11,12], its expression has also been found in the brain, particularly in the Purkinje cell layer of the human cerebellum, hippocampus, corpus callosum, and caudate nucleus [13]; adrenal cortex/medulla, and pancreas [14]. Due to its vast location, the physiological effects linked to an upper or lower expression or inhibition of AADAC could be beneficial, e.g., the upregulation of AADAC expression contributed to a better prognosis for several types of cancer [15][16][17][18][19]. On the other hand, variants or deletions of the AADAC gene have been linked to increased susceptibility to suffer Tourette syndrome [13].
In drug development, the extrapolation of data from orthologues is used to estimate the pharmacological effects and toxicity of drug candidates [20]. Orthologues are genes that evolved from a common ancestor by speciation; therefore, orthologue proteins are likely to have similar biological roles [21]. This similarity allows the identification of enzymes from organisms other than model organisms, such as birds, reptiles, or amphibians. Some AADAC orthologues, such as mouse, dog, rat, and cynomolgus macaque AADAC have been studied for their substrate recognition and specificity [11,12,[22][23][24]. For example, human AADAC can hydrolyze flutamide, phenacetin, indiplon, and rifampicin, whereas rat and mouse AADAC only hydrolyzes flutamide and phenacetin, and not rifampicin [11]. In contrast, dog AADAC only hydrolyzes phenacetin and not indiplon [12]. Differences in enzyme location are also observed among orthologue AADAC. For instance, in human and cynomolgus macaques, AADAC mRNA is found in the liver and small intestine, whereas in rats and mice, mRNA AADAC is expressed in the kidney [22].
Mice and rats are the preferred organisms for toxicological studies, and differences in substrate preference by human, rat, and mouse liver microsomes expressing AADAC have also been observed [2]. Human and rat liver microsomes could hydrolyze indiplon, but mouse liver microsomes could not, whereas human and mouse liver microsomes hydrolyzed phenacetin and ketoconazole. Interestingly, neither mouse nor rat liver microsomes had any activity on rifamycins, whereas human liver microsomes showed activity values from 20 to 60 pmol/min/mg.
Given the importance of human AADAC in lipid, endobiotic, and xenobiotic metabolism, the present work focused on the functional analysis of AADAC orthologues, human AADAC-protein interactions, and physicochemical properties of a series of AADAC orthologue sequences from mammals, reptiles, amphibians, and birds.

Collection of AADAC protein sequences
AADAC amino acid sequences were retrieved from GenBank and Ensembl databases through the Basic Local Alignment Search Tool (BLAST). Searches were performed using the annotated Human Arylacetamide Deacetylase amino acid sequence (GenBank accession no. NP_001077.2, and Ensembl accession no. ENSG00000114771), and InterProScan was used to confirm the protein domains [25]. The orthologue sequences were selected according to the following parameters: (i) sequences with lengths between 393 and 415 amino acids; (ii) sequences without gaps, unspecific amino acids, or truncated regions, (iii) sequences containing the α/β hydrolase-3 domain, and (iv) sequences belonging to mammals, amphibians, birds, and reptiles.

Phylogenetic analysis
The phylogenetic tree was constructed using an amino acid alignment of human AADAC orthologues with MUSCLE in Mega X Software (RRID:SCR_000667) [26] with the following settings: (i) hydrophobicity multiplier (1.20), (ii) max iterations (16), and (iii) cluster method (UPGMA), also, the distances were calculated using the Jones-Taylor-Thornton matrix (JTT), and the tree was created with the maximum likelihood method with 1000 bootstrap replicates. Finally, the Interactive Tree of Life (iTOL v6; https:// itol. embl. de/) was employed to display the tree [27]. The Fitch (or similarity) matrix of the 142 orthologues was developed in RStudio software (v2022.07.2) using the R package SeqinR 1.0-2: Biological Sequences Retrieval and Analysis [28]. A protein alignment was used to calculate the pairwise distance matrix of aligned sequences using the function dist.alignment (x, matrix = similarity). The Numerical Taxonomy Multivariate Analysis System (NTSYSpc v2.02) software calculated the correlation matrix of amino acid sequences coding for the AADAC gene using the Dis/similarity module for qualitative data. Principal component analysis (PCA) was performed by plotting the 2-dimensional (2D) Dis/similarity matrix in NTSYSpc.

AADAC-protein interaction prediction
Potential protein-protein association networks for human AADAC were predicted with the STRING online database (RRID:SCR_005223) [29], whereas the proteinprotein association for AADAC orthologues was predicted with the BioGRID database (https:// thebi ogrid. org/) (RRID:SCR_007393) [30]. For the STRING database (https:// string-db. org/), the sequences of AADAC were selected according to the above criteria. For the interaction analysis in BioGRID, the target-interacting protein Protein modeling of AADAC was performed with Alpha-Fold via the Google Colab platform [34], and visualized with UCSF Chimera software [35]. The physicochemical parameters of human AADAC orthologues were predicted with ProtParam using the EXPASY server (RRID:SCR_018087) [36]. The sequences in FASTA were used to analyze the molecular weight (MW), theoretical pI, the total number of negatively and positively charged residues, amino acid composition, extinction coefficients, instability index, aliphatic index, and grand average of hydropathicity (GRAVY). In addition, the transmembrane regions were inferred with TMHMM 2.0 (RRID:SCR_014935) [37], and glycosylations were predicted with NetOGlyc 4.0 (RRID:SCR_009026) [38]. One-way non-parametric ANOVA was used to evaluate the significative differences between orthologues characteristics using GraphPad Prism version 8.0.2 (RRID:SCR_002798).

Results
A total of 142 gene sequences from mammals (67), amphibians (4), birds (53), and reptiles (18) were collected from GenBank and confirmed with Ensembl database (Table 1) as orthologues. AADAC protein sequences showed ~ 48-100% identity across orthologues when human AADAC (NP_001077.2) was used as reference. In all cases, the α/β hydrolase domain was located between residues 107-376. Similarly, the amino acid residues of the catalytic triad of lipases/esterases (Ser, Asp/Glu, and His) were conserved among all the orthologues studied (Supplementary Figure S1).
The phylogenetic relationships of AADAC orthologues are shown in Fig. 1. The avian taxon had a common ancestor from which descended two distinctive subgroups: (i) clades XII and XIII, and (ii) clades VIII to XI. Under these terms, it is inferred that birds have the lowest percentage of identity with human AADAC. The analysis shows a marked difference between AADAC from the selected organisms. In mammals, four distinctive clades were observed: in clade I; human AADAC shared a common ancestor with AADAC from Pan paniscus and Pan troglodytes; in clade II, mouse and rat AADAC were grouped; clade III belonged to dog AADAC, whereas camels and hedgehogs were grouped in clade IV. Amphibians, represented by clade V, and clade IV from mammals evolved from a common ancestor. Reptiles, divided into clades VI and VII, share a common ancestor. PCA shows the similarity between orthologs of different taxa ( Fig. 2A, B). It can be observed that organisms of the same taxon share more significant similarities (Fig. 2B), comparable to that observed in the phylogenetic tree ( Fig. 1).
Human AADAC is closely related to three hydrolytic enzymes belonging to the carboxylesterase family: CES1, CES2, and CES3, which are ER-linked enzymes that contain the α/β hydrolase domain and are primarily known to be involved in phase I of drug or xenobiotic metabolism [39] . A single interactome to relate the genetic and physical interactions of CES1, CES2, PON1, and human AADAC proteins/genes was constructed (Fig. 3). The amyloid-beta precursor protein (APP) shows a chemical-protein interaction with CES2, AADAC, CES1, and PON1 (Fig. 3).
In humans, APP is a cell surface protein with signal transduction properties that control the viability, proliferation, migration, and aggressiveness of various cancer cells, such as breast cancer, ovarian cell adenocarcinoma, medulloblastoma, and neuroblastoma [32]. Additionally, APP interacted with Amyloid-Beta Precursor Protein Binding Protein 2 (APPBP2), which is involved in nonsmall cell lung cancer (NSCLC), a type of cancer accountable for ~ 80% of all lung carcinoma patients [33]. On the other hand, the interactions of 70 AADAC orthologues were analyzed using the STRING database (Supplementary Table S2). From the sequences studied, 45 AADAC orthologues were related to dihydrodiol dehydrogenases (DHDH), 57 sequences showed an interaction with the succinate receptor 1 (SUCNR1), and 44 interacted with ATP8B3. DHDH is involved in the detoxication of transdihydrodiol and carcinogenic metabolites of polycyclic aromatic hydrocarbons [40]; SUCNR1 is related to conditions such as hypertension, diabetes, and obesity [41], whereas ATP8B3 is involved in aminophospholipid transport in the liver [42].
The structural motif of the consensus pentapeptide GXSXG (Table 1 and Fig. 4A) was illustrated using  Figure S2). Motif 7 (Fig. 5) corresponds to the transmembrane region, while motifs 1, 3, had glycosylation sites  (Fig. 5A, B). The AADAC folding among organisms with 10 motifs was similar to human (Fig. 5C), and sequences with any missing motif showed structural differences, mainly in the formation of alpha helices and beta strands (Fig. 6D, F).
The physicochemical characterization shows that the molecular mass among orthologs remains between 44   showed the most stable AADAC (Fig. 7). The prediction of N-glycosylation sites with NetNGlyc-4.0 server projected 4, 1, 6, and 4 different patterns for amphibians, bird, mammals, and reptiles, respectively. The transmembrane regions of AADAC orthologues varied between 0 and 3. From 142 orthologue sequences, 48 organisms did not show transmembrane regions (33.8%), 73 organisms showed a single transmembrane region with the C-terminal oriented towards the lumen (51.4%), 5 organisms had 2 transmembrane regions with the active site towards the cytosol (3.5%), and 16 organisms showed 3 transmembrane regions with the C-terminal carboxyl oriented towards the ER lumen (11.2%) ( Table 1).

Discussion
The phylogenetic history of human AADAC orthologues infers those common ancestral sequences evolved in the avian taxon, with speciation events occurring in a descending fashion in reptile, amphibian, and mammalian taxa, respectively.
The HGGG motif forms a loop close to the active site and directly participates in the catalytic hydrolysis by stabilizing the oxyanion intermediate of the reaction. The presence of multiple G/Gly residues confers greater flexibility to this loop. In contrast, some studies point that a change in the last G/Gly residue for an A/ Ala decreases the enzyme activity up to 40% [44]. Pteropus alecto AADAC was the only orthologue to show an A/Ala residue placed at the last position (HGGA). The YXLXP motif has been previously reported in the telomeric protein SNM1B/Apollo (a member of the Metallo-β-lactamase/β-CASP family), which is involved in binding to TRF2 (telomeric repeat-binding factor 2) [45]; Similarly, SLX4 protein involved in DNA repair, contains a H/YXLXP motif and binds human TRF2. A similar site in the TRFH domain of TRF1 (around residue F142), FXLXP motif, serves as an anchor to TIN2 protein (TRF1-interacting protein 2) [46,47]. Another essential structural region is the transmembrane region (Fig. 3C). The amino acids belonging to this region are responsible for anchoring the protein to the lipid bilayer. The most conserved amino acids among the studied orthologues were Y/Tyr, P/Pro, G/Gly, and K/ Lys. In contrast, amino acids at positions 3,5,11,12,15,21, and 23 showed increased variability among AADAC orthologues.
The predicted physicochemical characteristics of AADAC orthologues are resumed in the Supplementary Table S1. The calculated Molecular weight (MW) of AADAC orthologues is shown for amphibians, birds, mammals, and reptiles, and the minimum and maximum values are highlighted using a box and whisker plot (Fig. 7A). Human AADAC has been previously  Supplementary Table S1. One-way non-parametric ANOVA was used to evaluate significant differences between orthologues characteristics using GraphPad Prism version 8.0.2 The amount of Asp + Glu and Arg + Lys is related to the isoelectric point of the protein (Fig. 5B, C). Human AADAC orthologues showed isoelectric point (pI) values between 5.5 and 9.3 ( Fig. 5D and Supplementary Table S1). The theoretical pI of human AADAC was 8.75, which differed from the pI calculated by in vivo experiments (9.36) [4]. Mammals showed pI values ranging from 6.0 to 9.3, followed by birds (5.6-9.2), reptiles (5.5-8.5), and amphibians (6.0-8.3).
The GRAVY value for a protein or a peptide is the sum of hydropathy values of all amino acids divided by the protein length [48]. Positive GRAVY values indicate hydrophobicity, whereas negative values mean hydrophilicity. The GRAVY value for human AADAC was found to be negative, implying a hydrophilic character, and similar GRAVY results were observed for mammals, amphibians, and reptiles AADAC (Fig. 7E). In contrast, birds were the only taxa showing positive GRAVY values, thus indicating hydrophobicity.
The instability index of a protein is calculated based on the presence of certain dipeptides. If the instability index value of a given protein is less than 40, it is considered stable under atmospheric conditions, and have a longer in vivo life span (~ 16 h), whereas proteins with an instability index > 40 show an in vivo half-life of less than 5 h [49]. The instability index value of human AADAC was 33.96 (Fig. 7E), which is considered a stable protein for in vivo experiments. In fact, human AADAC has already been expressed in sf9, DH10Bac, and E. coli cells, further purified and characterized for drug hydrolysis assays [6,50,51]. The average instability index values of AADAC orthologues ranged from 40 to 45 (Fig. 7F). Different model organisms for toxicological studies found in the mammalian taxon, such as mice, rats, and dogs, showed a protein instability index > 40, thus classifying them as in vivo unstable with a life span of fewer than five hours. Despite these values, mouse, rat, and dog AADAC microsomes have been successfully used in vitro to hydrolyze various prodrugs, confirming their role in the metabolism of the already known substrates of human AADAC [11,12,52].
For human AADAC, the high aliphatic index value was 99.15, which is considered thermostable. In general, all the AADAC orthologues studied showed aliphatic index values above 86.22, indicating a reliable degree of thermostability (Fig. 7G).
The potential N-glycosylation sites of amphibian, bird, mammal, and reptile AADAC were predicted with the NetNGlyc-4.0 server, and the results are detailed in Table 1. For N-glycosylation, an N-acetylglucosamine residue is linked via an amide bond to an asparagine (N/Asn) from the consensus sequence NX(S/T), where X can be any amino acid except proline (P/Pro). The presence of the consensus sequence is required for N-linked glycosylation; however, the potential sites might not be glycosylated [53]. For amphibians, the neuronal prediction gave two main glycosylation sites. All four amphibian sequences showed glycosylation sites at 78(NVTV/I). Mammalian AADAC showed six different glycosylation patterns. The most common pattern was 78(NV/ ITV)-282(NWSS/A) (36 sequences in total, 53.7%), including AADAC sequences of Homo sapiens, Pan paniscus, Pan troglodytes, and Canis lupus familiaris. This is in agreement with the results reported in the literature [52], where their findings proved that N-glycosylation at the N282 residue of human AADAC was crucial for enzymatic activity by proper folding. The following observed patterns were 77(NVTV) (7 sequences, 10.4%, including Rattus rattus and Rattus norvegicus), and 282(NWSS) (5 sequences, 7.5%), and 77(NVTV)-281(NWSS) (4 sequences, 6%). Three glycosylation sites were found for four mammal sequences In humans, AADAC is a type II transmembrane protein comprising a short, positively charged N-terminal region oriented towards the cytoplasm, a single transmembrane region, and a C-terminal tail oriented towards the lumen of the ER, containing the catalytic triad Ser189, Asp343, and His373 [5,52,54]. The lumen location of the catalytic triad was confirmed by the hydrolysis of D-13223, a substrate that is not observed in human liver cytosol [55]. The transmembrane regions of human AADAC and its orthologues were predicted with TMHMM (Table 1). TMHMM has a reliability of soluble and transmembrane proteins of 70-80% of assertiveness [37]. The server predicted a transmembrane helix located at residues 5-24 (Table 1), and an outer sequence of amino acid 25-399, which is consistent with the literature [54]. The active site of human AADAC faces the lumen site of the endoplasmic reticulum (RE), which implies different catalytic capacities compared to other carboxylesterases [5]. For example, substrates for cytosolic lipases, such as glycerolphospholipids or triglycerides stored in cytoplasmic lipid droplets, cannot cross the ER membrane and therefore are not hydrolyzed by AADAC [5].
All studied orthologues shared the same amino-terminal location at the cytosol, whereas the C-terminal was oriented towards the lumen of the RE.

Conclusions
Although AADAC is a ubiquitous protein spread among organisms from different taxa, drug hydrolysis assays and toxicological studies are limited to a few species. The predicted physicochemical characteristics of the studied 142 orthologues show similarity among the groups; however, the composition of amino acids as Arg + Lys and Asp + Glu ratios differed among the studied organisms, which generates differences between the isoelectric point. The orthologues were related to DHDH, ATP8B3, and P2RY1 genes, expressed in the lipid membrane, and directly related to skin and lung cancer genes (APP and APPBP2), suggesting that they may share mechanisms among these diseases. In addition, the comparative analysis of AADAC orthologues revealed that the sequence YXLXP, which has been involved as DNA repair mechanism, was shared by all the organisms studied. Other structural motifs that are fingerprints of lipolytic enzymes (GXSXG and HGGG) were found in all orthologs; however, according to the predicted 3D structures, missing motifs can generate conformational changes that can affect enzyme-substrate interaction.
The number of transmembrane regions varied among organisms and did not follow a pattern among mammals, amphibians, birds, or reptiles. Some substrate selectivities among the studied orthologues could be inferred by the phylogenetic tree constructed; however, further biochemical studies are mandatory to corroborate the results herein presented. This information, together with the analysis of the interactome, offers an overview of how these AADAC from the studied organisms could interact with various drugs tested mainly in primates or rodents. The collected data shown in this work broadens the knowledge of AADAC orthologues and could help to better select new model organisms as future candidates for detoxification, xenobiotic processing, and drug disposition studies.