Genome-wide identification of StU-box gene family and assessment of their expression in developmental stages of Solanum tuberosum

Background The Plant U-box (PUB), ubiquitin ligase gene, has a highly conserved domain in potato. However, little information is available about U-box genes in potato (Solanum tuberosum). In this study, 62 U-box genes were detected in the potato genome using bioinformatics methods. Further, motif analysis, gene structure, gene expression, TFBS, and synteny analysis were performed on the U-box genes. Results Based on in silico analysis, most of StU-boxs included a U-box domain; however, some of them lacked harbored domain the ARM, Pkinase_Tyr, and other domains. Based on their phylogenetic relationships, the StU-box family members were categorized into four classes. Analysis of transcription factor binding sites (TFBS) in the promoter region of StU-box genes revealed that StU-box genes had the highest and the lowest number of TFBS in MYB and CSD, respectively. Moreover, based on in silico and gene expression data, variable frequencies of TFBS in StU-box genes could indicate that these genes control different developmental stages and are involved in complex regulatory mechanisms. The number of exons in U-box genes ranged from one to sixteen. For most U-box genes, the exon–intron compositions and conserved motifs composition in most proteins in each group were similar. The intron–exon patterns and the composition of conserved motifs validated the U-box genes phylogenetic classification. Based on the results of genome distribution, StU-box genes were distributed unevenly on the 12 S. tuberosum chromosomes. The results showed that gene duplication may possess a significant role in genome expansion of S. tuberosum. Furthermore, genome evolution of S. tuberosum was surveyed using identification of orthologous and paralogous. We identified 40 orthologous gene pairs between S. tuberosum with Solanum lycopersicum, Oryza sativa, Triticum aestivum, Gossypium hirsutum, Zea maize, Coriaria mytifolia, and Arabidopsis thaliana as well as eight duplicated genes (paralogous) in S. tuberosum. StU-box 51 gene is one of the important gene among other StU-boxes in S. tuberosum under drought stress which was expressed in tuber and leaf under drought stress. Furthermore, StU-box 51 gene has the highest expression levels in four tissue-specific (stem, root, leaf, and tuber) in potato as well as it had the highest number of TFBS in promoter region. Based on our results, StU-box 51 can introduce to researcher to utilize in breeding program and genetic engineering in potato. Conclusions The results of this survey will be useful for further investigation of the probable role and molecular mechanisms of U-box genes in response to different stresses. Supplementary Information The online version contains supplementary material available at 10.1186/s43141-022-00306-7.


Background
Ubiquitination is an extremely conserved process in eukaryotes which is extensively implicated in various cellular processes namely cell cycle control, transcription, and the circadian clock [6]. This intracellular proteolysis is mediated mostly by the ubiquitin-26S-proteasome system. This system is a modification pathway of intracellular protein for cytosolic, membrane-localized, and nuclear proteins. The aberrant or truncated, active, and short-lived proteins from different cellular pathways are degraded and thereby regulate the protein loads of the cell [30]. The ubiquitination is mediated by a threestep enzymatic processes, a ubiquitin-activating enzyme (E1), ubiquitin-conjugating enzyme (E2), and a ubiquitin ligase (E3), recognizing the substrate [1,11]. Degradation of proteins through ubiquitin pathway involves two distinct and continuous steps. In this system, the ubiquitin complex mediates the alteration of proteins through a collection of reactions that activate, transfer, and bind ubiquitin to cellular proteins, catalyzed by E1, E2, and E3 enzymes, respectively. First, an E1-ubiquitin thioester bond is constituted between C-terminal Gly carbocyl group of ubiquitin and the active site Cys of the E1 enzyme by ATP-dependent reaction. Second, the E1 transmit the activated ubiquitin to the Cys residue of the E2 enzyme to form an E2-ubiquitin thiester-linked intermediate by transesterification. Finally, E2 facilitates the attachment of ubiquitin molecule to the target protein in the presence of E3. E3 ligase acts a key role in protein ubiquitination as E3 can recognize target proteins for modification [39]. A single protein or a protein complex binds the ubiquitin reaction, which could be awarded by E3 enzyme.
E3 ligase recognizes the cellular proteins undergoing Ub conjugation, the main specificity factor in the ubiquitin (Ub)-proteasome pathway is the E3 enzyme. Therefore, E3 ligases belong to different gene family in plants. There are more than 1000 E3s in cells that join Ub to proteins in a highly regulated manner [13]. The E3 ligases are one of the most abundant family among all three enzymes and are grouped into various families based on their structure, function, and substrate specificity. The main classes of the E3 ubiquitin ligases are RING (Really Interesting New Gene), HECT (Homologous to E6-associated protein C terminus), CRL (Cullin-RING ligase), and U-box [31]. Ubiquitin E3 ligase had a U-box domain, family of proteins with motifs including 70 amino acids [3]. Most of the U-box proteins possess E3 ligase functions.
In Brassica, ARC1, a novel U-box, is needed during refusal of self-incompatible pollen in pistil, ubiquitinates, and destruction of the S-receptor kinase [26]. Thus, the U-box gene family is considered a significant E3 ubiquitin ligase, affecting many plants signaling pathways and functions differently than other E3 enzyme classes. However, the evolution of the U-box genes in potato is largely unknown. Potato (S. tuberosum) is one of the most non-cereal food crop which is a vital food security crop for the worldwide population. Although, U-box gene may play significant roles in the development of potato however, up to now, the U-box gene of potato is infrequently surveyed. In this survey, the conserved domain analysis, evolutionary relevance, intron and exon patterns, chromosomal location, and analysis of expression profile were surveyed, providing a theoretical basis for the analysis of U-box gene functions.

Detection of U-box genes in S. tuberosum
Two techniques were utilized to detect potential StU-box genes in potato as explained earlier [29]. As the first technique, protein homology search with accessible U-box proteins from Arabidopsis, rice, and tomato were performed. The second technique included retrieving the U-box protein sequence using hidden Markov model (HMM) analysis, with the Pfam number PF04564 including typical U-box domain from the Pfam HMM library. The A. thaliana and rice protein sequences were taken from TAIR and RAP-DB databases, respectively. The known tomato U-box protein sequences were taken from NCBI, utilized as query sequences for tBLASTn program in potato to search for similar protein sequences. All putative sequences were approved with the SMART database and interproscan. The remaining 62 non-redundant candidates were recognized as StU-box proteins. The putative StU-boxs were validated by the presence of U-box, Armadillo (ARM), and protein tyrosine kinase (Pkina se_ Tyr) (PF01545) using the hmmscan tool.

Multiple sequence alignment and phylogenetic analysis
Sequence similarity analysis of StU-box proteins between S. tuberosum, S. lycopersicum, G. hirsutum, O. sativa, Z. maize, C. mytifolia, T. aestivum, and A. thaliana were utilized for multiple alignment as performed with MEGA 6.0 software. For phylogenetic tree construction, maximum likelihood method was utilized and its validation was done using multiple sequence alignments (CLUSTAL W with 1000 bootstrap replications) [27].

Structural characteristics of U-box proteins
Peptide length, molecular weight, and isoelectric point (PI) were determined using the ProParam tool. MEME program and the Pfam tool were utilized to identify the conserved motifs and StU-box protein domains, respectively [4,8]. Motif function was examined using the tool. GSDS program was utilized to analyze the exon-intron structures of StU-box genes.

Chromosomal location and TFBS analysis
Chromosomal maps of S. tuberosum U-box genes were constructed by Chromosome Map Tools available at Mapchart. The up-stream 1500 bp of promoter regions of each StU-box genes were investigated using Plant-PAN for the detection of Transcription factor binding sites (TFBS) in gene sequences.

Synteny analysis and selective pressure estimation
To evaluate syntenic relationship, the orthologous genes between S. tuberosum, S. lycopersicum, G. hirsutum, O. sativa, Z. maize, C. mytifolia, T. aestivum, and A. thaliana were detected from Ensemble Plants. When the similarity exceeded 70%, it was considered to demonstrate orthologous genes. The paralogous genes in StU-box proteins were identified with similarity higher than 85% from Ensemble Plants. Orthologous and paralogous StU-box genes were visualized using Circos program. To categorize genes based on the selection type, the Ka/Ks was determined for each orthologous gene pair. Genes with Ka/Ks ratio < 1 indicated purifying selection, while the criterion for positive (adaptive) selection is Ka/Ks > 1.

Gene expression analysis Plant growth, tissue-specific and drought-induced expression profiles of StMTP genes
For tissue-specific expression analysis, 2-week-old seedlings were utilized to collect the roots, stems, and leaves, while 4-month-old seedlings were utilized to collect the tubers from the Seed and Plant Improvement Institute (SPII). For each genotype under drought stress, three potato tubers with 50 ± 10 g were planted in plastic bags (26 cm height, ~ 25 cm diameter) filled with soil. For drought-treatment expression analysis, two treatments were performed: drought stress and well watered (control). Each treatment had a randomized complete block design with three blocks (replications). For 6 weeks, all plants in both treatments were watered equally. Afterwards 6 weeks, plants (drought stress treatment) unwatered during 2 weeks, while the other plant (control treatment) was irrigated optimally. Sampling genotypes were performed in 2 weeks after drought stress. It means that sampling was performed in 8-week-old seedling.
The samples were collected from leaves and tubers of G29 genotype grown under mentioned conditions. Then, leaves and tubers under normal and drought (TN (tuber normal), LN (leaf normal), TS (tuber stress), and LS (leaf stress)) were quickly dipped into liquid nitrogen and stored at − 80 °C until RNA extraction.

RNA extraction and quantitative real-time PCR
Total RNA was extracted from tuber and leaf under normal and stress conditions after drought stress (6-weekold seedling) using RNA-Plus kit (Sinaclone) based on the manufacturer's instructions. For the preparation of tissue-specific RNA, root, stem, leaf, and tubers were collected separately from 2-week-old seedlings. To remove residual genomic DNA contamination in RNA samples, DNase I (Fermentase Company) was utilized. The purity and concentration of RNA was determined by nanodrop as well as the quality of which was assessed using 1% agarose gel analysis. Then, cDNA synthesis was performed according to Easy cDNA Synthesis Kit instructions. Three replications were performed for the analysis of each gene. The potato EF-1α gene was utilized as reference gene. The gene-specific primers were designed using Vector NTI. Table S3 lists the primers and PCR conditions for amplification of StU-box 51, StU-box 27, StU-box 15, and StU-box 3, as well as the reference EFα1 gene. Real-time was performed on ABI 7500 using SYBR Green Supermix as described in the producer's guidelines. Analysis of gene expression was performed using the 2 -ΔΔCQ method for individual genes versus EFα1 as the internal control.

Gene ontology analysis of DEGs and protein-protein interaction of network analysis
Classification of DEGs by gene ontology (GO) analysis were performed using Blast2GO indicating probable pathways captured by genes involved in biological processes, molecular functions, and cellular component. The GO database generated an overview of the functional pathways in plant growth and developmental stages in potato. String (http:// string-db. org/) was utilized to detect co-expressed genes and to draw the protein-protein interaction networks.

Identification and characterization of U-box genes
In this survey, 62 genes were detected in the potato genome. The StU-box protein includes a 123 (StUbox 24) and 1484 (StU-box 32) aa U-box conserved domain. The molecular weight of StU-box was from 13781.97 kD (StU-box 24) to 166346.74 kD (StU-box 32). The PI was in the range of 5.01 (StU-box 45) to 9.18 (StUbox 35). Analysis of subcellular localization anticipated that 96% of the StU-box proteins were distributed in the nucleus and that only 8% were distributed in the cytoplasm (Table 1). These findings indicated that the majority of StU-boxs function in the nucleus.

Gene structure and motif analysis
To gain more insight into the basic difference of StU-box genes, the exon-intron structure of each StU-box was examined. The number of exons in StU-box genes ranged from 1 to 16 (Fig. 1). About 61% of the class I genes possess no introns with approximately similar exon length, indicating genetic maintenance. A large number of introns was identified in class III and IV members with significant structural modifications. The 45% of all potato U-box genes family were characterized by only one exon, a sign of functional conservation among members of Ubox gene family. Overall, our findings suggested that the ligase activity of U-box genes in potato is conserved. The structural organization also illustrated a relative amount of diversity among the members of U-box genes. The number of exons state the acquired assorted functional capabilities of the genes. The achievement of frequent exons and introns pattern could be a main outcome of the U-box genes expansion in potato. Applying a two-component limited mixture model, all detected U-box genes were investigated for the presence of the original and ungapped motifs using MEME suite (Fig. 2). The structural diversity and the function of potato U-box proteins were anticipated; 10 preserved motifs in potato U-box were recognized using the MEME program. Motif 1 and 2 were present throughout the potato U-box members ( Fig. 2; Table S1). Motifs 1 is conservative motifs in U-box genes; motifs 2 and 3 are conservative motifs in ARM; and motifs 4 is protein kinase motifs. The features of 10 motifs are revealed in Fig. 2, where motifs 5, 6, 7, 8, 9, and 10 are unidentified.
Motif 8 was widespread merely in class I and motif 9 was frequently existing in class I, whereas motifs 5 and 6 were characteristics of class I and II members which may minister separate biological functions. The symmetric and positional features of the recognized motifs consider not only the reservation of U-box domain functional facets but also the collection of further new domains over the progress of evolution. The detection of the 10 original motifs through the U-box genes provides indication for sharing biological functions. The common motifs patterns among the sequences are revealing of preserved evolutionary kinship and parallel cellular functions. Thus, it can be concluded that all the genes are implicated in the ubiquitin ligation.

Chromosomal localization and phylogenetic analysis
The detected members of U-box genes were called StUbox 1 to StU-box 62 as per their chromosomal positions from chromosome 1 to 12 (Fig. 3). We have dispersed the U-boxes into four groups, based on the existence of the U-box domain (class I), U-box domain with armadillo repeats (class II), U-box domain with protein kinase domain (class III), and U-box domain with other domains such as WD40, KAP, Ufd2P, TPR, and RPW8 (class IV). To survey the evolutionary relations of U-box gene family members between potato and Arabidopsis, 62 U-box protein sequences from two species were carefully analyzed and a phylogenetic tress was constructed. The aa sequences of the U-box of 62 proteins from potato and 64 proteins from Arabidopsis were used. According to the classification of previous studies, 62 U-box proteins that were similar to the U-box in Arabidopsis, rice, cotton, wheat, citrus, tomato, and maize were categorized into four groups (class I, II, III, and IV). Phylogenetic analysis indicated that all detected U-box proteins from potato together with Arabidopsis were obviously divided into four subgroups. Of the four groups, class I possess the largest number of StU-boxs with 36 members. Four potato proteins, StU-box 26, StU-box 32, StU-box 41, and StU-box 61 were grouped in class III, and seven potato proteins were grouped in the class IV. StU-box 17 and StU-box 46 were clustered in class IV, containing the U-box and Ufd2p domains (Fig. 4). In the class II, 15 StUboxes genes were clustered. Interestingly, these StU-box genes with similar genetic structures are grouped altogether. For example, StU-box 21/22/29 of class I each contained four exons, StU-box 26/41 of class III each contained eight exons, and StU-box 5/28/38/48/54/60/62 of class II each contained four exons.

Analysis of the TFBS in the promoter regions of StU-box genes
TF binding sites (TFBS), regions of DNA binding sites in promoter, are important for transcription initiation of its target genes [36]. To detect the TFBS in the promoter regions, the 1000 bp upstream sequences of StU-box genes were retrieved from the database of S. tuberosum genome and analyzed using PlantPAN. As shown in the   Table S2, 34 putative TFBS were detected in the promoter regions, the potential to regulate gene expression in response to environmental stresses, light response, tissue-specific response, other binding sites, and phytohormones. There are a number of diverse elements in the regulatory region of each corresponding gene and their diverse frequency in members of gene family. TFBS distribution in promoter regions of StU-box gene family is presented in the Table S2. Among these common TFBS elements, MYB, WRKY, and AP2/ERF appeared to be the most frequent elements (with 8855, 3810, and 2776, respectively) and were commonly observed by all StU-box genes. Besides, three different types of members namely bHLH, Dof, and GATA were explored in light responsiveness elements. Further, five types of TFBS elements were found in response to hormone, namely AP2 involved in ethylene responsiveness, ARF in auxin responsiveness, EIN3 in ethylene and jasmonate responsiveness, VOZ in gibberellin responsiveness, and BES1 in strigolactone and Brassinosteroids responsiveness. Moreover, four types of TFBS elements involved in response to different environmental stresses include MYB for responsive to stress, WRKY for responsive to drought, HSF for responsive to cold shock and heat stress, and C2H2 response to abiotic and biotic stresses. Additionally, elements related to tissue expression contained AT-Hook for vasculature-specific expression, SBP for flower and fruit development, LOB for expression in root, MADS box and MADF for expression in floral organ, WOX for spatial and temporal expression, and TCR for male and female reproductive tissues. Furthermore, elements are related to transcription and expression namely NF-YB for embryo development, Storekeeper for plant-specific DNA-binding proteins and regulator of patatin expression, WRC for functions in DNA binding, and Sox for cell fate decisions during development. Notably, elements involved in stress control were distributed in the promoter regions of all StUbox genes, while elements involved in transcription and expression responsiveness were less abundant than the others (Table S2). It seems that the presence of these elements indicated that StU-box genes could be transcriptionally regulated by abiotic and biotic stresses (Fig. 5, Table S2). Results showed that StU-box 51 and StUbox 37 genes were the highest and the lowest number of TFBS in the promoter sequences, respectively.

Orthologous and paralogous genes survey in StU-box
In this survey, evolutionary comparative analysis was done to detect orthologs of StU-box genes among S. tuberosum with A. thaliana, S. lycopersicum, O. sativa, and T. aestivum genomes. Based on our results, two genes in S. tuberosum revealed high similarity with four Arabidopsis genes and led to formation of four orthologous gene pairs. Further, one orthologous gene pairs was found in S. tuberosum with T. aestivum as well as with O. sativa. Eleven orthologous gene pairs were detected in S. tuberosum with S. lycopersicum. In the current survey, eight paralogous genes were identified. Orthologous genes between S. tuberosum with A. thaliana suggested that duplication plays a critical role in the expansion of U-box genes. In addition, eight paralogous gene pairs with identity more than 85% were detected in U-box gene family. These outcomes revealed that gene duplication  In the pathway, gene duplication included tandem/segmental duplications. Distribution of U-box genes on 12 chromosomes revealed that about 66.66% of U-box genes were implicated in tandem duplication with identity more than 90 percent (Figs. 6 and 7).

Synteny analysis and gene duplication
We have observed that about 19.35% of the detected Ubox genes participated in gene duplication occurrence in the S. tuberosum genome. Furthermore, tandem and segmental duplication were the key contributors to the expansion of potato U-boxes. Overall, both tandem/ segmental duplications were detected. These segmental duplication contained four genes from the 12 genes, located on chromosomes 1 and 4. A total of twelve duplication events were recorded among the U-box gene family. The gene duplication was found on one or two loci.
The synteny analysis showed that StU-box 6, 8, 9, 10, 11, and 13 were duplicated at two loci while residual candidates were observed at single locus.
To examine the selection types of the tandem and segmental duplication related to potato U-box genes, the synonymous (Ks) and non-synonymous substitutions (Ka) between the gene pairs were examined. Ka/ Ks ratio less than 1 indicates purifying selection on the gene pairs, Ka/Ks = 1 indicates neutral selection, and Ka/ Ks ratio more than 1 indicates positive selection on the gene pairs. A summary of the Ka/Ks ratios for the four tandem and eight segmental duplications are shown in Table 2. The detection of the nature of duplication and evolutionary pattern in the genome were determined using the Ka/Ks ratio [36]. Among the 62 StU-box members, we selected 12 pairs of duplicated blocks in the potato genome. Eight of the duplicated U-box genes in potato revealed a Ka/Ks ratio of less than 1, indicating that these one-to-one genes underwent purifying selection. StU-box 6/9, StU-box 7/12, StU-box 11/13, and StU-box 28/34 had a Ka/Ks ratio of more than 1, indicating that positive selection shaped these one-to-one genes. Most gene pairs of Arabidopsis and citrus underwent purify (negative) selection whereas, most genes of tomato, cotton, rice, wheat, and maize were subjected to positive selection. Our findings showed that tandem duplication occurred in four gene pairs. Taken all together, the outcomes indicated that the tandem and segmental duplications, as a leading component for the U-box genes extension, could efficiently contribute to the protection of the structures and functions of the genes. It can also be a cause behind the acquisition of novel functional domains on the genes [33].
These outcomes indicate that basically, segmental duplications, but not tandem duplications, have contributed to the expansion of the StU-box in potato. Furthermore, the duplicated gene pairs have evolved mainly under the effects of purifying selection with no functional divergence after segmental duplications. Overall, tandem duplication indicated a very high Ka/Ks ratio [41]. Since tandem duplication has generally resulted in gene clusters in genome, these outcomes also indicate that genes within each cluster have evolved faster than others. Thus, this type of duplication would be more likely to produce new functions during the extended evolutionary history of the potato. In contrast, genome-wide duplication was characterized by very low Ka/Ks ratios (Ka/Ks < 1), showing that most of the genes in this category have retained their original functions during evolution. Our results are also in disagreement with previous reported study on U-box genes in potato [24] ( Table 2).

Analysis of gene expression of StU-boxs The expression patterns of StU-box in tissue-specific
To further analyze the characteristics and function of the StU-box genes, the tissue-specific expression of the four U-box gene (StU-box 51, StU-box 27, StU-box 15, and StU-box 3) was analyzed. The expression pattern of StU-box genes in four different potato tissues, containing tuber, root, leaf, and stem were investigated using the qPCR. As shown in Fig. 8A, B, C, D, the tissue expression patterns of StU-boxs among the four genes were different. The expression levels of StU-box 3 and 15 approximately were similar, although StU-box 15 revealed higher expression than StU-box 3 in four tissues. Also, StUbox 51 had the highest expression levels in tuber, leaf, root, and stem while StU-box 3 and StU-box 15 had the lowest expression levels in tuber, leaf, root, and stem. StU-box 27 possesses the maximum expression level in leaf, while it has the minimum expression in tuber, root, and stem. StU-box 51 displayed relatively higher expression levels than StU-box 27, StU-box 15, and StU-box 3 in the leaf, root, tuber, and stem. In StU-box 3, gene expression levels in leaf were high, whereas gene expression levels were low in root, stem, and tuber.

The expression patterns of StU-box under drought stress
To further understand the expression levels of StU-box genes influenced by drought stress, we selected four StUbox genes after investigating the structure, phylogenetic analysis, and examining their relative expression profiles by qPCR in leaves and tubers after drought treatment. The expression levels of drought and normal treatment were given in Fig. 8E, F, G, and H. The quantitative realtime PCR (qRT-PCR) used in this study are provided in Table S3. The StU-box genes showed variation in expression with dehydration stress, as compared to control (Fig. 8I). The results of qPCR analysis revealed that the StU-box 51 had the highest expression level in leaf and tuber (normal) under normal treatment while it was downregulated under drought condition in leaf. However, StU-box 51 was upregulated under drought stress in tuber. Furthermore, the expression levels of StU-box 51 was higher than StU-box 3, StU-box 27, and StU-box 15 in both leaf and tuber (normal and drought conditions) (Fig. 8E, F, G, and H). StU-box 27 had higher expression level in leaf as compared to tuber under normal condition, whereas it had lower expression level in tuber and leaf under drought stress. In addition, gene expression profile for StU-box 15 and StU-box 3 were nearly equal in leaf and tuber under normal and drought stress conditions. The expression of both genes were up-regulated under normal leaf condition but were downregulated in leaf and tuber under stress treatment.

Co-expressed gene network and GO analysis
The protein-protein network interaction of 62 genes revealed that most genes in the network were included in the class I of U-box. In this network, PGSC0 003DM G4000 00791 (Stubox33) seemed to be the central protein involving in the pathway protein ubiquitination. PGSC0 003DM G4000 00043 (StU-box 15) possesses an essential role in protein ubiquitination and protein modification. OsU-box 40 (upregulated under salt and against pathogen invasion) was found to be orthologous with StU-box 15. PGSC0 003DM G4000 15790 (StU-box 27), another gene in this network, acts as a receptor protein kinase, trigging a defense response under abiotic and biotic stress conditions (Fig. 9). This receptor can mediate response to organic chemicals, namely the ethylene, cytokinin, and ABA hormones. For generation of defense responses, the activation of signal transduction cascades and protein ubiquitination are necessary for the modulation of plant immunity. On the other hand, some genes of U-box are involved in cellular regulation in eukaryotes, controlling a wide range of processes containing embryogenesis, hormone signaling, and senescence. StU-box 3 had a function in response to spotted leaf protein and was found to be orthologous with SlU-box 4.
The GO analysis revealed that the majority of the StUbox genes were involved in the response to stimuli, cellular response, response to chemical, cellular response to stimulus, and response to inorganic substance in biological processes. Further, more genes were implicated in transporter activity, transmembrane transporter activity, ion transmembrane transporter activity, and organic acid transmembrane transporter activity at levels of molecular function. In cellular component, most genes were involved in cellular, cellular anatomical entity, protein-containing complex, catalytic complex, intracellular protein-containing complex, cell periphery membrane, organelle, and intracellular anatomical (Fig. 10).

Discussion
E3 ligases are an essential switcher of plant signaling paths that play through targeting proteins to the degradation path. These proteins constitute four separate subclasses, indicating that they are involved in various roles. In this study, our in silico analysis identified 62 potato StU-box genes. The detected StU-box genes were unevenly distributed on the 10 potato chromosomes.
Features of the StU-box genes including peptide length, MW, Pi, and sub-cellular were analyzed. Our results agreed with previous studies in Arabidopsis, banana, grapevine, tomato, rice, cotton, and apple [9,17,24,30,37,38,42], StU-box proteins were mostly anticipated to be localized in nucleus, cytoplasm, and cell membrane. It is suggested that StU-boxs could function in the cytoplasm and nucleus-localized. In cotton, most of GhU-box genes are localized in the nucleus which agrees with our results in potato, consistent with their function as conserved gene [17]. In this study, GO analysis showed that majority of U-box genes are localized in membrane, organelle, cytoplasm, and cytosol. These results agreed with those reported in cotton and tomato [17,24]. Further,  most genes were involved in cellular processes and metabolic which was in agreement with results of Sharma and Taganna (2020). The phylogenetic study of the potato U-box gene family revealed a great similarity among all the four classes due to the existence of the core U-box domain in all the members. The diversification of the U-box genes, whose members regulate key aspects of plant growth and development, is a clear example of the role that gene duplication and sub-functionalization play in shaping genetic systems. The sub-functionalization observed across the subfamily is required for the retention of family members in the genome. The augmentation of the gene family members could be a result of a neutral procedure of subfunctionalization. Together, these results indicate that sub-functionalization of expression has evolved relatively slowly. Sub-functionalization model predicts genomic features correlated with different expression profiles, phylogenetic and functional analyses, and the process of functional divergence of duplicated genes. Gene duplication is a powerful mechanism providing the raw material for the evolution of the species and is the most common mechanism for the formation of original genes in these species [42]. Gene family expansion is associated with segmental and tandem duplications. Furthermore, whole-genome duplication, tandem, and segmental duplication have played key roles in the evolutionary expansion of gene families. The extension duplicated genes can also develop the acquisition of extended functions for the new genes. Tandem duplication tends to start modifications in gene structure and function more quickly than other mechanisms of duplication. StU-box 6 and 11 were involved in tandem duplication, suggesting that tandemly duplicated genes as a whole may play a vital role in signaling paths implicated in plant growth in potato. In Eucalyptus grandis and A. thaliana, expression analysis of paralogous gene pairs revealed differential expressions between paralogs in organs, supporting the notion that sub-functionalization and neo-functionalization occurred after duplication [7,14,15,43].
Eighteen pairs of potato paralogs (StU-box 6 and StU-box 8, StU-box 6 and StU-box 9, StU-box 6 and These results were the outcome of a putative tandem duplication occurrence. Our findings suggested that tandem gene duplication is the central cause of the expansion of the U-box gene family; similar findings have been reported in tomato and Arabidopsis [23]. Based on selective pressure analyses, most of the potato gene pairs were subjected to purify selection (negative) leading to removal of deleterious mutations. Also, our findings showed that Arabidopsis and citrus were exposed to purify selection. These findings are consistent with the outcomes reported for many other plant species [25]. However, in this survey, most genes in cotton, rice, tomato, wheat, and maize were considered as positive selection.
Gene duplication and syntenic study indicated that the segmental/tandem duplication are main forces for the diversity in the potato U-box genes. The syntenic analysis showed the structural and functional conservation of the genes, underlying the origins of the evolutionary novelty. Based on the evolutionary history of genes, orthologs have similar functions reflecting their conserved domains [2]. In the current survey, we found that StU-boxs could be functionally similar to their related homologs in Arabidopsis. The analysis separated the U-box proteins into four groups. StU-box 15 was clustered with AtU-box 37, AtU-box 55, AtU-box 58, and StU-box 27 was clustered with AtU-box 62. The StU-box 15 gene was categorized into the class II, which included U-box and ARM domains. Further, StU-box 27 was categorized in the class IV including U-box, kinase, and USP domains. U-box genes with similar functions and structural domains revealed a trend to cluster in the same subfamilies. Genomic comparison with orthologous genes from well-studied plant species may provide a valuable reference for newly detected genes. Therefore, the functions of StU-box were inferred by comparative genomic analyses with the U-box gene from Arabidopsis. Four orthologous gene pairs between Arabidopsis and potato were detected, suggesting that these genes may share a common ancestor and their functions have been conserved during evolution. Although, supplementary investigation is required to examine the particular function of one gene.
Gene duplication plays a major factor in formation of domains, providing new opportunities to gain new gene functions for an organism. New domains may be illustrated by fusion, terminal domain loss, and duplication, likely driven by non-allelic homologous recombination, exon-shuffling, and transposon events. These kinds of rearrangements are overrepresented as duplicated genes, representing that these duplications influence the domain rearrangement rates. The arrangement and organization of the genes likewise show the diversity in a gene family among species. The organizational association is associated with the gene evolution and functional features of the gene family. Several U-box genes were either intronless or with various introns. A parallel shape of intronless genes of the U-box gene family was also reported in grape vine and tomato. The U-box genes bearing many introns could act as a mutational buffer, protecting coding sequences from randomly happening harmful mutations. The existence of the intron-less genes shows the organizational integrity among the members of U-box family. The distribution of the recognized 10 motifs among the tomato U-box gene family suggests the structural and functional identity among potato U-box genes. Motif 1 was found to be preserved and showed homology with the U-box domain. It likewise demonstrates the existence of further domains that may contribute to the critical structural construction in the U-box gene family. Motifs 2 and 3 were the limited features of the class II genes, resembling the armadillo-like fold structure.
Tissue expression profile analysis provided worth clues about the significant roles of StU-box genes for potato growth and developmental stages. For example, StUbox 51 was exclusively expressed in tuber, root, stem, and leaf and StU-box 27 was expressed in leaf. Four StU-boxes were approximately upregulated in the leaf. Our findings indicate that the U box genes are found to be controlling several cellular processes namely root and shoot development, stolon growth, and tuber development. In S. lycopersicum, a high-rise in U-box gene expression was spotted in reproductive tissues, namely fruit and flower, suggesting the actions of U-box ligases in the critical plant development [24]. Qian et al. suggested that expression reduction, as a particular type of subfunctionalization, might assist the maintenance of duplicates and the conservation of their parental function [18].
The gene expression profiles of orthologus gene pairs, detected from syntenic analysis, were investigated to obtain understanding into functional consistency under different developmental steps and stress conditions. StU-box 3 was expressed in leaf. StU-box 3 is orthologous with SlU-box 4, where SlU-box 4 was expressed under heat shock conditions in the flower pollen tissue [24]. Further, SlU-box 4 was expressed in the leaf in tomato. However, StU-box 3 is downregulated under drought stress in potato. In Arabidopsis, AtPUB60 (At2g33340) and AtPUB49 (At5g67530) were highly expressed in leaf [32] which are similar to StUbox 3 (class I) in potato. These results can infer that abovementioned potato-tomato-Arabidopsis orthologs have similar functions. In rice, a spotted leaf gene Spl7 encodes a heat shock protein and its mutation is responsible for lesion formation in the leaves [35].
Gene expression under drought stress indicated that StU-box 51 was upregulated in drought stress. AtPUB55 (AT5G51270) include Universal Stress Protein Domain (USPD in Escherichia coli) mediates survival under different stresses such as toxic chemicals, osmotic stress, UV light damage, and starvation to nutrients [12]. At5g61560 (AtPUB58), orthologous StUbox 15, is a receptor-like protein kinase, has crucial regulatory roles in many aspects of plant growth and developmental. Further, this gene is involved in abiotic stress response namely, the abscisic acid response, calcium signaling, antioxidant defense, drought, salt, cold, and toxic metals/metalloids [36]. In this study, StU-box 15 is downregulated in drought. Our findings disagreed with previous studies in Arabidopsis and tomato [24,32]. The concept of function of orthogous gene pairs could be destroyed by a subfunctionalization event which two orthologous gene pairs could be possess different functions.
StU-box 51 was expressed in root, stem, leaf, and tubers in class IV in potato, while class IV genes showed a weak expression profiles in most tissues. StU-box 51 gene was upregulated, indicating that StU-box 51 contains many TFBS in its promoter region, including MYB, WRKY, bZIP, and NAC. MYB, bZIP, and NAC function in tuber development and play key roles in the upregulation of potato stolons. StU-box 3 and StU-box 27 were upregulated in leaf tissue due to a possible high number of MYB, bZIP, and WRKY. MYB and bZIP have been expressed under environmental stress and root storage [19,20,34]. WRKY had an important role in leaf tissues as well as was expressed in response to such stresses as wounding, drought, salt, and virus invasion [10,21,22]. StU-box 15 was upregulated in root tissues having a high number of DOF, AP2, and NAC. Dof is one of the most important TFs which was upregulated in root, shoot, leaf, and stolons. Based on the previous studies, some of Dof genes were expressed in all potato tissues while the expression levels of individual genes varied in different tissues [5,21,28]. In Brassica, AP2/ERF had specifically high expressions in the roots, although a few of the TFs were expressed in root and leaf [16,40]. The profiles of genes expression confirm the cis-regulatory elements prediction making it even more lucid to culminate the community of the U-box gene family in the tomato development.
The regulatory mechanisms controlling StU-box gene expression were evaluated at transcriptional levels using TFBS in the promoter regions of StU-box genes. A total of 14,508 putative TFBS were involved in multiple biological processes. The extension of the gene family was observed as a course of evolution where gene duplication and sub-functionalization of the native U-box domain in higher eukaryotes played a major role. The Sub-functionalization is another tool that leads to the maintenance of duplicated genes while partitioning the ancestral function. The increase of the gene family members could be a result of a neutral procedure of sub-functionalization. Together, our findings suggest that sub-functionalization of expression evolves relatively slowly. To better insight why this is, we discovered which genomic features are correlated with divergent expression profiles of the duplicate genes.
According to the structure analysis results, motif identification, gene duplication, gene expression, syntenic analysis, analysis of TFBS, diverse members of the identical subfamily, and group had similar gene structure and conserved