Growth hormone cDNA subcloning in TZ57R/T vector
The electrophoretic pattern of As_GH-TZ57R/T vector and Bo_GH-TZ57R/T constructions were confirmed the successful ligation of As_GH and Bo_GH cDNA sequences into the construction of TZ57R/T cloning vector (Figs. 3 and 4, respectively).
Growth hormone cDNA sequence annotation
Assaf and Boar growth hormone SNPs
Single nucleotide polymorphisms (SNPs) in different candidate genes and their association with animal performance have investigated in different animal species: cattle [15,16,17], sheep [18, 19], and goats [20]. The growth hormone gene SNPs have widely used for the genetic marker as an aid to genetic selection in farm animals: cattle [17, 21,22,23], sheep [24], and goat [25, 26].
Pairwise alignment of As_GH and Bo_GH cDNA sequence
Assaf sheep has a unique three SNPs (A637G638G639) that encodes for arginine amino acid (Arg, R194); this insertion mutation (AAG) has been absent in the growth hormone cDNA sequences of Boer goat in the current study and all breeds in GenBank database. This SNPs (AAG) may be used to develop a genetic marker for Assaf sheep breed.
Multiple alignments of As_GH sequence vs. GenBank sheep database
Also, the unique three SNPs (A637G638G639) of the Assaf growth hormone have been confirmed in the result of the multiple alignments comparing with of all breeds in the GenBank database. Ossimi sheep have three unique SNPs (A12A13G14) encodes for lysine (Lys2, K) vs. alanine (Ala3, A) in all GenBank database. Due to the difference of physicochemical properties of Lys and Ala residues [27], therefore substitution of Ala with Lys in Ossimi sheep may have an adverse effect on the function of growth hormone and thereby, affect animal performance.
Ossimi, Afghani, and Kazakh sheep are the most breeds have SNPs compared to the Genbank database that may be due to the interfering of random crossbreeding in each breed. There are two alleles substitutions in Ossimi and Afghani sheep which yield a new codon that encodes for different amino acid; for Ossimi sheep, Gg472c473 encodes for glycine (Gly, G), and Ca628g629 encodes for glutamine (Gln, Q) for Afghani sheep, respectively compared to GTT for valine (Val, V) and CGC for arginine (Arg, R) in the GenBank database and in Afghani sheep, Gg389c390 for glycine (Gly, G) compared to GTT for valine (Val, V) in the GenBank database. Both glycine and glutamine did not have the same physicochemical properties of valine and arginine; thereby, substitution between them may harm the protein function [27].
Multiple alignments of Bo_GH sequence vs. GenBank goat database
Pakistani Kamori, Tharri, Beetal, and Indian Beetal goat are the most breeds have SNPs compared to the GenBank goat database. Most goat breeds in Pakistani or in India are the multipurpose type used for meat and milk production or sometimes also for hair [28]. This may explain the variety of SNPs in the growth hormone cDNA sequence in these breeds due to the random crossbreeding in each breed.
Pakistani Kamori, Tharri, and Beetal goat shared with the same SNPs encode for the same amino acids. These single nucleotide polymorphisms can be used to develop genetic markers for each breed. Single nucleotide polymorphisms of A/G was observed in Afghani sheep (A) and Indian Beetal goat vs. G in GenBank sheep and goat database. In Pakistani Beetal goat has G/A SNP; g358GC (Gly120) compared to A78GC (Ser120) in GenBank goat database. This substitution of Gly with Ser is difficult to predict, and modeling studies suggest that it would not be great [18]. The multiple alignments of the growth hormone of sheep and goats in the present results are in agreement with the results reported previously, where there were two genotypes, G/G nucleotide and A/G nucleotide, in sheep and goat [29, 30]. This A/G genotype is associated with high growth traits like birth chest girth, weaning weight, large litter sizes, and good body conformation. It may be due to the presence of both glycine and serine residues in the heterozygous animals, and these residues are interconvertible; when the body needs any of them and not available from the feed, it uses serine to produce glycine and vice versa. The serine residue is involving in the metabolic processes that burn glucose and fatty acids for energy, besides that used to make creatine which combines with water to “pump up” muscle mass.
Moreover, glycine is required for the synthesis of protein, construction of DNA, as well as RNA and synthesis of bile acids and other amino acids in the body, and it helps in retarding degeneration of muscles [31]. Therefore, production improvements can be achieved using these growth hormone SNPs through the selection of animals that have AG genotype and enter them in breeding programs of Egyptian sheep and goat as a way to increase their productivity. In Egyptian sheep and goat, SNP of g55G and a55G with two genotypes (G/G and A/G) has detected in the growth hormone sequence of Egyptian sheep and goat. The GG and AG genotype frequencies were 35.56 and 64.44% in Barki sheep, 19.23 and 80.77% in Rahmani sheep, and 76.67 and 23.33% in Ossimi sheep, respectively. In goats, the GG and AG genotype frequencies were 0 and 100% for Baladi, 13.33 and 86.67% for Barki, and 23.53 and 76.47% for Zaraibi, respectively [32].
In the present study, the multiple alignments showed that SNP of C/T was observed in both sheep and goat; TCc248 (Ser80) in Ossimi sheep and TCc254 (Ser81) in China sheep vs. TCC251 (Ser81) in all GenBank sheep database, while TCc246 and TCc168 for Ser81 in Indian Tibetan, Lazhi, and Beetal goats vs. TCT263 for Ser81 in for all GenBank goat database. The present results are confirmed previously by [33], where the heterozygote counterparts for C1763T and A1780G SNPs in GH gene sequence exhibited heavy body weights (p < 0.05) in Indian Osmanabadi and Sangamneri goat breeds.
The prediction of growth hormone promoter
The promoter controls and regulates the first step of gene expression, so, is the most important regulatory sequence in the gene [34]. The promoter sequence in As_GH and Bo_GH sequence was completely identical sequence and has the same length irrespective of position difference.
Protein structure annotation
Signal peptides sequence
The signal peptides are unique sequence and usually ranged from 16 to 30 residues extended in the N-terminal of newly synthesized secretory and involved in the transport of the protein to or via cell membranes and targeting to the endoplasmic reticulum (ER) membrane for translation initiation [35, 36]. The signal peptides are discarded during protein transportation via cell membrane by a specific peptidase [37]; they are consisting of tripartite structure (1) region of hydrophilic residues, (2) region of hydrophobic residues, and (3) cleavage site with signal peptidase (SPase) [37].
Conserved domain and motifs
In the conserved domain of sheep breeds, residues of Ser, Gly, and Glu have the same physicochemical properties; therefore, the substitution between them may not affect the protein function [27]. Likewise, in the conserved domain of the goat breeds, the residues of Ser, Gly, and Leu, Val have the same physicochemical properties. Therefore, the substitution between them may not affect the protein function [27]. Other residues substitutions in both conserved domain may affect protein function due to the difference in physicochemical properties [27].
The GH protein sequence is strongly conserved in most mammals, but there are differences in the biological and receptor-binding properties due to the species-specificity of receptor-binding [38]. A protein domain is a conserved and distinct part of molecular evolution, usually related to specific molecular functions of such protein folding and can function independently of the rest of the protein. Detection of the significant protein conserved domains is often required for basic cellular function, stability, or reproduction [39]. Detection of the conserved domain on the growth hormone sequence of Assaf sheep and Boer goat may be indicated that the isolation and sequencing processes of growth hormone are achieved correctly.
The protein motifs are consecutive and conserved amino acids sequence of protein families (called motifs signatures) and can often be used as a prediction tool for protein function [40]. Therefore, the Bo_GH sequence protein and all sheep and goat breeds in GenBank database had two common motifs signature: Somatotropin_1 (CFSETIPAPTGKNEAQQKSDLELLRI SLLLIQSW) and Somatotropin_2 (CFRKDLHKTETYLRVMKC). Any change in these consecutive protein residues causes an inability to predict the motif signatures. Interestingly, the As_GH protein sequence had only one motif signature, Somatotropin_1 (CFSETIPAPTGKNEAQQKSDLELLRISLLLIQSW), and the second motif signature (Somatotropin_2) is unpredictable. Three novel distinct nucleotides (AAG) that encode for arginine (R194) in protein sequence were observed inside the consecutive sequence of the second motif signature_2 (CFRKRDLHKTETYLRVMKC). Therefore, due to the presence of this insertion mutation makes it undetectable. There are not available growth hormone cDNA sequences for Assaf sheep in the GenBank database; therefore, it could not confirm that this mutation is specific for this breed, or this is due to individual mutation. Further studies are needed to confirm that these SNPs are stable mutations in Assaf sheep breed or are transiently mutation.
Cysteine bridge and disulfide bonds
The disulfide bonds play a crucial role in the folding and stability of most extracellular secreted proteins [41], protection of protein integrity from the extracellular milieu oxidants and proteolytic enzymes, thereby increase their half-life of the protein [42]. Through the oxidative folding process, four from five cysteine residues are establishing two disulfide bonds between the thiol groups of cysteine residues to stabilize the folded form of a protein [43]. Also, in the intracellular environment, the sulfhydryl side chain of cysteines is excellent for binding to metals, such as zinc [44].
Protein alignments of As_GH and Bo_GH predicted protein sequence
Pairwise alignments
As_GH protein sequence has a unique residue (arginine, R194) that was absented in the Bo_GH protein sequence. Arginine (Arg) is a polar and positively charged amino acid; it prefers to be on the surface of the protein and frequently involved in salt-bridges where they pair with a negatively charged residue (aspartate or glutamate) to create stabilizing hydrogen bonds. Therefore, the presence of arginine (insertion mutation) in the As_GH sequence may be important for increasing the protein stability [45] and also may be involved in the growth hormone-receptor binding [46].
Multiple alignments of growth hormone protein residues
Gene families arise by gene duplication and natural selection. The multiple alignments are an essential study to understand the evolutionary event between species [47]. In most mammals, the GH sequence is strongly conserved, but differences in the biological and receptor-binding properties are due to the species-specificity of receptor-binding [38]. Due to the difference of physicochemical properties between Lys and Ala [27], the substitution of Lys2 residue in the GH signal peptide of Ossimi sheep vs. Ala3 residue in the GenBank database may be causing a reduction in GH secretion in Ossimi sheep. Although there is a difference in the properties of Pro6 residue (small residue) and Thr5 residue (polar residue), they shared in to facilitate the intracellular signal transduction of the proteins, so they can be substituted without negative effect on protein function [9, 48, 49].
The sheep GenBank database has dominant SNP at AC526A that encoded for Thr173 compared to Ag518A encoded for Arg173 in Chines Tibetan sheep. The SNP of C in the growth hormone sequence showed a positive association with the growth rate [50]. Therefore, substation C to G may have a negative effect on the growth rate of Chines Tibetan sheep.
The substitution of Val156 with Gly156 and Gly35 with Ser5 in Afghani sheep breed may be not affecting GH function because the Gly and Val residues are small size and hydrophobic residues, and Gly and Ser residues are tiny and small residues; hence, the substitution is functional [49]. Likewise, the substitution of Glu154 residue that found in all breeds with Gly154 in Tibetan sheep breed may be causing a reduction in growth hormone function due to the differences between them in physiochemical properties [48], where the Gly is a small and hydrophobic residue, while the Glu is a polar and negative charged residue [51]. The Tyr168 residue is presented in the growth hormone protein sequence of GenBank sheep breeds vs. His168 residue in Pakistani Latti sheep breed. Tyrosine and histidine are polar amino acids (neutral), hence may the Tyr residue can be substituted with His residue without the adverse effect of GH protein function [48].
Pakistani goat breeds (Tharri, Kamori, and Beetal) were shared in the same residue substitution of Ala15 vs. Thr15 in GenBank goat breeds database; this substitution may be not affecting the GH protein function because both residues are small size [49], and this may be used this mutation as a genetic marker for these goat breeds. The substitution of Gly156 Pakistani Kamori and Beetal goat breeds with Val156 in GenBank goat breeds database may be acceptable where the Gly and Val residues are small sizes and hydrophobic residues; hence, the substitution is functional [49, 51]. The leucine and proline are very non-reactive residues and rarely directly involved in protein active or binding sites. Leucine can be substituted by other hydrophobic, particularly aliphatic, residues. Since the leucine and proline are small sizes and aliphatic residues, so leucine can be substituted by proline residue in Lazhi and Tharri goat [48, 51].