The present investigation studied the polymorphism in LEP gene which related to fertility in 81 female Egyptian river buffaloes. PCR-RFLP pattern using Eco91I of the target fragment showed that all the investigated buffaloes had CC genotype which was confirmed by direct sequencing. In agreement with the current results, some investigators confirmed the CC genotype as the only detected genotype among several river buffalo breeds [3, 8,9,10,11,12]. But in cattle, the frequencies of CT and TT genotypes were 34% and 8% (Liefers et al. [4]), 47.03% and 9.13% [7], 38.8% and 2.4% [5], and 33% and 6% [6], respectively. Some studies found that the SNPs which are related to some reproduction traits in cattle were monomorphic in Egyptian buffaloes [23, 24].
Interestingly, both Eco91I [7] and HphI [4] enzymes which are used for C to T SNP (A59V mutation) detection in cattle are not suitable in any future studies on Egyptian buffalo to detect this A59V mutation because a G140A SNP was discovered by sequencing of LEP amplicon (Fig. 1c), and the A allele of the discovered SNP disrupts Eco91I and HphI restriction sites. This finding means that Eco91I and HphI will not be able to differentiate the 2 alleles of A59V mutation. It could be suggested that the 2 alleles of LEP gene C to TSNP (A59V) could be differentiated in any future studies on river buffalo using a PCR reaction which could be carried out by the same forward primer designed in this study in combination with the mutated primers (5′ CTGGTGAGGATCTGTTGGTCGATC 3′) or (5′CTGGTGAGGATCTGTTGGTTGATC 3′) which will produce an amplicon that has the restriction site of PvuI (CGATCG which will cut C allele) or BclI (TGATCA which will cut T allele), respectively.
Four SNPs (C32T, G140A, G263A, and G379A) were detected within the Egyptian buffalo samples. The alleles of G140A, G263A, and G379A SNPs in addition to 21 SNPs were found between in the leptin gene in the sequenced fragment of the Egyptian buffalo and the other buffalo breed records for the same gene in GenBank. The T allele of C32T intronic SNP was unique and was detected in Egyptian buffalo within this investigation only and was not detected in any buffalo records in GenBank or previous studies on river buffalo [3, 8,9,10,11,12].
The Brazilian buffalos in the study of Vallinoto et al. [3] had both alleles of G263A and G278A SNPs but had the G allele only of G140A and G379A SNPs compared to Egyptian buffaloes, and the Brazilian and Egyptian breeds are monomorphic for G386A SNP. Both of the Italian [8] and Egyptian buffaloes were polymorphic for G140A and G379A SNPs, but in contrast to Egyptian buffaloes, the Italian buffaloes were monomorphic for G263A SNP and polymorphic for G278A SNP. The Philippine buffaloes [9] had the two alleles of G263A and G379A SNPs exactly like the Egyptian breeds.
Compared to the Egyptian buffaloes, the Indian buffaloes [10] were polymorphic for G379A SNP and also were monomorphic for G140A and G263A SNPS (had G alleles of both SNPs), and the Egyptian buffaloes had the C allele of C242T SNP while the Indian buffaloes had T allele. In the studied sample of Italian buffaloes [12], the animals had G alleles of G140A, G263A, and G379A SNPs in addition to C allele of T360C SNP in opposition to Egyptian buffaloes.
Blasting of the nucleotide sequence of Egyptian river buffalo LEP gene amplified region against GenBank database displayed that cattle had the highest homology score (99%) compared to sheep (97%), goat (97%), human (87%), and mouse (79%). On the other hand, the homology of the translated amino acid sequence from full coding region of Egyptian river buffalo LEP gene-exon 3 and the similar sequence in cattle, sheep, goat, human, and mouse was 100%, 98%, 98%, 85%, and 82%, respectively. The homology of the translated amino acid sequence was higher in the target organisms than their DNA sequences homology.
Using the SplicePort software, the score of the 3′ splicing site in the end of intron 2 (in the position 47) was calculated to be 0.28 which was lower than the score of the intron 2–3′ splicing site in human (0.31) which could be replaced by another 3′ splicing site located 3 bp downstream it that lead to protein isoform which lacks glutamine at position 49 of the mature peptide [25]. This weak splicing site could be affected by flanked cis-acting splicing regulatory elements [26]. The SplicePort software detects potential 3′ splicing sites with a higher score than the regular 3′ splicing site located in the positions 266 (0.93) and 469 (0.87) in addition to 5′ splicing site (0.42) in the positions 267. G330C SNP generated a 3′ splicing site in position 332 with a score equals 0.96. These sites may act as cryptic splicing sites which could be activated by some mutations or naturally without any mutations leading to change in the final transcription products [27,28,29,30,31].
Moreover, six SNPs were predicted to have different effects on RNA cis-regulatory elements. Table 2 shows that T27C and C32T SNPs are very close to the 3′ splicing site of intron 2 and the first SNP disrupts an intron splicing silencer while the lastone creates a new intron splicing silencer. In the exon 3, both A114G and A310G SNPs disrupt two exonic splicing enhancers, but G263A and G379A SNPs disrupt two exonic splicing enhancers and in the same time the two SNPs create new exonic splicing enhancers. Disruption, creation, or changing the number of the cis-acting splicing regulatory elements could change the splicing efficiency and affect the different consequence processes like alternative splicing and intron retention which affect the final protein sequence and structure [32, 33].
Ten non-synonymous were found among the detected SNPs, and their potential effects on LEP protein functions were predicted (Fig. 4). The results showed that all the programs classified E129K and R159Q mutations as neutral mutations with a total PredictSNP expected accuracy of 83%. S71G and T87N mutation were classified as neutral mutations by PHD-SNP, POLYPHEN-1, POLYPHEN-2, and SNAP and as deleterious mutations by MAPP and SIFT, so the PredictSNP software classified them as neutral mutations but with 65 and 63% accuracy only. Furthermore, N103S, Y140C, E143Q, and R149W were evaluated by the six software as deleterious mutations with 87% combined accuracy by PredictSNP. E136G and S153P mutations were predicted as deleterious mutation by all the programs except PHD-SNP in the case of E136G and PHD-SNP and SIFT in the case of S153 with a 65 and 55% PredictSNP accuracy, respectively.
The amino acid conservation analysis showed that the amino acids S71, T87, E129, E136, S153, and R159 are not conserved in LEP polypeptide. The amino acids Y140 and E143 had the highest conservation score followed by the amino acids N103 and R149. These findings confirm that the conserved amino acids had an important functional or structural role on the native LEP protein [22], so the mutation in these amino acids could have a damaging effect on LEP protein.
Finally, the effect of revealed non-synonymous on the stability of 3D tertiary structure of river buffalo mature leptin peptide was predicted (Fig. 6). S71G only increased the stability of the leptin protein by 0.05 kcal/mol. T87N, N103S, E129K, E136G, Y140C, E143Q, R149W, S153P, and R159Q lowered the stability of mature leptin peptide tertiary structure by − 0.41, − 0.19, − 0.01, − 0.72, − 1.63, − 0.77, − 0.45, − 1.10, and − 0.25 kcal/mol, respectively. Y140C and S153P mutations reduced the stability of mature leptin peptide tertiary structure by more than − 1 kcal/mol which added more evidence supporting the damaging effect of these mutations [34]. Finally, the amino acid N103 in the leptin polypeptide is conserved in human and river buffalo, and the N to K mutation which occurs in this amino acid has a damaging effect on the physiological function of human LEP protein [35].