Molecular diversity analysis in hexaploid wheat (Triticum aestivum L.) and two Aegilops species (Aegilops crassa and Aegilops cylindrica) using CBDP and SCoT markers

Background Evaluation of genetic diversity and relationships among crop wild relatives is an important task in crop improvement. The main objective of the current study was to estimate molecular variability within the set of 91 samples from Triticum aestivum, Aegilops cylindrica, and Aegilops crassa species using 30 CAAT box–derived polymorphism (CBDP) and start codon targeted (SCoT) markers. Results Fifteen SCoT and Fifteen CBDP primers produced 262 and 298 fragments which all of them were polymorphic, respectively. The number of polymorphic bands (NPB), polymorphic information content (PIC), resolving power (Rp), and marker index (MI) for SCoT primers ranged from 14 to 23, 0.31 to 0.39, 2.55 to 7.49, and 7.56 to 14.46 with an average of 17.47, 0.34, 10.44, and 5.69, respectively, whereas these values for CBDP primers were 15 to 26, 0.28 to 0.36, 3.82 to 6.94, and 4.74 to 7.96 with a mean of 19.87, 0.31, 5.35, and 6.24, respectively. Based on both marker systems, analysis of molecular variance (AMOVA) indicated that the portion of genetic diversity within species was more than among them. In both analyses, the highest values of the number of observed (Na) and effective alleles (Ne), Nei’s gene diversity (He), and Shannon’s information index (I) were estimated for Ae. cylindrica species. Conclusion The results of cluster analysis and population structure showed that SCoT and CBDP markers grouped all samples based on their genomic constitutions. In conclusion, the used markers are very effective techniques for the evaluation of the genetic diversity in wild relatives of wheat.


Background
Based on the International Grains Council's report (2019) [1], the world needs more one billion tons of wheat for the next 4 years (~2024). It seems that this demand is fulfilled through conventional breeding programs [2]. However, there is a main concern among breeders that the genetic background of cultivated genotypes is narrowed by consecutive breeding cycles and remaining variability in its gene pool is inadequate for future breeding programs [3]. Therefore, the expansion of the genetic base of this important cereal is necessary. The genus Aegilops as the most important wheat gene pool can contribute to obtaining favorable traits in breeding programs. This genus includes 22 species at the three di-, tetra-, and hexaploid levels as well as with various genetic structures such as the U, M, S, B, D, N, X, and T genomes [4]. Numerous reports have revealed that different Aegilops species can be introducing desirable agronomic properties and breeding potential which induce resistance to various biotic and abiotic stresses [5][6][7][8][9][10][11][12][13][14][15][16][17][18][19]. One of the first fundamental step in each breeding program is the estimate of genetic diversity. Indeed, accurate investigation of the level of genetic diversity can be important in its breeding programs for characterizing ideal parental plants to provide segregating progenies for further basic analysis and selection [20].
Investigation of genetic diversity in wheat and its wild relatives has been done through agro-morphological characters, properties, and molecular markers techniques. Despite the botanical characters and agro-morphological traits being usually used to dissect genetic diversity, they are not completely successful due to environmental influences. In contrast, molecular markers as the genetic tools provide important information regarding the genetic structure and phylogenetic relationships among different plant species. These molecular tools expose genetic differences or similarities in better information without interference from environmental factors [21].
Various molecular marker techniques such as AFLP (amplified fragment length polymorphism), RAPD (randomly amplified polymorphic DNA), SSR (simple sequence repeat), ISSR (inter simple sequence repeats), DArT (diversity arrays technology), etc. are currently available for the evaluation of genetic population analysis, association analysis, and QTL mapping studies. CAAT box-derived polymorphisms (CBDP) and start codon targeted (SCoT) polymorphisms are two new reproducible markers that are based on the short conserved region in plant genes [22,23]. These techniques have been successfully used in genetic analyses in different plant species [24][25][26][27][28][29][30][31][32]. The present research is focused on the estimation of genetic diversity and population structure in a set of Iranian bread wheat genotypes and two Aegilops species using SCoT and CBDP markers.

Plant materials and DNA extraction
The plant materials consisted of 91 accessions belonging to Ae. cylindrica, Ae. crassa, and T. aestivum L. species. The genetic resources of wild species were collected from the natural habitats in Iran. and the seeds of all accessions were deposited in the Ilam University Gene bank with specific voucher numbers. The genetic composition and gene bank codes are presented in Table 1. Total genomic DNA of investigated accessions was isolated from fresh leaves based on CTAB protocol [33]. The quality of extracted DNA was tested by 0.8% agarose gel electrophoresis.

Polymerase chain reaction using SCoT and CBDP markers
A total of 15 SCoT primers (Table 2) were selected based on [21]. Polymerase chain reactions (PCRs) were conducted in a volume of 20 μL and consisted of 2 μL of DNA, 2 μL of each primer, 10 μL master mix PCR (ready-to-use PCR master mix 2×), and 6 μL ddH 2 O. The amplification conditions included an initial denaturation step of 5 min at 94°C, followed by 45 cycles of 45 s at 94°C, 1 min at 45°C, and 3 min at 72°C with a final extension at 72°C for 7 min. Produced fragments were separated by gel electrophoresis in 1.5% agarose. In CBDP analysis, 15 primers were designed based on [23], (Table 2). Similar to SCoT assay, each PCR reaction was amplified in a volume of 20 μL and contained 2 μL DNA, 2 μL of each primer, 6 μl double-distilled water, and 10 μl master mix. All reactions were carried out as follows: an initial denaturation step at 94°C for 5 min, followed by 45 cycles of denaturation at 94°C for 45 s, primer annealing at 56°C for 45 s, and primer elongation at 72°C for 90 s; the final extension at 72°C was held for 10 min. All products were run on a 1.5% agarose gel. In both systems, all amplified fragments were stained with Safestaine-II and finally photographed using a gel documentation device.

Data analysis
All the observed bands in SCoT and CBDP profiles were scored as 1 and 0 on the basis of the attendance presence and absence of the amplified fragments, respectively. To determination of efficiency the selected primers, five informativeness indices, such as the number of polymorphic bands (NPB), polymorphism information content (PIC), resolving power (Rp), and marker index (MI), were estimated. Partitioning the genetic diversity among and within species was done through analysis of molecular variance (AMOVA). Genetic variation parameters including the number of observed (Na) and effective alleles (Ne), Shannon's information index (I), Nei's gene diversity (He), and percentage of polymorphic loci (PPL) were calculated for comparing the level of genetic diversity among different species. All genetic parameters were calculated using GenAlEx software [34]. Cluster analysis was computed based on the Jaccard's dissimilarity matrix to the grouping of the investigated Aegilops accessions using DARwin software ver. 6.0.13 [35]. Population structure analysis was carried out using STRUCTURE software [36]. To obtain the optimum number of subpopulations, seven independent runs were determined, so in each run, the values of burn-in period and MCMC factors were 50,000. Then, the results of structure analysis were subjected to an estimate of subpopulations (ΔK) using the STRUCTURE HARVES TER software [37].

SCoT and CBDP polymorphism
All tested SCoT and CBDP primers were polymorphic and reproducible. The summary of estimated with the highest value was recognized from others as the informativeness primer, whereas primers SCoT-6, SCoT-20, and SCoT-21 showed the lowest values.
In the CBDP assay, 15 polymorphic primers amplified 298 fragments. Primers CBDP-12 and CBDP-10 amplified the maximum (26) and minimum (15) numbers of polymorphic fragments. Rp index ranged from 3.82 and 6.94 with an average of 5.35 per primer. CBDP-8 and CBDP-11 showed the highest and lowest values for this index than other primers. The MI index varied between 4.74 and 7.96 with an average of 6.24 per primer, and the highest and lowest values were estimated for CBDP-7 and CBDP-10 primers, respectively. The mean of PIC index was 0.31 and it ranged from 0.28 (CBDP-2 and CBDP-11) to 0.36 (CBDP-4, CBDP-6, and CBDP-7) ( Table 2).

Genetic diversity in Aegilops species
The results of the AMOVA analysis are shown in Fig. 1. Based on both marker systems, a significant difference within species was observed. Based on SCoT data, the portion of genetic variance within and between species were 78 and 22%, respectively, while based on CBDP data these portions were 80 and 20%, respectively. Moreover, based on both marker systems, there was a high level of genetic differentiation (G ST ) among the studied populations (Table 3). Besides, the values of gene flow (Nm) parameter for SCoT and CBDP markers were less than 1, showing a genetic isolation among different species. A summary of the estimated genetic variation parameters based on SCoT and CBDP markers is presented in

Grouping of Aegilops accessions and population structure analysis
The dendrogram rendered using the neighbor-joining algorithm (NJ) based on the SCoT data sets clustered all investigated samples into three main groups. The first, second, and third clusters consisted of 58, 29, and 4 samples, respectively (pairwise genetic distance coefficients are not shown). The first cluster (GI) was further divided into two sub-clusters (sub-I and sub-II). Sub-I included 21 accessions from Ae. crassa and 9 accessions from Ae. cylindrica, while sub-II consisted of 9 and 22 accessions from Ae. cylindrica and Ae. crassa, respectively. All T. aestivum accessions (except no. 1) were placed in the second cluster (GII). Three samples from Ae. cylindrica (nos. 89, 90, and 91) along with one sample of T. aestivum (no. 1) created the third cluster (GIII) (Fig. 2a). The dendrogram obtained using the CBDP data set indicated that all Aegilops samples were grouped into three main clusters. The first cluster (GI) embraced all bread wheat accessions. The second cluster (GII) consisted of 22 samples from Ae. cylindrica along with 21 samples from Ae. crassa species. The remaining samples from Ae. cylindrica and Ae. crassa were grouped in the third cluster (GIII) (Fig. 2b).
In the population structure analysis, the maximum ΔK for both data sets were observed at K = 3, with accessions falling into three subpopulations (Fig. 3a). In both analyses, the threshold level of membership for each sample in subpopulations was determined ≥ 0.5. Based on SCoT data, 20 samples of T. aestivum created the first subpopulation. All accessions of Ae. crassa except nos. 59 and 60 along with nine samples belonging to T. aestivum were clustered into the second subpopulation. Two samples from Ae. crassa (nos. 59 and 60) and all Ae. cylindrica were the third subpopulation. One sample (accession no. 2 from T. aestivum) was categorized into an admixed group (Fig. 3a). In CBDP analysis, the optimum number of subpopulations was revealed to be K = 3, which indicated that all of the samples can be grouped into three main subpopulations with an admixed group. Out of 30 samples from T. aestivum species, 22 samples were placed in subpopulation I, five samples (Nos. 20,22,28,29, and 30) fell into subpopulation II, and three samples (nos. 21, 24, and 25) along with one sample from Ae. crassa species (no. 59) were placed in the admixed group, respectively. All samples from Ae. crassa were separated from other samples and created subpopulation II. However, one sample from Ae. cylindrica was categorized in this subpopulation. Finally, the remaining Ae. cylindrica (30 samples) were assigned to subpopulation III (Fig. 3b). The results obtained by cluster analysis and population structure are generally supported by the principal coordinate analysis (PCoA). As shown in Fig. 4, all investigated samples were grouped into two main clusters using SCoT and CBDP markers. In both biplots, all accessions belonging to T. aestivum species were placed into the same cluster; however, all Ae. cylindrica and Ae. crassa fell into the same cluster together.

Discussion
Increasing the crop adaptability to climate change and ensuring food security for the next century are two critical scenarios which reveal the importance of genetic diversity in crop wild relatives. Among cereal crops, wheat has a rich gene pool, including many wild relatives with various genomic constructions. This feature has enabled wheat breeders to use them as a main source of important agronomic characters and ideal genes which are involved in tolerance to different biotic and abiotic stresses [38]. Therefore, investigation of molecular variability in wild relatives of wheat is a key task in exploring novel genes or even alleles for future breeding programs [39]. Molecular analysis study using DNA-based markers is an efficient approach to estimate genome diversity and population structure that has been used repeatedly in many plants. In the current study, CBDP and SCoT marker techniques served to investigate genetic diversity between and within two Aegilops species along with local bread wheat genotypes. All amplified fragments using both marker systems were polymorphic, which  confirmed that the used markers are a powerful tool for further genetic diversity analyses and classify the investigated samples ( Table 2). Pour-Aboughadareh et al. [40] used SCoT markers to analyze polymorphism of four Triticum species including T. aestivum, T. durum, T.
urartu, and T. boeoticum and obtained 97.59% of polymorphism fragments. Analogous works on wild relatives of bread wheat and different Aegilops species were conducted by Pour-Aboughadareh et al. [29] and Etminan et al. [41] and these researchers reported the high level of polymorphism using SCoT and CBDP markers. In addition to percentage polymorphism, resolving power (Rp) and polymorphism information content (PIC) are the important indices of marker usefulness used for comparison efficiency of markers for genetic analyses [29]. PIC depends on the number of detectable alleles and is described as the probability of a primer in identifying polymorphism between samples. On the other hand, Rp shows the discriminatory ability of the used primers to produce informative fragments [42]. Average amounts of PIC and RP confirmed the usefulness of the selected primers for analysis of genetic diversity and grouping the samples belonging to the different species of the Triticum and Aegilops genera (Table 2). Likewise, Etminan et al. [43] investigated accessions of the T. durum and obtained a PIC of 0.31 and Rp of 9.16 using CBDP markers. Also, in another study, these authors used fifteen CBDP primers for dissection of molecular variability in different Aegilops and Triticum species and reported a high level of polymorphism and discriminatory of the used markers (PIC = 0.47 and Rp = 11.19). Nowak et al. [2] analyzed three Aegilops species (Ae. crassa, Ae. neglecta, and Ae. juvenalis) using REMP and ISSR markers. The authors indicated that the used markers were efficient systems for evaluating the genetic diversity and also reported that the Aegilops species have a high level of genome variability which can serve as an ideal gene pool for discovering useful genes. The result of AMOVA revealed that genetic diversity observed within species (SCoT = 78% and CBDP = 80%) is more to that found among them (SCoT = 22% and CBDP = 80%), suggesting all accessions in each species have a wide genetic differentiation (Fig. 1). This finding is in accordance with those of the previous reports that showed the high level of diversity in Aegilops species through different DNA markers [2,29,41,[44][45][46][47][48]. Our results indicate the accessions from three different species are genetically different from each other. As shown in Table 3, the highest values of the genetic variation indices (especially Na, Ne, I, and He) were observed for Ae. cylindrica species using both marker systems. The higher level of diversity in this species might be referred to as the frequency of allelic variation of this species being affected by different climatic conditions [2]. Several studies considered Ae. cylindrica as novel sources of tolerance to abiotic stresses for further wheat breeding programs [49]. Pour-Aboughadareh et al. [29] reported that Ae. cylindrica has the highest level of genetic diversity among the evaluated Aegilops species, whereas the lower level of diversity belonged to Ae. crassa, which was in accordance with our findings in this study. However, this result disagrees with Etminan et al. [41], so these authors reported a high level of genetic diversity in T. aestivum and Ae. crassa then Ae. cylindrica using CBDP markers. These contradictions could be referred to the primer's sequences, the geographical origins, or sample size of the tested accessions. Khodaee et al. [50] also reported a high level of genetic diversity among the Iranian Ae. triuncialis accessions using ISSR, CBDP, and SCoT molecular markers and confirmed that all the three marker systems can provide a comprehensive pattern of the genetic diversity in Ae. triuncialis germplasm. In SCoT and CBDP analyses, clustering patterns were consistent with the results of population structure analysis. In both analyses, all investigated accessions were clustered based on their genomic structure with a minor admixture (Figs. 2, 3, and 4). Previously, a similar grouping pattern was observed for accessions from different Aegilops and Triticum species by Pour-Aboughadareh et al. [29] and Etminan et al. [41]. These authors reported SCoT and CBDP markers group species based on their genetic backgrounds and the obtained groups or subpopulations approximately confirm their taxonomic classification.

Conclusion
Preservation of the highest possible level of genetic diversity is one of the main goals of genetic resource conservation programs and assessment of genetic diversity using reliable methods provides useful information for the management of genetic resources and crop improvement programs. Our results revealed high polymorphism in the investigated Iranian wheat germplasm from different Triticum and Aegilops species. The molecular analysis of genetic diversity in the tested species showed a high level of genome variability in Ae. cylindrica species. Based on the results of AMOVA, genetic diversity observed within species was more than that found among them suggesting all accessions in each species have a wide genetic differentiation. In addition, based on obtained results, SCoT and CBDP markers were very effective techniques for the evaluation of the genetic diversity and phylogenetic studies in wheat germplasm. These results revealed that these two different genetargeted molecular markers can be used as reliable techniques for detecting the levels of DNA polymorphism and genetic relationship.