Complete DNA sequencing provides a lot of information about the entire sequence of organisms’ genomic DNA; however, basic methods like convenient genome walking methods are still needed [27]. In fact, genome walking is performed to identify unknown sequences adjacent to known sequences [32] and is very useful when genetic information about sequence analysis of organisms is limited [27]. The technique could be applied for the identification of the transposable elements, insertional mutagenesis ([16]; V [19].), retroviruses, and cloning of multiple genes in order to study their functions. The other applications include identifying promoters and regulatory elements in genomic DNA [9], gap filling in the genomic sequence, and mapping the intron and exon in genetics [3]. The only requirement to start the genome walking is the availability of a part of the known nucleotide sequence of the genome [14]. Genome walking can use polymerase chain reaction for the identification of unknown regions [31]; therefore, based on the purpose of research, different PCRs are used.
Genome walking can be depended on restriction and ligation reaction such as inverted PCR [29], rolling circle inverted PCR [30], step down PCR [35], cassette ligation [22], and rapid amplification of genomic ends (RAGE) [4]. Moreover, it can be based on the primer methods such as site-finding PCR [26], thermal asymmetric interlaced (TAIL) PCR [17], semi-random primer PCR [10], and linear and exponential TAIL (LE-TAIL) PCR [12]. Flanking-sequence exponential anchored (FLEA) PCR [20] uses random and degenerates primers with gene-specific primers [11, 33]. Single long primer PCR (SLRA PCR) used gene-specific primers and a random amplified polymorphic DNA primer for genome walking [15], and the stepwise partially overlapping primer (SWPOP-POP) method is a partial overlap of the latter primer that is identical to the 5′ part of the former one [2].
The restriction base method of genome walking requires a primitive digestion of genomic DNA by restriction enzymes. This digestion should be located at an appropriate distance between unknown and known regions [14]. An enzyme with a restriction site at a distance away from a gene-specific primer, not too far in order to allow subsequent PCR amplification, is valuable. Sometimes it is difficult to predict the correct choice of the restriction enzyme due to the lack of sufficient information about sequences [21]. Restriction fragment can be subsequently either self-circularized or ligated to the designed adaptors. These adaptors are created from the modification of DNA termini and/or double-strand cassette for connection to the genome [14]. T-linkers are examples of the first case that is caused by the modification of DNA termini [34]. Adaptors that are ligated separately to the genome fragments are double-stranded adaptor with a blunt end. In this adaptor, a shorter strand is blocked with an amine group [24]. Other double-strand adaptors are vectorette adaptor (double-strand adaptor with a mismatch region in the center) [1], spelinkerett adaptor (the same as the vectorette adaptor but with a hairpin structure) [7], adaptor consisting of a hairpin structure and tail-A (hairpin structure and tail-A ligated with phosphorothioate linkage) [18], and phosphorylated excess base adaptor (double-strand adaptor with sticky end and one base excess) [28]. Besides the large application of genome walking, sometimes, availability, cost of adaptors, and importation of genome walking kits are the barriers to the researcher’s performance.
Recently, several protocols have been developed using the combination of next-generation sequencing (NGS) technologies and genome walking. These protocols are used to study the insertion mechanisms of retroviruses, gene therapy, and functional genomics [31]. In virtual genome walking, gene models have been generated with low Illumina data, and algorithms can be used to walk a high number of repeats [8]. This method (Illumina) has been used for comparative analysis of the chloroplast genome of Stryphnodendron adstringens with related Mimosoid species [5].
There are more than 53 types of modification strategies to perform genome walking technique. Each of these methods is achieved by designing various primers (random primer, specific primer, etc.), ligation reaction step, and adaptors. Some of these methods do not have enough efficiency to identify the unknown nucleotide sequences and confirm PCR products. In addition, some methods have problems such as low precision for genome walking because of low specificity, high cost, being time-consuming, and limitations in performing genome walking in different sequences. In the restriction-based method of genome walking, just a few restriction sites are considered on the adaptors of common genome walking kits. This can also limit the researchers’ choice of a suitable enzyme for the desired genome. Also, the adaptor should be modified by adding 5′ phosphate and 3′ amine groups for the ligation process which increases the handling costs.
Restriction enzymes cut a double-strand DNA, so in this research, a double-strand adaptor with multiple restriction sites has been designed. Therefore, it is possible to select the desired enzyme for the genome, or researchers are able to substitute their desired restriction sites instead of the present adaptors’ multiple restriction sites. Additionally, since restriction enzyme digestion in double-strand adaptor provides the 5′ phosphate group, it is not required to add phosphate, amine, or other groups to the adaptor after digestion, and digested adaptor containing 5′ phosphate group can be ready for ligation with any DNA fragment without modification. In the present study, this strategy has been investigated by convenient PCR for the identification of Momordica charantia target fragment as a model. In addition, correct performance of adaptor was investigated using pTZ57R plasmid sequence. The availability of whole pTZ57R genome sequence allowed to accurately determine the performance of the designed double-strand adaptor.