Primer Design Workflow Using PABLOG

Shashank Saini; Barbara Micalizzi; Paola Frazzetto; Michele Ferrigno; andrey prjibelski; Giuseppe Diego Puglia

Jul 01, 2025

Primer Design Workflow Using PABLOG

DOI

https://dx.doi.org/10.17504/protocols.io.ewov1mwmovr2/v1

Shashank Saini¹,
Barbara Micalizzi²,
Paola Frazzetto²,
Michele Ferrigno²,
andrey prjibelski³,
Giuseppe Diego Puglia⁴

¹CNR;
²University of Catania;
³University of Helsinki;
⁴National Research Council of Italy

Barbara Micalizzi

University of Catania

DOI: https://dx.doi.org/10.17504/protocols.io.ewov1mwmovr2/v1

External link: https://github.com/tools4plant-omics/PABLOG

Protocol Citation: Shashank Saini, Barbara Micalizzi, Paola Frazzetto, Michele Ferrigno, andrey prjibelski, Giuseppe Diego Puglia 2025. Primer Design Workflow Using PABLOG. protocols.io https://dx.doi.org/10.17504/protocols.io.ewov1mwmovr2/v1

Manuscript citation:

Ferrigno, M., Frazzetto, P., Prjibelski, A., Tomescu, A.I. & Puglia, G.D. (2024) PABLOG: a Primer Analysis tool using a Bee-Like approach on Orthologous Genes. Physiologia Plantarum, 176(3), e14398. https://doi.org/10.1111/ppl.14398

License: This is an open access protocol distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited

Protocol status: Working

We use this protocol and it's working

Created: May 21, 2025

Last Modified: July 01, 2025

Protocol Integer ID: 218669

Keywords: non-model species, oligos from RNA-seq, primer design workflow, design accurate primer, using orthologous gene sequence, target orthologous gene sequence, pcr reaction, orthologous gene sequence, primer, accurate primer, primers at the exon, gene sequence to the transcriptome, transcriptome, transcriptome data, pcr, gene sequence, using pablog, dna gene sequence

Abstract

With PABLOG, you can effortlessly design primers at the exon-junction regions for accurate qRT-PCR reactions in non-model species using orthologous gene sequences. It requires a target orthologous gene sequence (.fasta) and transcriptome data (.fastq) and that's it! Through this protocol, you will be guided to (1) install PABLOG and the accessory tools, (2) fetch the DNA gene sequence, (3) fetch and prepare transcriptome data, (4) align the gene sequence to the transcriptome, and (5) design accurate primers and inspect the results.

Attachments

Physiologia Plantaru...

2.1MB

Guidelines

PABLOG (Primer design And BLAST On Genomes) is a powerful and customizable primer design tool, especially well-suited for non-model organisms. It offers: Built-in integration with BLAST to ensure primer specificity, Support for annotation files (GFF/GTF) and sequence files (FASTA), Targeting of specific gene features such as exons, introns, or coding sequences (CDS), Flexible parameter settings (e.g., primer length, Tm, GC content). PABLOG is ideal when working with partially annotated genomes or transcriptomes, where commercial software often falls short or lacks specificity.

Materials

To run this workflow successfully, you will need the following system tools and environments set up. These include a Unix-like shell (via WSL for Windows users) and genome visualization tools such as IGV. Windows Subsystem for Linux (WSL) — for Windows Users If you're using Windows, it's highly recommended to install the Windows Subsystem for Linux (WSL), which allows you to run a Linux terminal directly in Windows. Installation instructions: https://learn.microsoft.com/en-us/windows/wsl/install After installing WSL, you can open Ubuntu (or your chosen Linux distribution) via the terminal and install tools like minimap2, samtools, etc., using Conda or apt. Integrated Genome Viewer (IGV) IGV (Integrative Genomics Viewer) is a desktop application for exploring large genomic datasets interactively. It is useful for visualizing alignments (e.g., BAM files), annotations, and designed primer locations. Download IGV: https://software.broadinstitute.org/software/igv/download Once downloaded, launch the application and load your reference FASTA and alignment (e.g., BAM) files for interactive inspection. Optional But Recommended A modern text editor like VS Code (https://code.visualstudio.com/) or Sublime Text for editing FASTA/GTF/GFF files. Python 3.8+ and Jupyter Notebook (or JupyterLab) installed in a Conda environment.

1. Pre Installation requirments

 Download Anaconda package from the following link (https://repo.anaconda.com/archive/) according to your OS version (WSL or Linux, MacOS)

Windows user can install WSL followed by Anaconda installation by opening cmd or powershell and by running following commands.
Note:- In the last step of Anaconda installation when it prompts that by default open anaconda (base) on launch of terminal choose “yes”.

$ wsl --install

$ gunzip Anaconda3-2023.09-0-Linux-x86_64.sh.gz

$ bash Anaconda3-2023.09-0-Linux-x86_64.sh -b -p $HOME/anaconda3

Cloning and activation of the PABLOG github repository

$ git clone https://github.com/tools4plant-omics/PABLOG.git

$ cd PABLOG              

Note:- It is important to be in PABLOG directory before making conda enviroment for PABLOG

Create the conda enviroment and activate it

$ conda create -n PABLOG python=3.9    
# You can rename enviroment as per your choice in our case its PABLOG

$ conda activate PABLOG

Install the required packages for PABLOG to work

$ pip install -r requirements.txt 
                                 
# requirements file is in the above cloned directory. It will install the following tools:

#HTSeq(2.0.4)
#numpy(1.26.1)
#pandas(2.1.1)
#primer3-py(2.0.1)
#pysam(0.22.0)
#tqdm(4.65.0)

2. Obtain the Gene Sequence

Download CDS sequence of your target gene/s or say gene of interest from the nearest phylogenetic neighbour. In this case it is Tagetes erecta, Chrysanthemum morifolium, Rudbecka hirata or Helianthus annuus.

2. BLASTn to the reference genome available on NCBI website (https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastSearch&PROG_DEF=blastn&BLAST_SPEC=GDH_GCA_025667525.1 ) and download the genome sequence of the obtained BLAST results (Check for Evalue and similarity and alignment).

3. Inspect the gene sequence by MSA viewer or graphics viewer and download the sequence in the visible range.
Alternatively, when multiple alignments for the same region of CDS is visualized the gene sequences can be also manually downloaded by noting down the aligned coordinates of Linkage Group (LG) and arrange the coordinates in ascending order in excel. Take the first and the last coordinates, click on accession and fill them in the coordinates fields, download .fasta.

3. Transcriptome Preparation

Step 1: Go to the home directory, open the terminal and type WSL (now you are in Linux).

$ wget https://ftp-trace.ncbi.nlm.nih.gov/sra/sdk/current/sratoolkit.current-ubuntu64.tar.gz

Step 2 : Unzip the downloaded tar file and install SRA tool kit

$ tar -vxzf sratoolkit.tar.gz  

$ sudo apt install sra-toolkit 

Step 3: Run the following command (make sure you are in the directory where you downloaded SRA toolkit)

$ sratoolkit-ubuntu64/bin

Step 4: Run the following command 

$ vdb-config --interactive

Step 6: Run the following command to download your sequence from NCBI 

$ prefetch SRR343122

Step 7: Run the following command to convert your sequence to Fastq 

$ fasterq-dump SRR343122                                    # You will get two files  

Step 8: Join the two files by cat command

$ cat SRR343122_1.fastq SRR343122_1.fastq  SRR343122.fastq   

Step 9 : Make a .tar.gz archive

$ tar -czvf SRR343122.fastq.tar.gz /path/to/fastq_directory

4. Arrangement of files for usage

Place in the same folder the archived transcriptome file (SRR343122.fastq.tar.gz) and the downloaded gene sequence.

5. Activate conda enviroment

$ conda activate PABLOG

Convert the .fasta file to .sam file by using minimap2 with the following command:

$ minimap2 -ax splice -k14 -w 4 input_seq.fasta SRR343122_GCTranscript.tar.gz > output_alignment.sam

Convert .sam alignment to .bam file using Samtools

$ samtools view -bS output_alignment.sam | samtools sort -o output_alignment.bam

Create the .bai index from the .bam file alignment

$ samtools index output_alignment.bam output_alignment.bam.bai

6. View files in IGV viewer, to inspect the alignment. This can also be used to see the positions of primers generated by PABLOG

# Open IGV
$ igv


Next, open IGV and:
Click the Genome tab, load Genome from File and select the gene file (it is the full gene.fasta sequence, including introns, for which primers will be designed).
Click the File tab, load from File, and select the .bam file.

7. Copy the files output_alignment.bam and input_seq.fasta to the 'PABLOG' directory.

$ cp output_alignment.bam /path/to/PABLOG_directory 
$ cp input_seq.fasta /path/to/PABLOG_directory

8. Open the terminal in PABLOG directory and activate PABLOG enviroment

$ conda activate PABLOG

Run PABLOG

$ python pablog.py output_alignment.bam input_seq.fasta result_primers.txt 60 
   
# By default value is set to 60. You can lower it for more relaxed CIGAR score verification by PABLOG 

Results

Target Region absolute coordinates identified by PABLOG: CM046877.1:108370460-108374317

Local nucleotide positions selected for designing primers:
Start: 552
End: 2001
Goodness Score: 68.81

Oligos details:
OLIGOStartLengthTm (°C)GC%Any_th3’_thHairpinSequence
LEFT PRIMER5332456.5137.5014.550.000.00CTTCAGAAGGAGCTCATTAGATTT
RIGHT PRIMER6582359.4847.830.000.000.00CTGGAGGAGGGTGTTGATTCTAA
 

Final check

Review primer alignment in IGV.

Check this article for more information: https://doi.org/10.1111/ppl.14398

Protocol references

Ferrigno, M., Frazzetto, P., Prjibelski, A., Tomescu, A.I. & Puglia, G.D. (2024) PABLOG: a Primer Analysis tool using a Bee-Like approach on Orthologous Genes. Physiologia Plantarum, 176(3), e14398. Available from: https://doi.org/10.1111/ppl.14398

OLIGO	Start	Length	Tm (°C)	GC%	Any_th	3’_th	Hairpin	Sequence
LEFT PRIMER	533	24	56.51	37.50	14.55	0.00	0.00	CTTCAGAAGGAGCTCATTAGATTT
RIGHT PRIMER	658	23	59.48	47.83	0.00	0.00	0.00	CTGGAGGAGGGTGTTGATTCTAA