Jun 03, 2025

Public workspaceMAPIT-seq protocol V.1

  • Qixuan Cheng1,
  • gang xie1,
  • Xiangyu Zhang1,
  • Jie Wang1,
  • Shuangjin Ding1,
  • Yi-Xia Wu1,
  • Ming Shi1,
  • Fei-Fei Duan1,
  • Zi-Li Wan1,
  • Junyu Xiao1,
  • Yangming Wang1
  • 1Peking University
Icon indicating open access to content
QR code linking to this content
Protocol CitationQixuan Cheng, gang xie, Xiangyu Zhang, Jie Wang, Shuangjin Ding, Yi-Xia Wu, Ming Shi, Fei-Fei Duan, Zi-Li Wan, Junyu Xiao, Yangming Wang 2025. MAPIT-seq protocol V.1. protocols.io https://dx.doi.org/10.17504/protocols.io.q26g79ekkvwz/v1
License: This is an open access protocol distributed under the terms of the Creative Commons Attribution License,  which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Protocol status: Working
We use this protocol and it's working
Created: June 02, 2025
Last Modified: June 03, 2025
Protocol Integer ID: 219355
Keywords: RNA-binding protein; dual omics; tissue; single cell; RNA isoform
Abstract
RNA-binding proteins (RBPs) are essential regulators of RNA fate and function. A long-standing challenge in studying RBP regulation has been mapping their RNA interactomes within the dynamic transcriptomic landscape, especially in single-cell contexts and primary tissues. Here, we introduce MAPIT-seq (modification added to RBP interacting transcript-sequencing), an antibody-directed RNA editing technology utilizing fusion protein (pAG-hADAR2dd-rAPOBEC1) to label RBP targets in situ within fixed samples. This methodology integrates formaldehyde fixation, antibody incubation, and RNA deamination, enabling co-profiling of endogenous RBP-RNA interactions without genetic manipulation requirements. We optimize the protocol of MAPIT-seq and demonstrate its robustness and efficiency across multiple RBPs. MAPIT-seq achieves isoform resolution and is readily applied to frozen tissue sections, low-input and single cells, enabling the elucidation of cell stage-specific regulation of RBP. In summary, MAPIT-seq extends the dimensions of multi-omics profiling, offering an effective framework to dissect post-transcriptional regulation across dynamic biological processes and clinically relevant scenarios.
Materials
Instruments

Centrifuge (5810R) Eppendorf
Vortex Mixer (MX-S) Scilogex
Mini Centrifuge (D1008) Scilogex
1.5 mL Centrifuge Tubes (MCT-150-C) Axygen
Thermal Cycler (GE4852T) Biogener
0.2 mL PCR tubes (PCR-0208-C) Axygen
15/50 mL Centrifuge Tubes (352096, 352070) Falcon
PCR Magnet (QYM07A) Quayad

Reagents

1x PBS Phosphate Buffered Saline (Cat. CC008) Macgene
DPBS (Cat. 14190144) Thermo Fisher Scientific
Formaldehyde (Cat. 47608) Sigma-Aldrich, Merck
Proteinase Inhibitor Cocktail (Cat. B14002) Selleck Chemicals
Dithiobis (succinimidyl propionate) (DSP) (Cat. 22586) Thermo Fisher Scientific
Dimethyl Sulfoxide (DMSO) (Cat. D10999) PSAITONG
Glycine (Cat. G0167) VWR Life Science
5% Digitonin (Cat. N253-YH01-01A) Novoprotein
1 M Dithiothreitol (DTT) (Cat. C6014) Bioss
100 mM PMSF (Cat. P0100) Solarbio
Recombinant RNase Inhibitor (Cat. 2313A) Takara
0.5 M EDTA, pH 8.0 (Cat. SL3069) Coolaber
30% (wt/vol) BSA (Cat. A8577) Sigma-Aldrich
Magzol RNA Extraction Reagent (Cat. R4801-02) Macgene
RiboLock RNase Inhibitor (Cat. EO0382) Thermo Fisher Scientific
BioMag® Plus Concanavalin A (Cat. BP531) Bangs Laboratories
Glycerol (Cat. M152) VWR Life Science
Proteinase K (Cat. 2546) Ambion
VAHTS Universal V6 RNA-seq Library Prep Kit (Cat. NR604) Vazyme
Normal Rabbit IgG (Cat. 2729S) Cell Signaling Technology
Guinea Pig anti-Rabbit IgG Antibody (Cat. ABIN101961) Antibodies-Online
Anti-G3BP1 (Cat. 13057-2-AP) Proteintech
rAPOBEC1-pAG-hADAR2dd (E488Q) fusion protein: The pFastBac-rAPOBEC1-pAG-hADAR2dd plasmid sequences are available in article. Fusion proteins were stored in a buffer containing 25 mM HEPES, pH 7.5 and 150 mM NaCl supplemented with 5% glycerol and snap-frozen in liquid nitrogen to store at -80°C

Biological materials

Cell suspension. We have used human HeLa (CCL-2, ATCC) and HEK293T (CRL-3216, ATCC) cell lines.

Software

miniconda (https://www.anaconda.com/)
trim-galore (https://github.com/FelixKrueger/TrimGalore): apply adapter and quality trimming to FastQ files
MAPIT (https://github.com/wanglabpku/MAPIT-seq): MAPIT-seq data analysis and processing

MAPIT dependencies

- bwa (v0.7.17)
- hisat2 (v2.2.1)
- python (at least v3.8)
- picard (at least v2.27)
- samtools (at least v1.14)
- gatk4 (at least v4.2)
- bedtools (at least v2.30.0)
- gffread
- reditools2 (https://github.com/BioinfoUNIBA/REDItools2)
- pip (at least v21.2)
- python packages (pybedtools, pysam (at least v0.16.0.1), pandas (at least v1.4.3), numpy (at least v1.22.3), and scipy)

FLARE (https://github.com/YeoLab/FLARE): determine regions of enriched RNA editing
snakemake (v8.27.1): workflow management system
homer2 (v4.11): de novo motif finding

Required data

MAPIT-seq paired-end fastq files can be downloaded from the SRA archive under accession BioProject PRJNA1166181
Human GRCh38 reference genome (https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_40/GRCh38.p13.genome.fa.gz)
Human GRCh38 transcript annotation GTF file (https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_40/gencode.v40.chr_patch_hapl_scaff.annotation.gtf.gz)
Human GRCh38 transcript annotation GFF3 file (https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_40/gencode.v40.chr_patch_hapl_scaff.annotation.gff3.gz)
Human GRCh38 RepeatMasker annotation for interspersed repeats and low complexity DNA sequences (http://hgdownload.cse.ucsc.edu/goldenPath/hg38/database/rmsk.txt.gz)
Human GRCh38 dbSNP VCF file (https://ftp.ncbi.nih.gov/snp/organisms/human_9606_b151_GRCh38p7/VCF/GATK/All_20180418.vcf.gz)
Human GRCh38 Exome Variant Server (EVS) VCF file (http://evs.gs.washington.edu/evs_bulk_data/ESP6500SI-V2-SSA137.GRCh38-liftover.snps_indels.vcf.tar.gz)
Human GRCh38 1000 Genomes Project VCF file (http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/1000_genomes_project/release/20181203_biallelic_SNV/ALL.wgs.shapeit2_integrated_v1a.GRCh38.20181129.sites.vcf.gz)
Reagent setup
Reagent setup
DSP Fixation Buffer
50× DSP Stock Solution (50 mg/mL): Balance DSP at room temperature for 30 min, dissolve 50 mg DSP in 1 mL 100% anhydrous DMSO, aliquot, and store at -80°C.
DSP Fixation Solution: Thaw the stock solution at room temperature for 10 min. Add 490 μL PBS dropwise to 10 μL DSP stock in a 1.5 mL tube using a P200 pipette while vortexing (important).
Note: Filter off any precipitates and use freshly within 5 min.
Critical
Glycine Solution (2.5 M Glycine)
Dissolve 9.38 g glycine in DEPC-treated water to a total volume of 50 mL. Filter with a 0.22 μm membrane.
ConA Beads Binding buffer
1 M HEPES-KOH, pH 7.51 mL
1 M KCl500 μL
1 M CaCl250 μL
1 M MnCl250 μL
DEPC-treated waterto 50 mL
Store at 4°C.
Antibody Buffer 1
100 mM PMSF5 μL
Protease Inhibitor Cocktail5 μL
RiboLock RNase Inhibitor6.25 μL
Recombinant RNase Inhibitor6.25 μL
5% Digitonin1 μL
1 M DTT0.5 μL
0.5 M EDTA2 μL
30% BSA1.67 μL
DPBSto 500 μL
Prepare fresh. Adjust the Digitonin concentration for different cell lines if necessary.  
Antibody Buffer 2
Similar to Antibody Buffer 1, but exclude 0.5 M EDTA and 30% BSA. Make fresh.
High-salt buffer
1 M HEPES, pH 7.51 mL
5 M NaCl3 mL
DEPC-treated waterto 50 mL
Storage: Store at 4 °C.
Deaminase-incubation buffer
100 mM PMSF5 μL
Protease inhibitor cocktail5 μL
RiboLock RNase Inhibitor6.25 μL
Recombinant RNase Inhibitor6.25 μL
5% Digitonin1 μL
1 M DTT0.5 μL
High-salt bufferto 500 μL
Make fresh. 
Deamination buffer
1 M HEPES, pH 7.9150 μL
1 M KCl600 μL
5 M NaCl30 μL
80% Glycerol625 μL
0.1 mM ZnCl210 μL
DEPC-treated waterto 10 mL
Storage: Store at 4 °C.
Proteinase K digestion buffer
1 M Tris-HCl500 μL
5 M NaCl1 mL
0.5 M EDTA100 μL
10% SDS2.5 mL
DEPC-treated waterto 50 mL
Storage: Store at RT.
Equipment setup
Equipment setup
Downloading and installing software and dependencies
Softwares in MAPIT pipeline are mainly available in conda. We recommend you to install these softwares by conda. Some softwares need to install by source code in Github repository.
Miniconda (see https://www.anaconda.com/docs/getting-started/miniconda/install)
MAPIT and dependencies git clone https://github.com/WangLabPKU/MAPIT-seq cd MAPIT-seq conda env create -f env.yml -c bioconda -c conda-forge conda activate Mapit-seq chmod +x Mapit conda install trim_galore -c conda-forge
Reditools2 and dependencies cd .. git clone https://github.com/BioinfoUNIBA/REDItools2 cd REDItools2 pip install -r requirements.txt
FLARE and dependencies cd .. git clone https://github.com/YeoLab/FLARE conda install snakemake
Downloading and preparing the required data
Downloading reference sequence and annotation cd "your_ref_path" wget \ https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/ release_40/GRCh38.p13.genome.fa.gz wget \ https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/ release_40/gencode.v40.chr_patch_hapl_scaff.annotation.gtf.gz wget \ https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/ release_40/gencode.v40.chr_patch_hapl_scaff.annotation.gff3.gz
RepeatMasker wget \ http://hgdownload.cse.ucsc.edu/goldenPath/hg38/database/ rmsk.txt.gz gzip -d *
Downloading and processing known variants (SNP) annotation files: dbSNP, 1000Genome, and EVS cd "your_ref_path" mkdir "your_ref_path"/"genomeVersion_SNP" cd "your_ref_path"/"genomeVersion_SNP"
dbSNP wget \ https://ftp.ncbi.nih.gov/snp/organisms/ human_9606_b151_GRCh38p7/VCF/GATK/All_20180418.vcf.gz
EVS mkdir EVS_split_chr wget \ http://evs.gs.washington.edu/evs_bulk_data/ESP6500SI-V2- SSA137.GRCh38-liftover.snps_indels.vcf.tar.gz tar -xvf ESP6500SI-V2-SSA137.GRCh38- liftover.snps_indels.vcf.tar.gz for i in {1..22} X Y; do awk '{if(substr($1, 1, 1) == "#"){print $0}else if((length($4) == 1) && (length($5) == 1)) {gsub("MT","M");{if($1 ~ "chr") print $0; else print "chr"$0 }}}' ESP6500SI-V2-SSA137.GRCh38- liftover.chr${i}.snps_indels.vcf | gzip > EVS_split_chr/chr$ {i}.gz done
1000Genome wget \ http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/ 1000_genomes_project/release/20181203_biallelic_SNV/ ALL.wgs.shapeit2_integrated_v1a.GRCh38.20181129.sites.vcf.gz mkdir 1000genomes_split_chr zcat ALL.wgs.shapeit2_integrated_v1a.GRCh38.20181129.sites.vcf.gz | awk -v dir_SNP=1000genomes_split_chr '{if(substr($1, 1, 1) == "#"){print $0 > "1000genomes_header"}else if((length($4) == 1) && (length($5) == 1)) {gsub("MT","M");{if($1 ~ "chr") print $0 > dir_SNP"/"$1; else print "chr"$0 > dir_SNP"/chr"$1 }}}' gzip 1000genomes_split_chr/chr*
Creating configuration file
MAPIT configuration After downloading the genome sequence (FASTA), genome annotation (GTF and GFF3), RepeatMasker annotation, and known SNP datasets (split by chromosomes), the following command can be used to complete the configuration process. This step includes generating the genome sequence index, extracting gene element annotations, and creating chromosome-split dbSNP VCF files. Upon completion, a configuration file named “GenomeVersion.json” will be generated in the “conf” directory within the MAPIT-seq working directory, which will be used for downstream analyses. Mapit config \ --genomeVersion GRCh38 \ --genomeFasta "full_path"/GRCh38.p13.genome.fa \ --species human \ --outdir "full_path" \ --gff3 "full_path"/gencode.v40.chr_patch_hapl_scaff.annotation.gff3 \ --rmsk "full_path"/rmsk.txt \ --dbSNP "full_path"/GRCh38_SNP/All_20180418.vcf.gz \ --1000genomesDir "full_path"/GRCh38_SNP/1000genomes_split_chr \ --EVSEVADir "full_path"/GRCh38_SNP/EVS_split_chr \ --Reditools "full_path_to_Reditools" \ --FLARE "full_path_to_FLARE"
FLARE configuration To run the FLARE pipeline, you will first need a set of files specifying the genomic regions in which cluster identification will occur. Once the script finishes running, you will see a new folder called "genome_name"_regions, and within that folder a slew of files with increasing indices, i.e., GRCh38_regions_0, GRCh38_regions_1... "full_path_to_FLARE"/workflow_FLARE/scripts/generate_regions.py <full/path/to/your/genome/gtf/file> <genome_name>_regions
Homer configuration In an effort to make sure things are standardized for analysis, HOMER organizes promoters, genome sequences and annotation into packages. See http://homer.ucsd.edu/homer/introduction/configure.html in details. perl configureHomer.pl -install hg38
Procedure
Procedure
Cell fixation (40 min–1 h)
Collect 5×102–5×105 cells or nuclei in centrifuge tubes.
Wash cells with PBS by centrifuging at 250g at RT for 5 min.
Centrifigation
For formaldehyde fixation, dilute 37% formaldehyde in room-temperature PBS to 0.5%.
Fix cells for 5 min at room temperature. Quench with 0.125 M glycine for 5 min.
(Optional) For DSP fixation, add 500 μL DSP Fixation Solution to cells, rotate gently at room temperature for 15 min, mix with pipettes, and incubate for another 15 min. Quench with 1 M Tris-HCl, pH 7.4, for 5 min.
Optional
Centrifuge cells at 4°C, 450g, for 5 min. Wash cells twice with pre-cooled DPBS. Then resuspend the fixed cells in 500 μL DPBS. Fixed cells can be cryopreserved at -80°C with 10% DMSO or proceeded immediately
Pause
Note: After fixation, transfer cells to centrifuge tubes pre-washed with 1% BSA. Usage of a swing-bucket rotor is highly recommended to reduce cell loss.
Critical
Cell Binding to ConA Beads (20 min)
(Optional) For single-cell MAPIT-seq, the ConA beads-cells coupling step was omitted. Instead, formaldehyde or DSP fixed cells were collected by centrifuging at 650g, 4°C for 5 min in the following experiments.
Optional
Activate ConA beads by introducing 5–10 μL ConA beads to a volume of binding buffer equivalent to 10 times the bead volume. Ensure thorough mixing and subsequently place the tubes on a magnetic stand for 30 s to 2 min to remove the supernatant. Repeat this activation step, and resuspend the beads in their initial volume of binding buffer.
Add the activated ConA beads to the cell suspension and incubate at room temperature under continuous rotation for 10–20 min.
Wash the cell beads mixture once with DPBS. Transfer the beads to PCR strip tubes.
Note: To mitigate the adhesion of beads to the walls of the centrifuge tubes, it is advisable to pretreat the tubes with 1% BSA.
Critical
Cell permeabilization and antibody Incubation (4.5 h)
Resuspend the cells in 50 μL a pre-mixed solution of Antibody Buffer 1, incorporating both the RBP and IgG antibodies at a dilution of 100:1.
Note: The concentration of the antibodies can be adjusted following immunofluorescence assays. Mix gently and avoid repetitive pipetting to prevent material loss.
Critical
Incubate the mixture at 4°C under continuous rotation for 3 h.
Incubation
Note: Avoid inversion of the tubes during rotation. It is recommended to mix the suspension intermittently to prevent bead aggregation.
Critical
After a quick spin, place the tubes on a magnetic stand for 30 s to 2 min, then carefully remove the supernatant. Wash the beads once with pre-chilled DPBS on a roller at 4°C for 5 min.
Resuspend the beads in 50 μL Antibody Buffer 2, which contains secondary antibodies at a 100:1 ratio, and incubate at 4°C under continuous rotation for 1 h.
Incubation
Deaminase Incubation (1 h)
After a quick spin, place the tubes on a magnetic stand for 30 s to 2 min, then carefully remove the supernatant. Wash the beads once with pre-chilled DPBS on a roller at 4°C for 5 min.
Dilute 1 μg of rAPOBEC1-pAG-hADAR2dd in 50 μL Deaminase Incubation Buffer.
Resuspend the beads in 50 μL a pre-mixed Deaminase Incubation Buffer and incubate at 4°C under continuous rotation for 1 h.
Incubation
Deamination Reaction (3.5–5 h)
After a quick spin, place the tubes on a magnetic stand for 30 s to 2 min, then carefully remove the supernatant. Wash the beads twice with 100 μL pre-chilled Washing Buffer, which includes High-Salt Buffer with 0.005% Digitonin, on a roller at 4°C for 5 min per wash.
Prepare 100 μL Deamination Buffer supplemented with the following components: 1 μL Proteinase Inhibitor Cocktail, 1.25 μL RiboLock RNase Inhibitor, 1.25 μL Recombinant RNase Inhibitor, and 0.1 μL 1 M DTT.
Resuspend the beads in 40 μL pre-mixed Deamination Buffer and incubate at 30°C for 3–4 h.
Incubation
Note: To ensure specificity and minimize substantial RNA degradation, maintain a processing time of no more than 5 hours; optimal editing is typically achieved within 4 h.
Critical
Optional: For cells fixed with DSP, reverse crosslinking can be performed by incubating with 50 mM DTT at room temperature for 30 min.
Optional
Following the reaction, place the tubes on a magnetic stand for 30 s to 2 min, then carefully remove the supernatant.
RNA Extraction and Library Construction (10 h)
(Optional) For single-cell MAPIT-seq, cells were washed once, resuspended in DPBS with 0.05% BSA and 1 U/μl RiboLock RNase Inhibitor and subsequently loaded onto the 10x Genomics Chromium platform.
Optional
Resuspend the beads in 100 μL Proteinase K Digestion Buffer and add 1 μL 20 mg/mL Proteinase K. Vortex the mixture briefly, after a quick spin, digest the beads at 56°C for 1 h.
Note: Resuspend samples in Proteinase K digestion buffer before transferred to a 1.5 ml tube and then add Proteinase K to prevent beads aggregation.
Critical
For large samples (> 100,000 cells): Extract total RNA following the standard TRIzol protocol. Use glycogen to facilitate RNA precipitation. For small samples (500–50,000 cells): Isolate mRNA using Oligo d(T)25 Magnetic Beads and proceed with the Smart-seq2 protocol.
Construct sequencing libraries from 50–500 ng of RNA using the VAHTS Universal V6 RNA-seq Library Prep Kit for Illumina. Quantify the concentration of the libraries using Qubit and select 10–100 ng of cDNA for sequencing on an Illumina NovaSeq 6000 platform, employing 150 nt paired-end reads.
MAPIT-seq data quality control (15 min)
Computational step
FASTQC: quality control mkdir ./fastqFileQC fastqc -t 2 -o ./fastqFileQC "fastq_file_path"/*.fq.gz
Apply adapter and quality trimming mkdir ./clean trim_galore -j 20 -q 20 --phred33 \ --stringency 3 --length 20 -e 0.1 \ --paired "fastq_file_path"/xxx_R1.fq.gz \ "fastq_file_path"/xxx_R2.fq.gz \ --gzip -o ./clean --basename ${samplename}_${replicate} "fastq_file_path"/xxx_R1.fq.gz and "fastq_file_path"/xxx_R2.fq.gz denote the paired-end FASTQ files generated from MAPIT-seq experiments, corresponding to Read 1 and Read 2, respectively. The placeholders ${samplename} and ${replicate} indicate the abbreviated sample name and replicate identifier, respectively.
Aligning and processing MAPIT-seq reads to reference genome sequence (3 h)
Computational step
Abundant RNA (rRNA, tRNA, and mt-tRNA) filtering and two-round uniquely mapping Abundant RNA species, including rRNA, tRNA, and mitochondrial RNA, were computationally removed prior to genome alignment to optimize sequencing read utilization, improve target gene detection sensitivity, and ensure high-quality clean data for downstream analyses. To further enhance the sensitivity and specificity of RNA editing site detection, a two-round uniquely mapping strategy was employed. In this approach, HISAT2 was used in the first round to capture transcriptomic complexity, such as splicing events and RNA modifications, while BWA-MEM was applied in the second round to map genomic regions of editing sites using unmapped reads in the first round. The resulting alignments from both rounds were subsequently merged to generate a final BAM file for downstream analysis. Mapit mapping -v $genomeVersion \ --fq clean/${samplename}_${replicate}_val_1.fq.gz \ --fq2 clean/${samplename}_${replicate}_val_2.fq.gz \ --rna-strandness {F,R,FR,RF} \ -n ${samplename} -r ${replicate} -o ${outpath} -t ${threads} "$genomeVersion" represents the configuration file for specific genome assembly version created in “Equipment setup” session. "--fq" and "--fq2" specify the trimmed paired-end FASTQ files used as input for mapping. The "--rna-strandness" parameter defines the strand-specific information required by HISAT2 during alignment (refer to: https://ccb.jhu.edu/software/hisat2/manual.shtml for details). This step utilizes the specified number of computational threads (${threads}) and automatically generates four sub-directories within the designated output directory (${outpath}). Intermediate mapping and the final merged BAM files will be generated and stored within these sub-directories during the alignment process.
Fine-tuning alignments for RNA editing discovery
Following the GATK Best Practices workflow for RNA-seq short variant discovery (https://gatk.broadinstitute.org/hc/en-us/articles/360035531192-RNAseq-short-variant-discovery-SNPs-Indels), we customized and integrated key pre-processing steps — MarkDuplicates, SplitNCigarReads, and Base Quality Score Recalibration (BQSR) — into a single streamlined command. Mapit finetuning -v $genomeVersion \ -n ${samplename} -r ${replicate} -o ${outpath} This integrated procedure enables efficient data cleanup and fine-tuning for RNA editing detection, directly using the designated output directory (${outpath}) specified in Step 26 to process the final merged BAM file of "${samplename}_${replicate}".
RNA editing analysis (4–6 h)
Computational step
Calling editing sites and filtering out known SNPs This step uses the fine-tuned BAM files to call single nucleotide variants (SNVs) with GATK HaplotypeCaller, followed by filtering out known SNPs based on previously curated SNP datasets. The remaining A-to-G or C-to-U editing sites are then annotated using genome annotation files. Mapit callediting -v $genomeVersion \ --sampleList ${sample1},${sample2},…,${sampleN} \ -o ${outpath} --prefix ${prefix} \ --enzyme {ADAR,APOBEC,Both} -t ${threads} This step allows for the simultaneous analysis of multiple samples. By providing a comma-separated list of sample names in the "--samplelist" parameter, replicate IDs from multiple samples in the designated output directory (${outpath}) specified in Step 26–27 can be processed together. The parameter "--enzyme", limited to three words, represents the editing enzyme introduced in MAPIT-seq samples. And after filtering, editing information will output in "${outpath}/6-Edit_calling" directory with ${prefix} as the file prefix.
Differential editing analysis between RBP-MAPIT and IgG-MAPIT To identify RBP targets, the step uses a Wilcoxon signed-rank test to perform on the ‘editing index’ within 50-nt continuous and non-overlapping windows of transcripts between RBP-MAPIT and IgG-MAPIT samples. Mapit calltargets -v $genomeVersion \ -i ${outpath}/6-Edit_calling/${prefix}/${prefix} _Edit_DATA.tsv \ --treatName ${treatSampleName} --controlName ${ctrlSampleName} \ -o ${outpath} -c ${coverage} -l ${level} "${treatSampleName}" and "${ctrlSampleName}" represent the abbr. sample name of RBP-MAPIT and IgG-MAPIT in the designated output directory (${outpath}) specified in Step 26–28. And "${coverage}" is the minimum coverage reads for an effective editing site (default 10), and "${level}" represents the level performing in differential editing analysis. The step will output the "Differential editing analysis" results in the "${outpath}/6-Edit_calling/${prefix}_${treatSampleName}/" directory.
Identifying high-confidence editing clusters and RBP binding motifs (6–8 h)
Computational step
Identifying A-to-G and C-to-U editing using a modified SAILOR workflow for MAPIT Mapit prepare -v $genomeVersion -n ${samplename} -r ${replicate} \ -o ${outpath} This step is for preparing files needed in SAILOR and FLARE workflow. Mapit SAILOR -v $genomeVersion -n ${samplename} -r ${replicate} \ -c ${coverage} -o ${outpath} -t ${threads} This step integrated SAILOR workflow for MAPIT data and result directory. And "${coverage}" is the minimum coverage reads for an effective editing site (default 10).
Identifying A-to-G or C-to-U editing clusters using FLARE Mapit FLARE -v $genomeVersion -e {AG,CT} -n ${samplename} \ -r ${replicate} --regions ${regions} -o ${outpath} -t ${threads} This step integrated FLARE workflow for MAPIT data and result directory. And "${regions}", created in configuration step, is the directory of genomic regions in which cluster identification will occur.
Identifying high-confidence editing clusters Mapit hc_cluster -v $genomeVersion -n ${samplename} -o ${outpath} \ -s ${sloplength} This step identifies intersection C-to-U or A-to-G editing clusters between two replicates. And high-confidence MAPIT editing clusters are the intersection of two-type editing clusters and are extended or shrink into "${sloplength}" nt windows.
De novo RBP binding motifs finding hc_cluster_path=${outpath}/7-FLARE_for_MAPIT/confident_clusters hc_cluster_bed=${hc_cluster_path}/${samplename}/${samplename}_${slop}.bed output_motif=${hc_cluster_path}/${samplename}/${samplename}.motif findMotifsGenome.pl $hc_cluster_bed hg38 $output_motif \ -noknown -rna -size given -len 5,6,7,8 -p ${threads} This step used the high-confidence MAPIT editing clusters to perform De novo RBP binding motifs finding.