MAPIT-seq protocol V.1

Qixuan Cheng; Gang Xie; Xiangyu Zhang; Jie Wang; Shuangjin Ding; Yi-Xia Wu; Ming Shi; Fei-Fei Duan; Zi-Li Wan; Junyu Xiao; Yangming Wang

Jun 02, 2025

MAPIT-seq protocol V.1

DOI

dx.doi.org/10.17504/protocols.io.q26g79ekkvwz/v1

Qixuan Cheng¹,
Gang Xie¹,
Xiangyu Zhang¹,
Jie Wang¹,
Shuangjin Ding¹,
Yi-Xia Wu¹,
Ming Shi¹,
Fei-Fei Duan¹,
Zi-Li Wan¹,
Junyu Xiao¹,
Yangming Wang¹

¹Peking University

Qixuan Cheng

Peking University

DOI: https://dx.doi.org/10.17504/protocols.io.q26g79ekkvwz/v1

Protocol Citation: Qixuan Cheng, Gang Xie, Xiangyu Zhang, Jie Wang, Shuangjin Ding, Yi-Xia Wu, Ming Shi, Fei-Fei Duan, Zi-Li Wan, Junyu Xiao, Yangming Wang 2025. MAPIT-seq protocol V.1. protocols.io https://dx.doi.org/10.17504/protocols.io.q26g79ekkvwz/v1

License: This is an open access protocol distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited

Protocol status: Working

We use this protocol and it's working

Created: June 02, 2025

Last Modified: June 02, 2025

Protocol Integer ID: 219355

Keywords: RNA-binding protein; dual omics; tissue; single cell; RNA isoform, rna interactomes within the dynamic transcriptomic landscape, essential regulators of rna fate, dynamic transcriptomic landscape, rna editing technology, directed rna editing technology, rna interactions without genetic manipulation requirement, rna interactome, rna fate, rna interaction, utilizing fusion protein, rna, fusion protein, rna deamination, interacting transcript, rbp targets in situ, specific regulation of rbp, sequencing, binding protein, rbp target, rbp regulation, antibody incubation, studying rbp regulation, multiple rbp, antibody, protein, mapit, omics profiling

Abstract

RNA-binding proteins (RBPs) are essential regulators of RNA fate and function. A long-standing challenge in studying RBP regulation has been mapping their RNA interactomes within the dynamic transcriptomic landscape, especially in single-cell contexts and primary tissues. Here, we introduce MAPIT-seq (modification added to RBP interacting transcript-sequencing), an antibody-directed RNA editing technology utilizing fusion protein (pAG-hADAR2dd-rAPOBEC1) to label RBP targets in situ within fixed samples. This methodology integrates formaldehyde fixation, antibody incubation, and RNA deamination, enabling co-profiling of endogenous RBP-RNA interactions without genetic manipulation requirements. We optimize the protocol of MAPIT-seq and demonstrate its robustness and efficiency across multiple RBPs. MAPIT-seq achieves isoform resolution and is readily applied to frozen tissue sections, low-input and single cells, enabling the elucidation of cell stage-specific regulation of RBP. In summary, MAPIT-seq extends the dimensions of multi-omics profiling, offering an effective framework to dissect post-transcriptional regulation across dynamic biological processes and clinically relevant scenarios.

Materials

Instruments

Centrifuge (5810R) Eppendorf
Vortex Mixer (MX-S) Scilogex
Mini Centrifuge (D1008) Scilogex
1.5 mL Centrifuge Tubes (MCT-150-C) Axygen
Thermal Cycler (GE4852T) Biogener
0.2 mL PCR tubes (PCR-0208-C) Axygen
15/50 mL Centrifuge Tubes (352096, 352070) Falcon
PCR Magnet (QYM07A) Quayad

Reagents

1x PBS Phosphate Buffered Saline (Cat. CC008) Macgene
DPBS (Cat. 14190144) Thermo Fisher Scientific
Formaldehyde (Cat. 47608) Sigma-Aldrich, Merck
Proteinase Inhibitor Cocktail (Cat. B14002) Selleck Chemicals
Dithiobis (succinimidyl propionate) (DSP) (Cat. 22586) Thermo Fisher Scientific
Dimethyl Sulfoxide (DMSO) (Cat. D10999) PSAITONG
Glycine (Cat. G0167) VWR Life Science
5% Digitonin (Cat. N253-YH01-01A) Novoprotein
1 M Dithiothreitol (DTT) (Cat. C6014) Bioss
100 mM PMSF (Cat. P0100) Solarbio
Recombinant RNase Inhibitor (Cat. 2313A) Takara
0.5 M EDTA, pH 8.0 (Cat. SL3069) Coolaber
30% (wt/vol) BSA (Cat. A8577) Sigma-Aldrich
Magzol RNA Extraction Reagent (Cat. R4801-02) Macgene
RiboLock RNase Inhibitor (Cat. EO0382) Thermo Fisher Scientific
BioMag® Plus Concanavalin A (Cat. BP531) Bangs Laboratories
Glycerol (Cat. M152) VWR Life Science
Proteinase K (Cat. 2546) Ambion
VAHTS Universal V6 RNA-seq Library Prep Kit (Cat. NR604) Vazyme
Normal Rabbit IgG (Cat. 2729S) Cell Signaling Technology
Guinea Pig anti-Rabbit IgG Antibody (Cat. ABIN101961) Antibodies-Online
Anti-G3BP1 (Cat. 13057-2-AP) Proteintech
rAPOBEC1-pAG-hADAR2dd (E488Q) fusion protein: The pFastBac-rAPOBEC1-pAG-hADAR2dd plasmid sequences are available in article. Fusion proteins were stored in a buffer containing 25 mM HEPES, pH 7.5 and 150 mM NaCl supplemented with 5% glycerol and snap-frozen in liquid nitrogen to store at -80°C

Biological materials

Cell suspension. We have used human HeLa (CCL-2, ATCC) and HEK293T (CRL-3216, ATCC) cell lines.

Software

miniconda (https://www.anaconda.com/)
trim-galore (https://github.com/FelixKrueger/TrimGalore): apply adapter and quality trimming to FastQ files
MAPIT (https://github.com/wanglabpku/MAPIT-seq): MAPIT-seq data analysis and processing

MAPIT dependencies

- bwa (v0.7.17)
- hisat2 (v2.2.1)
- python (at least v3.8)
- picard (at least v2.27)
- samtools (at least v1.14)
- gatk4 (at least v4.2)
- bedtools (at least v2.30.0)
- gffread
- reditools2 (https://github.com/BioinfoUNIBA/REDItools2)
- pip (at least v21.2)
- python packages (pybedtools, pysam (at least v0.16.0.1), pandas (at least v1.4.3), numpy (at least v1.22.3), and scipy)

FLARE (https://github.com/YeoLab/FLARE): determine regions of enriched RNA editing
snakemake (v8.27.1): workflow management system
homer2 (v4.11): de novo motif finding

Required data

MAPIT-seq paired-end fastq files can be downloaded from the SRA archive under accession BioProject PRJNA1166181
Human GRCh38 reference genome (https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_40/GRCh38.p13.genome.fa.gz)
Human GRCh38 transcript annotation GTF file (https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_40/gencode.v40.chr_patch_hapl_scaff.annotation.gtf.gz)
Human GRCh38 transcript annotation GFF3 file (https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_40/gencode.v40.chr_patch_hapl_scaff.annotation.gff3.gz)
Human GRCh38 RepeatMasker annotation for interspersed repeats and low complexity DNA sequences (http://hgdownload.cse.ucsc.edu/goldenPath/hg38/database/rmsk.txt.gz)
Human GRCh38 dbSNP VCF file (https://ftp.ncbi.nih.gov/snp/organisms/human_9606_b151_GRCh38p7/VCF/GATK/All_20180418.vcf.gz)
Human GRCh38 Exome Variant Server (EVS) VCF file (http://evs.gs.washington.edu/evs_bulk_data/ESP6500SI-V2-SSA137.GRCh38-liftover.snps_indels.vcf.tar.gz)
Human GRCh38 1000 Genomes Project VCF file (http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/1000_genomes_project/release/20181203_biallelic_SNV/ALL.wgs.shapeit2_integrated_v1a.GRCh38.20181129.sites.vcf.gz)

Troubleshooting

Reagent setup

DSP Fixation Buffer

50× DSP Stock Solution (50 mg/mL): Balance DSP at room temperature for 30 min, dissolve 50 mg DSP in 1 mL 100% anhydrous DMSO, aliquot, and store at -80°C.

DSP Fixation Solution: Thaw the stock solution at room temperature for 10 min. Add 490 μL PBS dropwise to 10 μL DSP stock in a 1.5 mL tube using a P200 pipette while vortexing (important). 

Note: Filter off any precipitates and use freshly within 5 min.

Glycine Solution (2.5 M Glycine)

Dissolve 9.38 g glycine in DEPC-treated water to a total volume of 50 mL. Filter with a 0.22 μm membrane.

ConA Beads Binding buffer

 
1 M HEPES-KOH, pH 7.51 mL
1 M KCl500 μL
1 M CaCl250 μL
1 M MnCl250 μL
DEPC-treated waterto 50 mL
Store at 4°C.
     

Antibody Buffer 1

 
100 mM PMSF5 μL
Protease Inhibitor Cocktail5 μL
RiboLock RNase Inhibitor6.25 μL
Recombinant RNase Inhibitor6.25 μL
5% Digitonin1 μL
1 M DTT0.5 μL
0.5 M EDTA2 μL
30% BSA1.67 μL
DPBSto 500 μL
Prepare fresh. Adjust the Digitonin concentration for different cell lines if necessary.  
 

Antibody Buffer 2

Similar to Antibody Buffer 1, but exclude 0.5 M EDTA and 30% BSA. Make fresh.

High-salt buffer

 
1 M HEPES, pH 7.51 mL
5 M NaCl3 mL
DEPC-treated waterto 50 mL
Storage: Store at 4 °C.
 

Deaminase-incubation buffer

 
100 mM PMSF5 μL
Protease inhibitor cocktail5 μL
RiboLock RNase Inhibitor6.25 μL
Recombinant RNase Inhibitor6.25 μL
5% Digitonin1 μL
1 M DTT0.5 μL
High-salt bufferto 500 μL
Make fresh. 
 

Deamination buffer

 
1 M HEPES, pH 7.9150 μL
1 M KCl600 μL
5 M NaCl30 μL
80% Glycerol625 μL
0.1 mM ZnCl210 μL
DEPC-treated waterto 10 mL
Storage: Store at 4 °C.
 

Proteinase K digestion buffer

 
1 M Tris-HCl500 μL
5 M NaCl1 mL
0.5 M EDTA100 μL
10% SDS2.5 mL
DEPC-treated waterto 50 mL
Storage: Store at RT.
 

Equipment setup

Downloading and installing software and dependencies

Softwares in MAPIT pipeline are mainly available in conda. We recommend you to install these softwares by conda. Some softwares need to install by source code in Github repository.

Miniconda (see https://www.anaconda.com/docs/getting-started/miniconda/install)

MAPIT and dependencies git clone https://github.com/WangLabPKU/MAPIT-seq cd MAPIT-seq conda env create -f env.yml -c bioconda -c conda-forge conda activate Mapit-seq chmod +x Mapit conda install trim_galore -c conda-forge

Reditools2 and dependencies cd .. git clone https://github.com/BioinfoUNIBA/REDItools2 cd REDItools2 pip install -r requirements.txt

FLARE and dependencies cd .. git clone https://github.com/YeoLab/FLARE conda install snakemake

Downloading and preparing the required data

Downloading reference sequence and annotation cd "your_ref_path" wget \ https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/ release_40/GRCh38.p13.genome.fa.gz wget \ https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/ release_40/gencode.v40.chr_patch_hapl_scaff.annotation.gtf.gz wget \ https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/ release_40/gencode.v40.chr_patch_hapl_scaff.annotation.gff3.gz

RepeatMasker wget \ http://hgdownload.cse.ucsc.edu/goldenPath/hg38/database/ rmsk.txt.gz gzip -d *

Downloading and processing known variants (SNP) annotation files: dbSNP, 1000Genome, and EVS cd "your_ref_path" mkdir "your_ref_path"/"genomeVersion_SNP" cd "your_ref_path"/"genomeVersion_SNP"

dbSNP wget \ https://ftp.ncbi.nih.gov/snp/organisms/ human_9606_b151_GRCh38p7/VCF/GATK/All_20180418.vcf.gz

EVS mkdir EVS_split_chr wget \ http://evs.gs.washington.edu/evs_bulk_data/ESP6500SI-V2- SSA137.GRCh38-liftover.snps_indels.vcf.tar.gz tar -xvf ESP6500SI-V2-SSA137.GRCh38- liftover.snps_indels.vcf.tar.gz for i in {1..22} X Y; do awk '{if(substr($1, 1, 1) == "#"){print $0}else if((length($4) == 1) && (length($5) == 1)) {gsub("MT","M");{if($1 ~ "chr") print $0;  else  print  "chr"$0  }}}'  ESP6500SI-V2-SSA137.GRCh38- liftover.chr${i}.snps_indels.vcf  |  gzip  >  EVS_split_chr/chr$ {i}.gz done

1000Genome wget \ http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/ 1000_genomes_project/release/20181203_biallelic_SNV/ ALL.wgs.shapeit2_integrated_v1a.GRCh38.20181129.sites.vcf.gz mkdir 1000genomes_split_chr zcat ALL.wgs.shapeit2_integrated_v1a.GRCh38.20181129.sites.vcf.gz  | awk -v dir_SNP=1000genomes_split_chr '{if(substr($1, 1, 1) == "#"){print $0 > "1000genomes_header"}else if((length($4) == 1) && (length($5) == 1)) {gsub("MT","M");{if($1 ~ "chr") print $0 > dir_SNP"/"$1; else print "chr"$0 > dir_SNP"/chr"$1 }}}' gzip 1000genomes_split_chr/chr*

Creating configuration file

MAPIT configuration After downloading the genome sequence (FASTA), genome annotation (GTF and GFF3), RepeatMasker annotation, and known SNP datasets (split by chromosomes), the following command can be used to complete the configuration process. This step includes generating the genome sequence index, extracting gene element annotations, and creating chromosome-split dbSNP VCF files. Upon completion, a configuration file named “GenomeVersion.json” will be generated in the “conf” directory within the MAPIT-seq working directory, which will be used for downstream analyses. Mapit config \ --genomeVersion GRCh38 \ --genomeFasta "full_path"/GRCh38.p13.genome.fa \ --species human \ --outdir "full_path" \ --gff3 "full_path"/gencode.v40.chr_patch_hapl_scaff.annotation.gff3 \ --rmsk "full_path"/rmsk.txt \ --dbSNP "full_path"/GRCh38_SNP/All_20180418.vcf.gz \ --1000genomesDir "full_path"/GRCh38_SNP/1000genomes_split_chr \ --EVSEVADir "full_path"/GRCh38_SNP/EVS_split_chr \ --Reditools "full_path_to_Reditools" \ --FLARE "full_path_to_FLARE"

FLARE configuration To run the FLARE pipeline, you will first need a set of files specifying the genomic regions in which cluster identification will occur. Once the script finishes running, you will see a new folder called "genome_name"_regions, and within that folder a slew of files with increasing indices, i.e., GRCh38_regions_0, GRCh38_regions_1... "full_path_to_FLARE"/workflow_FLARE/scripts/generate_regions.py <full/path/to/your/genome/gtf/file> <genome_name>_regions

Homer configuration In an effort to make sure things are standardized for analysis, HOMER organizes promoters, genome sequences and annotation into packages. See http://homer.ucsd.edu/homer/introduction/configure.html in details. perl configureHomer.pl -install hg38

Procedure

Cell fixation (40 min–1 h)

Collect 5×102–5×105 cells or nuclei in centrifuge tubes.

Wash cells with PBS by centrifuging at 250g at RT for 5 min.

For formaldehyde fixation, dilute 37% formaldehyde in room-temperature PBS to 0.5%.

Fix cells for 5 min at room temperature. Quench with 0.125 M glycine for 5 min.

(Optional) For DSP fixation, add 500 μL DSP Fixation Solution to cells, rotate gently at room temperature for 15 min, mix with pipettes, and incubate for another 15 min. Quench with 1 M Tris-HCl, pH 7.4, for 5 min.

Centrifuge cells at 4°C, 450g, for 5 min. Wash cells twice with pre-cooled DPBS. Then resuspend the fixed cells in 500 μL DPBS. Fixed cells can be cryopreserved at -80°C with 10% DMSO or proceeded immediately

Note: After fixation, transfer cells to centrifuge tubes pre-washed with 1% BSA. Usage of a swing-bucket rotor is highly recommended to reduce cell loss.

Cell Binding to ConA Beads (20 min)

(Optional) For single-cell MAPIT-seq, the ConA beads-cells coupling step was omitted. Instead, formaldehyde or DSP fixed cells were collected by centrifuging at 650g, 4°C for 5 min in the following experiments.

Activate ConA beads by introducing 5–10 μL ConA beads to a volume of binding buffer equivalent to 10 times the bead volume. Ensure thorough mixing and subsequently place the tubes on a magnetic stand for 30 s to 2 min to remove the supernatant. Repeat this activation step, and resuspend the beads in their initial volume of binding buffer.

Add the activated ConA beads to the cell suspension and incubate at room temperature under continuous rotation for 10–20 min.

Wash the cell beads mixture once with DPBS. Transfer the beads to PCR strip tubes. 

Note: To mitigate the adhesion of beads to the walls of the centrifuge tubes, it is advisable to pretreat the tubes with 1% BSA.

Cell permeabilization and antibody Incubation (4.5 h)

Resuspend the cells in 50 μL a pre-mixed solution of Antibody Buffer 1, incorporating both the RBP and IgG antibodies at a dilution of 100:1. 

Note: The concentration of the antibodies can be adjusted following immunofluorescence assays. Mix gently and avoid repetitive pipetting to prevent material loss.

Incubate the mixture at 4°C under continuous rotation for 3 h. 

Note: Avoid inversion of the tubes during rotation. It is recommended to mix the suspension intermittently to prevent bead aggregation.

After a quick spin, place the tubes on a magnetic stand for 30 s to 2 min, then carefully remove the supernatant. Wash the beads once with pre-chilled DPBS on a roller at 4°C for 5 min.

Resuspend the beads in 50 μL Antibody Buffer 2, which contains secondary antibodies at a 100:1 ratio, and incubate at 4°C under continuous rotation for 1 h.

Deaminase Incubation (1 h)

After a quick spin, place the tubes on a magnetic stand for 30 s to 2 min, then carefully remove the supernatant. Wash the beads once with pre-chilled DPBS on a roller at 4°C for 5 min.

Dilute 1 μg of rAPOBEC1-pAG-hADAR2dd in 50 μL Deaminase Incubation Buffer.

Resuspend the beads in 50 μL a pre-mixed Deaminase Incubation Buffer and incubate at 4°C under continuous rotation for 1 h.

Deamination Reaction (3.5–5 h)

After a quick spin, place the tubes on a magnetic stand for 30 s to 2 min, then carefully remove the supernatant. Wash the beads twice with 100 μL pre-chilled Washing Buffer, which includes High-Salt Buffer with 0.005% Digitonin, on a roller at 4°C for 5 min per wash.

Prepare 100 μL Deamination Buffer supplemented with the following components: 1 μL Proteinase Inhibitor Cocktail, 1.25 μL RiboLock RNase Inhibitor, 1.25 μL Recombinant RNase Inhibitor, and 0.1 μL 1 M DTT.

Resuspend the beads in 40 μL pre-mixed Deamination Buffer and incubate at 30°C for 3–4 h.

Note: To ensure specificity and minimize substantial RNA degradation, maintain a processing time of no more than 5 hours; optimal editing is typically achieved within 4 h.

Optional: For cells fixed with DSP, reverse crosslinking can be performed by incubating with 50 mM DTT at room temperature for 30 min.

Following the reaction, place the tubes on a magnetic stand for 30 s to 2 min, then carefully remove the supernatant.

RNA Extraction and Library Construction (10 h)

(Optional) For single-cell MAPIT-seq, cells were washed once, resuspended in DPBS with 0.05% BSA and 1 U/μl RiboLock RNase Inhibitor and subsequently loaded onto the 10x Genomics Chromium platform.

Resuspend the beads in 100 μL Proteinase K Digestion Buffer and add 1 μL 20 mg/mL Proteinase K. Vortex the mixture briefly, after a quick spin, digest the beads at 56°C for 1 h.

Note: Resuspend samples in Proteinase K digestion buffer before transferred to a 1.5 ml tube and then add Proteinase K to prevent beads aggregation.

For large samples (> 100,000 cells): Extract total RNA following the standard TRIzol protocol. Use glycogen to facilitate RNA precipitation. For small samples (500–50,000 cells): Isolate mRNA using Oligo d(T)25 Magnetic Beads and proceed with the Smart-seq2 protocol.

Construct sequencing libraries from 50–500 ng of RNA using the VAHTS Universal V6 RNA-seq Library Prep Kit for Illumina. Quantify the concentration of the libraries using Qubit and select 10–100 ng of cDNA for sequencing on an Illumina NovaSeq 6000 platform, employing 150 nt paired-end reads.

MAPIT-seq data quality control (15 min)

FASTQC: quality control mkdir ./fastqFileQC fastqc -t 2 -o ./fastqFileQC "fastq_file_path"/*.fq.gz

Apply adapter and quality trimming mkdir ./clean trim_galore -j 20 -q 20 --phred33 \ --stringency 3 --length 20 -e 0.1 \ --paired "fastq_file_path"/xxx_R1.fq.gz \ "fastq_file_path"/xxx_R2.fq.gz \ --gzip -o ./clean --basename ${samplename}_${replicate} "fastq_file_path"/xxx_R1.fq.gz and "fastq_file_path"/xxx_R2.fq.gz denote the paired-end FASTQ files generated from MAPIT-seq experiments, corresponding to Read 1 and Read 2, respectively. The placeholders ${samplename} and ${replicate} indicate the abbreviated sample name and replicate identifier, respectively.

Aligning and processing MAPIT-seq reads to reference genome sequence (3 h)

Abundant RNA (rRNA, tRNA, and mt-tRNA) filtering and two-round uniquely mapping Abundant RNA species, including rRNA, tRNA, and mitochondrial RNA, were computationally removed prior to genome alignment to optimize sequencing read utilization, improve target gene detection sensitivity, and ensure high-quality clean data for downstream analyses. To further enhance the sensitivity and specificity of RNA editing site detection, a two-round uniquely mapping strategy was employed. In this approach, HISAT2 was used in the first round to capture transcriptomic complexity, such as splicing events and RNA modifications, while BWA-MEM was applied in the second round to map genomic regions of editing sites using unmapped reads in the first round. The resulting alignments from both rounds were subsequently merged to generate a final BAM file for downstream analysis. Mapit mapping -v $genomeVersion \ --fq clean/${samplename}_${replicate}_val_1.fq.gz \ --fq2 clean/${samplename}_${replicate}_val_2.fq.gz \ --rna-strandness {F,R,FR,RF} \ -n ${samplename} -r ${replicate} -o ${outpath} -t ${threads} "$genomeVersion" represents the configuration file for specific genome assembly version created in “Equipment setup” session. "--fq" and "--fq2" specify the trimmed paired-end FASTQ files used as input for mapping. The "--rna-strandness" parameter defines the strand-specific information required by HISAT2 during alignment (refer to: https://ccb.jhu.edu/software/hisat2/manual.shtml for details). This step utilizes the specified number of computational threads (${threads}) and automatically generates four sub-directories within the designated output directory (${outpath}). Intermediate mapping and the final merged BAM files will be generated and stored within these sub-directories during the alignment process.

Fine-tuning alignments for RNA editing discovery

Following the GATK Best Practices workflow for RNA-seq short variant discovery (https://gatk.broadinstitute.org/hc/en-us/articles/360035531192-RNAseq-short-variant-discovery-SNPs-Indels), we customized and integrated key pre-processing steps — MarkDuplicates, SplitNCigarReads, and Base Quality Score Recalibration (BQSR) — into a single streamlined command. Mapit finetuning -v $genomeVersion \ -n ${samplename} -r ${replicate} -o ${outpath} This integrated procedure enables efficient data cleanup and fine-tuning for RNA editing detection, directly using the designated output directory (${outpath}) specified in Step 26 to process the final merged BAM file of "${samplename}_${replicate}".

RNA editing analysis (4–6 h)

Calling editing sites and filtering out known SNPs This step uses the fine-tuned BAM files to call single nucleotide variants (SNVs) with GATK HaplotypeCaller, followed by filtering out known SNPs based on previously curated SNP datasets. The remaining A-to-G or C-to-U editing sites are then annotated using genome annotation files. Mapit callediting -v $genomeVersion \ --sampleList ${sample1},${sample2},…,${sampleN} \ -o ${outpath} --prefix ${prefix} \ --enzyme {ADAR,APOBEC,Both} -t ${threads} This step allows for the simultaneous analysis of multiple samples. By providing a comma-separated list of sample names in the "--samplelist" parameter, replicate IDs from multiple samples in the designated output directory (${outpath}) specified in Step 26–27 can be processed together. The parameter "--enzyme", limited to three words, represents the editing enzyme introduced in MAPIT-seq samples. And after filtering, editing information will output in "${outpath}/6-Edit_calling" directory with ${prefix} as the file prefix.

Differential editing analysis between RBP-MAPIT and IgG-MAPIT To identify RBP targets, the step uses a Wilcoxon signed-rank test to perform on the ‘editing index’ within 50-nt continuous and non-overlapping windows of transcripts between RBP-MAPIT and IgG-MAPIT samples. Mapit calltargets -v $genomeVersion \ -i ${outpath}/6-Edit_calling/${prefix}/${prefix} _Edit_DATA.tsv \ --treatName ${treatSampleName} --controlName ${ctrlSampleName} \ -o ${outpath} -c ${coverage} -l ${level} "${treatSampleName}" and "${ctrlSampleName}" represent the abbr. sample name of RBP-MAPIT and IgG-MAPIT in the designated output directory (${outpath}) specified in Step 26–28. And "${coverage}" is the minimum coverage reads for an effective editing site (default 10), and "${level}" represents the level performing in differential editing analysis. The step will output the "Differential editing analysis" results in the "${outpath}/6-Edit_calling/${prefix}_${treatSampleName}/" directory.

Identifying high-confidence editing clusters and RBP binding motifs (6–8 h)

Identifying A-to-G and C-to-U editing using a modified SAILOR workflow for MAPIT Mapit prepare -v $genomeVersion -n ${samplename} -r ${replicate} \ -o ${outpath} This step is for preparing files needed in SAILOR and FLARE workflow. Mapit SAILOR -v $genomeVersion -n ${samplename} -r ${replicate} \ -c ${coverage} -o ${outpath} -t ${threads} This step integrated SAILOR workflow for MAPIT data and result directory. And "${coverage}" is the minimum coverage reads for an effective editing site (default 10).

Identifying A-to-G or C-to-U editing clusters using FLARE Mapit FLARE -v $genomeVersion -e {AG,CT} -n ${samplename} \ -r  ${replicate}  --regions  ${regions}  -o  ${outpath}  -t  ${threads} This step integrated FLARE workflow for MAPIT data and result directory. And "${regions}", created in configuration step, is the directory of genomic regions in which cluster identification will occur.

Identifying high-confidence editing clusters Mapit  hc_cluster  -v  $genomeVersion  -n  ${samplename}  -o  ${outpath} \ -s ${sloplength} This  step  identifies  intersection  C-to-U  or  A-to-G  editing  clusters  between  two replicates. And high-confidence MAPIT editing clusters are the intersection of two-type editing clusters and are extended or shrink into "${sloplength}" nt windows.

De novo RBP binding motifs finding hc_cluster_path=${outpath}/7-FLARE_for_MAPIT/confident_clusters hc_cluster_bed=${hc_cluster_path}/${samplename}/${samplename}_${slop}.bed output_motif=${hc_cluster_path}/${samplename}/${samplename}.motif findMotifsGenome.pl $hc_cluster_bed hg38 $output_motif \ -noknown -rna -size given -len 5,6,7,8 -p ${threads} This step used the high-confidence MAPIT editing clusters to perform De novo RBP binding motifs finding.


	1 M HEPES-KOH, pH 7.5	1 mL
	1 M KCl	500 μL
	1 M CaCl2	50 μL
	1 M MnCl2	50 μL
	DEPC-treated water	to 50 mL


	100 mM PMSF	5 μL
	Protease Inhibitor Cocktail	5 μL
	RiboLock RNase Inhibitor	6.25 μL
	Recombinant RNase Inhibitor	6.25 μL
	5% Digitonin	1 μL
	1 M DTT	0.5 μL
	0.5 M EDTA	2 μL
	30% BSA	1.67 μL
	DPBS	to 500 μL


	1 M HEPES, pH 7.9	150 μL
	1 M KCl	600 μL
	5 M NaCl	30 μL
	80% Glycerol	625 μL
	0.1 mM ZnCl2	10 μL
	DEPC-treated water	to 10 mL


	1 M Tris-HCl	500 μL
	5 M NaCl	1 mL
	0.5 M EDTA	100 μL
	10% SDS	2.5 mL
	DEPC-treated water	to 50 mL

Public workspaceMAPIT-seq protocol V.1

MAPIT-seq protocol V.1