Jul 29, 2025

Public workspaceMPRA RNA-seq and DNA-seq library preparation

  • Jessica McAfee1,
  • Hyejung Won1
  • 1UNC Chapel Hill
  • MPRA
Icon indicating open access to content
QR code linking to this content
Protocol CitationJessica McAfee, Hyejung Won 2025. MPRA RNA-seq and DNA-seq library preparation. protocols.io https://dx.doi.org/10.17504/protocols.io.e6nvw4pjzlmk/v1
Manuscript citation:
Sool Lee*, Jessica C. McAfee*, Jiseok Lee*, Alejandro Gomez, Austin T. Ledford, Declan Clarke, Hyunggyu Min, Mark B. Gerstein, Alan P. Boyle, Patrick F. Sullivan, Adriana S. Beltran, Hyejung Won. Massively parallel reporter assay investigates shared genetic variants of eight psychiatric disorders. Cell. (2024) https://doi.org/10.1016/j.cell.2024.12.022 *Co-first authors

Nana Matoba*, Jessica C. McAfee*, Oleh Krupa, Jess Bell, Brandon D. Le, Jordan M. Valone, Gregory E. Crawford, Hyejung Won, Jason L. Stein. Massively parallel assessment of gene regulatory activity at human cortical structure associated variants. Biorxiv. (2025) https://doi.org/10.1101/2025.02.08.635393 *Co-first authors
License: This is an open access protocol distributed under the terms of the Creative Commons Attribution License,  which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Protocol status: Working
We use this protocol and it's working
Created: June 30, 2025
Last Modified: July 29, 2025
Protocol Integer ID: 221314
Keywords: MPRA, library Prep, PJD MPRA, qPCR, Massively Parallel Reporter Assay, DNA-seq, RNA-seq, generating mpra rna, mpra rna, transfectable mpra backbone, seq library preparation this protocol, donor-egp2ap-rc, seq library preparation, seq library, seq, dna
Funders Acknowledgements:
NIMH
Grant ID: DP2MH122403
IGVF Consortium
Grant ID: UM1HG012003, U01HG011952
PsychENCODE Consortium
Grant ID: R01MH122509
NICHD
Grant ID: T32HD040127
NIGMS
Grant ID: 5T32GM067553, 5T32GM135128
University Cancer Research Fund, Comprehensive Cancer Center Core support grant
Grant ID: P30CA016086
UNC Center for Mental Health and Susceptibility grant
Grant ID: [P30ES010126
Abstract
This protocol is for generating MPRA RNA-seq and DNA-seq libraries for sequencing. This protocol uses a transfectable MPRA backbone: Donor_eGP2AP_RC (Addgene: ID#133784). For more detail, see methods section of: https://www.cell.com/cell/fulltext/S0092-8674(24)01435-1
Guidelines
All temperatures are in Celsius.
Materials
Critical Commercial AssaysBrandCat. Number
DNA Clean and Concentrator Kit-25Zymo ResearchCat#D4033
Zymo DNA Clean and Concentrator Kit-5Zymo ResearchCat#D4033
Zymo Gel DNA Recovery KitZymo ResearchCat#D4008

ReagentsBrandCat. Number
AMPure XP BeadsBeckman CoulterCat#A63881
NEBNext 2X Q5 Hifi HS Master MixNew England Biolabs (NEB)Cat#M0453S
SparQ beadsVWRCat#76302-834
SuperScript IV Reverse TranscriptaseInvitrogenCat#18090050
Troubleshooting
Before start
Extract RNA and DNA from your cells that have been transfected with your MPRA library.
RNA and DNA extraction
RNA and DNA extraction:
DNA and RNA were extracted from the cells by Quick-DNA/RNA Miniprep Plus Kit (Zymo Research, cat. no. D7003) using 600 uL of shield buffer and 600 uL of lysis buffer per well.
RNA and DNA eluted with water at ~55°C
Reverse transcription of RNA
Run reverse transcription (RT) to create cDNA from RNA
Run RT rxn per sample: Use up to 5ug (5ug=5000ng; weight max) or 11uL (volume max) of RNA.
For each RNA sample, make the following mastermix:
A1X (uL)
2uM Lib_Han_RT 1
10mM dNTP 1
Total RNA 5ug to 11
Water to 13
Total 13
Heat for 65 °C for 5 min, ice for 1 min
Add the following mixture to finish RT.
A1X (uL)
5X SSIV buffer 4
100mM DTT 1
RNase OUT 1
SSIV Transcriptase 1
Total 7
55°C for 1 hour, 80°C for 10 min.
Create qPCR standards
Generate qPCR standards from cDNA to determine transfection efficiency.
Run the following PCR for one of the samples.
A1X (uL)
cDNA1
Water19
10uM Forward Primer: Lib_Hand2.5
10uM Reverse Primer: Lib_Seq_Luc_R2.5
PCR MM 2X (NEBNext Q5 Hot Start HiFi)25
Total50
StepTempTime
Initial denaturation98 30 sec
Denaturation98 10 sec
Annealing63 15 sec
Extension72 25 sec
CyclesGo to 2 24 times
Final extension72 5 min
Hold4
Extract the band at 205 bp using gel extraction (1.8% agarose gel, Zymo Gel DNA Recovery Kit, Cat#D4008)
Measure the concentration using Qubit.
Perform serial dilution, such that you have 1ng, 100pg, 10pg, 1pg, 100fg, and 10fg/uL stocks. These will serve as standards.
*Once standards are made, this step does not need to be repeated for each batch
qPCR to estimate transfection efficiency
Use qPCR to estimate the concentration of the MPRA RNA barcodes.
Master mix for cDNA and standards:
Reagents1x (uL)
cDNA library 1
NFW 7
10uM Lib_Hand 1
10uM Lib_Seq_Luc_R 1
SYBR Green 2x MM 10
Total 20
TempTimeCycle
50 2 min
95 2 min
95 1 sec X45 Cycles
60 30 sec
Master mix for DNA samples:
Master mix for DNA Samples1X (uL)
DNA library 1
water 7
10uM Lib_Hand_RT 1
10uM Lib_Seq_Luc_R 1
SYBR Green 2x MM 10
Total 20
TempTimeCycle
50 2 min
95 2 min
95 1 sec X45 Cycles
60 30 sec
Run the qPCR with all of these together on the same plate.
qPCR Data Analysis:
Example of Data Analysis:


Make a plot of average CT (y axis) and log concentration (x axis) of your standard values. For the x axis log concentration: 100pg would be 5, 10pg would be 4, 1pg would be 3, 100fg would be 2, 10fg would be 1. Get the line equation and R squared value.
For each of your samples, get the average CT value between the two replicates.
Solve your standards’ line equation for X (log concentration). Plug in the average sample CT value in the Y (average CT value), solve for X.
Retrieve the concentration (in fg) from X as 10^X. Generally, we are targeting for >700fg/replicate.
If your sample has a high transfection efficiency, continue to the next step. If the sample has low transfection efficiency, go back to optimizing transfection efficiency.
PCR 1
Amplify your RNA and DNA library.
Repeat the RT (step 2) for the rest of your RNA samples. Convert all of your RNA to cDNA.
PCR 1 RNA (cDNA): For RNA samples, use 5uL of your cDNA per 1 rxn. Generally, you will do 18 rxns per sample. Note that this uses Lib_Hand primer.
PCR 1 cDNA Master Mix1X (uL)18X (uL)
cDNA 5 90
10uM Forward: Lib_Hand 2.5 45
10uM Reverse: Lib_Seq_Luc_R 2.5 45
PCR MM 2X (NEBNext Q5 HS HiFi) 25 450
NFW 15 270
Total 50 900
TempTimeCycles
9830 sec1
9810 secX 9 cycles
6315 sec
7225 sec
725 min1
4forever1
PCR clean up: Split each sample to 2 of the -25 zymo clean and concentrator columns, elute each in 25uL water--> 50uL total per sample
PCR 1 for DNA:
This one uses primer Lib_Hand_RT.
You will do 4X per one sample.
PCR1 DNA Master Mix1X (uL)4X (uL)
DNA 0.5 2
10uM Forward: Lib_Hand_RT 2.5 10
10uM Reverse: Lib_Seq_Luc_R 2.5 10
PCR MM 2X (NEBNext Q5 HS HiFi) 25 100
NFW 19.5 78
Total 50 200
TempTimeCycles
98 30 sec1
98 10 secX8 cycles
63 15 sec
72 25 sec
72 5 min1
4 forever
PCR clean up: 1 sample use 1 -25 zymo clean and concentrator, elute each in 25uL water

qPCR 2
The purpose of qPCR 2 is to estimate the number of PCR cycles for PCR 2. PCR 2 is used to add sequencing adapters and sample indexes.

qPCR 2 is optional (you can apply this step only when the samples are overamplified with a certain cycle).

Do two reactions for each sample. Pick an index primer at random for qPCR2 and use it for all the samples.
qPCR 2 Master mix:
A1X (uL)
cDNA or DNA from PCR 1 1
water 7
10uM Primer: P5_Seq_Luc_F 1
10uM R Primer: P7_IND_#_Han 1
SYBR Green 2X MM 10
Total 20
TempTimeCycles
50 2 min 1
95 2 min 1
95 1 sec X45 Cycles
60 30 sec
Use ¼ of the multicomponent plot (1/4 of the fluorescence from the top and bottom of your lines), follow down this ¼ line to the number of PCR cycles to use.
PCR 2
PCR 2 adds Illumina sequencing adapters and sample indexes.

Assign indexes to each sample (each RNA (cDNA) and DNA library should get a unique index).
Use all of the product you get from the cleanup steps of PCR 1.

DNA samples should have 1.5 rxns per sample. RNA should have 3 rxns per sample.

ABRNADNA
PCR 2 Master Mixes 1X (uL) 3X (uL) 1.5X (uL)
Library 16.6 49.8 24.9
NFW 3.4 10.2 5.1
10uM F Primer: P5_Seq_Luc_F 2.5 7.5 3.75
10uM R Primer: P7_IND_#_Han 2.5 7.5 3.75
PCR MM 2X HIFI HS Q5 (NEB) 25 75 37.5
Total 50 150 75

TempTimeCycles
98 30 sec1
98 10 secX PCR cycles determined by qPCR 2
69 15 sec
72 25 sec
72 5 min1
4 forever
Expected band size: 273bp

The PCR product is cleaned up and size-selected using AMPure XP Beads (Beckman Coulter, cat. no. A63881) with 0.7X and 0.9X ratios to select for a DNA fragment at 273 bp.
Sequence your samples
The RNA and DNA library samples are pooled together in a 2:1 concentration ratio.
Use NovaSeq SP with custom cycles of 35 x 8 x 0 and 20% PhiX and custom sequencing primers.

Sequencing Primers: Index primer: TCGGCAGTTGGGAAGAGCATAGTCGTAGAGCACGC
R1 primer: CCAAGAAGGGCGGCAAGATCGCCGTGTAATAATTCTAGA
Primers:
Primer NameSequence
Lib_Hand_RTATGCTCTTCCCAACTGCCGACGGGGAGTGTACTAGT
Lib_HandTGCTCTTCCCAACTGCCGA
Lib_seq_Luc_RTACAACCGCCAAGAAGCTGC
P5_seq_Luc_FAATGATACGGCGACCACCGAGATCTACACTACAACCGCCAAGAAGCTGC
P7_Ind_#_HanCAAGCAGAAGACGGCATACGAGATNNNNNNNNGCGTGCTCTACGACTATGCTCTTCCCAACTGCCGA
Index primer TCGGCAGTTGGGAAGAGCATAGTCGTAGAGCACGC
R1 primerCCAAGAAGGGCGGCAAGATCGCCGTGTAATAATTCTAGA
P7_ind_#_Han Index primers:
Index Primer NameIndex Sequence in Primer 5'-3'Index Sequence after Sequencing 5'-3' (for demultiplexing)Nextera NameFull Primer Sequence
P7_Ind_1_HanTCGCCTTATAAGGCGA [H/N]701CAAGCAGAAGACGGCATACGAGATTCGCCTTAGCGTGCTCTACGACTATGCTCTTCCCAACTGCCGA
P7_Ind_2_HanCTAGTACGCGTACTAG [H/N]702CAAGCAGAAGACGGCATACGAGATCTAGTACGGCGTGCTCTACGACTATGCTCTTCCCAACTGCCGA
P7_Ind_3_HanTTCTGCCTAGGCAGAA[H/N]703CAAGCAGAAGACGGCATACGAGATTTCTGCCTGCGTGCTCTACGACTATGCTCTTCCCAACTGCCGA
P7_Ind_4_HanGCTCAGGATCCTGAGC[H/N]704CAAGCAGAAGACGGCATACGAGATGCTCAGGAGCGTGCTCTACGACTATGCTCTTCCCAACTGCCGA
P7_Ind_5_HanAGGAGTCCGGACTCCT[H/N]705CAAGCAGAAGACGGCATACGAGATAGGAGTCCGCGTGCTCTACGACTATGCTCTTCCCAACTGCCGA
P7_Ind_6_HanCATGCCTATAGGCATG [H/N]706CAAGCAGAAGACGGCATACGAGATCATGCCTAGCGTGCTCTACGACTATGCTCTTCCCAACTGCCGA
P7_Ind_7_HanGTAGAGAGCTCTCTAC[H/N]707CAAGCAGAAGACGGCATACGAGATGTAGAGAGGCGTGCTCTACGACTATGCTCTTCCCAACTGCCGA
P7_Ind_8_HanCCTCTCTGCAGAGAGG[H/N]708CAAGCAGAAGACGGCATACGAGATCCTCTCTGGCGTGCTCTACGACTATGCTCTTCCCAACTGCCGA
P7_Ind_9_HanAGCGTAGCGCTACGCT[H/N]709CAAGCAGAAGACGGCATACGAGATAGCGTAGCGCGTGCTCTACGACTATGCTCTTCCCAACTGCCGA
P7_Ind_10_HanCAGCCTCGCGAGGCTG[H/N]710CAAGCAGAAGACGGCATACGAGATCAGCCTCGGCGTGCTCTACGACTATGCTCTTCCCAACTGCCGA
P7_Ind_11_HanTGCCTCTTAAGAGGCA[H/N]711CAAGCAGAAGACGGCATACGAGATTGCCTCTTGCGTGCTCTACGACTATGCTCTTCCCAACTGCCGA
P7_Ind_12_HanTCCTCTACGTAGAGGA[H/N]712CAAGCAGAAGACGGCATACGAGATTCCTCTACGCGTGCTCTACGACTATGCTCTTCCCAACTGCCGA
P7_Ind_13_HanTCATGAGCGCTCATGA[H/N]714CAAGCAGAAGACGGCATACGAGATTCATGAGCGCGTGCTCTACGACTATGCTCTTCCCAACTGCCGA
P7_Ind_14_HanCCTGAGATATCTCAGG[H/N]715CAAGCAGAAGACGGCATACGAGATCCTGAGATGCGTGCTCTACGACTATGCTCTTCCCAACTGCCGA
P7_Ind_15_HanTAGCGAGTACTCGCTA[H/N]716CAAGCAGAAGACGGCATACGAGATTAGCGAGTGCGTGCTCTACGACTATGCTCTTCCCAACTGCCGA
P7_Ind_16_HanGTAGCTCCGGAGCTAC[H/N]718CAAGCAGAAGACGGCATACGAGATGTAGCTCCGCGTGCTCTACGACTATGCTCTTCCCAACTGCCGA
P7_Ind_17_HanTACTACGCGCGTAGTA[H/N]719CAAGCAGAAGACGGCATACGAGATTACTACGCGCGTGCTCTACGACTATGCTCTTCCCAACTGCCGA
P7_Ind_18_HanAGGCTCCGCGGAGCCT[H/N]720CAAGCAGAAGACGGCATACGAGATAGGCTCCGGCGTGCTCTACGACTATGCTCTTCCCAACTGCCGA
P7_Ind_19_HanGCAGCGTATACGCTGC[H/N]721CAAGCAGAAGACGGCATACGAGATGCAGCGTAGCGTGCTCTACGACTATGCTCTTCCCAACTGCCGA
P7_Ind_20_HanCTGCGCATATGCGCAG[H/N]722CAAGCAGAAGACGGCATACGAGATCTGCGCATGCGTGCTCTACGACTATGCTCTTCCCAACTGCCGA
P7_Ind_21_HanGAGCGCTATAGCGCTCN723CAAGCAGAAGACGGCATACGAGATGAGCGCTAGCGTGCTCTACGACTATGCTCTTCCCAACTGCCGA
P7_Ind_22_HanCGCTCAGTACTGAGCGN724CAAGCAGAAGACGGCATACGAGATCGCTCAGTGCGTGCTCTACGACTATGCTCTTCCCAACTGCCGA
P7_Ind_23_HanGTCTTAGGCCTAAGACN726CAAGCAGAAGACGGCATACGAGATGTCTTAGGGCGTGCTCTACGACTATGCTCTTCCCAACTGCCGA
P7_Ind_24_HanACTGATCGCGATCAGTN727CAAGCAGAAGACGGCATACGAGATACTGATCGGCGTGCTCTACGACTATGCTCTTCCCAACTGCCGA
Protocol references
Nana Matoba*, Jessica McAfee*, Oleh Krupa, Jess Bell, Brandon D. Le, Jordan M. Valone, Gregory E. Crawford, Hyejung Won, Jason L. Stein. Massively parallel assessment of gene regulatory activity at human cortical structure associated variants. Biorxiv. (2025) https://doi.org/10.1101/2025.02.08.635393 *Co-first authors

Sool Lee*, Jessica C. McAfee*, Jiseok Lee*, Alejandro Gomez, Austin T. Ledford, Declan Clarke, Hyunggyu Min, Mark B. Gerstein, Alan P. Boyle, Patrick F. Sullivan, Adriana S. Beltran, Hyejung Won. Massively parallel reporter assay investigates shared genetic variants of eight psychiatric disorders. Cell. (2024) https://doi.org/10.1016/j.cell.2024.12.022 *Co-first authors
Acknowledgements
We thank members of the Won lab, Dr. Michael Love, Ariana Marquez Gonzalez, and Rachel Sharp for helpful discussions and comments about this paper. This research was supported by the NIH New Innovator Award from the NIMH (DP2MH122403, H.W.), the IGVF Consortium (UM1HG012003, H.W. and U01HG011952, A.P.B.), the PsychENCODE Consortium (R01MH122509, H.W.), NICHD (T32HD040127, J.L.), NIGMS (5T32GM067553, S.L. and 5T32GM135128, J.C.M. and A.G.), and Genomics of ASD: Pathways to Genetic Therapies award from the Simons Foundation Autism Research Initiative (H.W.). We also acknowledge the technical support from the UNC High Throughput Sequencing Facility (University Cancer Research Fund, Comprehensive Cancer Center Core support grant [P30CA016086], and UNC Center for Mental Health and Susceptibility grant [P30ES010126]), as well as UNC Advanced Analytics Core (Center for Gastrointestinal Biology and Disease grant [P30DK034987]), and UNC Lenti-shRNA Core. Lastly, we would like to send our utmost gratitude toward our pets—Achilles, Daisy, Nahla, and Jiji—for their unconditional love and moral support.