WORKFLOW FOR THE NUCLEIC ACID  BASED IDENTIFICATION OF INSECTS USING WHOLE GENOME AMPLIFICATION AND  NANOPORE SEQUENCING - KAWA

Jürg E Frey; Beatrice Frey; Daniel Frei; Morgan Gueuning; Simon Blaser; Andreas Bühlmann

Feb 06, 2022

Version 1

WORKFLOW FOR THE NUCLEIC ACID BASED IDENTIFICATION OF INSECTS USING WHOLE GENOME AMPLIFICATION AND NANOPORE SEQUENCING - KAWA V.1

DOI

dx.doi.org/10.17504/protocols.io.bwpepdje

Jürg E Frey¹,
Beatrice Frey²,
Daniel Frei^2,3,
Morgan Gueuning^2,4,
Simon Blaser⁵,
Andreas Bühlmann⁶

¹Agroscope;
²Agroscope, Department of Method Development and Analytics, Research Group Molecular Diagnostics, Genomics and Bioinformatics, Wädenswil, Switzerland;
³Current address: Qiagen Instruments AG, Hombrechtikon, Switzerland;
⁴Current address: Department of Research and Development, Blood Transfusion Service Zurich, Swiss Red Cross, Schlieren, Switzerland;
⁵Agroscope, Department of Plants and Plant Products, Agroscope Phytosanitary Service;
⁶Agroscope, Department of Plants and Plant Products, Research Group Product Quality and Innovation

Juerg Frey

DOI: dx.doi.org/10.17504/protocols.io.bwpepdje

Protocol Citation: Jürg E Frey, Beatrice Frey, Daniel Frei, Morgan Gueuning, Simon Blaser, Andreas Bühlmann 2022. WORKFLOW FOR THE NUCLEIC ACID BASED IDENTIFICATION OF INSECTS USING WHOLE GENOME AMPLIFICATION AND NANOPORE SEQUENCING - KAWA. protocols.io https://dx.doi.org/10.17504/protocols.io.bwpepdje

License: This is an open access protocol distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited

Protocol status: Working

We use this protocol and it’s working

Created: July 18, 2021

Last Modified: February 08, 2022

Protocol Integer ID: 51654

Keywords: insects, identification of insects, whole genome amplification, nanopore sequencing, nucleic acid

Abstract

Note
This protocol uses proteinase K containing KAWA buffer for DNA extraction.
An alternative version that uses Monarch® Genomic DNA Purification Kit for DNA extraction and purification can be found here:
Protocol
NAME
WORKFLOW FOR THE NUCLEIC ACID  BASED IDENTIFICATION OF INSECTS USING WHOLE GENOME AMPLIFICATION AND  NANOPORE SEQUENCING - Monarch®
CREATED BY
Julia Rossmanith


BACKGROUND
World-wide trade with plant material has dramatically increased over the past decades, and with it has  the risk for accidental introduction of potential plant pests and diseases. Rapid and accurate nucleic acid based identification of such quarantine organisms has become an important tool to minimize their  dispersal. 
Nucleic acid based identification exploits genetic diversity. A basic tenet holds that species generally do not interbreed and hence, the level of genetic differentiation within species is generally lower than between species. For insects, this pattern of genetic diversity is being used by DNA barcoding with  great success. An approximately 600 base pairs long fragment in the first half of the mitochondrially encoded cytochrome c oxidase I gene (COI) is used as a reference sequence to identify insects at the species level. So far, reference sequences for ca. 231’000 different insect species are deposited on the publicly available “Barcoding for Life Database” BOLD (http://www.barcodinglife.org/; by June 2021). Another important database especially targeting phytosanitary purposes with DNA barcodes of vouchered reference specimen is the EPPO Q-Bank (European Plant Protection Organization, https://qbank.eppo.int/). 

The methodology of DNA barcoding generally relies on PCR amplification of the diagnostic COI gene fragment using a pair of primers for which the exact DNA sequence must be known. However, this information is not always available, for example in the case of so far undescribed species or in cases, where genetic variation within species affected primer sites. Furthermore, although the COI marker sequence shows an impressive degree of among species differentiation, this is not true for all species and hence, a number of important pest species cannot be differentiated based on this marker alone. 

An ideal method for species identification should therefore obtain information of the best discriminating genetic region or of several genetic regions. Here, we describe a state-of-the-art method to achieve  this task.

PURPOSE
The purpose of this workflow is to provide a generic method for genetic identification of potential insect quarantine species and of other especially dangerous pest species in support of the Swiss Federal Plant Protection Service. The method is marker independent and may be used with reference databases of any genetic fragment. It is based on whole genome amplification, followed by single strand nanopore sequencing and DNA barcoding based identification.

Attachments

LB_MM2_PcRefSeq_NrSe...

45KB

RefDB_BOLD59kGBCocc7...

40.7MB

Guidelines

SCOPE
This method is suitable for the qualitative identification of DNA (deoxyribonucleic acid) or cDNA (reverse-transcribed DNA of RNA of, e.g., viruses) of potentially all organisms. It has been tested against a broad taxonomic range of pest species. The workflow is designed to work with fresh, ethanol (EtOH; preferably 70%) preserved and frozen samples. The workflow presented here is established for insect species identification, but it was successfully applied to the identification of fungi and bacteria using the proper reference databases.

DEFINITIONS & ABBREVIATION
DNA: Deoxyribonucleic acid  
PCR: Polymerase Chain Reaction  
SOP: Standard Operating Procedure  
UV: Ultraviolet  
WGA: Whole Genome Amplification 
CO1/COI: Mitochondrial cytochrome c oxidase 1 gene 
Bp: Base pairs

PRINCIPLE OF THE METHOD
The workflow starts with nucleic acid extraction (DNA and/or RNA followed by cDNA production), followed by a WGA step of the extracted DNA/cDNA, then a clean-up step before producing the nanopore sequencing library which finally is loaded on the MinION nanopore sequencing device. 

The methods chosen for this workflow aim at flexibility of input material (gDNA, cDNA) and optimal output to enable multiple use of single flowcells. The workflow should in principle enable at least 10 individual runs of ca. 2-4 hours data collection on a single MinION flowcell. 

The resulting sequence data, ideally > 200’000 reads per individual sample, are loaded into the software ‘Geneious Prime’ and analyzed by a custom-made workflow using the proper reference library (see section "Materials"). 



7 Raw Data Processing and Analysis

7.1 Primary Data Acquisition and Basecalling

Primary data acquisition is performed with the Oxford Nanopore Technologies (ONT) data acquisition software MinKNOW GUI v.4.2.8, the operating software that drives nanopore sequencing devices. Basecalling of the raw nanopore sequencing read data is required to generate the fastq sequence data files needed for further analysis. Basecalling is a time-consuming process and hence it is beneficial to use graphic processing units (GPU) to support this process. To enable use of GPU for basecalling generally requires access to a Linux based operating system with decent memory and storage capacity. We use a Dell Precision 7920 Tower XCTO Base with 256GB RAM and a Nvidia Quadro RTX6000, 24 GB graphic card running Ubuntu 18.04. We use the software Guppy v. 4.5.4 for basecalling using GPU with a parameter set established by Miles Benton (https://gist.github.com/sirselim/2ebe2807112fae93809aa18f096dbb94#file-basecalling_notes-md) to be used from a terminal window: 
/guppy_basecaller --disable_pings --input_path /{location_of_data_folder}/ --save_path /{location_of_save_data_folder}/ -c dna_r9.4.1_450bps_hac.cfg -x 'auto' --recursive --num_callers 4 -- gpu_runners_per_device 8 --chunks_per_runner 1024 --chunk_size 1000 

The process generates fastq data files and places all reads with a minimum quality into a folder named “passed”.

7.2 Data analysis

To store and analyze the basecalled fastq nanopore read sequences obtained by sequencers from Oxford Nanopore Technologies (MinION, PromethION, GridION, Flongle) we use the software Geneious Prime v.21.1.1 or greater.

We developed an automated workflow that combines several steps including mapping reads to a reference database (in our case containing ca. 600 base pairs 5’ of the mitochondrial cytochrome oxidase I gene of insects, downloaded from the ‘Barcoding of Life Database’ BOLD), establishing a consensus sequence and running a BLAST search on a local copy of the GenBank database. The BLAST results which generally enable species identification of the sample are stored in a sub-folder.

Open the Geneious Prime software by clicking on the corresponding icon 
In the left panel, go to your working folder and right-click to generate a new folder for storage and analysis of your new nanopore data files. We recommend to use a new folder for each run and within that folder a sub-folder for each barcoded sample. The nanopore fastq data files are stored in a folder on the computer system running the software ‘MinKNOW’ (which enables performing nanopore sequencing on the MinION as mentioned above) with a name similar to ‘passed’ and have the extension ‘.fastq’. For each barcoded sample, the corresponding fastq files have to be copied into the newly established data sub-folder in Geneious, for example, via drag and drop. 
Select the desired number of read packets (by default, 4000 reads per packet), ideally ca. 200’000 in total, by activating them in the right-hand window, then select “Workflows – LB_MM2_PcRefSeq_NrSeq_ret30_210212” from the Geneious Prime menu. If that workflow (incl. in Appendix) does not appear on the menu you have to import it first. 
In the new pop-up window, select the proper reference database – default is the custom made 59k entry database (RefDB_BOLD59kGBCocc755_60625Seq_211213.fasta; incl. in Appendix) extracted from BOLD and modified to exclude duplicates and to maintain a 97% minimal distance between branches. Upon confirming with ‘ok’ the workflow will perform the following steps using the parameters indicated below: 

    a) Mapper: Minimap2 v.2.17 with the following parameters: Dissolve contigs and re-assemble selected; reference     sequence: RefDB_BOLD59kGBCocc755_60625Seq_211213; data type: Oxford Nanopore; include secondary alignments: maximum secondary aligments per read: 5; minimum secondary to primary alignment score ratio: 0.8; no trimming (remove existing trim regions from sequences); and under advanced options, an additional command line option: -t 8. Also, in the results panel, select ‘Save assemply report’, ‘Save in sub-folder’ and ‘Save contigs’.
     b) Sort Documents: Field to sort by: % of Ref Seq; select ‘Reverse Sort’ and ‘After sorting, only keep at most 30 documents.
    c) Mask Alignment: Eliminates alignment columns with <35% entries; in the Options panel, select ‘Expose no options; in the Results panel, select ‘Save a copy with sites stripped; in the What to mask or strip panel, select ‘Sites containing: Gaps (%) 35%; in the All Operation Options panel, select ‘Don’t append ‘Stripped’ to sequence names. 
    d) Save Documents: Select ‘Save these documents as output from workflow’, ‘Save in subfolder called: {Folder Name}; ‘Select these documents when the operation completes, and ‘Continue’. 
    e) Generate Consensus Sequence: Establishes a consensus sequence with a 0% majority threshold. Uses the following parameters: In the Options panel, select ‘Expose no options; in the All Operation Options panel, select ‘Threshold: 0% - Majority’, ‘Ignore Gaps’, ‘Assign Quality Total’, ‘Trim to reference sequence’, and ‘Append text to name of alignment consensus sequence’. 
    f) Save Documents: Select ‘Save these documents as output from workflow’, ‘Save in subfolder called: {Folder Name}; ‘Select these documents when the operation completes, and ‘Continue’.
    g) BLAST: Performs a BLAST search for the ‘contig consensus sequence ‘ on a local GenBank database copy using the program ‘Megablast’, places results into a ‘Hit table’, retrieving ‘Matching regions with annotations’, and saves the top 10 best hits in a new folder. Uses the following parameters: In the Options panel, select ‘Expose no options; in the All Operation Options panel, select ‘Nucleotide Query Option’, Database ‘Nucleotide collection (nr/nt), Program: ‘Megablast’, Results: ‘Hit table’, Retrieve ‘Matching region with annotations’, Macimum Hits ‘10’. Also selected is ‘Low Complexity Filter’ and ‘Mask for lookup table’, with other parameters being default values. 
    h) Save Documents: Select ‘Save these documents as output from workflow’, ‘Save in subfolder called: {Folder Name}; and ‘Branch from 2 Operations Ago’. 
    i) Sort Documents: In the Options panel, select ‘Expose no options; In the Options panel, select Field to sort by: ‘Bit-Score’, and select ‘After sorting, only keep at most 30 documents.

The BLAST results in the new folder are sorted according to the highest Bit-Score for the hit which in most cases will present the highest % Pairwise Identity and the lowest probability value (E-Value) on top of the list. In addition, Geneious Prime adds other information for proper qualification of the BLAST hits. 

Generally, the results obtained with this workflow provide ‘% Pairwise Identity’ hits of >99% and hence strong evidence for an unambiguous species identification of the sample for which the consensus sequence was established.


7.3 Additional Resources

Geneious Prime: Training videos and other resources for all steps outlined above for the software Geneious Prime are available on the Geneious homepage: Resources | Geneious Prime 
Barcoding of Life Database: The Barcode of life systems page (http://www.boldsystems.org/) is an excellent resource for barcoding based identifications and it allows searching their database with the proper marker sequence (e.g., COI for insects and ITS for fungi; see detailed instructions below).


8 Barcode-based identification on the NCBI GenBank Database 

As an alternative to the identification using local BLAST implemented in the Geneious Prime workflow, the consensus sequence may also be BLASTed directly on the GenBank Database of NCBI:

Open the BLAST homepage of NCBI (https://blast.ncbi.nlm.nih.gov/Blast.cgi) and choose the web BLAST option ‘Nucleotide BLAST’. 
Paste your consensus sequence into the window ‘Enter Query Sequence’ 
Check that the following parameters are chosen: a. Under ‘Choose Search Set’, ‘Database’: Choose ‘Standard’, and ‘Nucleotide collection (nr/nt)’ in the dropdown menu; leave the other fields empty. b. Under ‘Program selection’, ‘Optimize for’: Choose ‘Highly similar sequences (megablast) c. Select ‘Show results in a new window’ before you select ‘BLAST’ at the bottom left of the web page d. It may take a short while before GenBank returns the BLAST results in a new window: 
Criteria for correct species allocation are, among others: a. The top >10 entries are the same species name, however, there may be synonymous names for many insects. b. The identity of the query sequence with the GenBank entry (column ‘Ident’) is >97% c. The coverage region of the alignment between your consensus sequence and the GenBank entry is >80% (at least 300bp) d. The E-Value is very close to 0
Note
Note: GenBank does contain erroneous entries that may, for example, originate from an error in allocating the sample to the correct species based on morphological characters. Such cases can be identified easily if there is only a single such entry in a long list of identical species names.


9 Barcode-based identification on the Barcode of Life Database

In order to identify what species your consensus, full-length COI sequence originates from it is necessary to utilize reference databases. One example for such reference database is the Barcode of Life Database (BOLD). This database includes a comprehensive set of COI sequence data that has been collected by individuals and organizations across the globe and is constantly being updated with new data. 

Start by navigating to the BOLD Systems webpage (http://www.boldsystems.org/) and select the “Identification” tab at the top of the webpage. 
Using the default settings, select the “Animal Identification (COI)” tab (for arthropod identification) and the “Species Level Barcode Records” database. Paste the consensus sequence obtained from the sample into the search box at the bottom of the page. 
The browser will eventually update and return the results of the search, revealing the records contained in the database that yield the closest match in terms of sequence similarity. The result screen contains a lot of information that may be explored to establish a confident identification for the query sequence. Often the search results will all originate from a single species allowing an unambiguous identification to be made for the sample. In some cases, two or more species are returned, prompting BOLD to display the message “A species match could not be made, the queried specimen is likely to be one of the following”. This may, for example, happen when highly similar reference sequences were entered with different names (synonyms) for the same species.


10 Barcode-based identification on the EPPO-Q-Bank Database 

Open the EPPO-Q-Bank homepage (https://qbank.eppo.int/blast?db=arthropods), check that ‘Arthropods’ are selected and paste your consensus sequence into the window ‘Paste sequence to align:’ and select ‘Start alignment’ 
EPPO-Q-Bank will return its results in the lower part of the same window: 
The percent identity of your query sequence to the best match entries in the EPPO-Q-Bank Database is indicated in the column ‘Similarity %’, the percent overlap of both sequences in the column ‘Overlap %’. If both values are close to 100% then the species allocation is reliable.
Note
Note: The three databases mentioned above are basically independent, although there is much overlap between GenBank and BOLD. This means that not all species are represented in all databases but rather in only one or two of them. Furthermore, the set of individual references for a species is mostly different in each database. Therefore, BLAST results will generally be different among the databases.

Materials

MATERIALS & EQUIPMENT

The sections below report all the equipment and materials required to apply this protocol. N.B. Batch numbers of kits used must be recorded.

Water
General use: Double-distilled water, preferably from a Milli-Q ultraclean water device 
PCR procedures: Sterile, DNase-, RNase- and Protease-free water e.g. Fisher Scientific DNA free water, product code: BPE2470-1.
ReagentWater DNA Grade DNASE Protease freeFisher ScientificCatalog #BP24701  

Solutions, standards and reference materials
Solutions: All solutions should be of molecular grade purity and only be used to the expiry date indicated on the package. Repeated freeze-thaw cycles should be avoided. Pipetting should be performed with utmost care using filter tips to avoid contamination. Where appropriate aliquots may be used to minimize contamination.

Standards: All standards should only be used to the expiry date indicated on the package. Repeated freeze-thaw cycles should be avoided. Pipetting should be performed with utmost care using filter tips to avoid contamination.

Reference materials: The reference library may be established with any collection of nucleic acid sequences that are useful to discriminate among the taxa of interest. For example, in the case of insects, the reference library is composed of all insect standard barcode entries (a 648 base-pair region of the mitochondrial cytochrome c oxidase 1 gene (“CO1” or “COI”) downloaded from the ‘Barcoding of Life Database’ (http://www.barcodinglife.org/; downloaded in May 2021), with identical entries removed and a minimum difference between tree tips of 3%. This was done with the utility “Dedupe” from BBTools (BBTools - DOE Joint Genome Institute; as implemented in Geneious Prime) to remove duplicate entries and all sequences with a similarity of >97% among each other.

The present SOP was successfully used, partly with minor variations, on a total of 67 samples covering 14 insect families and 26 species.

Commercial kits
Nucleic acid extraction: The method described here uses a custom made proteinase K buffer and mechanical disruption for DNA extraction. However, it has been shown to work well using the following commercial kits: ‘GenElute™ Mammalian Genomic DNA Miniprep Kit’ (Sigma-Aldrich Chemie GmbH, Buchs, Switzerland; Product code G1N350); the ‘DNeasy Blood & Tissue Kit’ (QIAGEN AG, Basel, Switzerland; Product code 69506); and the ‘Monarch® Genomic DNA Purification Kit’ (Bioconcept AG, Allschwil, Switzerland; Product code NEB T3010S). DNA extraction kits from other suppliers must be shown to be appropriate before use.
ReagentGenElute™ Mammalian Genomic DNA Miniprep KitSigma AldrichCatalog #G1N350  
ReagentDNeasy Blood & Tissue KitsQiagenCatalog #69506  
ReagentMonarch® Genomic DNA Purification KitNew England BiolabsCatalog #T3010S  

Whole Genome Amplification (WGA): GenomePlex® Complete Whole Genome Amplification Kit WGA2 (Sigma-Aldrich Chemie GmbH, Buchs, Switzerland; Product code WGA2-50RXN)
ReagentGenomePlex® GGA Kit zur Gesamtgenom-AmplifikationSigma AldrichCatalog #WGA2-50RXN  

cDNA Production: LunaScript® RT SuperMix Kit (Bioconcept AG, Allschwil, Switzerland; Product  code NEB E3010S)ReagentLunaScript® RT SuperMix KitNew England BiolabsCatalog #E3010S  

Nanopore Sequencing by MinION: Amplicons by Ligation (SQK-LSK109) with native barcode ligation: Ligation Sequencing Kit SQK-LSK109 with Native Barcoding Expansion 1-12 Kit (EXP-NBD104).
ReagentNative Barcoding Expansion 1-12 (PCR-free)Oxford Nanopore TechnologiesCatalog #EXP-NBD104  
NEBNext® Companion Module for Oxford Nanopore Technologies® Ligation Sequencing (Bioconcept AG, Allschwil, Switzerland; Product code NEB E7180S)
ReagentNEBNext® Companion Module for Oxford Nanopore Technologies® Ligation SequencingNew England BiolabsCatalog #E7180S  

Reaction Cleanup: MinElute Reaction Cleanup Kit of Qiagen (QIAGEN AG, Basel, Switzerland; Product code 28206)
ReagentMinElute Reaction Cleanup KitQiagenCatalog #28206  

Plastic ware and other disposable material
Note
It is essential that all plastic-ware is sterile before use.

ABCD
ItemDetailExample SupplierProduct code / Remarks
Pipette tips (filtered) 10, 100, 200 & 1000μlFisher Scientific
PCR tubes single, strip or 96-wellFisher Scientific
Eppendorf tubes 0.2, 0.5, 1.5 ml Fisher ScientificlowBind
Qubit assay tubesInvitrogen (Fisher 
Scientific)
lowBind
SeqStudio MicroplatesMicroAmp Optical 96 well 
Reaction PlateApplied Biosystems 
by life technologies 
(Fisher Scientific)
Retsch Collection Microtubes2mlQiagen
Retsch Beads (stainless steel grinding beads)3mmFisher Scientific11758414
Equipment
The following items of equipment are required to undertake the analysis. Several alternative suppliers/models are available for each item. These must be shown to be appropriate before use. 
ABCD
ItemDetail Example supplier Product code / Remarks
Precision pipettes1-1000µlFisher Scientific
Bench top vortexLabnetVX-100
Thermocycler SensoQuestWitec
Thermal mixer to hold 1.5 ml tubesEppendorf5355
DNA quantifier -photospectrometric
Accurate to +/- 1 ng Witec (Thermo Scientific)Nanodrop ONE
DNA quantifier - fluorometric
Invitrogen (Fisher 
Scientific)QuBit 3
Microcentrifuge to hold 1.5 ml tubes Eppendorf 5452 
Desktop centrifuge with 
microwell plate carrierSigma 4-15C
Power supply for electrophoresis
BioRad 1000/500
Electrophoresis equipment (trays, combs)
BioRad 
UV documentation systemWitecE-Box
Vacuum Pump for microwell plate 96 
cleanup steps
≥20 inHG vacuumMillipore
Sequencing device MinION FLO-Min106DOxford Nanopore 
Technologies ONT
Sequencing flowcell R.9.4.1Oxford Nanopore 
Technologies ONT
Laminar flow hood With UV lightSKANMonMouth VFLT1000
Tissue Homogenizer 
with AdaptorsTissueLyser II Qiagen
MagnetMagJET Separation Thermo Scientific™ 
(Thermo Fisher Scientific)15265126
Tube rotator (HULA mixer)RotoFlex® Plus bench 
top tube rotator Merck (Sigma-Aldrich) Z740290
Other materials
Disposable plastic gloves, sterile dissection equipment. 

Electronic files / computer software
The software Geneious Prime (Biomatters; https://www.geneious.com/prime/) v. 21.1.1. or above with  a single user license.
Internet access is required to utilize NCBI’s BLAST (BLAST: Basic Local Alignment Search Tool  (nih.gov)). Alternatively, a local installation of a BLAST database is required (may be performed with  assistance from Geneious Prime after downloading the necessary nt files from the GenBank site).

Protocol materials

ReagentWater DNA Grade DNASE Protease freeFisher ScientificCatalog #BP24701 
ReagentMonarch® Genomic DNA Purification KitNew England BiolabsCatalog #T3010S 
ReagentGenomePlex® GGA Kit zur Gesamtgenom-AmplifikationMerck MilliporeSigma (Sigma-Aldrich)Catalog #WGA2-50RXN 
ReagentLunaScript® RT SuperMix KitNew England BiolabsCatalog #E3010S 
ReagentGenElute™ Mammalian Genomic DNA Miniprep KitMerck MilliporeSigma (Sigma-Aldrich)Catalog #G1N350 
ReagentDNeasy Blood & Tissue KitsQiagenCatalog #69506 
ReagentMinElute Reaction Cleanup KitQiagenCatalog #28206 
ReagentNative Barcoding Expansion 1-12 (PCR-free)Oxford Nanopore TechnologiesCatalog #EXP-NBD104 
ReagentNEBNext® Companion Module for Oxford Nanopore Technologies® Ligation SequencingNew England BiolabsCatalog #E7180S 
ReagentQIAGEN Protease and Proteinase KQiagenCatalog #19131 
ReagentMinElute Reaction Cleanup KitQiagenCatalog #28206 
ReagentGenElute™ Total RNA Purification KitMerck MilliporeSigma (Sigma-Aldrich)Catalog #RNB100 
ReagentLunaScript RT SuperMix KitNew England BiolabsCatalog #
E3010S 
ReagentGenomePlex® GGA Kit zur Gesamtgenom-AmplifikationMerck MilliporeSigma (Sigma-Aldrich)Catalog #WGA2-50RXN 
ReagentLigation Sequencing KitOxford Nanopore TechnologiesCatalog #SQK-LSK109 

Safety warnings

For hazard information and safety warnings, please refer to the SDS (Safety Data Sheet).

Before start

Note
It is essential to wear disposable plastic gloves during all laboratory procedures and to use pipette tips that are sterile and fitted with filters.

1 Sample preparation

All samples should be stored frozen at Temperature-20 °C   or stored in Concentration70 % EtOH  (few days at TemperatureRoom temperature   or  at Temperature4 °C   in the refrigerator) until processed. Samples can be stored frozen indefinitely. Use sterile dissection equipment where appropriate. As we are mostly using crude DNA extracts with or without further  cleanup for the WGA step there is a need to equilibrate tissue size with the amount of proteinase K buffer solution. The following table lists empirically established conditions:

AB
Homogenization 
Buffer (μl)
Organism (or parts thereof)
50Thrips larva 1st and 2nd instar; small Whitefly larvae (1st or 2nd instar) or eggs; 
very small tissue sample
100Thrips adult or 3rd instar; large Whitefly larvae (3rd instar) or pupae; tissue samples of corresponding size; antenna or tarsus (or part of leg) of adult tephridit fruit fly; 
0.5x0.5x0.5mm sample of insect larvae
200Drosophila adult; 1x1x1mm sample of larger insect larvae or of caterpillars

2 DNA Extraction

Materials:  

The extraction is carried out with the proteinase K containing KAWA buffer (section 2.1 / step 3). We use Qiagen Proteinase.K, 20 mg/ml (QIAGEN AG, Basel, Switzerland; Product code 19131). 
ReagentQIAGEN Protease and Proteinase KQiagenCatalog #19131  

Alternatively, a commercial kit may be used as mentioned in Section Materials, following the manufacturer’s  protocols. It is recommended that the manufacturer’s guidelines are checked each time kits are ordered to ensure any updates/changes made since development of this SOP are incorporated. 

2.1 Fast Extraction by Proteinase K Buffer (KAWA extraction)

20m

Add Amount100 µL  -Amount200 µL KAWA Buffer  *) in Collection Microtubes (Qiagen).

*) KAWA Buffer: Concentration10 millimolar (mM) Tris HCl , Concentration1 millimolar (mM) EDTA , Concentration0.5 % Tween 20 , Concentration50 μg/ml Proteinase K , Ph8.0 .

Preparation of 100 ml KAWA Buffer:

ABC
NrAmountMaterial
10.121 gTris Base
2adjust to pH 8.0 with 1M HCl (ca. 0.5 ml)
30.037 gEDTA
40.5 mlTween 20
55 mg (250 µl of Qiagen Proteinase.K, 20 mg/ml)Proteinase K

Aliquot in 1.5 ml tubes. 

STORE AT Temperature-20 °C  .

 Add stainless steel ball 3 mm.

Add tissue sample.

Homogenize on the Retsch Mixer Mill (Qiagen) for 2x3 min at 25 Hz, turning plate after the first period, then centrifuge.

Heat tubes at Temperature95 °C   for Duration00:20:00  .

20m

Centrifuge, transfer in 0.5 ml lowbind Eppendorf tubes.

Store frozen at Temperature-20 °C   (avoid freeze-thaw cycles, homogenates may be stored for several weeks at Temperature4 °C  ).

Note
As the extract resulting from a KAWA extraction contains RNA and contaminating proteins its DNA content cannot be accurately quantified. In our experience, using 1ul of KAWA extracts for WGA works in the vast majority of cases. However, we also found that using a cleanup kit, such as the Qiagen MinElute Reaction Cleanup Kit, after KAWA extraction (see section 2.2) leads to improved balance between different extractions in the final sequencing libraries.

2.2 Reaction cleanup

Materials:  

We use the MinElute Reaction Cleanup Kit of Qiagen (QIAGEN AG, Basel, Switzerland; Product code 28206). 
ReagentMinElute Reaction Cleanup KitQiagenCatalog #28206  

A cleanup step using the MinElute Reaction Cleanup Kit of Qiagen (QIAGEN AG, Basel, Switzerland; Product code 28206) is recommended after the proteinase K extraction as this seems to reduce the variance in the number of sequencing reads between individual libraries in multiplex runs. 

This step is not necessary if using a commercial nucleic acid extraction kit. However, a reverse transcription step is required if using an RNA extraction kit. For example, using the GenElute™ Total RNA Purification Kit from Sigma-Aldrich (Merck, Sigma-Aldrich Chemie GmbH, Buchs, Switzerland; Product  code: RNB100), the following cDNA production kit was successfully used: LunaScript® RT SuperMix Kit (Bioconcept AG, Allschwil, Switzerland; Product code NEB E3010S). Also, if RNA is extracted it may be beneficial to omit the DNAse step to maximize the yield of nucleic acids.
ReagentGenElute™ Total RNA Purification KitSigma AldrichCatalog #RNB100  
ReagentLunaScript RT SuperMix KitNew England BiolabsCatalog #
E3010S  

Procedure for reaction cleanup using the Qiagen MinElute Reaction Cleanup Kit:

Prepare MinElute column setup on 2ml collection tube.

Add Amount300 µL Buffer ERC  to Amount75 µL KAWA product .

Load Amount375 µL of mixture  to column setup.

Centifuge the sample of last step for Duration00:01:00  . 
Discard flow-through, re-assemble.

Add Amount750 µL Buffer PE  to column setup.

Centifuge the sample of last step for Duration00:01:00  . 
Discard flow-through, re-assemble.

Centifuge the sample of last step for Duration00:02:00  . 
Place MinElute column in NEW 1.5ml  Eppendorf tube.

Add Amount12 µL Elution Buffer  to centre of MinElute Column.
Note
Deviation from supplier’s protocol.

Incubate for Duration00:01:00   @ TemperatureRoom temperature  .

Centrifuge for Duration00:01:00  .
Note
DNA quantification at this step may be advisable if the expected yield is below 10 ng.

2.3 DNA extract quantification

DNA extracted with a commercial kit or after clean-up may be quantified to assess the extraction process and enable normalisation of DNA concentration. One common method is to use a Nanodrop ND 1000 Spectrophotometer. DNA should be diluted to 10-50ng/μl using DNA-free water. Negative controls should read ~0 ng/μl.

Controls:  
A negative extraction control (with no tissue) should be run in parallel with all batches of sample extraction and quantified alongside all tissue extractions

3 Whole Genome Amplification (WGA)

Materials:  

We use the GenomePlex® Complete Whole Genome Amplification Kit WGA2 (Sigma-Aldrich  Chemie GmbH, Buchs, Switzerland; Product code WGA2-50RXN).
ReagentGenomePlex® GGA Kit zur Gesamtgenom-AmplifikationSigma AldrichCatalog #WGA2-50RXN  

Procedure for whole genome amplification using the Sigma GenomePlex® Complete Whole Genome  Amplification Kit WGA2: 

3.1 WGA Step 1: Fragmentation

34m

Run Thermocycler program (Program: incubation at Temperature95 °C  , runs Duration00:30:00  ).
(To assure the Thermocycler is ready when needed.)

30m

Use DNA/cDNA sample: Transfer Amount10 µL DNA (≥ 10 ng)  of section 2.2/step 19 into new 8-Strip Microtubes.

Add Amount1 µL Fragmentation Buffer  to each DNA tube of previous step.

Heat for Duration00:04:00   @ Temperature95 °C   in Thermocycler. Immediately cool TemperatureOn ice  .
Note
Alternatively, a tabletop Mini Cooler may be used.

3.2 WGA Step 2: Library Preparation

Add Amount2 µL Library Preparation  Buffer (green)  to DNA of  previous step.

Add Amount1 µL Library Stabilization Solution (yellow)  to DNA of previous step. Vortex and centrifuge.

Heat for Duration00:02:00   @ Temperature95 °C   in Thermocycler. Immediately cool TemperatureOn ice  .

Add Amount1 µL Library Preparation  Enzyme (red)  to DNA of previous step. Vortex and centrifuge.

Run Thermocycler program with WGA Library Prep Rxn.

Program: 
incubation at Temperature16 °C  , runs Duration00:20:00  ; 
incubation at Temperature24 °C  , runs Duration00:20:00  ; 
incubation at Temperature37 °C  , runs Duration00:20:00  ; 
incubation at Temperature75 °C  , runs Duration00:05:00  ; 
cool to and hold at Temperature4 °C  

1h 5m

3.3 WGA Step 3: Amplification

30m

Add Amount48 µL MH2O  to each reaction tube of previous step (WGA Library Prep Rxn).

Add Amount7.5 µL Amplification Master Mix  to each reaction tube of previous step.

Add Amount5 µL WGA DNA Polymerase  to each reaction tube of previous step. Vortex and centrifuge.

Run Thermocycler program.

Program:
Initial incubation: Duration00:03:00   at Temperature95 °C  ;
17 cycles of Duration00:00:15   at Temperature94 °C  ; Duration00:05:00   at Temperature65 °C  ;
cool to and hold Temperature4 °C  

Store short term Temperature4 °C  , long term Temperature-20 °C  .

8m 15s

OPTIONAL: Check on gel: Run Amount5 µL   on 1.4% TBE gel, Amount4 µL marker , Amount4 µL loading dye  at 70 V/cm for Duration00:30:00  .

30m

3.4 WGA Step 4: Reaction Cleanup

Prepare MinElute column setup on 2ml collection tube.

Add Amount300 µL Buffer ERC  to Amount75 µL KAWA product .

Load Amount375 µL of mixture  to column setup.

Centifuge the sample of last step for Duration00:01:00  . 
Discard flow-through, re-assemble.

Add Amount750 µL Buffer PE  to column setup.

Centifuge the sample of last step for Duration00:01:00  . 
Discard flow-through, re-assemble.

Centrifuge the sample of last step for Duration00:02:00  . 
Place MinElute column in NEW 1.5ml  Eppendorf tube.

Add Amount10 µL Elution Buffer  to centre of MinElute Column.
Note
Deviation from supplier’s protocol.

Incubate for Duration00:01:00   @ TemperatureRoom temperature  .

Centrifuge for Duration00:01:00  .
Note
DNA quantification at this step may be advisable if the expected yield is below 10 ng.

4 WGA Product Check by Gel Electrophoresis

Gel electrophoresis of DNA in an agarose gel is a standard technique in molecular biology, but equipment, reagents, staining and visualization varies considerably between laboratories, and according to  local health & safety controls. Therefore, this SOP suggests general conditions that need to be  adapted to each laboratory. 

4.1 Make a 1.2% TBE agarose gel (1xTBE pH: 9.0) containing 0.0001% Ethidium Bromide*)

32m

Place Amount80 mL 1xTBE-Buffer  + Amount1 g Agarose  in a 500ml Erlenmeyer flask.

Heat in microwave at max intensity for Duration00:02:00   with intermittent interruption for shaking (take care not to overheat).

Add Amount8 µL Ethidium Bromide*)  and cast the gel.
Safety information
*) Ethidium Bromide is a carcinogenic chemical, use nitrile gloves and consult the security regulations.

Wait Duration00:30:00   at TemperatureRoom temperature   or store in the fridge at Temperature4 °C  .

30m

4.2 Gel loading and running

30m

Prepare Size Standard (e.g. Thermo Scientific™ GeneRuler DNA Ladder Mix, ready-to-use;  Order Nr. 10181070) and samples (whole genome amplification products) for gel loading  by mixing Amount3 µL of each sample and of the Size Standard  with Amount3 µL of Loading buffer  to be  prepared as follows:

Loading Buffer Preparation for PCR Amplification Product Electrophoresis (10ml):

Add Amount1.5 g Ficoll 400  to Amount10 mL 1xTBE , adjust pH to Ph9.0 .

Add Amount5 mg Bromophenol Blue  (adjust amount visually, may be too high).

Carefully pipet the Amount6 µL loading mix  into individual wells of the gel, beginning with the Size Standard at the leftmost well.

 Run at 70V for approximately Duration00:30:00   (depending on size of gel), ensuring the DNA does not run off the gel.

30m

4.3 Visualization

Visualize your DNA fragments in UV light (with appropriate safety precautions); if the WGA reaction has been successful it shows as a smear of approximately 400 - 1000 base pairs in length. Your negative controls should not contain bands.

Note
Note: If no WGA amplification signal is obtained after several attempts it may be advisable to run a  positive control using a previously successful PCR. In rare cases more DNA extract may be needed.  Alternatively, there may be inhibitors for the PCR in the crude extract, such as in aphids, where the  high sugar content inhibits PCR. In such cases, the crude extract needs to be cleaned up with a commercial kit such as the Sigma ‘GenElute™ Mammalian Genomic DNA Miniprep Kit.

4.4 Recording

Keep a permanent record of your gel (electronic and/or hard copy) as proof that the WGA reaction was successful and contaminant free.

5 Sequencing Library Preparation

Materials:  

The library for nanopore sequencing is produced with the Ligation Sequencing Kit SQK-LSK109 of Oxford Nanopore Technologies for sequencing on the flowcell type R.9.4.1 (flowcell ID: FLO-Min106D), following the manufacturer’s recommendations with some minor modifications.
ReagentLigation Sequencing KitOxford Nanopore TechnologiesCatalog #SQK-LSK109  

5.1 Library Preparation Step 1: DNA End-Prep

Transfer 120 ng total DNA of section 3 into new 8-strip Microtubes. 

Add MH2O to total Amount54 µL  .

Add Amount3.5 µL Ultra II End-prep reaction buffer .

Add Amount3 µL Ultra II End-prep enzyme mix .

Mix by flicking, spin down.

RUN Thermocycler program.

Program: 
incubate Duration00:30:00   at Temperature20 °C  /Duration00:20:00   at Temperature65 °C  ,
transfer contents to 1.5ml Eppendorf tube.

50m

Add Amount60 µL AMPure XP beads  (resuspended). Mix by flicking tube.

Incubate for Duration00:10:00   @ TemperatureRoom temperature   on HULA mixer.

10m

Spin and pellet on magnet until clear. Pipette off supernatant, keep on magnet.

Wash beads on magnet with Amount200 µL 70% EtOH (fresh) . Pipette off supernatant, do not disturb pellet.

Wash beads on magnet with Amount200 µL 70% EtOH (fresh) . Pipette off supernatant, do not disturb pellet.

On magnet, pipette off residual EtOH.

Dry for Duration00:00:30  .

30s

Resuspend in Amount61 µL water (nucleasefree) .

Incubate for Duration00:05:00   at TemperatureRoom temperature  

Pellet on magnet until clear.

Collect Amount61 µL   eluate. May store at Temperature4 °C   DurationOvernight  .

OPTIONAL: Quantify Amount1 µL   of product on QuBit. Note DNA concentration (ng/μl).

5.2 Library Preparation Step 2: Native Barcode Ligation

17m 30s

Add Amount22.5 µL   of eluted DNA of Endrepair product into new 1.5 ml Eppendorf tube; mix by pipetting. Use total Amount100 fmol  -Amount200 fmol   (=ca. 35-60 ng).

Add Amount2.5 µL Native Barcode  to each reaction tube of previous step. Note Barcode Numbers!

Add Amount25 µL Blunt/TA Ligase Master Mix  to each reaction tube of previous step; mix by pipetting.

Incubate for Duration00:10:00   @TemperatureRoom temperature  .

10m

Add Amount50 µL AMPure XP beads  (resuspended). Mix by flicking.

Incubate for Duration00:05:00   @TemperatureRoom temperature   on HULA mixer. Spin down.

Pellet on magnet until clear. Pipette off supernatant, keep on magnet.

Wash beads on magnet with Amount200 µL 70% EtOH (fresh) . Pipette off supernatant, do not disturb pellet.

Wash beads on magnet with Amount200 µL 70% EtOH (fresh) . Pipette off supernatant, do not disturb pellet.

Remove residual by spin on magnet. Pipette off residual EtOH.

Dry for Duration00:00:30  .

30s

Resuspend in Amount26 µL water (nucleasefree) .

Incubate for Duration00:02:00   @TemperatureRoom temperature  .

Pellet on magnet until clear.

Collect Amount26 µL eluate  and transfer to 1.5ml Eppendorf tube.

MUST DO: Quantify Amount1 µL   of product on QuBit. Note DNA concentration (ng/μl).

Pool equimolar amounts of each barcoded sample to 1.5ml Eppendorf tube (to 100-200 fmol total (=ca. 60 ng)). 
Note
DO NOT multiply low concentration samples linearly! @10x lower concentration use max 5x more DNA!

Dilute single pooled Barcode ligation product to Amount65 µL  .

5.3 Library Preparation Step 3: Adaptor Ligation and Clean-up

35m 30s

Use Amount60 µL single pooled barcode ligation product .

Add Amount25 µL Ligation Buffer (LNB) .

Add Amount10 µL NEBNext Quick T4 DNA Ligase .

Add Amount5 µL Adapter Mix (AMII for multiplex) . Mix by flicking, spin down.

Incubate for Duration00:10:00   @TemperatureRoom temperature  .

10m

Add Amount40 µL AM Pure XP beads (resuspended) . Mix by flicking

Incubate for Duration00:15:00   @ TemperatureRoom temperature   on HULA mixer. Spin down.

15m

Pellet on magnet until clear. Pipette off supernatant, keep on magnet.

Wash beads with Amount250 µL Short Fragment Buffer (SFB) . Wait Duration00:03:00   on Magnet. Resuspend by flicking, pellet, remove supernatant.

Wash beads with Amount250 µL Short Fragment Buffer (SFB) . Wait Duration00:03:00   on Magnet. Resuspend by flicking, pellet, remove supernatant.

Remove residual by spin.

Dry Duration00:00:30  .

30s

Resuspend in Amount15 µL Elution Buffer (EB) .

Incubate for Duration00:10:00   @ Temperature37 °C  .

10m

Pellet on magnet until clear.

Collect library (Amount15 µL  ) from previous step into new Eppendorf tube.

Quantify Amount1 µL eluted DNA  on QuBit. Use appropriate amount for 16 ng of library for next step; dilute with EB buffer.

6 Nanopore Sequencing

Priming and Loading the Flowcell

6.1 Nanopore Sequencing Step 1: Priming and Loading the Flowcell

Prepare Flowcell (perform QC on MinION). Record number of active pores.

Prepare the flow cell priming mix: Add Amount30 µL Flush Tether (FLT)  directly to Flush Buffer (FB) tube. Mix.

Load flow cell with Amount800 µL priming mix  via priming port. Spot on closed!!!

Wait for Duration00:05:00  .

Prepare LIBRARY for loading: add Amount38 µL Sequencing Buffer (SQB)  to 1.5ml Eppendorf tube.

Add Library: Add Amount26 µL Loading Beads (LB)  to 1.5ml Eppendorf tube (mixed immediately before use).

Add Library: Add Amount12 µL diluted DNA library  from section 5.3 to 1.5 ml Eppendorf tube.
Note
We used up to 16 ng total DNA but obtained best results with 12 ng.

Load flow cell  with Amount200 µL priming mix  via priming port. Spot on closed!!!

Add Amount75 µL Library   from Go togo to step #120   via SpotON sample port. Add drop by drop!

Close priming port, SpotOn port, perform sequencing on a MinION (Flongle, GridION, Promethion) using protocols.io method https://www.protocols.io/view/starting-a-minion-sequencing-run-using-minknow-7q6hmze; make sure to use flow cell type LSK109 and barcode kit EXP-NBD104 (option now available).

6.2 Nanopore Sequencing Step 2: Flow Cell Storage

30m

Prepare wash mix: Add Amount20 µL Wash Solution A  to 1.5ml Eppendorf tube.

Add Amount380 µL Wash Solution B  to same 1.5ml Eppendorf tube. Vortex.

Open inlet port, add Amount400 µL Wash Mix  via priming port. Close priming port after loading (Spot on closed!!!).

Wait for Duration00:30:00   @TemperatureRoom temperature  .

30m

Add Amount500 µL Storage Buffer S  via priming port. Close priming port after adding (Spot on closed!!!).

Remove spare contents in flow cell. Aspirate Amount1000 µL AIR  from empty flow cell via trash removal port top left. Spot on closed!!!

Store in fridge.

7 Raw Data Processing and Analysis

For Raw Data Processing and Analysis, please see section "Guidelines".

A	B	C	D
Item	Detail	Example Supplier	Product code / Remarks
Pipette tips (filtered)	10, 100, 200 & 1000μl	Fisher Scientific
PCR tubes	single, strip or 96-well	Fisher Scientific
Eppendorf tubes	0.2, 0.5, 1.5 ml	Fisher Scientific	lowBind
Qubit assay tubes		Invitrogen (Fisher Scientific)	lowBind
SeqStudio Microplates	MicroAmp Optical 96 well Reaction Plate	Applied Biosystems by life technologies (Fisher Scientific)
Retsch Collection Microtubes	2ml	Qiagen
Retsch Beads (stainless steel grinding beads)	3mm	Fisher Scientific	11758414

	A	B
	Homogenization Buffer (μl)	Organism (or parts thereof)
	50	Thrips larva 1st and 2nd instar; small Whitefly larvae (1st or 2nd instar) or eggs; very small tissue sample
	100	Thrips adult or 3rd instar; large Whitefly larvae (3rd instar) or pupae; tissue samples of corresponding size; antenna or tarsus (or part of leg) of adult tephridit fruit fly; 0.5x0.5x0.5mm sample of insect larvae
	200	Drosophila adult; 1x1x1mm sample of larger insect larvae or of caterpillars

A	B	C
Nr	Amount	Material
1	0.121 g	Tris Base
2		adjust to pH 8.0 with 1M HCl (ca. 0.5 ml)
3	0.037 g	EDTA
4	0.5 ml	Tween 20
5	5 mg (250 µl of Qiagen Proteinase.K, 20 mg/ml)	Proteinase K

Public workspaceWORKFLOW FOR THE NUCLEIC ACID BASED IDENTIFICATION OF INSECTS USING WHOLE GENOME AMPLIFICATION AND NANOPORE SEQUENCING - KAWA V.1

WORKFLOW FOR THE NUCLEIC ACID BASED IDENTIFICATION OF INSECTS USING WHOLE GENOME AMPLIFICATION AND NANOPORE SEQUENCING - KAWA V.1