Feb 05, 2022

Public workspaceWORKFLOW FOR THE NUCLEIC ACID BASED IDENTIFICATION OF INSECTS USING WHOLE GENOME AMPLIFICATION AND NANOPORE SEQUENCING - KAWA V.1

  • Jürg E Frey1,
  • Beatrice Frey2,
  • Daniel Frei2,3,
  • Morgan Gueuning2,4,
  • Simon Blaser5,
  • Andreas Bühlmann6
  • 1Agroscope;
  • 2Agroscope, Department of Method Development and Analytics, Research Group Molecular Diagnostics, Genomics and Bioinformatics, Wädenswil, Switzerland;
  • 3Current address: Qiagen Instruments AG, Hombrechtikon, Switzerland;
  • 4Current address: Department of Research and Development, Blood Transfusion Service Zurich, Swiss Red Cross, Schlieren, Switzerland;
  • 5Agroscope, Department of Plants and Plant Products, Agroscope Phytosanitary Service;
  • 6Agroscope, Department of Plants and Plant Products, Research Group Product Quality and Innovation
Icon indicating open access to content
QR code linking to this content
Protocol CitationJürg E Frey, Beatrice Frey, Daniel Frei, Morgan Gueuning, Simon Blaser, Andreas Bühlmann 2022. WORKFLOW FOR THE NUCLEIC ACID BASED IDENTIFICATION OF INSECTS USING WHOLE GENOME AMPLIFICATION AND NANOPORE SEQUENCING - KAWA. protocols.io https://dx.doi.org/10.17504/protocols.io.bwpepdje
License: This is an open access protocol distributed under the terms of the Creative Commons Attribution License,  which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Protocol status: Working
We use this protocol and it’s working
Created: July 18, 2021
Last Modified: February 07, 2022
Protocol Integer ID: 51654
Keywords: insects, identification of insects, whole genome amplification, nanopore sequencing, nucleic acid
Abstract

Note
This protocol uses proteinase K containing KAWA buffer for DNA extraction.
An alternative version that uses Monarch® Genomic DNA Purification Kit for DNA extraction and purification can be found here:
Protocol
WORKFLOW FOR THE NUCLEIC ACID  BASED IDENTIFICATION OF INSECTS USING WHOLE GENOME AMPLIFICATION AND  NANOPORE SEQUENCING - Monarch®
NAME

WORKFLOW FOR THE NUCLEIC ACID BASED IDENTIFICATION OF INSECTS USING WHOLE GENOME AMPLIFICATION AND NANOPORE SEQUENCING - Monarch®

CREATED BY
Julia Rossmanith


BACKGROUND
World-wide trade with plant material has dramatically increased over the past decades, and with it has the risk for accidental introduction of potential plant pests and diseases. Rapid and accurate nucleic acid based identification of such quarantine organisms has become an important tool to minimize their dispersal.
Nucleic acid based identification exploits genetic diversity. A basic tenet holds that species generally do not interbreed and hence, the level of genetic differentiation within species is generally lower than between species. For insects, this pattern of genetic diversity is being used by DNA barcoding with great success. An approximately 600 base pairs long fragment in the first half of the mitochondrially encoded cytochrome c oxidase I gene (COI) is used as a reference sequence to identify insects at the species level. So far, reference sequences for ca. 231’000 different insect species are deposited on the publicly available “Barcoding for Life Database” BOLD (http://www.barcodinglife.org/; by June 2021). Another important database especially targeting phytosanitary purposes with DNA barcodes of vouchered reference specimen is the EPPO Q-Bank (European Plant Protection Organization, https://qbank.eppo.int/).

The methodology of DNA barcoding generally relies on PCR amplification of the diagnostic COI gene fragment using a pair of primers for which the exact DNA sequence must be known. However, this information is not always available, for example in the case of so far undescribed species or in cases, where genetic variation within species affected primer sites. Furthermore, although the COI marker sequence shows an impressive degree of among species differentiation, this is not true for all species and hence, a number of important pest species cannot be differentiated based on this marker alone.

An ideal method for species identification should therefore obtain information of the best discriminating genetic region or of several genetic regions. Here, we describe a state-of-the-art method to achieve this task.

PURPOSE
The purpose of this workflow is to provide a generic method for genetic identification of potential insect quarantine species and of other especially dangerous pest species in support of the Swiss Federal Plant Protection Service. The method is marker independent and may be used with reference databases of any genetic fragment. It is based on whole genome amplification, followed by single strand nanopore sequencing and DNA barcoding based identification.
Guidelines
SCOPE
This method is suitable for the qualitative identification of DNA (deoxyribonucleic acid) or cDNA (reverse-transcribed DNA of RNA of, e.g., viruses) of potentially all organisms. It has been tested against a broad taxonomic range of pest species. The workflow is designed to work with fresh, ethanol (EtOH; preferably 70%) preserved and frozen samples. The workflow presented here is established for insect species identification, but it was successfully applied to the identification of fungi and bacteria using the proper reference databases.

DEFINITIONS & ABBREVIATION
DNA: Deoxyribonucleic acid
PCR: Polymerase Chain Reaction
SOP: Standard Operating Procedure
UV: Ultraviolet
WGA: Whole Genome Amplification
CO1/COI: Mitochondrial cytochrome c oxidase 1 gene
Bp: Base pairs

PRINCIPLE OF THE METHOD
The workflow starts with nucleic acid extraction (DNA and/or RNA followed by cDNA production), followed by a WGA step of the extracted DNA/cDNA, then a clean-up step before producing the nanopore sequencing library which finally is loaded on the MinION nanopore sequencing device.

The methods chosen for this workflow aim at flexibility of input material (gDNA, cDNA) and optimal output to enable multiple use of single flowcells. The workflow should in principle enable at least 10 individual runs of ca. 2-4 hours data collection on a single MinION flowcell.

The resulting sequence data, ideally > 200’000 reads per individual sample, are loaded into the software ‘Geneious Prime’ and analyzed by a custom-made workflow using the proper reference library (see section "Materials").



7 Raw Data Processing and Analysis

7.1 Primary Data Acquisition and Basecalling

Primary data acquisition is performed with the Oxford Nanopore Technologies (ONT) data acquisition software MinKNOW GUI v.4.2.8, the operating software that drives nanopore sequencing devices. Basecalling of the raw nanopore sequencing read data is required to generate the fastq sequence data files needed for further analysis. Basecalling is a time-consuming process and hence it is beneficial to use graphic processing units (GPU) to support this process. To enable use of GPU for basecalling generally requires access to a Linux based operating system with decent memory and storage capacity. We use a Dell Precision 7920 Tower XCTO Base with 256GB RAM and a Nvidia Quadro RTX6000, 24 GB graphic card running Ubuntu 18.04. We use the software Guppy v. 4.5.4 for basecalling using GPU with a parameter set established by Miles Benton (https://gist.github.com/sirselim/2ebe2807112fae93809aa18f096dbb94#file-basecalling_notes-md) to be used from a terminal window:
/guppy_basecaller --disable_pings --input_path /{location_of_data_folder}/ --save_path /{location_of_save_data_folder}/ -c dna_r9.4.1_450bps_hac.cfg -x 'auto' --recursive --num_callers 4 -- gpu_runners_per_device 8 --chunks_per_runner 1024 --chunk_size 1000

The process generates fastq data files and places all reads with a minimum quality into a folder named “passed”.

7.2 Data analysis

To store and analyze the basecalled fastq nanopore read sequences obtained by sequencers from Oxford Nanopore Technologies (MinION, PromethION, GridION, Flongle) we use the software Geneious Prime v.21.1.1 or greater.

We developed an automated workflow that combines several steps including mapping reads to a reference database (in our case containing ca. 600 base pairs 5’ of the mitochondrial cytochrome oxidase I gene of insects, downloaded from the ‘Barcoding of Life Database’ BOLD), establishing a consensus sequence and running a BLAST search on a local copy of the GenBank database. The BLAST results which generally enable species identification of the sample are stored in a sub-folder.

  1. Open the Geneious Prime software by clicking on the corresponding icon
  2. In the left panel, go to your working folder and right-click to generate a new folder for storage and analysis of your new nanopore data files. We recommend to use a new folder for each run and within that folder a sub-folder for each barcoded sample. The nanopore fastq data files are stored in a folder on the computer system running the software ‘MinKNOW’ (which enables performing nanopore sequencing on the MinION as mentioned above) with a name similar to ‘passed’ and have the extension ‘.fastq’. For each barcoded sample, the corresponding fastq files have to be copied into the newly established data sub-folder in Geneious, for example, via drag and drop.
  3. Select the desired number of read packets (by default, 4000 reads per packet), ideally ca. 200’000 in total, by activating them in the right-hand window, then select “Workflows – LB_MM2_PcRefSeq_NrSeq_ret30_210212” from the Geneious Prime menu. If that workflow (incl. in Appendix) does not appear on the menu you have to import it first.
  4. In the new pop-up window, select the proper reference database – default is the custom made 59k entry database (RefDB_BOLD59kGBCocc755_60625Seq_211213.fasta; incl. in Appendix) extracted from BOLD and modified to exclude duplicates and to maintain a 97% minimal distance between branches. Upon confirming with ‘ok’ the workflow will perform the following steps using the parameters indicated below:

a) Mapper: Minimap2 v.2.17 with the following parameters: Dissolve contigs and re-assemble selected; reference sequence: RefDB_BOLD59kGBCocc755_60625Seq_211213; data type: Oxford Nanopore; include secondary alignments: maximum secondary aligments per read: 5; minimum secondary to primary alignment score ratio: 0.8; no trimming (remove existing trim regions from sequences); and under advanced options, an additional command line option: -t 8. Also, in the results panel, select ‘Save assemply report’, ‘Save in sub-folder’ and ‘Save contigs’.
b) Sort Documents: Field to sort by: % of Ref Seq; select ‘Reverse Sort’ and ‘After sorting, only keep at most 30 documents.
c) Mask Alignment: Eliminates alignment columns with <35% entries; in the Options panel, select ‘Expose no options; in the Results panel, select ‘Save a copy with sites stripped; in the What to mask or strip panel, select ‘Sites containing: Gaps (%) 35%; in the All Operation Options panel, select ‘Don’t append ‘Stripped’ to sequence names.
d) Save Documents: Select ‘Save these documents as output from workflow’, ‘Save in subfolder called: {Folder Name}; ‘Select these documents when the operation completes, and ‘Continue’.
e) Generate Consensus Sequence: Establishes a consensus sequence with a 0% majority threshold. Uses the following parameters: In the Options panel, select ‘Expose no options; in the All Operation Options panel, select ‘Threshold: 0% - Majority’, ‘Ignore Gaps’, ‘Assign Quality Total’, ‘Trim to reference sequence’, and ‘Append text to name of alignment consensus sequence’.
f) Save Documents: Select ‘Save these documents as output from workflow’, ‘Save in subfolder called: {Folder Name}; ‘Select these documents when the operation completes, and ‘Continue’.
g) BLAST: Performs a BLAST search for the ‘contig consensus sequence ‘ on a local GenBank database copy using the program ‘Megablast’, places results into a ‘Hit table’, retrieving ‘Matching regions with annotations’, and saves the top 10 best hits in a new folder. Uses the following parameters: In the Options panel, select ‘Expose no options; in the All Operation Options panel, select ‘Nucleotide Query Option’, Database ‘Nucleotide collection (nr/nt), Program: ‘Megablast’, Results: ‘Hit table’, Retrieve ‘Matching region with annotations’, Macimum Hits ‘10’. Also selected is ‘Low Complexity Filter’ and ‘Mask for lookup table’, with other parameters being default values.
h) Save Documents: Select ‘Save these documents as output from workflow’, ‘Save in subfolder called: {Folder Name}; and ‘Branch from 2 Operations Ago’.
i) Sort Documents: In the Options panel, select ‘Expose no options; In the Options panel, select Field to sort by: ‘Bit-Score’, and select ‘After sorting, only keep at most 30 documents.

The BLAST results in the new folder are sorted according to the highest Bit-Score for the hit which in most cases will present the highest % Pairwise Identity and the lowest probability value (E-Value) on top of the list. In addition, Geneious Prime adds other information for proper qualification of the BLAST hits.

Generally, the results obtained with this workflow provide ‘% Pairwise Identity’ hits of >99% and hence strong evidence for an unambiguous species identification of the sample for which the consensus sequence was established.


7.3 Additional Resources

Geneious Prime: Training videos and other resources for all steps outlined above for the software Geneious Prime are available on the Geneious homepage: Resources | Geneious Prime
Barcoding of Life Database: The Barcode of life systems page (http://www.boldsystems.org/) is an excellent resource for barcoding based identifications and it allows searching their database with the proper marker sequence (e.g., COI for insects and ITS for fungi; see detailed instructions below).


8 Barcode-based identification on the NCBI GenBank Database

As an alternative to the identification using local BLAST implemented in the Geneious Prime workflow, the consensus sequence may also be BLASTed directly on the GenBank Database of NCBI:

  1. Open the BLAST homepage of NCBI (https://blast.ncbi.nlm.nih.gov/Blast.cgi) and choose the web BLAST option ‘Nucleotide BLAST’.
  2. Paste your consensus sequence into the window ‘Enter Query Sequence’
  3. Check that the following parameters are chosen: a. Under ‘Choose Search Set’, ‘Database’: Choose ‘Standard’, and ‘Nucleotide collection (nr/nt)’ in the dropdown menu; leave the other fields empty. b. Under ‘Program selection’, ‘Optimize for’: Choose ‘Highly similar sequences (megablast) c. Select ‘Show results in a new window’ before you select ‘BLAST’ at the bottom left of the web page d. It may take a short while before GenBank returns the BLAST results in a new window:
  4. Criteria for correct species allocation are, among others: a. The top >10 entries are the same species name, however, there may be synonymous names for many insects. b. The identity of the query sequence with the GenBank entry (column ‘Ident’) is >97% c. The coverage region of the alignment between your consensus sequence and the GenBank entry is >80% (at least 300bp) d. The E-Value is very close to 0
Note
Note: GenBank does contain erroneous entries that may, for example, originate from an error in allocating the sample to the correct species based on morphological characters. Such cases can be identified easily if there is only a single such entry in a long list of identical species names.


9 Barcode-based identification on the Barcode of Life Database

In order to identify what species your consensus, full-length COI sequence originates from it is necessary to utilize reference databases. One example for such reference database is the Barcode of Life Database (BOLD). This database includes a comprehensive set of COI sequence data that has been collected by individuals and organizations across the globe and is constantly being updated with new data.

  1. Start by navigating to the BOLD Systems webpage (http://www.boldsystems.org/) and select the “Identification” tab at the top of the webpage.
  2. Using the default settings, select the “Animal Identification (COI)” tab (for arthropod identification) and the “Species Level Barcode Records” database. Paste the consensus sequence obtained from the sample into the search box at the bottom of the page.
  3. The browser will eventually update and return the results of the search, revealing the records contained in the database that yield the closest match in terms of sequence similarity. The result screen contains a lot of information that may be explored to establish a confident identification for the query sequence. Often the search results will all originate from a single species allowing an unambiguous identification to be made for the sample. In some cases, two or more species are returned, prompting BOLD to display the message “A species match could not be made, the queried specimen is likely to be one of the following”. This may, for example, happen when highly similar reference sequences were entered with different names (synonyms) for the same species.


10 Barcode-based identification on the EPPO-Q-Bank Database

  1. Open the EPPO-Q-Bank homepage (https://qbank.eppo.int/blast?db=arthropods), check that ‘Arthropods’ are selected and paste your consensus sequence into the window ‘Paste sequence to align:’ and select ‘Start alignment’
  2. EPPO-Q-Bank will return its results in the lower part of the same window:
  3. The percent identity of your query sequence to the best match entries in the EPPO-Q-Bank Database is indicated in the column ‘Similarity %’, the percent overlap of both sequences in the column ‘Overlap %’. If both values are close to 100% then the species allocation is reliable.
Note
Note: The three databases mentioned above are basically independent, although there is much overlap between GenBank and BOLD. This means that not all species are represented in all databases but rather in only one or two of them. Furthermore, the set of individual references for a species is mostly different in each database. Therefore, BLAST results will generally be different among the databases.

Materials
MATERIALS & EQUIPMENT

The sections below report all the equipment and materials required to apply this protocol. N.B. Batch numbers of kits used must be recorded.

Water
General use: Double-distilled water, preferably from a Milli-Q ultraclean water device
PCR procedures: Sterile, DNase-, RNase- and Protease-free water e.g. Fisher Scientific DNA free water, product code: BPE2470-1.
ReagentWater DNA Grade DNASE Protease freeFisher ScientificCatalog #BP24701

Solutions, standards and reference materials
Solutions: All solutions should be of molecular grade purity and only be used to the expiry date indicated on the package. Repeated freeze-thaw cycles should be avoided. Pipetting should be performed with utmost care using filter tips to avoid contamination. Where appropriate aliquots may be used to minimize contamination.

Standards: All standards should only be used to the expiry date indicated on the package. Repeated freeze-thaw cycles should be avoided. Pipetting should be performed with utmost care using filter tips to avoid contamination.

Reference materials: The reference library may be established with any collection of nucleic acid sequences that are useful to discriminate among the taxa of interest. For example, in the case of insects, the reference library is composed of all insect standard barcode entries (a 648 base-pair region of the mitochondrial cytochrome c oxidase 1 gene (“CO1” or “COI”) downloaded from the ‘Barcoding of Life Database’ (http://www.barcodinglife.org/; downloaded in May 2021), with identical entries removed and a minimum difference between tree tips of 3%. This was done with the utility “Dedupe” from BBTools (BBTools - DOE Joint Genome Institute; as implemented in Geneious Prime) to remove duplicate entries and all sequences with a similarity of >97% among each other.

The present SOP was successfully used, partly with minor variations, on a total of 67 samples covering 14 insect families and 26 species.

Commercial kits
Nucleic acid extraction: The method described here uses a custom made proteinase K buffer and mechanical disruption for DNA extraction. However, it has been shown to work well using the following commercial kits: ‘GenElute™ Mammalian Genomic DNA Miniprep Kit’ (Sigma-Aldrich Chemie GmbH, Buchs, Switzerland; Product code G1N350); the ‘DNeasy Blood & Tissue Kit’ (QIAGEN AG, Basel, Switzerland; Product code 69506); and the ‘Monarch® Genomic DNA Purification Kit’ (Bioconcept AG, Allschwil, Switzerland; Product code NEB T3010S). DNA extraction kits from other suppliers must be shown to be appropriate before use.
ReagentGenElute™ Mammalian Genomic DNA Miniprep KitSigma AldrichCatalog #G1N350
ReagentDNeasy Blood & Tissue KitsQiagenCatalog #69506
ReagentMonarch® Genomic DNA Purification KitNew England BiolabsCatalog #T3010S

Whole Genome Amplification (WGA): GenomePlex® Complete Whole Genome Amplification Kit WGA2 (Sigma-Aldrich Chemie GmbH, Buchs, Switzerland; Product code WGA2-50RXN)
ReagentGenomePlex® GGA Kit zur Gesamtgenom-AmplifikationSigma AldrichCatalog #WGA2-50RXN

cDNA Production: LunaScript® RT SuperMix Kit (Bioconcept AG, Allschwil, Switzerland; Product code NEB E3010S)ReagentLunaScript® RT SuperMix KitNew England BiolabsCatalog #E3010S

Nanopore Sequencing by MinION: Amplicons by Ligation (SQK-LSK109) with native barcode ligation: Ligation Sequencing Kit SQK-LSK109 with Native Barcoding Expansion 1-12 Kit (EXP-NBD104).
ReagentNative Barcoding Expansion 1-12 (PCR-free)Oxford Nanopore TechnologiesCatalog #EXP-NBD104
NEBNext® Companion Module for Oxford Nanopore Technologies® Ligation Sequencing (Bioconcept AG, Allschwil, Switzerland; Product code NEB E7180S)
ReagentNEBNext® Companion Module for Oxford Nanopore Technologies® Ligation SequencingNew England BiolabsCatalog #E7180S

Reaction Cleanup: MinElute Reaction Cleanup Kit of Qiagen (QIAGEN AG, Basel, Switzerland; Product code 28206)
ReagentMinElute Reaction Cleanup KitQiagenCatalog #28206

Plastic ware and other disposable material
Note
It is essential that all plastic-ware is sterile before use.

ABCD
ItemDetailExample SupplierProduct code / Remarks
Pipette tips (filtered) 10, 100, 200 & 1000μlFisher Scientific
PCR tubes single, strip or 96-wellFisher Scientific
Eppendorf tubes 0.2, 0.5, 1.5 ml Fisher ScientificlowBind
Qubit assay tubesInvitrogen (Fisher Scientific) lowBind
SeqStudio MicroplatesMicroAmp Optical 96 well Reaction PlateApplied Biosystems by life technologies (Fisher Scientific)
Retsch Collection Microtubes2mlQiagen
Retsch Beads (stainless steel grinding beads)3mmFisher Scientific11758414
Equipment
The following items of equipment are required to undertake the analysis. Several alternative suppliers/models are available for each item. These must be shown to be appropriate before use.
ABCD
ItemDetail Example supplier Product code / Remarks
Precision pipettes1-1000µlFisher Scientific
Bench top vortexLabnetVX-100
Thermocycler SensoQuestWitec
Thermal mixer to hold 1.5 ml tubesEppendorf5355
DNA quantifier -photospectrometric Accurate to +/- 1 ng Witec (Thermo Scientific)Nanodrop ONE
DNA quantifier - fluorometric Invitrogen (Fisher Scientific)QuBit 3
Microcentrifuge to hold 1.5 ml tubes Eppendorf 5452
Desktop centrifuge with microwell plate carrierSigma 4-15C
Power supply for electrophoresis BioRad 1000/500
Electrophoresis equipment (trays, combs) BioRad
UV documentation systemWitecE-Box
Vacuum Pump for microwell plate 96 cleanup steps ≥20 inHG vacuumMillipore
Sequencing device MinION FLO-Min106DOxford Nanopore Technologies ONT
Sequencing flowcell R.9.4.1Oxford Nanopore Technologies ONT
Laminar flow hood With UV lightSKANMonMouth VFLT1000
Tissue Homogenizer with AdaptorsTissueLyser II Qiagen
MagnetMagJET Separation Thermo Scientific™ (Thermo Fisher Scientific)15265126
Tube rotator (HULA mixer)RotoFlex® Plus bench top tube rotator Merck (Sigma-Aldrich) Z740290
Other materials
Disposable plastic gloves, sterile dissection equipment.

Electronic files / computer software
The software Geneious Prime (Biomatters; https://www.geneious.com/prime/) v. 21.1.1. or above with a single user license.
Internet access is required to utilize NCBI’s BLAST (BLAST: Basic Local Alignment Search Tool (nih.gov)). Alternatively, a local installation of a BLAST database is required (may be performed with assistance from Geneious Prime after downloading the necessary nt files from the GenBank site).
Protocol materials
ReagentWater DNA Grade DNASE Protease freeFisher ScientificCatalog #BP24701
ReagentMonarch® Genomic DNA Purification KitNew England BiolabsCatalog #T3010S
ReagentGenomePlex® GGA Kit zur Gesamtgenom-AmplifikationMerck MilliporeSigma (Sigma-Aldrich)Catalog #WGA2-50RXN
ReagentLunaScript® RT SuperMix KitNew England BiolabsCatalog #E3010S
ReagentGenElute™ Mammalian Genomic DNA Miniprep KitMerck MilliporeSigma (Sigma-Aldrich)Catalog #G1N350
ReagentDNeasy Blood & Tissue KitsQiagenCatalog #69506
ReagentMinElute Reaction Cleanup KitQiagenCatalog #28206
ReagentNative Barcoding Expansion 1-12 (PCR-free)Oxford Nanopore TechnologiesCatalog #EXP-NBD104
ReagentNEBNext® Companion Module for Oxford Nanopore Technologies® Ligation SequencingNew England BiolabsCatalog #E7180S
ReagentQIAGEN Protease and Proteinase KQiagenCatalog #19131
ReagentMinElute Reaction Cleanup KitQiagenCatalog #28206
ReagentGenElute™ Total RNA Purification KitMerck MilliporeSigma (Sigma-Aldrich)Catalog #RNB100
ReagentLunaScript RT SuperMix KitNew England BiolabsCatalog # E3010S
ReagentGenomePlex® GGA Kit zur Gesamtgenom-AmplifikationMerck MilliporeSigma (Sigma-Aldrich)Catalog #WGA2-50RXN
ReagentLigation Sequencing KitOxford Nanopore TechnologiesCatalog #SQK-LSK109
Safety warnings
For hazard information and safety warnings, please refer to the SDS (Safety Data Sheet).
Before start

Note
It is essential to wear disposable plastic gloves during all laboratory procedures and to use pipette tips that are sterile and fitted with filters.

1 Sample preparation
1 Sample preparation
All samples should be stored frozen at Temperature-20 °C or stored in Concentration70 % EtOH (few days at TemperatureRoom temperature or at Temperature4 °C in the refrigerator) until processed. Samples can be stored frozen indefinitely. Use sterile dissection equipment where appropriate. As we are mostly using crude DNA extracts with or without further cleanup for the WGA step there is a need to equilibrate tissue size with the amount of proteinase K buffer solution. The following table lists empirically established conditions:

AB
Homogenization Buffer (μl) Organism (or parts thereof)
50Thrips larva 1st and 2nd instar; small Whitefly larvae (1st or 2nd instar) or eggs; very small tissue sample
100Thrips adult or 3rd instar; large Whitefly larvae (3rd instar) or pupae; tissue samples of corresponding size; antenna or tarsus (or part of leg) of adult tephridit fruit fly; 0.5x0.5x0.5mm sample of insect larvae
200Drosophila adult; 1x1x1mm sample of larger insect larvae or of caterpillars

2 DNA Extraction
2 DNA Extraction
Materials:

The extraction is carried out with the proteinase K containing KAWA buffer (section 2.1 / step 3). We use Qiagen Proteinase.K, 20 mg/ml (QIAGEN AG, Basel, Switzerland; Product code 19131).
ReagentQIAGEN Protease and Proteinase KQiagenCatalog #19131

Alternatively, a commercial kit may be used as mentioned in Section Materials, following the manufacturer’s protocols. It is recommended that the manufacturer’s guidelines are checked each time kits are ordered to ensure any updates/changes made since development of this SOP are incorporated.
2.1 Fast Extraction by Proteinase K Buffer (KAWA extraction)
2.1 Fast Extraction by Proteinase K Buffer (KAWA extraction)
20m
20m
Add Amount100 µL -Amount200 µL KAWA Buffer *) in Collection Microtubes (Qiagen).

*) KAWA Buffer: Concentration10 millimolar (mM) Tris HCl , Concentration1 millimolar (mM) EDTA , Concentration0.5 % Tween 20 , Concentration50 μg/ml Proteinase K , Ph8.0 .

Preparation of 100 ml KAWA Buffer:

ABC
NrAmountMaterial
10.121 gTris Base
2adjust to pH 8.0 with 1M HCl (ca. 0.5 ml)
30.037 gEDTA
40.5 mlTween 20
55 mg (250 µl of Qiagen Proteinase.K, 20 mg/ml)Proteinase K

Aliquot in 1.5 ml tubes.

STORE AT Temperature-20 °C .
Pipetting
Mix
Add stainless steel ball 3 mm.
Add tissue sample.
Homogenize on the Retsch Mixer Mill (Qiagen) for 2x3 min at 25 Hz, turning plate after the first period, then centrifuge.
Centrifigation
Heat tubes at Temperature95 °C for Duration00:20:00 .
20m
Centrifuge, transfer in 0.5 ml lowbind Eppendorf tubes.
Centrifigation
Store frozen at Temperature-20 °C (avoid freeze-thaw cycles, homogenates may be stored for several weeks at Temperature4 °C ).

Note
As the extract resulting from a KAWA extraction contains RNA and contaminating proteins its DNA content cannot be accurately quantified. In our experience, using 1ul of KAWA extracts for WGA works in the vast majority of cases. However, we also found that using a cleanup kit, such as the Qiagen MinElute Reaction Cleanup Kit, after KAWA extraction (see section 2.2) leads to improved balance between different extractions in the final sequencing libraries.

2.2 Reaction cleanup
2.2 Reaction cleanup
5m
5m
Materials:

We use the MinElute Reaction Cleanup Kit of Qiagen (QIAGEN AG, Basel, Switzerland; Product code 28206).
ReagentMinElute Reaction Cleanup KitQiagenCatalog #28206

A cleanup step using the MinElute Reaction Cleanup Kit of Qiagen (QIAGEN AG, Basel, Switzerland; Product code 28206) is recommended after the proteinase K extraction as this seems to reduce the variance in the number of sequencing reads between individual libraries in multiplex runs.

This step is not necessary if using a commercial nucleic acid extraction kit. However, a reverse transcription step is required if using an RNA extraction kit. For example, using the GenElute™ Total RNA Purification Kit from Sigma-Aldrich (Merck, Sigma-Aldrich Chemie GmbH, Buchs, Switzerland; Product code: RNB100), the following cDNA production kit was successfully used: LunaScript® RT SuperMix Kit (Bioconcept AG, Allschwil, Switzerland; Product code NEB E3010S). Also, if RNA is extracted it may be beneficial to omit the DNAse step to maximize the yield of nucleic acids.
ReagentGenElute™ Total RNA Purification KitSigma AldrichCatalog #RNB100
ReagentLunaScript RT SuperMix KitNew England BiolabsCatalog # E3010S

Procedure for reaction cleanup using the Qiagen MinElute Reaction Cleanup Kit:
Prepare MinElute column setup on 2ml collection tube.

Mix
Add Amount300 µL Buffer ERC to Amount75 µL KAWA product .
Load Amount375 µL of mixture to column setup.
Centifuge the sample of last step for Duration00:01:00 .
Discard flow-through, re-assemble.
1m
Centrifigation
Add Amount750 µL Buffer PE to column setup.
Mix
Centifuge the sample of last step for Duration00:01:00 .
Discard flow-through, re-assemble.
1m
Centrifigation
Centifuge the sample of last step for Duration00:02:00 .
Place MinElute column in NEW 1.5ml Eppendorf tube.
2m
Centrifigation
Add Amount12 µL Elution Buffer to centre of MinElute Column.
Note
Deviation from supplier’s protocol.


Mix
Incubate for Duration00:01:00 @ TemperatureRoom temperature .
1m
Incubation
Centrifuge for Duration00:01:00 .
Note
DNA quantification at this step may be advisable if the expected yield is below 10 ng.


1m
Centrifigation
2.3 DNA extract quantification
2.3 DNA extract quantification
DNA extracted with a commercial kit or after clean-up may be quantified to assess the extraction process and enable normalisation of DNA concentration. One common method is to use a Nanodrop ND 1000 Spectrophotometer. DNA should be diluted to 10-50ng/μl using DNA-free water. Negative controls should read ~0 ng/μl.

Controls:
A negative extraction control (with no tissue) should be run in parallel with all batches of sample extraction and quantified alongside all tissue extractions
Analyze
3 Whole Genome Amplification (WGA)
3 Whole Genome Amplification (WGA)
Materials:

We use the GenomePlex® Complete Whole Genome Amplification Kit WGA2 (Sigma-Aldrich Chemie GmbH, Buchs, Switzerland; Product code WGA2-50RXN).
ReagentGenomePlex® GGA Kit zur Gesamtgenom-AmplifikationSigma AldrichCatalog #WGA2-50RXN

Procedure for whole genome amplification using the Sigma GenomePlex® Complete Whole Genome Amplification Kit WGA2:
3.1 WGA Step 1: Fragmentation
3.1 WGA Step 1: Fragmentation
34m
34m
Run Thermocycler program (Program: incubation at Temperature95 °C , runs Duration00:30:00 ).
(To assure the Thermocycler is ready when needed.)
30m
Incubation
Use DNA/cDNA sample: Transfer Amount10 µL DNA (≥ 10 ng) of section 2.2/step 19 into new 8-Strip Microtubes.
Mix
Add Amount1 µL Fragmentation Buffer to each DNA tube of previous step.
Mix
Heat for Duration00:04:00 @ Temperature95 °C in Thermocycler. Immediately cool TemperatureOn ice .
Note
Alternatively, a tabletop Mini Cooler may be used.

4m
3.2 WGA Step 2: Library Preparation
3.2 WGA Step 2: Library Preparation
2m
2m
Add Amount2 µL Library Preparation Buffer (green) to DNA of previous step.
Mix
Add Amount1 µL Library Stabilization Solution (yellow) to DNA of previous step. Vortex and centrifuge.
Mix
Heat for Duration00:02:00 @ Temperature95 °C in Thermocycler. Immediately cool TemperatureOn ice .
2m
Add Amount1 µL Library Preparation Enzyme (red) to DNA of previous step. Vortex and centrifuge.

Mix
Run Thermocycler program with WGA Library Prep Rxn.

Program:
incubation at Temperature16 °C , runs Duration00:20:00 ;
incubation at Temperature24 °C , runs Duration00:20:00 ;
incubation at Temperature37 °C , runs Duration00:20:00 ;
incubation at Temperature75 °C , runs Duration00:05:00 ;
cool to and hold at Temperature4 °C
1h 5m
Incubation
3.3 WGA Step 3: Amplification
3.3 WGA Step 3: Amplification
30m
30m
Add Amount48 µL MH2O to each reaction tube of previous step (WGA Library Prep Rxn).
Mix
Add Amount7.5 µL Amplification Master Mix to each reaction tube of previous step.
Mix
Add Amount5 µL WGA DNA Polymerase to each reaction tube of previous step. Vortex and centrifuge.
Centrifigation
Mix
Run Thermocycler program.

Program:
Initial incubation: Duration00:03:00 at Temperature95 °C ;
17 cycles of Duration00:00:15 at Temperature94 °C ; Duration00:05:00 at Temperature65 °C ;
cool to and hold Temperature4 °C

Store short term Temperature4 °C , long term Temperature-20 °C .
8m 15s
OPTIONAL: Check on gel: Run Amount5 µL on 1.4% TBE gel, Amount4 µL marker , Amount4 µL loading dye at 70 V/cm for Duration00:30:00 .
30m
Optional
3.4 WGA Step 4: Reaction Cleanup
3.4 WGA Step 4: Reaction Cleanup
6m
6m
Prepare MinElute column setup on 2ml collection tube.



Mix
Add Amount300 µL Buffer ERC to Amount75 µL KAWA product .
Load Amount375 µL of mixture to column setup.
Centifuge the sample of last step for Duration00:01:00 .
Discard flow-through, re-assemble.
1m
Centrifigation
Add Amount750 µL Buffer PE to column setup.
Mix
Centifuge the sample of last step for Duration00:01:00 .
Discard flow-through, re-assemble.
1m
Centrifigation
Centrifuge the sample of last step for Duration00:02:00 .
Place MinElute column in NEW 1.5ml Eppendorf tube.
2m
Centrifigation
Add Amount10 µL Elution Buffer to centre of MinElute Column.
Note
Deviation from supplier’s protocol.

Mix
Incubate for Duration00:01:00 @ TemperatureRoom temperature .
1m
Incubation
Centrifuge for Duration00:01:00 .
Note
DNA quantification at this step may be advisable if the expected yield is below 10 ng.

1m
Centrifigation
4 WGA Product Check by Gel Electrophoresis
4 WGA Product Check by Gel Electrophoresis
Gel electrophoresis of DNA in an agarose gel is a standard technique in molecular biology, but equipment, reagents, staining and visualization varies considerably between laboratories, and according to local health & safety controls. Therefore, this SOP suggests general conditions that need to be adapted to each laboratory.
4.1 Make a 1.2% TBE agarose gel (1xTBE pH: 9.0) containing 0.0001% Ethidium Bromide*)
4.1 Make a 1.2% TBE agarose gel (1xTBE pH: 9.0) containing 0.0001% Ethidium Bromide*)
32m
32m
Place Amount80 mL 1xTBE-Buffer + Amount1 g Agarose in a 500ml Erlenmeyer flask.
Mix
Heat in microwave at max intensity for Duration00:02:00 with intermittent interruption for shaking (take care not to overheat).

2m
Add Amount8 µL Ethidium Bromide*) and cast the gel.
Safety information
*) Ethidium Bromide is a carcinogenic chemical, use nitrile gloves and consult the security regulations.


Mix
Wait Duration00:30:00 at TemperatureRoom temperature or store in the fridge at Temperature4 °C .
30m
4.2 Gel loading and running
4.2 Gel loading and running
30m
30m
Prepare Size Standard (e.g. Thermo Scientific™ GeneRuler DNA Ladder Mix, ready-to-use; Order Nr. 10181070) and samples (whole genome amplification products) for gel loading by mixing Amount3 µL of each sample and of the Size Standard with Amount3 µL of Loading buffer to be prepared as follows:

Loading Buffer Preparation for PCR Amplification Product Electrophoresis (10ml):



Mix
Add Amount1.5 g Ficoll 400 to Amount10 mL 1xTBE , adjust pH to Ph9.0 .
Mix
Add Amount5 mg Bromophenol Blue (adjust amount visually, may be too high).
Mix
Carefully pipet the Amount6 µL loading mix into individual wells of the gel, beginning with the Size Standard at the leftmost well.

Pipetting
Run at 70V for approximately Duration00:30:00 (depending on size of gel), ensuring the DNA does not run off the gel.

30m
4.3 Visualization
4.3 Visualization
Visualize your DNA fragments in UV light (with appropriate safety precautions); if the WGA reaction has been successful it shows as a smear of approximately 400 - 1000 base pairs in length. Your negative controls should not contain bands.

Note
Note: If no WGA amplification signal is obtained after several attempts it may be advisable to run a positive control using a previously successful PCR. In rare cases more DNA extract may be needed. Alternatively, there may be inhibitors for the PCR in the crude extract, such as in aphids, where the high sugar content inhibits PCR. In such cases, the crude extract needs to be cleaned up with a commercial kit such as the Sigma ‘GenElute™ Mammalian Genomic DNA Miniprep Kit.



4.4 Recording
4.4 Recording
Keep a permanent record of your gel (electronic and/or hard copy) as proof that the WGA reaction was successful and contaminant free.
5 Sequencing Library Preparation
5 Sequencing Library Preparation
Materials:

The library for nanopore sequencing is produced with the Ligation Sequencing Kit SQK-LSK109 of Oxford Nanopore Technologies for sequencing on the flowcell type R.9.4.1 (flowcell ID: FLO-Min106D), following the manufacturer’s recommendations with some minor modifications.
ReagentLigation Sequencing KitOxford Nanopore TechnologiesCatalog #SQK-LSK109
5.1 Library Preparation Step 1: DNA End-Prep
5.1 Library Preparation Step 1: DNA End-Prep
1h
1h
Transfer 120 ng total DNA of section 3 into new 8-strip Microtubes.
Mix
Add MH2O to total Amount54 µL .
Mix
Add Amount3.5 µL Ultra II End-prep reaction buffer .
Mix
Add Amount3 µL Ultra II End-prep enzyme mix .
Mix
Mix by flicking, spin down.
RUN Thermocycler program.

Program:
incubate Duration00:30:00 at Temperature20 °C /Duration00:20:00 at Temperature65 °C ,
transfer contents to 1.5ml Eppendorf tube.
50m
Incubation
Add Amount60 µL AMPure XP beads (resuspended). Mix by flicking tube.
Mix
Incubate for Duration00:10:00 @ TemperatureRoom temperature on HULA mixer.
10m
Incubation
Spin and pellet on magnet until clear. Pipette off supernatant, keep on magnet.
Pipetting
Wash beads on magnet with Amount200 µL 70% EtOH (fresh) . Pipette off supernatant, do not disturb pellet.
Wash
Wash beads on magnet with Amount200 µL 70% EtOH (fresh) . Pipette off supernatant, do not disturb pellet.
Wash
On magnet, pipette off residual EtOH.
Pipetting
Dry for Duration00:00:30 .
30s
Resuspend in Amount61 µL water (nucleasefree) .
Incubate for Duration00:05:00 at TemperatureRoom temperature

5m
Incubation
Pellet on magnet until clear.
Collect Amount61 µL eluate. May store at Temperature4 °C DurationOvernight .
5m
OPTIONAL: Quantify Amount1 µL of product on QuBit. Note DNA concentration (ng/μl).

Optional
5.2 Library Preparation Step 2: Native Barcode Ligation
5.2 Library Preparation Step 2: Native Barcode Ligation
17m 30s
17m 30s
Add Amount22.5 µL of eluted DNA of Endrepair product into new 1.5 ml Eppendorf tube; mix by pipetting. Use total Amount100 fmol -Amount200 fmol (=ca. 35-60 ng).
Mix
Add Amount2.5 µL Native Barcode to each reaction tube of previous step. Note Barcode Numbers!
Mix
Add Amount25 µL Blunt/TA Ligase Master Mix to each reaction tube of previous step; mix by pipetting.
Mix
Incubate for Duration00:10:00 @TemperatureRoom temperature .
10m
Incubation
Add Amount50 µL AMPure XP beads (resuspended). Mix by flicking.
Mix
Incubate for Duration00:05:00 @TemperatureRoom temperature on HULA mixer. Spin down.
5m
Incubation
Pellet on magnet until clear. Pipette off supernatant, keep on magnet.
Pipetting
Wash beads on magnet with Amount200 µL 70% EtOH (fresh) . Pipette off supernatant, do not disturb pellet.
Wash
Wash beads on magnet with Amount200 µL 70% EtOH (fresh) . Pipette off supernatant, do not disturb pellet.
Wash
Remove residual by spin on magnet. Pipette off residual EtOH.
Pipetting
Dry for Duration00:00:30 .
30s
Resuspend in Amount26 µL water (nucleasefree) .
Incubate for Duration00:02:00 @TemperatureRoom temperature .
2m
Incubation
Pellet on magnet until clear.
Collect Amount26 µL eluate and transfer to 1.5ml Eppendorf tube.
MUST DO: Quantify Amount1 µL of product on QuBit. Note DNA concentration (ng/μl).
Analyze
Critical
Pool equimolar amounts of each barcoded sample to 1.5ml Eppendorf tube (to 100-200 fmol total (=ca. 60 ng)).
Note
DO NOT multiply low concentration samples linearly! @10x lower concentration use max 5x more DNA!

Dilute single pooled Barcode ligation product to Amount65 µL .
5.3 Library Preparation Step 3: Adaptor Ligation and Clean-up
5.3 Library Preparation Step 3: Adaptor Ligation and Clean-up
35m 30s
35m 30s
Use Amount60 µL single pooled barcode ligation product .
Add Amount25 µL Ligation Buffer (LNB) .
Mix
Add Amount10 µL NEBNext Quick T4 DNA Ligase .

Mix
Add Amount5 µL Adapter Mix (AMII for multiplex) . Mix by flicking, spin down.
Mix
Incubate for Duration00:10:00 @TemperatureRoom temperature .
10m
Incubation
Add Amount40 µL AM Pure XP beads (resuspended) . Mix by flicking
Mix
Incubate for Duration00:15:00 @ TemperatureRoom temperature on HULA mixer. Spin down.
15m
Incubation
Pellet on magnet until clear. Pipette off supernatant, keep on magnet.
Pipetting
Wash beads with Amount250 µL Short Fragment Buffer (SFB) . Wait Duration00:03:00 on Magnet. Resuspend by flicking, pellet, remove supernatant.

3m
Wash
Wash beads with Amount250 µL Short Fragment Buffer (SFB) . Wait Duration00:03:00 on Magnet. Resuspend by flicking, pellet, remove supernatant.

3m
Wash
Remove residual by spin.
Dry Duration00:00:30 .
30s
Resuspend in Amount15 µL Elution Buffer (EB) .
Incubate for Duration00:10:00 @ Temperature37 °C .
10m
Incubation
Pellet on magnet until clear.
Collect library (Amount15 µL ) from previous step into new Eppendorf tube.
Quantify Amount1 µL eluted DNA on QuBit. Use appropriate amount for 16 ng of library for next step; dilute with EB buffer.
Analyze
6 Nanopore Sequencing
6 Nanopore Sequencing
Priming and Loading the Flowcell
6.1 Nanopore Sequencing Step 1: Priming and Loading the Flowcell
6.1 Nanopore Sequencing Step 1: Priming and Loading the Flowcell
5m
5m
Prepare Flowcell (perform QC on MinION). Record number of active pores.
Prepare the flow cell priming mix: Add Amount30 µL Flush Tether (FLT) directly to Flush Buffer (FB) tube. Mix.
Mix
Load flow cell with Amount800 µL priming mix via priming port. Spot on closed!!!
Wait for Duration00:05:00 .

5m
Pause
Prepare LIBRARY for loading: add Amount38 µL Sequencing Buffer (SQB) to 1.5ml Eppendorf tube.
Add Library: Add Amount26 µL Loading Beads (LB) to 1.5ml Eppendorf tube (mixed immediately before use).
Add Library: Add Amount12 µL diluted DNA library from section 5.3 to 1.5 ml Eppendorf tube.
Note
We used up to 16 ng total DNA but obtained best results with 12 ng.


Load flow cell with Amount200 µL priming mix via priming port. Spot on closed!!!

Add Amount75 µL Library from Go togo to step #120 via SpotON sample port. Add drop by drop!

Close priming port, SpotOn port, perform sequencing on a MinION (Flongle, GridION, Promethion) using protocols.io method https://www.protocols.io/view/starting-a-minion-sequencing-run-using-minknow-7q6hmze; make sure to use flow cell type LSK109 and barcode kit EXP-NBD104 (option now available).
6.2 Nanopore Sequencing Step 2: Flow Cell Storage
6.2 Nanopore Sequencing Step 2: Flow Cell Storage
30m
30m
Prepare wash mix: Add Amount20 µL Wash Solution A to 1.5ml Eppendorf tube.
Mix
Add Amount380 µL Wash Solution B to same 1.5ml Eppendorf tube. Vortex.
Mix
Open inlet port, add Amount400 µL Wash Mix via priming port. Close priming port after loading (Spot on closed!!!).
Wait for Duration00:30:00 @TemperatureRoom temperature .
30m
Pause
Add Amount500 µL Storage Buffer S via priming port. Close priming port after adding (Spot on closed!!!).
Remove spare contents in flow cell. Aspirate Amount1000 µL AIR from empty flow cell via trash removal port top left. Spot on closed!!!
Store in fridge.
7 Raw Data Processing and Analysis
7 Raw Data Processing and Analysis
For Raw Data Processing and Analysis, please see section "Guidelines".
Analyze