Oct 19, 2021

Public workspaceDRAGEN COVID Lineage App SARS-CoV-2 Strain Characterization on the Illumina BaseSpace Platform V.2

This protocol is a draft, published without a DOI.
  • 1Centers for Disease Control and Prevention
  • TOAST_public
    Tech. support email: toast@cdc.gov
Icon indicating open access to content
QR code linking to this content
Protocol CitationTechnical Outreach and Assistance for States Team 2021. DRAGEN COVID Lineage App SARS-CoV-2 Strain Characterization on the Illumina BaseSpace Platform. protocols.io https://protocols.io/view/dragen-covid-lineage-app-sars-cov-2-strain-charact-by78pzrwVersion created by Technical Outreach and Assistance for States Team
License: This is an open access protocol distributed under the terms of the Creative Commons Attribution License,  which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Protocol status: In development
We are still developing and optimizing this protocol
Created: October 19, 2021
Last Modified: October 19, 2021
Protocol Integer ID: 54240
Disclaimer
The opinions expressed here do not necessarily reflect the opinions of the Centers for Disease Control and Prevention or the institutions with which the authors are affiliated. The protocol content here is under development and is for informational purposes only and does not constitute legal, medical, clinical, or safety advice, or otherwise; content added to protocols.io is not peer reviewed and may not have undergone a formal approval of any kind. Information presented in this protocol should not substitute for independent professional judgment, advice, diagnosis, or treatment. Any action you take or refrain from taking using or relying upon the information presented here is strictly at your own risk. You agree that neither the Company nor any of the authors, contributors, administrators, or anyone else associated with protocols.io, can be held responsible for your use of the information contained in or linked to this protocol or any of our Sites/Apps and Services.
Abstract
This protocol provides instructions on how to run the DRAGON COVID Lineage app on the Illumina BaseSpace Sequence Hub. The DRAGON COVID Lineage app reads in fastq sequence files and produces Lineage, Clade, and Kmer Detection reports for each run. The information provided in these reports include Pangolin lineage classification, variant calls, and the fraction of human and SARS-CoV-2 kmers detected. The DRAGON COVID Lineage app also assembles a consensus sequence fasta file for each sample. This document applies to all whole-genome sequencing runs on the Illumina platform and downstream bioinformatics for public health laboratories.

For technical assistance, please contact: TOAST@cdc.gov
Create Basespace Project
Create Basespace Project
Login to the Illumina BaseSpace Platform and create a new BaseSpace project
Note
Login to the BaseSpace Sequence Hub at:

Note
Click on the ‘Projects’ tab on the top of the Sequence Hub homepage and create a new project directory by going to FILE -> NEW -> PROJECT:
The Projects panel in the BaseSpace Sequence Hub


Add Fastq files to the new project directory. This can be done by either uploading fastq files from a local directory or by importing sequences using their Sequence Read Archive (SRA) accession numbers from NCBI using the ‘SRA Import’ app.

Note
The step case format does not translate well in a pdf file. Be aware the pdf will present each step case sequentially which may be confusing when following the protocol.




Step case

Upload Fastq files from local directory
6 steps

Go to FILE --> UPLOAD --> FILES and it will bring you to the ‘Upload Files’ page
The Basespace Projects 'Upload Files' page
Click on the FASTQ tile and add the selected fastq files for upload. It should be noted that BaseSpace has specific upload requirements for fastq files including file naming schemes and header formats. They are described here: https://support.illumina.com/help/BaseSpace_Sequence_Hub/Source/Informatics/BS/UploadFastqReq_swBS.htm?Highlight=fastq

Run DRAGEN COVID Lineage App
Run DRAGEN COVID Lineage App

Run the DRAGEN COVID Lineage App

Note
Go to the Apps panel and search ‘lineage’. Click on the DRAGEN COVID Lineage app.

The 'DRAGEN COVID Lineage' BaseSpace app

Note
Click ‘LAUNCH APPLICATION’ and then click ‘SELECT PROJECT’ and select the new project directory. Click ‘SELECT BIOSAMPLE(S)’. In the ‘SELECT BIOSAMPLE(S)’ window click the ‘Filters’ button on the right. Choose the new project directory

The ‘SELECT BIOSAMPLE(S)’ window

Note
Select the samples to be used in the analysis and click the ‘SELECT’ button. At the bottom of the application page click ‘LAUNCH APPLICATION’. This will bring you to a new ‘ANALYSIS’ page similar to the following

The Analysis panel with a queued DRAGEN COVID Lineage run


Accessing the DRAGEN COVID Lineage app summary report

Note
The DRAGEN COVID Lineage App is finished when the status line says ‘Complete’. To view the ‘Summary Report,’ click on your analysis, navigate to the ‘REPORTS’ tab, and then click ‘Summary’ on the left side of the page.

The Analysis panel showing the DRAGEN COVID Lineage app Summary Report


Note
The ‘Summary Report’ compiles together three individual reports: ‘Lineage Report’, ‘Clade Report’ and ‘Kmer Detection Report’. The ‘Lineage Report’, which provides Pangolin lineage assignments is shown below

The lineage section of the DRAGEN COVID Lineage Summary Report showing Pangolin lineage assignments
The Clade section of the DRAGEN COVID Lineage Summary Report showing NextClade clade assignments


Downloading consensus sequences generated by the DRAGEN COVID Lineage app

Note
To download the SARS-CoV-2 consensus sequences produced by the DRAGEN COVID Lineage program, click on the ‘FILES’ tab and then click on one of the samples. This will bring you to a new ‘FILES’ page similar to the following:

The FILES tab of the Analysis panel for this DRAGEN COVID Lineage app run



Note
Click on the ‘consensus’ directory and scroll down to the first file that has ‘TYPE’ ‘fa’. Clicking on this sequence will pull up a new window the looks similar to the following:

Windows displaying the hard masked SARS-CoV-2 consensus sequence

This filename should end with ‘consensus_hard_masked_sequence.fa’. Click ‘DOWNLOAD’ to save the consensus sequence to a local directory. Note there is an additional TYPE 'fa' file that ends with a 'consensus_soft_masked_sequence.fa'. This consensus sequence should not be uploaded to a public database like Genbank or Gisaid.


Note
Additional documentation for the DRAGEN COVID Lineage app is available here:


Note
Before submitting the resulting SARS-CoV-2 consensus sequence assemblies to public repositories, such as NCBI GenBank or GISAID, refer to the following documentation describing submission criteria and minimum quality control thresholds:

GenBank Submission Criteria: About GenBank Submission (nih.gov)
Gisaid Submission Criteria: Download Gisaid inclusion criteria.pdfGisaid inclusion criteria.pdf

Alternative Lineage Assignment
Alternative Lineage Assignment
The SARS-CoV-2 consensus sequence assembly generated by the CDRAGEN COVID Lineage app can also be uploaded to other lineage assignment software.
Upload the consensus sequence for each sample to the Pangolin COVID-19 Lineage Assigner at:



Click the 'Start analysis' button:

Pangolin COVID-19 Lineage Assigner example

The Pangolin COVID-19 Lineage Assigner returns the lineage classification and assignment probability:

Pangolin COVID-19 Lineage Assigner output

Or upload the consensus sequence for each sample to the Nextclade clade assignment web portal at:


NextClade assignment web portal

The Nextclade server provides clade classification as well as QC metrics and a list of amino acid substitutions. A summary output file can be downloaded with the 'Export to CSV' button.

Nextclade clade assignment output