NCBI_rRNA_Submission

Avery S Hiley

Jun 14, 2025

NCBI_rRNA_Submission

DOI

https://dx.doi.org/10.17504/protocols.io.dm6gprqppvzp/v1

Avery S Hiley¹

¹UCSD - Scripps Institution of Oceanography

Rouse Lab

Avery S Hiley

UCSD - Scripps Institution of Oceanography

DOI: https://dx.doi.org/10.17504/protocols.io.dm6gprqppvzp/v1

Protocol Citation: Avery S Hiley 2025. NCBI_rRNA_Submission. protocols.io https://dx.doi.org/10.17504/protocols.io.dm6gprqppvzp/v1

License: This is an open access protocol distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited

Protocol status: Other

This protocol was never completed and is thus not applicable for use.

Created: May 10, 2020

Last Modified: June 14, 2025

Protocol Integer ID: 36813

Keywords: uploading rrna sequence, rrna sequence, ncbi-rrna-submission, genbank

Abstract

A short tutorial on uploading rRNA sequences to genBank.

First and foremost, gather your rRNA (e.g. mitochondrial 16S, 12S and nuclear 18S, 28S) sequences that need to be uploaded and align them in either Geneious, Mesquite, or using the online MAFFT version 7 server: https://mafft.cbrc.jp/alignment/server/.

*Note: Each rRNA gene must be aligned and submitted separately.

Blast one of the sequences and make sure that the first record it matches with on NCBI says ‘Plus/Plus’. This indicates that the sequence is in the correct forward 5’ direction. If it says ‘Plus/Minus’, then reverse complement your entire alignment.

Important note: It is normal for a rRNA gene alignment to have gaps in it, so don't worry if this is the case. 

However, please make sure there are no ambiguities in your sequences. If a sequence has lots of bases that aren’t ACGT (e.g. N, K, M, other IUPAC codes, etc.):

Please look at your original De Novo assembly in Geneious and check if you can make a nucleotide base call. Compare the amplitudes of the Forward/Reverse peaks at the corresponding nucleotide positions in the chromatogram (.ab1 files), and choose the nucleotide with the greater amplitude.

Check if the ends of the sequence assembly need to be trimmed due to low/poor quality.

Make any nucleotide edits and/or deletions as necessary before proceeding.

Once the rRNA sequences do not have ambiguities, export the final alignment in Mesquite as a fasta file. Do not 'include gaps'!

Open this fasta file in TextWrangler or BBEdit. Edit all of your sequence headers to match this format:

>A9919 [organism=Peinaleopolynoe mineoi] Peinaleopolynoe mineoi voucher SIO:BIC:A9919 cytochrome c oxidase subunit I (COI) gene, partial cds; mitochondrial

In this example, “A9919” is the BIC number; use either the BIC number or a different institution’s catalog number here to represent the sequence ID. If catalog numbers are not assigned yet, you may use a lab code (e.g. S25170) as the sequence ID here.

“[organism=Peinaleopolynoe mineoi]” is obviously where you insert the species name for the corresponding sequence ID.

“Peinaleopolynoe mineoi”: Repeat the species name here.

“voucher SIO:BIC:A9919”: Follow this proper format to identify the corresponding BIC number with our institution abbreviations. If the specimen is deposited at a different institution, this will instead mimic the abbreviations of that place before listing the catalog number (e.g. MNHN for Muséum national d'Histoire naturelle). If your catalog numbers aren’t assigned yet, exclude this section altogether.

“cytochrome c oxidase subunit I (COI) gene, partial cds; mitochondrial” should be listed at the end of the aforementioned information.

After all of your sequence headers follow this format, save the fasta file again.

Create an online account for NCBI’s Submission Portal platform: https://submit.ncbi.nlm.nih.gov/.

Start a new submission.

Submission Type: ‘Metazoan (multicellular animal) Mitochondrial COX1’.

Submitter: Fill out the corresponding information (see screenshot below for our lab details). Make sure to check ‘Update my contact information in profile’ in order for future submissions to use this specific info by default.

Sequencing Technology: Select ‘Sanger dideoxy sequencing’ & ‘Assembled sequences (each sequence was assembled from two or more overlapping sequence reads)’.

Sequences: Select ‘Release on specified date or upon publication, whichever is first’ or release immediately if you are late in uploading your sequences (this should almost never be the case). Typically you will want to choose a year in advance to be safe. Next, upload your COI fasta file that was completed in step 16.

Source Info: Under ‘Do your sequence IDs represent one of these?’, select ‘Specimen-Voucher’ if you followed the headers format in step 15. If catalog numbers aren’t assigned yet and your sequence IDs were listed as lab codes (e.g. S25147), select ‘Isolate’ here.

Source Modifiers: This is the section where you add key details that you would like to be attached to the sequences (e.g. locality and depth). However, only the organism name and specimen-voucher are required for submission. You should have already included these in the previous steps, so you may continue if you do not wish to apply additional source modifiers.

*Since all the specifics will be in your publication, you may typically stick with locality (column name = ‘Country’) and depth (column name = ‘Altitude’). 

*If you would like to add a couple key source modifiers, then choose one of the following two options under ‘How do you want to apply source modifiers?’:

Option 1. ‘Upload a tab-delimited table’: Proceed to download the source modifier template table. You can edit this file in several programs (TextEdit, TextWrangler, BBEdit, or even Microsoft Excel). In a text edit program, separate the information in each column by inserting 1 tab. The locality information in the ‘Country’ column must go from broad to specific. For example, “Costa Rica: Mound 12” follows this format. The depths entered in the ‘Altitude’ column must be negative (e.g. “-1800 m”). An example of a tab-delimited table is attached below:

Option 2. ‘Use an editable table’: This is self explanatory. You can manually add columns with corresponding information for each sequence ID via the online interface. However, please copy and paste, and save your work somewhere else for your records.

References: Under ‘Sequence authors’, add yourself (unless someone else did the lab work). Select the corresponding ‘Publication status’ that applies to you. Add the title of your paper and ‘Specify new authors’ to add the names of all authors on your paper.

Review & Submit: Check the details of your submission here. If everything looks good, submit your sequences.