License: This is an open access protocol distributed under the terms of the Creative Commons Attribution License,  which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Protocol status: Working
We use this protocol and it's working
Created: August 16, 2021
Last Modified: June 12, 2025
Protocol Integer ID: 52360
Keywords: mitogenome annotation in geneious, finalizing mitogenome annotation, geneious, annotation
Disclaimer
Our protocols are constantly evolving and old versions will be deleted.
The documents here are not intended to be cited in publications
Abstract
Our protocols are constantly evolving and old versions will be deleted.
The documents here are not intended to be cited in publications
Troubleshooting
∂
BEFORE you start annotating your mitogenomes, you want to run them through another annotating software, MITOS.
Enter your name, email, if you want you can add a job identifier (i.e. what is the mitogenome you're uploading for annotation), MAKE SUR·E to change the genetic code to the appropriate choice (in our lab it's generally always "05 - Invertebrate"), then navigate to your assembled mitogenome in Finder. Expand the MitoFinder folder, then expand the "..._Final_Results" folder, and find the "Sample_MitoFinder_100p_mtDNA_contig.fasta" file
Opening Geneious and Setting Up Workspace
Log into Geneious - (or, if you have your own version, just open that) Use the floating license server. IT IS VERY IMPORTANT YOU DO NOT UPDATE GENEIOUS - the floating license will not work with newer versions.
When prompted, enter the following server IP address and port, and make sure to select "Use floating license server".
Command
Server IP
132.239.80.207
Command
Port
27001
Create A New Folder: When you first open Geneious, you should have a blank workspace with some sample documents.
You will want to create a new folder for the mitogenome you will work on. To do this, go to the tool bar at the top of your monitor and click "File", then "New", and "Folder..."
Name it something like "SpecimenFamily_Mitogenomes" (i.e. "Pilargidae_Mitogenomes"), or whatever is useful to you. It's good practice to not use any spaces in the folder and file names.
If you will be working on multiple specimens in the same family or greater taxonomic category than just one sample from one species, make another folder within the folder you just made i.e. with your specific sample name.
Example - highlighted is the new folder.
Create another folder within the family or greater taxonomic category folder named "Genes", and within that folder, create 28 new folders. For each protein coding gene, you want 2 folders; one that is "ProteinCodingGene_AA" for the amino acid translation sequence, and the other that is "ProteinCodingGene_NT" for the nucleotide sequence (i.e. ATP6_AA and ATP6_NT). 12S (rrnS) and 16S (rrnL) are not protein coding genes, so for them, you only need one folder each: "rrnS_NT" and "rrnL_NT".
Importing Sequences
Importing Your Data
Drag In Your "...contig.fasta" File: Go to wherever you have your assembled mitogenome in Finder. Expand the MitoFinder folder, then expand the "..._Final_Results" folder, and find the "Sample_MitoFinder_100p_mtDNA_contig.fasta" file.
Drag the "Sample_MitoFinder_100p_mtDNA_contig.fasta" file into Geneious, where it says "Drop files here to import". Your file should show up like this. If you accidentally placed it in the wrong folder, you can click on it and move it around, like in Finder.
Drag In Your "contig.gff" File: In the same folder as your "contig.fasta" file, locate the "Sample_MitoFinder_100p_mtDNA_contig.gff" file, and drag it into Geneious, into the same area where your "contig.fasta" file shows up. This will visually map the genes MitoFinder found.
Sometimes Geneious will ask you questions about this - i.e. what you want to map it to, just say yes for everything.
Sometimes after importing your data won't visually look like this - usually, deleting the files from your Geneious working folder and re-importing them fixes this issue (sometimes a you have to re-import multiple times before it works)
Circularization: The mitogenome should be circular. However, MitoFinder most often does not find the circularization and leaves your genes linear. If you want to check, locate the "Sample_MitoFinder_100p.infos" file and open it in a text editor.
If circularization is not found (which you can also see by seeing Geneious does not automatically circularize it), we want to manually force it. The non-coding region (D-loop) can be very repetitive and it's hard for the program to decide where the genome ends.
Forcing Circularization: Right click on your sequence in Geneious and click on "Circularize Sequence".
The change should show up in your bottom part of Geneious, where the sequence is no longer mapped in a straight line, but now becomes a circle.
Save changes.
Note
NOTE: You want COX1 to be going from left to right (clockwise)! If it is not, you have to reverse complement the entire mitogenome! Do this by clicking "R.C.". (In this screenshot, COX1 is fine, but in the above one, it is not and must be reverse complemented).
Import Amino Acids and Nucleotides: In Finder, in the same folder as before, locate and drag into Geneious the "Sample_MitoFinder_100p_mtDNA_contig_genes_AA.fasta" and the "Sample_MitoFinder_100p_mtDNA_contig_genes_NT.fasta" files. You don't want to import the "..._final_genes_..." files, because they aren't necessarily correct.
Geneious will ask how you want the sequences from the files stored. Choose "Keep sequences separate".
Your Geneious workspace should then look something like this:
There should be 13 AA files, which are noted by the blue icons - these are the Amino Acid translations of the protein coding genes, and 15 NT files, which are noted by the orange double helix icons - these are the NucleoTide sequences of both the protein coding genes and the non-coding genes (12S/rrnS and 16S/rrnL).
HOWEVER, sometimes MitoFinder doesn't find all of the genes - that's okay; that's why we've run MITOS, and if even MITOS doesn't find them, then we can troubleshoot using steps outlined later on in the protocol.
First Pass Check
The first way we will check how correct MitoFinder is, is by checking for start and stop codons in the AA sequences of the protein coding genes.
Click on one of your AA files (blue icon). In the bottom part of Geneious, you should see a sequence of letters, which are the single letter codes for amino acids. Ideally, the sequence will start with the letter 'M', which signifies a start codon, and end with an asterisk '*', which signifies the stop codon.
Wherever you are keeping your notes, write down for each gene if it did or did not start with an M and end with an asterisk.
Do this for EACH AA file. It is very likely that many of your genes will not start and end with start and stop codons. This is how I like to write my notes, but organize your own however is most helpful to you:
Now put away these AA and NT files into their corresponding folders within the "Genes" folder you made. For extra clarity, you can select all of them,
Manually Fixing Mitogenome Annotations with MITOS (default)
This is the default way you should do mitogenome annotation. If neither MitoFinder nor MITOS found all 15 genes, then you would refer to "Obtaining Reference Mitogenomes for Troubleshooting, if all 15 genes are not present" onward.
Importing MITOS results
Hopefully by now, MITOS should have gotten results back to you.
By clicking the links on the left, download the GFF file, FAS file, and the protein plot. Put them in a new corresponding file somewhere that makes sense to you (i.e. a "MITOS" file where you have the FitoFinder file).
Scroll to the bottom of the page. If MITOS found anything unusual, it will tell you here. This is an example:
Sometimes MITOS will find 2 or more of a gene, or not find some, and that's when the protein plot comes in handy.
This is an example of a zoomed in protein plot:
Notice, how under the pink line (nad5), there are 3 other lines, orange, blue and purple (atp8, nad2, and nad4). Sometimes, some genes have certain parts that could be similar, but not very, so it still picks up some signal. The line that has the highest plateau is the one you want to trust. Sometimes, when there are say, 2 atp8 genes that MITOS found, what happened is that the low orange line was found somewhere where there was no other strong signal line above it, but it was still just as low as it is here. Then, you would get 2. But if you checked the protein plot, you would know which one is correct - the one that reaches the highest on the graph.
In Geneious, once again drag in the SAME "...conting.fasta" file as before.
Then, drag in the MITOS "___.gff" file.
It might say this:
If so, tell it to use what it found, and click "OK".
It might also say this:
Select "Continue".
This is what your Geneious should now look like:
Rename the sequence so you know it's from MITOS (i.e. change "MitoFinder_100p.1" to "MITOS").
Force circularize for this sequence, too, like before for the MitoFinder sequence.
Align the MitoFinder and MITOS sequences for manual annotation and comparison!
Select both the MitoFinder and MITOS files. Then click "Align/Assemble".
Choose "Pairwise Align...", and choose "OK".
Your Geneious workspace should look like this:
Now, it's time to scroll through the mitogenome and manually annotate it based on comparing it with MITOS.
Place your cursor at the front of the alignment (click there), and press the zoom in button.
Scroll to the right until you hit your first annotation. Once you deal with that one, keep scrolling until the next one, and then the next one, etc. until you get through your entire mitogenome!
If it's a tRNA, just check if the MITOS and MitoFinder annotations are the same (if they start and stop with the same nucleotide).
Also check for overlapping - you don't want them to overlap either ith eachother or with other genes. Once you make sure the gene a tRNA is overlapping with is correct, shorten the MitoFinder tRNA annotation so they don't overlap, by clicking and dragging the pink and green bars (each separately) until they don't.
Note
NOTE: You should have and tRNA-Leu1 and tRNA-Leu2, and a tRNA-Ser1 and tRNA-Ser2. When you are interested in gene order, you write down 1 or 2 based on MITOS, not MitoFinder! MITOS has historically been more correct with this.
So in this case, this gene is tRNA-Leu2.
If it's a protein coding gene or rRNA, stop before the beginning. These are the genes you notes whether they start with an M (the start codon) and end with an asterisk (the stop codon).
First, I like to turn on the Translation - it helps me visually.
1. Go to the second right sidebar tab that looks like a screen
2. Check the box by "Translation"
3. MAKE SURE to change the "Genetic Code" to whatever is relevant for you - in our lab, that's mostly "Invertebrate Mitochondrial"
You can check the translation of a part of sequence by clicking where you want to start checking and then dragging to where you want to see - but MAKE SURE you drag from left to right! Drag direction matters!
Note
HOWEVER - just because you have the translation, doesn't mean you can blindly trust the colors - sometimes some codons can be interpreted as an M or something else, and if it's at the beginning of the annotated gene, Geneious likes to show it as an M, even if you wrote down that the gene does NOT start with an M.
For annotation, you are changing ONLY the MitoFinder sequence, but you are looking at MITOS to help you.
A guiding principle is that you want MINIMAL OVERLAP between genes - if you can entirely avoid it, do. (But at least in annelids, ND4 and ND4L will always overlap ~10 bp!)
To change a MitoFinder annotation, click on the front or back end of the specific green gene bar, and drag it to where you want it (this may take a bit of practice to click correctly for dragging and not just selecting). Then, do the same for the specific yellow CDS bar (coding sequence) or red rrna bar (16S and 12S), and drag that to where you want it, as well.
Note
BE EXTREMELY CAREFUL NOT TO ACCIDENTALLY ADD A STOP CODON ANYWHERE in the middle of a gene! This is the bit that the "Translation" function really helps me with.
At the beginning of a gene, look for the start codon M - the corresponding nucleotides are ATA or ATG.
(Rarely, it could be another set of nucelotides. For Nephytidae, I've run into COI starting with GTG instead).
a.) If the gene starts with an M and the MITOS annotation is either the same, or starts further to the right, keep the MitoFinder annotation as is.
b.) If the gene starts with an M but the MITOS annotation is further to the left and either the MITOS annotation starts with an M or there is an M even further to the left that does not overlap with other genes, change the MitoFinder annotation to whatever the furthest to the left non-overlapping start codon (M) would be.
c.) If the gene does not start with an M, but MITOS' annotation does start with an M, either to the right or to the left, change the MitoFinder annotation to what MITOS found.
d.) If neither MitoFinder nor MITOS annotations start with an M: search for one. Start with the left and go to the right if no luck on the left. Compare gene length to other known genes of the taxonomic group (i.e. if a gene is usually 1500 bp, you'll know something is weird if your gene has 800 bp.).
Below are some examples of editing the beginning of a gene. Feel free to add your own if you don't think they fit into one of the already added examples!
Note
Example 1: Starts with an M, but MITOS found more that doesn't start with an M.
Here, I've run into ND1. It starts with an M.
However, MITOS found more of the gene further back to the left. But, if I click at the beginning of the MITOS annotation in the mitofinder track and drag it to the right to the beginning of the MitoFinder annotation, I can see that it does NOT start with an M (even though Geneious says I/M, it is not ATA or ATG).
But, if I start checking before then, I can see that there is a start codon right after the end of the previous gene (in this case, a tRNA). So, I will change it to that! But only because MITOS found more gene to the left.
This is what my ND1 now looks like:
At the end of a gene, look for the stop codon * - the corresponding nucleotides are TAA or TAG.
Obtaining Reference Mitogenomes for Troubleshooting, if all 15 genes are not present
Obtaining Reference Mitogenomes: When manually fixing genes, we help ourselves with reference mitogenomes. We want to import and simultaneously look at our new mitogenome and already published mitogenomes, because those help us make informed decisions when it comes to expanding or shortening a gene.
Go to GenBank and search for the mitogenomes of closely related species by typing in the following:
Command
Search for mitogenomes of related species
"SampleFamily" & "mitogenome"
Make sure to replace "SampleFamily" with whatever your family of organisms is, so you would search for i.e. " "Pilargidae" & "mitogenome" ". Include the quotation marks, because that means GenBank will show you only exact matches.
If your search gives no results, try the searching by your sample's suborder or order, etc. until you get results. You can use WoRMS to help you.
Once you have the results, select all of them by clicking the boxes before them and choose "Send to", select "Gene Features", keep the default "FASTA Nucleotide" as the format, and click "Create File".
Rename the file accordingly and move it into your sample's folder. Then, drag it into Geneious, again keeping the sequences separate. A lot of new NT files should appear in your workspace, making it look like this:
Sometimes, the uploaded mitogenomes won't have rrnS/12S and rrnL/16S included in the FASTA download. In those cases, you'll have to specifically search for both rrnS and rrnL FASTA files on GenBank of each of the mitogenomes you imported. Download, rename, and drag in those files, too, if you don't already have them.
Manually Checking and Fixing Protein Coding Genes when all 15 genes are not present
The first pass check gave us a general idea of how complete our genes are. Now, we want to carefully check each one (even those that do start and stop with start and stop codons!) and fix those that need fixing.
Aligning the Sequences: Start by typing "COX1" into the search bar in Geneious. Then, select all of the COI sequences that were uploaded from GenBank. You want to make sure those are the only sequences selected - you can check this by looking at the upper right hand corner below the search bar (I've noted it with the arrow) - the number of selected sequences should be the same as the number of imported mitogenomes.
Note
Sometimes, uploaders to GenBank will use COI/CO1 instead of COX1 as the gene name. If you're not getting the right amount of genes given the amount of mitogenomes you uploaded for reference, try spelling it differently. Other examples of differently spelled genes are:
NAD1-6 instead of ND1-6
COB instead of CYTB
When aligning these, make sure when you change the search from say ND5 to NAD5, you click directly on the box where the check mark shows up to let you know you've selected the sequence. This keeps all of the sequences you've selected so far selected, even if you're in a different search result. Then, click the 'x' in the search bar, and proceed with aligning. If you click anywhere else when selecting a sequence, it will select only that specific sequence and deselect the rest.
Aligning the Sequences: Then click "Align/Assemble" in the tool bar, and choose "Map to Reference...".
This window will pop up:
Aligning the Sequences: To specify the reference sequence, which we want to be your sample, click "Choose..." and navigate to your circularized file (green circle icon), then click "Select".
Aligning the Sequences: The rest of the default options are fine, so click "OK". Once Geneious is done, clear the search, and at the top of your files, you should see an alignment (orange icon of three parallel lines). Your workspace should look like this:
In the bottom part of Geneious, you now have the alignment. The black bars are where the GenBank sequences aligned with your sequences that MitoFinder found, which are the green and yellow bars.
Check the Beginning of the Gene: First zoom in to the front of your gene. Do this by clicking on the house icon on the bottom right side of Geneious. Then, place your cursor close to the beginning of the gene you are looking at and click on the two blue arrows pointing towards each other. You can zoom out by clicking on the same icon.
This will give you such a view:
The invertebrate start codons are ATG and ATA. However, just because MitoFinder decided your gene starts with a start codon, this DOES NOT MEAN that this is the FIRST start codon - make sure to check to the left of the gene to see if there are any more ATG/ATA codons. Take care to go by three nucleotides at a time, otherwise you shift the frame of translation.
Editing the Beginning: If you found that the current way MitoFinder annotated the beginning is incorrect (either no start codon or not the right one), click on the green and yellow bars and drag them one at a time to the nucleotide they should really start with. Make sure you extend/shorten the gene only by a number of nucleotides divisible by three!
Note
For later reference, I like having "before" and "after" screenshots of the sequences, so I'll definitely know what I changed where, even months/years later.
Editing the Beginning Example:
MitoFinder didn't find the start codon. When I zoomed into the beginning of the gene, I began to look for start codons. I actually found six potential start codons (underlined red)! However, the one farthest to the left is the correct option.
Note
Note: The top few sequences I aligned look different than the bottom few because the bottom ones are from GenBank and the top ones with the graphic representations of genes are from Geneious - sequences I have already manually checked an annotated.
I then dragged the gene annotations so they now look like this when corrected:
I made sure the amount of nucleotides I added was divisible by three.
Check the End of the Gene: Either zoom out and place your cursor at the end of the gene and zoom in again, or just scroll to the end of the gene.
The stop codons you are looking for are TAA or TAG. Sometimes, when MitoFinder finds the stop codon, your sequence will end like the others you are aligning it to, and other times it won't. However, it can still be correct, like this:
There's no obvious way to change this sequence, it doesn't align with the other reference sequences, but there is a stop codon, so I left it as is.
Editing the End: If you found that the current way MitoFinder annotated the end is incorrect (no stop codon), click on the green and yellow bars and drag them one at a time to the nucleotide they should really end with. Make sure you extend/shorten the gene by a number of nucleotides divisible by three!
Note
For later reference, I like having "before" and "after" screenshots of the sequences, so I'll definitely know what I changed where, even months/years later.
Editing the End Example:
MitoFinder didn't find the stop codon. When I zoomed into the end of the gene, I found two:
HOWEVER! Stop codons are different than start codons. There can't be a stop codon somewhere within the gene, so even if there is another potential stop codon farther to the right, that is not a stop codon for this gene. You have to choose the stop codon closest to the end of the gene you're fixing. Therefore, I edited this gene like this:
I made sure I extended the gene by a number of nucleotides divisible by three - in this case 45.
I Can't Find a Stop Codon?
AA residue thing
Checking the Validity of Edits: After you've edited the annotations of a protein coding gene, there is one more way to check that you have done so correctly. This makes sure you've edited the nucleotides comprising the gene in numbers divisible by three, and also ensures you didn't accidentally add any extra stop codons anywhere.
First, click on the green annotation (the gene), and then click 'Extract'.
You can add something to the end of the name, i.e. "...AnnotationCheck" so that you know where the sequence came from later on, then click "OK".
Checking the Validity of Edits: Then click on the sequence it gives you as an output and click on the "Translate" button. Change the Genetic code to "Invertebrate Mitochondrial", leave the translation frame as 1, and make sure to UNCHECK both of the checked boxes - we don't want to force Geneious to translate the first three nucleotides as a start codon and we don't want it to remove the final stop codon.