Now we're going to calculate the diversity at each of your sample sites. These will will be pure calculations in Excel rather than on the command-line. mothur can do this using OTUs, but the OTUs we generated in mothur were at too fine a resolution to be useful for our purposes-- we want to compare taxonomic diversity at a higher taxonomic level, so we're calculating it by hand based on our taxonomic classifications.
We'll calculate diversity using two measures: species richness (r) and the Shannon-Weiner Index (H'). The Shannon-Weiner Index (H') is meant to take into account both the taxon richness (how many different taxa there are) and evenness (does one taxon dominate or are they evenly distributed?)
We are going to calculate the diversity of each of your sample sites at taxonomic level 2. This means that each entry at level 2 (i.e. 'Euryarchaeota,' 'Thaumarchaeota,' 'Proteobacteria') will count as one taxon. The number of sequences for that taxon in your sample represents the total number counted in that sample for that taxon. Calculate the following for each of your sites (i.e. you should have one H' value for the deep chlorophyll max, one for the surface, and one for the mesopelagic zone).
Species richness = r = number of taxa in your sample
The equation for the Shannon-Weiner index is:
H’= index of taxonomic diversity, the Shannon-Weiner Index
Pi = proportion (percent) of total sample belonging to the ith taxon
ln= natural log (log base e = not the same as log!)
This index takes into account more than just the number of taxa (richness) in a sample, but also how evenly distributed the taxa are (evenness) within the sample. The index increases either by having more richness or by having greater evenness.
Hint: -Σ (Pi ln(Pi)) = -((Ptaxon1*ln(Ptaxon1)) + (Ptaxon2*ln(Ptaxon2)) + (Ptaxon3*ln(Ptaxon3)) + …)
I suggest that you start by calculating the total number of sequences for each sample site for all of taxonomic level 2. Then, for each taxon within taxlevel2, calculate the total proportion of seqences belonging to that taxon (Pi). Then calculate H'.
Excel command for natural log: LN()
Note that in Excel, LN(0) = error, so skip the cells with a value of 0.
Pay attention to your use of parentheses!
Please do these calculations in a way that is clear so that I can track your calculations. You will be submitting these Excel spreadsheets as part of your lab assignment for this week. Please compile the total number of sequences, the species richness, and the Shannon-Weiner index for each of your sample depths in a table. Save this as Table 1 and provide a caption.