Taxonomic Bias

Datasets produced by different analysts are incompatible (Lee et al. 2019). Figure 1 illustrates a diatom survey, with each point representing a sample, and each color representing an analyst. Ideally, analysts (colors) should be randomly distributed. However, analysts are grouped together indicating that samples are not equally identified. Taxonomic bias is a problem caused by inconsistent naming practices by analysts.

Failure to Correct Bias

Most projects implement quality assurance and quality control (QA/QC) procedures to achieve taxonomic consistency. However, post-hoc taxonomic correction is time-consuming and ineffective. Furthermore, post-hoc correction sacrifices data resolution and fails to reduce "analyst signal" without also reducing environmental signals (Lee et al. 2019).

The Solution

A solution to the problem of taxonomic bias is to use a voucher flora, along with randomization of sample assignment and a multi-party system of QA/QC (Tyree et al., in prep).

What is a Voucher Flora?

A voucher flora is a collection of light microscope images for a defined project.

A voucher flora is created before identification and enumeration of diatoms begins. Prepared slides are examined for an entire project, and images of all diatom taxa are collected. The images are then grouped into species, or operational taxonomic units (OTUs). Analysts working on the project collaborate to discuss species boundaries and inform the final OTUs within the voucher flora. Then, as each analyst works on identification and enumeration, the voucher flora serves to coordinate analysts, so that all participants work from the same morphological understanding of species boundaries. Finally, the voucher flora is archived as a public, permanent record.

Steps to Eliminate Taxonomic Bias

1. A priori, images are collected and organized into morphological OTUs.

2. Each OTU is assigned a provisional name (GOM01, GOM02, etc.)

3. Taxonomists add images to the voucher flora, when new taxa are encountered during analysis.

4. Identification of species is delayed, until the final step of analysis.

5. The voucher publicly available and serves as a permanent record of the study.

6. Permanent slides are deposited in a public herbarium to support the voucher.



Bishop, I.W., Esposito, R.M., Tyree, M. and Spaulding, S.A. 2017. A diatom voucher flora from selected southeast rivers (USA). Phytotaxa 332: 101-140.




2008 Nrsa Analyst Comparison
Image Credit: Sylvia Lee
Figure 1. NMDS of 2008 EPA samples. Each analyst is shown in a different color. Lee et al. 2019. "Taxonomic harmonization may reveal a stronger association between diatom assemblages and total phosphorus in large datasets"
Bishop Et Al 2017
Image Credit: Bishop et al. 2017
Figure 2. An example of a plate of Gomphonema OTUs from a voucher study (Bishop et al. 2017)