Taxonomy, classification,
and specimens
3/28/11
Taxonomy vs. classification
Taxonomy is the practice and science of classification
It is usually organized by supertype-subtype relationships
(generalization-specialization relationships or parent-child
relationships)
A hierarchical taxonomy is a tree structure of classifications
for a given set of objects. At the top of this structure is a
single classification, the root node, that applies to all objects.
Nodes below this root are more specific classifications that
apply to subsets of the total set of classified objects
Biological taxonomy
Alpha taxonomy the science of defining and
naming organisms; it is the alphabet of biology
Beta taxonomy (systematics) the science of
understanding the relationships among taxa; it
is the grammar of biology
Taxonomy provides a relational link between
and amongst biological phenomena
Why alpha taxonomy matters
Taxonomic name is the unique ID of a taxon
Facilitates communication about taxa e.g.,
Identification and describing species
Biodiversity mapping and cataloging life
Standardization of model organisms
Classification of organisms according to a variety of criteria
(evolutionary, utilitarian, geographic etc.)
Was the prerequisite of the evolutionary thought
Taxonomy provides a stable
and universal vocabulary of
organisms
Paul Ehrlich a
cautionary tale
Criteria of a good taxonomy
Stability (ICZN, ICBN etc.)
Uniformity (using a dead language)
Traceability (taxonomic changes leave a documented trail)
Logical hierarchy (a difficult transition from scala naturae to
phylogenies)
Pre-Linnean taxonomy
Shen Nung, Emperor of China around 3000 BC.
known as the Father of Chinese medicine and is
believed to have introduced acupuncture
pharmacopoeia Divine Husbandman's Materia
Medica included 365 medicines derived from
minerals, plants, and animals
Around 1500 BC medicinal plants were
illustrated on wall paintings in Egypt
In one of the oldest and largest papyrus rolls,
Ebers Papyrus, plants are included as
medicines for different diseases. They have
local names such as "celery of the hill country"
and "celery of the delta (species of Apiaceae)
Pre-Linnean taxonomy
Biological taxonomy as a branch of Western
science emerged with the Aristotle (384322 BC)
In Historia Animlium he introduced the
concept of scala naturae (Ladder of Life)
according to which organisms were
classified
Aristotle recognized 520 species of animals,
which he divided into those with blood
(vertebrates) and without blood
(invertebrates)
Animals were arranged according to their
vitality and ability to move
Pre-Linnean taxonomy
Picture of a Violet in De Materia
Medica by Dioscorides
Theophrastus (370285 BC) wrote a classification of all known
plants, De Historia Plantarum, which contained 480 species. His
classification was based on growth form, and we still recognize
many of his plant genera e.g., Narcissus, Crocus and Cornus.
Dioscorides (4090 AD) a Greek physician, who wrote De Materia
Medica, which contained around 600 species
The classification in his work is based on the medicinal properties
of the species.
Plinius (2379 AD) in Naturalis Historia, a work of 160 volumes, he
described plants and gave them Latin names
Many of these names are still in use e.g., Populus alba and Populus
nigra
The Father of Botanical Latin
Pre-Linnean taxonomy
Gaspard Bauhin (1560-1624), was a Swiss
botanist who wrote Pinax theatri botanici (1596),
He introduced many names of genera that were
later adopted by Linnaeus, and remain in use.
For species he carefully pruned the descriptions
down to as few words as possible e.g., Plantago
media = Plantago foliis ovato-lanceolatis
pubescentibus, spica cylindrica, scapo tereti
The single-word description was still a
description intended to be diagnostic, not an
arbitrarily-chosen name to serve as a unique
identifier
Joseph Pitton de Tournefort (1656 1708) a
French botanist, introduced the concept of a
genus, which can have multiple species
Carl Linnaeus (1707 1778)
Swedish botanist who introduced the now
accepted hierarchical classification of living
organisms and binomial nomenclature of
species (for this Linnaeus was designated
the lectotype of Homo sapiens [in Stearn
1959: 4])
First presented in Leiden in 1735, Systema
Naturae was based on Aristotles system of
progressive subdivision on groupings of
organisms
Introduced the concepts of kingdoms,
classes, orders, genera, and species
Published in 1753 , Species Plantarum is
internationally accepted as the beginning of
modern botanical nomenclature; it
described over 7,300 species
International Code of Botanical
Nomenclature (ICBN)
The formal starting date of nomenclature at 1 May 1753,
the publication of Species Plantarum by Linnaeus (or at
later dates for specified groups and ranks)
ICBN applies not only to plants, as they are now defined,
but also to other organisms traditionally studied by
botanists and mycologists. This includes Cyanobacteria;
fungi, including chytrids, oomycetes, and slime moulds;
photosynthetic protists and taxonomically related nonphotosynthetic groups (bacteria were excluded in 1990)
Taxonomic ranks recognized by ICBN
kingdom (regnum)
subregnum
division or phylum (divisio, phylum)
subdivisio or subphylum
class (classis)
subclassis
order (ordo)
subordo
family (familia)
genus (genus)
subgenus
section (sectio)
subsectio
series (series)
subseries
species (species)
subspecies
variety (varietas)
subvarietas
form (forma)
subforma
subfamilia
tribe (tribus)
subtribus
Naming rules of ICBN
The name of a taxon above the rank of family is treated as a noun in the plural and
is written with an initial capital letter
A name of a division or phylum should end in -phyta unless the taxon is a division
or phylum of fungi, in which case its name should end in mycota
A name of a subdivision or subphylum should end in -phytina, unless it is a
subdivision or subphylum of fungi, in which case it should end in mycotina
A name of a class or of a subclass should end as follows:
In the algae: -phyceae (class) and -phycidae (subclass);
In the fungi: -mycetes (class) and -mycetidae (subclass);
In other groups of plants: -opsida (class) and -idae, but not -viridae (subclass)
The name of a family is a plural adjective used as a noun with the termination -
aceae
For the naming of cultivated plants there is a separate code, the
International Code of Nomenclature for Cultivated Plants (ICNCP)
Carl Linnaeus (1707 1778)
Systema Naturae - the 10th edition was
released in 1758, it is the starting point for
zoological nomenclature
Names published before that date are
unavailable, even if they would otherwise
satisfy the rules. The only work which
takes priority over the 10th edition is Carl
Alexander Clerck's Aranei Suecici, which
was published in 1757
International Code of Zoological
Nomenclature (ICZN)
Rules the naming and classification of the Metazoa and
protistan taxa whenever they are or have been treated
as animals for nomenclatural purposes
Scope: independent of botanical nomenclature, no
name is to be rejected because it is identical with the
name of a plant (homonymy a problem for databases)
Basic principles
Law of Priority
Law of Proscription
Law of Type Fixation
Naming rules of ICZN
Nominate subtaxa: if a taxon is divided into subtaxa, the name of one must be the
same as, or derived from, that of the taxon (except for ending) e.g, Blaberinae
(subfamily) in Blaberidae (family)
Endings: Family Group names: superfamily iodea, family idea, subfamily inae,
tribe ini, subtribe -ina; no rules for higher taxa
Family- and Genus-group names are always capitalized,
Genus always a noun
Species usually an adjective that must agree in gender with the genus
Species-group names are never capitalized
No Species-group name alone constitutes the name of a species, it must be used in
combination with a Genus-group name
Naming rules of ICZN
Words: uninominal for supraspecific taxa, binominal for species, trinominal for
subspecies (quatronomials forms, variants are not covered by the ICZN)
Author: authors name follows scientific name without punctuation, in
parentheses if combined with a generic name different from the orig.
combination e.g., Redtenbacheriella maculata Karny 1910 becomes
Pseudosaga maculata (Karny 1910)
Author of combination is not cited (in botany the author of the new
combination is cited)
Problems with binomial (Species-group)
nomenclature
Is typological inadequate to circumscribe genetic and
morphological diversity of species
Attempts to overcome the limitations by introducing
subspecific taxa (subspecies variety, form)
There is still no central, authoritative repository of ALL
names for ALL organisms (but we are getting there)
Post-Linnean taxonomy
Jean-Baptiste de Lamarck (17441829)
launched an evolutionary theory including
inheritance of acquired characters, named the
"Lamarckism".
First example of using data interpreted within
an evolutionary framework to classify
organisms, Scala Naturae no longer leading
principle of classification.
Willi Hennig (19131976) founded the era of
cladistics
Only similarities grouping species
(synapomorphies) should be used in
classification
Taxa should include all descendants from one
single ancestor (the rule of monophyly)
Rank-free classification: PhyloCode
Kevin de Queiroz and Jacques Gauthier, started the
discussions in the 1990's and laid the theoretical foundation
to a new nomenclatural code for all organisms, the
PhyloCode
The first draft was published on the web in 2000
PhyloCode reflects a philosophical shift from naming species
and subsequently classifying them into higher taxa to naming
both species and clades.
Only species and clades should have names, and that all
ranks above species are excluded from nomenclature.
Why specimens matter
The specimens contained in museum collections represent the
totality of our current understanding of the worlds biodiversity
Specimens in collections reveal polymorphisms, help reconstruct
historical distributions, develop models of seasonal phenology, and
identify potential hotspots of diversity and endemism which may
be crucial to regional conservation efforts
Collections-based research forms the foundation of all
phylogenetic and systematic treatments, including molecularbased research
The process of managing a specimen
collection
1.
Collecting and preparing the specimens focus on the
preservation of maximum number of characters and
specimen longevity
2.
Accessioning a process whereby a group of specimens
entering the collection are recognized as a group united by
their origin, and all associated information is recorded
(permits, collectors etc.)
3.
Determination specimens are identified to lowest level of
taxonomic hierarchy possible; identification and its accuracy
can be refined with time
The process of managing a specimen
collection
4. Cataloging the assignment of a unique, institution and
collection-specific identifier; only after a specimen has been
cataloged it is considered fully curated
5.
Data capture and management data associated with
each specimen are captured in a database; these data may
be linked to other, related data (e.g., a database of host
plants collected during the same expedition, but not curated
with the insect collection). From this point on the history
and use of the specimen will be tracked, and its associated
data can be disseminated.
Data that should accompany each specimen
record
Specimen unique ID
Information in bold must be on
the physical specimen label
Lot ID/Accession ID/Catalog ID etc.
Specimen location (institution, collection, drawer, vial etc)
Collecting/observation event data
GPS coordinates
Locality names
Date/time
Collector
Collecting method
Habitat/behavior/association data
Data that should accompany each specimen
record
Specimen attributes
Sex/stage
Type status
Morphometrics
Media
Condition, notes etc.
Identification data
Tracking the specimen
Each specimen in a collection/database should be tagged
with a unique ID
ID should be both machine- and human-readable
ID must be unique within the collection/database
ID may contain additional information (e.g., coden, species
etc.)
Specimen barcodes
Linear barcodes
Information encoded by a
Code 93 (up to 43 characters)
Code 128
Stacked code (multiples of linear codes)
combination of widths of bars
and spaces (e.g., 3 bars and 3
species per character in Code
128)
Readable by older generation
readers
Numerical or alphanumerical
Large and limited in
information content
Specimen barcodes
Matrix (2D) barcodes
Information encoded by
Data matrix code
(up to 2,335 characters)
QR code (up to 7,089
characters)
clustering and position of
blocks
Require high resolution readers
Numerical or alphanumerical
Smaller and capable of large
information content
ID data that should accompany each specimen
record Identification
Species or morphospecies name e.g.,
Gryllus campestris L., Gryllus cf.
campestris, Gryllus sp. 1
Identifiers name
Date of identification
History of identification
What is specimen identification
Assigning an individual specimen to a species is a
hypothesis that the unknown is conspecific with
the type specimen of the species and NOT that it
fits into a typological circumscription of that
species (this concept is often misunderstood in
real life)
How to confirm a name/identification
Consult a specialist
Use peer-reviewed printed publications (monographs, keys
etc.)
Compare with reference specimens (including types)
Use online resources
How to confirm a name/identification
Online resources
Taxonomic catalogs
Type specimen databases
Other specimen database (e.g., virtual herbaria)
Other online identification resources
Type specimens
Types are onomatophores they
provide a physical reference point
for a specific, named, operational
taxonomic unit
Type specimens are not typical
representatives of a species
They provide a historical reference
point for a species diagnosis and
are cornerstones of nomenclatorial
stability
Type specimens
Holotype - A single physical example (or illustration) of an organism used to formally
describe a species. A name-bearing type (onomatophore, the primary type).
Syntype Any of two or more specimens listed in a species description where a
holotype was not designated; term no longer in use.
Paratype Any additional specimen other than the holotype, listed in the type series,
where the original description designated a holotype.
Neotype A specimen later selected to serve as the single type specimen when an
original holotype has been lost or destroyed, or never designated.
Lectotype A specimen later selected to serve as the single type specimen for species
originally described from a set of syntypes.
Paralectotype Any additional specimen from among a set of syntypes, after a
lectotype has been designated from among them. These are not name-bearing types.
Type specimens online
Universal access to type information (negative
identification as the primary function)
Permanent type documentation
Error correction/type designation
Repatriation of information, other buzzwords
Online type data and image access
First online type image collection in 1995 (Venezuelan
butterflies)
First taxonomists not necessarily first to be online
(most of Linnean and Fabricius types have never been
photographed)
Types imaged: 2,372
A hybrid approach: DORSA & FoCol
Smithsonian Institution Department of
Entomology
Museum of Comparative Zoology,
Harvard University
Canthon vigilans LeConte, 1858
(Coleoptera: Scarabaeidae)
MCZ Type Number: 3701
Type status: Type
Stage: Adult
Medium: Mounted
Insect types in major collections
Types specimens of 434,367 species in 13 major online collections
Images available for 35,320 species
Images available for 8.13% species in these collections
Images available for ~4% of described species of insect (at the
most)
Taxonomic authority files
Taxon-specific authority files (bottom-up
approach)
Aggregate authority file portals and federated
biological databases (top-down)
TETTIGONIOIDEA
Species: 8,310
Types imaged: 5,269 (63%)
http://orthoptera.speciesfile.org/
FORMICIDAE
Species: 14,097
Types imaged: 801 (5.7%)
http://antweb.org
PHASMIDA
Species: 2,960
Types imaged: 1,935 (65%)
http://phasmida.speciesfile.org
http://www.tropicos.org/
http://plants.usda.gov
http://www.fishbase.org/
http://mczbase.mcz.harvard.edu/SpecimenSearch.cfm
http://www.sp2000.org/
http://www.itis.gov/
http://www.gbif.org/
Examples of specimen data entry protocols: Digital Bee Collection
Network (AMNH, UC Riverside, UC Davis, UC Berkeley, CSCA, Cornell, UConn, Rutgers, Vermont, USDA Bee
Systematics Lab)
TAXONOMIST:
1.
Identification
2.
Gender Determination
3.
Check all prior identifications
Identify as much as possible from among undetermined specimens prior
to data entry
If necessary, change header labels where epithet has been changed
Orient males upside-down so they are obvious
Pass to data entry technician
Proof all entered data (later stage)
Correct errors, fill in blanks for difficult localities
Examples of specimen data entry protocols: Digital Bee Collection
Network (AMNH, UC Riverside, UC Davis, UC Berkeley, CSCA, Cornell, UConn, Rutgers, Vermont, USDA Bee
Systematics Lab)
DATA ENTRY TECHNICIAN:
4.
5.
6.
Sorting
Organizes specimens first by locality, then by date (if multiple dates from same
locality), then by host (if multiple hosts for same date/locality); this maximizes
overlap of data elements between successive records during data entry
Secondary organization by gender where possible, to maximize number of
successive records all of the same gender (e.g., if 10 males and 10 females, each from
a unique locality, then group males and females)
Labeling and Transcription
Pre-printed serialized unique labels with codens applied sequentially
Essential data elements transcribed on paper worksheets
Data Entry
Examples of specimen data entry protocols: Digital Bee Collection
Network (AMNH, UC Riverside, UC Davis, UC Berkeley, CSCA, Cornell, UConn, Rutgers, Vermont, USDA Bee
Systematics Lab)
EXAMPLE
Unit tray labeled Bombus bifarius nearcticus from CSCA Specimen labeled
9 mi NW Fandango Pass, CA, 5/22/62, on Artemisia tridentata
Determiner label reads B. nearcticus, det. R. Snelling 1962
ID already confirmed (but this taxon name no longer valid)
Gender determined
Specimen numbers already in database, a few default fields (including source
institution) auto-entered, record otherwise blank
First steps: needs species number, locality number
Secondary: various manual-entry data fields, including gender, host plant
Examples of specimen data entry protocols:
The MCZ Rhopalocera (Lepidoptera) Rapid Data Capture Project
(Museum of Comparative Zoology, Harvard University)*
MCZ holdings:
Natural history specimens
21 million
Entomological specimens
7.5 million
Lepidoptera specimens
600,000
Pinned butterflies
200,000
Separate data entry from specimen handling in a three step
process
1. Prepare the collection
2. Photograph specimens and data labels
3. Transcribe data from high resolution images
*Based on the ECN 2010 presentation by Morris, P. J., Eastwood, R., Ford, L., Haley, B., Pierce, N.
Examples of specimen data entry protocols:
The MCZ Rhopalocera (Lepidoptera) Rapid Data Capture Project
(Museum of Comparative Zoology, Harvard University)
Protocol:
Identify specimens and
record in data sheet
2. Ensure only a single
species in each unit tray
3. Expand spacing so that
staff can remove and
replace specimens
without damage
1.
Examples of specimen data entry protocols:
The MCZ Rhopalocera (Lepidoptera) Rapid Data Capture Project
(Museum of Comparative Zoology, Harvard University)
4.
5.
Taxonomic information & drawer numbers
entered into spreadsheet from which 2D
barcode labels will be generated
Barcoded taxonomic data labels generated from
data spreadsheet (encoded in QRCode 2D
Barcode)
Examples of specimen data entry protocols:
The MCZ Rhopalocera (Lepidoptera) Rapid Data Capture Project
(Museum of Comparative Zoology, Harvard University)
Drawer prepared for imaging -with Taxon
ID barcode labels positioned above unit
trays
Individual specimen prepared for imaging
and assigned a unique Specimen ID
barcode
Machine-read data
Human-read data
Examples of specimen data entry protocols:
The MCZ Rhopalocera (Lepidoptera) Rapid Data Capture Project
(Museum of Comparative Zoology, Harvard University)
1. Collection preparation (identification, sorting, label printing)
Average of 60 seconds per specimen (timing is highly variable depending on the state of
curation)
Done by entomologist
2. Specimen/data imaging
Average of 60 seconds per specimen
Performed by unskilled personnel
3. Data entry
Is slower but can be done simultaneously by multiple personnel
Specimens are not handled
Basic data entry done by unskilled personnel
Quality control by specialist/entomologist
Dealing with specimen backlog
The example of NC State Insect Museum
Specimens are not identified/curated a priori
All specimens are photographed and the images are made
publicly available
Specialists will (hopefully) identify the specimens and data
will be entered into a specimen-level database
http://insectmuseum.org/specimens.php
Exercises
1.
Confirm the identification of specimens provided using
available online resources
2.
Place the identified specimens into most recent taxonomic
hierarchy (Class: Order: Family: Subfamily: Genus)
3.
Confirm the validity of the species names and their
authorship
4.
Place provided barcodes on the specimens and enter them
into a simple spreadsheet