Tuesday, 29 December 2015

A common plant survey of the vice counties of Durham and South Northumberland

Many people remark on the changes that are occurring in the countryside, the disappearance of some species and the spread of others. Yet these anecdotes cannot substitute for hard facts. There are also many suggested causes for all these changes; a warmer climate, different agricultural practises, eutrophication, alien species etc. Botanical observations tend to be biased. For example, people often note the exceptional species but ignore the common ones. So it is difficult to draw conclusions about plant abundance from casual observations. What was needed was a dedicated survey with a clear repeatable methodology.

Common plant species are the mainstay of habitats, they create our woodlands, hedgerows and
The distribution of heather in Durham and
South Northumberland predicted
from the common plant survey data.
meadows; they provide the food for herbivores and pollinators and they create homes for birds and mammals. Changes in the abundance of rare species have little impact on other species, but change in the abundance of common species can have cascading effects on whole ecosystems of which we are a part.

For these reasons volunteer botanists in the north-east of England conducted a four year survey to benchmark the abundance of common plants. Led by the Botanical Societies vice county recorders, John Durkin, John Richards and Quentin Groom they surveying the plants in a randomly selected sample of 1km2 grid squares in the vice counties of Durham and South Northumberland.  They have created a solid foundation that can be used to qualify the abundance of common species and be compare against previous and future studies. The project was conducted over four years and required volunteers to go to all sorts of places. Some people surveyed post-industrial brown-field sites, while other walked for miles across bleak moorland to reach sites high in the hills. Although, these moors are arguable more wild and natural, the industrial wastelands are far more biodiverse.

The results of this survey have just been openly published, contributing an additional 35,000 observations to the 200,000 observations collected by local recorders since the turn of the millennium (http://bdj.pensoft.net/articles.php?id=7318).

Botanical surveying continues in the region despite the end of this project. Volunteers continue to monitor rare plants in the region (https://dx.doi.org/10.6084/m9.figshare.1480492; http://www.bsbi.org.uk/County_Durham_Rare_Plants_Register_2013.pdf) and are currently working towards the next atlas of Britain and Ireland coordinated by the Botanical Society of Britain and Ireland (http://bsbi.org.uk/).

Good biological conservation in the 21st century will be as much to do with sensitive adaption to change as it is about preserving what we have. Human memory is short and fickle and it is only with benchmark surveys, such as this, that we can hope to understand and manage that change.

Tuesday, 30 June 2015

We should drop the use of the word tetrad or at least learn what its area is!

The word tetrad is used to describe a square area of land with sides of 2km it has an area of 4km2. The term tetrad is widely used in biological recording in the UK, but it often misunderstood both in the UK and abroad. Logically it is derived from the ancient Greek word for four. Likewise a monad has 1km sides and has an area of 1km2 . A pentad and a hectad have sides of 5km and 10km respectively, but they have areas of 25km2 and 100kmso their name is based on the length of the side not their area. Outside the UK many people assume a tetrad has sides of 4km, because this follows logically, but lack of a logical series is perhaps why there are so many mistakes.

Mistakes are not hard to find, below are some examples published in official publications where the area of a tetrad is wrong...
Perhaps it is time to drop the use of this ambiguous term!

Saturday, 18 April 2015

Biological Observation Analysis and Repeatability

There has been considerable interest in the repeatability of scientific research. Ioannidis (2005) showed that most claimed research findings are false and Simmons et al. (2011) neatly demonstrate how simple methodological decisions made by an experimenter can increase the number of false-positive errors. Consider the case of an experimentalist who has various ‘degrees of freedom’ in the way that they conduct and analyse their experiment. These freedoms have a large influence on the conclusions draw from the results, particularly on the proportion of false-positive errors. For example, an experimenter can decide to increase the number of replicates if they do not have a significant result in their original experiment. They also have the flexibility to either include or exclude outlying values and set arbitrary thresholds for the inclusion of observations.

Similarly, observers of organisms have similar freedoms that contribute to their errors. For example, if an observer expects to find an organism in a habitat they may continue surveying only until they find those organisms that they are anticipating will be present. Furthermore, if they expect certain organisms they are perhaps more likely to misidentify similar looking organisms as the organism they expect to see, thus simultaneously creating a false-positive and a false-negative observation. There are many other observer behaviors that can lead to errors. Observers do not disclose their taxonomic biases, such as ignoring grasses and sedges. Observers will vary in their treatment of dead organisms, with some people treating them as a sign of occupancy and others not. Observers will vary in their methodology, either in what they accept as evidence of occupancy and by the equipment they use. Such choices introduce biases even when the observers are diligently trying to conduct a survey to the best of their ability. Though there are undoubtedly also cases where observers consciously manipulate their findings or carelessly report them wrongly (Sabbagh 2001; John et al., 2012).
The generation of false-positives among experimentalists also parallels biological observers in another characteristics. There is a well-known and widespread publication biases in the scientific literature towards positive results (Francis, 2012). Similarly, in my experience observers of biodiversity are much more likely to report observations of an unusual species, than ubiquitous species, particularly is that species is easily identifiable.

There are many ways that field recording could be improved, but we will always rely on cleaver analysis techniques to extract information from our data. The question is, can we peel off the layers of biases to ever know anything biologically relevant from our observations. In analysis of botanical records there is a tendency to include every observation, but if doing that only increases the biases, then ignoring biased observations can improve repeatability. It might seem rash to ignore data, but be skeptical of methods that use all the available data, particularly where the data is aggregated to disguise the biases.


Francis, G. (2012) Too good to be true: Publication bias in two prominent studies from experimental psychology. Psychonomic Bulletin & Review, 19(2), 151-156. doi: 10.3758/s13423-012-0227-9
Ioannidis J.P.A. (2005) Why Most Published Research Findings Are False. PLoS Medicine 2(8), e124. doi:10.1371/journal.pmed.0020124.
John, L.K., Loewenstein, G., & Prelec, D. (2012) Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological science, 23(5), 524–532. doi: 10.1177/0956797611430953
Sabbagh, K. (2001) A Rum Affair. A True Story of Botanical Fraud. 1–276. Da Capo Press.
Simmons, J.P., Nelson, L.D. & Simonsohn, U. (2011) False-positive psychology undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological science, 22(11), 1359–1366. doi: 10.1177/0956797611417632

This work by Quentin Groom is licensed under a Creative Commons Attribution 3.0 Unported License.

Saturday, 17 January 2015

Why is it worth reading botany from 200 years ago?

George Orwell wrote that “Each generation imagines itself to be more intelligent than the one that went before it, and wiser than the one that comes after it.

And so it is with scientists. Until recently I was largely oblivious of the enormous corpus of botanical literature from the 18th and 19th centuries. Of course I was fully aware of the thousands of dusty volumes in libraries, but I stupidly assumed that they didn't contain much of worth. It wasn't until the Biodiversity Heritage Library made these volumes digitally available and searchable that I realized what a precious mine of information they contained.

Early Floras and Faunas are not just of historical interest; they contain descriptions of species, habitats and landscapes that give unique information on the biota of their time. This information is practically impossible to gather from any other source. The growth of trees can be examined using the width of their rings, and sometimes preserved remains can be found in bogs and in sediments, but for many plants there are no recent remains, except for what was written about them in books.

Europe in particular has the most to gain from the digitization of this literature. It has, by far, the biggest legacy of biodiversity literature, starting with the ancient Greeks and herbalists through to Linnaeus and the great plant hunters.

Fully digitizing a book requires scanning; optical character recognition (OCR); manual correction of the OCR results and semantically enhancing the text with annotations, such as the modern names of species, coordinates of localities and linking references to their original source. This can take a lot of effort, but digitization is work that only needs to be done once, particularly if the results are made openly available for everyone to share.

I recently published a paper where I use biodiversity literature from over 200 years to examine habitat and distribution changes in the small, smelly, weed, stinking goosefoot (Chenopodium vulvaria) (https://peerj.com/articles/723/). It was possible to show that the descriptions of this plant's habitat have changed with time. I look forward to the time when so much of our legacy literature is digitized that the same exercise can be conducted on hundreds of species so that the last two hundred year's changes in biodiversity can be reconstructed from what has been written. Obviously, those changes will be viewed through the filter of those hundreds of authors' words, but it nonetheless will be a unique resource for environmental history.

If you’re interested in contributing to the digitization and transcription of biodiversity literature investigate the Biodiversity Heritage Library; the transcription of books on Wikisource and the full semantic publishing of legacy literature conducted by Pensoft.

Below are graphs showing change in the use of various habitat categories that where used to describe the habitat of stinking goosefoot in literature and on specimen labels. The top four categories show significant changes, but the bottom four show no significant change.