Tuesday 29 December 2015

A common plant survey of the vice counties of Durham and South Northumberland

Many people remark on the changes that are occurring in the countryside, the disappearance of some species and the spread of others. Yet these anecdotes cannot substitute for hard facts. There are also many suggested causes for all these changes; a warmer climate, different agricultural practises, eutrophication, alien species etc. Botanical observations tend to be biased. For example, people often note the exceptional species but ignore the common ones. So it is difficult to draw conclusions about plant abundance from casual observations. What was needed was a dedicated survey with a clear repeatable methodology.

Common plant species are the mainstay of habitats, they create our woodlands, hedgerows and
The distribution of heather in Durham and
South Northumberland predicted
from the common plant survey data.
meadows; they provide the food for herbivores and pollinators and they create homes for birds and mammals. Changes in the abundance of rare species have little impact on other species, but change in the abundance of common species can have cascading effects on whole ecosystems of which we are a part.

For these reasons volunteer botanists in the north-east of England conducted a four year survey to benchmark the abundance of common plants. Led by the Botanical Societies vice county recorders, John Durkin, John Richards and Quentin Groom they surveying the plants in a randomly selected sample of 1km2 grid squares in the vice counties of Durham and South Northumberland.  They have created a solid foundation that can be used to qualify the abundance of common species and be compare against previous and future studies. The project was conducted over four years and required volunteers to go to all sorts of places. Some people surveyed post-industrial brown-field sites, while other walked for miles across bleak moorland to reach sites high in the hills. Although, these moors are arguable more wild and natural, the industrial wastelands are far more biodiverse.

The results of this survey have just been openly published, contributing an additional 35,000 observations to the 200,000 observations collected by local recorders since the turn of the millennium (http://bdj.pensoft.net/articles.php?id=7318).

Botanical surveying continues in the region despite the end of this project. Volunteers continue to monitor rare plants in the region (https://dx.doi.org/10.6084/m9.figshare.1480492; http://www.bsbi.org.uk/County_Durham_Rare_Plants_Register_2013.pdf) and are currently working towards the next atlas of Britain and Ireland coordinated by the Botanical Society of Britain and Ireland (http://bsbi.org.uk/).

Good biological conservation in the 21st century will be as much to do with sensitive adaption to change as it is about preserving what we have. Human memory is short and fickle and it is only with benchmark surveys, such as this, that we can hope to understand and manage that change.

Tuesday 30 June 2015

We should drop the use of the word tetrad or at least learn what its area is!

The word tetrad is used to describe a square area of land with sides of 2km it has an area of 4km2. The term tetrad is widely used in biological recording in the UK, but it often misunderstood both in the UK and abroad. Logically it is derived from the ancient Greek word for four. Likewise a monad has 1km sides and has an area of 1km2 . A pentad and a hectad have sides of 5km and 10km respectively, but they have areas of 25km2 and 100kmso their name is based on the length of the side not their area. Outside the UK many people assume a tetrad has sides of 4km, because this follows logically, but lack of a logical series is perhaps why there are so many mistakes.

Mistakes are not hard to find, below are some examples published in official publications where the area of a tetrad is wrong...
Perhaps it is time to drop the use of this ambiguous term!

Saturday 18 April 2015

Biological Observation Analysis and Repeatability

There has been considerable interest in the repeatability of scientific research. Ioannidis (2005) showed that most claimed research findings are false and Simmons et al. (2011) neatly demonstrate how simple methodological decisions made by an experimenter can increase the number of false-positive errors. Consider the case of an experimentalist who has various ‘degrees of freedom’ in the way that they conduct and analyse their experiment. These freedoms have a large influence on the conclusions draw from the results, particularly on the proportion of false-positive errors. For example, an experimenter can decide to increase the number of replicates if they do not have a significant result in their original experiment. They also have the flexibility to either include or exclude outlying values and set arbitrary thresholds for the inclusion of observations.

Similarly, observers of organisms have similar freedoms that contribute to their errors. For example, if an observer expects to find an organism in a habitat they may continue surveying only until they find those organisms that they are anticipating will be present. Furthermore, if they expect certain organisms they are perhaps more likely to misidentify similar looking organisms as the organism they expect to see, thus simultaneously creating a false-positive and a false-negative observation. There are many other observer behaviors that can lead to errors. Observers do not disclose their taxonomic biases, such as ignoring grasses and sedges. Observers will vary in their treatment of dead organisms, with some people treating them as a sign of occupancy and others not. Observers will vary in their methodology, either in what they accept as evidence of occupancy and by the equipment they use. Such choices introduce biases even when the observers are diligently trying to conduct a survey to the best of their ability. Though there are undoubtedly also cases where observers consciously manipulate their findings or carelessly report them wrongly (Sabbagh 2001; John et al., 2012).
The generation of false-positives among experimentalists also parallels biological observers in another characteristics. There is a well-known and widespread publication biases in the scientific literature towards positive results (Francis, 2012). Similarly, in my experience observers of biodiversity are much more likely to report observations of an unusual species, than ubiquitous species, particularly is that species is easily identifiable.

There are many ways that field recording could be improved, but we will always rely on cleaver analysis techniques to extract information from our data. The question is, can we peel off the layers of biases to ever know anything biologically relevant from our observations. In analysis of botanical records there is a tendency to include every observation, but if doing that only increases the biases, then ignoring biased observations can improve repeatability. It might seem rash to ignore data, but be skeptical of methods that use all the available data, particularly where the data is aggregated to disguise the biases.


Francis, G. (2012) Too good to be true: Publication bias in two prominent studies from experimental psychology. Psychonomic Bulletin & Review, 19(2), 151-156. doi: 10.3758/s13423-012-0227-9
Ioannidis J.P.A. (2005) Why Most Published Research Findings Are False. PLoS Medicine 2(8), e124. doi:10.1371/journal.pmed.0020124.
John, L.K., Loewenstein, G., & Prelec, D. (2012) Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological science, 23(5), 524–532. doi: 10.1177/0956797611430953
Sabbagh, K. (2001) A Rum Affair. A True Story of Botanical Fraud. 1–276. Da Capo Press.
Simmons, J.P., Nelson, L.D. & Simonsohn, U. (2011) False-positive psychology undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological science, 22(11), 1359–1366. doi: 10.1177/0956797611417632

This work by Quentin Groom is licensed under a Creative Commons Attribution 3.0 Unported License.

Saturday 17 January 2015

Why is it worth reading botany from 200 years ago?

George Orwell wrote that “Each generation imagines itself to be more intelligent than the one that went before it, and wiser than the one that comes after it.

And so it is with scientists. Until recently I was largely oblivious of the enormous corpus of botanical literature from the 18th and 19th centuries. Of course I was fully aware of the thousands of dusty volumes in libraries, but I stupidly assumed that they didn't contain much of worth. It wasn't until the Biodiversity Heritage Library made these volumes digitally available and searchable that I realized what a precious mine of information they contained.

Early Floras and Faunas are not just of historical interest; they contain descriptions of species, habitats and landscapes that give unique information on the biota of their time. This information is practically impossible to gather from any other source. The growth of trees can be examined using the width of their rings, and sometimes preserved remains can be found in bogs and in sediments, but for many plants there are no recent remains, except for what was written about them in books.

Europe in particular has the most to gain from the digitization of this literature. It has, by far, the biggest legacy of biodiversity literature, starting with the ancient Greeks and herbalists through to Linnaeus and the great plant hunters.

Fully digitizing a book requires scanning; optical character recognition (OCR); manual correction of the OCR results and semantically enhancing the text with annotations, such as the modern names of species, coordinates of localities and linking references to their original source. This can take a lot of effort, but digitization is work that only needs to be done once, particularly if the results are made openly available for everyone to share.

I recently published a paper where I use biodiversity literature from over 200 years to examine habitat and distribution changes in the small, smelly, weed, stinking goosefoot (Chenopodium vulvaria) (https://peerj.com/articles/723/). It was possible to show that the descriptions of this plant's habitat have changed with time. I look forward to the time when so much of our legacy literature is digitized that the same exercise can be conducted on hundreds of species so that the last two hundred year's changes in biodiversity can be reconstructed from what has been written. Obviously, those changes will be viewed through the filter of those hundreds of authors' words, but it nonetheless will be a unique resource for environmental history.

If you’re interested in contributing to the digitization and transcription of biodiversity literature investigate the Biodiversity Heritage Library; the transcription of books on Wikisource and the full semantic publishing of legacy literature conducted by Pensoft.

Below are graphs showing change in the use of various habitat categories that where used to describe the habitat of stinking goosefoot in literature and on specimen labels. The top four categories show significant changes, but the bottom four show no significant change.

Sunday 17 August 2014

The problem of data discovery for invasive species

At a recent conference on invasive species someone from the audience requested that there should be fewer databases, to which there was a general muttering of agreement. Can you imagine if someone had said the same thing about books? Surely only an anti-intellectual would want fewer books. However, I perfectly understand the frustrations that lead to this request. All the time there are new launches of databases and websites on alien species, they never have precisely the same remit, but they will always have overlapping interests with other databases. For a time-pressed user, who is scouring the internet for a simple answer to a simple question, there is a bewildering array of data sources. Some are highly visible, others are hidden; some are nice looking, but superficial, while others are mines of information but hard to navigate. The same thing could be said of books, but people expect more of the internet where they were promised a bright future of connectedness and interoperability.

Yet we are not going to get fewer databases any time soon, current funding models and the missions of providers restrict us. Funders want to see clear results from their investment and providers want to create something new for their effort. It is currently hard to achieve those aims if they are tied to a single product. There are other problems, too. Single products don't handle differences of opinion well and local issues related to culture and language are not well suited to a monolithic approach. Furthermore, data is best managed by the people with most interest in it. They have the most incentive to gather new data and they are the experts in their specialisms.

We all want discoverable, accurate, up-to-date information on alien species that are sustainably managed, but for all the reasons I've mentioned we can't yet have fewer databases. Nevertheless, there are many things we can do to improve the situation, which include federation, openness and standardisation.


There is duplication of effort in our invasive species databases. Taxonomic names, common names, references, observation data and specimen data are repeatedly entered and curated by each database independently. This does not have to be the case. We can federate out some of our work to providers who specialise in those data. For example, all modern scientific publications have a Digital Object Identifier (DOI). This simple, but unique identifier is the key to all the bibliographic information on a publication. DOIs are maintained by the publishing industry who will do a better job of looking after this domain of data than we will. In our databases it is only necessary to store the DOI and derive other information from the DOI resolvers. ORCIDs are another example; they are an open, self administered system for uniquely identifying a scientist. They are administered by the scientists themselves who are best placed to do the job and we potentially only need to store the ORCID in our database.
Federation has the potential to reduce costs, while at the same time improving standards, improving sustainability and helping us to concentrate on our core interest of invasion biology.
Nevertheless, if we federate some services we need to trust those services to provide the information we need, at a price we can afford and for those services to be provided for the long term and reliably. DOIs and ORCIDs are supported by the publishing industry and by large academic institutions. Other infrastructures to which we could federate responsibilities might be the Global Names architecture and the GBIF. These infrastructures need communities such as ours to justify their existence, but similarly we might benefit considerably from their domain expertise and investment.


Working within a framework of standards for data quality can be frustrating, particularly in an emerging discipline where standards often seem to be unnecessarily constraining. There is a temptation for everyone to invent their own ‘standard’, yet the advantages of standardization are numerous. We should always be looking to other disciplines to reuse and build upon their standards. Standards that are extensible can provide a flexible approach.
The ability to combine digital resources is a fine goal of standardisation, but to do it we need to understand each other’s data. Wherever possible, we need to explain and annotate our data. Using common standards is a good first step, but we also need to ensure that the metadata is kept up-to-date and accurate. Many people will have noticed the problem of data aggregators where the meaning of data can subtly change as it is transferred from one database to another. The creation of domain ontologies can clarify the meaning of terms without necessarily constraining the development of new data sources. This is a comparatively new field within computer science, but one that should be explored for invasion biology.


Even if we don't want it, we get copyright automatically and are stuck with it for many years after our death, unless we ensure that each of our works is openly licensed. You can't deny the usefulness of open resources such as Wikipedia. And yet, it is perhaps most remarkable for its success in mobilizing data providers. However, scientists are often afraid of openness, thinking that others will ‘steal’ their work and not give them sufficient credit. However, often the reverse is true. Open licensing promotes data discovery and experts can use it to promote themselves through their expertise not the data they hold. Work still needs to be done on providing traceable citations for data, but scientists already have mechanisms for doing this, such as so-called data publications. Scientists also need to become more educated about copyright, as data per se can’t be copyrighted.

To conclude, making data more accessible and discoverable is not an easy task, yet the tools and practises to do it are available to us. What is needed is a change in culture, not necessarily towards having monolithic databases, but towards sharing, openness and connectivity. It will take some investment of resources and progress might seem slow at first, but eventually we can build a global infrastructure for invasive species that satisfies our needs and we don’t necessarily need to have fewer databases to do it.

This work by Quentin Groom is licensed under a Creative Commons Attribution 3.0 Unported License.

Wednesday 25 June 2014

A third reason for the success of invasive species

A recent paper by Colautti et al. (2014) stated that the success of invasive species can be explained by two views.
  1. intrinsic factors make some species inherently good invaders
  2. species become invasive as a result of extrinsic ecological and genetic influences such as release from natural enemies, hybridization or other novel ecological and evolutionary interactions
This is not an unusual view of invasion biology, but one that is based on a static view of the world. This philosophy views habitats as fixed entities and the invaders being imposed upon them. There is a third reason for the success of invasive species, that new habitat has been create for them. Take for example the simple case of oak gall wasps Andricus aries, A. corruptrix, A. grossulariae, A. liginocolus, A. lucidus and A. quercuscalicis which have all invaded northern Europe in the past 60 years. They all have a complex alternating life cycle requiring two oak species Quercus cerris and Q. robur (Ozaki et al., 2006). However, only Q. robur is native to northern Europe. Q. cerris was introduced to northern Europe in the 18th century and was first found in the wild in the mid-19th century. It has, since then, continually been planted and has naturalised in many places. Thus, mankind created a habitat for these Andricus species where both oaks lived side-by-side and after a short lag the wasps came and occupied it. There is no reason to think these wasps are particularly good invaders, though some have spread quicker than others. Nor have they managed to escape their parasitoids (Ellis, 2005). It is not known if some evolutionary process aided their spread, but why invoke such a mechanism when a more parsimonious one exists?
Another, much less obscure, example is roadside halophytes. In recent years in the UK, roadside halophytes were amongst plants with the largest increase in range (Groom, 2013). Some plants such as Danish scurvy-grass (Cochlearia danica), reflexed saltmarsh grass (Puccinellia distans) and lesser sea-spurrey (Spergularia marina) have increased the quickest, but many different native saltmarsh plants can also be found by roadsides somewhere in Britain. There is no reason to think that halophytes are better invaders than other plants, nor is there any reason to believe that they have escaped natural enemies, as these are native plants. It is unlikely that all of them have undergone evolutionary changes or that there are “novel ecological and evolutionary interactions”. The main thing that has changed is the creation of large amounts of interconnected saline habitat at the side of roads.
Many other examples exist, C4 grasses are spreading in northern Europe due to changes in farming practises (Hoste & Verloove, 2001). Chasmophytes such as ivy-leaved toadflax (Cymbalaria muralis), buddleia (Buddleja davidii) and wall-rue (Asplenium ruta-muraria)  are increasing in cities because of the large number of walls on which they can grow. In numerous cases of invasion the creation of new habitat precedes the invasion. Climate change, atmospheric nitrogen deposition, pollution, forestry, agricultural and cultural changes all alter habitats, potentially priming these species for invasion. If a new habitat is being created one can be fairly sure that some organism will come along and occupy it. I do not discount the other two reasons for invasions, but if we ignore mankind's role in priming the environment for invasion we misunderstand the invasion process of many species.

Cited Literature

  1. Colautti, R.I., Parker, J.D., Cadotte, M.W., Pyšek, P., Brown, C.S., Sax, D.F. and Richardson, D.M. (2014) Quantifying the invasiveness of species. In: Capdevila-Argüelles L, Zilletti B (Eds) Proceedings of 7th NEOBIOTA conference, Pontevedra, Spain. NeoBiota 21: 7–27 doi:10.3897/neobiota.21.5310
  2. Ellis, H.A. (2005) Observations of the Agamic (Knopper Gall) of Andricus quercuscalicis and the associated inquilines and parasitoids in Northumberland. Cecidology 20: 12–27
  3. Groom Q.J. (2013) Some poleward movement of British native vascular plants is occurring, but the fingerprint of climate change is not evident. PeerJ 1:e77 http://dx.doi.org/10.7717/peerj.77
  4. Hoste, I. & Verloove, F. (2001) De opgang van C4-grassen (Poaceae, Paniceae) in de snel evoluerende onkruidvegetaties in maïsakkers tussen Brugge en Gent (Vlaanderen, België). Dumortiera 78: 2–11
  5. Ozaki, K., Yukawa, J., Ohgushi, T. and Price, P.W. (2006) Galling Arthropods and Their Associates. Springer-Verlag, Tokyo
File:Knopper gall on oak (Quercus robur), induced by Andricus quercuscalicis (gall wasp), Arnhem, the Netherlands.jpg
Knopper gall on oak (Quercus robur) induced by Andricus quercuscalicis (gall wasp) by Bj.schoenmakers

This work by Quentin Groom is licensed under a Creative Commons Attribution 3.0 Unported License.

Sunday 22 June 2014

A request for material of Oxalis corniculata

Oxalis corniculata has become an almost ubiquitous weed of plant pots and borders. Its explosive capsules and sticky seeds let it jump, like a vegetable flea, from pot to pot. This phenomena is not unique to Britain and Ireland. All across Europe O. corniculata can be found in similar situations.
Linnaeus first described the species from Europe, but it is not clear if it is native here. Close relatives exist in North America, Asia and Australasia. It has an extremely plastic phenotype depending on habitat. Characters such as hairiness, leaf size and habit all overlap between species in this group, even though these species do not hybridise readily. It is for this reason my colleagues and I at the Botanic Garden Meise (Belgium) are trying a molecular genetic approach to understanding the O. corniculata group. We are hoping to be able to unravel the phylogeny of these taxa and more precisely define the taxon boundaries. Perhaps we will even get indications of its geographic origins.
We are looking for specimens (fresh or rapidly dried) of plants in the O. corniculata group from as many places as possible. In addition to O. corniculata the corniculata group includes O. corniculata var. atropurpurea, O. dillenii, O. exilis and O. stricta. It doesn't matter if you can’t identify it with certainty, but it would help us match molecular and physical traits if you are able to provide a specimen with fruits and flowers. Nevertheless, even non-fruiting material will help.

Contact me at quentin.groom@br.fgov.be

This work by Quentin Groom is licensed under a Creative Commons Attribution 3.0 Unported License.