Georeferencing
John Acocks’ herbarium specimens
HESTER M. STEYN1 & IAN A. ENGELBRECHT2
1 National
Herbarium, South African National Biodiversity Institute, Private Bag X101,
Pretoria 0001, South Africa.
h.steyn@sanbi.org.za,
https://orcid.org/0000-0001-9243-5965
2 Department of Zoology and Entomology, University of Pretoria, Lynnwood Road, Pretoria, South Africa.
![]() |
JPH Acocks |
Abstract
John Phillip Harison Acocks (7 April 1911–20 May 1979) made significant contributions to the field of South African botany. He was one of the most prolific South African collectors, collecting over 25 000 herbarium specimens from southern Africa between 1936 and 1977. The digitisation of herbarium specimens has revolutionized botanical research and conservation efforts, as information previously ‘locked up’ in specimen labels has become accessible without having to see the physical specimens. Since the 1970s staff members of the National Herbarium in Pretoria have been capturing information from herbarium specimen labels into various electronic databases. Most of Acocks’ specimen labels have been digitized over the years, but despite his precise locality notes on most labels, no coordinates were available to facilitate mapping, modelling and analysis of the occurrence data.
Recently, a project was launched to georeference the >20 600 Acocks specimens in the botanical database of the South African National Biodiversity Institute (SANBI). This was done by using various digitized maps in QGIS, a database compiled of most of Acocks’ collecting localities and notes, and a custom built software tool for managing and reusing georeference data. As a result, the coordinates of around 20 000 specimen records were updated in the database, thus enabling more accurate mapping of species’ distributions, enhancing spatial and temporal analysis of the valuable Acocks collection, enabling long-term monitoring of biodiversity by allowing researchers to revisit collection sites and compare historical data with current observations, as well as adding thousands of new georeferenced localities to the gazetteer within the georeferencing software for future use.
Keywords: BODATSA, digitization, geographical
coordinates, georeferencing protocol, georeferencing tool, mapping, NSCF,
point-radius method, SANBI.
Introduction
John Phillip Harison
Acocks (Glen & Germishuizen 2010) not only collected more than 25 000 herbarium
specimens (Killick 1980; Glen & Germishuizen 2010) in South Africa, Namibia
and Eswatini (then Swaziland), but he also meticulously documented the plant
species, their distributions and habitat preferences. Acocks collected species
presence and abundance data at ca. 3 000 sites (Rutherford et al. 2003). These
sites were of varying size and shape, and time spent at the sites varied from
around 20 minutes to three or four hours, depending on the diversity of the
vegetation (Acocks 1953; Rutherford et al. 2003). Acocks’ field notes served as
the basis for a database of plant species distributions and abundance (ACKDAT, O’Callaghan
et al. 1994; Rutherford et al. 2003). These comprehensive botanical surveys
helped with the classification and documentation of the region’s flora, land
management, conservation efforts, identifying rare and endemic species,
understanding plant diversity patterns, and assessing the impacts of
environmental changes on vegetation (Rutherford et al. 2003). Acocks is also
renowned for his milestone publication – Veld
Types of South Africa (Acocks 1953, reprinted in 1975). This publication,
which categorizes vegetation based on ecological characteristics, provided
valuable insights into the diverse plant communities across South Africa and
how and why the vegetation changed. Acocks’ specimens are mainly housed in the
National Herbarium (PRE), Selmar Schonland Herbarium (GRA), Alexander McGregor
Museum Herbarium (KMG) and Natal Herbarium (NH) (Glen & Germishuizen 2010)).
John Acocks is honored through the naming of several plant species, such as Selago acocksii Hilliard, Carex acocksii C. Archer, Lotononis acocksii B.-E. van Wyk, Moraea acocksii Goldblatt & J.C.
Manning, and Trachyandra acocksii
Oberm. (SANBI 2024). His legacy also extends to the animal kingdom, with a
genus of grasshoppers, Acocksacris,
named after him, comprising five species (Glen & Germishuizen 2010).
Herbarium specimens provide verifiable and citable evidence of the occurrence of plants at a specific point in time and space (Lughadha et al. 2019). An essential piece of data captured by an herbarium specimen is the locality where the plant was found. The locality data of millions of specimens collected before the use of Global Positioning System (GPS) devices remain trapped in the description of the locality on the specimen label, unavailable for mapping or modelling (Wieczorek et al. 2004; Bloom et al. 2017; Unknown 2020). Owing to specimens having been collected over the course of many years or even centuries, their geographical data is often ambiguous or may be imprecise (Jonathan 2016; Lughadha et al. 2019). This is why georeferencing – determining geographical coordinates of textual descriptions of localities, thus enabling it to be plotted on a map (Nelson et al. 2012; Jonathan 2016) – is an essential step in digitization (Nelson et al. 2012; Jonathan 2016). The process of georeferencing requires consideration of geography, topography, collection practices and historical context to determine the precise location described on the specimen label. The resulting data have a wealth of possible uses, empower novel research with occurrence data and help fill the gaps in our understanding of plant life (Jonathan 2016; Wolf et al. 2016; Bloom et al. 2017; Unknown 2020).
Although the coordinates of Acocks' approximately 3,000 sampling sites were available in ACKDAT from 1994 (Rutherford et al. 2003), his specimens were primarily assigned to quarter-degree squares (QDS) (Edwards & Leistner 1971) with the coordinates taken as the centroid of the QDS. This method resulted in mapping inaccuracies with an uncertainty radius of ±12 km.
Materials and methods
In 2021 a new software
tool for georeferencing museum and herbarium specimen records was launched by
the Natural Science Collections Facility (NSCF, https://georef.nscf.org.za, Engelbrecht
2021). This tool simplifies the georeferencing process by grouping similar
localities together and allowing for reuse of georeference data across
different datasets. It also facilitates groups of people working together to
georeference a particular dataset, and conforms to Darwin Core standards for
georeference metadata (Darwin Core Quick Reference Guide, https://dwc.tdwg.org/terms/).
Acocks’ specimen records in the Botanical Database
for Southern Africa (BODATSA) were georeferenced using this tool.
Locality coordinates were primarily sourced from several spatial data layers using the free and open source QGIS (https://qgis.org/), including 1:50 000 and 1:250 000 topo-cadastral maps (maps with topographical detail and additional names, numbers and boundaries of original farms, the boundaries of magisterial districts, and provincial and international boundaries), as well as the 1:500 000 Irrigation Series maps used by Acocks. Acocks annotated his maps with red dots, crosses or pencil marks indicating his sample sites (Figure 1) which greatly facilitated locating their coordinates. A shapefile of Acocks’ georeferenced sampling sites (based on ACKDAT) was also used to pinpoint his collecting localities. The workflow involved locating the coordinates for a particular location in QGIS and then estimating an uncertainty radius around those coordinates which would suitably cover all possible areas where the specimen may have been collected. This information, together with other metadata, were then captured in the georeferencing tool.
![]() |
Figure 1. An example of a map annotated by Acocks with red dots, crosses and pencil marks to indicate his sample sites. |
![]() |
Figure 2. Screen shot of georeferencing tool showing locality group being georeferenced (left), candidate georeferences in the middle and the detail of a georeference on the right. |
After the records were georeferenced, the data set was extracted from the georeferencing tool and cleaned in OpenRefine (version 3.7.4, http://openrefine.org), a free, open-source tool for cleaning and transforming data. Python scripts were used to calculate the QDS (Edwards & Leistner 1971) from the new coordinates, to update the province names and to select a class of coordinate uncertainty from the lookup list used in BODATSA (SANBI 2024). The records in BODATSA were subsequently updated with the georeferenced data by linking the two datasets with a unique identifier for each record.
The georeferenced records are currently being mapped by date collected to manually find outliers, using a Python script. It was decided to do this after updating the main database due to the time it will require and to make the georeference data available as soon as possible.
Results and Discussion
Over 20 000 (97%) of
Acocks’ specimen records in BODATSA were successfully georeferenced and most of
the records assigned with an uncertainty radius of 2 km or less (also see
Rutherford et al. 2003) with the primary sources of uncertainty being the
precision of the maps he used compared with more modern maps, and not knowing
whether distances reported in the locality descriptions were measured from the
centre or edges of towns.
As a result of the georeferencing process, these specimen records can be plotted and the data spatially analysed. Acocks collected plant specimens from all over South Africa, with a few specimens from neighbouring countries (Figure 3). He did not collect any specimens from Lesotho and only a few specimens were collected in Botswana, Eswatini, Mozambique, southern Namibia, and Zimbabwe (Table 1). Specimens were collected from various biomes and vegetation types, but large areas of the Kalahari and Limpopo Province were not sampled (Figure 3). South African provinces where the most specimens were collected are the Western and Northern Cape with 5 334 and 4 823 specimens respectively, and provinces with the least collections are Gauteng and North West with 267 and 310 respectively.
Acocks did not collect herbarium specimens from all his sampling sites and if he collected specimens, they were only from certain taxa, as can be seen in the example from the Ghaap Plateau below. Furthermore, specimens were often collected while travelling between two sampling sites (L.W. Powrie, pers. comm., 2021). Therefore, dots representing the sampling sites and those representing the collection sites are not always exact matches (Figure 3).
![]() |
Figure 3. Distribution of Acocks’ sampling sites l, and plant specimens l housed in SANBI herbaria. |
Acocks (1988) noted 302 plant species in a single survey in the Asbestos Hills between Daniëlskuil and Kuruman on the Ghaap Plateau (Northern Cape) – the highest number of species recorded at any of his sampling sites (Zietsman & Zietsman 2021). This is not reflected in his specimen collection, as the highest number of species per QDS collected from the Ghaap Plateau was 39 from 2823AB (west of Daniëlskuil, Northern Cape). Over 200 species per QDS were collected in 3326DA (248 species) (Kenton-on-Sea, Eastern Cape), 2929BB (464 species) (Estcourt, KwaZulu-Natal), 3227CB (470 species) (Stutterheim, Eastern Cape) and 3119CA (501 species) (Lokenburg south of Nieuwoudtville, Northern Cape) (Figure 4).
Mapping the records by
collection date to detect outliers produced 2450 maps (Figure 5). Outliers were
nearly always the result of incorrect dates captured in the database. From the
10th to the 16th of March 1936 Acocks was collecting in
the Kimberley area. The outlier (westernmost dot) in Figure 5D must therefore
be incorrect. The record was checked and the coordinates are correct, but based
on the locality information and Acocks’ collecting number, this specimen was
collected on 13 March 1937 and not in 1936 as originally captured in the
database. These errors are being corrected in the main database as they are
located.
![]() |
Figure 4. Number of species collected by Acocks per quarter degree square (QDS), based on specimens in SANBI herbaria. |
![]() |
Figure 5. Maps showing specimen
collections per date; all Acocks specimens l, specimen(s) collected on a specific date l.
|
Conclusions
The NSCF georeference tool is a valuable and easy-to-use online tool as it uses locality groups and reliable existing georeferences are re-used. Owing to Acocks’ specimen records being georeferenced recently, these records are now assigned to the correct current provinces and can be mapped with a relatively low uncertainty. As Acocks’ work continues to be referenced and utilized by botanists, ecologists, land managers and policymakers, these georeferenced records are not only of a better quality but they can also be used for long-term monitoring of biodiversity, preparing Species Distribution Models (SDMs) for predicting past, present and future suitable habitats for species (Jonathan 2016, Wolf et al. 2016, Bloom et al. 2017, Unknown 2020), as well as adding thousands of new georeferenced localities to the gazetteer within the georeferencing tool for future use.
Acknowledgements
The authors would like to
acknowledge Mike O’Callaghan and Barry Jagger who did most of the
georeferencing of the Acocks sampling sites for the ACKDAT database, and Les
Powrie for his monumental work in making the ‘Acocks maps’ electronically
available, leading the digitization of the Acocks field notes and for all his
information regarding this extraordinary botanist.
References
Acocks, J.P.H., 1953, ‘Veld types of South Africa’, Memoirs of the Botanical Survey of South Africa, 28, 1–192. Department of Agriculture and Water Supply, Pretoria.
Acocks, J.P.H., 1975, ‘Veld types of South Africa, 2nd edition’, Memoirs of the Botanical Survey of South Africa, 40, 1–128. Department of Agriculture and Water Supply, Pretoria.
Acocks, J.P.H., 1988, ‘Veld types of South Africa’, 3rd edition, Memoirs of the Botanical Survey of South Africa, 57, 1–146. Department of Agriculture and Water Supply, Pretoria.
Bloom, T.D.S., Flower, A., & DeChaine, E.G., 2017, ‘Why georeferencing matters: Introducing a practical protocol to prepare species occurrence records for spatial analysis’, Ecology and Evolution, 8(1), 765–777. https://onlinelibrary.wiley.com/doi/full/10.1002/ece3.3516
Edwards, D. & Leistner, O.A., 1971, ‘A degree reference system for citing biological records in southern Africa’, Mitteilungen der Botanischen Staatssammlung München 10: 501–509. https://www.biodiversitylibrary.org/page/15185301#page/509/mode/1up
Engelbrecht, I., 2021, ‘Fun and easy georeferencing with a new online tool from South Africa’, Biodiversity Information Science and Standards, 5, e73572. https://doi.org/10.3897/biss.5.73572.
Jonathan, T., 2016, ‘Why georeferencing is the most important thing for the museum since sliced bread’ | Digital Collections Programme, https://naturalhistorymuseum.blog/2016/01/25/why-georeferencing-is-the-most-important-thing-for-the-museum-since-sliced-bread-digital-collections/#:~:text=Georeferencing%20is%20the%20process%20used,a%20complicated%20process%20to%20understand.
Killick, D.J.B., 1980, ‘Obituaries: John Phillip Harison Acocks (1911–1979)’, Bothalia, 13, 239–244.
Lughadha, E.N., Walker, B.E., Canteiro, C., Chadburn, H., Davis, A.P., Hargreaves, S., Lucas, E.J., Schuiteman, A., Williams, E., Bachman, S.P., Baines, D., Barker, A., Budden, A.P., Carretero, J., Clarkson, J.J., Roberts, A. & Rivers, M.C., 2018, ‘The use and misuse of herbarium specimens in evaluating plant extinction risks’, Philosophical Transactions of the Royal Society B., 374, 20170402 https://doi.org/10.1098/rstb.2017.0402.
O’Callaghan, M., Powrie, L.W., Hurford, J.L. & Rutherford, M.C., 1994, ‘ ACKDAT: the national plant ecological database’, Unpublished paper presented at the Fynbos Forum, 13–15 July 1994, Bien Donne, Stellenbosch, South Africa.
Nelson, G., Paul, D., Riccardi, G. & Mast, A.R., 2012, ‘Five task clusters that enable efficient and effective digitization of biological collections’, ZooKeys, 209, 19–45. https://doi.org/10.3897/zookeys.209.3135
Rutherford, M.C., Powrie, L.W. & Midgley, G.F., 2003, ‘ACKDAT: a digital spatial database of distributions of South African plant species and species assemblages’, South African Journal of Botany, 69, 99–104.
SANBI, 2024, ‘Botanical Database of Southern Africa (BODATSA)’, Electronic version available from https://posa.sanbi.org/ [accessed continuously]
Unknown, 2020, ‘Mapping the world of plants’, [Blog] https://www.capturingcaliforniasflowers.org/blog/mapping-the-world-of-plants.
Wieczorek, J., Guo, Q. & Hijmans, R., 2004, ‘The point-radius method for georeferencing locality descriptions and calculating associated uncertainty’, International Journal of Geographical Information Science, 18(8), 745–767. https://doi.org/10.1080/13658810412331280211.
Wolf, A., Zimmerman, N.B., Anderegg, W.R.I., Busby, P.E. & Christensen, J., 2016, ‘Altitudinal shifts of the native and introduced flora of California in the context of 20th-century warming’, Global Ecology and Biogeography, 25, 41829. https://onlinelibrary.wiley.com/doi/full/10.1111/geb.12423.
Zietsman, P.J. & Zietsman, L., 2021, ‘Floristic diversity
at Kolomela mine on the Ghaap Plateau, Postmasburg, Northern Cape Province’, Indago March 2021.
https://nationalmuseumpublications.co.za/floristic-diversity-at-kolomela-mine-on-the-ghaap-plateau-postmasburg-northern-cape-province/.
About the authors:
Hester Steyn is based at the National Herbarium in Pretoria where she curates the Acanthaceae, Campanulaceae, Lobeliaceae and Rubiaceae. Her interests include managing collection databases and doing fieldwork in the dry northwestern areas of South Africa.
Ian Engelbrecht is a specialist in collection digitization and works with several institutions in South Africa through the Natural Science Collections Facility. His taxonomic interests include mygalomorph spiders and scorpions.
No comments:
Post a Comment