Historical Maps: Machine learning helps us over the map vectorisation crux

Geoff Groom, Gregor Levin, Stig Roar Svenningsen, Mads Linnet Perner

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review


Modern geography is massively digital with respect to both map data production and map data analysis. When
we consider historical maps, as a key resource for historical geography studies, the situation is different. There
are many historical maps available as hardcopy, some of which are scanned to raster data. However, relatively
few historical maps are truly digital, as machine-readable geo-data layers. The Danish “Høje Målebordsblade”
(HMB) map set, comprising approximately 1100 sheets, national coverage (i.e. Denmark 1864-1920), and
geometrically correct, topographic, 1:20,000, surveyed between 1842 and 1899, is a case in point. Having the
HMB maps as vector geo-data has a high priority for Danish historical landscape, environmental and cultural
studies. We present progress made, during 2019, in forming vector geo-data of key land categories (water bodies,
wetland, forest, heath, sand dune) from scanned HMB printed map sheets. The focus here is on the role in that
work of machine learning methods, specifically the deep learning tool convolutional neural networks (CNN) to
map occurrences of specific map symbols associated with the target land categories. Demonstration is made of
how machine learning is applied in conjunction with pixel and object based analyses, and not merely in isolation.
Thereby, the strengths of machine learning are utilised, and the weaknesses of the applied machine learning are
acknowledged and worked with. Symbols detected by machine learning serve as guidance for appropriate values
to apply in pixel based image data thresholding. The resulting map products for two study areas (450 and 300
) have overall false-positive and false-negative levels of around 10% for all target categories. The ability to
utilise the cartographic symbols of the HMB maps enabled production of higher quality vector geo-data of the
target land categories than would otherwise have been possible. That these methods are in this work developed
and applied via a commercial software (Trimble eCognition©) recognizes the significance of a tried-and-tested
and easy-to-use, graphical-user-interface and a fast, versatile processing architecture for development of new,
complex digital solutions. The components of the resulting workflow are, in principle, alternatively usable via
various free and open source software environments.
Original languageEnglish
Title of host publicationProceedings of the ICA Workshop on Automatic Vectorisation of Historical Maps
Number of pages10
Place of PublicationBudapest
Publication date2020
Publication statusPublished - 2020
Externally publishedYes

Cite this