Features

Geological mapping in the age of artificial intelligence

Two centuries on from the publication of Britain’s first geological maps, developments in artificial intelligence are presenting new opportunities in mapping practice. Charlie Kirkwood discusses what’s new and why, as geologists, we should be interested

Words by Charlie Kirkwood
1 September 2022

A geological map of potassium (red), iron (green) and calcium (blue) distributions in the UK generated using an artificial intelligence system. (Image credit and copyright of Charlie Kirkwood)

In 1815, during a time of industrial revolution, William Smith published the first geological map of Britain. This was followed in 1820 by the map by George Bellas Greenough, President of the Geological Society of London. These two pioneers disagreed on the best mapping approach: Smith used fossils to recognise and map different strata, while Greenough is said to have favoured ‘mineralogical views’. Such is the nature of science.

Despite their considerable age, the maps of both Smith (Fig. 1) and Greenough still look entirely familiar to us – bodies of similar rock are classified as distinct units and mapped as coloured polygons (or as three-dimensional volumes using cross sections). Smith’s map emphasises biostratigraphy, while Greenough’s map highlights mineralogy – which map to prefer depends on the particular purposes of the user.

Over two centuries later, the traditional hand-drawn mapping approach remains widely used. However, with this approach it is difficult to quantify and communicate uncertainty, and the classification process, by which geology is represented using discrete units, means that some information is inevitably lost. Today, we can address these limitations by developing geological Artificial Intelligence (AI), which has the potential to revolutionise geological mapping.

Differing approaches

Hypothetically, the ‘ultimate’ geological map would be able to correctly tell us the exact values of any geological properties at any location, thereby serving perfect information to every end user. We would know the exact position and grade of all Earth’s resources and could optimise how we use them. Of course, we can never observe all the geological properties we are interested in at every point, so we must use some form of modelling – mental or computational – to fill in the gaps between our observations. There are multiple ways to interpret what the geology might be doing in between our observations, so there will always be uncertainty about the true form of the geology that we are attempting to map. It is important to quantify and communicate this uncertainty so that end users can factor it into their decision making.

Traditionally, geological maps have been produced using a ‘classification first’ approach (Fig. 2, right-hand side). With this approach, bodies of similar rock are manually delineated and classified into discrete units that are described as possessing their own distinct geological properties. However, many geological properties, such as age, composition and texture (on which traditional classifications are often based), are fundamentally continuous variables, and attempting to model them as discrete classes introduces quantisation error (Fig. 3, top panel). This means that less information can be recovered from the map than goes into making it, with fidelity being limited by the number of unique classes used. At the same time, with no clear way to convey the uncertainty about the position of class boundaries, these traditional maps are inherently overconfident in their linework. The traditional mapping approach can also introduce spatial inconsistencies, with mismatches at map-tile boundaries due to disagreements between individual geologists. In addition, the production process itself – decades of sparsely documented mental modelling that can easily be lost to history – is far from transparent. Overall, these limitations make it difficult to directly compare traditional ‘classification first’ geological maps to reality, which in turn makes it difficult to know what constitutes their improvement.

By adopting a ‘properties first’ approach on the other hand, we can aim to model geological properties directly as continuous variables (Fig. 3, bottom panel). This change in mapping philosophy brings the crucial benefit of enabling direct comparison between our maps and reality. That is, predictions of measurable geological properties (the output of the ‘properties first’ map, for example chemical composition as in figure 2) can simply be compared to observations of those properties across the map. This makes ‘properties first’ maps much more straightforward to evaluate than ‘classification first’ maps, and therefore provides a clearer path towards their improvement. If desired, we can still derive classified maps from maps of continuous geological properties simply by applying appropriate classification schemes afterwards – so in principle no capability is lost by adopting the ‘properties first’ approach.

Figure 1: William Smith’s geological map. First imprint, issued c. September 1815. (Credit: The Geological Society of London, UK. Archive ref: LDGSL/22).

The challenge of the ‘properties first’ approach is that it cannot reasonably be implemented by hand, but this is where AI comes in (see boxes 1 & 2). By developing suitable AI methods, we can aim to map any (or all) geological properties of interest as continuous-valued predictions through continuous space, free from the limits on fidelity that traditional ‘classification first’ mapping approaches impose (e.g., see figure 2). At the same time, uncertainties can be quantified via Bayesian inference: a statistical method that allows us to model a distribution of plausible explanations for the generation of observed data, rather than to over-confidently commit to a single explanation. In mapping terms, this means we can model a distribution of possible maps, rather than just one. Doing so allows uncertainty to be conveyed as predictive probability distributions of geological properties for any point in space (Fig. 3, bottom panel), which allows map users to account for the different geological conditions they may encounter. Having maps produced by a unified AI system, rather than by separate geologists, also means that the output is spatially consistent, with no map-tile mismatches, and the production process is transparent – the data are kept separate from the model, and both are well documented and shareable as digital objects.

While the ‘ultimate’ perfectly informative (zero error and uncertainty) geological map will never actually be attainable, by developing AI for geological mapping we can specifically aim to minimise the discrepancy between our maps and reality while remaining honest about our uncertainty. By doing so, we can maximise our knowledge of the state of the lithosphere given our observations, in an approach that is more analogous to ensemble weather forecasting than to traditional hand-drawn geological maps.

BOX 1: Creating an AI map

Using AI allows us to directly map the individual geological properties that underlie our traditional classifications (as well as any properties that our classifications may not consider), and to quantify uncertainty in the process. To achieve this, our AI systems must learn to provide a predictive distribution for the values of geological properties as a function of some inputs that define the ‘feature space’ of the problem.

Thinking in terms of position in 3D space is the bare minimum when it comes to geological mapping. In AI terms, this is a 3D feature space: we would be learning to predict geological properties based purely on where we are (figure 3, bottom panel provides an example of what this can look like for a 1D feature space). However, geologists consider many more factors (or features) than just position in space; they scan the landscape for geologically informative clues as part of the mapping process. These features may include the general topography, breaks of slope, and changes in vegetation. In order to produce convincing outputs our AI mapping systems should also be capable of identifying and using such features.

The breakthrough of deep learning is that it can ingest unstructured data, such as images, and learn to extract any features that are useful for improving performance on the task at hand – in this case mapping geological properties. To harness these capabilities, we can supplement our observations of geological properties with images of the surrounding terrain centred on each observation, and provide these as an additional input (along with position in space) to our deep learning architecture (Fig. 4).

In the case of the AI map in figure 2, our deep learning system learns to predict relative concentrations of potassium, iron, and calcium (as centred log-ratios) based on chemical assay measurements from over 100,000 stream sediment samples collected across the British Isles by the British Geological Survey, Geological Survey Ireland, and Geological Survey of Northern Ireland (see methodology of Johnson et al., 2005). Our AI system learns how these element concentrations relate not just to position in space, as provided by the easting, northing, and elevation of the geochemical observations (INPUT B, Fig. 4), but also how they relate to more complex features of the landscape, which are learned automatically from images of elevation data (from the EU’s Copernicus land monitor service) centred on each geochemical observation (INPUT A, Fig. 4).

This setup allows our AI mapping system to predict geological properties by interpolating between their observations in a much richer self-learned feature space than simply geographic space itself, in a procedure that is closer to the thought processes of a human geologist. In addition, by adopting a Bayesian approach, our AI system models a distribution of possible maps, rather than just a single best map, in order to quantify uncertainties and provide well-calibrated probabilistic predictions. Figure 2 simply displays the mean of this distribution of possible maps.

Details of the full methodology, including simulation of different possible maps, are available in Kirkwood et al., 2022, Math Geosci 54, 507–531; https://doi.org/10.1007/s11004-021-09988-0

Traditional geostatistics

The ‘properties first’ approach is not a new idea in itself. For example, the mineral exploration industry has been using geostatistical interpolation techniques (i.e., kriging) to map ‘properties first’ for decades. This is because it is much easier to find metal deposits by observing and mapping the concentrations of those metals directly, rather than only attempting to infer their distributions from traditional classified geological maps, which may contain very little information alluding to mineralisation (or other phenomena that are of interest to end users).

In simple terms, kriging can be thought of as fitting a smooth function (or surface) through our observations in two or three spatial dimensions. Kriging traditionally relies on an assumption that the effect that the distance between observations has on their similarity will be consistent everywhere. It also often assumes that this spatial autocorrelation will be the same in all directions. While these assumptions may be reasonable at the scale of individual mines, when mapping at regional or national scales the maps produced by kriging can seem unconvincing because they fail to capture geological structure, such as that of differing terranes and faulting, which geologists know to be important.

Widespread adoption of the ‘properties first’ approach to geological mapping therefore requires the development of a new generation of geostatistical modelling methods which do away with the need to make geologically unconvincing assumptions – this is where AI comes in. The process of developing AI mapping systems presents a fresh opportunity for geologists and geostatisticians to collectively agree on what we should consider to be reasonable assumptions behind the maps we produce. This is an exciting time for geological mapping.

Figure 2: (Left of diagonal) An example AI-generated ‘properties first’ geological map, which provides probabilistic predictions of geological properties through space. The properties mapped here are centred log-ratios of potassium (K; red), iron (Fe; green), and calcium (Ca; blue) concentrations as observed in stream sediments. This map displays the mean of a distribution of possible maps (the Bayesian posterior predictive distribution), from which distributions of possible values can be obtained for any point in space. The AI system was trained using G-BASE and Tellus geochemical survey data provided by the BGS, GSI, and GSNI, with auxiliary information provided by Copernicus terrain elevation grid; see Box 1. (Right of diagonal) A traditional ‘classification first’ geological map whereby rock units are manually delineated and coloured according to convention, based on rock types and ages. Source: The British Geological Survey’s Geology of Britain Viewer (http://mapapps.bgs.ac.uk/geologyofbritain3d/) Contains British Geological Survey materials © UKRI 2022 (Image credit and copyright of Charlie Kirkwood)

BOX 2: Different applications

Potassium, iron, and calcium are chosen to illustrate the AI mapping approach in this article because much of the familiar geology of the British Isles is revealed in the relative contrasts of these major elements, as can be seen by comparison with the traditional geological map (which is not a surprise, since composition is a key part of traditional classification schemes). The relative concentrations of these elements can tell us a lot. For example, while calcium is abundant in limestones and chalk (e.g. the Chiltern hills, coloured blue), it can be almost completely absent from deep marine sediments deposited below the calcite compensation depth (e.g. much of Wales, coloured yellow). Meanwhile, among other things, iron is abundant in mafic intrusions, such as the Antrim basalts of Northern Ireland (turquoise), while potassium is relatively concentrated in felsic intrusions, such as the granites of the Cornubian batholith in the south west of England (red). In order to produce such a detailed map, our AI mapping system has had to learn the unique ways in which each element relates to the landscape itself; different chemical compositions allude to different weathering properties, which in turn have different surface expressions in the terrain’s topography, which our neural network learns to ‘read’.

Different end-use applications require different geological properties to be mapped. For example, for mineral exploration we may want maps of commodity element concentrations; for engineering projects we may want maps of mechanical properties such as shear strengths and joint kinematics. More generally, structural geologists may want maps of the orientations of bedding planes and foliation, and stratigraphers may want maps of ages of formation. Increasingly we are collecting quantitative observations of all these properties and more; the challenge is to design suitable AI systems that can model and map them in geologically sensible ways, so that our maps can become increasingly accurate and precise as more observations are collected, while remaining honest about their uncertainty.

Figure 3: Conceptual comparison of ‘classification first’ mapping (top) and ‘properties first’ mapping using Bayesian AI (bottom). The black line shows the true value of some geological property, y, through space, x. Black crosses are our observations of this property, with some error, which the two mapping approaches utilise. Note that regardless of their number, or the exact position of their boundaries, a single set of discrete classes provides a poor representation of continuous geological properties and cannot convey uncertainty. These issues are exacerbated in the real world mapping case of attempting to summarise multiple properties at once using the same class boundaries. Conversely, mapping ‘properties first’ using Bayesian AI methods brings the potential to directly obtain skilful probabilistic maps of as many geological properties as are of interest. (Image credit and copyright of Charlie Kirkwood)

Deep learning and Bayesian AI

The mathematical foundations of how to learn from data can be traced back to at least Pierre Simon Laplace and Thomas Bayes in the 1700s, but it is only now that our computers are becoming sufficiently powerful to realise the wider potential of what can be achieved. A significant breakthrough came ten years ago in 2012, when Alex Krizhevsky and colleagues demonstrated that deep neural networks (neural networks with multiple stacked layers; e.g. Fig. 4) trained using powerful graphics processing units (GPUs) were able to outperform all previous computational approaches in the task of classifying images. What was revolutionary was that this ‘deep learning’ approach could simply be provided with the raw images as input, with no need to manually apply things like edge-detection filters in advance to make the classification problem easier to solve (e.g., the right kind of filter can highlight the presence of cat ears in an image, thus making it easier to identify a cat). Instead, deep learning can automatically learn relevant features for itself.

Deep learning has emerged as an effective way to utilise recent advances in computing power because it allows us to pose complex modelling problems as optimisation problems that can be solved relatively efficiently. A neural network is a system of interconnected parameters, whose values define a function that dictates what the neural network’s output will be for a given input. By optimising the parameter values we can learn functions to perform specific tasks. In the context of geological mapping, we can design neural networks that learn to output accurate predictions of geological properties, informed not just by position in space, but also by the same kinds of features we look for in the field as geologists, such as breaks of slope and differences in terrain textures. In concept, this process is comparable to that of training a human geologist with the necessary skills to produce good geological maps. The more prior knowledge a geologist has, the better job they are likely to do when mapping a new area. The same applies to AI – we can incorporate our geological wisdom into the design of AI mapping systems so that they are equipped to recognise the types of relationships that we believe are important (as in figure 4). Therefore, a transition to AI mapping does not mean a move away from the importance of geological expertise. Instead, it is an opportunity for us to collate our expertise together inside the digital domain.

Figure 4: Architecture of the deep neural network that produced the AI map of figure 2. The neural network learns by optimising its output – predictive distributions of geological properties – in relation to field observations (points on the map on the left, shown coloured by calcium concentration) and to our prior beliefs. Our architecture combines parallel information processing pathways in order to learn both local contextual features from gridded terrain elevation data (Input A) and global position features (Input B), as well as the interactions between these two feature types. The ‘thought processes’ involved are not unlike those of a field geologist. (Modified from Kirkwood, et al., 2022, Math Geosci 54, 507–531; https://doi.org/10.1007/s11004-021-09988-0, published under a Creative Commons CC BY license. Image credit and copyright of Charlie Kirkwood)

When comparing geological maps to reality, it makes sense to use the statistical concept of likelihood – the probability that the scenario depicted by the map would produce the observations we see. In deep learning we can directly maximise the likelihood using optimisation, which would result in a single map with the maximum likelihood of having produced our field observations. It seems we instinctively aim to do the same thing as mapping geologists, albeit in a more organic way. However, without infinite geological observations there will always be room for multiple possible geological interpretations because of the uncertainty about what is going on in the gaps between our observations. Therefore, the act of producing only a single best-fit geological map (the maximum likelihood approach) is inherently overconfident – we are putting all our eggs into one basket despite logic dictating that this basket is just one of many possibilities. Using a single ‘best’ map for important decision making could therefore lead to unwanted outcomes, because by doing so we are blind to the unavoidable uncertainties involved in the production of the map, which may be large enough to derail our projects in the real world.

To deal with uncertainty, we need to shift our goal away from producing a single best geological map and instead aim to model all of the maps that are possible given our observations. This is the Bayesian approach (named after Thomas Bayes, 1702 – 1761). We can think of this as creating a digital flipbook – infinitely long – where each page shows a realisation of a different possible geological map. As a collective, the book describes all the geological scenarios that could plausibly result in the observations we see. By collecting more observations, we can increase our knowledge and reduce the spread (uncertainty) between the possible geological scenarios. We can achieve this in AI mapping by modelling a distribution over parameter values according to Bayes’ rule, rather than simply optimising to achieve a single best map.

Figure 5: AI map quality checks are made against test observations (n = 11,700) excluded from the modelling process so that they are akin to unobserved locations. Far left: scatter plot of observed and predicted element concentrations (taking the mean of the AI’s predictive distribution as a point prediction). High R2 values show good fit between the map and unobserved locations. Left: quantile-quantile plot assessing probabilistic calibration of the AI’s predictive distribution against reality. Calibration is near-perfect, suggesting that element concentrations at unseen locations will be observed with the frequencies implied by the AI map. Continuous rank probability score (CRPS) approaches zero as the AI’s predictive distribution matches the distribution of observations. (Image credit and copyright of Charlie Kirkwood)

By planning our activities and policies around probabilities rather than potentially incorrect absolutes, we can optimise the outcomes of our geology-related endeavours (as well as optimise the collection of new data). Of course, probabilities need to be calibrated against reality in order to be useful, but this is all part of the workflow of developing AI systems (Fig. 5). Interestingly, much of the work in assessing the skill of probabilistic predictions has been developed in the context of weather forecasting, which is in many ways the spatio-temporal cousin of geological mapping, and also had its early roots in hand drawn maps.

The closest thing we have to a Bayesian approach in the world of traditional (non computational) geological mapping practice is ‘the undergraduate mapping trip’, on which a cohort of student geologists are each tasked with producing their own map of the same study area. The result is a collection, or ensemble, of possible geological maps. This ensemble conveys uncertainty (“it could be this, or it could be that”) better than any single map could, which is the same reason that weather forecasters have been using ensemble models for about the last 30 years. However, in the case of the undergraduate mapping trip we may question the overall skill of the resultant ensemble on the grounds that it has been produced by inexperienced geologists whose individual interpretations may not always be sensible. Imagine if instead the ensemble of maps was produced by infinitely many world-leading geological mapping experts – in essence this is what AI has the potential to provide us with, if we design it well.

Where the future leads

Bayesian geological mapping can only be achieved in a rigorous way using powerful computers. So, we must port our mapping procedures into the digital domain – and that doesn’t mean drawing maps by hand on a computer, it means designing AI systems that can themselves conduct the task of geological mapping. Not only will this allow us to produce geological models and maps at new levels of fidelity, and with quantified uncertainties, but in the process we would be ‘laying out on the table’ the necessary intellectual machinery required to produce geological maps, so that this knowledge can be communally improved upon over the years.

It is difficult to predict just how drastic the progress could be if we – the geoscience community – adopt AI as an integral part of geological mapping. Improving our ability to distil and convey geological information can only bring benefits, particularly as we face the pressing challenges of the climate crisis. Now more than ever we need to be in tune with our planet, and that means going beyond the opaque polygons of our traditional mapping practices.

Dedication

I would like to dedicate this article to Hazel Pritchard, my supervisor during my time at Cardiff University, and to Barry Rawlins, a colleague and mentor during my time at the British Geological Survey. Both were inspirational geoscientists who encouraged open‑mindedness and discussion of new ideas. Both sadly lost their lives to illness in 2017, but the enthusiasm and ideas they helped to cultivate still thrive.

Charlie Kirkwood
Charlie is a geologist and data scientist working towards his mathematics PhD at the University of Exeter, UK, in partnership with the Met Office.

Further reading

Kirkwood et al. (2022) Bayesian deep learning for spatial interpolation in the presence of auxiliary information. Mathematical Geosciences 54, 507–531; https://doi.org/10.1007/s11004-021-09988-0
Bergen et al. (2019) Machine learning for data-driven discovery in solid Earth geoscience. Science 363, 6433; https://doi.org/10.1126/science.aau0323
McGovern et al. (2017) Using artificial intelligence to improve real-time decision-making for high-impact weather. Bulletin of the American Meteorological Society, 98(10), pp.2073-2090. https://doi.org/10.1175/BAMS-D-16-0123.1
Kirkwood et al. (2016) A machine learning approach to geochemical mapping. Journal of Geochemical Exploration 167, 49-61; https://doi.org/10.1016/j.gexplo.2016.05.003
Kirkwood et al. (2015) Geological mapping using high resolution regression modelled soil geochemistry. Poster presented at The Geological Society of London’s William Smith Meeting 2015 (Part 2) – 200 Years and Beyond: The Future of Geological Mapping; http://nora.nerc.ac.uk/id/eprint/512664
Krizhevsky et al. (2012) Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25; https://proceedings.neurips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf
Haupt et al. eds. (2008) Artificial intelligence methods in the environmental sciences. Springer Science & Business Media.
Gneiting et al. (2007) Probabilistic forecasts, calibration and sharpness. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 69, 243-268; https://doi.org/10.1111/j.1467-9868.2007.00587.x
Johnson et al. (2005) G-BASE: baseline geochemical mapping of Great Britain and Northern Ireland. Geochemistry: Exploration, Environment, Analysis 5, 347-357; https://doi.org/10.1144/1467-7873%2F05-070