Drivers of imbalance in machine learning uptake across geology and geophysics

Our Data Science Director, Samuel Fielding and Lorin Davies (Managing Director) recently authored an article for the Machine Learning special issue of First Break this September. In it they use machine learning to interrogate factors driving its uptake in various geoscience domains.

September 2019

Topics covered include visualizing discipline-specific drivers of machine learning uptake (using machine learning), and exploring different ways to facilitate machine learning in undersaturated disciplines within the geosciences.

Machine Learning augments our acquisition, preparation, and interpretation of the data, so our next great challenge is to unlock these advancements to realise the next great productivity leap in data analysis.

EAGE members can read the full article here or anyone can read the abstract here.


Source-to-sink workflows for de-risking the oil and gas reservoir

Over the last few weeks we have been spending lots of time talking to clients about some of the datasets offered by Petryx. In a few cases we have been talking to sedimentologists and geoscientists who are experienced with manipulating and interrogating datasets like ours. However, even experienced geoscientists still have questions about how to get the most out of these datasets, particularly when doing source-to-sink work. This blog explains the approach we take at Petryx when puzzling out sediment routing and prediction of reservoir quality. We outline a generalised approach, followed by some of the outputs you would expect to see from a workflow like this.

Source to sink: Detailed study of processes and products relating to clastic sediments, from their origins in the hinterland, to the sedimentary basin.

Normally, the quickest route to good interpretation is through good scientific method. In source-to-sink questions this means observing the data, be that subsurface and/or hinterland data, and formulating a question: “where and when are the good reservoir sands being deposited”, or “does my block contain good quality sand?”. Maybe we have questions about seal capacity, or maybe a group of people have a bunch of questions which we could really do with some quantifiable answers to.

The next step is to build a hypothesis based on the data we have observed. This may mean integrating drainage-basin polygons with hinterland geology and geochemical data. We can take thermochronological data and further refine our drainage basins, or adjust the expected sediment generation based upon these inputs. Paleocurrent data help us to verify where clastic systems were flowing, and further refine the drainage story. We bring in any climatic data we have and use this along with total drainage length to quantify sediment breakdown, and therefore the cleaning potential of our sands. Plate models and paleogeographic interpretations can be significant, but throughout this process it is paramount to be able to segregate the interpretative inputs from the hard data, and weight accordingly. Once we have mashed together all these data and interpretations we can make a prediction: “Given what I have observed, I think that there is lots of good quality sand going from A to B at this time”. If we have enough data, and time to work with it, we might even quantify this hypothesis with some degree of certainty (see this great book for an explanation of why you should be sceptical of predictions that people won’t put a number on).

Now the fun bit! We take some data which we have held in reserve and test our hypotheses. A great game if you are asking a service provider to give you a bespoke source-to-sink interpretation is to keep a few petrographic analyses in reserve to test some of their predictions. We sometimes get a bit upset about not getting all the data, but it is a fantastic way of testing the veracity of the predictions being made. After all, it’s better to test our hypothesis before you drill the well, rather than afterwards.

Above: One output of a hypothetical source-to-sink workflow, with mineral composition predictions from given hinterland zones. Multiple datasets can be used to give a qualitative estimate of reservoir potential, and then mixed into sediment packages. Image credit: Petryx Ltd (CC BY-SA 4.0)

Now we can start to iterate. It is OK to get it wrong as long as we find out why. Often the predictions will need to be refined and improved to reflect new data. Once we have a model which we are happy with we can take our outputs and make some quantitative statements about the subsurface. Often a report or study will stop here and not fully translate a working model into useful quantitative data, as if it is enough to say, “We have a model which explains the data well”. Let’s take that hypothetical model and share our insights with our colleagues:

  • Thermochronological data suggest up to 6 km of exhumed lithosphere over our target interval.
  • Uplift rates around our target interval suggest a significant influx of clastic material into the area around our block (15 km3) derived from hinterland A (65% certainty).
  • Siliciclastic packages in package α are likely to be composed of 55% quartz, 25% plagioclase and 20% K-feldspar.
  • Package β, whilst relatively significant in the sediment pile is likely to be largely (85%) derived from hinterland C, so could be problematic. This may mean that where it is interleaved with package α it risks washing in diagenetic fluids.
  • Package γ appears to be a distal equivalent of package α, but also contains chemical sediments. It is likely to contain up to 6% chert.
  • Package ζ is a regional seal with an expected clay content of 30%.

Finally, we want to say loud and clear what the sensitivities and testable parts of the model are. So if someone finds some data which doesn’t fit, we can iterate our model further.

  • Paleoclimatic conditions indicate high potential for sand cleaning and improved reservoir quality. Lower weathering rates and stable climatic conditions will result in lower chemical abrasion and reduced sand cleaning.
  • Little compositional data was available for the upper part of our target zone, resulting on heavy reliance on UPb data, which can be blind to mafic hinterlands.
  • We would expect little or no metamorphic minerals in package α.

To quote statistician George Box, who we think had it about right: “All models are wrong, but some are useful”. Hopefully this idealised workflow can help you get closer to that useful model and know what these datasets should be able to tell us. If you are interested in source-to-sink modelling, or want to discuss more, drop a comment below or head over to a short survey which we will be promoting in the coming weeks. We think we have some great workflows, but your opinions can really help us make them more relevant to you.

George Box. Image credit: DavidMCEddy at en.wikipedia (CC BY-SA 3.0).

By Dr Lorin Davies

PRESS RELEASE: New international geoscience consultancy announced

Petryx Ltd, is a new company set up to meet the growing need for modern data science capabilities in the Oil & Gas industry. The company is based in North Wales in the UK and serves Oil & Gas operators worldwide.

Dr Lorin Davies, Managing Director of Petryx Ltd said: “The Oil & Gas industry has fundamentally changed since the heyday of the 1990’s and 00’s. Our customers are modern organisations who look to the service sector for innovative answers to their challenges. We launched Petryx to meet these challenges and to bring technologies developed for the internet software industry to the upstream industry.

The products and services in development at Petryx are truly exciting and will allow our clients to better understand their own data, as well as the answers buried in academic research.”

Petryx offers built-for-purpose data products and services. Petryx data products are designed to be light, responsive and perfectly optimised for clients’ needs. As scientists in the Oil & Gas industry tackle new hurdles with more marginal fields, in deeper water and with smaller teams, the skills and expertise offered by Petryx will help them to make better decisions.