Drivers of imbalance in machine learning uptake across geology and geophysics

Our Data Science Director, Samuel Fielding and Lorin Davies (Managing Director) recently authored an article for the Machine Learning special issue of First Break this September. In it they use machine learning to interrogate factors driving its uptake in various geoscience domains.

September 2019

Topics covered include visualizing discipline-specific drivers of machine learning uptake (using machine learning), and exploring different ways to facilitate machine learning in undersaturated disciplines within the geosciences.

Machine Learning augments our acquisition, preparation, and interpretation of the data, so our next great challenge is to unlock these advancements to realise the next great productivity leap in data analysis.

EAGE members can read the full article here or anyone can read the abstract here.


Tales from Texas

Back in May, Lorin and I headed across the Atlantic to visit friends (new and old) over in Texas. We planned to spend a week in Houston in the run up to AAPG ACE 2019, followed by a week in San Antonio at the conference.

It was a little different to your usual business trip. As a start-up, it means we had much more flexibility in terms of our travel and accommodation. For starters, we were able to choose from hundreds of Air B’n’B’s and take a few diversions adding in a little road trip here and there.

Nonetheless we had a few hiccups along the way. I had the bright idea of printing my AAPG poster while we were in Houston to save bringing it over on the plane. Apparrently this isn’t so straight forward (even if you put your order through and everything looks hunky dory!) and I strongly advise against trying to print an 8x4ft poster at Office Depot! We ended up having to adapt the design and print it in three 2ft wide panels. It actually worked out as quite a nice way to break up the poster and guide the reader through the content – every cloud and all that.

Watercolour by Jim Koehn

After a week spent in Houston visiting potential clients and sampling a local ‘Ice House’ (above), we drove west over to San Antonio. We arrived at our lovely little Airbnb a few days before the conference and gave ourselves some time off to relax (below) or should I say explore.

Our San Antonio Airbnb

After contemplating numerous road trip options (a particularly adventurous one of Lorin’s was to attempt to make the 16 hour drive to the Grand Canyon) we decided on a relatively short jaunt to the Mexican border city of Laredo.

Here, we thought we would just nip in to Mexico, get a stamp in our passports and head home after sampling some authentic Mexican food and culture. It was our intention to walk over the Rio Grande bridge into Mexico as foot passengers instead of going by car, which is not uncommon for tourists to do.

Our first port of call when we arrived at Laredo was lunch. We stopped at ‘El Maison De San Agustin‘ and I preceeded to order quesadillas for the millionth time on our trip thus far and drank a vat of hibiscus ice tea. Following said lunch, we were both stuffed and ready for our walk across the bridge.

El Maison De San Augustin

Crossing over into Mexico was predictably a lot faster than on the way back. As soon as we were over the border/bridge we had numerous offers of dentistry services!? We had a bit of a wander around, took some photos and then began to head back over the bridge. We didn’t have to queue for long to get back across but were a little disappointed that we didn’t actually get a stamp in our passports!

The Mexico-US border in Laredo over the Rio Grande.

Back in San Antonio a few days later, we kicked off the first day of the AAPG ACE conference at the Henry B. Gonzalez Convention Centre with the ice breaker reception where we bumped into many old friends and colleagues and put some names to new faces.

AAPG ACE Ice breaker reception

The conference had a packed schedule with topics ranging from unconventional reservoir characterisation, deepwater sedimentology, machine learning, and not one but two sessions on source-to-sink (a personal favourite of mine!). I spent Monday afternoon listening to talks from the “Fluvial and Deltaic Depositional Environments: Reservoir Characterisation and Prediction From Multiple Scale Analyses” theme. John Holbrook touched upon braided versus meandering systems and whether or not it is actually that black and white when it comes down to it. James Mullins discussed automated workflows for reservoir modelling using drone data and libraries of photo analogue systems, and Margaret Pataki presented her work on Rapid Reservoir Modelling (RRM) using sketch based models.

AAPG ACE 2019 Exhibition Hall and Poster Sessions

Tuesday morning was spent swithching between the source-to-sink sessions and “Multi-Disciplinary Integration for Subsurface Efforts in the Age of Big Data”. There were lots of interesting provenance and sediment supply studies studies interspersed with sediment routing and recycling in the source-to-sink sessions, inolving talks from Peter Burgess and Jinyo Zhang. The integration and big data session touched upon multi-disciplinary approaches to subsurface studies, including the use of petrography, geochronology and biostratigraphy data in order to produce more accurate chronostratigraphic interpretations when correlating wells, presented by Eugene Syzmanski. It seems that the value of geochronology is gaining more and more momentum in the oil and gas industry recently as people realise that multi-disciplinary approaches lead to more robust interpretations.

That afternoon was our poster session where we presented on “A Source-to-Sink reservoir Quality Prediction Workflow: The Offshore Nile Delta”. It was great to see such a busy poster session (I’m guessing partly due to the proximity of the exhibition hall and free beer at the end of the day!) and we ended up chatting to lots of new and interesting folks about the work that we are doing at Petryx and how it can add value to real world geological problems – particulary with regards to taking the leg work out of data collection and standardisation. All in all, we were stood by the poster for about 4 hours and recieved lots of positive and constructive feedback.

Extended abstracts to be published online soon!

I also had judging commitments that afternoon and headed over to the poster session on “The Digital Transformation in the Geosciences”. There were loads of fantastic posters, all were presented exceptionally well with some great visualisations of some pretty cutting-edge data science.

Subjects included petrophysical facies classifications using neural networks, the use of clustering techiniques to define chemofacies and the application of decision trees to determine failure modes.

In addition to all the technical talks, there were also some great sessions focussing on sustanability and a DEG special session on the environmental impact of the oil and gas industry. Discussions were centered about how geology will remain key in the transition to a more carbon neutral society with the increasing importance of practices such as carbon cap and storage and geothermal energy. Consensus was that we need to stop thinking of ourselves as the bad guys and to start realising the potential that we have as an industry to help with the climate change challenges that lie ahead. This will also be crucial in continuing to attract bright and ambitious talent to the oil industry in order to help us adapt to the digital transformation culture and the ‘big crew change’ that we see on the horizon.

Unfortunately Tuesday was also the last day of the conference for us as we had to head back to Houston for a last minute meeting before catching our flight home. All in all Lorin and I had a great trip gathering lots of feedback, meeting lots of new faces and are looking forward to heading back to AAPG next year, in Houston.

Next week Lorin will be writing an article all about our week in the Start-up area of this year’s EAGE Annual in London.

PRESS RELEASE: Petryx and Getech announce database partnership

Petryx Ltd have partnered with Getech Plc, allowing customers in the Oil & Gas industry to view Petryx Database coverage alongside Getech data products. Covering every major continental mass in the world, the Petryx Database offers hinterland data essential to data-driven source-to-sink interpretations.  

Managing Director of Petryx Ltd, Lorin Davies says: “We are delighted to have partnered with Getech. This brokerage deal provides much higher visibility of Petryx datasets to Oil & Gas explorers and demonstrates Getech’s commitment to innovation. Deals like this emphasise the many ways in which the Oil & Gas industry can support innovative start-ups like ours.” 

Senior GIS Consultant, Thierry Gregorius says: “We are excited to add this valuable new data resource to the extensive product range that we offer our customers. Explorers and geoscientists will now be able to access global datasets from Petryx via Getech, including geochemistry, geo- and thermochronology, and more. With this step we are offering our customers a growing one-stop shop.” 

Petryx Ltd is an innovative digital geoscience start-up who have revolutionised data integration and acquisition processes, giving Oil & Gas explorers fast access to cleaned and standardised datasets. The Petryx Database is a multi-disciplinary geoscience dataset compiled from numerous sources with the collaboration of industry and academic partners. It offers an unrivalled view of the composition of the earth from a single unified platform. 

Petryx at PETEX

This week we have been showing off our wares at the PESGB YP Summit and PETEX conference. If you haven’t had the opportunity to chat to us already, or found one of our web-app postcards then drop me a line.

We are showcasing two unique offerings this week. The first is our web-app. The web-app is a demonstration tool which shows the kind of bespoke coded solution we can provide. It is particularly useful for quick-look analyses or when conducting routine plotting or charting tasks and can probably save you a load of time. We dropped in some XRD data, but we can build tools like this for you quickly and easily for virtually any geoscience dataset. Get to the web-app by clicking here.

Secondly we are showing the Petryx Database. With an estimated 80% of data science projects devoted to aggregating and cleaning data. The Petryx Database offers a global geoscience data resource unequalled in the market. What’s more, Petryx offer a one off purchase commercial model, so you needn’t buy the same data from anyone year-after-year.

If we missed you, or you want to hear more about our data or services then drop us a line!

Lorin Davies

Source-to-sink workflows for de-risking the oil and gas reservoir

Over the last few weeks we have been spending lots of time talking to clients about some of the datasets offered by Petryx. In a few cases we have been talking to sedimentologists and geoscientists who are experienced with manipulating and interrogating datasets like ours. However, even experienced geoscientists still have questions about how to get the most out of these datasets, particularly when doing source-to-sink work. This blog explains the approach we take at Petryx when puzzling out sediment routing and prediction of reservoir quality. We outline a generalised approach, followed by some of the outputs you would expect to see from a workflow like this.

Source to sink: Detailed study of processes and products relating to clastic sediments, from their origins in the hinterland, to the sedimentary basin.

Normally, the quickest route to good interpretation is through good scientific method. In source-to-sink questions this means observing the data, be that subsurface and/or hinterland data, and formulating a question: “where and when are the good reservoir sands being deposited”, or “does my block contain good quality sand?”. Maybe we have questions about seal capacity, or maybe a group of people have a bunch of questions which we could really do with some quantifiable answers to.

The next step is to build a hypothesis based on the data we have observed. This may mean integrating drainage-basin polygons with hinterland geology and geochemical data. We can take thermochronological data and further refine our drainage basins, or adjust the expected sediment generation based upon these inputs. Paleocurrent data help us to verify where clastic systems were flowing, and further refine the drainage story. We bring in any climatic data we have and use this along with total drainage length to quantify sediment breakdown, and therefore the cleaning potential of our sands. Plate models and paleogeographic interpretations can be significant, but throughout this process it is paramount to be able to segregate the interpretative inputs from the hard data, and weight accordingly. Once we have mashed together all these data and interpretations we can make a prediction: “Given what I have observed, I think that there is lots of good quality sand going from A to B at this time”. If we have enough data, and time to work with it, we might even quantify this hypothesis with some degree of certainty (see this great book for an explanation of why you should be sceptical of predictions that people won’t put a number on).

Now the fun bit! We take some data which we have held in reserve and test our hypotheses. A great game if you are asking a service provider to give you a bespoke source-to-sink interpretation is to keep a few petrographic analyses in reserve to test some of their predictions. We sometimes get a bit upset about not getting all the data, but it is a fantastic way of testing the veracity of the predictions being made. After all, it’s better to test our hypothesis before you drill the well, rather than afterwards.

Above: One output of a hypothetical source-to-sink workflow, with mineral composition predictions from given hinterland zones. Multiple datasets can be used to give a qualitative estimate of reservoir potential, and then mixed into sediment packages. Image credit: Petryx Ltd (CC BY-SA 4.0)

Now we can start to iterate. It is OK to get it wrong as long as we find out why. Often the predictions will need to be refined and improved to reflect new data. Once we have a model which we are happy with we can take our outputs and make some quantitative statements about the subsurface. Often a report or study will stop here and not fully translate a working model into useful quantitative data, as if it is enough to say, “We have a model which explains the data well”. Let’s take that hypothetical model and share our insights with our colleagues:

  • Thermochronological data suggest up to 6 km of exhumed lithosphere over our target interval.
  • Uplift rates around our target interval suggest a significant influx of clastic material into the area around our block (15 km3) derived from hinterland A (65% certainty).
  • Siliciclastic packages in package α are likely to be composed of 55% quartz, 25% plagioclase and 20% K-feldspar.
  • Package β, whilst relatively significant in the sediment pile is likely to be largely (85%) derived from hinterland C, so could be problematic. This may mean that where it is interleaved with package α it risks washing in diagenetic fluids.
  • Package γ appears to be a distal equivalent of package α, but also contains chemical sediments. It is likely to contain up to 6% chert.
  • Package ζ is a regional seal with an expected clay content of 30%.

Finally, we want to say loud and clear what the sensitivities and testable parts of the model are. So if someone finds some data which doesn’t fit, we can iterate our model further.

  • Paleoclimatic conditions indicate high potential for sand cleaning and improved reservoir quality. Lower weathering rates and stable climatic conditions will result in lower chemical abrasion and reduced sand cleaning.
  • Little compositional data was available for the upper part of our target zone, resulting on heavy reliance on UPb data, which can be blind to mafic hinterlands.
  • We would expect little or no metamorphic minerals in package α.

To quote statistician George Box, who we think had it about right: “All models are wrong, but some are useful”. Hopefully this idealised workflow can help you get closer to that useful model and know what these datasets should be able to tell us. If you are interested in source-to-sink modelling, or want to discuss more, drop a comment below or head over to a short survey which we will be promoting in the coming weeks. We think we have some great workflows, but your opinions can really help us make them more relevant to you.

George Box. Image credit: DavidMCEddy at en.wikipedia (CC BY-SA 3.0).

By Dr Lorin Davies

PRESS RELEASE: New international geoscience consultancy announced

Petryx Ltd, is a new company set up to meet the growing need for modern data science capabilities in the Oil & Gas industry. The company is based in North Wales in the UK and serves Oil & Gas operators worldwide.

Dr Lorin Davies, Managing Director of Petryx Ltd said: “The Oil & Gas industry has fundamentally changed since the heyday of the 1990’s and 00’s. Our customers are modern organisations who look to the service sector for innovative answers to their challenges. We launched Petryx to meet these challenges and to bring technologies developed for the internet software industry to the upstream industry.

The products and services in development at Petryx are truly exciting and will allow our clients to better understand their own data, as well as the answers buried in academic research.”

Petryx offers built-for-purpose data products and services. Petryx data products are designed to be light, responsive and perfectly optimised for clients’ needs. As scientists in the Oil & Gas industry tackle new hurdles with more marginal fields, in deeper water and with smaller teams, the skills and expertise offered by Petryx will help them to make better decisions.