Drivers of imbalance in machine learning uptake across geology and geophysics

Our Data Science Director, Samuel Fielding and Lorin Davies (Managing Director) recently authored an article for the Machine Learning special issue of First Break this September. In it they use machine learning to interrogate factors driving its uptake in various geoscience domains.

September 2019

Topics covered include visualizing discipline-specific drivers of machine learning uptake (using machine learning), and exploring different ways to facilitate machine learning in undersaturated disciplines within the geosciences.

Machine Learning augments our acquisition, preparation, and interpretation of the data, so our next great challenge is to unlock these advancements to realise the next great productivity leap in data analysis.

EAGE members can read the full article here or anyone can read the abstract here.


Tales from Texas

Back in May, Lorin and I headed across the Atlantic to visit friends (new and old) over in Texas. We planned to spend a week in Houston in the run up to AAPG ACE 2019, followed by a week in San Antonio at the conference.

It was a little different to your usual business trip. As a start-up, it means we had much more flexibility in terms of our travel and accommodation. For starters, we were able to choose from hundreds of Air B’n’B’s and take a few diversions adding in a little road trip here and there.

Nonetheless we had a few hiccups along the way. I had the bright idea of printing my AAPG poster while we were in Houston to save bringing it over on the plane. Apparrently this isn’t so straight forward (even if you put your order through and everything looks hunky dory!) and I strongly advise against trying to print an 8x4ft poster at Office Depot! We ended up having to adapt the design and print it in three 2ft wide panels. It actually worked out as quite a nice way to break up the poster and guide the reader through the content – every cloud and all that.

Watercolour by Jim Koehn

After a week spent in Houston visiting potential clients and sampling a local ‘Ice House’ (above), we drove west over to San Antonio. We arrived at our lovely little Airbnb a few days before the conference and gave ourselves some time off to relax (below) or should I say explore.

Our San Antonio Airbnb

After contemplating numerous road trip options (a particularly adventurous one of Lorin’s was to attempt to make the 16 hour drive to the Grand Canyon) we decided on a relatively short jaunt to the Mexican border city of Laredo.

Here, we thought we would just nip in to Mexico, get a stamp in our passports and head home after sampling some authentic Mexican food and culture. It was our intention to walk over the Rio Grande bridge into Mexico as foot passengers instead of going by car, which is not uncommon for tourists to do.

Our first port of call when we arrived at Laredo was lunch. We stopped at ‘El Maison De San Agustin‘ and I preceeded to order quesadillas for the millionth time on our trip thus far and drank a vat of hibiscus ice tea. Following said lunch, we were both stuffed and ready for our walk across the bridge.

El Maison De San Augustin

Crossing over into Mexico was predictably a lot faster than on the way back. As soon as we were over the border/bridge we had numerous offers of dentistry services!? We had a bit of a wander around, took some photos and then began to head back over the bridge. We didn’t have to queue for long to get back across but were a little disappointed that we didn’t actually get a stamp in our passports!

The Mexico-US border in Laredo over the Rio Grande.

Back in San Antonio a few days later, we kicked off the first day of the AAPG ACE conference at the Henry B. Gonzalez Convention Centre with the ice breaker reception where we bumped into many old friends and colleagues and put some names to new faces.

AAPG ACE Ice breaker reception

The conference had a packed schedule with topics ranging from unconventional reservoir characterisation, deepwater sedimentology, machine learning, and not one but two sessions on source-to-sink (a personal favourite of mine!). I spent Monday afternoon listening to talks from the “Fluvial and Deltaic Depositional Environments: Reservoir Characterisation and Prediction From Multiple Scale Analyses” theme. John Holbrook touched upon braided versus meandering systems and whether or not it is actually that black and white when it comes down to it. James Mullins discussed automated workflows for reservoir modelling using drone data and libraries of photo analogue systems, and Margaret Pataki presented her work on Rapid Reservoir Modelling (RRM) using sketch based models.

AAPG ACE 2019 Exhibition Hall and Poster Sessions

Tuesday morning was spent swithching between the source-to-sink sessions and “Multi-Disciplinary Integration for Subsurface Efforts in the Age of Big Data”. There were lots of interesting provenance and sediment supply studies studies interspersed with sediment routing and recycling in the source-to-sink sessions, inolving talks from Peter Burgess and Jinyo Zhang. The integration and big data session touched upon multi-disciplinary approaches to subsurface studies, including the use of petrography, geochronology and biostratigraphy data in order to produce more accurate chronostratigraphic interpretations when correlating wells, presented by Eugene Syzmanski. It seems that the value of geochronology is gaining more and more momentum in the oil and gas industry recently as people realise that multi-disciplinary approaches lead to more robust interpretations.

That afternoon was our poster session where we presented on “A Source-to-Sink reservoir Quality Prediction Workflow: The Offshore Nile Delta”. It was great to see such a busy poster session (I’m guessing partly due to the proximity of the exhibition hall and free beer at the end of the day!) and we ended up chatting to lots of new and interesting folks about the work that we are doing at Petryx and how it can add value to real world geological problems – particulary with regards to taking the leg work out of data collection and standardisation. All in all, we were stood by the poster for about 4 hours and recieved lots of positive and constructive feedback.

Extended abstracts to be published online soon!

I also had judging commitments that afternoon and headed over to the poster session on “The Digital Transformation in the Geosciences”. There were loads of fantastic posters, all were presented exceptionally well with some great visualisations of some pretty cutting-edge data science.

Subjects included petrophysical facies classifications using neural networks, the use of clustering techiniques to define chemofacies and the application of decision trees to determine failure modes.

In addition to all the technical talks, there were also some great sessions focussing on sustanability and a DEG special session on the environmental impact of the oil and gas industry. Discussions were centered about how geology will remain key in the transition to a more carbon neutral society with the increasing importance of practices such as carbon cap and storage and geothermal energy. Consensus was that we need to stop thinking of ourselves as the bad guys and to start realising the potential that we have as an industry to help with the climate change challenges that lie ahead. This will also be crucial in continuing to attract bright and ambitious talent to the oil industry in order to help us adapt to the digital transformation culture and the ‘big crew change’ that we see on the horizon.

Unfortunately Tuesday was also the last day of the conference for us as we had to head back to Houston for a last minute meeting before catching our flight home. All in all Lorin and I had a great trip gathering lots of feedback, meeting lots of new faces and are looking forward to heading back to AAPG next year, in Houston.

Next week Lorin will be writing an article all about our week in the Start-up area of this year’s EAGE Annual in London.

PRESS RELEASE: Petryx and Getech announce database partnership

Petryx Ltd have partnered with Getech Plc, allowing customers in the Oil & Gas industry to view Petryx Database coverage alongside Getech data products. Covering every major continental mass in the world, the Petryx Database offers hinterland data essential to data-driven source-to-sink interpretations.  

Managing Director of Petryx Ltd, Lorin Davies says: “We are delighted to have partnered with Getech. This brokerage deal provides much higher visibility of Petryx datasets to Oil & Gas explorers and demonstrates Getech’s commitment to innovation. Deals like this emphasise the many ways in which the Oil & Gas industry can support innovative start-ups like ours.” 

Senior GIS Consultant, Thierry Gregorius says: “We are excited to add this valuable new data resource to the extensive product range that we offer our customers. Explorers and geoscientists will now be able to access global datasets from Petryx via Getech, including geochemistry, geo- and thermochronology, and more. With this step we are offering our customers a growing one-stop shop.” 

Petryx Ltd is an innovative digital geoscience start-up who have revolutionised data integration and acquisition processes, giving Oil & Gas explorers fast access to cleaned and standardised datasets. The Petryx Database is a multi-disciplinary geoscience dataset compiled from numerous sources with the collaboration of industry and academic partners. It offers an unrivalled view of the composition of the earth from a single unified platform. 

Petryx at PETEX

This week we have been showing off our wares at the PESGB YP Summit and PETEX conference. If you haven’t had the opportunity to chat to us already, or found one of our web-app postcards then drop me a line.

We are showcasing two unique offerings this week. The first is our web-app. The web-app is a demonstration tool which shows the kind of bespoke coded solution we can provide. It is particularly useful for quick-look analyses or when conducting routine plotting or charting tasks and can probably save you a load of time. We dropped in some XRD data, but we can build tools like this for you quickly and easily for virtually any geoscience dataset. Get to the web-app by clicking here.

Secondly we are showing the Petryx Database. With an estimated 80% of data science projects devoted to aggregating and cleaning data. The Petryx Database offers a global geoscience data resource unequalled in the market. What’s more, Petryx offer a one off purchase commercial model, so you needn’t buy the same data from anyone year-after-year.

If we missed you, or you want to hear more about our data or services then drop us a line!

Lorin Davies

Less isn’t always more. Sometimes it’s less.

Welcome to the third instalment of the Petryx Blog series! After @SamFielding confessed that his programming know-how and obsession with automation was fuelled by his laziness, I will be discussing how programming isn’t just a way to save time, but a fantastic tool when it comes to making sense of large unwieldy compilations of geoscience data.

Large, multi-technique datasets (notice how I didn’t use the term big data, such a wildly overused and little understood phrase in our book) are becoming more common throughout geoscience. This is partly due to the integration of an ever-increasing number of disciplines, but mostly due to the sheer amount of raw data available to us when carrying out analyses using several different techniques on a single project.

So why exactly do we need multi-technique studies? Sometimes a single dataset will give you the specific answer you require. However, if you have a small sample size or if you are faced with a particularly complex problem, you’ll likely need to incorporate data from other methods to meet these requirements. This is where a multi-technique approach can really add value.

Let’s take provenance studies for example. In my own work I have often found U-Pb zircon geochronology to be invaluable due to zircon’s robust nature and ubiquity regardless of tectonic setting. However, when I was faced with a reduced sample size, multiple episodes of recycling and the need to record multiple geothermal events, this method on its own was not enough. The same is true for other methods too, like heavy mineral analyses or fission track, which, while great under certain circumstances, can both prove insufficient depending on what information you are trying to glean from your data.

 

Using a multi-technique dataset can be an art form, but like any other art, its quality, not quantity, that counts. More is not necessarily better in all cases. Choosing which techniques to use and in which combination to use them requires careful consideration and knowledge of what each technique brings to the table; something we at Petryx pride ourselves on. When done right this delicate balancing act can reveal vital information and shed light on areas that would have been otherwise obscured.

Working on multi-disciplinary projects in both academia and industry over the past 10 years has exposed me to an ever-changing array of data generated from numerous different techniques. Dealing with this constant stream of datasets often means folders full of spreadsheets which must be organised and worked through before any useful analyses can take place or any meaningful conclusions can be drawn. Simply handling multi-technique data can prove a mammoth and intimidating task which gets in the way of what really needs doing.

Here at Petryx we want to make analysing data, and getting the answers you need, easier. Our Petryx Database provides an extensive, ever-growing, compilation of data which can be easily accessed and queried from the Petryx Data Lens web front-end. Designed and developed by geoscientists and data science experts, the Database and Data Lens provide valuable tools which aim to minimise the amount of time our clients need to spend on data management. The Data Lens allows users to query, visualise, and pull together their own data alongside those from peer-reviewed datasets. Most importantly, our Data Lens isn’t a generic data handling platform, but is instead a specialist tool for geoscientists and the specific applications and problems they encounter.

In the early stages of any project, knowing what types of data are available for your area of interest can help you to plan more effectively. You may have a penchant for thermochronology, but if fission track data is looking a bit thin on the ground in your study area, and you have a limited budget, then you might want to consider either tapping into some other methods that have better coverage or enquiring about our data acquisition service.

Perhaps you are further along in your research and find yourself needing to compare the results of several different datasets to a multitude of signatures from potential source areas. Instead of laboriously plotting them individually using different tools, why not check out the different signatures on the fly or filter through thousands of rows of data in seconds to find similar samples that could be the source of your clastic reservoir at that crucial interval. Here at Petryx, we can help you do it. By having a small team of geoscientists and data scientists working closely together, we’ve created a product that takes the best bits from both worlds and puts you in control without all the hard work, allowing you to make the most out of your all-encompassing multi-technique dataset.

By Laura Fielding, Geoscience Director

Data science for better living

I have a confession to make. I’m lazy. Most of my working day is dedicated to finding neat ways of making my life easier. That can’t be a bad thing, right? Being lazy seems to go hand in hand with a desire to learn computer programming. And programming can certainly make your life easier, often by several orders of magnitude!

As Data Science Director at Petryx my mission is to lighten the load; to let you breeze through some of the everyday things that are necessary but, let’s face it, often stressful, repetitive, or dull. Humans have so much more to offer than being stuck doing the repetitive tasks a computer could easily perform. Also, life’s too short, and we want you to relax a little and enjoy it.

We want to help you work smart, not hard by:

Making the things you need to do easier and faster. Do boring but important things in a fraction of the time, then do something more important, or more fun, instead.
Helping the things you do to be more reproducible. Make it easy to perform exactly the same process over and over again.
Improving the usability and accuracy of results. Make your analyses more useful and allow them to stand the test of time.

At Petryx, we are striving to address some key problems faced by Oil & Gas and other data-reliant industries, starting with…

“Where is the data and why is it in such a mess?”

Our Petryx Database is designed from the ground up to be a single point of entry to cleaned and standardised data that was previously poorly connected and poorly constrained. These data can be mashed up from a mix of both non-exclusive and proprietary sources so can be used by academic institutions and businesses alike. The creation of the Petryx Database seeks to address one of the largest problems in many industries today by bringing our clients data that is ready for their needs and in a clean, easy-to-digest format.

The Petryx Data Lens web tool has been developed to let users more easily access, query, and analyse that data. It is a natural complement to the underlying database and provides access for users not wanting to query data programmatically (e.g. using SQL).

Simply having all of your data in the same place and in a ready-to-use format is a necessity before you can start automating any analysis or other downstream data-centric tasks. With the Petryx Database and Data Lens we aim to provide this essential but often overlooked platform.

Talking of automating tasks brings us to the next problem we want to address…

“I have to do that how many times!?”

Almost all the tasks we perform throughout the working day – moving data around, making a chart, writing a report, etc. – have a repetitive element which could be done more easily by a computer. To someone who isn’t tuned to thinking like a computer these programmable elements may not be readily apparent. However, not only could a computer perform any of these tasks for you, it can often perform them almost instantaneously.

Performing tasks manually has another downside – it’s not easily reproducible. If you want someone else to pick up where you left off, you’d likely write down some instructions – more manual work that we could do without! Writing code so that a computer can perform these tasks for you can almost completely eliminate this problem.

At Petryx we build products and offer services that make everyday tasks more efficient and more reproducible. We want to allow the computer to pick up some of the slack, removing headaches and bottlenecks from your workflows.

The philosophy of using computing power to take the burden off performing manual tasks is central to how we run our business, and to the skills we develop and promote. We use programming languages like R, Python and SQL to help make our workflows more efficient and more reproducible on a daily basis, and we want to help others do the same.

Finally, even if you’ve gathered all your data into the same place, and automated all of your repetitive processes, you’re going to want to analyse it. This brings us to the third problem…

“I don’t have a program that can do that”

Often, many of us don’t have the tools to get the most out of our data. For instance, if you don’t have access to the most appropriate statistical or charting techniques, and instead have to rely on analysing your data in spreadsheets, you are at a significant disadvantage. Even a simple statistical analysis (e.g. a t-test) is remarkably difficult using just a spreadsheet. Don’t get me wrong, spreadsheets are great for many things, but for doing data analysis they are only one step up from a pen, paper, and calculator. Having to use suboptimal methods can lead to erroneous or poorly defined conclusions. This can result not only in direct financial loss but may also force you to redo analyses. More time wasted doing things none of us should have to do!

As regular users of open source programming languages R & Python, Petryx has access to the cutting edge in statistical analysis, data manipulation, data visualisation, and machine learning techniques. For example, many statistical libraries are only available in R or Python. Having powerful tools at our disposal ensures any analysis and interpretation we do maximises the value we get out of our data. It also allows us to generate the most accurate reflection of reality possible with the data available.

Not only are we using tools like R to help us internally with the work we do for others, but we’re also always on the lookout for ways to help you make the most of these tools as well.

The future

Over the last 10 years the Oil & Gas and related industries have certainly moved further down the road to enlightenment. Databases are now often used instead of storing data in files. Computing power is preferred over human power for mundane, repetitive tasks.

However, we still have a way to go.

Petryx are ready to help our partners take the next step in that journey. By using tools at the cutting edge of data storage, processing, visualisation and analysis, we want to give humans more time to do what they’re good at.

By Sam Fielding, Data Science Director

A new dining experience for geoscience data consumers

This week we launched Petryx Ltd, a new company with a new vision for digital services for the Oil & Gas industry. This blog post is my introduction to the company and an opportunity for me to articulate what we are doing. It will be followed in the coming days by blog posts from my fellow directors Laura Fielding and Sam Fielding who will be giving their perspectives on what we have to offer the geoscience domain.

After a challenging few years, many Oil & Gas companies are emerging from the downturn as smaller, more responsive organisations. Whereas before a team of dozens or hundreds of explorers would have worked up prospects across the globe; now just a few skilled geoscientists must assess where opportunities lie, and work to understand the subsurface. Whilst all of this has been happening, products offered by the service industry have moved ever closer to long-term subscription models. These streamlined enterprises face many challenges, and we at Petryx are here to meet them.

In order to do the job of exploration, operators therefore have to work differently, and whilst much of this change in pace and tactics must come from within, we believe that it is the role of service companies to reflect the new face of geoscience exploration in the solutions we deliver. A trend for episodic projects have taken precedent over multi-year programmes, and now more than ever our clients must answer their technical challenges in less time. As a company, Petryx offers a portfolio of innovative new data solutions and services, which, quite simply, help our clients to get to the answers they need faster.

In a few days’ time Geoscience Director at Petryx Laura Fielding will be discussing how multidisciplinary datasets are invaluable in cracking subsurface puzzles, and this is exactly what we will be helping our clients to do with our first commercial offering. We are designing a single, connected, multidisciplinary database of geoscience data, the Petryx Database. Hosted in the cloud, this database may be mirrored by our clients directly into their private data clouds, so they can leverage the data themselves; or, if they prefer, accessed through a built-for-purpose web application which allows the data to be queried and interpreted on the fly. Our clients will be able to cross-examine their own data directly against the Petryx Database, using the web application as an interpretative tool, or upload newly acquired datasets directly into their own instance of the database.

Perhaps most importantly, the web applications and interpretative tools offered with the Petryx Database are free, and the data in our database is offered as a one-off purchase rather than on a subscription basis; so that our clients have more control and are better able to manage their budgets, focusing resources on the areas which count. We understand the needs of our clients and believe that the tools we offer and the infrastructure for serving Petryx data is our responsibility, just like a restaurant is responsible for paying the rent and providing diners with a knife and fork. As long as our customers keep buying meals, we’ll keep our side of the bargain and keep the lights on.

By Lorin Davies, Managing Director