Big data has become a major topic in many industries. Most recently, the oil and gas industry adopted a special interest in data science as a result of the increasing availability of public domains and commercial databases. Utilizing and processing such data can help in making better future decisions. The aim of this work is to provide an example and demonstrate methodologies on how to collect and utilize big data to help in making better future decisions in the oils and gas industry.
After reading a good number of papers and books about the applications of data analysis in the oil and gas industry, in addition to other industries, and given that data analysis is the area of expertise of the authors, this paper was written to demonstrate real examples of data processing and validation workflows. This work is intended to cover the gaps in the literature were many of the publications only discuss the importance of data-driven analytics.
This paper provides an overview of the diverse and bulk data generating sources in the oil and gas industry, starting from the exploration phase to the end of the lifecycle of the well. It provides an example of utilizing a public domain database (FracFocus) and demonstrates a step by step workflow on how to collect and process the data based on the objective of the analytics. Two real examples of descriptive and predictive analytics are also demonstrated in this paper to show the power of having a diverse and multiple resources databases. A framework of data validation and preparation is also shown to illustrate data quality checks combined with best practices of data cleansing and outlier detection methodologies.
This paper provides a clear methodology on how to successfully apply data analysis which can serve as a guide for some future data analysis applications in the oil and gas industry.