Reservoir analogue study, which is different with flow physics based prediction methods such as reservoir simulation, is based on human experience and knowledge from skilled reservoir engineers. In this work, we present a new workflow to replace the human knowledge-based analogue studies with data analytics and machine learning techniques.
First, we collect reservoir properties, development parameters, and historical recovery data for 1381 actual U.S. oilfields from Tertiary Oil Recovery Information System (TORIS) by U.S. Department of Energy. We conduct extensive data cleaning for outliers and missing values. Then, we determine the most important determining factors for recovery factors. We further use single-variable and bi-variable analysis to understand relationship between recovery factor and determining factors. Finally, we use train an Artificial Neural Network (ANN) model to make recovery factors predictions.
We have found that the recovery factors mostly depend on 19 principal factors, a reduction from a total of more than 70 properties originally in TORIS. We randomly select data from 90% of these oilfields as training set for machine learning. The predictability and accuracy of such methodology is tested by making recovery forecasts for the remaining 10% oilfields and by comparing the forecasts with the actual recovery factors. Eventually, the average error in recovery factor predicted by the trained ANN model is about 10%. Overall, this methodology has shown strong performance in computer assisted analogue study, which shows minimum requirements on human knowledge and hands-on work to study these 1381 oilfields.
This work provides a new workflow of using data analytics and machine learning techniques for reservoir analogues studies. Reservoir engineering software systematically built based on this methodology can serve as more efficient and accurate predictive tool in studying reservoir analogues.