Analyzing databases using data analytics

Access full-text files




Lee, Boum Hee

Journal Title

Journal ISSN

Volume Title



There are many public and private databases of oil field properties the analysis of which could lead to insights in several areas. The recent trend of Big Data has given rise to novel analytic methods to effectively handle multidimensional data, and to visualize them to discover new patterns. The main objective of this research is to apply some of the methods used in data analytics to datasets with reservoir data.

Abstract Abstract Using a commercial reservoir properties database, we created and tested three data analytic models to predict ultimate oil and gas recovery efficiencies, using the following methods borrowed from data analytics: linear regression, linear regression with feature selection, and Bayesian network. We also adopted similarity ranking with principal component analysis to create a reservoir analog recommender system, which recognizes and ranks reservoir analogs from the database.

Abstract Among the models designed to estimate recovery factors, the linear regression models created with variables selected with sequential feature selection method performed the best, showing strong positive correlations between actual and predicted values of reservoir recovery efficiencies. Compared to this model, Bayesian network model, and simple linear regression model performed poorly.

Abstract For the reservoir analog recommender system, an arbitrary reservoir is selected, and different distance metrics were used to rank analog reservoirs. Because no one distance metric (and hence the given reservoir analog list) is superior to the other, the reservoirs given in the recommended list are compared along with the characteristics of distance metrics.


LCSH Subject Headings