# Browsing by Subject "Data analysis"

Now showing 1 - 9 of 9

- Results Per Page
1 5 10 20 40 60 80 100

- Sort Options
Ascending Descending

Item A data analytics framework for analyzing the effect of frac hits on parent well production(2021-07-24) Guo, Yifei; Oort, Eric van; Ashok, Pradeepkumar, 1977-Show more Well interference, which is commonly referred to as frac hits, has become a significant factor affecting production in fractured horizontal shale wells with the increase in infill drilling in recent years. Today, there is still no clear understanding on how frac hits affect production. This work aims to develop a process to automatically identify the different types of frac hits and to determine the effects of stage-to-well distance and frac hit intensity on long-term parent well production. First, child well completions data and parent well pressure data are processed by a frac hit detection algorithm to automatically identify different frac hit intensities and duration within each stage. This algorithm classifies frac hits based on the magnitude of the differential pressure spikes. The frac stage to parent well distance is also calculated. Then, we compare the daily production trend before and after the frac hits to determine the severity of its influence on production. Finally, any evident correlations between the stage-to-well distance, frac hit intensity and production change are identified and investigated. In addition, this work also introduces a novel pressure integration approach to better understand the prolonged production effects of sustained frac hits. The methodology was to first characterize every induced pressure change in a parent well (i.e. to determine frac-hit intensity), and subsequently to estimate the average daily production change after each frac hit. This was accomplished by utilizing a frac hit auto-detection algorithm and a log-log analysis of daily production. To add the effect of interference time (i.e. the time of the pressure change caused by the frac hit), a “PCI” metric was calculated for each parent well. Finally, the effects of frac-hit intensity and interference time on parent well production were quantified through correlation analysis. This work utilizes 3 datasets covering 13 horizontal wells in the Bakken formation and 37 horizontal wells in the Eagle Ford Shale formation. These datasets included well trajectories, child well completions data, parent well pressure data and parent well production data. The data analysis results include factors influencing frac hit intensity and production responses to different types of frac hits. Also, the result shows that both frac hit intensity and interference time impacts parent well production. This work not only adds new understandings to frac hit study, but also challenges some of the previous workShow more Item Application of space time concept in GIS for visualizing and analyzing travel survey data(2006-05) Lu, Xiaoyun; Zhang, Ming, 1963 April 22-Show more The classic time geography concept (space-time path) provides a powerful framework to study travel survey data which is an important source for travel behavior studies. Based on the space-time concept, this research will present a visualizing approach to analyze travel survey data. By inputting the data into GIS software such as TransCAD and ArcGIS and editing the needed information, this study will explain how to create 3D images of travel paths for showing the variation of trip distribution in relation to different social-economic factors deemed as the driving forces of such patterns. Also, this report will address the technical challenges involved in this kind of study and will discuss directions of future research.Show more Item Assessing the performance of a machine learning algorithm in identifying bubbles in dust emission(2019-01-30) Xu, Duo, M.A.; Offner, StellaShow more Stellar feedback created by radiation and winds from massive stars plays a significant role in both physical and chemical evolution of molecular clouds. This energy and momentum leaves an identifiable signature (“bubbles") that affect the dynamics and structure of the cloud. Most bubble searches are performed “by-eye", which are usually time-consuming, subjective and difficult to calibrate. Automatic classifications based on machine learning make it possible to perform systematic, quantifiable and repeatable searches for bubbles. We employ a previously developed machine learning algorithm, Brut, and quantitatively evaluate its performance in identifying bubbles using synthetic dust observations. We adopt magneto-hydrodynamics simulations, which model stellar winds launching within turbulent molecular clouds, as an input to generate synthetic images. We use a publicly available three-dimensional dust continuum Monte-Carlo radiative transfer code, HYPERION, to generate synthetic images of bubbles in three Spitzer bands (4.5 μm, 8 μm and 24 μm). We designate half of our synthetic bubbles as a training set, which we use to train Brut along with citizen-science data from the Milky Way Project. We then assess Brut's accuracy using the remaining synthetic observations. We find that after retraining Brut's performance increases significantly, and it is able to identify yellow bubbles, which are likely associated with B-type stars. Brut continues to perform well on previously identified high-score bubbles, and over 10% of the Milky Way Project bubbles are reclassified as high-confidence bubbles, which were previously marginal or ambiguous detections in the Milky Way Project data. We also investigate the size of the training set, dust model, evolution stage and background noise on bubble identification.Show more Item Characterizing emerging urban transportation modes : models and methods(2020-12-07) Zuniga Garcia, Natalia; Machemehl, Randy B.; Scott, James G; Kockelman, Kara M; Ruiz-Juri, Natalia; Claudel, Christian GShow more The introduction of emerging transportation technologies, such as mobility-on-demand and shared modes, have caused disruptions in urban transportation systems. These services brought multiple challenges, including the lack of infrastructure, arbitrary pricing schemes, deficient operating rules and regulations, and safety concerns. Furthermore, the deployment of these technologies has increased the need and demand for improved management of the associated data. In particular, the volume of the collected information, the variability of data sources, the heterogeneous structure, and the inherent spatio-temporal nature highlight challenges for finding spatial and temporal relationships, dealing with computational complexity, and for the integration or fusion from various sources. This research work is based on the need for the implementation of models and methods for dealing with large-scale, diverse, and spatio-temporal datasets to adequately characterize emerging mobility technologies and their potential impacts on urban environments. Specifically, it assesses three main points: (1) the impact of emerging mobility modes on urban areas is still unknown, (2) it is not clear what is the effect of shared mobility services on public transit usage, (3) when available, the data may present several challenges. This dissertation designs and applies models and methods to evaluate emerging mobility services' impacts on different aspects of urban areas. The impacts in question are analyzed using four distinctive techniques based on advanced statistics and data analysis models and methods. These techniques are applied to several data sources describing ridesourcing (i.e., ride-hailing via transportation network companies or TNCs), microtransit (i.e., privately owned and operated shared transportation system that can have fixed or flexible routes and schedules), micromobility (e.g., bikesharing and dockless electric scooters or e-scooters), and public transit trips from Austin, Texas. The results of the analyses show that the current fare system and pricing strategies can lead to disparities in TNC driver earnings. Temporal and spatial demand variations can exacerbate search frictions, which can cause an overall market failure. The results suggest that new pricing strategies are required and that there is a need for pricing regulations. A further examination of the ridesourcing effect on the airport ground access using Intelligent Transportation Systems (ITS) showed that the average airport-accessing speed decreases in the presence of TNCs. The use of ITS data is proposed to support airport decision-making processes. Finally, this study analyzed the integration of shared modes with the public transit system. Shared modes can complement the public transportation systems (like bus, train, and air) and solve first-mile-last-mile (FMLM) accessibility issues. However, this study's results suggest that this integration is not yet happening for TNCs and microtransit modes. An analysis of the use of Public-Private Partnerships (PPPs) to introduce share modes in areas with low public transit demand suggests the service was mainly used for intrazonal trips and not for FMLM. Further analysis of the relationship between e-scooters and public transit was able to identify areas with potential e-scooter and bus interaction. The results suggest that future collaborations and PPPs should focus on integrating these mobility services into the public transit system.Show more Item Classification of Hydrocarbon Recovery Factor Based on Reservoir Databases(2008-08) Sharma, Aviral; Lake, Larry W.; Srinivasan, SanjayShow more In this thesis data analyses of the oil and gas reservoir data sets have been performed to come up with deterministic and probabilistic values of ultimate recovery factor for both oil and gas reservoirs. This could be of interest for exploration because the amount of knowledge regarding a newly discovered reservoir is limited and it would be helpful to know some proxy value of recovery factor that could be a guide during later flow simulations. This would also be helpful in projecting the possible revenues that could be generated from the reservoirs. The deterministic models are based on multivariate linear regression. The probability models include the calibration of the likelihood of recovery factor using naïve Bayesian classification. For the oil reservoirs, classification accuracies of the recovery factor were compared using the geological and engineering parameters. For the gas reservoirs, the Bayesian classifier model was implemented by fitting a multivariate Gaussian distribution to the predicting variables.The linear regression model performed well compared to the empirical correlations given by Arps et al. (1967) and Guthrie et al. (1995). In gas reservoirs, good prediction was achieved by using the recovery instead of recovery factor as a response function in the regression. The likelihood functions of the recovery factor for both gas and oil reservoirs are multimodal and non-Gaussian. For the oil reservoirs, both geological and engineering parameters played important role for the prediction of recovery factor, which eventually lead to the conclusion that the engineering and geological parameters are not independent.Show more Item Data-driven methodologies for supporting decision-making in roadway safety and pavement management(2023-08) Xu, Yang, M.S. in Engineering; Bhasin, Amit; Li, Jenny; Caldas, Carlos H; Boyles, Stephen DShow more There has been a significant rise in the utilization of data-driven methods within the contemporary realm of transportation engineering. This trend is primarily attributed to the limitations associated with experience-based methods, such as subjectivity and non-reproducibility. In contrast, data-driven methods have proven to offer a more objective and effective approach to problem analysis, thereby providing decision-makers with a reliable basis for informed decision-making. This present research focuses on two types of data-driven methodologies: geostatistical analyses utilizing geographic information systems (GIS) and cutting-edge algorithms associated with artificial intelligence (AI). In numerical analysis, data provides a means to gain valuable insights into a problem of interest. While AI-oriented methods have been shown in many studies to be more effective than traditional approaches, the accuracy of the analysis still heavily depends on the quality of the data. This dissertation endeavors to shed light on the pivotal role that data plays in both roadway safety analysis and pavement management. To accomplish this, four distinct studies are proposed that examine different aspects of data-driven methods. The studies encompass an evaluation of data consistency in motor vehicle crash databases, the identification of crash hot spots within a road network, a synthesis of advancements in the application of AI algorithms to various activities of pavement management, and an exploration of the relationship between pavement conditions and roadway safety using AI-oriented methods. The knowledge acquired from these studies serves as a foundation for future research, advancements, and the adoption of innovative approaches to enhance the efficiency of safety analysis and pavement management. This research ultimately facilitates informed decision-making, effective resource allocation, and the implementation of cost-effective interventions to enhance roadway safety and optimize pavement management practices.Show more Item Equipment data analysis study : failure time data modeling and analysis(2012-05) Zhu, Chen, master of science in engineering; Popova, Elmira; Bickel, J.EricShow more This report presents the descriptive data analysis and failure time modeling that can be used to find out the characteristics and pattern of failure time. Descriptive data analysis includes the mean, median, 1st quartile, 3rd quartile, frequency, standard deviation, skewness, kurtosis, minimum, maximum and range. Models like exponential distribution, gamma distribution, normal distribution, lognormal distribution, Weibull distribution and log-logistic distribution have been studied for failure time data. The data in this report comes from the South Texas Project that was collected during the last 40 years. We generated more than 1000 groups for STP failure time data based on Mfg Part Number. In all, the top twelve groups of failure time data have been selected as the study group. For each group, we were able to perform different models and obtain the parameters. The significant level and p-value were gained by Kolmogorov-Smirnov test, which is a method of goodness of fit test that represents how well the distribution fits the data. The In this report, Weibull distribution has been proved as the most appropriate model for STP dataset. Among twelve groups, eight groups come from Weibull distribution. In general, Weibull distribution is powerful in failure time modeling.Show more Item Parameter selection in seismic data analysis problems(2021-05-10) Decker, Luke Adam; Fomel, Sergey B.; Arbogast, Todd; Ghattas, Omar; Foster, Douglas; Wheeler, MaryShow more Seismic imaging is an essential tool for non-invasive subsurface evaluation. It enables Earth scientists to create a picture of the planet's interior, predicting the rocks and structures that lie below. This can enable characterization of tectonic margins to better understand the deep history of the planet, delineation of aquifers to provide water, and the safe and economic exploration for commercial oil and gas accumulations for energy production. To generate these images numerous observations of the subsurface are taken and they are transformed to a common domain where observations of the same point in the subsurface overlay. These transformations typically are linear on the observed data and usually depend on a parameter related to seismic wave propagation, like the speed at which a seismic wave travels through the subsurface, in a non-linear manner. Selecting and determining these parameters is a crucial step in the generation of seismic images. Using inaccurate parameters in the transformations involved in seismic data processing results in seismic images that are distorted, inaccurate representations of the subsurface. Because these parameters are related to seismic wave propagation, their values can provide insight into the composition of the Earth's interior, including the rocks or fluids present. In this dissertation, I present methods for accurately determining those parameters and how they may be used to efficiently generate accurate, well resolved images of the Earth's interior. I show how dynamic time warping may be used to create an operator which efficiently corrects for the blurring and distortion present in seismic images caused by seismic anisotropy, or wave propagation speed changing with the direction of travel, while simultaneously characterizing and quantifying that anisotropy. I demonstrate how slope-decomposed seismic images may be transported along their characteristics in a process called oriented velocity continuation to efficiently generate a suite of images over a range of plausible migration velocities, and how oriented velocity continuation may be used with seismic diffraction imaging to determine migration velocity. The use of oriented velocity continuation is further expanded on to generate a framework for probabilistic diffraction imaging using a collection of weights computed from slope-decomposed images that represent the probability of a correctly imaged diffraction existing at a point in space for a given migration velocity, while simultaneously outputting the most likely migration velocity at each point in space. This method generates seismic images with significantly improved signal to noise ratios compared to conventional approaches. Finally, I formulate a variational method for picking an optimal surface representing how a parameter evolves in space from a volume representing the quality of fit for different parameter values based on iteratively minimizing a functional. I prove that minimizers for that functional exist, and that an iterative method will converge to a minimizer in an infinite dimensional setting. The method is applied using continuation, or graduated optimization, to avoid local minima and used to determine seismic velocities as a component of seismic processing workflows and perform automatic interpretation of a seismic horizon.Show more Item Production analysis of oil production from unconventional reservoirs using bottom hole pressures entirely in the Laplace space(2015-05) La, Natalie-Nguyen; Lake, Larry W.; Mohanty, Kishore KShow more Laplace transforms are a powerful mathematical tool to solve many problems that describe fluid flow in unconventional reservoirs. However, for the solutions to be useful in applications, for instance history matching, they must be converted from the Laplace space into the real-time domain. A common practice is to numerically invert the transformed Laplace solution. However, we find substantial benefits if the data sets are handled entirely in the Laplace domain, and fitted to models presented in Laplace space rather than in the time domain. The data set used in this work is oil production rate and bottom hole pressure (BHP) from a liquid-rich shale play in North America, which we study to understand the decline of production from a tight formation produced by a fractured horizontal well. Since the BHP is relatively constant in the long run, a constant BHP solution is appropriate to analyze inflow performance analysis for most wells. However in some cases, as a result of operational changes to some wells, mainly periodic shut-ins, the production rate experiences isolated pressure build-ups. Both the production rate and BHP are transformed into the Laplace domain and accounted for in the model. Ours is the first analysis that combines rate and BHP entirely in the Laplace domain. There is no need for a Laplace transform inversion. Two models whose Laplace solutions are readily available are studied side-by-side, a single-compartment model versus a dual-compartment model. We fit the transformed production data of hundreds of wells to the Laplace models. The algorithm to transform data is fairly simple and computationally inexpensive. Since Laplace transformation smoothes the data, the fits are consistently good. Both models yield realistic and similar estimates of ultimate recovery. In most cases the effect of the second compartment in the dual-compartment model can be ignored, i.e., neglecting the fracture-well interaction. The single-compartment model seems adequate for modeling unconventional reservoirs performance. The knowledge of the reservoir model parameters provides estimation of the drainage volume and forecast future production. One of the main advantages of this novel history matching method is its ability to eliminating noise from data scatter without losing important information. As a result, we can match data more easily. Moreover, real-time solutions to many fluid flow problems in porous media often cannot be obtained analytically but rather via numerical computation. Our current method eliminates the need of inverting to real-time solutions. Additionally, these solutions often assume simple closed forms in Laplace domain even for very complex geometry (higher number of compartments), facilitating the task of history matching.Show more