Browsing by Subject "R"
Now showing 1 - 6 of 6
- Results Per Page
- Sort Options
Item Applying Classification and Regression Trees to manage financial risk(2012-05) Martin, Stephen Fredrick; Scott, James (Statistician); Carvalho, Carlos M.; Marti, Nathan C.This goal of this project is to develop a set of business rules to mitigate risk related to a specific financial decision within the prepaid debit card industry. Under certain circumstances issuers of prepaid debit cards may need to decide if funds on hold can be released early for use by card holders prior to the final transaction settlement. After a brief introduction to the prepaid card industry and the financial risk associated with the early release of funds on hold, the paper presents the motivation to apply the CART (Classification and Regression Trees) method. The paper provides a tutorial of the CART algorithms formally developed by Breiman, Friedman, Olshen and Stone in the monograph Classification and Regression Trees (1984), as well as, a detailed explanation of the R programming code to implement the RPART function. (Therneau 2010) Special attention is given to parameter selection and the process of finding an optimal solution that balances complexity against predictive classification accuracy when measured against an independent data set through a cross validation process. Lastly, the paper presents an analysis of the financial risk mitigation based on the resulting business rules.Item Detecting Structural Variation in Evolved Bacterial Genomes Using Paired-End DNA Sequencing Data(2022-05) Reitman, Joseph; Barrick, Jeffrey E.Structural variants are large-scale genome rearrangement events, such as chromosomal inversions, duplications, and deletions, that can lead to innovative evolution that is not possible with point mutations. They can be especially important for microbial speciation and pathogenesis. We developed and tested an algorithm to detect newly evolved structural variants in microbial genomes from paired-end sequencing data. The method looks for read pairs with anomalous distances or orientations between where the two reads map to a reference genome. Then, it scores putative predictions using a statistical model trained on read pairs spanning normal positions in the chromosome. A computational pipeline for carrying out this analysis was implemented in R. The code was tested on genome sequencing data from a population of bacteria from the Lenski Long-Term Evolution Experiment with Escherichia coli that evolved to colonize a new nutrient niche through tandemly duplicating a region of the chromosome. This code could be integrated into the open-source breseq mutation prediction pipeline in the future to improve its ability to detect structural variants.Item Intro to R and R Studio(2022-09-30) Brodsky, Meryl; Chapman Tripp, HannahItem Macroscale modeling linking energy and debt : a missing linkage(2017-06-29) Jayaswal, Harshit; King, Carey WayneWhat if we realized that the fundamental economic framework of models that are meant to guide a low-carbon energy transition prevents them from actually answering the question they are supposed to answer? Instead of assuming a series of energy investments, and then estimating the economic impacts of those choices, they actually do the exact opposite. They assume economic growth and then make a series of investments to meet emissions targets without actually factoring in how the energy systems themselves feedback to economic growth. The research here would be to try to understand how energy and resource extraction are linked with long-term economic outcomes, specifically addressing the idea of accumulation of debt in the economy. Many economic models implicitly assume that energy resources are not constraints on the economy. These energy-related constraints have to be introduced if we are to effectively understand long-term debt and natural resource interactions. Same is also true with various biophysical models which do not consider economic parameters like debt, employment and wages etc. while modeling population growth and resources in the system. The research objective is to develop a consistently merged model combining both a biophysical and an economic model to describe the industrial transition to the contemporary macroeconomic state. The research approach would be to integrate macro-scale system dynamics models of money, debt, and employment (specifically the Goodwin and Minsky models of (Keen, 1995 & Keen, 2013)) with system dynamics models of biophysical quantities (specifically population and natural resources such as in (Meadows et al., 1972, Meadows et al., 1974, Motesharrei et al., 2014)). The proposed research concept is critical to link biophysical modeling concepts with those economic models that specifically include the link of debt to employment and economic growth. This type of modeling is anticipated to help answer important questions for a low-carbon transition, for example, how does the rate of investment in “energy” feedback to growth of population, economic output, and debt; and how does the capital structure (e.g. fixed costs vs. variable costs) of fossil and renewable energy systems relate to, and affect, economic outcomes.Item Mapping COVID Data with R(2020-09-25) Javan, EmilyItem Open source data analysis and visualization with R(2024-01-26) Marden, Alex