Estimating Rainfall with Neural Networks and Conditional Random Fields
Over the past 30 years, an average of 85 people died each year in the US due to flash- floods, making them the most fatal severe weather condition. Particularly in Central Texas, the "most ash- flood prone area in the United States," we need to accurately predict rainfall. However, meteorologists continue to manually adjust state-of-the-art physical models based on experience. The National Weather Service creates flash- flood warnings based on Doppler radar station estimates of rainfall occurring over the past hour. As such, estimating rainfall directly impacts public safety - overestimates cause extraneous warnings that are easily ignored while underestimates fail to warn those in danger. Unfortunately, Doppler radar can miss the mark, yielding misleading results. If a method existed to update these estimates to make them more accurate, meteorologists could make better flash- flood predictions, saving lives. Furthermore, if this method were robust and based on data, rather than heuristics, it could be trusted as a step in post-processing of radar scans, seamlessly integrating with existing systems. This project uses neural networks and conditional random fi elds - equipment from the toolbox of machine learning - to create a data-based model for updating Doppler radar rainfall estimations. To do this, the neural network is "trained" (uses actual observations to learn patterns) using the Lower Colorado River Authority's network of rain sensors. These rain sensors provide a ground-truth value for the rainfall in the Central Texas area. The neural network compares the Doppler radar estimates and the rain sensor ground-truth to learn how to better predict rainfall from radar scans. The neural network can also employ a rough estimate of the true rainfall from a subset of the rain sensors to make even better overall rainfall estimates. Furthermore, conditional random fi elds provide a method of smoothing these predictions, leveraging the fact that drastic changes in rainfall are not physically reasonable (at least, in general case). Based on the machine learning techniques referenced above, I sought to create a system that could make rainfall estimates more accurately (based on the ground-truth rain sensors) than the na ive estimate provided directly by Doppler radar. To do this, I implemented neural network and conditional random field algorithms, using many di fferent experimental con figurations. Each of these was used to create a potential system which was then compared against the Doppler radar estimates. Although testing more confi gurations would provide additional statistical certainty, the system "in the middle" of those tested produced the best results. To be more explicit, the neural network model with a reasonably large number of neighboring cells used in the calculation, but no conditional random field algorithms applied, gave the smallest error under both utilized error metrics, meaning that it produced the best rainfall estimates. Furthermore, each tested confi guration is consistent with or out-performs the standard Doppler radar estimate. As such, any of the tested network con figurations seem to be a valid post-processing tool to create better rainfall estimates, providing a simple avenue to make more accurate flash-flood predictions in Central Texas.