Assessing the performance of a machine learning algorithm in identifying bubbles in dust emission

Xu, Duo, M.A.

Assessing the performance of a machine learning algorithm in identifying bubbles in dust emission

Access full-text files

XU-THESIS-2018.pdf (4.88 MB)

Date

2019-01-30

Authors

Xu, Duo, M.A.

Abstract

Stellar feedback created by radiation and winds from massive stars plays a significant role in both physical and chemical evolution of molecular clouds. This energy and momentum leaves an identifiable signature (“bubbles") that affect the dynamics and structure of the cloud. Most bubble searches are performed “by-eye", which are usually time-consuming, subjective and difficult to calibrate. Automatic classifications based on machine learning make it possible to perform systematic, quantifiable and repeatable searches for bubbles. We employ a previously developed machine learning algorithm, Brut, and quantitatively evaluate its performance in identifying bubbles using synthetic dust observations. We adopt magneto-hydrodynamics simulations, which model stellar winds launching within turbulent molecular clouds, as an input to generate synthetic images. We use a publicly available three-dimensional dust continuum Monte-Carlo radiative transfer code, HYPERION, to generate synthetic images of bubbles in three Spitzer bands (4.5 μm, 8 μm and 24 μm). We designate half of our synthetic bubbles as a training set, which we use to train Brut along with citizen-science data from the Milky Way Project. We then assess Brut's accuracy using the remaining synthetic observations. We find that after retraining Brut's performance increases significantly, and it is able to identify yellow bubbles, which are likely associated with B-type stars. Brut continues to perform well on previously identified high-score bubbles, and over 10% of the Milky Way Project bubbles are reclassified as high-confidence bubbles, which were previously marginal or ambiguous detections in the Milky Way Project data. We also investigate the size of the training set, dust model, evolution stage and background noise on bubble identification.