Optimizing visual grounding of latent representations of speech from distant language groups
dc.contributor.advisor | Harwath, David | |
dc.creator | Crabtree, Christopher Edwin | |
dc.date.accessioned | 2022-11-08T23:58:10Z | |
dc.date.available | 2022-11-08T23:58:10Z | |
dc.date.created | 2021-12 | |
dc.date.issued | 2021-12-03 | |
dc.date.submitted | December 2021 | |
dc.date.updated | 2022-11-08T23:58:11Z | |
dc.description.abstract | Recent years have seen an increasing research interest into using multi-modal grounding techniques to bolster classic natural language processing (NLP) and automated speech recognition (ASR) tasks. Previous work by Harwath et al. [5], demonstrated that visual grounding approximately doubled their model's bilingual utterance retrieval performance and similarly image retrieval was substantially improved by adding an alignment objective between languages. However, there is still much we don't know about the exact mechanism by which grounding is used in modern neural network systems. In this work, we extend the line of research pioneered by Harwath et al. by exploring empirically several contrastive learning frameworks and objectives designed to align input from different modalities (i.e. visual and speech input). Our experiments indicate potential avenues for improvement over the current best performing loss objective through analysis of our top two performing loss functions. We also find that in our trilingual setting, cross-lingual learning objectives can be removed to both improve image retrieval performance and reduce hyperparameter complexity | |
dc.description.department | Computer Science | |
dc.format.mimetype | application/pdf | |
dc.identifier.uri | https://hdl.handle.net/2152/116594 | |
dc.identifier.uri | http://dx.doi.org/10.26153/tsw/43489 | |
dc.subject | Image retrieval | |
dc.subject | Loss functions | |
dc.title | Optimizing visual grounding of latent representations of speech from distant language groups | |
dc.type | Thesis | |
dc.type.material | text | |
thesis.degree.department | Computer Sciences | |
thesis.degree.discipline | Computer Science | |
thesis.degree.grantor | The University of Texas at Austin | |
thesis.degree.level | Masters | |
thesis.degree.name | Master of Science in Computer Sciences |
Access full-text files
Original bundle
1 - 1 of 1