Phoneme segmentation using self-supervised speech models

dc.contributor.advisorHarwath, David
dc.creatorStrgar, Luke Vincent
dc.date.accessioned2024-04-22T20:10:54Z
dc.date.available2024-04-22T20:10:54Z
dc.date.issued2023-08
dc.date.submittedAugust 2023
dc.date.updated2024-04-22T20:10:55Z
dc.description.abstractWe apply transfer learning to the task of phoneme segmentation and demonstrate the utility of representations learned in self-supervised pre-training for the task. Our model extends transformer-style encoders with strategically placed convolutions that manipulate features learned in pre-training. Using the TIMIT and Buckeye corpora we train and test the model in the supervised and unsupervised settings. The latter case is accomplished by furnishing a noisy label-set with the predictions of a separate model, it having been trained in an unsupervised fashion. Results indicate our model eclipses previous state-of-the-art performance in both settings and on both datasets. Finally, following observations during published code review and attempts to reproduce past segmentation results, we find a need to disambiguate the definition and implementation of widely-used evaluation metrics. We resolve this ambiguity by delineating two distinct evaluation schemes and describing their nuances. We provide a publicly available implementation of our work on Github.
dc.description.departmentComputer Sciences
dc.format.mimetypeapplication/pdf
dc.identifier.uri
dc.identifier.urihttps://hdl.handle.net/2152/124891
dc.identifier.urihttps://doi.org/10.26153/tsw/51493
dc.language.isoen
dc.subjectTransfer learning
dc.subjectMachine learning for speech processing
dc.subjectDeep learning
dc.subjectPhoneme boundary detection
dc.subjectSpeech segmentation
dc.subjectSelf-supervised pre-training
dc.subjectAutomatic speech processing
dc.subjectPhoneme segmentation
dc.subjectSelf-supervised learning
dc.titlePhoneme segmentation using self-supervised speech models
dc.typeThesis
dc.type.materialtext
thesis.degree.departmentComputer Sciences
thesis.degree.grantorThe University of Texas at Austin
thesis.degree.nameMaster of Science in Computer Sciences

Access full-text files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
STRGAR-PRIMARY-2024.pdf
Size:
909 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 2 of 2
No Thumbnail Available
Name:
LICENSE.txt
Size:
1.84 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
4.45 KB
Format:
Plain Text
Description: