Enabling accurate and private machine learning systems

dc.contributor.advisorVikalo, Haris
dc.contributor.committeeMemberDe Veciana, Gustavo
dc.contributor.committeeMemberMokhtari, Aryan
dc.contributor.committeeMemberShakkottai, Sanjay
dc.contributor.committeeMemberWilliamson, Sinead
dc.creatorRibero Díaz, Mónica
dc.creator.orcid0000-0002-9634-0880
dc.date.accessioned2022-09-13T22:50:59Z
dc.date.available2022-09-13T22:50:59Z
dc.date.created2022-05
dc.date.issued2022-04-28
dc.date.submittedMay 2022
dc.date.updated2022-09-13T22:51:00Z
dc.description.abstractMachine learning applications in fields where data is sensitive, such as healthcare and banking, face challenges due to the need to protect the privacy of participating users. Tools developed in the past decades that aim to address this challenge include differential privacy and federated learning. Yet maintaining performance while protecting sensitive data poses a trilemma between accuracy, privacy, and efficiency. In this thesis, we aim to address these fundamental challenges and take a step towards enabling machine learning under privacy and resource constraints. On the differential privacy front, we develop in Chapter 2 an algorithm that addresses efficiency and accuracy of differentially private empirical risk minimization. We provide a dimension independent excess risk bound and show the algorithm converges to this excess risk bound at the same rate as AdaGrad. In Chapter 3 we introduce an algorithm for differentially private Top-k selection, a problem that often arises as a building block of large-scale data analysis tasks like NLP and recommender systems. The algorithm samples from a distribution with exponentially large support only in polynomial time and space, and improves existing pure differential privacy methods. On the federated learning front, locality of data imposes various system design challenges due to resource constraints. In Chapter 4, we propose federated and differentially private algorithms for matrix factorization tasks that arise when training recommender systems in the settings where data is distributed across different silos (e.g., hospitals or banks). Chapter 5 introduces a client selection strategy that reduces communication in federated learning while maintaining accuracy of the model. Finally, in Chapter 6 we conclude by presenting F3AST, a novel algorithm that addresses user intermittency in federated learning under an unknown and time varying system configuration.
dc.description.departmentElectrical and Computer Engineering
dc.format.mimetypeapplication/pdf
dc.identifier.urihttps://hdl.handle.net/2152/115689
dc.identifier.urihttp://dx.doi.org/10.26153/tsw/42587
dc.language.isoen
dc.subjectDifferential privacy
dc.subjectFederated learning
dc.subjectMachine learning
dc.subjectCommunication efficiency
dc.titleEnabling accurate and private machine learning systems
dc.typeThesis
dc.type.materialtext
thesis.degree.departmentElectrical and Computer Engineering
thesis.degree.disciplineElectrical and Computer Engineering
thesis.degree.grantorThe University of Texas at Austin
thesis.degree.levelDoctoral
thesis.degree.nameDoctor of Philosophy
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
RIBERODIAZ-DISSERTATION-2022.pdf
Size:
2.69 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
4.46 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
1.85 KB
Format:
Plain Text
Description: