Enabling accurate and private machine learning systems
dc.contributor.advisor | Vikalo, Haris | |
dc.contributor.committeeMember | De Veciana, Gustavo | |
dc.contributor.committeeMember | Mokhtari, Aryan | |
dc.contributor.committeeMember | Shakkottai, Sanjay | |
dc.contributor.committeeMember | Williamson, Sinead | |
dc.creator | Ribero Díaz, Mónica | |
dc.creator.orcid | 0000-0002-9634-0880 | |
dc.date.accessioned | 2022-09-13T22:50:59Z | |
dc.date.available | 2022-09-13T22:50:59Z | |
dc.date.created | 2022-05 | |
dc.date.issued | 2022-04-28 | |
dc.date.submitted | May 2022 | |
dc.date.updated | 2022-09-13T22:51:00Z | |
dc.description.abstract | Machine learning applications in fields where data is sensitive, such as healthcare and banking, face challenges due to the need to protect the privacy of participating users. Tools developed in the past decades that aim to address this challenge include differential privacy and federated learning. Yet maintaining performance while protecting sensitive data poses a trilemma between accuracy, privacy, and efficiency. In this thesis, we aim to address these fundamental challenges and take a step towards enabling machine learning under privacy and resource constraints. On the differential privacy front, we develop in Chapter 2 an algorithm that addresses efficiency and accuracy of differentially private empirical risk minimization. We provide a dimension independent excess risk bound and show the algorithm converges to this excess risk bound at the same rate as AdaGrad. In Chapter 3 we introduce an algorithm for differentially private Top-k selection, a problem that often arises as a building block of large-scale data analysis tasks like NLP and recommender systems. The algorithm samples from a distribution with exponentially large support only in polynomial time and space, and improves existing pure differential privacy methods. On the federated learning front, locality of data imposes various system design challenges due to resource constraints. In Chapter 4, we propose federated and differentially private algorithms for matrix factorization tasks that arise when training recommender systems in the settings where data is distributed across different silos (e.g., hospitals or banks). Chapter 5 introduces a client selection strategy that reduces communication in federated learning while maintaining accuracy of the model. Finally, in Chapter 6 we conclude by presenting F3AST, a novel algorithm that addresses user intermittency in federated learning under an unknown and time varying system configuration. | |
dc.description.department | Electrical and Computer Engineering | |
dc.format.mimetype | application/pdf | |
dc.identifier.uri | https://hdl.handle.net/2152/115689 | |
dc.identifier.uri | http://dx.doi.org/10.26153/tsw/42587 | |
dc.language.iso | en | |
dc.subject | Differential privacy | |
dc.subject | Federated learning | |
dc.subject | Machine learning | |
dc.subject | Communication efficiency | |
dc.title | Enabling accurate and private machine learning systems | |
dc.type | Thesis | |
dc.type.material | text | |
thesis.degree.department | Electrical and Computer Engineering | |
thesis.degree.discipline | Electrical and Computer Engineering | |
thesis.degree.grantor | The University of Texas at Austin | |
thesis.degree.level | Doctoral | |
thesis.degree.name | Doctor of Philosophy |
Access full-text files
Original bundle
1 - 1 of 1
Loading...
- Name:
- RIBERODIAZ-DISSERTATION-2022.pdf
- Size:
- 2.69 MB
- Format:
- Adobe Portable Document Format