Online learning and decision-making from implicit feedback

Krishnasamy, Subhashini

Online learning and decision-making from implicit feedback

dc.contributor.advisor	Shakkottai, Sanjay
dc.contributor.advisor	Vishwanath, Sriram
dc.contributor.committeeMember	Baccelli, Francois
dc.contributor.committeeMember	Zitkovic, Gordon
dc.contributor.committeeMember	Srikant, Rayadurgam
dc.creator	Krishnasamy, Subhashini
dc.date.accessioned	2017-06-20T20:33:42Z
dc.date.available	2017-06-20T20:33:42Z
dc.date.issued	2017-05
dc.date.submitted	May 2017
dc.date.updated	2017-06-20T20:33:43Z
dc.description.abstract	This thesis focuses on designing learning and control algorithms for emerging resource allocation platforms like recommender systems, 5G wireless networks, and online marketplaces. These systems have an environment which is only partially known. Thus, the controllers need to make resource allocation decisions based on implicit feedback obtained from the environment based on past actions. The goal is to sequentially select actions using incremental feedback so as to optimize performance while simultaneously learning about the environment. We study three problems which exemplify this setting. The first is an inference problem which requires identification of sponsored content in recommender systems. Specifically, we ask if it is possible to detect the existence of sponsored content disguised as genuine recommendations using implicit feedback from a subset of users of the recommender system. The second problem is the design of scheduling algorithms for switch networks when the user-server link statistics are unknown (for e.g., in wireless networks, online marketplaces). The scheduling algorithm has to tradeoff between scheduling the optimal links and obtaining sufficient feedback about all the links for accurate estimates. We observe the close connection of this problem to the stochastic multi-armed bandit problem and analyze bandit-style explore-exploit algorithms for learning the statistical parameters while simultaneously assigning servers to users. The third is the joint problem of base station activation and rate allocation in an energy efficient wireless network when the channel statistics are unknown. The controller observes instantaneous channel rates of activated BSs, and thereby sequentially obtains implicit feedback about the channel. Here again, there is a tradeoff between learning the channel versus optimizing the operation cost based on estimated parameters. For each of these systems, we propose algorithms with provable asymptotic guarantees. These learning algorithms highlight the use of implicit feedback in online decision making and control.
dc.description.department	Electrical and Computer Engineering
dc.format.mimetype	application/pdf
dc.identifier	doi:10.15781/T2P26Q88C
dc.identifier.uri	http://hdl.handle.net/2152/47285
dc.language.iso	en
dc.subject	Online learning
dc.subject	Resource allocation
dc.subject	Learning algorithm design
dc.subject	Implicit feedback
dc.subject	Incremental feedback
dc.subject	Sponsored content
dc.subject	Sponsored content detection
dc.subject	Scheduling algorithm design
dc.subject	Stochastic multi-armed bandit
dc.subject	Base station activation
dc.subject	Learning algorithms
dc.subject	Online decision-making
dc.subject	Online control
dc.title	Online learning and decision-making from implicit feedback
dc.type	Thesis
dc.type.material	text
thesis.degree.department	Electrical and Computer Engineering
thesis.degree.discipline	Electrical and Computer Engineering
thesis.degree.grantor	The University of Texas at Austin
thesis.degree.level	Doctoral
thesis.degree.name	Doctor of Philosophy

Access full-text files

Original bundle

Now showing 1 - 1 of 1

Name:: KRISHNASAMY-DISSERTATION-2017.pdf
Size:: 1.65 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: LICENSE.txt
Size:: 1.85 KB
Format:: Plain Text
Description:

Download

Collections

UT Electronic Theses and Dissertations