Using Supervised Learning Techniques to Predict Television Ratings
Access full-text files
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
How well a given TV show does is scored by a metric called “rating,” which denotes the percentage of households watching live TV at the time that are tuned into that particular show. To maximize ratings, being able to reliably predict them is necessary. For my thesis, in collaboration with Austin’s public-television station KLRU-TV, a variety of techniques were tested in order to discern the most accurate model for predicting the ratings of a television-show airing. To accomplish this, I created nine regression models, each using a different algorithm that has been proven to work across many kinds of problems. These were a linear regression model, a k-nearest-neighbors model, a SVM model, a decision tree model, a bagging ensemble model, a gradient boosting ensemble model, two kinds of fully connected neural networks, or MLPs, and a recurrent neural network. I also created several feature sets, which included Nielsen, IMDb, and engineered features. Each model was tested across every combination of feature sets and exhaustively hyperparamatized to find what method produced the best results. Most models did similarly well under at least one combination of hyperparameters and feature set, with the only exception being the linear regression model, which performed poorly across the board. The best model was a tie between the k-nearest-neighbors model and the bagging ensemble model, which both received an R2 score of .64 when run on all features. Though this is not a perfect score, it means the mean average error was just .2, which is small enough to be useful when optimizing program schedules and selling ad space.