Upgrades management and performance diagnosis in cellular networks




Qureshi, Mubashir Adnan

Journal Title

Journal ISSN

Volume Title



Today, cellular networks are one of the top drivers for all forms of communications, Internet access, and multimedia applications. Smartphone users heavily rely upon them and expect high availability at all times. The cellular service providers aim to provide excellent quality of experience for millions to billions of smartphone users by continuously monitoring the network and ser- vice performance. They introduce upgrades to their network on a regular basis in order to improve service experience, roll out new service functionalities, fix software bugs, or patch security issues. Network upgrades typically involve new software releases, firmware upgrades, hardware modifications, configuration parameter changes, and topology changes. There are some major technical difficulties that arise during the process of applying upgrades in a cellular network. First of all, before applying nationwide upgrade, the new changes need to be first tested in field to detect any undesirable effects. Because of limited testing budget and complex nature of cellular networks, it is difficult to specify representative testing locations. Moreover, in case of negative performance impact on testing nodes, diagnosing performance issues becomes a challenging task because of the diverse nature of cellular networks as different network locations can have different configuration settings. Secondly, after field testing, these upgrades need to be applied on nationwide nodes in a cellular service provider’s network. During this nationwide upgrade application, service disruption can happen as network node may have to be temporarily powered off which needs an efficient strategy to make sure that users remain connected while upgrades are being applied. Lastly, when network state changes after upgrade application, certain performance anomalies may arise which can be quite subtle. Detecting and diagnosing them is another daunting task associated with this process. In this dissertation, we will show how to address the above mentioned challenges. First we develop a framework for upgrades field testing that takes into account the complex configurations of network nodes and comes up with a multi-phase model to apply field tests. The objective here is to expose different configurations during testing to maximize the chances of catching any problem. Moreover, if performance degradation is observed, we develop an efficient diagnosis tool that extracts the root cause configuration responsible for the problem. In addition, we develop a scheduling framework that takes in the net- work topology and traffic profile to come up with a schedule detailing the order in which nodes should be upgraded so that upgrade process is completed in a fast manner. As network nodes may need to be taken down for upgrade to be applied, this can render users in the vicinity of the node to be affected. We come up with the upgrade schedule by taking into account the congestion and capacity constraints while also making sure that upgrade process is completed in a swift way. Lastly, we present an anomaly detection and root cause analysis engine. Sometimes the effect of a problematic configuration may not be visible on net- work edges (base stations) or at coarse aggregation level (call drop rate being monitored at market level). In such cases, aggregation of performance statistics over configuration subset experiencing the issue exposes the problem. This problem suffers from curse of dimensionality as there are exponential number of subsets. We design a scalable approach to detect anomalies at proper aggregation level without exploring whole search space and localize the subset configuration causing the issue


LCSH Subject Headings