Knowledge transfer techniques for dynamic environments
The expense involved in obtaining class labels for data has led to the emergence of semi-supervised learning techniques which try to make use of both the labeled and the unlabeled data to obtain classifiers with better generalization capabilities. Most existing semi-supervised methods assume that the unlabeled data have the same underlying distribution as the training data. However, data acquired for actual problems often suffer from population drift over time or space, and consequently classifiers learned from existing labeled data tend to become obsolete over time or extended geographic areas. In this dissertation, semi-supervised techniques are considered for updating existing classifiers, while allowing for the possibility of population drift in the incoming data. The proposed techniques make use of meta-information that is not explicitly provided by the data to aid in semi-supervision. First, a framework that exploits the contextual information in an existing hierarchical binary classifier is presented to rapidly construct a new classifier for a new but related classification problem. The knowledge transfer technique is augmented with active learning to efficiently update the classifier using far fewer data points than simple semi-supervised methods. The proposed technique is shown to be well-suited for adapting classifiers, even when there is a significant difference between the labeled and unlabeled data. The knowledge transfer approach detailed in this thesis assumes the existence of a pre-defined hierarchy of classes. However, it is possible that several different class hierarchies are defined or obtained for the same domain. A maximum likelihood framework is proposed for integrating available hierarchies into a single ‘master hierarchy’. The taxonomy integration method is shown to result in more natural mappings between existing taxonomies compared to alternative approaches that do not exploit the class hierarchy information. A technique that automatically generates n-ary class hierarchies is also presented. The n-ary trees are shown to better reflect the inter-class relationships and are in general more effective for knowledge transfer than binary trees. Focusing on the domain of hyperspectral data, the efficacy of the new techniques is evaluated for the problem of classifying spatially/temporally varying hyperspectral images. The empirical results clearly demonstrate the utility of exploiting ‘contextual’ information for the problem of knowledge transfer in dynamic environments.