Knowledge transfer techniques for dynamic environments
Abstract
The expense involved in obtaining class labels for data has led to the emergence of
semi-supervised learning techniques which try to make use of both the labeled and
the unlabeled data to obtain classifiers with better generalization capabilities. Most
existing semi-supervised methods assume that the unlabeled data have the same
underlying distribution as the training data. However, data acquired for actual
problems often suffer from population drift over time or space, and consequently
classifiers learned from existing labeled data tend to become obsolete over time or
extended geographic areas.
In this dissertation, semi-supervised techniques are considered for updating
existing classifiers, while allowing for the possibility of population drift in the incoming data. The proposed techniques make use of meta-information that is not
explicitly provided by the data to aid in semi-supervision.
First, a framework that exploits the contextual information in an existing
hierarchical binary classifier is presented to rapidly construct a new classifier for a
new but related classification problem. The knowledge transfer technique is augmented with active learning to efficiently update the classifier using far fewer data
points than simple semi-supervised methods. The proposed technique is shown to
be well-suited for adapting classifiers, even when there is a significant difference
between the labeled and unlabeled data.
The knowledge transfer approach detailed in this thesis assumes the existence
of a pre-defined hierarchy of classes. However, it is possible that several different
class hierarchies are defined or obtained for the same domain. A maximum likelihood framework is proposed for integrating available hierarchies into a single ‘master
hierarchy’. The taxonomy integration method is shown to result in more natural
mappings between existing taxonomies compared to alternative approaches that do
not exploit the class hierarchy information. A technique that automatically generates n-ary class hierarchies is also presented. The n-ary trees are shown to better
reflect the inter-class relationships and are in general more effective for knowledge
transfer than binary trees.
Focusing on the domain of hyperspectral data, the efficacy of the new techniques is evaluated for the problem of classifying spatially/temporally varying hyperspectral images. The empirical results clearly demonstrate the utility of exploiting
‘contextual’ information for the problem of knowledge transfer in dynamic environments.