Coding mechanisms for communication and compression : analysis of wireless channels and DNA sequencing
MetadataShow full item record
This thesis comprises of two related but distinct components: Coding arguments for communication channels and information-theoretic analysis for haplotype assembly. The common thread for both problems is utilizing information and coding theoretic principles in understanding their underlying mechanisms. For the first class of problems, I study two practical challenges that prevent optimal discrete codes utilizing in real communication and compression systems, namely, coding over analog alphabet and fading. In particular, I use an expansion coding scheme to convert the original analog channel coding and source coding problems into a set of independent discrete subproblems. By adopting optimal discrete codes over the expanded levels, this low-complexity coding scheme can approach Shannon limit perfectly or in ratio. Meanwhile, I design a polar coding scheme to deal with the unstable state of fading channels. This novel coding mechanism of hierarchically utilizing different types of polar codes has been proved to be ergodic capacity achievable for several fading systems, without channel state information known at the transmitter. For the second class of problems, I build an information-theoretic view for haplotype assembly. More precisely, the recovery of the target pair of haplotype sequences using short reads is rephrased as the joint source-channel coding problem. Two binary messages, representing haplotypes and chromosome memberships of reads, are encoded and transmitted over a channel with erasures and errors, where the channel model reflects salient features of highthroughput sequencing. The focus is on determining the required number of reads for reliable haplotype reconstruction.