Data mining techniques for classifying RNA folding structures
MetadataShow full item record
RNA is a crucial biological molecule that is critical for protein synthesis. Significant research has been done on folding algorithms for RNA, in particular the 16S rRNA of bacteria and archaea. Rather than modifying current works on these folding algorithms, this report ventures into the pioneering works for data mining the same 16S rRNA. Initial works were based on a single complex helix across seven organisms. However, classification analysis proved to be inaccurate due to severe multicollinearity in the data set. A secondary data mining analysis was done on the entire RNA sequence of the same seven organisms, and was successfully used to sequentially categorically predict the characteristic of a given nucleotide in the RNA sequence.