Publications - Working papers
Please find below working papers of our group. Currently, we list 58 working papers.
In the list are only not published papers present. If you look for a preprint of an already published paper you must look in the "Published papers" section.
If you have problems accessing electronic information, please let us know:
©NOTICE: All working papers are copyrighted by the authors; If you would like to use all or a portion of any paper, please contact the author.
Supervised and Unsupervised Classification of lncRNA Subtypes
Rituparno Sen, Jörg Fallmann, Maria EmÃlia M. T. Walter, Peter F. Stadler
Download
Abstract
Many small nucleolar RNAs and many of the hairpin precursors of
miRNAs are processed from long non-protein-coding (lncRNA) host genes. In
contrast to their highly conserved and heavily structured payload, the
host genes feature poorly conserved sequences. Nevertheless there is
mounting evidence that the host genes have biological functions. No
obvious connections between the function of the host genes and the
function of their payloads have been reported. Here we inverstigate
whether there is an association of host gene function or mechanisms with
the type of payload. To assess this hypothesis we test whether the miRNA
host genes (MIRHGs), snoRNA host genes (SNHGs), and other lncRNAs host
genes can be distinguished based on sequence and structure features. A
positive answer would imply a correlation between host genes and their
payload. While the three classes can be distinguished reliably when the
classifier is allowed to extract features from the payloads, this is no
longer the case when only sequence and structure of parts of the host
gene distal from the snoRNAs or miRNA payload is used for
classification. We conclude therefore, that MIRHGs and SNHGs do not form
separate non-coding RNA classes that are distinct from each other or from
lncRNAs without payloads.
Keywords
host gene, miRNA, snoRNA, k-mers, secondary structure, random forest