Published January 1, 2017 | Version v1
Journal article Open

Categorization of species based on their microRNAs employing sequence motifs, information-theoretic sequence feature extraction, and k-mers

  • 1. Zefat Acad Coll, Community Informat Syst, IL-13206 Safed, Israel
  • 2. Jacobs Univ Bremen, Transmiss Syst Grp TrSys, Bremen, Germany

Description

Background: Diseases like cancer can manifest themselves through changes in protein abundance, and microRNAs (miRNAs) play a key role in the modulation of protein quantity. MicroRNAs are used throughout all kingdoms and have been shown to be exploited by viruses to modulate their host environment. Since the experimental detection of miRNAs is difficult, computational methods have been developed. Many such tools employ machine learning for pre-miRNA detection, and many features for miRNA parameterization have been proposed. To train machine learning models, negative data is of importance yet hard to come by; therefore, we recently started to employ pre-miRNAs from one species as positive data versus another species' pre-miRNAs as negative examples based on sequence motifs and k-mers. Here, we introduce the additional usage of information-theoretic (IT) features.

Files

bib-31a1fec9-5c14-4a47-b73f-9950ebd4f8a2.txt

Files (256 Bytes)

Name Size Download all
md5:e0dcb63184c56ac8bd12fc1af08e3be9
256 Bytes Preview Download