Published January 1, 2013 | Version v1
Journal article Open

TRAINER: A General-Purpose Trainable Short Biosequence Classifer

  • 1. Baskent Univ, Dept Comp Engn, TR-06490 Ankara, Turkey
  • 2. Middle E Tech Univ, Dept Chem, TR-06531 Ankara, Turkey

Description

Classifying sequences is one of the central problems in computational biosciences. Several tools have been released to map an unknown molecular entity to one of the known classes using solely its sequence data. However, all of the existing tools are problem-specific and restricted to an alphabet constrained by relevant biological structure. Here, we introduce TRAINER, a new online tool designed to serve as a generic sequence classification platform to enable users provide their own training data with any alphabet therein defined. TRAINER allows users to select among several feature representation schemes and supervised machine learning methods with relevant parameters. Trained models can be saved for future use without retraining by other users. Two case studies are reported for effective use of the system for DNA and protein sequences; candidate effector prediction and nucleolar localization signal prediction. Biological relevance of the results is discussed.

Files

bib-8bf65fe6-916b-435b-99f4-7b382a0777f3.txt

Files (164 Bytes)

Name Size Download all
md5:50c7244928dd6f21e1cc275fbfad3953
164 Bytes Preview Download