GenomeRunner web

dbcreator_ucsc and dbcreator_encodeDCC, modules for creating organism-specific genome annotation databases

Making database(s) should be the first step before running GenomeRunner.

GenomeRunner started off with the advent of the UCSC genome annotation database. This database remains the main source of well organized and validated genome annotation data. Other data sources, such as pilot data from the ENCODE Data Coordination Center, or from the Cistrome database are in process of being incorporated into GenomeRunner's framework.

Currently, GenomeRunner has two submodules for processing genome annotation data:

dbcreator_ucsc - processes the UCSC genome annotation database
dbcreator_encodeDCC - processes data from the ENCODE Data Coordination Center (experimental)

The main goal of both submodules is to convert different data formats into standard .BED files and organize them into tree-like categories (subfolders). As they process different data and data categorization schemes, it is recommended to avoid mixing their output by using different [dir] folders.