Recent data generated by the ENCODE project[Birney, Nature, 2007], and others[McGaughey, Genome Res, 2008], has revealed that sequence alignment algorithms detect functional regulatory DNA sequence elements with low accuracy (~50%). To more accurately detect functional DNA sequence elements, we have developed an alignment-independent Gibbs-sampling based algorithm which uses over-representation and evolutionary conservation equally to detect conserved groups of short DNA regulatory elements in orthologous regions. We have optimized this algorithm on synthetic sequence data, where we have shown that it can identify seeded motif sites with up to 25% higher sensitivity and accuracy than alignment based approaches[Siddharthan, PLoS Comput Biol, 2005;Sinha, BMC Bioinformatics, 2004;Wang, Bioinformatics, 2003], especially when the compared species are as diverged as C. elegans, remanei, briggsae, and brenneri. Our algorithm is also more sensitive in detecting shared regulatory elements in genes which are bound by a known TF[Harbison, Nature, 2004;Li, PLoS Biol, 2008]. Our approach joins similar methods advocating an alignment free approach for assessing conservation[Ward, Bioinformatics, 2008], with the distinction that our algorithm is capable of de novo motif discovery without prior knowledge of TF specificity. We have identified four significantly conserved motifs in the ribosomal protein (RP) promoters of C. elegans, remanei, briggsae, and brenneri. Using promoter::mCherry reporters, we show that these motifs are necessary for proper RP expression, not only in C. elegans but also in C. briggsae. While often considered to be housekeeping genes, RPs are tightly regulated temporally and in a tissue specific manner in worms, mice, and in humans, but the pathways and TFs which regulate RP biosynthesis are unknown. The tight regulatory control of RPs is corroborated by recent findings that improper expression of RPs and ribosomal RNA leads to diverse developmental defects: 1) heterozygous RP mutant flies exhibit a "minute" phenotype[Marygold, Genome Biol, 2007], 2) reduced rRNA expression leads to defects in
glp-1/notch signaling in the C. elegans gonad[Voutev, Dev Biol, 2006], 3) decreased levels of rRNA cause increased apoptosis in the zebrafish CNS and embryonic lethality[Azuma, J Biol Chem, 2006], and 4) decreased levels of RPS3 due to a promoter mutation result in defects in gonadogenesis in flies[Saeboe-Larssen, Genetics, 1998]. We hypothesize that defects in ribosomal protein regulation will have diverse phenotypes in a wide range of translationally challenged cells.