[
WormBook,
2006]
Throughout the C. elegans sequencing project Genefinder was the primary protein-coding gene prediction program. These initial predictions were manually reviewed by curators as part of a "first-pass annotation" and are actively curated by WormBase staff using a variety of data and information. In the WormBase data release WS133 there are 22,227 protein-coding gene, including 2,575 alternatively-spliced forms. Twenty-eight percent of these have every base of every exon confirmed by transcription evidence while an additional 51% have some bases confirmed. Most of the genes are relatively small covering a genomic region of about 3 kb. The average gene contains 6.4 coding exons accounting for about 26% of the genome. Most exons are small and separated by small introns. The median size of exons is 123 bases, while the most common size for introns is 47 bases. Protein-coding genes are denser on the autosomes than on chromosome X, and denser in the central region of the autosomes than on the arms. There are only 561 annotated pseudogenes but estimates but several estimates put this much higher.
[
1985]
Myosins from slime molds to brain cells show a remarkable commonality of general molecular properties. These characteristics include two globular domains or heads that contain ATPase and actin-binding sites and the fibrous, coiled-coil a-helical rod that interacts with other molecules in assembly. Two heavy chains (m.w. 200,000) contribute to both heads, whereas two kinds of light chains bind to each head. In this paper, we consider striated muscles and their myosins. The phylogenetically distant nematode body-wall muscles and rabbit fast skeletal muscles produce myosin heavy chains, with about 47% of the amino acid sequences in the heads and 37% of the amino acids in the rod being identical (Karn et al. 1984). Myosin heavy chains are therefore highly conserved proteins. Contrasting with the phylogenetic conservation of myosin structure and sequence is the diversity of supramolecular arrangements of myosin assemblies in striated muscles, the so-called thick filaments. The lengths of thick filaments range from 1.55 um in vertebrates, 2-4 um in insect flight muscles, 10 um in the nematode to 40 um in certain mollusks. The average diameters of these filaments range from about 15 nm in vertebrates, 20 nm in insects, 25 nm in nematodes to 50-100 nm in some molluscan muscles. The surface arrangements of the myosin heads also vary in these different species. The lattice arrangements between thick filaments and the interdigitating, actin-containing thin filaments differ in terms of symmetry and thick:thin stoichiometry between these muscles. It appears likely that other protein components of these muscles interact with the very similar myosins to produce this structural diversity. The relatively subtle differences between myosin isoforms may also be important in these interactions. We define isoform in the case of myosin, for example, as a protein that is defined as a myosin by biochemical criteria but that can be distinguished on the basis of intrinsic molecular structure from another myosin within the same organism. In this paper, we describe experiments suggesting that two genetically different isoforms of myosin play distinct roles in concert with other proteins during the assembly of thick filaments in