Test Case Title

Discover non-coding RNA in yeast

Test Case Acronyme


Test Case Class


Contact person

Stefan Astrom and Feng Lin



Test Case Description

We sequenced the genome of Kluyveromyces dobzhanskii, a non-model organism. As a part of this endeavor, we have also very recently finished sequencing the transcriptomes (polyA+) of these two species. When we compared the whole genome sequences between them, we found more than fifty conserved sequences longer than 200bp located in the intergenic region whose function was unknown. We call these loci conserved non-coding sequences (NCS). We deleted 10 of these NCSs in K. lactis and found that 5 of the resulting strain resulted in a phenotype different from wild type. Hence, several of the NCSs were functionally important. Further analysis showed that several NCSs defined extended 3’ UTRs of the flanking genes. However, others did not and we hope to find noncoding RNAs among these elements. So what exactly are these NCSs and how many functional elements exist in the intergenic region of the whole genome? Finally, Translational frameshifting is an alternate process of translation, in which the ribosome slides back one nucleotide (-1 frameshifting) or skips one nucleotide (+1 frameshifting). We have identified a gene from K. lactis that contains a programmed frameshift of a novel type. Programmed frameshifts require a so-called slippery site and the slippery site we identified is different from the sites that were previously published.

Background knowledge

K. dobzhanskii is closely related species to Kluyveromyces lactis. K. lactis, also called milk yeast, must be considered a model as many groups world-wide work with this organism. The K. lactis genome is relatively well annotated. The general idea of our work is to establish K. lactis and K. dobzhanskii as a pair of organisms that can be compared to learn more about how yeast genomes evolved.  As an example conserved regulatory sequences can be identified using these two genomes (i.e. phylogenetic footprinting). Similar work has been performed with S. cerevisiae and close relatives (i.e. S. bayanus), but since the Kluyveromyces yeasts are separated by >100×106 years of evolution from the Saccharomyces yeasts, we argue that novel and interesting discoveries can be made.

Initial state of the Test case

To address these issues we hope to obtain bioinformatics help. One issue is to separate 5’ and 3’ UTRs from distinct intergenic transcripts. Next, distinct intergenic transcripts must be filtered by BLAST- and pFAM-searches to exclude short protein coding transcripts that undoubtedly exist among them.  We also need help to estimate the quality of the RNA seq data. Part of the data has been assembled back to the genome sequences. We noticed there are some expressed sequences in the intergenic region at variable expression level. We need bioinformatic methods to do the entire genome analysis for predicting non-coding RNA or other functional elements. We are also interested in the difference of expression pattern between the two genomes in general.

Further we would like to analyze this novel slippery site with respect to the presence genes and if slippage can generate a novel gene product.

Desired final state of the Test Case


Test Case Work Plan



LF: complexe case mixing different questions, it needs to be more clearly defined or split into several individual test cases.

public/loadedtestcases/tc8.txt · Last modified: 2012/09/28 16:36 by lfalquet
Trace: tc8