The principle of the EST-CDS based prediction of SJ

The junctions between combined exons from EST sequences of an organism after splicing can be deduced by aligning cDNA sequences to the corresponding homologous genes of a related annotated organism.


The flow chart of PSJ

  • The first part is to get the splice junctions of 21 model plant from the annotation files. And then to make a SJs database file as a reference file that contain the annotated SJs in CDS sequences of 21 model plant.
  • The second part is to align the model plant CDS sequences with the reference file by BLASTN. Using a Perl script, we obtained the target segments with a match rate greater than 80% and alignment length greater than the average exon length. Finally, 8,337,648 SJs were predicted with an abundance of 9.56 SJs/KB, but the number of annotated SJs was only 3,199,967, much smaller than the prediction. All the detail information of these predicted SJs could be searched by clicking the phylogenetic tree in the PSJ statistics page or search page.
  • The third part is to predict the SJs of 242 non-model plant. Using the reference file as subject sequences and the EST sequences of the 242 non-model plant as query sequences, alignment was performed by BLASTN. The screening parameters of the alignment result were a match rate greater than 80% and a match length longer than 200 bp. A total of 37,263,648 SJs were predicted with an abundance of 7.96SJs/KB in the non-model plant. An overview of the results is displayed in the statistics web page of the PSJ database.
  • The fourth part is to analyze SJ accuracy and coverage and summarize all these SJs and then to provide an online SJ mining and analysis tool.

































Copyright©2014 : Shandong Agricultural University PostCode: 271018