FAQ

1. Links on the web 2. Software 3. Papers

What is the difference between Trinity and Trans-ABySS?
Here is a paper from 2011 comparing the two: Optimizing de novo transcriptome assembly from short-read RNA-Seq data: a comparative study

The paper pointed out the pros and cons of using Trinity, Trans-abySS and other assemblers. The authors compiled a list of guidelines at the end of their paper that sufficiently sums up their results comparing the programs:

1) Generally, MK approach should be considered to achieve better assembly results.

2) Trinity is the best SK assembler for transcriptome assembly for both small and large data set across various conditions. But don’t choose Trinity if long running time is to be avoided. ["Whereas abbyss showed a good balance between memory usage and runtime"]

3) Oases-MK and trans-ABySS produce the most diverse long transcripts. But one must avoid Oases if machine memory is limited.

4) [don't use SOAPdenovo]

5) Large data set can be divided into a serious of 0.5, 1, 3G subsets to test for the optimal conditions for assembly.

6) For design a transcriptome study, usually 100× average coverage on estimated size of expressed transcripts is recommended to start with for de novo assembly.

'''Boiled down it seems that we should use Trinity and if its longer runtime is an issue, we should use Trans-abySS. Others have suggested doing both and pooling the results.''' Trinity also had room to be more optimized by:

1) "Jobs from Butterfly module could be distributed in clusters using a job array, which could greatly reduce the running time for this step."

2) "Trinity had a “--jaccard_clip” option that was recommended for gene dense genome with lots of transcripts overlapping on the same strand. ...The option [can] significantly reduced the number of fused genes "

3) "There can be further improvement if MK strategy is applied to Trinity. However, the application is limited to its long runtime and fixed k-mer value, so it is impractical to apply MK strategy to Trinity with the current version."

4) By including strand-specific information the number of fused genes reported decreased.

Blast2GO: Its shortcomings and advantages
An introduction: Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research

In that paper (2005), they concluded that "by joining annotation to function analysis B2G provides a powerful data mining tool ideally suited to support genomic research in non-model species. Its species-independent character and different data input fronts makes it a valuable mining resource for potentially any organism. B2G combines high-throughput analysis, statistical evaluation and biology framed visualization with a high degree of user interaction... "

However, researchers at the Hollings Marine Laboratory have expressed disappointment with the program saying that it is "fundamentally broken". Viewing forums online, it seems that some others users share that sentiment. There does not seem to be a lot of alternatives to Blast2GO, except for InterProScan and Apollo. Apollo works better with genome-wide annotation and B2G works better per protein.

If we decide to stick with Blast2Go, we could configure B2G on our cluster and have it run via our Galaxy instance. By running on the cluster, the BLAST searches can be run in a parallel. Additionally, we can have a dedicated B2G database, which will work a lot faster than accessing the B2G database from Spain. This coming from a galaxy developer, there are wrapper files for B2G already written.

Annotation dolphin or non-model organisms? BLAST? BLAST2GO?

GENE IDs, GENE Symbols, Ontologies, Pathways Answer might be R scripts that automates this or Python

SNPs, Genome wide association studies Software/tools/algorithms

Jill – 48 samples

Long range equilibrium??

Alternative splicing Cufflinks but is there something else

RSEM

Blast2Go annotation is crappy

Modified Modularity Clustering

Using dolphin genome as scaffold?

Run dolphin multiple ways

Send Jill e-mail with e-mails of students

Dropbox storage data