There will be a limited user support for Galaxy-qld during the workshop.
On November 4, 2017 we are holding a Galaxy training at UQCCR, Herston, for participants of the Practical Microbial Genomics Workshop. The training is based on Public data → assembly, annotation, MLST tutorial created by Melbourne Bioinformatics.
Slides for the workshop: pdf
Because of time constrains we will use files from a data library available on the server. Instruction for data import: pdf
We noticed that with the default settings RNA_STAR cannot map majority of reads in some RNA-Seq datasets from Arabidopsis. Here is an extract from the log file:
Uniquely mapped reads % | 3.27% % of reads mapped to multiple loci | 10.06% % of reads unmapped: too short | 86.64%
Read a detailed explanation of ‘too short’ classification from Alexander Dobin.
Proportion of mapped reads can be increased by modification of alignment settings. Note that the procedure is described for RNA STAR Gapped-read mapper for RNA-seq data (Galaxy Version 2.4.0d-2). For additional information check relevant RNA_STAR threads, such as this one.
Set Would you like to set output parameters (formatting and filtering)? to Yes.
Set Would you like to set additional output parameters (formatting and filtering)? to Yes.
Reduce the default 0.66 value for the following filter options:
Minimum alignment score, normalized to read length (–outFilterScoreMinOverLread)
Minimum number of matched bases, normalized to read length (–outFilterMatchNminOverLread)
(can be 0)
Set Other parameters (seed, alignment, and chimeric alignment) to Yes
Set Would you like to set alignment parameters? to Yes
Reduce value for Minimum mapped length for a read mate that is spliced, normalized to mate length (–alignSplicedMateMapLminOverLmate) from the default 0.66 to something smaller.
Inspect the alignment, just to make sure you are happy with mapping.
Another Galaxy workshop from QFAB Bioinformatics is scheduled on October 11-12, 2017.
Title: Variant detection using Galaxy
Venue: Room 3.141, Queensland Bioscience Precinct, The University of Queensland, St Lucia
Start at Wed, 11/10/2017, 09:00, end at Thu, 12/10/2017 – 12:30.
Registration is essential.
Our IT provider requested a temporary shutdown of Galaxy-qld on September 11, 2017, Monday, around 9 am Brisbane time, to fix a fault in hardware. We understand the repair may take hours, but exact duration is unknown. Updates on the situation are available through the GVL-Qld Twitter account @GVL_QLD.
Galaxy-qld will not accept new jobs since September 9.
The event will not affect user data.
UPDATE. September 11, 3:05 pm. Galaxy-qld is back online. Initial tests indicate the server is fully functional.
When: Wed, 13/09/2017 – 09:00 to Thu, 14/09/2017 – 12:30.
Where: MultiMedia Room 3.141, Queensland Bioscience Precinct (building 80), The University of Queensland, St Lucia.
Cost: $25. Registration is essential.
Participants need to bring a wi-fi-enabled laptop capable to eduroam connection. The room is equipped with power points for every participant.
Open file in a new tab by clicking on the link above. Switch to the new tab and download the file on your computer.
Recently a new annotation of the Arabidopsis thaliana genes, Araport11, is added to Arabidopsis thaliana gene annotations data library on Galaxy-qld. The dataset was imported from ARAPORT, and modified for compatibility with the existing Arabidopsis assemblies. This post provides an overview of A. thaliana resources on Galaxy-qld.
The very first version of Galaxy-qld had TAIR9 assembly represented by five chromosomes, with the following contig names: chr1, chr2, chr3, chr4 and chr5. It does not have the mitochondrial and/or chloroplast genomes.
Later on request from our users we added the TAIR10 gene annotation into Arabidopsis thaliana gene annotations data library. This annotation includes genes from Mt and Pt. It uses just numbers (1, 2, 3, 4, 5) for chromosome names. The TAIR10 genomic sequence is identical to TAIR9 (link). To provide our users with greater flexibility we added TAIR10 aligner indices to Galaxy-qld. TAIR10 assembly contains the following contigs: 1, 2, 3, 4, 5, Mt and Pt.
The Araport11 gene annotation is based on TAIR10 genome assembly (link) which is identical to the TAIR9 assembly. The original annotation comes with the following contig names: Chr1, Chr2, Chr3, Chr4, Chr5, ChrC, ChrM. [no comments on standard nomenclature here] To make the Araport11 annotation compatible with the TAIR9 assembly available on Galaxy-qld we replaced ‘Chr’ with ‘chr’. To make it compatible with the TAIR10, we removed ‘Chr’ from the contig names, replaced ChrC and ChrM with Pt and Mt, respectively, and sorted records in the same order as in the TAIR10 assembly: 1, 2, 3, 4, 5, Mt, Pt. The modified annotation is available in Arabidopsis thaliana gene annotations data library under Araport11_GFF3_genes_transposons.201606.modified.gtf name.