We noticed that with the default settings RNA_STAR cannot map majority of reads in some RNA-Seq datasets from Arabidopsis. Here is an extract from the log file:
Uniquely mapped reads % | 3.27% % of reads mapped to multiple loci | 10.06% % of reads unmapped: too short | 86.64%
Read a detailed explanation of ‘too short’ classification from Alexander Dobin.
Proportion of mapped reads can be increased by modification of alignment settings. Note that the procedure is described for RNA STAR Gapped-read mapper for RNA-seq data (Galaxy Version 2.4.0d-2). For additional information check relevant RNA_STAR threads, such as this one.
Set Would you like to set output parameters (formatting and filtering)? to Yes.
Set Would you like to set additional output parameters (formatting and filtering)? to Yes.
Reduce the default 0.66 value for the following filter options:
Minimum alignment score, normalized to read length (–outFilterScoreMinOverLread)
Minimum number of matched bases, normalized to read length (–outFilterMatchNminOverLread)
(can be 0)
Set Other parameters (seed, alignment, and chimeric alignment) to Yes
Set Would you like to set alignment parameters? to Yes
Reduce value for Minimum mapped length for a read mate that is spliced, normalized to mate length (–alignSplicedMateMapLminOverLmate) from the default 0.66 to something smaller.
Inspect the alignment, just to make sure you are happy with mapping.