Skip to content

Single-end alignment

Single-end (SE) alignment maps each read in a single FASTQ file independently — there’s no mate pair to constrain the alignment.

Terminal window
rustar-aligner \
--genomeDir /path/to/genome_index \
--readFilesIn reads.fq \
--outSAMtype SAM \
--outFileNamePrefix sample_

For stranded RNA-seq libraries, add the strand field to the output so that downstream tools (Cufflinks, StringTie, etc.) can pick up the strand assignment from junction motifs:

Terminal window
--outSAMstrandField intronMotif

Use --readFilesCommand zcat to pipe through decompression:

Terminal window
rustar-aligner \
--genomeDir /path/to/genome_index \
--readFilesIn reads.fq.gz \
--readFilesCommand zcat \
--outSAMtype BAM SortedByCoordinate \
--outFileNamePrefix sample_

On macOS, use gzcat instead of zcat.

Use --runThreadN to parallelise the alignment:

Terminal window
--runThreadN 16

The alignment phase scales nearly linearly with thread count up to the IO limit of the storage holding the input/output files.

The defaults match STAR. Two parameters are particularly useful to know:

  • --outFilterMultimapNmax (default 10) — max number of alignments per read. A read mapping to more loci is discarded as MultiMapTooMany.
  • --outFilterMismatchNoverLmax (default 0.3) — max ratio of mismatches to mapped length. Tightening this to 0.04 is a common choice for human RNA-seq.
  • --outFilterScoreMinOverLread (default 0.66) — min alignment score relative to read length. Reads scoring below this threshold are unmapped.

See the CLI parameters reference for the full list.

Multi-mapping reads emit one SAM record per locus by default (up to --outFilterMultimapNmax). To cap the number of secondary alignments written:

Terminal window
--outSAMmultNmax 5

When two alignments tie on score, rustar-aligner uses a seeded RNG (StdRng) to pick the primary. Set the seed for reproducible runs:

Terminal window
--runRNGseed 42

The default seed is 777.