Update - Alternate Version
An alternate version of pBWA is provided. Aside from some minor stability enhancements, pBWA alternate provides faster input file reading for large input files and more efficiently outputs alignments into a single SAM file. Thanks to Rob Egan for the enhancements.
Introduction
Requirements
How to Get
How to Use
pBWA can be executed as long as there is a pre-existing index created by BWA's index command. Below are generic commands, and the other tabs will show examples using a parallel scheduler. The scheduling commands will match those used on the SHARCNET.
./pBWA aln -f SaiPrefix /path/to/Index.fa /path/to/Read_1.fq NumReads
./pBWA aln -f SaiPrefix2 /path/to/Index.fa /path/to/Read_2.fq NumReads
./pBWA sampe -f SamPrefix /path/to/Index.fa SaiPrefix SaiPrefix2 /path/to/Read_1.fq /path/to/Read_1.fq NumReads
A list of parameters available for the above commands can be found on the BWA sourceforge page.
Lack of multithreading is best for systems with sufficient RAM (>=4GB/core). This ensures that all stages (sampe, aln) will receive the maximum amount of parallelism. Note that each stage of pBWA MUST be executed with the same number of parallel processors, although the number of threads can differ.
sqsub -q mpi -n 240 -r 1h --mpp 4G ./pBWA aln -n 2 -f Aln1 Index/hg18.fa Reads/Read_1.fq 100000000
sqsub -q mpi -n 240 -r 1h --mpp 4G ./pBWA aln -n 2 -f Aln2 Index/hg18.fa Reads/Read_2.fq 100000000
sqsub -q mpi -n 240 -r 1h --mpp 4G ./pBWA sampe -f Aln Index/hg18.fa Aln1 Aln2 Reads/Read_1.fq Reads/Read_2.fq 100000000
The -q flag tells us it is an MPI program. The -n flag tells us we want 240 parallel processors executing pBWA. The -r flag is a system requested time limit and the --mpp flag tells the system we need each parallel process to be given 4GB of RAM.
Using a combination of multithreading and parallelism is best for systems lacking sufficient RAM. The number of processors and threads used depends on the system. Play around and see what works best for yours! Below is an example for the Orca cluster (24 cores/node and 1.33GB RAM/core) on the SHARCNET.
sqsub -q mpi -n 40 -N 10 --mpp 8G -r 1h -f noaffinity ./pBWA aln -n 2 -f Aln1 -t 6 Index/hg18.fa ...
sqsub -q mpi -n 40 -N 10 --mpp 8G -r 1h -f noaffinity ./pBWA aln -n 2 -f Aln2 -t 6 Index/hg18.fa ...
sqsub -q mpi -n 40 --mpp 4G -r 1h ./pBWA sampe...
The -N flag tells us we want our 40 processors to be spread out evenly over 10 nodes. Requesting 8G of memory (more than we need) ensures that the 4 processors per node are taking up all of the RAM on that node, ensuring the remaining 20 cores will be available for threading. This allows each process to spawn 6 threads for a total of 240 threads of execution. Note that for sampe, we are only able to run it with 40 threads (processes) of execution due to the lack of multithreaded availability. Future releases of pBWA are expected to introduce multithreading for samse/sampe.