Trimming with Trimmomatic

Trimming Exercise

Please copy the trimming directory from our shared folder

cd $SCRATCH
cp -r /scratch/courses/HITS-2018/trimming .
cd trimming

Although we have a shell script that we can simply submit using sbatch, we will perform this analysis in interactive mode. The script is useful as a reference for future use.

To start we must first enter the interactive session

srun -c1 -t3:00:00 --mem=4000 --pty /bin/bash

In the above command, I have requested one cpu on one node for three hours and 4GB of memory.

Now I will load the Trimmomatic module. Remember, to check the version number, you can simply type module avail trimmomatic

module purge
module load trimmomatic/0.36

Trimmomatic is a java application and for this reason, it needs to be executed with the commands java -jar followed by the actual application .jar file. Lucky for us, once we loaded the Trimmomatic module, a new variable was placed in our environment with the path the to .jar file. To find this path type the following:

env | grep TRIM

TRIMMOMATIC_HOME=/share/apps/trimmomatic/0.36
TRIMMOMATIC_ROOT=/share/apps/trimmomatic/0.36
LMOD_FAMILY_TRIMMOMATIC=trimmomatic
TRIMMOMATIC_JAR=/share/apps/trimmomatic/0.36/trimmomatic-0.36.jar

The TRIMMOMATIC_JAR variable provides us with this path. Now to execute Trimmomatic let’s write out the entire line. Notice that since this is going to be an extremely long line, we can use a \ to tell bash interpreter that although I am not enterine a new line, do not interpret it asif I have pressed enter. YOU MUST MAKE SURE THERE IS NO SPACE AFTER THE \

java -jar $TRIMMOMATIC_JAR PE -phred33 \
sequence_1.fastq.gz sequence_2.fastq.gz \
sequence_1_trimmed.fq sequence_1_unpair_trimmed.fq \
sequence_2_trimmed.fq sequence_2_unpair_trimmed.fq \
HEADCROP:15 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36

You should now have 4 new files in your directory.