Trimming Exercise
Please copy the trimming directory from our shared folder
cd $SCRATCH cp -r /scratch/courses/HITS-2018/trimming . cd trimming
Although we have a shell script that we can simply submit using sbatch, we will perform this analysis in interactive mode. The script is useful as a reference for future use.
To start we must first enter the interactive session
srun -c1 -t3:00:00 --mem=4000 --pty /bin/bash
In the above command, I have requested one cpu on one node for three hours and 4GB of memory.
Now I will load the Trimmomatic module. Remember, to check the version number, you can simply type module avail trimmomatic
module purge module load trimmomatic/0.36
Trimmomatic is a java application and for this reason, it needs to be executed with the commands java -jar
followed by the actual application .jar file. Lucky for us, once we loaded the Trimmomatic module, a new variable was placed in our environment with the path the to .jar file. To find this path type the following:
env | grep TRIM TRIMMOMATIC_HOME=/share/apps/trimmomatic/0.36 TRIMMOMATIC_ROOT=/share/apps/trimmomatic/0.36 LMOD_FAMILY_TRIMMOMATIC=trimmomatic TRIMMOMATIC_JAR=/share/apps/trimmomatic/0.36/trimmomatic-0.36.jar
The TRIMMOMATIC_JAR variable provides us with this path. Now to execute Trimmomatic let’s write out the entire line. Notice that since this is going to be an extremely long line, we can use a \
to tell bash interpreter that although I am not enterine a new line, do not interpret it asif I have pressed enter. YOU MUST MAKE SURE THERE IS NO SPACE AFTER THE \
java -jar $TRIMMOMATIC_JAR PE -phred33 \ sequence_1.fastq.gz sequence_2.fastq.gz \ sequence_1_trimmed.fq sequence_1_unpair_trimmed.fq \ sequence_2_trimmed.fq sequence_2_unpair_trimmed.fq \ HEADCROP:15 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36
You should now have 4 new files in your directory.