Seurat part 2 – Cell QC

Now that we have loaded our data in seurat (using the CreateSeuratObject), we want to perform some initial QC on our cells.

While the CreateSeuratObject imposes a basic minimum gene-cutoff, you may want to filter out cells at this stage based on technical or biological parameters. Seurat allows you to easily explore QC metrics and filter cells based on any user-defined criteria. In the example below, we visualize gene and molecule counts, plot their relationship, and exclude cells with a clear outlier number of genes detected as potential multiplets. Of course this is not a guaranteed method to exclude cell doublets, but we include this as an example of filtering user-defined outlier cells. We also filter cells based on the percentage of mitochondrial genes present.

Note: In order to detect mitochondrial genes, we need to tell Seurat how to distinguish these genes. We do this using a regular expression as in “mito.genes <- grep(pattern = "^MT-". If your mitochondrial genes are named differently, then you will need to adjust this pattern accordingly (e.g. “mt-“, “mt.”, or “MT_” etc.).

Questions:

  1. What is the difference between nGenes and nUMIs?
  2. Can you detect the potential outliers in each plot?
  3. How do you feel about the quality of the cells at this initial QC step?

Now based on our observations, we can filter out what we see as clear outliers. We will define a window of a minimum of 200 detected genes per cell and a maximum of 2500 detected genes per cell. Again, these parameters should be adjusted according to your own data and observations. For example, if you had very high coverage, you might want to adjust these parameters and increase the threshold window. As this is a guided approach, visualization of the earlier plots will give you a good idea of what these parameters should be.

Questions:

  1. How many cells did we filter out using the thresholds specified above?