Data Analysis 2: Immunobiology - Sample data analysis
There are three workshops:
Week 2 DA 2: Immunobiology - Sample data analysis
Week 4 DA 3: Immunobiology - Analysis of your own data
Week 6 DA 4: Immunobiology - Customising figures
These slides:
Prepare you for the sample data analysis workshop; your own data will be like the sample data
Summarise the experimental design and aims
Explain what the data are
Go through the analytical steps conceptually
Explain what tools we will use in the workshop to do the analysis
Macrophages produce TNF-α in response to bacterial infection
Question: Does the production of TNF-α by macrophages require live bacteria, or is the cell wall component sufficient?
Therefore we need 3 treatments: Media (control), Lipopolysaccharide (LPS, Cell wall component of E. coli) and live E. coli
We measure TNF-α with a TNF-α antibody conjugated to Allophycocyanin (APC)
Therefore we need a control for antibody binding and use Isotype antibody
Macrophages are treated with one of three treatments: Media, LPS E. coli which are NeonGreen fluorescence
Two antibodies are used for each treatment: Isotype antibody, TNF-α antibody conjugated to Allophycocyanin (APC)
Thus there are 3 x 2 = 6 combinations
Two variable of interest: red fluorescence, green fluorescence
We also measure forward scatter (cell size) and side scatter (cell granularity) which can be used to quality control the cells
We only expect to see red fluorescence (APC) if the treatment induces TNF-α production in macrophages and the TNF-α antibody is used.
We only expect to see green fluorescence (FITC) if the treatment is E. coli
This is summarised in the figure on the next page
The data are in a flow cytometry standard format (FCS) file
Each FCS file contains data from one sample
You will have 6 FCS files, one for each combination of treatment and antibody
there are 22 variables in columns and up to 50000 cells in rows
the 22 columns: TIME, Time MSW, Pulse Width, FS Lin, FS Area, FS Log, SS Lin, SS Area, SS Log, FL 1 Lin, FL 1 Area, FL 1 Log, FL 2 Lin, FL 2 Area, FL 2 Log, FL 3 Lin, FL 3 Area, FL 3 Log, FL 8 Lin, FL 8 Area, FL 8 Log, Event Count
FS is Forward scatter, SS is Side scatter, FL is fluorescence channel
FL 1 is the green fluorescence channel and we will rename it E_coli_FITC
FL 8 is the red fluorescence channel and we will rename it TNFa_APC
FL 2 and FL 3 are not used in this experiment and we will delete those columns
We will use just four columns: E_coli_FITC_Lin, TNFa_APC_Lin, FS Lin, and SS Lin
The analysis of flow cytometry data is relatively simple conceptually
We apply several quality control steps to the data to remove anomalous signals, dead cells and debris
We use scatter plots, calculate means, and find percentages of cells in different regions of the scatter plots
Import the data into R, improve the column names and remove unwanted columns
Apply automated quality control
Apply a “logicle” transformation (Parks, Roederer, and Moore 2006) to the fluorescence channels (similar to logging)
Explore the data with scatter plots and histograms/density plots
Use FS Lin and SS Lin to determine what cells (rows) to remove as debris
Determine cut-offs for cells being positive for TNF-α and E. coli
Calculate the percentage of cells that are positive for TNF-α for each treatment
Import, rename and subset columns using the flowCore
package (Ellis et al. 2024)
Automated quality control with the flowAI
package (Monaco et al. 2016)
Apply a “logicle” transformation using the flowCore
package
Put the data into a dataframe to make it easy to use tidyverse
(Wickham et al. 2019) tools like group_by()
, summarise()
, ggplot()
, filter()
the flowCore
package imports each FCS file as a flowFrame
object
The flowFrame
object contains the data from the FCS file and metadata about the experiment
A collection of related flowFrames
are stored in a flowSet
object
flowAI
and flowCore
functions work with flowSet
objects
After that we can convert the flowSet
to a dataframe to use tidyverse
tools
Sample data are like the data you will produce in your own experiment
3 treatments x 2 antibodies = 6 combinations; 4 variables upto 50000 cells each
The analysis is conceptually simple: quality control, transformation, scatter plots, and calculating percentages
The week 2 workshop analyses the sample data, in the week 4 workshop you will analyse your own data
We will use the flowCore
, flowAI
and tidyverse
packages to do the analysis