Independent Study to prepare for workshop

Data Analysis 2: Immunobiology - Sample data analysis

Emma Rand

Overview

There are three workshops:

  1. Week 2 DA 2: Immunobiology - Sample data analysis

  2. Week 4 DA 3: Immunobiology - Analysis of your own data

  3. Week 6 DA 4: Immunobiology - Customising figures

Overview

These slides:

  • Prepare you for the sample data analysis workshop; your own data will be like the sample data

  • Summarise the experimental design and aims

  • Explain what the data are

  • Go through the analytical steps conceptually

  • Explain what tools we will use in the workshop to do the analysis

Experimental design and aims

Experimental design and aims

  • Macrophages produce TNF-α in response to bacterial infection

  • Question: Does the production of TNF-α by macrophages require live bacteria, or is the cell wall component sufficient?

  • Therefore we need 3 treatments: Media (control), Lipopolysaccharide (LPS, Cell wall component of E. coli) and live E. coli

  • We measure TNF-α with a TNF-α antibody conjugated to Allophycocyanin (APC)

  • Therefore we need a control for antibody binding and use Isotype antibody

Experimental design and aims

  • Macrophages are treated with one of three treatments: Media, LPS E. coli which are NeonGreen fluorescence

  • Two antibodies are used for each treatment: Isotype antibody, TNF-α antibody conjugated to Allophycocyanin (APC)

  • Thus there are 3 x 2 = 6 combinations

  • Two variable of interest: red fluorescence, green fluorescence

  • We also measure forward scatter (cell size) and side scatter (cell granularity) which can be used to quality control the cells

Experimental design and aims

  • We only expect to see red fluorescence (APC) if the treatment induces TNF-α production in macrophages and the TNF-α antibody is used.

  • We only expect to see green fluorescence (FITC) if the treatment is E. coli

  • This is summarised in the figure on the next page

Experimental design and aims

The data

The data

  • The data are in a flow cytometry standard format (FCS) file

  • Each FCS file contains data from one sample

  • You will have 6 FCS files, one for each combination of treatment and antibody

  • there are 22 variables in columns and up to 50000 cells in rows

The data

  • the 22 columns: TIME, Time MSW, Pulse Width, FS Lin, FS Area, FS Log, SS Lin, SS Area, SS Log, FL 1 Lin, FL 1 Area, FL 1 Log, FL 2 Lin, FL 2 Area, FL 2 Log, FL 3 Lin, FL 3 Area, FL 3 Log, FL 8 Lin, FL 8 Area, FL 8 Log, Event Count

  • FS is Forward scatter, SS is Side scatter, FL is fluorescence channel

  • FL 1 is the green fluorescence channel and we will rename it E_coli_FITC

  • FL 8 is the red fluorescence channel and we will rename it TNFa_APC

  • FL 2 and FL 3 are not used in this experiment and we will delete those columns

  • We will use just four columns: E_coli_FITC_Lin, TNFa_APC_Lin, FS Lin, and SS Lin

Analytical steps

Overview

  • The analysis of flow cytometry data is relatively simple conceptually

  • We apply several quality control steps to the data to remove anomalous signals, dead cells and debris

  • We use scatter plots, calculate means, and find percentages of cells in different regions of the scatter plots

Analytical steps

  • Import the data into R, improve the column names and remove unwanted columns

  • Apply automated quality control

  • Apply a “logicle” transformation (Parks, Roederer, and Moore 2006) to the fluorescence channels (similar to logging)

  • Explore the data with scatter plots and histograms/density plots

  • Use FS Lin and SS Lin to determine what cells (rows) to remove as debris

  • Determine cut-offs for cells being positive for TNF-α and E. coli

  • Calculate the percentage of cells that are positive for TNF-α for each treatment

Tools

Tools

  • Import, rename and subset columns using the flowCore package (Ellis et al. 2024)

  • Automated quality control with the flowAI package (Monaco et al. 2016)

  • Apply a “logicle” transformation using the flowCore package

  • Put the data into a dataframe to make it easy to use tidyverse (Wickham et al. 2019) tools like group_by(), summarise(), ggplot(), filter()

The data in R

  • the flowCore package imports each FCS file as a flowFrame object

  • The flowFrame object contains the data from the FCS file and metadata about the experiment

  • A collection of related flowFrames are stored in a flowSet object

  • flowAI and flowCore functions work with flowSet objects

  • After that we can convert the flowSet to a dataframe to use tidyverse tools

Summary

Summary

  • Sample data are like the data you will produce in your own experiment

  • 3 treatments x 2 antibodies = 6 combinations; 4 variables upto 50000 cells each

  • The analysis is conceptually simple: quality control, transformation, scatter plots, and calculating percentages

  • The week 2 workshop analyses the sample data, in the week 4 workshop you will analyse your own data

  • We will use the flowCore, flowAI and tidyverse packages to do the analysis

References

Ellis, B, Perry Haaland, Florian Hahne, Nolwenn Le Meur, Nishant Gopalakrishnan, Josef Spidlen, Mike Jiang, and Greg Finak. 2024. “flowCore: flowCore: Basic Structures for Flow Cytometry Data.”
Monaco, Gianni, Hao Chen, Michael Poidinger, Jinmiao Chen, Joao Pedro de Magalhaes, and Anis Larbi. 2016. “flowAI: Automatic and Interactive Anomaly Discerning Tools for Flow Cytometry Data” 32. 10.1093/bioinformatics/btw191.
Parks, David R., Mario Roederer, and Wayne A. Moore. 2006. “A New Logicle Display Method Avoids Deceptive Effects of Logarithmic Scaling for Low Signals and Compensated Data.” Cytometry Part A 69A (6): 541–51. https://doi.org/10.1002/cyto.a.20258.
Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the Tidyverse 4: 1686. https://doi.org/10.21105/joss.01686.