br RNA seq sample preparation
RNA-seq sample preparation
gd T Artesunate were isolated from tumor bearing lungs or spleens of KP mice by FACS. Approximately 1000 cells/condition were sorted directly into TCL buffer (QIAGEN), and the sequencing libraries were prepared using a modified Smartseq V2 protocol as previously described (Picelli et al., 2014; Singer et al., 2017). A total of 9 spleen and 15 lung samples from three independent experiments were sequenced on the Illumina HiSeq 2000 platform.
The graphical abstract image is created with BioRender.
QUANTIFICATION AND STATISTICAL ANALYSIS
GraphPad Prism 7 was used for statistical analyses with the histology/IHC/16S quantification and flow cytometry data. P values from unpaired two-tailed Student’s t tests were used for comparisons between two groups and one-way ANOVA with Bonferroni’s post hoc test was used for multiple comparisons. For tumor burden and bacterial load correlation analysis, linear regression was per-formed with GraphPad Prism 7. Figure legends specify the statistical analysis used.
Illumina HiSeq 2000 40-nt single-ended reads were mapped to the UCSC mm9 mouse genome build (genome.ucsc.edu) using RSEM (Li and Dewey, 2011). Raw estimated expression counts were upper-quartile normalized to a count of 1000 (Bullard et al., 2010). Genes with an upper-quartile expression distribution value less than 20 in both conditions were considered lowly expressed and dropped from downstream analyses. The expression dataset was log (base 2) transformed to stabilize variance. KP-lung (n = 15) and KP spleen (n = 9) samples were jointly analyzed to derive a murine signature of lung gd T gene expression. A high-resolution signature discovery approach (Independent Component Analysis, ICA) was employed to characterize changes in gene expression profiles as described previously (Romero et al., 2017). Briefly, this unsupervised blind source separation technique was used on this discrete count-based expression dataset to elucidate statistically independent and biologically relevant signatures. ICA is a signal processing and multivariate data analysis technique in the category of unsupervised matrix factorization methods (Hyva¨rinen and Oja, 2000). The R implementation of the core JADE algorithm (Joint Approximate Diagonalization of Eigenmatrices) (Biton et al., 2014) was used along with custom R utilities. Statistical significance of biologically relevant signatures was assessed using the Mann-Whitney-Wilcoxon test (alpha = 0.05). Signature genes with jz-scorej > 3 and jfold-changej > 2 were used to generate a heatmap illustrating changes in gene expression, with the HeatPlus package in R. Gene set enrichment analyses (GSEA) were carried out using the signature scores per gene (z-scores) in pre-ranked mode with default settings (Subramanian et al., 2005). A volcano plot was used to illustrate the magnitude of fold-change for top-scoring (z-scores) genes in the signature. The expression levels of 8 biologically relevant top-scoring genes in the signature were illustrated using a pairwise dot-plot between lung and spleen samples. False dis-covery rate (FDR) values for differential expression status of these genes were calculated using a pairwise comparison in EBSeq (Leng et al., 2013). All RNA-seq analyses were conducted in the R Statistical Programming language (http://www.r-project.org/).
Clinical data analysis
IL-22 receptor (IL22RA1) expression levels were compared between normal and tumor tissue within the TCGA (https:// cancergenome.nih.gov/) LUAD cohort. Matched normal and tumor tissue IL22RA1 standardized expression levels were illustrated using an Empirical Cumulative Distribution Function plot (ECDF) and significance was assessed using a Kolmogorov-Smirnov test. Similarly, the set of normal samples was compared with the entire set of tumor samples (either with or lacking matched normal samples). For survival analysis, RNA-seq gene expression profiles of primary tumors and relevant clinical data of 515 LUAD patients were obtained from TCGA and survival analyses were conducted as described previously (Romero et al., 2017; Tammela et al., 2017).