Bioinfo | 003. Reference Review - 2
This blog primarily focuses on a detailed review of the analysis section from Significance of Immunogenic Cell Death-Related Prognostic Gene Signature in Cervical Cancer Prognosis and Anti-Tumor Immunity, documenting its brainstorm process to establish a foundation for practical implementation.
In summary, the study confirms through a series of analyses that high expression of PDIA3 can serve as a significant predictor in cervical cancer prognosis. Assuming continued high expression of PDIA3 in patient prognosis indicates a poorer outcome, highlighting the need for focused attention on such patients. (What constitutes high expression, however?)
Data Sources Link to heading
- Cervical cancer databases from TCGA (gene expression data for 309 patients, clinical-pathological data for 323 patients)
- GSE44001 (300 samples)
- GSE16852
- GSE6791
Data Processing Link to heading
- Incomplete samples were excluded, retaining only those with complete data for subsequent analysis.
- Transcript per million (TPM) values were calculated from Fragments Per Kilobase of transcript per Million fragments mapped (FPKM) to enhance comparability of data across different samples. This conversion reduces inaccuracies in cross-sample quantification.
- tpm = (fpkm / sum(fpkm)) * 1e6
- 49 genes related to Immunogenic Cell Death (ICD) were identified from previous literature and studies.
Gene Set Variation Analysis (GSVA) - 基因集变异分析 Link to heading
pkg:
- RCircos
- maftools
- Using “RCircos” to analyze the frequency of copy number variations (CNV) gained or lost across 23 pairs of human chromosomes for 49 relevant genes associated with Immunogenic Cell Death (ICD) in cervical cancer. GSVA is used to analyze CNV and mutation status of ICD-related genes in cervical cancer.
- Using “maftools” to detect mutation frequencies, types, and base changes of ICD-related genes in TCGA cohort.
Copy Number Variation (CNV) refers to the phenomenon where the number of copies of certain segments in the genome differs from the reference genome. These variations include gains (copy number gain) or losses (copy number loss) of gene or genomic segments. CNV can affect gene dosage, thereby influencing gene expression and function, and is an important genetic factor in many diseases, including cancer.
Clustering Analysis Based on IRG Expression Link to heading
pkg:
- ConsensusClusterPlus
- clusterProfiler
- Using “ConsensusClusterPlus” for clustering analysis of tumor cells (TCGA database + 3 GEO databases?).
- Kaplan-Meier analysis to assess the prognosis of tumor subtypes identified through consensus clustering.
- Conducting functional enrichment analysis (GO and KEGG).
Construction of ICD-related Risk Scoring Model Link to heading
pkg:
- limma
- glmnet
- maxstat
- Utilize “limma” to confirm gene expression related to Immunogenic Cell Death (ICD).
- Further narrow down ICD-related gene expression using lasso regression.
- Use Cox regression analysis to calculate the risk index (IPGs model). IPGs model coefficients will determine further analysis, particularly focusing on PDIA3.
- Plot ROC curves and calculate AUC.
- Combine clinical data with IPGs scores for single-factor Cox analysis and multi-factor Cox analysis to determine if the risk score can serve as an independent predictor of prognosis in cervical cancer patients.
- Use “maxstat” to identify high-risk and low-risk groups.
Analysis of IRG Correlation with Immune Infiltrating Cells and Prospects for Immunotherapy Link to heading
pkg:
- ESTIMATE
- pRRophetic
- Calculate ImmuneScore, StromalScore, and ESTIMATEScore, representing scores for immune infiltration, stromal cell infiltration, and an estimation combining both stromal and immune cell scores.
- Perform single-sample gene set enrichment analysis (ssGSEA).
- Use “pRRophetic” to assess differences in drug sensitivity.
- Utilize the interface provided by Tumor Immune Dysfunction and Exclusion (TIDE) (http://tide.dfci.harvard.edu/) to predict responses to immunotherapy.
The Mann–Whitney test was used to compare immune cell ssGSEA scores and immune-related pathways between groups. Two-tailed p-values < 0.05 were considered statistically significant.
ssGSEA (single-sample Gene Set Enrichment Analysis) and GSEA (Gene Set Enrichment Analysis) are both methods for enrichment analysis based on gene expression data, but they have important differences and applications:
- GSEA (Gene Set Enrichment Analysis)
- Definition: GSEA is a method to discover enrichment of gene sets under two different biological conditions, typically comparing disease group versus control group.
- Principle: GSEA ranks genes based on their expression levels, places genes from a gene set along a ranked list of gene expression data, and calculates an enrichment score to assess the overall enrichment of the gene set.
- Applicability: Suitable for comparing gene expression differences between two or more conditions (e.g., disease group vs control group) to explore biological processes and pathways related to disease.
- sGSEA (single-sample Gene Set Enrichment Analysis)
- Definition: ssGSEA evaluates the enrichment of gene sets in individual samples based on gene expression data without the need for predefined sample groups.
- Principle: ssGSEA calculates enrichment scores for gene sets based on the expression levels of core genes within each sample.
- Applicability: Suitable for assessing the activity levels of specific biological processes or pathways within individual samples, such as evaluating immune cell infiltration or metabolic pathway activity.
Summary of Differences:
- Objective: GSEA is used for inter-group comparisons to find biologically relevant pathways with differential expression. ssGSEA evaluates the enrichment of gene sets within single samples.
- Input: GSEA requires grouping information to divide gene expression data into at least two groups. ssGSEA operates directly on single samples without predefined groups.
- Application: GSEA is appropriate for comparing gene expression under different conditions (e.g., disease states, pre- and post-treatment). ssGSEA is suitable for assessing the enrichment of gene sets within individual samples.
In conclusion, the choice between GSEA and ssGSEA depends on the research question and the type of data: GSEA is chosen for comparing gene set enrichment across different conditions, while ssGSEA is chosen for evaluating gene set enrichment within individual samples.
Experimental Materials? Link to heading
The human CC-related cell lines HeLa and C33a were purchased from Meisen (Zhejiang, China), and SiHa and ME180 were purchased from Pricella (Wuhan, China). All cultures were supplemented with RPMI 1640 medium, 0.25% trypsin ethylenediamine tetraacetic acid, and fetal bovine serum (Gibco, NY, USA) and maintained below 5% CO2 at 37 °C. PDIA3 antibody (Proteintech, Wuhan, China), GAPDH antibody (CST, Shanghai, China), goat anti-rabbit IgG antibody secondary antibody (Zsbio, Beijing, China), and goat anti-mouse IgG H&L/Cy3 (Beyotime, Shanghai, China).
Ignored.
Immunofluorescence Link to heading
For PDIA3 co-localization analysis, the CC cells were treated with a PDIA3 antibody (1:2000 dilution) overnight at 4 °C. The cells were then restained with goat anti-rabbit IgG H&L/FITC (1:2000 dilution).
Processing methods (this should belong to wet experiments, not sure why wet experiments are needed?)
Real-Time Quantitative PCR Link to heading
We collected 8 samples of normal ovarian tissue and 24 samples of CESC tissue for real-time quantitative PCR (qPCR) detection. Total RNA was extracted from tissue samples (Shanghai, China) using the Promega total RNA extraction kit. After RNA extraction, cDNA was synthesized from total RNA using the Transcriptor First Strand cDNA Synthesis Kit (Shanghai, China). qPCR was performed using SuperReal PreMix Plus (Tiangen Biotech, Beijing, China) to determine the expression levels of IPGs (identified in Supplementary Table 1) purchased from Shanghai Shenggong Bioengineering Co., Ltd.
Not very familiar with the experimental details, so is the purpose to validate the model’s effectiveness through experiments?