TCGA Explorer — Pan-Cancer Gene Analysis Tool

Free, browser-based analysis of gene expression, survival, immune infiltration, and mutational signatures across 30 TCGA cancer types . Enter a gene symbol, select analyses, and generate PDF reports. For research use only.


About this tool

TCGA Explorer uses RNASeq STAR TPM expression data from The Cancer Genome Atlas (TCGA) to generate on-demand analysis reports for any protein-coding gene. Analyses cover pan-cancer expression profiles, tumour vs normal comparisons, stage-stratified expression, Kaplan–Meier survival curves, Cox regression heatmaps, immune cell infiltration scores, COSMIC v2/v3 mutational signatures, DESeq2 differential expression, and GSEA pathway enrichment.

Reports are generated as PDF files in background processes (10–90 min depending on scope). Save your Session ID to retrieve completed reports after closing the browser. Data source: TCGA via GDC. Expression data: STAR alignment, TPM normalisation. 30 cancer cohorts including BRCA, LUAD, COAD, GBM, OV, PRAD, KIRC, and more.

Author: Zsofia Sztupinszki . Copyright © 2023–2026.

Please cite: Sztupinszki Z, Fiam R, Pipek O, Diossy M, Börcsök J, Prosz A, Csabai I & Szallasi Z. Polymerase theta expression is correlated with proliferative capacity but not with DNA repair deficiency status in solid tumors. npj Precision Oncology 9, 200 (2025). https://doi.org/10.1038/s41698-025-01000-w

Select a gene above, navigate to an analysis tab, choose your analyses, and click Generate. Reports are created in the background (10–90 min). Save your Session ID to retrieve results later.


📊 Gene Expression

Pan-cancer overview, normal vs tumour, stage, proliferation index.

📈 Survival

Pan-cancer survival heatmap (Cox, median/best cutoff) plus per-cancer KM curves and breast subtype survival.

🔬 Molecular Features

Immune infiltration and mutational signatures (COSMICv2, v3, Indel).

🧬 Differential Expression

DESeq2 differential expression and GSEA pathway enrichment.

🔗 Pathway Enrichment

Interactive ssGSEA pan-cancer explorer.

ℹ️ About

Analysis descriptions, example reports, abbreviations, data sources.


Gene Expression Analyses

Pancancer overview of TCGA
Boxplots of TPM and log2(TPM+1) expression across all 30 TCGA cohorts and GTEx healthy tissue. Includes primary tumour, adjacent normal, and metastasis where available.
Normal vs Tumor, Met
Pairwise comparisons of tumour-adjacent normal vs primary vs metastatic samples. Includes paired-sample analysis.
Stage
Expression by tumour stage (I–IV) per cancer type.
Expression vs Proliferation Index
Scatter plots of gene expression vs tumour proliferation index.

Survival Analyses

Survival, all patients
Kaplan-Meier curves for OS, splitting patients into high/low expression groups using the maxstat optimal cutpoint. Results per selected cancer type.
Survival, treatment included platina
Same as above, restricted to platinum-treated patients.
Survival, treatment included paclitaxel
Same as above, restricted to paclitaxel-treated patients.
Breast cancer subtypes
Survival stratified by PAM50 molecular subtype (Luminal A/B, HER2-enriched, Basal-like, Normal-like).

Molecular Feature Analyses

Expression vs Immune infiltration, all patients
Correlation of gene expression with deconvoluted immune cell fractions (TIMER2.0) across selected cancer types.
Expression vs Mutational Signatures (COSMICv2 / v3)
Correlation with COSMIC SNV mutational signature activities (v2 and v3 separately).
Expression vs Indel Mutational Signatures
Correlation with COSMIC indel (ID) signature activities.
Immune infiltration, breast subtypes
Immune cell composition by PAM50 subtype in BRCA.

Differential Gene Expression & GSEA

This analysis:

  1. Splits TCGA patients for the selected cancer into top-N% (high) and bottom-N% (low) expressors of the selected gene.
  2. Runs DESeq2 to identify differentially expressed genes between the two groups.
  3. Runs fgsea on the ranked gene list against MSigDB HALLMARK + KEGG + GO gene sets.
  4. Produces volcano plots, heatmaps, and GSEA enrichment plots.

Note: The GSEA results from this analysis are also available in the Pathway Enrichment tab after the job completes.



Important

After saving your Session ID, you can leave the page and return later. Be aware that some reports may take a long time to generate (e.g.If you select all of the cancer types and all analyses (except GSEA), the process can take approximately 90 minutes.).
If you want to perform additional analysis, always use “Restart Analysis”.
To view sample reports, visit the “Examples” tab.
Each report can be downloaded individually by clicking “Download…” or all at once in a zip file by clicking “Download All Reports.”
Except for the GSEA analysis, p-values are not corrected.

Pancancer overview of TCGA

This report provides an overview of the expression of selected gene in various cancer types, using data from the TCGA and GTEx datasets. It includes figures for TPM and log2(TPM+1).
The TCGA dataset contains three source types (though not for all cancers): Primary Tumor, Metastasis, and Tumor-adjacent Normal Tissue. The GTEx dataset represents healthy normal tissue.
Detailed results are available in the “Normal vs Tumor, Met” report.
An important note: in these plots, every sample is represented. The “Normal vs Tumor, Met” report (available on the main site) offers separate plots—one for when all samples are included and another that shows only paired samples (i.e., tumor and normal tissues from the same patients).

Normal vs Tumor, Met

This report provides of the comparisons expression of selected gene in TCGA comparing Primary Tumor, Metastasis, and Tumor-adjacent Normal Tissue.
The first column of figures contains all the samples for the cancer type, and the 2nd column contains only those samples where Tumor-adjacent Normal Tissue and Primary Tumor RNASeq data is available from the same patients.

Stage

In this report, the expression of selected gene is compared to the cancer stage. Each row of figures represents a different cancer type. The first column of figures uses the AJCC Pathological Stage, the second column uses the Clinical Stage, and the third column represents the “Consensus Stage,” which combines the AJCC stage and, where unavailable, defaults to the Clinical Stage.

Expression vs Proliferartion Index

The expression of many, maybe most of the “cancer genes” is associated with proliferation. There are not many genes that are upregulated in cancer tissue compared to normal, but is not associated with proliferation. This is a nice paper on the topic: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5748209/
The “general level of proliferation” in a sample can be estimated using RNASeq, these estimations usually referred as proliferation indexes. The proliferation index was determined using two methods:

  • Venet D, Dumont JE, Detours V. Most random gene expression signatures are significantly associated with breast cancer outcome. PLoS Comput Biol. 2011; 7. doi: 10.1371/journal.pcbi.1002240.
  • Marie Locard-Paulet,Oana Palasca,Lars Juhl Jensen, Identifying the genes impacted by cell proliferation in proteomics and transcriptomics studies

Furthermore, the correlation with cell cycle-related genes is also represented in this report.

Survival, all patients

Survival analysis for patients with the selected cancer type. Methods used: Kaplan-Meier and Cox proportional hazards regression. Patients are divided into two groups based on either median expression or optimal cutoff. Survival times analyzed: Overall Survival (OS), Progression-Free Survival (PFS), Disease-Specific Survival (DSS), Disease-Free Interval (DFI), and Progression-Free Interval (PFI). Refer to the Abbreviations section for detailed definitions. To generate this analysis for all cancer types, use the “Select All” option.

Survival, treatment included platina

Survival analysis for patients who received documented treatment including platinum-based drugs (e.g., cisplatin, carboplatin). It’s likely that more patients had this treatment than is documented. However, we can be certain that these patients received it.

Survival, treatment included paclitaxel

Survival analysis for patients who received documented treatment including paclitaxel or other taxols. It’s likely that more patients had this treatment than is documented. However, we can be certain that these patients received it.

Expression vs Immune infiltration, all patients

The gene expression is compared to the sample purity and immune composition, which are deconvoluted using bulk RNA-Seq data. The methods used include CIBERSORT, TIMER, the approach presented by Danaher et al., MCP-counter, and xCell.

Expression vs Mutational Signatures (COSMICv2)

The gene expression is compared to the contribution (weight) and number of mutations corresponding to COSMICv2 single nucleotide mutational signatures (https://cancer.sanger.ac.uk/signatures/signatures_v2/), using whole-exome sequencing data.

Expression vs Mutational Signatures (COSMICv3)

The gene expression is compared to the contribution (weight) and number of mutations corresponding to COSMICv3 single nucleotide mutational signatures (https://cancer.sanger.ac.uk/signatures/sbs/), using whole-exome sequencing data.

Expression vs Indel Mutational Signatures

The gene expression is compared to the contribution (weight) and number of mutations corresponding to COSMICv3 indel mutational signatures (https://cancer.sanger.ac.uk/signatures/id/), using whole-exome sequencing (WES) data. These signatures are less reliable using WES data (compared to WGS).

Expression vs Immune infiltration, in breast cancer subtypes

As in the previous analysis, the difference is that the breast cancer cases are classified into PAM50 subtypes using RNA-Seq data, and the analysis is carried out for each of the subtypes.

Survival, breast cc. patients per subtype

As in the previous analysis, the difference is that the breast cancer cases are classified into PAM50 subtypes using RNA-Seq data, and the analysis is carried out for each of the subtypes.

Survival, breast cc. patients per subtype, treatment included paclitaxel

As in the previous analysis, the difference is that the breast cancer cases are classified into PAM50 subtypes using RNA-Seq data, and the analysis is carried out for each of the subtypes.

TCGA, Differential gene expression analysis of high vs low expressing group and GSEA

To view the tables and some figures, please download this report. and open it on your PC.
In the first step, patients are divided into two groups based on the limit set on the scale. These groups are then compared to identify differentially expressed genes. Subsequently, Gene Set Enrichment Analysis (GSEA) is performed. For example, if you choose the gene BRCA1, ovarian cancer (OV), and a limit of 25%, the analysis compares the highest BRCA1-expressing patients (top 25%) to the lowest 25%. The differentially expressed genes between these groups are then analyzed further.

TCGA Cancer Types

Abbreviation Cancer Type
ACC Adrenocortical carcinoma
BLCA Bladder Urothelial Carcinoma
BRCA Breast invasive carcinoma
CESC Cervical squamous cell carcinoma and endocervical adenocarcinoma
CHOL Cholangiocarcinoma
COAD Colon adenocarcinoma
DLBC Lymphoid Neoplasm Diffuse Large B-cell Lymphoma
ESCA Esophageal carcinoma
GBM Glioblastoma multiforme
HNSC Head and Neck squamous cell carcinoma
KICH Kidney Chromophobe
KIRC Kidney renal clear cell carcinoma
KIRP Kidney renal papillary cell carcinoma
LGG Brain Lower Grade Glioma
LIHC Liver hepatocellular carcinoma
LUAD Lung adenocarcinoma
LUSC Lung squamous cell carcinoma
MESO Mesothelioma
OV Ovarian serous cystadenocarcinoma
PAAD Pancreatic adenocarcinoma
PCPG Pheochromocytoma and Paraganglioma
PRAD Prostate adenocarcinoma
READ Rectum adenocarcinoma
SARC Sarcoma
SKCM Skin Cutaneous Melanoma
STAD Stomach adenocarcinoma
TGCT Testicular Germ Cell Tumors
THCA Thyroid carcinoma
THYM Thymoma
UCEC Uterine Corpus Endometrial Carcinoma
UCS Uterine Carcinosarcoma

Survival Abbreviations

OS (Overall Survival)
Event: 1 for death from any cause, 0 for alive.
Time: Overall survival time in days (last_contact_days_to or death_days_to, whichever is larger).
DSS (Disease-Specific Survival)
Event: 1 for patient whose vital_status was Dead and tumor_status was WITH TUMOR (indicating death from disease). 0 for patient whose vital_status was Alive or whose vital_status was Dead and tumor_status was TUMOR FREE. Note: This is an approximation based on available data; some patients with tumor may have died from other causes.
Time: Disease-specific survival time in days (last_contact_days_to or death_days_to, whichever is larger).
DFI (Disease-Free Interval)
Event: 1 for patients having new tumor event (local recurrence, distant metastasis, or new primary tumor), including cases with unknown type. Disease-free was defined by: (1) treatment_outcome_first_course = 'Complete Remission/Response'; or (2) residual_tumor = 'R0'; or (3) margin_status = 'negative'. If none of these fields were available, DFI was NA.
Time: Disease-free interval time in days (new_tumor_event_dx_days_to for events; for censored cases, last_contact_days_to or death_days_to, whichever is applicable).
PFI (Progression-Free Interval)
Event: 1 for patients having any new tumor event (progression, local recurrence, distant metastasis, new primary tumors), or died with cancer without new tumor event, including cases with unknown event type.
Time: Progression-free interval time in days (for events, eiys_to or death_days_to, whichever is applicable; for censored cases, last_contact_days_to or death_days_to, whichever is applicable).

Source of data: