The prognosis of patients with metastatic melanomas is extremely heterogeneous. Therefore, identifying high-risk subgroups by using innovative prediction models would help to improve selection of appropriate management options.
In this study, two datasets (GSE7929 and GSE7956) of mRNA expression microarray in an animal melanoma model were normalized by frozen Robust Multi-Array Analysis and then combined by the distance-weighted discrimination method to identify time course-dependent metastasis-related gene signatures by Biometric Research Branch-ArrayTools (BRB)-ArrayTools. Then two datasets (GSE8401 and GSE19234) of clinical melanoma samples with relevant clinical and survival data were used to validate the prognosis signature.
A novel 192-gene set that varies significantly in parallel with the increasing of metastatic potentials was identified in the animal melanoma model. Further, this gene signature was validated to correlate with poor prognosis of human metastatic melanomas but not of primary melanomas in two independent datasets. Furthermore, multivariate Cox proportional hazards regression analyses demonstrated that the prognostic value of the 192-gene set is independent of the TNM stage and has higher areas under the receiver operating characteristic curve than stage information in both validation datasets.
Our findings suggest that a time course-dependent metastasis-related gene expression signature is useful in predicting survival of malignant melanomas and might be useful in informing treatment decisions for these patients.
Keywords:Melanomas; Metastasis; Prognosis; Prediction; Gene signature
The incidence of malignant melanoma, an aggressive form of skin cancer, is increasing rapidly globally, including in China . It has been estimated that approximately 15% of patients with primary melanoma will develop distant metastases . Because metastatic melanoma responds poorly to conventional therapy, its outcome is generally very poor ,. However, a proportion of patients do survive for prolonged periods following development of metastasis. Therefore, developing innovative prediction models for stratifying metastatic melanoma further would improve planning of appropriate managements.
Gene expression profiling has been used to establish molecular signatures for classifying the subtypes of primary tumors and predicting the clinical outcome of multiple cancers including malignant melanoma ,. Previous studies have identified several panels of gene expression signatures with clinical relevance to malignant melanoma. In this study, we identified a novel gene signature based on time course mRNA expression microarray data in an animal melanoma model that can be used to predict survival in patients with malignant melanomas.
The gene expression profiles of four independent datasets deposited in the Gene Expression Omnibus (GEO) (accession numbers GSE7929, GSE7956, GSE8401, and GSE19234) were analyzed ,. Microarray platforms used in these datasets were HU133A Affymetrix DNA chips (GSE7929, GSE7956, GSE8401) and Illumina HumanWG-6 v2 E/HumanWG HT12 Expression Beadchip (GSE19234).
The GSE7929 comprises the gene expression microarray data of 32 human melanoma cells samples including 11 replicas of parental melanoma cell line A375 (LeiATCC) and 21 derived cell lines with different metastatic potentials selected based on a xenotransplant metastasis model . The GSE7956 dataset comprises whole-genome mRNA expression data of 39 human melanoma cell samples, including 10 replicas of parental melanoma cell line A375 (LeiFidler) and 29 derived cell lines with different metastatic potentials generated by two or three rounds of selection in immunodeficient mice .
The GSE8401 dataset was generated from 83 melanoma samples (31 primary and 52 metastatic tumors) that were collected from patients undergoing surgery were collected from 1992 to 2001 at the Massachusetts General Hospital and Harvard Medical School as a part of the diagnostic workup or planning of therapy . Fifty-two of these metastatic melanoma samples were of superficially spreading type, 19 nodular, and 5 acral lentiginous; the types are not available for the remaining. Most of the primary melanomas were TNM stage I-II, the metastatic melanoma group includes 17 stage III and 35 stage IV tumors .
The GSE19234 dataset comprises 44 melanoma tumors with metastasis were collected at the New York University Medical Center from 1988 to 1997 . Five cases received immunotherapy. Fifteen cases showed CD3 positive. Among these tumors, 4 cases were at stage IIIA tumors,23 at stage IIIB, 12 at stage IIIC, and 5 at stage IV .
As claimed in the original publications, all the studies on these datasets had been approved by the local ethical committees and samples were collected with written informed consent from the patients.
Microarray data merging
Two human melanoma cell line datasets GSE7929 and GSE7956 were used to identifying time course-dependent metastasis-related gene signatures. All gene expression profiles were normalized with frozen Robust Multi-Array Analysis (fRMA), a procedure in which allows arrays are not longer clustered per study but per platform . The two datasets were combined using the distance-weighted discrimination (DWD) method . DWD is a multivariate analysis tool that is able to adjust experiments batch effects to compensate for biases present in different gene expression datasets. The integration effect was inspected by a box-plot of gene expression in each array using Biometric Research Branch (BRB)-ArrayTools 4.3.1 .
Time course analysis of metastasis-related gene signature on merged dataset
A time course analysis of the combined datasets of GSE7929 and GSE7956 was performed with the expression profiles of different generations of metastatic cell lines set as time-course data. A BRB-ArrayTool plug-in was used to perform regression analysis of time course expression data to identify the genes whose expression varies over time. The round numbers of different generations of selected cells were set as different time points. The tests were performed at a false discovery rate (FDR) threshold of 0.05. A list of genes that satisfied this threshold was produced and used as gene sets to assess correlations with poor survival of patients with melanomas in the validation datasets.
Validation of the time course prediction gene signature in independent datasets
Two clinical melanoma sample datasets (GSE8401 and GSE19234) were used to validate the prognostic significance of the time course-dependent metastasis-related gene signature. For the validated datasets, series matrix files containing normalized data were obtained from the GEO. Survival risk prediction of the gene expression data by the time course-dependent metastatic gene signature was performed based on principal components with BRB-ArrayTools software. Survival risk prediction was based on 10-fold cross-validation and classified patients into low- and high-risk groups according to the prognostic indexes.
Distributions of overall survival were assessed using the Kaplan-Meier curve method and log-rank statistics based on 100 permutations. Multivariate analyses of prognostic factors were based on the Cox proportional hazards model. Area under the receiver operating characteristic (ROC) curves were used as a performance measure for prognostic factors. Permutated Kaplan-Meier curve analysis was performed using BRB-ArrayTools. All other statistical analyses were performed with Medcalc Software. P values less than 0.05 were considered statistically significant.
In this study, a gene set that correlated with the metastatic potential of melanoma cell lines was first identified from the gene expression profiling of two datasets (GSE7929 and GSE7956). To increase power, these two data sets were merged by the DWD method after fRMA normalization. As seen in Figure 1, after merging, the relative log gene expression plots indicated a good batch effect removal.
Figure 1. A Box-plot of gene expression in each metastatic melanoma array of the combined dataset (GSE7929 and GSE7956). Using distance-weighted discrimination (DWD) method after frozen Robust Multi-Array Analysis (fRMA) normalization indicates a good batch effect removal.
The merged dataset comprised of 71 samples among which the parental lines samples were assigned as “time course 0” (n = 21); metastasis-derived cell lines established after one round of selection as “time course 1” (n = 24); derived cell lines after two or three rounds of selection as “time course 2” (n = 26). Cells at different metastatic stages had distinct expression patterns. As seen in Figure 2 and in the Additional file 1, 192 genes were identified by BRB-ArrayTools as changing significantly over the time course with an FDR threshold of 0.05. Furthermore, this 192-gene signature was compared with a previously identified 150-gene signature in GSE8401 and 266-gene signature in GSE19234. There was little gene overlap among these three gene sets, which indicates that our 192-gene signature is novel for predicting survival of patients with metastatic melanoma.
Figure 2. Hierarchical clustering of samples of the combined datasets (GSE7929 and GSE7956) by the time course gene set. Parental lines samples were assigned as “time course 0”; metastatic derived cell lines established after one round of selection as “time course 1” (n = 24); derived cell lines after two or three rounds of selection as “time course 2”.
Next, the robustness with which the 192 metastasis-related gene expression signature in predicts survival in patients with melanomas was further tested with whole-genome mRNA expression profiling data obtained from two completely independent datasets (GSE8401 and GSE19234). Using the prognostic indexes based on our gene signature, patients were partitioned into high- and low-risk groups. As seen in Figure 3 the high-risk groups was observed to have a significantly shorter overall survival time than the low-risk group in 55 metastatic melanomas of GSE8401, and 44 metastatic melanomas of GSE19234. However, no significant difference in survival between high- and low-risk groups was observed for primary melanoma samples of GSE8401 Figure 4.
Figure 3. Survival prediction using a 192-gene signature in two validation datasets. High-risk group defined by a 192 gene set had a significantly shorter overall survival time than low-risk group in GSE8401 (A), GSE19234 (B), and metastatic tumor subgroup of GSE8401 (C), but not in primary tumor subgroup of GSE8401 (D). (By Kaplan-Meier curve method and log-rank statistics based on 100 permutations).
Figure 4. Comparison of ROC curves between 192-gene set signature and TNM stage. A 192-gene signature gave a predictive performance with higher areas under the ROC curve than those with stage information in both two validation datasets of metastatic melanoma (A: GSE840; B: GSE19234).
Multivariate Cox proportional hazards regression analyses showed that the prognostic value of our 192-gene signature was independent of TNM stage in both validation datasets (Table 1). The Cox model with the 192-gene signature was also found to provide a predictive performance with higher areas under the ROC curve than TNM stage.
Table 1. Multivariate Cox proportional hazards regression analyses on survival time in two metastatic melanoma datasets
Because metastatic melanomas are heterogeneous, it is critical to develop reliable clinical or molecular predictors to identify those with a relatively worse prognosis and thus select those that could benefit from more aggressive therapies. Previous studies have identified two survival-related gene signatures for patients with metastatic melanoma. The first signature, a 266-gene signature, was generated using the Significant Analysis of Microarrays (SAM) method and could enhance prediction of survival of metastatic melanoma patients by clinical staging . A more earlier study by Xu et al.  compared the gene expression patterns of highly metastatic human melanoma cell lines derived from poorly metastatic parental lines in an animal metastasis model using the Gene Set Enrichment Analysis (GSEA) method and thus identified another survival-related 150-gene set for predicting prognosis of metastatic melanomas.
In this study, we re-analyzed the same dataset of animal melanoma models as that assessed by Xu et al. . Unlike the original study, we used a time-course strategy to isolate metastasis-related genes. We identified 192 genes that varied significantly in parallel with increasing metastatic potential; this set differs from the previously identified gene sets. We confirmed that this time-course gene expression changes in an animal melanoma model correlate with survival of patients with melanoma metastases in two independent datasets. Moreover, our novel time-course gene signature predictor is independent of stage, and has better predictive performance than stage. Our findings suggest that this 192-gene signature might improve the ability of TNM staging to predict survival of patients with metastatic melanoma. Further, we also found that gene expression signatures from animal melanoma model and human melanoma datasets could be reconciled.
Interpretation of gene function showed that the 192 genes are enriched significantly in extracellular matrix receptor interaction, focal adhesion and the β-transforming growth factor signaling pathway. Many of these genes have previously been implicated in melanoma metastasis and prognosis prediction, including MYC, GPNMB, GAB2, and SATB1 -. Therefore, the functional significance of this 192-gene pathway also warrants further investigation.
In conclusion, the time course-dependent gene signature we have developed is useful for predicting survival of patients with metastatic melanomas and might be useful for informing treatment decisions.
DWD: Distance-weighted discrimination
fRMA: Frozen robust multi-array analysis
ROC: Receiver operating characteristic
SAM: Significant analysis of microarrays
GSEA: Gene set enrichment analysis
The authors declare that they have no competing interest.
RC and JL put forward the idea and design the analysis. RC was responsible for the manuscript writing. RC, GZ, and YZ were responsible for the data collecting and analysis. NL participated in the data processing. All authors read and approved the final manuscript.
This work was supported by National natural science foundation of china (81102066); Natural science foundation of Guangdong Province of China (S2011040003129); Medical Research Foundation of Guangdong Province of China (A2011441).
Huang YS, Chen XX, Yang SX, Wu LS, Zhao JY, Li XY, Tu P, Li H: Preliminary exploration of the clinical features of Chinese patients with skin malignancies and premalignancies: a retrospective study of 1420 cases from Peking University First Hospital.
Nodin B, Fridberg M, Jonsson L, Bergman J, Uhlen M, Jirstrom K: High MCM3 expression is an independent biomarker of poor prognosis and correlates with reduced RBM3 expression in a prospective cohort of malignant melanoma.
Xu L, Shen SS, Hoshida Y, Subramanian A, Ross K, Brunet J-P, Wagner SN, Ramaswamy S, Mesirov JP, Hynes RO: Gene expression changes in an animal melanoma model correlate with aggressiveness of human melanoma metastases.
Bogunovic D, O’Neill DW, Belitskaya-Levy I, Vacic V, Yu Y-L, Adams S, Darvishian F, Berman R, Shapiro R, Pavlick AC: Immune profile and mitotic index of metastatic melanoma lesions enhance clinical staging in predicting patient survival.
J Am Stat Assoc 2007, 102:1267-1271. Publisher Full Text