Differential diagnosis of pulmonary nodules identified by computed tomography (CT) remains a challenge in clinical practice. Here, we characterize the global metabolome of 480 serum samples, including healthy controls, benign lung nodules, and stage I lung adenocarcinoma. Adenocarcinomas exhibit unique metabolomic profiles, whereas benign nodules and healthy individuals have high similarity in metabolomic profiles. In the discovery group (n = 306), a set of 27 metabolites was identified to differentiate between benign and malignant nodules. The AUC of the discriminant model in the internal validation (n = 104) and external validation (n = 111) groups was 0.915 and 0.945, respectively. Pathway analysis revealed increased glycolytic metabolites associated with decreased tryptophan in lung adenocarcinoma serum compared with benign nodules and healthy controls, and suggested that tryptophan uptake promotes glycolysis in lung cancer cells. Our study highlights the value of serum metabolite biomarkers in assessing the risk of pulmonary nodules detected by CT.
Early diagnosis is critical to improve survival rates for cancer patients. Results from the US National Lung Cancer Screening Trial (NLST) and the European NELSON Study have shown that screening with low-dose computed tomography (LDCT) can significantly reduce lung cancer mortality in high-risk groups1,2,3. Since the widespread use of LDCT for lung cancer screening, the incidence of incidental radiographic findings of asymptomatic pulmonary nodules has continued to increase 4 . Pulmonary nodules are defined as focal opacities up to 3 cm in diameter 5 . We face difficulties in assessing the likelihood of malignancy and dealing with the large number of pulmonary nodules detected incidentally on LDCT. Limitations of CT can lead to frequent follow-up examinations and false-positive results, leading to unnecessary intervention and overtreatment6. Therefore, there is a need to develop reliable and useful biomarkers to correctly identify lung cancer in the early stages and differentiate most benign nodules at initial detection 7 .
Comprehensive molecular analysis of blood (serum, plasma, peripheral blood mononuclear cells), including genomics, proteomics or DNA methylation8,9,10, has led to growing interest in the discovery of diagnostic biomarkers for lung cancer. Meanwhile, metabolomics approaches measure cellular end products that are influenced by endogenous and exogenous actions and are therefore applied to predict disease onset and outcome. Liquid chromatography-tandem mass spectrometry (LC-MS) is a widely used method for metabolomics studies due to its high sensitivity and large dynamic range, which can cover metabolites with different physicochemical properties11,12,13. Although global metabolomic analysis of plasma/serum has been used to identify biomarkers associated with lung cancer diagnosis14,15,16,17 and treatment efficacy,18 serum metabolite classifiers to distinguish between benign and malignant lung nodules remain to be much studied. -massive research.
Adenocarcinoma and squamous cell carcinoma are the two main subtypes of non-small cell lung cancer (NSCLC). Various CT screening tests indicate that adenocarcinoma is the most common histological type of lung cancer1,19,20,21. In this study, we used ultra-performance liquid chromatography-high-resolution mass spectrometry (UPLC-HRMS) to perform metabolomics analysis on a total of 695 serum samples, including healthy controls, benign pulmonary nodules, and CT-detected ≤3 cm. Screening for Stage I lung adenocarcinoma. We identified a panel of serum metabolites that distinguish lung adenocarcinoma from benign nodules and healthy controls. Pathway enrichment analysis revealed that abnormal tryptophan and glucose metabolism are common alterations in lung adenocarcinoma compared with benign nodules and healthy controls. Finally, we established and validated a serum metabolic classifier with high specificity and sensitivity to distinguish between malignant and benign pulmonary nodules detected by LDCT, which may aid in early differential diagnosis and risk assessment.
In the current study, sex- and age-matched serum samples were retrospectively collected from 174 healthy controls, 292 patients with benign pulmonary nodules, and 229 patients with stage I lung adenocarcinoma. Demographic characteristics of the 695 subjects are shown in Supplementary Table 1.
As shown in Figure 1a, a total of 480 serum samples, including 174 healthy control (HC), 170 benign nodules (BN), and 136 stage I lung adenocarcinoma (LA) samples, were collected at Sun Yat-sen University Cancer Center. Discovery cohort for untargeted metabolomic profiling using ultra-performance liquid chromatography-high-resolution mass spectrometry (UPLC-HRMS). As shown in Supplementary Figure 1, differential metabolites between LA and HC, LA and BN were identified to establish a classification model and further explore differential pathway analysis. 104 samples collected by Sun Yat-sen University Cancer Center and 111 samples collected by two other hospitals were subjected to internal and external validation, respectively.
a Study population in the discovery cohort that underwent global serum metabolomics analysis using ultra-performance liquid chromatography-high-resolution mass spectrometry (UPLC-HRMS). b Partial least squares discriminant analysis (PLS-DA) of the total metabolome of 480 serum samples from the study cohort, including healthy controls (HC, n = 174), benign nodules (BN, n = 170), and stage I lung adenocarcinoma (Los Angeles, n = 136). +ESI, positive electrospray ionization mode, -ESI, negative electrospray ionization mode. c–e Metabolites with significantly different abundances in two given groups (two-tailed Wilcoxon signed rank test, false discovery rate adjusted p value, FDR <0.05) are shown in red (fold change > 1.2) and blue (fold change < 0.83). ) shown on the volcano graphic. f Hierarchical clustering heat map showing significant differences in the number of annotated metabolites between LA and BN. Source data is provided in the form of source data files.
The total serum metabolome of 174 HC, 170 BN and 136 LA in the discovery group was analyzed using UPLC-HRMS analysis. We first show that quality control (QC) samples cluster tightly at the center of an unsupervised principal component analysis (PCA) model, confirming the stability of the current study’s performance (Supplementary Figure 2).
As shown in the partial least squares-discriminant analysis (PLS-DA) in Figure 1 b, we found that there were clear differences between LA and BN, LA and HC in positive (+ESI) and negative (−ESI) electrospray ionization modes. isolated. However, no significant differences were found between BN and HC in +ESI and -ESI conditions.
We found 382 differential features between LA and HC, 231 differential features between LA and BN, and 95 differential features between BN and HC (Wilcoxon signed rank test, FDR <0.05 and multiple change >1.2 or <0.83) (Figure .1c-e). . Peaks were further annotated (Supplementary Data 3) against a database (mzCloud/HMDB/Chemspider library) by m/z value, retention time and fragmentation mass spectrum search (details described in Methods section) 22 . Finally, 33 and 38 annotated metabolites with significant differences in abundance were identified for LA versus BN (Figure 1f and Supplementary Table 2) and LA versus HC (Supplementary Figure 3 and Supplementary Table 2), respectively. In contrast, only 3 metabolites with significant differences in abundance were identified in BN and HC (Supplementary Table 2), consistent with the overlap between BN and HC in PLS-DA. These differential metabolites cover a wide range of biochemicals (Supplementary Figure 4). Taken together, these results demonstrate significant changes in the serum metabolome that reflect malignant transformation of early-stage lung cancer compared with benign lung nodules or healthy subjects. Meanwhile, the similarity of the serum metabolome of BN and HC suggests that benign pulmonary nodules may share many biological characteristics with healthy individuals. Given that epidermal growth factor receptor (EGFR) gene mutations are common in lung adenocarcinoma subtype 23, we sought to determine the impact of driver mutations on the serum metabolome. We then analyzed the overall metabolomic profile of 72 cases with EGFR status in the lung adenocarcinoma group. Interestingly, we found comparable profiles between EGFR mutant patients (n = 41) and EGFR wild-type patients (n = 31) in PCA analysis (Supplementary Figure 5a). However, we identified 7 metabolites whose abundance was significantly altered in patients with EGFR mutation compared to patients with wild-type EGFR (t test, p < 0.05 and fold change > 1.2 or < 0.83) (Supplementary Figure 5b). The majority of these metabolites (5 out of 7) are acylcarnitines, which play an important role in fatty acid oxidation pathways.
As illustrated in the workflow shown in Figure 2 a, biomarkers for nodule classification were obtained using least absolute shrinkage operators and selection based on 33 differential metabolites identified in LA (n = 136) and BN (n = 170). Best combination of variables (LASSO) – binary logistic regression model. Ten-fold cross-validation was used to test the reliability of the model. Variable selection and parameter regularization are adjusted by a likelihood maximization penalty with parameter λ24. Global metabolomics analysis was further performed independently in the internal validation (n = 104) and external validation (n = 111) groups to test the classification performance of the discriminant model. As a result, 27 metabolites in the discovery set were identified as the best discriminant model with the largest mean AUC value (Fig. 2b), among which 9 had increased activity and 18 decreased activity in LA compared to BN (Fig. 2c).
Workflow for building a pulmonary nodule classifier, including selecting the best panel of serum metabolites in the discovery set using a binary logistic regression model via ten-fold cross-validation and evaluating predictive performance in internal and external validation sets. b Cross-validation statistics of LASSO regression model for metabolic biomarker selection. The numbers given above represent the average number of biomarkers selected at a given λ. The red dotted line represents the average AUC value at the corresponding lambda. Gray error bars represent the minimum and maximum AUC values. The dotted line indicates the best model with the 27 selected biomarkers. AUC, area under the receiver operating characteristic (ROC) curve. c Fold changes of 27 selected metabolites in the LA group compared with the BN group in the discovery group. Red column – activation. The blue column is a decline. d–f Receiver operating characteristic (ROC) curves showing the power of the discriminant model based on 27 metabolite combinations in the discovery, internal, and external validation sets. Source data is provided in the form of source data files.
A prediction model was created based on the weighted regression coefficients of these 27 metabolites (Supplementary Table 3). ROC analysis based on these 27 metabolites yielded an area under the curve (AUC) value of 0.933, discovery group sensitivity was 0.868, and specificity was 0.859 (Fig. 2d). Meanwhile, among the 38 annotated differential metabolites between LA and HC, a set of 16 metabolites achieved an AUC of 0.902 with a sensitivity of 0.801 and specificity of 0.856 in discriminating LA from HC (Supplementary Figure 6a-c). AUC values based on different fold change thresholds for differential metabolites were also compared. We found that the classification model performed best in discriminating between LA and BN (HC) when the fold change level was set to 1.2 versus 1.5 or 2.0 (Supplementary Figure 7a,b). The classification model, based on 27 metabolite groups, was further validated in internal and external cohorts. The AUC was 0.915 (sensitivity 0.867, specificity 0.811) for internal validation and 0.945 (sensitivity 0.810, specificity 0.979) for external validation (Fig. 2e, f). To assess interlaboratory efficiency, 40 samples from the external cohort were analyzed in an external laboratory as described in the Methods section. The classification accuracy achieved an AUC of 0.925 (Supplementary Figure 8). Because lung squamous cell carcinoma (LUSC) is the second most common subtype of non-small cell lung cancer (NSCLC) after lung adenocarcinoma (LUAD), we also tested the validated potential utility of metabolic profiles. BN and 16 cases of LUSC. The AUC of discrimination between LUSC and BN was 0.776 (Supplementary Figure 9), indicating poorer ability compared to discrimination between LUAD and BN.
Studies have shown that the size of nodules on CT images is positively correlated with the likelihood of malignancy and remains a major determinant of nodule treatment25,26,27. Analysis of data from the large cohort of the NELSON screening study showed that the risk of malignancy in subjects with nodes <5 mm was even similar to that in subjects without nodes 28 . Therefore, the minimum size requiring regular CT monitoring is 5 mm, as recommended by the British Thoracic Society (BTS), and 6 mm, as recommended by the Fleischner Society 29 . However, nodules larger than 6 mm and without obvious benign features, called indeterminate pulmonary nodules (IPN), remain a major challenge in evaluation and management in clinical practice30,31. We next examined whether nodule size influenced metabolomic signatures using pooled samples from the discovery and internal validation cohorts. Focusing on 27 validated biomarkers, we first compared the PCA profiles of HC and BN sub-6 mm metabolomes. We found that most of the data points for HC and BN overlapped, demonstrating that serum metabolite levels were similar in both groups (Fig. 3a). The feature maps across different size ranges remained conserved in BN and LA (Fig. 3b, c), whereas a separation was observed between malignant and benign nodules in the 6–20 mm range (Fig. 3d). This cohort had an AUC of 0.927, specificity of 0.868, and sensitivity of 0.820 for predicting the malignancy of nodules measuring 6 to 20 mm (Fig. 3e, f). Our results show that the classifier can capture metabolic changes caused by early malignant transformation, regardless of nodule size.
ad Comparison of PCA profiles between specified groups based on a metabolic classifier of 27 metabolites. CC and BN < 6 mm. b BN < 6 mm vs BN 6–20 mm. in LA 6–20 mm versus LA 20–30 mm. g BN 6–20 mm and LA 6–20 mm. GC, n = 174; BN < 6 mm, n = 153; BN 6–20 mm, n = 91; LA 6–20 mm, n = 89; LA 20–30 mm, n = 77. e Receiver operating characteristic (ROC) curve showing discriminant model performance for nodules 6–20 mm. f Probability values were calculated based on the logistic regression model for nodules measuring 6–20 mm. The gray dotted line represents the optimal cutoff value (0.455). The numbers above represent the percentage of cases projected for Los Angeles. Use a two-tailed Student’s t test. PCA, principal component analysis. AUC area under the curve. Source data is provided in the form of source data files.
Four samples (aged 44–61 years) with similar pulmonary nodule sizes (7–9 mm) were further selected to illustrate the performance of the proposed malignancy prediction model (Fig. 4a, b). On initial screening, Case 1 presented as a solid nodule with calcification, a feature associated with benignity, whereas Case 2 presented as an indeterminate partially solid nodule with no obvious benign features. Three rounds of follow-up CT scans showed that these cases remained stable over a 4-year period and were therefore considered benign nodules (Fig. 4a). Compared with clinical evaluation of serial CT scans, single-shot serum metabolite analysis with the current classifier model quickly and correctly identified these benign nodules based on probabilistic constraints (Table 1). Figure 4b in case 3 shows a nodule with signs of pleural retraction, which is most often associated with malignancy32. Case 4 presented as an indeterminate partially solid nodule with no evidence of a benign cause. All these cases were predicted as malignant according to the classifier model (Table 1). The assessment of lung adenocarcinoma was demonstrated by histopathological examination after lung resection surgery (Fig. 4b). For the external validation set, the metabolic classifier accurately predicted two cases of indeterminate lung nodules larger than 6 mm (Supplementary Figure 10).
CT images of the axial window of the lungs of two cases of benign nodules. In case 1, CT scan after 4 years showed a stable solid nodule measuring 7 mm with calcification in the right lower lobe. In case 2, CT scan after 5 years revealed a stable, partially solid nodule with a diameter of 7 mm in the right upper lobe. b Axial window CT images of the lungs and corresponding pathological studies of two cases of stage I adenocarcinoma before lung resection. Case 3 revealed a nodule with a diameter of 8 mm in the right upper lobe with pleural retraction. Case 4 revealed a partially solid ground-glass nodule measuring 9 mm in the left upper lobe. Hematoxylin and eosin (H&E) staining of resected lung tissue (scale bar = 50 μm) demonstrating the acinar growth pattern of lung adenocarcinoma. Arrows indicate nodules detected on CT images. H&E images are representative images of multiple (>3) microscopic fields examined by the pathologist.
Taken together, our results demonstrate the potential value of serum metabolite biomarkers in the differential diagnosis of pulmonary nodules, which may pose challenges when evaluating CT screening.
Based on a validated differential metabolite panel, we sought to identify biological correlates of major metabolic changes. KEGG pathway enrichment analysis by MetaboAnalyst identified 6 common significantly altered pathways between the two given groups (LA vs. HC and LA vs. BN, adjusted p ≤ 0.001, effect > 0.01). These changes were characterized by disturbances in pyruvate metabolism, tryptophan metabolism, niacin and nicotinamide metabolism, glycolysis, the TCA cycle, and purine metabolism (Fig. 5a). We then further performed targeted metabolomics to verify major changes using absolute quantification. Determination of common metabolites in commonly altered pathways by triple quadrupole mass spectrometry (QQQ) using authentic metabolite standards. Demographic characteristics of the metabolomics study target sample are included in Supplementary Table 4. Consistent with our global metabolomics results, quantitative analysis confirmed that hypoxanthine and xanthine, pyruvate, and lactate were increased in LA compared to BN and HC (Fig. 5b, c, p <0.05). However, no significant differences in these metabolites were found between BN and HC.
KEGG pathway enrichment analysis of significantly different metabolites in the LA group compared to the BN and HC groups. A two-tailed Globaltest was used, and p values were adjusted using the Holm-Bonferroni method (adjusted p ≤ 0.001 and effect size > 0.01). b–d Violin plots showing hypoxanthine, xanthine, lactate, pyruvate, and tryptophan levels in serum HC, BN, and LA determined by LC-MS/MS (n = 70 per group). White and black dotted lines indicate the median and quartile, respectively. e Violin plot showing normalized Log2TPM (transcripts per million) mRNA expression of SLC7A5 and QPRT in lung adenocarcinoma (n = 513) compared to normal lung tissue (n = 59) in the LUAD-TCGA dataset. The white box represents the interquartile range, the horizontal black line in the center represents the median, and the vertical black line extending from the box represents the 95% confidence interval (CI). f Pearson correlation plot of SLC7A5 and GAPDH expression in lung adenocarcinoma (n = 513) and normal lung tissue (n = 59) in the TCGA dataset. The gray area represents the 95% CI. r, Pearson correlation coefficient. g Normalized cellular tryptophan levels in A549 cells transfected with nonspecific shRNA control (NC) and shSLC7A5 (Sh1, Sh2) determined by LC-MS/MS. Statistical analysis of five biologically independent samples in each group is presented. h Cellular levels of NADt (total NAD, including NAD+ and NADH) in A549 cells (NC) and SLC7A5 knockdown A549 cells (Sh1, Sh2). Statistical analysis of three biologically independent samples in each group is presented. i Glycolytic activity of A549 cells before and after SLC7A5 knockdown was measured by extracellular acidification rate (ECAR) (n = 4 biologically independent samples per group). 2-DG,2-deoxy-D-glucose. Two-tailed Student’s t test was used in (b–h). In (g–i), error bars represent the mean ± SD, each experiment was performed three times independently and the results were similar. Source data is provided in the form of source data files.
Considering the significant impact of altered tryptophan metabolism in the LA group, we also assessed serum tryptophan levels in the HC, BN, and LA groups using QQQ. We found that serum tryptophan was reduced in LA compared with HC or BN (p < 0.001, Figure 5d), which is consistent with previous findings that circulating tryptophan levels are lower in patients with lung cancer than in healthy controls from the control group33,34,35. Another study using PET/CT tracer 11C-methyl-L-tryptophan found that the tryptophan signal retention time in lung cancer tissue was significantly increased compared to benign lesions or normal tissue36. We hypothesize that the decrease in tryptophan in LA serum may reflect active tryptophan uptake by lung cancer cells.
It is also known that the end product of the kynurenine pathway of tryptophan catabolism is NAD+37,38, which is an important substrate for the reaction of glyceraldehyde-3-phosphate with 1,3-bisphosphoglycerate in glycolysis39. While previous studies have focused on the role of tryptophan catabolism in immune regulation, we sought to elucidate the interplay between tryptophan dysregulation and glycolytic pathways observed in the current study. Solute transporter family 7 member 5 (SLC7A5) is known to be a tryptophan transporter43,44,45. Quinolinic acid phosphoribosyltransferase (QPRT) is an enzyme located downstream of the kynurenine pathway that converts quinolinic acid to NAMN46. Inspection of the LUAD TCGA dataset revealed that both SLC7A5 and QPRT were significantly upregulated in tumor tissue compared to normal tissue (Fig. 5e). This increase was observed in stages I and II as well as stages III and IV of lung adenocarcinoma (Supplementary Figure 11), indicating early disturbances in tryptophan metabolism associated with tumorigenesis.
Additionally, the LUAD-TCGA dataset showed a positive correlation between SLC7A5 and GAPDH mRNA expression in cancer patient samples (r = 0.45, p = 1.55E-26, Figure 5f). In contrast, no significant correlation was found between such gene signatures in normal lung tissue (r = 0.25, p = 0.06, Figure 5f). Knockdown of SLC7A5 (Supplementary Figure 12) in A549 cells significantly reduced cellular tryptophan and NAD(H) levels (Figure 5g,h), resulting in attenuated glycolytic activity as measured by extracellular acidification rate (ECAR) (Figure 1). 5i). Thus, based on metabolic changes in serum and in vitro detection, we hypothesize that tryptophan metabolism may produce NAD+ through the kynurenine pathway and play an important role in promoting glycolysis in lung cancer.
Studies have shown that a large number of indeterminate pulmonary nodules detected by LDCT may lead to the need for additional testing such as PET-CT, lung biopsy, and overtreatment due to a false-positive diagnosis of malignancy.31 As shown in Figure 6, our study identified a panel of serum metabolites with potential diagnostic value that may improve risk stratification and subsequent management of pulmonary nodules detected by CT.
Pulmonary nodules are evaluated using low-dose computed tomography (LDCT) with imaging features suggestive of benign or malignant causes. The uncertain outcome of nodules can lead to frequent follow-up visits, unnecessary interventions, and overtreatment. The inclusion of serum metabolic classifiers with diagnostic value may improve risk assessment and subsequent management of pulmonary nodules. PET positron emission tomography.
Data from the US NLST study and the European NELSON study suggest that screening high-risk groups with low-dose computed tomography (LDCT) may reduce lung cancer mortality1,3. However, risk assessment and subsequent clinical management of large numbers of incidental pulmonary nodules detected by LDCT remain the most challenging. The main goal is to optimize the correct classification of existing LDCT-based protocols by incorporating reliable biomarkers.
Certain molecular biomarkers, such as blood metabolites, have been identified by comparing lung cancer with healthy controls15,17. In the current study, we focused on the application of serum metabolomics analysis to distinguish between benign and malignant pulmonary nodules incidentally detected by LDCT. We compared the global serum metabolome of healthy control (HC), benign lung nodules (BN), and stage I lung adenocarcinoma (LA) samples using UPLC-HRMS analysis. We found that HC and BN had similar metabolic profiles, whereas LA showed significant changes compared to HC and BN. We identified two sets of serum metabolites that differentiate LA from HC and BN.
The current LDCT-based identification scheme for benign and malignant nodules is mainly based on the size, density, morphology and growth rate of nodules over time30. Previous studies have shown that the size of nodules is closely related to the likelihood of lung cancer. Even in high-risk patients, the risk of malignancy in nodes <6 mm is <1%. The risk of malignancy for nodules measuring 6 to 20 mm ranges from 8% to 64%30. Therefore, the Fleischner Society recommends a cutoff diameter of 6 mm for routine CT follow-up. 29 However, risk assessment and management of indeterminate pulmonary nodules (IPN) larger than 6 mm have not been adequately performed 31 . Current management of congenital heart disease is usually based on watchful waiting with frequent CT monitoring.
Based on the validated metabolome, we demonstrated for the first time the overlap of metabolomic signatures between healthy individuals and benign nodules <6 mm. The biological similarity is consistent with previous CT findings that the risk of malignancy for nodules <6 mm is as low as for subjects without nodes.30 It should be noted that our results also demonstrate that benign nodules <6 mm and ≥6 mm have high similarity in metabolomic profiles, suggesting that the functional definition of benign etiology is consistent regardless of nodule size. Thus, modern diagnostic serum metabolite panels may provide a single assay as a rule-out test when nodules are initially detected on CT and potentially reduce serial monitoring. At the same time, the same panel of metabolic biomarkers distinguished malignant nodules ≥6 mm in size from benign nodules and provided accurate predictions for IPNs of similar size and ambiguous morphological features on CT images. This serum metabolism classifier performed well in predicting the malignancy of nodules ≥6 mm with an AUC of 0.927. Taken together, our results indicate that unique serum metabolomic signatures may specifically reflect early tumor-induced metabolic changes and have potential value as risk predictors, independent of nodule size.
Notably, lung adenocarcinoma (LUAD) and squamous cell carcinoma (LUSC) are the main types of non-small cell lung cancer (NSCLC). Given that LUSC is strongly associated with tobacco use47 and LUAD is the most common histology of incidental lung nodules detected on CT screening48, our classifier model was specifically built for stage I adenocarcinoma samples. Wang and colleagues also focused on LUAD and identified nine lipid signatures using lipidomics to distinguish early-stage lung cancer from healthy individuals17. We tested the current classifier model on 16 cases of stage I LUSC and 74 benign nodules and observed low LUSC prediction accuracy (AUC 0.776), suggesting that LUAD and LUSC may have their own metabolomic signatures. Indeed, LUAD and LUSC have been shown to differ in etiology, biological origin and genetic aberrations49. Therefore, other types of histology should be included in training models for population-based detection of lung cancer in screening programs.
Here, we identified the six most frequently altered pathways in lung adenocarcinoma compared with healthy controls and benign nodules. Xanthine and hypoxanthine are common metabolites of the purine metabolic pathway. Consistent with our results, intermediates associated with purine metabolism were significantly increased in the serum or tissues of patients with lung adenocarcinoma compared with healthy controls or patients at the preinvasive stage15,50. Elevated serum xanthine and hypoxanthine levels may reflect the anabolism required by rapidly proliferating cancer cells. Dysregulation of glucose metabolism is a well-known hallmark of cancer metabolism51. Here, we observed a significant increase in pyruvate and lactate in the LA group compared with the HC and BN group, which is consistent with previous reports of glycolytic pathway abnormalities in the serum metabolome profiles of non-small cell lung cancer (NSCLC) patients and healthy controls. the results are consistent52,53.
Importantly, we observed an inverse correlation between pyruvate and tryptophan metabolism in the serum of lung adenocarcinomas. Serum tryptophan levels were reduced in the LA group compared with the HC or BN group. Interestingly, a previous large-scale study using a prospective cohort found that low levels of circulating tryptophan were associated with an increased risk of lung cancer 54 . Tryptophan is an essential amino acid that we get entirely from food. We conclude that serum tryptophan depletion in lung adenocarcinoma may reflect rapid depletion of this metabolite. It is well known that the end product of tryptophan catabolism via the kynurenine pathway is the source of de novo NAD+ synthesis. Because NAD+ is produced primarily through the salvage pathway, the importance of NAD+ in tryptophan metabolism in health and disease remains to be determined46. Our analysis of the TCGA database showed that the expression of the tryptophan transporter solute transporter 7A5 (SLC7A5) was significantly increased in lung adenocarcinoma compared with normal controls and was positively correlated with the expression of the glycolytic enzyme GAPDH. Previous studies have mainly focused on the role of tryptophan catabolism in suppressing the antitumor immune response40,41,42. Here we demonstrate that inhibition of tryptophan uptake by knockdown of SLC7A5 in lung cancer cells results in a subsequent decrease in cellular NAD levels and a concomitant attenuation of glycolytic activity. In summary, our study provides a biological basis for changes in serum metabolism associated with malignant transformation of lung adenocarcinoma.
EGFR mutations are the most common driver mutations in patients with NSCLC. In our study, we found that patients with EGFR mutation (n = 41) had overall metabolomic profiles similar to patients with wild-type EGFR (n = 31), although we found decreased serum levels of some EGFR mutant patients in acylcarnitine patients. The established function of acylcarnitines is to transport acyl groups from the cytoplasm into the mitochondrial matrix, leading to the oxidation of fatty acids to produce energy 55 . Consistent with our findings, a recent study also identified similar metabolome profiles between EGFR mutant and EGFR wild-type tumors by analyzing the global metabolome of 102 lung adenocarcinoma tissue samples50. Interestingly, acylcarnitine content was also found in the EGFR mutant group. Therefore, whether changes in acylcarnitine levels reflect EGFR-induced metabolic changes and the underlying molecular pathways may merit further study.
In conclusion, our study establishes a serum metabolic classifier for the differential diagnosis of pulmonary nodules and proposes a workflow that can optimize risk assessment and facilitate clinical management based on CT scan screening.
This study was approved by the Ethics Committee of Sun Yat-sen University Cancer Hospital, the First Affiliated Hospital of Sun Yat-sen University, and the Ethics Committee of Zhengzhou University Cancer Hospital. In the discovery and internal validation groups, 174 sera from healthy individuals and 244 sera from benign nodules were collected from individuals undergoing annual medical examinations at the Department of Cancer Control and Prevention, Sun Yat-sen University Cancer Center, and 166 benign nodules. serum. Stage I lung adenocarcinomas were collected from Sun Yat-sen University Cancer Center. In the external validation cohort, there were 48 cases of benign nodules, 39 cases of stage I lung adenocarcinoma from the First Affiliated Hospital of Sun Yat-sen University, and 24 cases of stage I lung adenocarcinoma from Zhengzhou Cancer Hospital. Sun Yat-sen University Cancer Center also collected 16 cases of stage I squamous cell lung cancer to test the diagnostic ability of the established metabolic classifier (patient characteristics are shown in Supplementary Table 5). Samples from the discovery cohort and internal validation cohort were collected between January 2018 and May 2020. Samples for the external validation cohort were collected between August 2021 and October 2022. To minimize gender bias, approximately equal numbers of male and female cases were assigned to each cohort. Discovery Team and Internal Review Team. Participant gender was determined based on self-report. Informed consent was obtained from all participants and no compensation was provided. Subjects with benign nodules were those with a stable CT scan score at 2 to 5 years at the time of analysis, except for 1 case from the external validation sample, which was collected preoperatively and diagnosed by histopathology. With the exception of chronic bronchitis. Lung adenocarcinoma cases were collected before lung resection and confirmed by pathological diagnosis. Fasting blood samples were collected in serum separation tubes without any anticoagulants. Blood samples were clotted for 1 hour at room temperature and then centrifuged at 2851 × g for 10 minutes at 4°C to collect serum supernatant. Serum aliquots were frozen at -80°C until metabolite extraction. The Department of Cancer Prevention and Medical Examination of the Sun Yat-sen University Cancer Center collected a pool of serum from 100 healthy donors, including an equal number of men and women aged 40 to 55 years. Equal volumes of each donor sample were mixed, the resulting pool was aliquoted and stored at -80°C. The serum mixture was used as reference material for quality control and data standardization.
Reference serum and test samples were thawed and metabolites were extracted using a combined extraction method (MTBE/methanol/water) 56 . Briefly, 50 μl of serum was mixed with 225 μl of ice-cold methanol and 750 μl of ice-cold methyl tert-butyl ether (MTBE). Stir the mixture and incubate on ice for 1 hour. The samples were then mixed and vortex mixed with 188 μl of MS-grade water containing internal standards (13C-lactate, 13C3-pyruvate, 13C-methionine, and 13C6-isoleucine, purchased from Cambridge Isotope Laboratories). The mixture was then centrifuged at 15,000 × g for 10 min at 4 °C, and the lower phase was transferred into two tubes (125 μL each) for LC-MS analysis in positive and negative modes. Finally, the sample was evaporated to dryness in a high-speed vacuum concentrator.
The dried metabolites were reconstituted in 120 μl of 80% acetonitrile, vortexed for 5 min, and centrifuged at 15,000 × g for 10 min at 4°C. Supernatants were transferred into amber glass vials with microinserts for metabolomics studies. Untargeted metabolomics analysis on an ultra-performance liquid chromatography-high-resolution mass spectrometry (UPLC-HRMS) platform. Metabolites were separated using a Dionex Ultimate 3000 UPLC system and an ACQUITY BEH Amide column (2.1 × 100 mm, 1.7 μm, Waters). In positive ion mode, the mobile phases were 95% (A) and 50% acetonitrile (B), each containing 10 mmol/L ammonium acetate and 0.1% formic acid. In negative mode, mobile phases A and B contained 95% and 50% acetonitrile, respectively, both phases contained 10 mmol/L ammonium acetate, pH = 9. The gradient program was as follows: 0–0.5 min, 2% B; 0.5–12 min, 2–50% B; 12–14 min, 50–98% B; 14–16 min, 98% B; 16–16.1. min, 98 –2% B; 16.1–20 min, 2% B. The column was maintained at 40°C and the sample at 10°C in the autosampler. The flow rate was 0.3 ml/min, the injection volume was 3 μl. A Q-Exactive Orbitrap mass spectrometer (Thermo Fisher Scientific) with an electrospray ionization (ESI) source was operated in full scan mode and coupled with the ddMS2 monitoring mode to collect large volumes of data. The MS parameters were set as follows: spray voltage +3.8 kV/- 3.2 kV, capillary temperature 320°C, shielding gas 40 arb, auxiliary gas 10 arb, probe heater temperature 350°C, scanning range 70–1050 m/ h, resolution. 70 000. Data were acquired using Xcalibur 4.1 (Thermo Fisher Scientific).
To assess data quality, pooled quality control (QC) samples were generated by removing 10 μL aliquots of the supernatant from each sample. Six quality control sample injections were analyzed at the beginning of the analytical sequence to assess the stability of the UPLC-MS system. Quality control samples are then periodically introduced into the batch. All 11 batches of serum samples in this study were analyzed by LC-MS. Aliquots of a serum pool mixture from 100 healthy donors were used as reference material in respective batches to monitor the extraction process and adjust for batch-to-batch effects. Untargeted metabolomics analysis of the discovery cohort, internal validation cohort, and external validation cohort was performed at the Metabolomics Center of Sun Yat-sen University. The external laboratory of Guangdong University of Technology Analysis and Testing Center also analyzed 40 samples from the external cohort to test the performance of the classifier model.
After extraction and reconstitution, absolute quantitation of serum metabolites was measured using ultra-high performance liquid chromatography-tandem mass spectrometry (Agilent 6495 triple quadrupole) with an electrospray ionization (ESI) source in multiple reaction monitoring (MRM) mode. An ACQUITY BEH Amide column (2.1 × 100 mm, 1.7 μm, Waters) was used to separate metabolites. The mobile phase consisted of 90% (A) and 5% acetonitrile (B) with 10 mmol/L ammonium acetate and 0.1% ammonia solution. The gradient program was as follows: 0–1.5 min, 0% B; 1.5–6.5 min, 0–15% B; 6.5–8 min, 15% B; 8–8.5 min, 15%–0% B; 8.5–11.5 min, 0%B. The column was maintained at 40 °C and the sample at 10 °C in the autosampler. The flow rate was 0.3 mL/min and the injection volume was 1 μL. MS parameters were set as follows: capillary voltage ±3.5 kV, nebulizer pressure 35 psi, sheath gas flow 12 L/min, sheath gas temperature 350°C, drying gas temperature 250°C, and drying gas flow 14 l/min. The MRM conversions of tryptophan, pyruvate, lactate, hypoxanthine and xanthine were 205.0–187.9, 87.0–43.4, 89.0–43.3, 135.0–92.3 and 151.0–107. 9 respectively. Data were collected using Mass Hunter B.07.00 (Agilent Technologies). For serum samples, tryptophan, pyruvate, lactate, hypoxanthine, and xanthine were quantified using calibration curves of standard mixture solutions. For cell samples, tryptophan content was normalized to the internal standard and cell protein mass.
Peak extraction (m/z and retention time (RT)) was performed using Compound Discovery 3.1 and TraceFinder 4.0 (Thermo Fisher Scientific). To eliminate potential differences between batches, each characteristic peak of the test sample was divided by the characteristic peak of the reference material from the same batch to obtain the relative abundance. The relative standard deviations of internal standards before and after standardization are shown in Supplementary Table 6. Differences between the two groups were characterized by false discovery rate (FDR<0.05, Wilcoxon signed rank test) and fold change (>1.2 or <0.83). Raw MS data of the extracted features and reference serum-corrected MS data are shown in Supplementary Data 1 and Supplementary Data 2, respectively. Peak annotation was performed based on four defined levels of identification, including identified metabolites, putatively annotated compounds, putatively characterized compound classes, and unknown compounds 22 . Based on database searches in Compound Discovery 3.1 (mzCloud, HMDB, Chemspider), biological compounds with MS/MS matching validated standards or exact match annotations in mzCloud (score > 85) or Chemspider were finally selected as intermediates between the differential metabolome. Peak annotations for each feature are included in Supplementary Data 3. MetaboAnalyst 5.0 was used for univariate analysis of sum-normalized metabolite abundance. MetaboAnalyst 5.0 also evaluated KEGG pathway enrichment analysis based on significantly different metabolites. Principal component analysis (PCA) and partial least squares discriminant analysis (PLS-DA) were analyzed using the ropls software package (v.1.26.4) with stack normalization and autoscaling. The optimal metabolite biomarker model for predicting nodule malignancy was generated using binary logistic regression with least absolute shrinkage and selection operator (LASSO, R package v.4.1-3). The performance of the discriminant model in the detection and validation sets was characterized by estimating AUC based on ROC analysis according to the pROC package (v.1.18.0.). The optimal probability cutoff was obtained based on the maximum Youden index of the model (sensitivity + specificity – 1). Samples with values less or greater than the threshold will be predicted as benign nodules and lung adenocarcinoma, respectively.
A549 cells (#CCL-185, American Type Culture Collection) were grown in F-12K medium containing 10% FBS. Short hairpin RNA (shRNA) sequences targeting SLC7A5 and a nontargeting control (NC) were inserted into the lentiviral vector pLKO.1-puro. The antisense sequences of shSLC7A5 are as follows: Sh1 (5′-GGAGAAACCTGATGAACAGTT-3′), Sh2 (5′-GCCGTGGACTTCGGGAACTAT-3′). Antibodies to SLC7A5 (#5347) and tubulin (#2148) were purchased from Cell Signaling Technology. Antibodies to SLC7A5 and tubulin were used at a dilution of 1:1000 for Western blot analysis.
The Seahorse XF Glycolytic Stress Test measures extracellular acidification (ECAR) levels. In the assay, glucose, oligomycin A, and 2-DG were administered sequentially to test cellular glycolytic capacity as measured by ECAR.
A549 cells transfected with non-targeting control (NC) and shSLC7A5 (Sh1, Sh2) were plated overnight in 10 cm diameter dishes. Cell metabolites were extracted with 1 ml of ice-cold 80% aqueous methanol. Cells in the methanol solution were scraped off, collected into a new tube, and centrifuged at 15,000 × g for 15 min at 4°C. Collect 800 µl of supernatant and dry using a high-speed vacuum concentrator. The dried metabolite pellets were then analyzed for tryptophan levels using LC-MS/MS as described above. Cellular NAD(H) levels in A549 cells (NC and shSLC7A5) were measured using a quantitative NAD+/NADH colorimetric kit (#K337, BioVision) according to the manufacturer’s instructions. Protein levels were measured for each sample to normalize the amount of metabolites.
No statistical methods were used to preliminarily determine the sample size. Previous metabolomics studies aimed at biomarker discovery15,18 have been considered as benchmarks for size determination, and compared to these reports, our sample was adequate. No samples were excluded from the study cohort. Serum samples were randomly assigned to a discovery group (306 cases, 74.6%) and an internal validation group (104 cases, 25.4%) for untargeted metabolomics studies. We also randomly selected 70 cases from each group from the discovery set for targeted metabolomics studies. The investigators were blinded to group assignment during LC-MS data collection and analysis. Statistical analyzes of metabolomics data and cell experiments are described in the respective Results, Figure Legends, and Methods sections. Quantification of cellular tryptophan, NADT, and glycolytic activity was performed three times independently with identical results.
For more information about the study design, see the Natural Portfolio Report Abstract associated with this article.
The raw MS data of the extracted features and the normalized MS data of the reference serum are shown in Supplementary Data 1 and Supplementary Data 2, respectively. Peak annotations for differential features are presented in Supplementary Data 3. The LUAD TCGA dataset can be downloaded from https://portal.gdc.cancer.gov/. The input data for plotting the graph is provided in the source data. Source data is provided for this article.
National Lung Screening Study Group, etc. Reducing lung cancer mortality with low-dose computed tomography. Northern England. J. Med. 365, 395–409 (2011).
Kramer, B.S., Berg, K.D., Aberle, D.R. and Prophet, P.C. Lung cancer screening using low-dose helical CT: results from the National Lung Screening Study (NLST). J. Med. Screen 18, 109–111 (2011).
De Koning, HJ, et al. Reducing lung cancer mortality with volumetric CT screening in a randomized trial. Northern England. J. Med. 382, 503–513 (2020).
Post time: Sep-18-2023