Abstract:
Objective Based on plasma lipidomics combined with machine learning approaches, this study aimed to screen molecular biomarkers for the diagnosis of early-stage lung cancer in elderly patients and to evaluate their diagnostic performance.
Methods This was a retrospective diagnostic study consisting of two parts. The first part involved molecular biomarker screening. Elderly patients with early-stage lung cancer (early lung cancer group), patients with benign pulmonary nodules (benign nodule group), and contemporaneous healthy individuals undergoing physical examinations (healthy control group) were enrolled from Peking University People's Hospital between November 2023 and November 2024. In addition, early-stage lung cancer patients and healthy controls meeting the inclusion criteria from a previous study of our research group were included as an independent validation cohort. Plasma samples were collected from all subjects, and untargeted lipidomics analysis was performed using high-performance liquid chromatography-mass spectrometry. Principal component analysis and orthogonal partial least squares discriminant analysis were used to evaluate metabolic differences between groups. L1-regularized support vector machine combined with incremental feature selection was employed to screen diagnostic biomarkers for early-stage lung cancer. Model performance was assessed using receiver operating characteristic curves, calibration curves, Brier scores, and decision curve analysis. The second part involved functional validation of the molecular biomarkers using the human lung adenocarcinoma cell line A549, with palmitoylcarnitine (CAR 16:0) selected as a representative biomarker for functional validation via CCK-8 and cell scratch assays.
Results A total of 36 patients in the early lung cancer group, 35 patients in the benign nodule group, and 41 healthy controls were enrolled, along with an independent validation cohort of 110 individuals (59 patients with early-stage lung cancer and 51 healthy controls). The principal component analysis results demonstrated that quality control samples were tightly aggregated at the centroid of all samples, reflecting robust instrument performance and dependable data quality. Orthogonal partial least squares discriminant analysis revealed significant metabolic differences between the early lung cancer group and the control group (benign nodule group + healthy control group) (R
2X=0.406, R
2Y=0.529, Q
2Y=0.44). L1-regularized support vector machine identified five carnitine-related lipids-palmitoleoylcarnitine (CAR 16:1), palmitoylcarnitine, α-linolenoylcarnitine (CAR 18:3), linoleoylcarnitine (CAR 18:2), and oleoylcarnitine (CAR 18:1) -as diagnostic biomarkers for early-stage lung cancer, all with stability values >98%. In the screening scenario (early lung cancer group
vs. benign nodule group + healthy control group), the model based on these five biomarkers achieved an area under the curve (AUC) of 0.895 (95% CI:0.700-1.000) for diagnosing early-stage lung cancer, with a sensitivity of 98.4%, specificity of 63.9%, and accuracy of 75.0%. For differentiating early-stage lung cancer from benign pulmonary nodules, the model yielded an AUC of 0.877 (95% CI:0.797-0.965), sensitivity of 86.1%, and specificity of 80.0%. For differentiating early-stage lung cancer from healthy controls, the model yielded an AUC of 0.929 (95% CI:0.877-0.988), sensitivity of 94.4%, and specificity of 85.4%. Calibration and decision curve analyses demonstrated good model calibration and overall net benefit for patients with early-stage lung cancer. In the independent validation cohort, the model achieved an AUC of 0.874 (95% CI:0.781-0.940) for diagnosing early-stage lung cancer, with a sensitivity of 86.4%, specificity of 82.4%, and accuracy of 86.4%.
In vitro experiments showed that palmitoylcarnitine inhibited the proliferation and migration of A549 cells, with a half-maximal inhibitory concentration of 55.04 μmol/L.
Conclusions The five plasma carnitine-related lipids screened based on untargeted lipidomics and machine learning may serve as potential molecular biomarkers for the diagnosis of early-stage lung cancer in elderly patients. The high-sensitivity characteristic of the model makes it particularly suitable for screening scenarios in early-stage lung cancer.