Lipid metabolic gene-wide prole and signature of lung adenocarcinoma

Lung cancer is a worldwide cancer with high morbidity and mortality. More and more evidence shows that the disorder of lipid metabolism is the key to the development of cancer, and analysis of lipid-related genes may lead to diagnosis and prognostic biomarkers related to lung cancer. In this study, we performed the differentially expressed analysis of 1045 lipid metabolism-related genes between LUAD tumors and normal tissues in the TCGA-LUAD cohort. Then the bioinformatic analysis of DEGs was showed. PPI networks and cytoHubba APP determine hub genes. The association between hub genes and overall survival was evaluated by Kaplan-Meier Plotter. To predict the prognosis of LUAD patients, a nomogram was built, the nomogram was validated by another cohort (GSE13213).

DGAT1, HPGDS, and LPL, were associated with worse OS for 1925 LUAD patients. Based on the nomogram, we found that the high-risk score group had a worse OS, and the validated cohort had the same result.

Conclusion
In conclusion, we generated a lipid metabolic transcriptome-wide pro le of LUAD patients and found that signi cant lipid metabolic pathways were correlated with the LUAD. Furthermore, we constructed a signature of six lipid metabolic genes, which signi cantly associated with diagnosis and prognosis of LUAD patients. The gene signature can be used as a biomarker for LUAD.

Background
Lung cancer is the most commonly diagnosed cancer (11.6% of the total cases) and the leading cause of cancer death (18.4% of the total cancer deaths) in the world [1]. Among the subtype of lung cancers, adenocarcinoma is the most common histologic subtype of lung cancer in men and women posterior to the 1990s [2]. A 2005-2014 epidemiological survey from China showed that the proportion of adenocarcinoma increased from 36.4-53.5%, while the proportion of squamous carcinoma decreased from 45.4-34.4% [3]. The increasing incidence of lung adenocarcinoma (LUAD) has also been reported with air pollution-related factors [4][5][6].
The pathways of cancer patients were various [7]. Cancer caused by different pathways may require different treatments. One of the hallmarks of cancer is metabolic reprogramming. The importance of alterations related to lipid metabolism is starting to be recognized, and the increase in de novo lipogenesis is considered a new hallmark in many aggressive cancers [8]. Epidemiological data indicated that a certain number of lung cancer patients with high high-density lipoprotein cholesterol(HDL-C) and low low-density lipoprotein (LDL) and low-density lipoprotein receptor (LDLR) level have better survival in patients [9,10]. Compared with healthy subjects, NSCLC patients showed signi cant increases in phosphatidylcholine (PCs) and phosphatidylethanolamine (PEs) [11]. Other lipid metabolism indicators associated with NSCLC includes sphingomyelins, phosphatidylinositols, phosphatidylserines, phosphatidylethanolamine, phospholipids, phosphatidylcholine [12]. The cancer cells' requirement of metabolic intermediates for macromolecule production is overwhelming. Fatty acid oxidation(FAO) can help to generate ATP to support the membranes formation, energy storage, production of signaling molecules by coordinating the activation of lipid anabolic metabolism [13]. During the process, de novo adipogenesis in cancer increases. Lipid metabolism relating genes including FASN [14], ABCA1 [15], ACLY [16], FABP4 [17], CD36 [18], SCD1 [19] have been reported to be associated with morbidity, prognosis and treatment resistance of NSCLC.
Gene expression pro ling is an essential part of bioinformatics, which has broad application prospects and powerful functions in oncology medicine. It has excellent clinical application potential in molecular diagnosis, tumor molecular screening, new target discovery, tumor response prediction, patient classi cation, and prognosis prediction [20]. To explore the further lipid mentalism relating regulation network and pathway, we conducted an integrated bioinformatic method to construct the gene-wide expression pro le and a signature of lipid metabolism on LUAD. We explored the potential biomarkers for diagnosis and prognostic guidance of LUAD caused by lipid metabolism disorder.

Patients and datasets
We downloaded 519 lung adenocarcinoma (LUAD) tissues and 58 normal tissues with mRNA expression data from The Cancer Genome Atlas (TCGA, https://cancergenome.nih.gov/) database using the R package TCGAbiolinks [21]. The ensemble ID of TCGA samples was annotated with human genes GRCh38/hg38.

Bioinformatic analysis
We used the R package clusterPro ler to furtherly explore the biological signi cance of lipid metabolismrelated DEGs [25]. In GO and KEGG analysis, FDR < 0.05 was considered a signi cant enrichment. Then we uploaded the DEGs that containing gene identi ers and corresponding FDR values and log2FC values into the IPA software (Qiagen). The "core analysis" function included in the software was used to interpret the DEGs.

Protein-protein interaction network generation and hub genes analysis
We built a protein-protein interaction (PPI) network of differentially expressed lipid metabolism-related genes using the Search Tool for the Retrieval of Interacting Genes (STRING, http://string-db.org/) database [26]. The combined score of ≥ 0.4 was the cut-off value. Cytoscape software (version 3.6.0) helped to visualize PPI networks [27]. According to 12 ranking methods in cytoHubba [28], an APP in Cytoscape, the top ten genes of each method were selected for overlap analysis, and the genes with the highest number of overlaps were used as hub genes.

Survival analysis
The overall survival (OS) analysis of hub genes was shown by Kaplan-Meier Plotter (http://kmplot.com/analysis/), which includes clinical data and gene expression information for 1925 lung cancer patients [29]. Then, information on the number of cases along with median values of mRNA expression levels, hazard ratios (HR) with 95% con dence intervals (CI), and log-rank P-values were extracted from the KM plotter webpage. Log-rank P-values < 0.05 were considered signi cant.

Prediction model
Based on the selected hub genes, we use the nomogram package of R ("rms") [30] to develop a model to evaluate the prognosis of LUAD patients. Using the formula of the nomogram, we calculated the prognosis score of each patient. According to the score, patients are divided into a low-risk score group and a high-risk score group using the median classi cation method. The prognosis score was validated by the patients' actual prognosis outcome. Then we did the same analyses on the external set (117 LUAD patients from GSE13213) to validate the availability of this 6-gene-based risk model.

Identi cation and functional analysis of lipid metabolismrelated DEGs
A total of 217 lipid metabolism-related DEGs were identi ed from the TCGA-LUAD cohort. A volcano plot was constructed to reveal the signi cant DEGs (Fig. 1A), and a heatmap was performed to show the hierarchical clustering analysis of the DEGs (Fig. 1B). To get an overall understanding of 217 lipid metabolism-related DEGs, we conducted GO terms and KEGG pathway enrichment by clusterPro ler package, while canonical pathways analysis by IPA. The results of KEGG pathway enrichment showed that DEGs were signi cantly enriched in glycerophospholipid metabolism, arachidonic acid metabolism, and metabolism of xenobiotics by cytochrome P450. In contrast, they were signi cantly enriched in fatty acid metabolic process, glycerolipid metabolic process, and steroid metabolic process from GO terms (Fig. 1C). IPA identi ed signi cant canonical networks associated with the DEGs. IPA showed that the top canonical pathways associated with common DEGs were Eicosanoid Signaling, FXR/RXR Activation, and Atherosclerosis Signaling (Fig. 1D).

PPI network construction and cytoHubba analysis
Lipid metabolism-related DEGs were analyzed by the STRING tool. Ultimately, a PPI network with 216 nodes and 1140 edges was established and visualized in Cytoscape (Fig. 2). Then a total of 6 hub genes were identi ed by the overlap of the top 10 genes according to 12 ranked methods in cytoHubba (Table   S1)

Prediction model based on survival-related hub genes and validation
Based on the Cox regression model, a nomogram was built to predict the prognosis of LUAD patients, using the mRNA expression of the six survival-related hub genes (Fig. 4A). Then we calculated the prognosis score of each patient, and found that high-risk score group had worse OS of 3 years [HR = 1.51 (1.07-2.13), P = 0.018] (Fig. 4B). We validated the model that high-risk score group had worse OS [HR = 1.84 (1.00-3.37), P = 0.047] (Fig. 4C).

Discussion
Metabolic change has been widely observed in cancer cells [31]. Among those metabolisms, lipid metabolism widely participates in the regulation of many cellular processes such as cell growth, proliferation, differentiation, survival, apoptosis, in ammation, motility, membrane homeostasis, chemotherapy response, and drug resistance [32]. Some recent researches have reported some component of PM2.5, promotes pulmonary injury by modifying lipid metabolism [33]. However, there are less researches regarding the association between transcriptome-wide lipid metabolism and lung cancer.
Therefore, this study using the LUAD cohort to generate the transcriptome-wide pro le of lipid-related. Similar to the previous studies regarding other kinds of cancers [34], the fatty acid, glycerolipid, and glycerophospholipids were also the primary driven enrichment biological function. Besides, arachidonic acid metabolism, PPAR signaling pathway, insulin resistance, eicosanoids signaling, and other pathways and GO terms are also reported to nd in cancer [35][36][37]. From the network of those biological function modules, which were connected with genes shared between modules, the lipid metabolic of LUAD was associated with nicotine, estrogen biosynthesis, melatonin, and atherosclerosis. Nicotine may promote LUAD development by regulating lipid metabolism. The interaction between estrogen biosynthesis and lipid metabolic is one of the high-risk factors of LUAD, which is consistent with the tread that lung cancer incidence is rising in women and has, in fact, more than doubled since the mid1970s [38]. Atherosclerosis and cancer have many similarities [39]. Patients with atherosclerotic disease are prone to repeated episodes of ischemia/reperfusion, which induces oxidative stress through the formation of oxygen free radicals. Endogenous exposure to free radicals increases the risk of cancer in individuals with atherosclerotic diseases [40]. Besides, retinoic acid inducing RAR-beta /RXR activation to promote tumor progression should be a potential way to promote LUAD. RXR can activate FXR/RXR and LXR/RXR, and the two activations also overlap. FXR has been reported as a tumor suppressor [41]. A possible mechanism is that FXR activates CCND1 expression and promotes cell proliferation. In order to activate the expression of its target genes, FXR is a heterodimer with retinoid X receptor alpha (RXRα). It binds to FXR response element (FXRE) after activation by a speci c agonist, mainly IR-1 [42]. Besides, FXR has been recently found related to the microenvironment of immunotherapy closely. Through FXR/RXR and LXR/RXR, lipid metabolic may in uence the development of LUAD by regulating the immunity system.
To nd the potential interventional target of LUAD patients, we constructed the network of those genes that are related to lipid and LUAD and nd six hub genes. CYP2C9, which is a drug target of lung cancer, can be slowed by cytochrome P450, the tumorigenesis was regulated [43,44]. LUAD patients with a lower expression of CYP2C9 have a better prognosis. UGT1A variants may play only a minor role in other lung cancer risk [45]. LUAD patients with a lower expression of UGT1A have a better prognosis (Fig. 3). DGAT1 and LPL are lipid metabolic genes. Both of them are involved in fatty acid synthesis. HPGDS has the therapeutic potential in allergic in ammation [46]. Those three genes were positively related to survival time. INS encodes insulin and plays a vital role in the regulation of carbohydrate and lipid metabolism.
LUAD patients with a lower expression of INS have a better prognosis. The regulation of fatty acid synthesis and insulin and the in ammation control may be the treatment of LUAD patients. Based on those six genes, a risk model was constructed. LUAD patients from two cohorts with the lower risk score had a better prognosis.

Conclusions
In summary, we generated a lipid metabolic transcriptome-wide pro le of LUAD patients and found that signi cant lipid metabolic pathways were correlated with the LUAD. Our ndings suggest that lipid metabolic is a way through which exogenous substances can affect the development of cancer. A signature of six lipid metabolic genes was signi cantly associated with diagnosis and prognosis of LUAD patients. The gene signature can be used as a biomarker for LUAD.   The PPI network of lipid metabolism-related DEGs. Red, white, and blue nodes represent upregulated genes, no expression differences genes, and downregulated genes, respectively. The magnitude of the degree is positively correlated with the size of a node.