Identification of prognostic lipid droplet-associated genes in pancreatic cancer patients via bioinformatics analysis

Background Pancreatic cancer is the fourth leading cause of cancer deaths in the United States both in females and in males, and is projected to become the second deadliest cancer by 2030. The overall 5-year survival rate remains at around 10%. Cancer metabolism and specifically lipid metabolism plays an important role in pancreatic cancer progression and metastasis. Lipid droplets can not only store and transfer lipids, but also act as molecular messengers, and signaling factors. As lipid droplets are implicated in reprogramming tumor cell metabolism and in invasion and migration of pancreatic cancer cells, we aimed to identify lipid droplet-associated genes as prognostic markers in pancreatic cancer. Methods We performed a literature search on review articles related to lipid droplet-associated proteins. To select relevant lipid droplet-associated factors, bioinformatics analysis on the GEPIA platform (data are publicly available) was carried out for selected genes to identify differential expression in pancreatic cancer versus healthy pancreatic tissues. Differentially expressed genes were further analyzed regarding overall survival of pancreatic cancer patients. Results 65 factors were identified as lipid droplet-associated factors. Bioinformatics analysis of 179 pancreatic cancer samples and 171 normal pancreatic tissue samples on the GEPIA platform identified 39 deferentially expressed genes in pancreatic cancer with 36 up-regulated genes (ACSL3, ACSL4, AGPAT2, BSCL2, CAV1, CAV2, CAVIN1, CES1, CIDEC, DGAT1, DGAT2, FAF2, G0S2, HILPDA, HSD17B11, ICE2, LDAH, LIPE, LPCAT1, LPCAT2, LPIN1, MGLL, NAPA, NCEH1, PCYT1A, PLIN2, PLIN3, RAB5A, RAB7A, RAB8A, RAB18, SNAP23, SQLE, VAPA, VCP, VMP1) and 3 down-regulated genes (FITM1, PLIN4, PLIN5). Among 39 differentially expressed factors, seven up-regulated genes (CAV2, CIDEC, HILPDA, HSD17B11, NCEH1, RAB5A, and SQLE) and two down-regulation genes (BSCL2 and FITM1) were significantly associated with overall survival of pancreatic cancer patients. Multivariate Cox regression analysis identified CAV2 as the only independent prognostic factor. Conclusions Through bioinformatics analysis, we identified nine prognostic relevant differentially expressed genes highlighting the role of lipid droplet-associated factors in pancreatic cancer.


Background
Pancreatic cancer is the fourth leading cause of cancer deaths in the United States both in females and males [1], and is predicted to be the second most common cancer by 2030 [2]. Although recent therapeutic advances such as more effective adjuvant and neo-adjuvant chemotherapies, together with more radical and safer surgery, pancreatic cancer prognosis is very poor and the overall 5-year survival rate is still 10% [1,3]. Metabolic reprogramming has been recognized as a hallmark of cancer [4]. Aberrant lipid synthesis and reprogrammed lipid metabolism are both associated with the development and progression of pancreatic cancer [5]. This can be because lipids such as phospholipid bilayers are fundamental structural components enabling cellular proliferation [5]. Lipid droplets (LDs) are ubiquitous intracellular organelles that store neutral lipids, such as, triacylglycerides, and cholesterol esters [6,7]. The storage of neutral lipids in LDs is important for protecting cells from lipotoxicity due to the buildup of excess lipids in cell membranes [7]. Some cancer cells accumulate massive amount of LDs [7]. LDs are further implicated to mediate the proliferation, invasion, metastasis, as well as chemotherapy resistance in several types of cancer [8]. Oncogenic KRAS, which is the most important driver for pancreatic cancer development, controls the storage and utilization of LD supporting reprogramming of tumor cell metabolism, invasion and migration [9]. LDs are composed of a monolayer of phospholipids together with a variety of proteins such as structural proteins, membrane transport proteins, and enzymes [8]. LDs can associate with most other cellular organelles through membrane contact sites mediated by a set of proteins [10]. Since LDassociated proteins play an important role in dynamics of LD, we hypothesized that LD-associated factors may be associated with the outcome of pancreatic cancer patients.

Literature search for selecting lipid droplet-associated factors
To identify lipid droplet-associated factors, we performed a literature search in the PubMed database related to lipid droplet-associated proteins (last search date: July 2020). All publications with keywords "lipid droplet-associated protein" and the category "review" were collected. There was no restriction on the publication period. We analyzed all retrieved articles , and selected factors that were described to localize on lipid droplet including factors on contact sites to other organelles, such as endoplasmic reticulum and mitochondria.

Gene expression profiling interactive analysis (GEPIA) bioinformatics analysis
For gene expression profiling and overall survival analysis, we conducted bioinformatics analysis on the GEPIA platform (http://gepia.cancer-pku.cn/) [40]. GEPIA is an online analysis tool for processing highthroughput RNA sequencing expression data of bulk tumorous and normal samples based on the Cancer Genome Atlas (TCGA) (https://portal.gdc.cancer.gov/) and the Genotype-Tissue Expression (GTEx, https://www. gtexportal.org/) databases. Our analysis included 179 pancreatic cancer samples and 171 normal pancreatic tissue samples. Dot maps of selected genes were generated. Furthermore, the GEPIA differential analysis module was used for analyzing gene expression profiles of pancreatic cancer and paired normal samples, and for screening differentially expressed genes (DEGs) between tumor and normal tissues. DEGs were further analyzed for overall survival (OS) on the GEPIA platform. Median expression was used as the threshold between high expression and low expression cohorts. The Human Protein Atlas version 20.1 (https://www.proteinatlas.org) [41] was used to analyze DEGs at the protein level by immunohistochemistry.

Statistical analysis
Statistical analysis was performed through the plug-in units of GEPIA. Analysis of variance (ANOVA) and Limma package plug-in were performed for screening DEGs. The data analysis function of Limma package is for the construction of linear models and differential expression for RNA-seq data. Genes with log2 (fold change) > 1 or < − 1, and a P value < 0.05 were considered DEGs. P-values were calculated as false discovery rate (FDR)-adjusted P-values with the Limma packages. OS analysis was performed using the Kaplan-Meier survival plots tool in the GEPIA platform. Via Log-rank test, log-rank P-values, hazards ratios (HR), and Cox Pvalues were obtained. Univariate and multivariate Cox regression analysis was carried out via Cox Proportional-Hazards (CoxPH) function in R. P < 0.05 was considered statistically significant.

Expression of a large number of lipid droplet-associated genes is significantly altered in pancreatic cancer patient specimens
We  Table 2.
Lipid droplet-associated gene expression of CAV2, CIDEC, HILPDA, HSD17B11, NCEH1, RAB5A, and SQLE is associated with poor survival while expression of BSCL2 and FITM1 is associated with longer overall survival of pancreatic cancer patients Next, overall survival (OS) analysis of 39 DEGs was performed on the GEPIA platform. The results revealed that up-regulation of seven genes (CAV2, CIDEC, HILP DA, HSD17B11, NCEH1, RAB5A, and SQLE) and downregulation of FITM1 were significantly associated with a decrease in OS. Interestingly, up-regulation of BSCL2 was associated with favorable OS. Hence, through the above analysis, nine prognostic DEGs of lipid dropletassociated factors were identified in pancreatic cancer (Fig. 2, 3a). To further verify, whether these nine genes have prognostic power, we performed univariate Cox proportional hazards regression analysis and calculated hazard ratios (HRs) and 95% confidence intervals (CIs , P = 0.007) (Fig. 3b). To identify, whether these genes were independent prognostic factors, we performed multivariate Cox proportional hazards regression analysis. Among the prognostic DEGs, only CAV2 was an independent prognostic factor (coefficient: 0.4, HR: 1.5, P = 0.005) (Fig. 3c). We further analyzed protein expression of the nine DEGs on the Human Protein Atlas database. Representative immunohistochemistry pictures of the DEGs are shown in Fig. 4. Taken together, LD-associated factors seem to be relevant in pancreatic cancer since there are linked by bioinformatics analyses to overall survival of pancreatic cancer patients.

Discussion
LD accumulation in non-adipose tissues has been recognized as a new hallmark of cancer cells [8]. Higher LD content has been reported in colorectal, breast, prostate cancer, hepatocellular carcinoma, renal cell carcinoma and glioblastoma [8]. LDs have been implicated to mediate aspects of proliferation, invasion, metastasis, as well as chemotherapy resistance in several types of cancer [8]. Increased storage of lipids in LDs has been suggested to be beneficial for the survival of cancer cells. Increased LD contents may expand the energy source for the metabolic need of proliferative cancer cells [8]. Storage of excess FAs and cholesterol in LDs can prevent lipotoxicity and endoplasmic reticulum (ER) stress [8]. Crucial regulators in LD homeostasis include structural proteins, membrane transport proteins and enzymes [8]. We hypothesized that expression of LD-associated factors is relevant in pancreatic cancer, because LDassociated factors support the storage of neutral lipids in lipid droplets providing energy source for cancer cells and potentially protecting cancer cells from lipotoxicity. Furthermore, LDs can associate with most other cellular organelles such as ER, nucleus, mitochondria, peroxisomes, and lysosomes through membrane contact sites [10]. It is not known whether LDs can be transferred between cells, or can be secreted into the blood stream. It has been shown that tumors signal over long distances to sites of future metastases to promote formation of a pre-metastatic niche that potentially supports growth of disseminated tumor cells upon their arrival [42]. If lipid droplets act as "transporters" of lipids and signaling molecules between cells, LDs and LD-associated factors might be involved in metastasis. The molecular mechanisms, however, need to be clarified in future.
In the current study, we identified 36 upregulated and 3 downregulated genes characterized as LD-associated factors in pancreatic cancer (Table 2). We further identified that enhanced expression of 7 genes namely CAV2, CIDEC, HILPDA, HSD17B11, NCEH1, RAB5A, and SQLE are associated with significantly shorter overall survival whereas elevated expression of BSCL2 and FITM1 correlates with longer overall survival of pancreatic cancer patients (Fig. 3a). To verify those findings also on the protein level, normal pancreatic tissue samples, in addition to cancerous and para-cancerous tissues, would be required. However, we confirmed protein expression of the DEGs by using the Human Protein Atlas database. Although gene transcripts levels and protein concentrations do not always correspond to each other, we identified prognostic LD-associated factors at the transcriptome level, providing clues for future proteomics, metabolomics and downstream functional analysis.
CAV-2 is a caveolin family member and CAV-2 expression is increased during the accumulation of intracellular LDs and the adipogenic differentiation [43]. CAV-2 plays a key role in intracellular cell transport and higher level of CAV-2 is associated with different types of cancer progression [44]. It has been demonstrated that higher expression of CAV-2 and its upstream regulator bromodomain containing 4 (BRD4) is associated with shorter overall survival of 76 pancreatic cancer patients [44]. In our study, both CAV1 and CAV2 were identified as DEGs, but only CAV2 was a prognostic DEG associated with shorter OS in pancreatic cancer patients. CIDEC (also known as fat-specific protein of 27 kDa, FSP27) has been suggested to mediate LD-LD contact for promoting LD fusion [6,10]. CIDE proteins are enriched at LD-LD contact sites and physically chaining the adjacent organelles [10]. So far it has been demonstrated that CIDEC promotes development of hepatic steatosis and steatohepatitis [45,46]. In our study, only CIDEC but not CIDEA or CIDEB was identified as a prognostic DEG. The precise and specific role of CIDEC in cancer needs to be clarified.
In various cancer cells including renal cell carcinoma, ovarian clear cell carcinoma, colorectal adenoma and carcinoma, upregulation of HILPDA has been observed [47][48][49]. Hypoxia-inducible factor 1 (HIF1) regulates the expression of HILPDA [11]. HILPDA preferentially  accumulates in LDs undergoing remodeling (e.g. expansion). HILPDA has been shown to co-localize with the lipogenic enzymes DGAT1 and DGAT2 [11], which were both identified as DEGs in our study (Table 2). DGATs catalyze the final rate-limiting step in the formation of triglycerides. In line with this, HILPDA has been shown to promote intracellular lipid accumulation by enhancing triglyceride synthesis [11]. Overexpression of HILPDA, but not DGAT1/2, is associated with shorter overall survival in pancreatic cancer patients (Fig. 3a), suggesting that additional roles of HILPDA, other than regulating triacylglyceride (TAG) synthesis, may influence the outcome of pancreatic cancer patients. Indeed, HILPDA inhibits the ratelimiting enzyme of TAG hydrolysis PNPLA2 (ATGL), leading to inhibition of lipolysis, attenuated fatty acid oxidation and ROS production [11].
HSD17B11 is known to convert 5α-androstan-3α, 17βdiol (3α-diol) to androsterone [50]. HSD17B11 regulates size of LDs, LD distribution and TAG content [51]. The role of HSD17B11 in pancreatic cancer has not been elucidated. It has been shown that HSD17B13 variants are Fig. 4 Representative immunohistochemistry of nine lipid droplet-associated genes between pancreatic cancer and normal pancreas tissues in the Human Protein Atlas database associated with nonalcoholic fatty liver disease (NAFLD) and HSD17B13 expression is elevated in nonalcoholic steatohepatitis (NASH) patients, or with risk of cirrhosis and hepatocellular carcinoma (HCC), while a HSD17B13 variant has been demonstrated to protect from HCC development [52][53][54]. In our study, HSD17B13 was not identified as DEG. Regarding NCEH1, which hydrolyzes 2-acetyl monoalkylglycerol in the metabolism of ether lipids, it has been suggested as a prognostic marker for pancreatic cancer [55], supporting our analysis.
Rab family protein RAB5A belongs to the Ras family of G-proteins, which regulate membrane vesicle trafficking. In our study, several Rab family members namely RAB5A, RAB7A, RAB8A, and RAB18 were identified as DEGs ( Table 2). Among these four genes, expression of RAB5A was associated with shorter OS of pancreatic cancer patients. Expression of RAB5A correlated with pancreatic tumor progression in another study of 111 patients as well [56]. Increased expression of RAB5A predicts metastasis and shorter OS in colorectal cancer patients [57]. It has been suggested that RAB5A regulates Wnt signaling, proliferation, invasion and 5-FU drug resistance [56]. Further, RAB5A activates IRS1, Akt and mTOR signaling [58], suggesting that LD-associated factors are also involved in regulating signaling pathways. SQLE is a key enzyme in the cholesterol synthesis pathway and converts squalene to 2,3-epoxysqualene [59]. SQLE increase epigenetic silencing of PTEN, leading to activation of the Akt-mTOR pathway and NAFL D-induced HCC growth. High expression of SQLE was associated with shorter OS of HCC patients [60]. On the other hand, it has been demonstrated that reduction of SQLE mRNA and protein expression is associated with shortened survival of colorectal cancer patients. SQLE reduction aggravates colorectal cancer progression via the activation of the β-catenin pathway and deactivation of p53 pathway [61]. In our study with pancreatic cancer patient databases, SQLE was identified as a DEG and higher expression of SQLE was associated with shorter OS of pancreatic cancer patients. The precise and cancer type-specific role of SQLE has to be further clarified.

Study strengths and limitations
Bioinformatics analysis revealed that expression of LDassociated factors is associated with overall survival in pancreatic cancer patients. Although aberrant lipid synthesis and reprogrammed lipid metabolism are both associated with the development and progression of pancreatic cancer, LD and LD-associated factors have not been considered in this disease. In the current study, we identified 65 factors as LD-associated factors. We identified 39 DEGs with 36 up-regulated and 3 downregulated genes. Among 39 DEGs, 7 up-regulated genes and 2 down-regulated genes were significantly associated with overall survival of pancreatic cancer patients. Cox regression analysis further validated 8 factors as prognostic factors.
There are also several limitations of the study. Cofounding factors such as body-mass index (BMI) and others could not be considered, since the data of normal tissue from the GTEx database do not include these information. Furthermore, findings were not validated on the protein level. Further studies must include functional analysis using knockout animal model of the prognostic candidate genes to analyze whether deletion/ inhibition of the lipid droplet-associated factors can change the metabolic profile of the cells, proliferation, and the ability to metastasize. It would also be of interest to clarify whether lipid droplets can be detected in liquid biopsies (blood), and whether deletion/ inhibition of LDassociated factors reduce lipid droplet contents in the blood.

Conclusions
LDs are ubiquitous cellular organelles, involved not only in lipid metabolism but also in diverse biological functions such as regulating signaling pathways. LDs mediate proliferation, invasion, metastasis, as well as chemotherapy resistance in several types of cancer. LD-associated proteins play an important role in dynamics of LD and it is now evident that expression of several LD-associated genes are associated with overall survival in pancreatic cancer patients. The current study identified prognostic LD-associated factors at the transcriptome level, providing clues for future proteomics and downstream functional and pathway analysis. It is important to increase our understanding of cancer type-specific roles of LDassociated factors, which may help to develop more specific and personalized therapies for pancreatic cancer patients in the future. Authors' contributions RB performed the gene bioinformatics analysis and wrote the manuscript. AR gave administrative support. JK supervised the study. YS designed the study, performed the gene selection analysis, and wrote the manuscript. The author(s) read and approved the final manuscript.

Funding
There is no funding obtained for the current manuscript.

Availability of data and materials
All data generated or analyzed during this study are included in this article.

Declarations
Ethics approval and consent to participate Not applicable.

Consent for publication
The Human Protein Atlas version 20.1 (https://www.proteinatlas.org) was used to analyze differentially expressed genes at the protein level by immunohistochemistry. Image credit: Human Protein Atlas.