LncRNA SNHG8 is identified as a key regulator of acute myocardial infarction by RNA-seq analysis

Background Long noncoding RNAs (lncRNAs) are involved in numerous physiological functions. However, their mechanisms in acute myocardial infarction (AMI) are not well understood. Methods We performed an RNA-seq analysis to explore the molecular mechanism of AMI by constructing a lncRNA-miRNA-mRNA axis based on the ceRNA hypothesis. The target microRNA data were used to design a global AMI triple network. Thereafter, a functional enrichment analysis and clustering topological analyses were conducted by using the triple network. The expression of lncRNA SNHG8, SOCS3 and ICAM1 was measured by qRT-PCR. The prognostic values of lncRNA SNHG8, SOCS3 and ICAM1 were evaluated using a receiver operating characteristic (ROC) curve. Results An AMI lncRNA-miRNA-mRNA network was constructed that included two mRNAs, one miRNA and one lncRNA. After RT-PCR validation of lncRNA SNHG8, SOCS3 and ICAM1 between the AMI and normal samples, only lncRNA SNHG8 had significant diagnostic value for further analysis. The ROC curve showed that SNHG8 presented an AUC of 0.850, while the AUC of SOCS3 was 0.633 and that of ICAM1 was 0.594. After a pairwise comparison, we found that SNHG8 was statistically significant (P SNHG8-ICAM1 = 0.002; P SNHG8-SOCS3 = 0.031). The results of a functional enrichment analysis of the interacting genes and microRNAs showed that the shared lncRNA SNHG8 may be a new factor in AMI. Conclusions Our investigation of the lncRNA-miRNA-mRNA regulatory networks in AMI revealed a novel lncRNA, lncRNA SNHG8, as a risk factor for AMI and expanded our understanding of the mechanisms involved in the pathogenesis of AMI.


Background
Acute myocardial infarction (AMI), one of the leading causes of mortality and morbidity worldwide, is a manifestation of acute coronary syndrome (ACS) [1]. ACS is a group of clinical syndromes that include unstable angina pectoris, non-ST-segment elevation AMI (NSTEMI), STsegment elevation AMI (STEMI) and sudden death [2]. Many associated risk factors, such as age, sex, lifestyle, hypertension, diabetes, atherosclerosis, dyslipidemia and genetic factors, are significantly associated with AMI [3][4][5][6]. However, the exact molecular mechanisms of AMI pathophysiological processes and pathologies have not been completely elucidated. With this in mind, exploring and finding hub molecules for AMI are essential to analyzing effective treatment measures and prevention methods.
Several previous studies have determined that a large fraction of noncoding RNAs (ncRNAs) have a crucial role in the modulation of biological processes and have essential functions in disease development [7,8]. Noncoding RNAs, based on the size of the transcript, can be divided into two classes: (1) short ncRNAs (< 200 nt) that include transcription initiation RNAs, PIWI-interacting RNAs and microRNAs and (2) long ncRNAs (lncRNAs) (> 200 nt) that include natural antisense transcripts, transcribed ultraconserved regions, long intergenic ncRNAs (lincR-NAs) and enhancer-like ncRNAs [9,10]. Contrary to the short ncRNAs, which are highly conserved and their function essentially is to participate in posttranscriptional repression, lncRNAs are not well conserved, and their roles are variable [11].
Recently, increasing evidence has suggested that many ncRNAs participate in specific pathological and physiological processes of AMI [12]. As they are stable in the plasma and other body fluids, ncRNAs can regulate target mRNA translation or promote mRNA degradation. Many studies have proven that both lncRNAs and miR-NAs are closely associated with the development of AMI [13]. In this study, we constructed an AMI-related lncRNA-miRNA-mRNA network by analyzing RNA-seq data and established a global triple network via the Gene Expression Omnibus (GEO) repository to explore the potential molecular mechanisms of AMI. The specific workflow is shown in Fig. 1.

Gene expression profile probe reannotation
An RNA-seq profile dataset (GSE65705) [14] was downloaded from the Gene Expression Omnibus database (https://www.ncbi.nlm.nih.gov/geo/), which was based on the platform of GPL11154 Illumina HiSeq 2000 (Homo sapiens). GSE65705 contains 32 platelet samples from AMI patients (16 STEMI and 16 NSTEMI) and 2 platelet samples from normal individuals. The data analysis was performed by Reads Per Kilobase per Million mapped reads (RPKM) and quantile normalized using the robust multiarray average (RMA) method. The probes were then annotated using Bioconductor in R [15]. If one gene had more than one probe, the mean expression value of this gene was selected.

Differentially expressed mRNAs (DEGs) and lncRNAs (DELs) and functional enrichment analysis
The limma package [16] in R was used to select the DEGs and DELs in the AMI samples compared with the normal samples. |Log 2 fold-change| ≥ 2 and adjusted P < 0.05 were set as the threshold for DEGs and DELs. Clus-terProfiler and DOSE package in R [17] were used to perform the Gene Ontology (GO), Disease Ontology (DO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses for DEGs. In all of the analyses, an adjusted P-value (Q-value) of < 0.05 was regarded as statistically significant.

Protein-protein interaction (PPI) network construction and module analysis
The STRING database (version 11) [18] provides information on protein prediction and experimental interactions. Neighborhoods, text mining, gene fusion, databases, cooccurrence and coexpression experiments are used as the prediction methods for the database. In Fig.1 A flowchart of data analysis addition, the interactions of protein pairs in the database are presented as a combined fraction. In the current study, all DEGs were mapped to PPIs. To explore the hub genes in the network, the cutoff value was set at a combined score > 0.9 [19]. The role of protein nodes in the network was described by degrees. Network modules are one of the cores of protein networks and may have specific biological implications. The Cytoscape software package (version 3.71) was used to analyze the major clustering modules, and the most notable clustering modules were examined with Molecular Complex Detection (MCODE) [20,21]. Subsequently, EASE ≤0.05 and a count ≥2 were set as the cutoff values and an MCODE score > 10 as the threshold for subsequent analyses.

Prediction of miRNAs and construction of the ceRNA network
The relationships among lncRNA, miRNA, and mRNA are essential elements for the construction of the ceRNA network. We employed miRwalk and DIANA TOOLS [22,23] to predict the targeting miRNAs for mRNAs. After GO, DO, KEGG, PPI and MCODE analyses had been completed, several meaningful DEGs were picked and mapped to targeting miRNAs. The lncRNA-miRNA interactions were predicted with the miRcode and star-Base databases [24,25], using the same mapping method as for lncRNA-miRNA interactions. Cytoscape was performed to construct and visualize the lncRNA-miRNA-mRNA ceRNA network.

Study population
A total of 230 patients were recruited from an inpatient treatment facility for chest complaint in the Liuzhou People's Hospital from 2018-1-1 to 2018-12-31. This study included 115 healthy participants and 115 AMI patients; the sample size was sufficient to have adequate power. The volunteers in this study were AMI patients who received treatment with percutaneous coronary intervention (PCI). (1) The inclusion criteria were as follows [26]: diagnostic criteria with reference to 2018 diagnostic guidelines for AMI patients, the elevation of cardiac biomarker (cTnT) above the upper limit of the reference value of the 99 percentile and accompanied by at least one of the following myocardial ischemia pieces of evidence: electrocardiogram revealing new ischemic changes; X-ray imaging evidence suggesting a new localized ventricular wall dysplasia or loss of viable myocardium. (2) The exclusion criteria were as follows: myocardial infarction complicated by other organ failure or serious lesions (such as lung cancer, liver and kidney failure, etc.), except for patients with allergic diseases and autoimmune diseases. The healthy control volunteers were selected by the following criteria: (1) Inclusion criteria: no history of cardiovascular disease, normal chest radiography, and normal liver and kidney function, excluding infections, tumors, etc. (2) Exclusion criteria: myocardial infarction (MI), use of thrombolytic drugs to treat cardiomyopathy, and cardiogenic shock. Clinical data were gathered for all volunteers and included baseline clinical features, angiography, and laboratory test results. For the AMI group, blood samples were the first samples obtained from patients after admission. The blood samples were collected before PCI and just a few hours after chest pain occurred, and were very suitable for the development of early diagnosis biomarkers. Clinical data collection, biochemical measurements and diagnostic criteria were performed according to previous studies [27,28]. The Declaration of Helsinki of 1975 (http:// www.wma.net/en/30publications/10policies/b3/), which was revised in 2008, was followed, and the Ethics Committee of Liuzhou people's Hospital agreed with the study design (No: Lunshen-2017-KY; Mar. 7, 2017). Informed consent was obtained from all subjects after receiving a full explanation of the study.

RNA isolation, reverse transcription (RT) and quantitative PCR (qPCR)
A venous blood sample of 5 ml was collected into an EDTA-coated tube from the above patients. The blood was centrifuged at 3000 g for 15 min. A NanoDrop ND-1000 spectrophotometer (NanoDrop Thermo, Wilmington, DE) was used to examine RNA quantity and quality. Subsequently, cDNA was synthesized through reverse transcription of RNA by using a reverse transcriptase kit (TIANGEN; catalog number: KR211, China). The reaction mixture included 10 μL of miRNA RT reaction buffer, 2 μL of enzyme mixture, RNase-free water (up to 20 μL) and 2 μg of total RNA. The mixture was incubated at 42°C for 60 min, at 95°C for 3 min, and then at 4°C. Our quantitative RT-PCR analysis included 1 lncRNA (SNHG8) and 2 genes (SOCS3 and ICAM1), and the location and amplification of primers are shown in Additional file 1: Table S1. The primers were designed using Primer 5.0 (Shanghai Sheng Gong, China). First-strand cDNA was biosynthesized by transcribing 1 μg of tRNA using a cDNA transcription kit (thermo) with Oligo (dT) primer or random primer. Quantitative RT-PCR was performed by applying SYBR Green PCR MasterMix (Applied Biosystems, USA) with 7500 H T Fast Real-Time PCR system (Bio-Rad, USA). After the calculation of the threshold cycle (Ct) value of each sample, quantitative expression results were then obtained according to the 2 −ΔΔct method. PCR was performed in 10 μl reaction volumes, which included 2 μl of cDNA, 5 μl 2× Master Mix, 0.5 μl of Forward Primer (10 μM), 0.5 μl of Reverse Primer (10 μM) and 2 μl of double distilled water. The reaction was incubated at 95°C for 15 min, at 95°C for 28 s, at 61°C for 30 s, and at 72°C for 35 s. All reactions were performed in duplicate. GAPDH was used as the internal control [29].

Statistical analysis
The statistical software package SPSS 22.0 (SPSS Inc., Chicago, IL, USA) was used in this study. A chi-square analysis was used to assess the differences in the percentages between the groups. Quantitative variables were expressed as the means ± standard deviation (TG levels are shown as medians and interquartile ranges and were analyzed by Wilcoxon-Mann-Whitney test because they were not in a normal distribution). The AMI risk score was calculated for each patient as a linear combination of selected predictors that were weighted by their respective coefficients. The 'rms' package was used for the AMI prediction nomogram. The predictive accuracy of the risk model was assessed by discrimination measured by C-statistic and calibration evaluated by Hosmer-Lemeshow χ 2 statistic. To compare the plasma mRNAs and lncRNAs between the control and case groups, receiver operating characteristic (ROC) curve analysis was conducted. The diagnostic value of the mRNAs and lncRNAs was evaluated by the area under curve (AUC). All tests were two-sided, and P < 0.05 was considered statistically significant.

Data preprocessing and identified differentially expressed genes
After quality control, we found that the AMI samples 4, 11, 22 and 31 could not be normalized (Fig. 2a), and we had to remove them from the analysis (Fig.  2b). Then, all of the rest of the samples were well normalized (Fig. 2c). Subsequently, we eliminated many incorrect expression levels and identified a total of 3127 items with adjusted P < 0.05 when comparing the AMI and control samples, but only identified 762 DEGs, which included 488 upregulated and 274 downregulated DEGs with |log 2 (fold change) | ≥ 2. In addition, a total of 98 DELs, which included 55 upregulated and 43 downregulated DEGs with |log 2 (fold change) | ≥ 2, were identified. The heatmap and volcano plot are shown in Fig. 3.

Functional annotation, PPI network construction and identification of hub genes
We used the clusterProfiler package in R to carry out the KEGG pathway enrichment, DO functional and GO analyses to elucidate the role of DEGs (Fig. 4). In the analysis of GO functions, approximately 329 biological processes, 104 cellular components, and 45 molecular functions were identified with an adjusted P < 0.05. Table 1 shows the top 10 terms. Approximately 21 pathways were enriched in the KEGG pathway analysis and 20 DO terms were identified for the screened DEGs at adjusted P < 0.05 (false discovery rate, FDR set at < 0.05). Table 2 shows the top 15 items.
To generate a PPI for these DEGs, data analysis was performed using the STRING database, from which 7544 protein pairs and 591 nodes were revealed with a combined score > 0.9. Figure 5a shows the net analysis in Cytoscape. Three modules with a score > 10 were found and are presented in Fig. 5 (B-D) for detection using the Molecular Complex Detection (MCODE) application. These three modules included a total of 300 genes. Finally, after a comprehensive analysis of the GO, DO, and KEGG data, we selected 2 DEGs related to the onset of AMI, which demonstrated a high degree of association simultaneously, as well as in the submodule analysis. These two genes were suppressor of cytokine signaling 3 (SOCS3) and intercellular adhesion molecule 1 (ICAM1) and were located in module 2 ( Fig. 5c).

Construction of the lncRNA-miRNA-mRNA regulatory network
First, considering the interaction between mRNAs (SOCS3 and ICAM1) and miRNAs, the miRwalk databases and DIANA TOOLS were searched for miRNA-mRNA interactions. Subsequently, we predicted lncRNAs (DELs) that can bind with miRNAs to design the lncRNA-miRNA regulatory network using the starBase (v3.0) and miRcode databases. Then, we matched the predicted miRNAs to build a network and found that lncRNA SNHG8, hsa-miR-411-5p, SOCS3 and ICAM1 were hub items in this triple regulatory network. Ultimately, a lncRNA-miRNA-mRNA network was formed by merging the two sets of data and was visualized by Cytoscape (version 3.71) (Fig. 5e).

The validation of expression profiles
After a comprehensive analysis, lncRNA SNHG8, SOCS3 and ICAM1 were validated in our samples, which included 115 controls and 115 patients who suffered from.
AMI. The demographic results are shown in Table 3. After validation, we found that the relative expression of lncRNA SNHG8, SOCS3 and ICAM1 was increased in AMI compared with the normal controls with significant differences (Fig. 6a-c). This conclusion was the same as the trend we obtained when we analyzed the RNA-seq data.

Expression level biomarker sensitivity for AMI
Considering the above mentioned observations, we further assessed these two genes (SOCS3 and ICAM1) and        lncRNA SNHG8 as potential biomarkers for AMI. ROC curve analysis showed that lncRNA SNHG8 presented an AUC of 0.85, while the AUC of ICAM1 was 0.594 and that of SOCS3 was 0.633. After the pairwise comparison, we found that lncRNA SNHG8 had significant statistical significance (P SNHG8-ICAM1 = 0.002; P SNHG8-SOCS3 = 0.031, Fig. 6d).

Nomogram prediction model development to estimate individual AMI probability
We selected gender, age, smoking, drinking, BMI, SBP, DBP, serum glucose, TC, TG, HDL-C, LDL-C, ApoA1, ApoB, heartbeat, creatinine, uric acid, troponin T, CK, and CKMB, and the relative expression of lncRNA SNHG8, SOCS3, and ICAM1 were the best subset of risk factors to develop an AMI risk score and risk model (nomogram) (Fig. 7). In this analysis, male was labeled as 1 and female was labeled as 2; and for smoking and drinking, yes was labeled as 2 and no was labeled 1. The nomogram had excellent discriminative power based on the C-statistic and was well calibrated with the Hosmer-Lemeshow χ 2 statistic. The predicted probabilities of developing AMI ranged from 0.00002 to 99.9%. After calculation, lncRNA SNHG8, ApoB, ApoA1, LDL-C, serum glucose and smoking were statistically significantly related to the risk of AMI.

Discussion
Cardiovascular disease is currently the leading cause of death. The number of cardiovascular patients, especially those with acute myocardial infarction, will continue to grow steadily in the next 10 years [30]. Traditional prevention of myocardial infarction includes anti-platelet and lipid regulation. Unfortunately, statins are frequently not available for several reasons. Nutraceuticals and  .86. A degree was used to describe the importance of protein nodes in the network, with a dark color filling denoting a high degree and light color a low degree functional food ingredients that are beneficial to vascular health may represent useful compounds that are able to reduce the overall cardiovascular risk induced by dyslipidemia by acting parallel to statins or as adjuvants in case of failure or in situations where statins cannot be used [31]. In-depth study of the occurrence and development mechanism of AMI and a search for reliable predictive biomarkers will have a great impact on the treatment and prevention of AMI. In recent years, with the continuous progress of technology, high-throughput sequencing technology has been widely used to analyze the causes of diseases and to find reliable predictive biomarkers.
Recently, an increasing number of studies have demonstrated that lncRNAs participate in fundamental cellular processes, such as gene transcription, post transcriptional gene regulation, RNA processing, gene regulation and chromatin modification [32]. In addition, lncRNAs can also play a role as competing endogenous RNAs (ceRNAs) by sponging specific miRNAs to release their target mRNAs [33]. Moreover, many lncRNAs have been found to be modulators in the progression of cardiovascular diseases (CVDs), such as cardiovascular aging, myocardial infarction and cardiac hypertrophy [34,35]. Moreover, several previous studies have reported that lncRNAs can regulate biological processes that are associated with myocardial infarction [36]. In our current study, we found that the relative expression of lncRNA SNHG8 was upregulated in AMI. SNHG8, located on 4q26, is thought to encode small nucleolar RNAs (snoRNAs) that function as targets of transcription factor FoxM1 in the regulation of muscle satellite cell proliferation and survival [37]. SNHG8 may be related to the regulation of myocardial muscle cell necrosis after acute myocardial ischemia.
MicroRNAs (miRNAs), which are conserved 19-25 nucleotide noncoding RNAs that function in regulating post-translational gene expression, have become a focus HDL-C high-density lipoprotein cholesterol, LDL-C low-density lipoprotein cholesterol, Apo Apolipoprotein, CK creatine kinase, CKMB creatine kinase-myocardial band a Mean ± SD determined by t-test. b Because it was not normally distributed, the value of triglyceride content was presented as median (interquartile range), and the difference between the two groups was determined by the Wilcoxon-Mann-Whitney test of translational research [38]. Recently, several studies have found that miR-411 is associated with metabolic diseases and atherosclerosis-related diseases. Zhao et al. found that hsa-miR-411-5p was associated with high-fat diet-induced hepatic insulin resistance in mice [39]. Stather et al. demonstrated that hsa-miR-411 was related to peripheral arterial disease and atherosclerosis [40]. According to database prediction, miR-411-5p may be a target that binds to SOCS3 and ICAM1. These two genes have been confirmed to be associated with myocardial infarction [41,42].
Recently, lncRNA-miRNA-mRNA axes have been shown to be unique regulatory mechanisms that are closely related to cardiovascular diseases (CVDs). For instance, nuclear factor IA (NFIA) regulates cholesterol homeostasis in the body and promotes the progression of atherosclerosis through the lncRNA RP5833A20.1 sponging miR-382-5p, which targets NFIA axis [43]. These findings suggest that lncRNAs could become candidate clinical diagnostic and prognostic markers, providing new therapeutic targets for CVDs and future insights into the prevention and treatment of other diseases. In our current study, we also found that the relative expression of lncRNA SNHG8 was significantly elevated in AMI and could be used as a promising biomarker for the diagnosis and treatment for AMI. The specific mechanism may be that lncRNA SNHG8 could regulate SOCS3 or ICAM1 expression by sponging hsa-miR-411-5p in AMI.
This study has its limitations. First, the patients enrolled in this study to validate the relative expression were from only one hospital, and the sample size may be a little small. Whether there are differences for patients from different areas and races is not known. Therefore, the validity of the results should be tested further in additional prospective cohorts. Second, the specific mechanism of the lncRNA-miRNA-mRNA axes for regulating the pathogenesis of CAD has not been validated in vivo and in vitro.

Conclusion
In conclusion, we explored the molecular mechanism of AMI by constructing the lncRNA-miRNA-mRNA axis based on the ceRNA hypothesis. After analyzing RNA-seq data, we combined differentially expressed mRNAs and lncRNAs. After functional analysis and predictive ncRNA network construction, we identified the lncRNA SNHG8-miR-411-5p-SOCS3/ICAM1 regulatory network. LncRNA SNHG8 may regulate SOCS3 or ICAM1 expression by sponging hsa-miR-411-5p in AMI and may serve as diagnostic or prognostic biomarkers of AMI.

Additional file
Additional file 1: Table S1. The primers used in qPCR of the lncRNA and mRNAs. Fig. 7 Nomogram to estimate individual AMI probability. The nomogram was constructed based on a logistic regression model for the outcome of definite AMI. Each predictor variable characteristic had a corresponding point value based on its position on the top point scale and contribution to the model. The probability of AMI for each subject was calculated by adding the points for each variable to obtain a total point value that corresponded to a probability of AMI from the scale presented on the bottom line. *P < 0.05, **P < 0.01