Breast cancer, a molecularly heterogeneous disease, is classified into hormone receptor-positive luminal breast cancer (LBC), human epidermal growth factor receptor 2-positive breast cancer, and triple-negative breast cancer (TNBC). Precursor microRNAs (pre-miRNAs), typically form hairpin structures with a length from 65 to 80 bases, are shown to play crucial roles in breast cancer carcinogenesis. We hypothesized that these pre-miRNAs could have been sequenced in total RNA sequencing (RNA-seq) and developed a novel algorithm to profile pre-miRNAs from raw total RNA-seq data. A total of 907 breast cancer samples curated by Malaysian Breast Cancer Genetic Study (MyBrCa) were profiled using this algorithm and a comparison was made between pre-miRNA profiles and mature miRNA profiles obtained from The Cancer Genome Atlas (TCGA) dataset. We explored differentially expressed pre-miRNAs in TNBC in comparison to LBC and conducted downstream functional analyses of the target genes. A prognostic signature was built by LASSO-Cox regression on selected pre-miRNAs and validated internally and externally by MyBrCa and TCGA datasets, respectively. As a result, 10 common differentially expressed pre-miRNAs were identified. Functional analyses of these pre-miRNAs captured certain aggressive TNBC behaviors. Importantly, a pre-miRNA signature composed of 4 out of these 10 pre-miRNAs significantly prognosticated the breast cancer patients in the MyBrCa cohort and TCGA cohort, independent of conventional prognostic factors. In conclusion, this novel algorithm allows profiling pre-miRNAs from raw total RNA-seq data, which could be cross-validated with mature miRNA profiles for cross-platform comparison. The findings of this study underscore the importance of pre-miRNAs in breast cancer carcinogenesis and as prognostic factors.
* Title and MeSH Headings from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine.