Abstract:
Non-canonical proteins are translation products that arise from genomic regions historically classified as “non-coding” or situated within the so-called “dark genome”. They also encompass proteins or variants produced through atypical translational mechanisms within classical protein-coding genes, such as alternative reading frames (aORFs), upstream open reading frames (uORFs), non-AUG initiation, and ribosomal frameshifting. For a long time, these translational products were largely overlooked because they tend to be short, expressed at low abundance, and lack recognizable conserved domains, thereby evading detection by standard gene-prediction algorithms. Over the past decade, methodological advances have dramatically altered this view. High-resolution ribosome profiling, long-read and full-length translatome sequencing, single-cell transcriptomic and translatomic approaches, improvements in high-sensitivity mass spectrometry and de novo peptide sequencing, and integrated proteogenomic pipelines have collectively enabled robust identification and confident peptide assignment for numerous dark-proteome members and non-canonical isoforms. Experimental validation, ranging from targeted proteomics and immunopeptidomics to functional perturbation assays, now supports the biological existence and functional relevance of numerous such peptides. A growing body of evidence indicates that non-canonical proteins are not mere translational noise but can exert substantial biological effects: they rewire cellular metabolism, modulate canonical signalling cascades, influence proteostasis and stress responses, remodel the tumour immune microenvironment, and contribute to resistance against chemotherapy, radiotherapy and targeted agents. Importantly, many non-canonical products exhibit tumour-restricted or tumour-enriched expression patterns and include sequence elements absent from normal tissues. These properties make them compelling candidates for tumour-specific biomarkers, neoantigens for personalized immunotherapy, and novel molecular targets for precision interventions. Despite this promise, major challenges remain for their routine clinical translation. These include heterogeneous and incomplete annotation criteria, variable detection sensitivity and reproducibility across platforms, high false-positive rates in peptide identification, a lack of scalable functional assays to assign mechanistic roles, and unresolved questions around immunogenicity and safety of targeting these molecules. To move the field forward, coordinated efforts are advocated to establish community standards for annotation and reporting, to integrate complementary multi-omics datasets, to develop high-throughput functional screening pipelines, and to design early-phase clinical studies prioritizing safety and translational feasibility. This review synthesized the technological milestones and current mechanistic insights into non-canonical proteins in cancer, evaluated their translational potential as biomarkers and therapeutic targets, and outlined strategic priorities to accelerate their responsible integration into precision oncology practice.