Developing a computational pipeline to identify cis-regulatory signatures in breast cancer using SLIC-CAGE data

Supervision team information: Computational Biology & Gene Regulation group, Norwegian Centre for Molecular Biosciences and Medicine (NCMBM),
UiO Supervisor: Anthony Mathelier
IBV supervisor: Pierre Chymkovitch
Co-supervisors: Dina Aronsen, Katalin Ferenc, Roza Berhanu Lemma
e-mail address: anthony.mathelier@ncmbm.uio.no

 

Keywords: cis-regulatory signatures, SLIC-CAGE, promoters, enhancers, transcription factors, gene regulatory networks, breast cancer

Project description

Gene regulation is orchestrated by promoters and enhancers that shape transcriptional programs in health and disease. SLIC-CAGE provides base-pair resolution maps of transcription start sites (TSSs) and enhancer activity, offering unique opportunities to study regulatory mechanisms in large patient cohorts [1]. We recently profiles >200 breast cancer patient samples with SLIC-CAGE. This Master’s project aims to develop and implement a computational pipeline to uncover cis-regulatory signatures driving tumor heterogeneity. The candidate will use preprocess SLIC-CAGE data quantifying promoter and enhancer activities to apply clustering strategies (e.g., MOFA+ [2]) to stratify patients by cis-regulatory profiles, and build enhancer-mediated gene regulatory networks using GRaNIE [3] and SCENIC+ [4]. Motif enrichment analyses leveraging resources such as JASPAR [5] will be used to link transcription factors (TFs) to subtype-specific cis-regulatory programs. The pipeline will be implemented in Snakemake or Nextflow with containerized environments to ensure reproducibility and scalability. Applied to our breast cancer SLIC-CAGE cohort, this project will reveal promoter/enhancer usage patterns, subtype-specific regulatory programs, and candidate TF drivers with potential clinical implications. Beyond the immediate application, the pipeline will serve as a general framework for integrative cis-regulatory analyses of large-scale CAGE datasets.

References

1. Cvetesic N, et al. SLIC-CAGE: high-resolution transcription start site mapping using nanogram-levels of total RNA. Genome Res. 2018.

2. Argelaguet R, et al. MOFA+: a statistical framework for comprehensive integration of multi-modal single-cell data. Genome Biol. 2020.

3. Kamal A, Arnold C, et al. GraNIE and GraNPA: inference and evaluation of enhancer-mediated gene regulatory networks. Mol. Syst. Biol. 2023

4. Bravo Gonzales-Blas C, et al. SCENIC+: single-cell multiomic inference of enhancers and gene regulatory networks. Nat Methods. 2023.

5. Rauluseviciute I, Puig RR, et al. JASPAR 2024: 20th anniversary of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 2024.

Publisert 3. sep. 2025 11:08 - Sist endret 3. sep. 2025 11:08

Veileder(e)

Omfang (studiepoeng)

60