Decoding Enhancer-Mediated Gene Regulation with CAGE-seq and Deep Learning

March 23, 14:00 – 15:30

Organizer

SciLifeLab Event
events@scilifelab.se
View Organizer Website

Venue

  • Air&Fire, SciLifeLab Stockholm
  • Tomtebodavägen 23A
    Solna, Sweden

Decoding Enhancer-Mediated Gene Regulation with CAGE-seq and Deep Learning

March 23, 2026 @ 14:00 15:30 CET

Michiel de Hoon, RIKEN center, Japan

Abstract

Precise regulation of gene expression is fundamental to cellular identity and behavior, and gene dysregulation is a common feature across a wide variety of diseases.  At the molecular level, this regulation is orchestrated by transcription factors acting on non-coding regulatory elements at promoters and distal enhancers. Genome-wide association studies have shown that disease-associated variants are enriched in enhancer elements, and somatic mutations in these regulatory regions can contribute to diseases such as cancer. As it is often unclear how such variants influence gene expression, connecting enhancers to their target promoters is a key step toward understanding the regulatory basis of pathogenic gene expression.

Cap Analysis Gene Expression (CAGE-seq), originally developed at RIKEN, provides an unbiased and quantitative genome-wide view of transcription initiation at single-nucleotide resolution. Beyond quantitating mRNAs and long non-coding RNAs, CAGE-seq revealed widespread transcription at enhancers, enabling their systematic identification and quantitative characterization. While traditional bioinformatics approaches have focused on genome annotation and expression analysis, emerging neural network frameworks now enable inference of enhancer–promoter interactions directly from sequencing data.

In this presentation, I will highlight the use of AI-based sequence-to-function frameworks that can predict transcriptional activity directly from genomic sequence. When combined with CAGE-seq, these approaches enable the construction of context-specific regulatory maps linking enhancer activity to gene expression. Using examples from leukemia, I will illustrate how CAGE-seq data, analyzed using deep learning, support a systematic interpretation of inter-patient heterogeneity in gene regulation.

Biography

I entered computational biology during my Ph.D. in Physics at the University of California, Berkeley. After postdoctoral training in bioinformatics at the University of Tokyo and Columbia University, I joined RIKEN in Yokohama, where I became a core contributor to the FANTOM consortium, an international effort to functionally annotate the mammalian genome.

Between 2014 and 2021, I led the computational biology program of FANTOM6, coordinating my research group and consortium-wide efforts to functionally annotate non-coding RNA. I am currently a tenured Senior Research Scientist at the RIKEN Center for Integrative Medical Sciences, investigating gene regulatory mechanisms in disease, particularly cancer.

More recently, my research has focused on developing and applying AI-based sequence-to-function models to infer long-range gene regulatory interactions from genome and transcriptome data, with the aim of understanding the regulatory basis of inter-patient heterogeneity in disease.

Host: Pelin Sahlen pelin.akan@scilifelab.se

Tomtebodavägen 23A
Solna, Sweden
+ Google Map

Last updated: 2026-02-06

Content Responsible: Isolde Palombo(isolde.palombo@scilifelab.se)