Posts by Collection

Portfolio item number 1

Short description of portfolio item number 1

Portfolio item number 2

Short description of portfolio item number 2

Using statistical profiling to decipher hidden chromatin contacts resulting from repeated sequences

Published in HAL Thesis, 2024

Genomes are not static entities; they evolve and acquire various repeated elements Repetitive genomic elements, crucial yet challenging for NGS analysis, are addressed by Hicberg, an algorithm employing statistical inference to accurately map their reads in diverse sequencing data and, in yeast, reveals novel interactions between retrotransposons and the enigmatic 2-micron plasmid. Applied to yeast, Hicberg further unveils the influence of retrotransposons on chromatin organization and identifies other new genomic interactions, offering a more comprehensive understanding of genome structure and dynamics.

Recommended citation: Sébastien Gradit. Using statistical profiling to decipher hidden chromatin contacts resulting from repeated sequences. Genetics. Sorbonne Université, 2024. English. ⟨NNT : 2024SORUS494⟩. ⟨tel-05010244⟩
Download Paper

Hicberg: Reconstruction of genomic signals from repeated elements

Published in Pending, 2025

This paper introduces Hicberg, a new method to more accurately map the 3D organization of genomes by resolving ambiguities caused by repetitive DNA sequences in Hi-C data, revealing potentially hidden structural features in various organisms.

Download Paper

Statistical inference of repeated sequence contacts in Hi-C maps

Published: July 25, 2023

Increasingly detailed investigations of the spatial organization of genomes reveal that chromosome folding influences or regulates dynamic processes such as transcription, DNA repair and segregation. Hi-C approach is commonly used to characterize genome architecture by quantifying physical contacts’ frequency between pairs of loci through high-throughput sequencing. These sequences cause challenges during the analysis’ alignment step, due to the multiplicity of plausible positions to assign sequencing reads. These unknown parts of the genome architecture, that may contain biological information, remains hidden throughout downstream functional analysis. To overcome these limitations, we have developed HiC-BERG, a method combining statistical inference with input from DNA polymer behavior characteristics and features of the Hi-C protocol to assign with robust confidence repeated reads in a genome and “fill-in” empty vectors in contact maps. HiC-BERG is intended to be applicable to different types of organisms. We will present the program and key validation tests, before applying it to unveil hidden parts of the genomes of E.coli, Saccharomyces cerevisiae and Plasmodium falciparum. HiC-BERG shows that repeated sequences may be involved in singular genomic architectures. Our method can provide an alternative visualization of genomic contacts under a wide variety of biological conditions allowing a more complete view of genome plasticity.

Hicberg – Hi-C Biological Estimation of Reapeated elements in Genomes

Published: April 22, 2024

During their evolution, the genomes of micro-organisms can acquire quantities of different repeated elements such as retrotransposons, duplicated genes or tandem repeats. This type of sequence within genomes cannot be processed directly by NGS technologies because they generate short reads that cannot be located unambiguously on reference genomes. This information is filtered out by most current pipelines, leading to incomplete genomic tracks resulting in a significant loss of information on biological functions, processes and genomic structures involving repeated elements. We developed Hicberg, an algorithm that uses statistical inference and pseudo-random generators to predict the positions of repeated sequence reads from different omics paired-end data (including Hi-C, Mnase-seq or ChIP-seq). After developing the method and calibrating it on a test bench, we explored during this PhD project how it improves genomic data interpretability of various species, starting with microbial ones such as Saccharomyces cerevisiae. Reconstruction of Hi-C and ChIP-seq genomic tracks with Hicberg revealed how some retrotransposons in this model organism contribute in the positioning of cohesin, a molecular motor involved in the formation of chromatin loops. A new role for retrotransposons sequences as contact points for the elusive yeast 2 micron episomal molecule was also identified. Overall, these results underline the power of the approach to discover new novel molecular relationships, and the interest in applying this tool more widely to larger genomes with greater quantities of repeats. The proposed method can therefore provide an alternative visualization of genomic signals in a wide variety of biological conditions and allow a more comprehensive view of genome organization and plasticity. Importantly, existing datasets can be revisited using the approach to unveil overlook features.

Hicberg : Prediction of omics signals from repeated elements

Published: June 26, 2024

Repeated genomic elements (like retrotransposons) are poorly mapped by standard NGS pipelines, resulting in a significant loss of information regarding biological functions and genomic structures. To overcome this, we developed Hicberg, an algorithm that uses statistical inference trained on unambiguous genome regions to predict the positions of reads from these repeated sequences. Hicberg successfully reconstructs complete genomic tracks from various paired-end omics data, including Hi-C and ChIP-seq, significantly improving data completeness and interpretability.

Using statistical profiling to decipher hidden chromatin contacts resulting from repeated sequences

Published: November 26, 2024

During their evolution, the genomes of micro-organisms can acquire quantities of different repeated elements such as retrotransposons, duplicated genes or tandem repeats. This type of sequence within genomes cannot be processed directly using high throughput sequencing technologies because they generate short sequences that cannot be located unambiguously on reference genomes. This type of data is filtered by most current standard pipelines, resulting in a significant loss of information on biological functions, processes and genomic structures involving repeated elements. We developed Hicberg, an algorithm that uses statistical inference and pseudo-random generators to predict the positions of repeated sequence reads from different omics paired-end data (including contact technologies like Hi-C, nucleosomes or proteins positioning technologies like Mnase-seq or ChIP-seq). After developing the method and calibrating it on a test bench, we explored during this PhD project how it improves genomic data interpretability of various species, starting with microbial ones such as Saccharomyces cerevisiae.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.

Sébastien Gradit, PhD

Posts by Collection

portfolio

Portfolio item number 1

Portfolio item number 2

publications

Using statistical profiling to decipher hidden chromatin contacts resulting from repeated sequences

Hicberg: Reconstruction of genomic signals from repeated elements

softwares

talks

Statistical inference of repeated sequence contacts in Hi-C maps

Hicberg – Hi-C Biological Estimation of Reapeated elements in Genomes

Hicberg : Prediction of omics signals from repeated elements

Using statistical profiling to decipher hidden chromatin contacts resulting from repeated sequences

teaching

Teaching experience 2