The research area spans a wide range of biological problems, organisms, and types of biological data: from genome variation to the chromatin architecture, gene expression, and metabolic evolution, from unicellular organisms to humans, and from genome to the metabolome. The main focus is on the deep bioinformatics analysis of various types of ‘omics data generated by wet-lab collaborators, and on the integration of biological data sets that can lead to the discovery of important biological insights on mechanisms of cell and organ functioning.
Chromatin organization changes between life stages of soil-living amoeba Dictyostelium discoideum. Recent advances enabled by the Hi-C technique had unraveled many principles of chromosomal folding that have been subsequently linked to disease and gene regulation. However, we still know remarkably little about chromatin architecture in organisms other than mammals and fruit flies. To explore the changes of chromosomal folding during the life cycle of soil-living amoeba Dictyostelium discoideum, we performed Hi-C in two biological replicates at 0, 2, 5, and 8 h, and constructed high-resolution interaction maps that revealed the presence of loops. Loops often form regular patterns and do not change their position in the genome between the life cycle stages. Interestingly, the orientation of genes near loops is not random: housekeeping genes clearly tend to have convergent orientation, while differentially expressed genes show a weaker tendency. Moreover, genes inside loops have higher expression than genes outside loops. Thus, gene orientation and expression are important for loop formation but the exact mechanism remains to be unraveled. In addition, we explored how different groups of histone modifications are distributed and whether the active marks follow the transcription pattern. Further, we studied elongated loops observed on Hi-C maps. We developed an algorithm for their annotation and showed that their transcription is highly symmetrical for loops elongated to the right and to the left, while histone marks of active transcription do not follow this pattern. Moreover, these transcription patterns cannot be explained by the density of genes or gene size. This is a joint project with the Institute of Gene Biology, Moscow.
Histone acetylation level regulates the formation of TADs. The Hi-C technique revealed that chromosomes of mammals and fruit flies are organized into spatially compact Topologically Associating Domains (TADs). In fruit flies, the mechanism of TAD formation is not yet clear. In this project, we test the hypothesis that the mechanism of TAD self-assembly is based on the ability of nucleosomes from inactive chromatin to aggregate, and on the lack of this ability in acetylated nucleosomal arrays. We analyzed data of Hi-C and Chip-Seq (with antibodies against pan acetylated H3 histone) experiments in control D. melanogaster late embryonic (Schneider-2) cells, as well as in HDAC1-depleted cells and in cells treated with histone acetyltransferase inhibitor curcumin or histone deacetylase inhibitor trichostatin A. Acetylation level changes were studied, in association with TAD position and density differences. Inhibition of HDAC1 was found to lead to an increase of acetylation level in interTAD regions and coordinated changes in TAD structure. Thus, histone acetylation plays a key role in the mechanism of TAD formation in Drosophila. This is a joint project with the Institute of Gene Biology, Moscow.
Neural network applications for chromatin 3D structure analysis. The Hi-C method allows one to analyze the three-dimensional structure of chromosomes with high accuracy, however, it has technical drawbacks. One of them is the presence of regions with missing values in resulting Hi-C maps that create difficulties for downstream analysis. We developed a procedure of missing values inference from the surrounding map segments using neural networks as the predictive model. Training and test sets were formed from the Drosophila embryo Hi-C data, excluding empty or low-quality map regions. We trained and tested three predictive models – a shallow neural network with two features (upper and lower pixels), a medium-sized neural network with eight features (diagonal pixels), and a deep neural network with twenty features (the surrounding square of pixels). We calculated RMSE and R^2 values and reconstructed Hi-C map regions using these three models. The deep neural network with twenty features demonstrated the best performance.
Investigation of the role of SIRT6 in the molecular mechanisms of gene expression regulation, metabolism, and aging. Sirtuins are a family of proteins that have mono-ADP ribosyltransferase or NAD+-dependent deacetylase activity and modulate multiple biological processes. Among them, SIRT6 has one of the strongest relations to healthy aging and protection from neurodegenerative diseases. It was previously shown that it plays important role in DNA repair, telomere maintenance, gene expression regulation, and metabolism. Despite the fact that the importance of SIRT6 for brain aging is clear, the exact molecular mechanisms of its functioning are still poorly understood. In our study, we focused on multilayer bioinformatic analysis to examine the SIRT6-induced changes from gene expression to metabolism and chromatin architecture. We analyzed transcriptomic, lipidomic, and metabolic profiles of brain-specific SIRT6-KO mice versus wild-type mice. Preliminary analysis of mass spectrometry and transcriptomic data of SIRT6-KO mouse brains revealed differentially expressed genes and differentially accumulated metabolites and lipids in the group of SIRT6-KO mice compared to the control group. This is a joint project with the Ben-Gurion University of the Negev, Israel.
De-Novo transcriptome assembly of Siberian wood frog Rana amurensis. Rana amurensis is widespread in eastern Asia and attracted great scientific interest due to an unusual tolerance to hypoxia. However, neither reference genome nor transcriptome has been described for this species. Thus we built the first version of the transcriptome assembly of Rana amurensis using RNA-seq reads from the brain (normoxia and hypoxia, 2 replicates) and heart (normoxia, 1 replicate) samples. We also provided the functional annotation for 62,235 transcripts that clustered into 19,094 genes. In the future, we will improve the assembly quality by increasing the number of replicates. This is a joint project with the Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences, Russia.
Chromatin architecture changes in mental disorders. Mental disorders directly or indirectly affect a significant proportion of the world’s population. Despite the high social significance, effective therapy methods have not yet been found, and the mechanisms of the development of these diseases are not completely clear. Most of the genetic variants associated with the risk of developing mental disorders are located in the non-coding part of the genome. These variants are thought to have an indirect effect through gene regulation, which requires an understanding of the regulatory landscape in the human brain. Previous research highlights the important role of chromosome architecture in the development of mental disorders. However, all of these studies were performed on healthy brain samples as a source of chromatin conformation data. Our study will be the first one directly comparing chromatin architecture between mental disorders and control brain samples. However, cellular heterogeneity complicates the description of the complex structure of gene regulation in the human brain. Neurons and glia show different profiles of gene expression and chromatin availability. Therefore, analysis of chromatin conformation at cellular resolution is necessary to capture the complexity of how chromatin structure affects expression profiles in cells. We will use FACS flow cytometry to sort neurons (NeuN+ cells) and glia (NeuN- cells), two main types of cells in the brain, and determine the chromosome conformation separately for each cell type in mental disorders and control brain samples. Comparison of chromatin conformation data with RNA-seq and ATAC-seq experiments in the same brain samples and cell types will allow us to obtain a complete picture of changes in the regulatory landscape in mental disorders, using all the information about transcription, chromatin openness, and changes in chromosome conformation.
Three-dimensional genome organization in sea sponge Halisarca dujardini. The Hi-C technique also allows the study of the spatial organization of the genomes of non-model organisms and it is important since it makes it possible to trace the sequential complication of the chromatin structure in the course of evolution. Species of the Porifera type (Sponges) are among the most ancient multicellular organisms that exist today. An additional important aspect of the biology of Porifera species is the ability to reaggregate (regeneration of the whole organism) – to restore the body of a sponge from a cell suspension after mechanical or other types of dissociation. Such a process, accompanied by significant changes in gene expression, should also be accompanied by significant changes in the three-dimensional structure of chromatin since numerous previous studies indicate a strong relationship between gene expression and chromatin folding in all well-studied organisms. The study of the spatial organization of the sponge genome will make it possible to supplement the existing ideas about the evolutionary diversity of chromatin architecture, their functional significance, and mechanisms of formation at the early stages of evolution during the transition to multicellularity. Integration of data on the dynamics of such changes and transcriptome data will complement knowledge about the processes of cell differentiation, and explore the relationship between the spatial organization of chromatin and the regulation of gene expression. This is a joint project with the Koltzov Institute of Developmental Biology, Moscow.
Close to completion:
Chromatin structure changes during Drosophila spermatogenesis. Spermatogenesis is accompanied by dramatic changes in gene expression. The spatial organization of chromatin can impact gene expression but the extent of chromatin structure changes during the development of sperm cells remains unclear. To investigate links between chromatin architecture and gene expression changes during spermatogenesis, we analyzed RNA-seq data and high-resolution Hi-C interaction maps of Drosophila testis at two spermatogenesis stages, spermatocytes and spermatogonia. The study included chromatin interactions at different scales: from interactions of single genes to chromatin compartments. Preliminary results show a clear correlation (Spearman’s R=0.27) between chromatin compartmentalization and expression changes. Namely, the transition from an inactive to an active compartment is associated with an increase in total expression level. Moreover, genes that are active in spermatocytes tend to have less compact chromatin around their TSSs (p<10-9), in line with the current understanding of the interplay between chromatin folding and transcription. This is a joint project with the Institute of Gene Biology, Moscow.
A comprehensive map of the human brain lipidome and its evolution. The lipid composition of brain anatomical structures remains poorly understood, particularly in humans and closely related non-human primates. We describe the generation and analysis of a lipidome atlas of the adult human brain, comprising a large-scale mass spectrometry-based lipidome profiling of 75 anatomically precise subdivisions in four individuals. To explore the evolution of the human brain lipidome, we additionally produce lipidome atlases of adult chimpanzee, bonobo, and macaque brains in three individuals per species, as well as transcriptome atlases for the same 597 samples of these primate species and humans. Lipidome profiles show striking anatomical specificity in non-cortical brain regions, while in the neocortex lipidome composition is strongly associated with functional networks of the brain structures. By contrast, at the level of transcriptome, anatomical specificity of gene expression levels prevails both in non-cortical and cortical structures. Prefrontal cortical regions, together with the white matter, show the highest human-specificity of lipid intensities, supported by observations at the gene expression level.
Single-cell resolution analysis of transcription and chromatin accessibility in threespine sticklebacks. Marine and freshwater sticklebacks represent a perfect model for studying epigenetic components of phenotypic plasticity allowing fish to inhabit water with different salinity. We applied single-cell RNA sequencing (scRNA-seq), single-cell assay for transposase-accessible chromatin sequencing (scATAC-seq), and whole-genome bisulfite sequencing to characterize intercellular variability in transcription, the abundance of open chromatin regions, and CpG methylation level in gills of marine and freshwater sticklebacks. We found little difference in overall transcriptional variance between the morphs. However, genetic divergence islands (DIs) coincided with regions of increased methylation entropy in freshwater fish. Moreover, analysis of transcription factor binding sites within DIs revealed that СTCF motifs around marker SNPs were significantly enriched within the region. Thus, our data underline the role of epigenetics in the adaptation of marine sticklebacks to freshwater. This is a joint project with the Institute of Bioengineering, Moscow.
optimalTAD: a novel algorithm predicting an optimal set of topologically associating domains. Topologically Associating Domains (TADs) are fundamental features of chromatin folding, recently discovered thanks to the development of the Hi-C method. However, computational identification of domain boundaries still remains a nontrivial problem due to the nested structure of TADs resulting in the existence of non-unique TAD sets at multiple length scale levels. In order to address this problem we propose a novel algorithm for finding the optimal set of TADs based on the combination of Hi-C and epigenetic data. Our algorithm maximizes the difference in the median histone acetylation level of TADs and inter-TADs to optimize resolution parameter ‘gamma’ in the Armatus tool, frequently used for TAD calling. We have successfully validated our algorithm on publicly available datasets of Drosophila melanogaster and plan to publish the paper in the near future.
Evolution of the human brain transcriptome at the single-cell resolution. While our understanding of the human brain evolution is advancing, our knowledge of expression differences unique to its particular areas and cell types is still incomplete. In this project, we present an analysis of gene expression differences between humans and age-matched chimpanzees, bonobos, and rhesus monkeys conducted in 33 brain regions using conventional RNA sequencing. For three of these regions, we further analyzed uniquely human expression differences at the single-cell level, generating data from more than 100,000 cell nuclei. We show that gene expression evolves rapidly within cell types, with more than two-thirds of cell-type-specific differences not detected using conventional RNA sequencing. Neurons tend to evolve faster in all hominids, but astrocytes show more differences in the human lineage, including alterations of spatial distribution across neocortical layers. Integration of human-specific differences across 33 brain regions further reveals co-evolving anatomically distributed regional modules coinciding with functional networks defined by functional brain imaging.
Supervisor: Ekaterina Khrameeva
|Team||Irina Zhegalova||Anna Kononkova||Anastasiia Golova|
|Artemiy Golden||Alexander Cherkasov|
|PhD students:||Dmitrii Smirnov||Victoria Kobets|
Metabolome signature of autism in the human prefrontal cortex. Kurochkin I, Khrameeva E, Tkachev E, Stepanova V, Vanyushkina A, Stekolshchikova E, Qian Li, Zubkov D, Shichkova P, Halene T, Willmitzer L, Giavalisco P, Akbarian S, Khaitovich P. Commun Biol. 21.06.2019;2:234. doi: 10.1038/s42003-019-0485-4