The research in the laboratory spans a wide range of biological problems, organisms, and types of biological data: from genome variation to the chromatin architecture, gene expression, and metabolic evolution, from unicellular organisms to humans, and from genome to the metabolome. The main focus is on the deep bioinformatics analysis of various types of ‘omics data generated by wet-lab collaborators, and on the integration of biological data sets that can lead to the discovery of important biological insights on mechanisms of cell and organ functioning.
Chromatin organization changes between life stages of soil-living amoeba Dictyostelium discoideum
Recent advances enabled by the Hi-C technique had unraveled many principles of chromosomal folding that have been subsequently linked to disease and gene regulation. However, we still know remarkably little about chromatin architecture in organisms other than mammals and fruit flies. To explore the changes of chromosomal folding during the life cycle of soil-living amoeba Dictyostelium discoideum, we performed Hi-C in two biological replicates at 0, 2, 5, and 8 h, and constructed high-resolution interaction maps that revealed the presence of loops. Loops often form regular patterns and do not change their position in the genome between the life cycle stages. Interestingly, the orientation of genes near loops is not random: housekeeping genes clearly tend to have convergent orientation, while differentially expressed genes show a weaker tendency. Moreover, genes inside loops have higher expression than genes outside loops. Thus, gene orientation and expression is important for loop formation but the exact mechanism remains to be unraveled.
Histone acetylation level regulates formation of TADs
Hi-C technique revealed that chromosomes of mammals and fruit flies are organized into spatially compact Topologically Associating Domains (TADs). In fruit flies, the mechanism of TAD formation is not yet clear. In this project, we test the hypothesis that the mechanism of TAD self-assembly is based on the ability of nucleosomes from inactive chromatin to aggregate, and on the lack of this ability in acetylated nucleosomal arrays. We analyzed data of Hi-C and Chip-Seq (with antibodies against pan acetylated H3 histone) experiments in control D. melanogaster late embryonic (Schneider-2) cells, as well as in HDAC1-depleted cells and in cells treated with histone acetyltransferase inhibitor curcumin or histone deacetylase inhibitor trichostatin A. Acetylation level changes were studied, in association with TAD position and density differences. Inhibition of HDAC1 was found to lead to an increase of acetylation level in interTAD regions, and coordinated changes in TAD structure. Thus, histone acetylation plays a key role in the mechanism of TAD formation in Drosophila.
Chromatin structure changes during Drosophila spermatogenesis
Spermatogenesis is accompanied by dramatic changes of gene expression. The spatial organization of chromatin can impact gene expression but the extent of chromatin structure changes during the development of sperm cells remains unclear. To investigate links between chromatin architecture and gene expression changes during spermatogenesis, we analyzed RNA-seq data and high-resolution Hi-C interaction maps of Drosophila testis at two spermatogenesis stages, spermatocytes and spermatogonia. The study included chromatin interactions at different scales: from interactions of single genes to chromatin compartments. Preliminary results show a clear correlation (Spearman’s R=0.27) between chromatin compartmentalization and expression changes. Namely, transition from an inactive to an active compartment is associated with an increase of total expression level. Moreover, genes that are active in spermatocytes tend to have less compact chromatin around their TSSs (p<10-9), in line with current understanding of interplay between chromatin folding and transcription.
Neural network applications for chromatin 3D structure analysis
Hi-C method allows one to analyze three-dimensional structure of chromosomes with high accuracy, however it has technical drawbacks. One of them is the presence of regions with missing values in resulting Hi-C maps that create difficulties for downstream analysis. We developed a procedure of missing values inference from the surrounding map segments using neural networks as the predictive model. Training and test sets were formed from the Drosophila embryo Hi-C data, excluding empty or low-quality map regions. We trained and tested three predictive models – shallow neural network with two features (upper and lower pixels), medium sized neural network with eight features (diagonal pixels) and deep neural network with twenty features (the surrounding square of pixels). We calculated RMSE and R^2 values and reconstructed Hi-C map regions using these three models. Deep neural network with twenty features demonstrated the best performance.
Close to completion:
Evolution of the human brain transcriptome at the single-cell resolution
While our understanding of the human brain evolution is advancing, our knowledge of expression differences unique to its particular areas and cell types is still incomplete. In this project, we present an analysis of gene expression differences between humans and age-matched chimpanzees, bonobos, and rhesus monkeys conducted in 33 brain regions using conventional RNA sequencing. For three of these regions, we further analyzed uniquely human expression differences at the single cell level, generating data from more than 100,000 cell nuclei. We show that gene expression evolves rapidly within cell types, with more than two-thirds of cell type-specific differences not detected using conventional RNA sequencing. Neurons tend to evolve faster in all hominids, but astrocytes show more differences on the human lineage, including alterations of spatial distribution across neocortical layers. Integration of human-specific differences across 33 brain regions further reveals co-evolving anatomically distributed regional modules coinciding with functional networks defined by functional brain imaging.
A comprehensive map of the human brain lipidome and its evolution
The lipid composition of brain anatomical structures remains poorly understood, particularly in humans and closely related non-human primates. We describe the generation and analysis of a lipidome atlas of the adult human brain, comprising a large-scale mass spectrometry-based lipidome profiling of 75 anatomically precise subdivisions in four individuals. To explore evolution of the human brain lipidome, we additionally produce lipidome atlases of adult chimpanzee, bonobo and macaque brains in three individuals per species, as well as transcriptome atlases for the same 597 samples of these primate species and human. Lipidome profiles show striking anatomical specificity in non-cortical brain regions, while in neocortex lipidome composition is strongly associated with functional networks of the brain structures. By contrast, at the level of transcriptome, anatomical specificity of gene expression levels prevails both in non-cortical and cortical structures. Prefrontal cortical regions, together with the white matter, show the highest human-specificity of lipid intensities, supported by observations at the gene expression level.