Computational Molecular Science

Computational Molecular Science group is led by Prof. Maxim Fedorov, the director of CDISE. Our main research interest is in finding new ways in chemical informatics that are based on a combination of physical chemistry methods with machine learning techniques for prediction of properties of organic compounds. Our primary goal is to develop methods that, on the one hand, are accurate and, on the other hand, are universal and have wide applicability domains.

Recent growth in the amount of digital data available for computer processing in chemistry and life sciences has enabled a plethora of data mining and machine learning tools to be applied in these fields. Our main research interest is in finding new ways in chemical informatics that are based on a combination of physical chemistry methods with machine learning techniques for the prediction of properties of organic compounds. Our primary goal is to develop accurate methods that will be applicable in a variety of domains. The group has several principal research directions.

OUR TEAM

Dr., Prof. Maxim Fedorov, , ORCID: 0000-0003-3901-3565, Scopus Author ID: 8659113600

Dr., Assistant Prof. Maria Pukalchik, , ORCID 0000-0001-7996-642X; Scopus Author ID: 56368460100

Research scientist Dr. Dmitriy Karlov, , ORCID 0000-0002-7194-1081; Scopus Author ID: 55208518000

PhD students:

Aleksei Bochkarev,
Sergey Sosnin, , ORCID 0000-0002-3042-7369
Ekaterina Sosnina, , ORCID 0000-0002-6764-755X
Viacheslav Pronin,
Dmitriy Shadrin,

OUR RESEARCH PROJECTS

1. Machine learning for drug discovery.

1.1. Predicting

Current research topics include:

Toxicity is one of the most concerning properties of an organic compound to be a drug candidate. Approximately 30% of novel drugs do not pass the first stage of clinical trials resulting in heavy losses for the pharmaceutical industry. That means that the current methods for in-silico toxicity estimations as well as animal testing do not satisfy the highest safety criteria applied to modern medicines. Nevertheless, the modern advantages of deep learning can eliminate this difficulty. We are designing the architectures of multi-output neural networks that are able to predict profiles of toxicity of organic compounds for different animal species and humans. Our models will help the industrial and scientific communities to be more efficient in drug development.

Prediction of antiviral activity profiles for small molecule compounds is another research topic of interest, aiming to improve the efficiency of the search for new antivirals and repurposing of the previously studied compounds. Our research goal here is to analyze the efficiency of different machine learning techniques as applied to the new ViralChEMBL database (contains 650K activity entries for more than 220K compounds against 160 viral species). The natural property of such datasets is that they are sparse — generally, nobody measures everything against everything. This makes profile prediction especially complicated. Our goal is to approach this class of problems using contemporary machine learning techniques, such as multi-output neural networks, multi-output decision trees, recommender systems, and other tools that will allow reliable predictions based on sparse data sets.

Project leader: Sergey Sosnin,

1.2. Search for optimal molecular and pharmacological properties

The multi-parameter assessment approach advances the quality and effectiveness of the drug discovery process by allowing us to optimize diverse and often conflicting molecular and pharmacological properties simultaneously and to reach their desired balance. The goal of our research is to analyze the influence of the methods of descriptor calculation, continuity, and discreteness of their values and their applicability domain, as well as the nature of desirability functions in the multi-parameter assessment profile.

Project leader: Ekaterina Sosnina,

2. Machine learning for agro-science.

2.1. Fertilizer plays an important role in maintaining soil fertility, increasing yields, and improving harvest quality.

However, a significant portion of fertilizers is lost, increasing agricultural costs, wasting energy and polluting the environment (especially when the fertilizer was added in inappropriate doses), which are challenges for the sustainability of modern agriculture. Numerous studies have demonstrated the contradiction effects of fertilizers on soil properties and crop yields.

This project is aimed at the creation of a state-of-the-art multi-output system that will be able to predict the best-management practice (set of the best fertilizers ) for different soil and crop types. Our system will help the industrial and scientific communities to be more efficient in agro-farming.

Project leader: Mariia Pukalchik,

2.2 Mathematical modeling and optimization of plant growth dynamics in closed artificial controlled systems.

Modeling the plant growth dynamics with modern mathematical approaches is very actual due to the high demand for precise and robust models for their further usage in solving optimization problems for greenhouses. As engineering innovations have a vast area of implementation in the industry of producing food, in particular, it had an effect on the design of greenhouses, the old-fashioned methods describing the processes of the plant growth in greenhouses have to be adapted to these new technologies and improved. The main research goal is to implement machine-learning techniques, improve existing parametric models, make verification on field data and develop a predictive model for more efficient management of greenhouses in particular in terms of assessment of yield, energy consumption, fertilizer consumption, and optimal conditions of plant growth. The outcome of our research will be a very important part of greenhouses’ APC systems.

Project leader: Dmitry Shadrin, .

3. Industrial projects

3.1. Searching for an effective solution to the cooling problem in the supercomputer segment

While developing at a rapid pace, the supercomputer industry has become more sophisticated and demanding in terms of quantity and complexity of electronic components, which puts the need to increase the efficiency of heat sinks at the forefront. The existence of a thermodynamic cooling limit causes serious heat removal problems. Consequently, many current cooling practices seem to have reached their capacity, being unable to cool newer generations of electronics. Our research goal is to analyze state-of-the-art cooling technologies, debating on their development perspectives, elaborate appropriate solutions for the problem of heat removal over various time horizons and substantiate the possibility of implementing a system that is based on the effective use of the Kapitza temperature jump as a thermodynamic method in the cooling cycle.

Project leader: Viacheslav Pronin,

3.2. Investigating the interrelation of the Kapitza jump and interfacial properties of liquids for oil quality evaluation

Despite the complexity and uncertainty of the modern global economy, oil still remains an energy resource that is crucial for the economic development of many countries, including Russia. The fact that its significance will not decrease in the short and medium-term is confirmed by numerous studies and forecasts. Consequently, ensuring effective field development and correct reservoir engineering is of utmost importance for the oil industry. The majority of fields demand techniques of enhanced or tertiary oil recovery, which increases the complexity of oil extraction. Our project includes studying the properties of liquids at the interface boundary, with particular attention to the case of injecting fluid into oil wells in order to increase the efficiency of extraction. We aim to prove that the magnitude of the temperature discontinuity is a parameter that can serve as an objective basis for the evaluation of all the properties of liquids at the interfacial boundary and does not demand any extreme experiments or techniques. This methodology is potentially able to help avoid the necessity of thorough protection for measurement devices and to become an effective instrument of oil express analysis.

Project leader: Viacheslav Pronin,

OUR COMMERCIAL PROJECTS

The Syntelly project: state-of-the-art chemical synthesis planning

Organic synthesis is a branch of science where technology meets art. On the one hand, synthetic chemists have a pool of known chemical reactions; on the other hand, there are a lot of uncertain processes and transformations that can be deduced only by the expertise and intuition of the chemist. Unfortunately, this fact hinders the development of computer-aided organic synthesis, despite the fact that the first attempt was performed in 1969 (LHASA program) by one of the most well-known chemists, a Nobel Prize winner E.J. Corey. The main reason for failures is the complexity of formalizing chemistry into the defined set of rules, and a breakthrough system should include intangible things like chemical intuition to succeed. However, now, the situation tends to change. Recent advances in deep learning can notably improve the quality, and the “artificial chemist” would be a reality in the nearest future. Our group launched a project Syntelly for creating a computer program that will be able to plan chemical synthesis better than an experienced chemist can do.

Please, contact if you are interested in Syntelly.