Computational Molecular Science

Computational Molecular Science group is led by Prof. Maxim Fedorov, the director of CDISE. Our main research interest is in finding new ways in chemical informatics that are based on a combination of physical chemistry methods with the machine learning techniques for prediction of properties of organic compounds. Our primary goal is to develop methods that, on one hand, are accurate and, on the other hand, are universal and have wide applicability domains.
Recent growth in the amount of digital data available for computer processing in chemistry and life sciences has enabled a plethora of data mining and machine learning tools to be applied in these fields. Our main research interest is in finding new ways in chemical informatics that are based on a combination of physical chemistry methods with the machine learning techniques for prediction of properties of organic compounds. Our primary goal is developing accurate methods  that will be applicable in a variety of domains. The group has several principal research directions.

 

OUR TEAM

Dr., prof. Maxim V. Fedorov, , ORCID: 0000-0003-3901-3565, Scopus Author ID: 8659113600

Research scientist Dr. Maria Pukalchik, , ORCID 0000-0001-7996-642X; Scopus Author ID: 56368460100

Research scientist Dr. Dmitriy Karlov, , ORCID 0000-0002-7194-1081; Scopus Author ID: 55208518000

PhD students:

  • Aleksei Bochkarev,
  • Sergey Sosnin, , ORCID 0000-0002-3042-7369
  • Ekaterina Sosnina, , ORCID 0000-0002-6764-755X
  • Viacheslav Pronin,
  • Dmitriy Shadrin,

 

OUR RESEARCH PROJECTS

1. Machine learning for drug discovery.

  • 1.1. Predicting

Current research topics include:

Toxicity is one of the most concerning properties of an organic compound to be a drug candidate. Approximately 30% of novel drugs do not pass the first stage of clinical trials resulting in heavy losses for pharmaceutical industry. That means that the current methods for in-silico  toxicity estimations as well as animal testing do not satisfy the highest safety criteria applied to modern medicines.  Nevertheless, the modern advantages in deep learning can eliminate this difficulty. We are designing the architectures of multi-output neural networks which are able to predict profiles of toxicity of organic compounds for different animal species and humans. Our models will help the industrial and scientific communities to be more efficient in drug development.

Prediction of antiviral activity profiles for small molecule compounds is another research topic of interest, aiming to improve the efficiency of the search for new antivirals and repurposing of the previously studied compounds. Our research goal here is to analyze the efficiency of different machine learning techniques as applied to the new ViralChEMBL database (contains 650K activity entries for more than 220K compounds against 160 viral species). The natural property of such datasets is that they are sparse — generally nobody measures everything against everything. This makes profile prediction especially complicated. Our goal is to approach this class of problems using contemporary machine learning techniques, such as multi-output neural networks, multi-output decision trees, recommender systems and other tools that will allow reliable predictions based on sparse data sets.

Project leader: Sergey Sosnin,  

  • 1.2. Search for optimal molecular and pharmacological properties

The multi-parameter assessment approach advances the quality and effectiveness of the drug discovery process by allowing us to simultaneously optimize diverse and often conflicting molecular and pharmacological properties and to reach their desired balance. The goal of our research is to analyze the influence of the methods of descriptor calculation, continuity and discreteness of their values and their applicability domain, as well as the nature of desirability functions in the multi-parameter assessment profile.

Project leader: Ekaterina Sosnina, 

 

2. Machine learning for agro-science.

  • 2.1.  A fertilizer plays an important role in maintaining soil fertility, increasing yields and improving harvest quality.

However, a significant portion of fertilizers are lost, increasing agricultural costs, wasting energy and polluting the environment (especially when the fertilizer was added in inappropriate doses), which are challenges for the sustainability of modern agriculture. Numerous studies have demonstrated contradiction effects of fertilizers on soil properties and crop yields.

This project is aimed at creation of a state-of-the-art multi-output system which will be able to predict the best-management practice (set of the best fertilizers ) for different soil and crop types. Our system will help the industrial and scientific communities to be more efficient in agro-farming.

Project leader: Mariia Pukalchik,

  • 2.2 Mathematical modeling and optimization of plant growth dynamics in closed artificial controlled systems.

Modeling the plant growth dynamics with modern mathematical approaches is very actual due to the high demand for precise and robust models for their further usage in solving optimization problems for greenhouses. As engineering innovations have a vast area of implementation in the industry of producing food, in particular it had effect on design of greenhouses, the old-fashioned methods describing the processes of the plant growth in greenhouses have to be adopted to these new technologies and improved. The main research goal is to implement machine-learning techniques, improve existing parametric models, make a verification on field data and develop a predictive model for more efficient management of greenhouses in particular in terms of assessment of yield, energy consumption, fertilizer consumption, and optimal conditions of plant growth. The outcome of our research will be a very important part of greenhouses’ APC systems.

Project leader: Dmitry Shadrin, .

 

3. Industrial projects

  •  3.1. Searching for an effective solution to the cooling problem in the supercomputer segment

Developing at rapid pace, supercomputer industry has become more sophisticated and demanding in terms of quantity and complexity of electronic components, which puts the need to increase efficiency of heat sinks at the forefront. Existence of a thermodynamic cooling limit causes serious heat removal problems. Consequently, many of current cooling practices seem to have reached their capacity, being unable to cool newer generations of electronics. Our research goal is to analyze state-of-the-art cooling technologies, debating on their development perspectives, elaborate appropriate solutions for the problem of heat removal over various time horizons and substantiate the possibility of implementing a system that is based on the effective use of the Kapitza temperature jump as a thermodynamic method in the cooling cycle.

Project leader: Viacheslav Pronin,

  •  3.2. Investigating the interrelation of the Kapitza jump and interfacial properties of liquids for oil quality evaluation

Despite the complexity and uncertainty of the modern global economy, oil still remains an energy resource that is crucial for the economic development of many countries including Russia. The fact that its significance will not decrease in the short and medium term is confirmed by numerous studies and forecasts. Consequently, ensuring the effective field development and correct reservoir engineering is of utmost importance for oil industry. The majority of fields demand techniques of enhanced or tertiary oil recovery, which increases the complexity of oil extraction. Our project includes studying the properties of liquids at the interface boundary, with particular attention to the case of injecting fluid into oil wells in order to increase the efficiency of extraction. We aim to prove that the magnitude of the temperature discontinuity is a parameter that can serve as an objective basis for evaluation of all the properties of liquids at the interfacial boundary and does not demand any extreme experiments or techniques. This methodology is potentially able to help avoid the necessity of thorough protection for measurement devices and to become the effective instrument of oil express analysis.

Project leader: Viacheslav Pronin,

  

OUR COMMERCIAL PROJECTS

The Syntelly project: state-of-the-art chemical synthesis planning

Organic synthesis is a branch of science where technology meets art. On one hand, synthetic chemists have a pool of known chemical reactions; on the other hand, there are a lot of uncertain processes and transformations that can be deduced only by expertise and intuition of the chemist. Unfortunately, this fact hinder the development of computer aided organic synthesis, despite the fact that the first attempt was performed in 1969 (LHASA program) by one of the most well-known chemists, a Nobel Prize winner E.J. Corey. The main reason of failures is the complexity of formalizing chemistry into the defined set of rules, and a breakthrough system should include intangible things like chemical intuition to succeed. However, now the situation tends to change. Recent advances in deep learning can notably improve the quality, and the “artificial chemist” would be a reality in the nearest future. Our group launched a project Syntelly for creating a computer program that will be able to plan chemical synthesis better than an experienced chemist can do.

Please, contact if you are interested in Syntelly.