Show simple item record

dc.contributor.advisorKristian Berland
dc.contributor.authorDey, Aditya
dc.date.accessioned2023-07-18T16:27:30Z
dc.date.available2023-07-18T16:27:30Z
dc.date.issued2023
dc.identifierno.nmbu:wiseflow:6839577:54592307
dc.identifier.urihttps://hdl.handle.net/11250/3079871
dc.description.abstractAccurate prediction of the melting point of oral drugs is crucial for understanding their chemical properties. Early identification of these properties aids in the screening of potential drugs, thereby saving resources in the pharmaceutical industry's discovery and manufacturing processes. The prediction of organic molecules is a complex task due to many factors that affect entropy and enthalpy forces within a molecule, which are dependent on various factors like shape, electronegativity, flexibility, rotatability, intermolecular bonding, etc. In this study, we curated a combined dataset of organic molecules, extracted from the Open Notebook Science Dataset and Cambridge Structure Database. The dataset consists of molecules composed of carbon, oxygen, nitrogen, sulfur, phosphorous, and halogens, exhibiting a wide range of melting point temperatures and molecules with complex structures. To gain insights into the significance of each feature and its contribution to melting point prediction, we divided the combined dataset into four subsets based on the number of bonds an atom can form. We perform feature engineering on these datasets by studying the physical and chemical properties known to impact melting points. Numerical features were derived from the molecules, capturing relevant information. Additionally, we utilized embedding features without any modifications. Machine learning models were trained using both numerical and embedding features, with the accuracy evaluated through R$^2$ scores and root mean squared error values. We set the model trained on embedding features as a benchmark for our model and features to surpass. Our machine learning models exhibited good performance, outperforming the benchmark and achieving good prediction accuracy. Furthermore, we conducted an in-depth analysis of the results to assess the impact of individual features on the models. We observed physical shape features and the presence of specific substructural groups exhibited a strong correlation with melting point prediction. To explore the relationship between features, we performed a principal component analysis. The findings of this study have important implications for drug development, formulation, and optimization of manufacturing processes. Accurate prediction of melting points enhances drug screening procedures and aids in the design of effective pharmaceutical products.
dc.description.abstract
dc.languageeng
dc.publisherNorwegian University of Life Sciences
dc.titlePrediction of Melting Temperature of Organic Molecules using Machine Learning
dc.typeMaster thesis


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record