In drug discovery, one of the most prominent topics is to efficiently explore the vast drug-like chemical space to find synthesizable and novel chemical structures with desired biological properties. To address this challenge, we have created the DrugSpaceX database based on expert-defined transformation of the approved drug molecules. The current version of DrugSpaceX contains more than 100 million transformed chemical products for virtual screening, with outstanding characteristics for its structural novelty, diversity and large three-dimensional chemical space coverage.

Additionally, as for ligand identification and optimization intents, DrugSpaceX also provides several subsets for download, including a 10% diversity subset, an extended drug-like subset, a drug-like subset, a lead-like subset, and a fragment-like subset. Other than chemical properties and transformation instructions, DrugSpaceX could also locate the position of transformation, which enables medicinal chemists easily to integrate strategy planning and protection design purposes.

The Nova and BIOSTER cheminformatics library was used within the StarDrop software platform to exponentially broaden one’s search by taking the ‘Drug set’ molecule and creating new generations of related compounds (1-3). As reference, we also collected existing databases dated January 07, 2020,e.g. the DrugBank (4), the PDB (5), the BindingDB (6), the ChEMBL (7) and the CSD (8). To represent the properties of molecules, we calculated some chemical descriptors for each molecular entity with the RDKit library. The set of descriptors were molecular weight (MW), octanol-water partition coefficient (logP) (9), number of hydrogen bond acceptors (HBA), number of hydrogen bond donors (HBD), total polar surface area (TPSA), amounts of rotatable bonds (RotB), the quantitative estimate of drug-likeness (QED) (10), the synthetic accessibility (SA) score (11) and MCE-18 (12).

The DrugSpaceX website is licensed under a Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0), which permits reusers to distribute, remix, adapt, and build upon the material in any medium or format, for noncommercial purposes only. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit https://creativecommons.org/licenses/by-nc/4.0/. The authors would like to thank ChemAxon (13) and StarDrop for their support as well as the many users of DrugSpaceX for their valuable feedback and suggestions.

REFERENCES

  1. StarDrop, version 6.6; Optibrium Ltd., 2019.
  2. Ujváry, I. and Hayward, J. (2012) BIOSTER: a database of bioisosteres and bioanalogues. In Brown, N. (ed.), Bioisosteres in Medicinal Chemistry, Wiley VCH, pp. 53-74.
  3. Segall, M., Champness, E., Leeding, C., Lilien, R., Mettu, R. and Stevens, B. (2011) Applying medicinal chemistry transformations and multiparameter optimization to guide the search for high-quality leads and candidates. J. Chem. Inf. Model., 51, 2967-2976.
  4. Wishart, D.S., Feunang, Y.D., Guo, A.C., Lo, E.J., Marcu, A., Grant, J.R., Sajed, T., Johnson, D., Li, C., Sayeeda, Z. et al. (2018) DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res., 46, D1074-D1082.
  5. Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N. and Bourne, P.E. (2000) The Protein Data Bank. Nucleic Acids Res., 28, 235-242.
  6. Gilson, M.K., Liu, T., Baitaluk, M., Nicola, G., Hwang, L. and Chong, J. (2016) BindingDB in 2015: A public database for medicinal chemistry, computational chemistry and systems pharmacology. Nucleic Acids Res., 44, D1045-D1053.
  7. Mendez, D., Gaulton, A., Bento, A.P., Chambers, J., De Veij, M., Felix, E., Magarinos, M.P., Mosquera, J.F., Mutowo, P., Nowotka, M. et al. (2019) ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Res., 47, D930-D940.
  8. Groom, C.R., Bruno, I.J., Lightfoot, M.P. and Ward, S.C. (2016) The Cambridge Structural Database. Acta Crystallogr., Sect. B: Struct. Sci., Cryst. Eng. Mater., 72, 171-179.
  9. Wildman, S.A. and Crippen, G.M. (1999) Prediction of Physicochemical Parameters by Atomic Contributions. J. Chem. Inf. Comput. Sci., 39, 868-873.
  10. Bickerton, G.R., Paolini, G.V., Besnard, J., Muresan, S. and Hopkins, A.L. (2012) Quantifying the chemical beauty of drugs. Nat. Chem., 4, 90-98.
  11. Ertl, P. and Schuffenhauer, A. (2009) Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. J. Cheminform., 1, 8.
  12. Ivanenkov, Y.A., Zagribelnyy, B.A. and Aladinskiy, V.A. (2019) Are We Opening the Door to a New Era of Medicinal Chemistry or Being Collapsed to a Chemical Singularity? J. Med. Chem., 62, 10026-10043.
  13. Marvin was used for drawing, displaying and characterizing chemical structures, substructures and reactions, Marvin 20.14.0, ChemAxon.