In drug discovery, one of the most prominent topics is to efficiently explore the vast drug-like chemical space to find synthesizable and novel chemical structures with desired biological properties. To address this challenge, we have created the DrugSpaceX database based on expert-defined transformation of the approved drug molecules. The current version of DrugSpaceX contains more than 100 million transformed chemical products for virtual screening, with outstanding characteristics for its structural novelty, diversity and large three-dimensional chemical space coverage.
Additionally, as for ligand identification and optimization intents, DrugSpaceX also provides several subsets for download, including a 10% diversity subset, an extended drug-like subset, a drug-like subset, a lead-like subset, and a fragment-like subset. Other than chemical properties and transformation instructions, DrugSpaceX could also locate the position of transformation, which enables medicinal chemists easily to integrate strategy planning and protection design purposes.
The Nova and BIOSTER cheminformatics library was used within the StarDrop software platform to exponentially broaden one’s search by taking the ‘Drug set’ molecule and creating new generations of related compounds (1-3). As reference, we also collected existing databases dated January 07, 2020，e.g. the DrugBank (4), the PDB (5), the BindingDB (6), the ChEMBL (7) and the CSD (8). To represent the properties of molecules, we calculated some chemical descriptors for each molecular entity with the RDKit library. The set of descriptors were molecular weight (MW), octanol-water partition coefficient (logP) (9), number of hydrogen bond acceptors (HBA), number of hydrogen bond donors (HBD), total polar surface area (TPSA), amounts of rotatable bonds (RotB), the quantitative estimate of drug-likeness (QED) (10), the synthetic accessibility (SA) score (11) and MCE-18 (12).
The DrugSpaceX website is licensed under a Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0), which permits reusers to distribute, remix, adapt, and build upon the material in any medium or format, for noncommercial purposes only. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit https://creativecommons.org/licenses/by-nc/4.0/. The authors would like to thank ChemAxon (13) and StarDrop for their support as well as the many users of DrugSpaceX for their valuable feedback and suggestions.