The current version of DrugSpaceX contains more than 100 million transformed chemical products for virtual screening, with outstanding characteristics for its structural novelty, diversity and large three-dimensional chemical space coverage. In addition, as for ligand identification and optimization intents, DrugSpaceX also provides branches for molecules downloading, including a 10% diversity subset, an extended drug-like subset, a drug-like subset, a lead-like subset, and a fragment-like subset. Other than chemical properties and transformation instructions, DrugSpaceX could also locate the position of transformation, which enables medicinal chemists easily systematically integrate strategy planning and protection design purposes.


Table 1. Detailed view of the DrugSpaceX Web page showing the representative samples.

Type Name Number Filtering Critieria
DrugSpaceX D 2215 DrugSpaceX-Drug-set-smiles.smi.tar.gz
S 937230 DrugSpaceX-S-sample-smiles.smi.tar.gz
A 100946534 DrugSpaceX-Total.smi.tar.gz
Extended Drug-like Subset DSX-EL 75146964 DrugSpaceX-EL.smi.tar.gz
Drug-like Subset DSX-DL 22556593 DrugSpaceX-DL.smi.tar.gz
Lead-like Subset DSX-LL 7038707 DrugSpaceX-LL.smi.tar.gz
Fragment-like Subset DSX-FL 2348639 DrugSpaceX-FL.smi.tar.gz
10% Subset DSX-10% 10094653 DrugSpaceX-10S.smi.tar.gz

1. DrugSpaceX, D (the drug set), S (S sample) and A (all the data)
2. A extended drug-like subset : (DSX-EL) was used : (MW≤700, 0≤logP≤7.5, HBD≤5, TPSA≤200 Å2, RotB ≤ 20)
3. A drug-like subset : (DSX-DL) was used: MW ≤ 500 Da, logP ≤ 5, HBD ≤ 5, HBA ≤ 10, TPSA ≤ 150 Å2, RotB ≤ 7
4. A lead-like subset : (DSX-LL) was used: MW ≤ 350 Da and MW ≥ 250 Da, logP ≤ 3.5, and RotB ≤ 7
5. A fragment-like subset (DSX-FL) was also used : MW ≤ 250 Da, logP ≤ 3.5, and RotB ≤ 5
6. 10% set: Random 10%


Yang T, Li Z, Chen Y, Feng D, Wang G, Fu Z, Ding X, Tan X, Zhao J, Luo X, Chen K, Jiang H, Zheng M. DrugSpaceX: a large screenable and synthetically tractable database extending drug space. Nucleic Acids Res. 2020 Oct 26:gkaa920. doi: 10.1093/nar/gkaa920. Epub ahead of print. PMID: 33104791.