Surflex Platform
The Surflex Platform consists of the five modules described below. The Surflex Manual contains details of all computational procedures and options within each command-line module.
We support Linux (most common variants), Windows, and MacOS. All of the modules are multi-core capable, and very substantial speed-ups are observed with modern multi-core laptops, workstations, and HPC clusters.
Tools Module
Fast and Accurate Small Molecule Processing
The Tools module addresses the most common aspects of small-molecule preparation:
- 2D to 3D conversion (from SMILES or SDF)
- Chirality detection and enumeration
- Protonation
- Conformer generation
Features and benefits:
- Template-free and non-stochastic
- Relies on MMFF94sf forcefield for structure derivation
- Fast and accurate on typical drug-like ligands, with better coverage of diverse conformations
- Fastest and most accurate method for macrocyclic ligands
- Capable of incorporating NMR restraints, which is particularly useful for large peptidic macrocycles
Selected Publications
Jain, A.N., Brueckner, A.C., Jorge, C., Cleves, A.E., Khandelwal, P., Caceres-Cortes, J., and Mueller, L. (2023). Complex peptide macrocycle optimization: Combining NMR restraints with conformational analysis to guide structure-based and ligand-based design. JCAMD. Open Access
Jain, A.N., Brueckner, A.C., Cleves, and Reibarkh, M.Y., and Sherer, E.C. (2023). A Distributional Model of Bound Ligand Conformational Strain: From Small Molecules up to Large Peptidic Macrocycles. JMC. Open Access
Jain, A. N., Cleves, A. E., Gao, Q., Wang, X., Liu, Y. Sherer, E. C., and Reibarkh, M. Y. (2019). Complex macrocycle exploration: Parallel, heuristic, and constraint-based conformer generation using ForceGen. JCAMD, 33(6), 531-558. Open Access
Cleves, A. E. and Jain, A. N. (2017). ForceGen 3D structure and conformer generation: From small lead-like molecules to macrocyclic drugs. JCAMD, 31(5), 419-439. Open Access
The following recent studies all depended on the FGen3D and ForceGen Methods:
Cleves, A.E., Tandon, H, and Jain, A.N. (2024). Structure-Based Pose Prediction: Non-Cognate Docking Extended to Macrocyclic Ligands. JCAMD. Open Access
Cleves, A.E., Jain, A.N., Demeter, D.A., Buchan, Z.A., Wilmot, J., and Hancock, E.N. (2024). From UK-2A to florylpicoxamid: Active learning to identify a mimic of a macrocyclic natural product. JCAMD. Open Access
Cleves, A. E., Johnson, S. R., and Jain, A. N. (2021). Synergy and Complementarity between Focused Machine Learning and Physics-Based Simulation in Affinity Prediction. JCIM. Open Access
Cleves, A. E. and Jain, A. N. (2020). Structure-Based and Ligand-Based Virtual Screening on DUD-E+: Performance Dependence on Approximations to the Binding-Pocket. JCIM. Open Access
Cleves, A. E., Johnson, Stephen R., and Jain, A. N. (2019). Electrostatic-field and surface-shape similarity for virtual screening and pose prediction. JCAMD, 33, 865-886. Open Access
Cleves, A. E. and Jain, A. N. (2018). Quantitative Surface Field Analysis: Learning Causal Models to Predict Ligand Binding Affinity and Pose. JCAMD, 32, 731-757. Open Access
Similarity Module
State-of-the-Art 3D Molecular Similarity
The Similarity module implements ligand similarity operations using the eSim method:
- Virtual screening
- Pose prediction
- Multiple ligand alignment
The core eSim methodology is also integrated into the Docking and QuanSA modules.
Features and benefits:
- Virtual screening enrichment is both practically and statistically significantly better than alternative methods
- Virtual screening speeds of over 20 million compounds per day on a single computing core
- Databases of billions of molecules can be screened in hours using cloud-based computing resources
- Pose prediction accuracy is substantially better than alternative approaches
Selected Publications
Cleves, A.E., Tandon, H, and Jain, A.N. (2024). Structure-Based Pose Prediction: Non-Cognate Docking Extended to Macrocyclic Ligands. JCAMD. Open Access
Cleves, A. E. and Jain, A. N. (2020). Structure-Based and Ligand-Based Virtual Screening on DUD-E+: Performance Dependence on Approximations to the Binding-Pocket. JCIM. Open Access
Cleves, A. E., Johnson, Stephen R., and Jain, A. N. (2019). Electrostatic-field and surface-shape similarity for virtual screening and pose prediction. JCAMD, 33, 865-886. Open Access
Cleves, A. E. and Jain, A. N. (2018). Quantitative Surface Field Analysis: Learning Causal Models to Predict Ligand Binding Affinity and Pose. JCAMD, 32, 731-757. Open Access
Cleves, A. E. and Jain, A. N. (2015). Chemical and Protein Structural Basis for Biological Crosstalk Between PPAR-alpha and COX Enzymes. JCAMD, 29(2),101-112. Open Access
Yera, E. R., Cleves, A. E., & Jain, A. N. (2014). Prediction of off-target drug effects through data fusion. In Pacific Symposium on Biocomputing (Vol. 19, pp. 160-171). Open Access
Yera, E.R., Cleves, A.E., and Jain, A.N. (2011) Chemical Structural Novelty: On-Targets and Off-Targets. J Med Chem, 64: 6771-6785. Open Access
Cleves, A. E. and Jain, A.N. (2006). Robust Ligand-Based Modeling of the Biological Targets of Known Drugs. J Med Chem, 49, 2921-2938
Cleves, A.E. and Jain, A.N. (2008). Effects of Inductive Bias on Computational Evaluations of Ligand-Based Modeling and on Drug Discovery. JCAMD, 22, 147-159
Jain, A.N. (2004). Ligand-Based Structural Hypotheses for Virtual Screening. J Med Chem. 47, 947-961.
Jain, A.N. (2000). Morphological similarity: A 3D molecular similarity method correlated with protein-ligand recognition. JCAMD 14, 199-213.
Docking and xGen Modules
Top-Tier Solution for Virtual Screening and pose Prediction + Real-Space X-ray Density Modeling of Ligands
The Docking module addresses all aspects of ensemble docking:
- Large-scale PDB retrieval and processing
- Surface-based binding site alignment using the PSIM method
- Fully automatic pocket variant selection to cover the relevant protein conformational variation
- Virtual screening
- Pose prediction
Feature and benefits:
- Automated alignment and selection of appropriate binding site variants
- Robust and fully automatic modes for virtual screening and pose prediction\Very extensive validation
- Highly accurate non-cognate ligand docking
- Directly applicable to synthetic macrocycles, with accuracy equivalent to non-macrocycles
The xGen module implements a novel method for real-space refinement and de novo fitting of ligand ensembles into X-ray density maps:
- Models ligand density using conformational ensembles
- Avoids atom-specific B-factors as X-ray model parameters
- Produces chemically sensible conformers with low strain energy; applicable to complex macrocycles
- Yields superior fit to X-ray density than standard fitting approaches
- Accessible to non-crystallographers and as part of crystallographic workflows
Selected Publications
Cleves, A.E., Tandon, H, and Jain, A.N. (2024). Structure-Based Pose Prediction: Non-Cognate Docking Extended to Macrocyclic Ligands. JCAMD. Open Access
Jain, A.N., Brueckner, A.C., Jorge, C., Cleves, A.E., Khandelwal, P., Caceres-Cortes, J., and Mueller, L. (2023). Complex peptide macrocycle optimization: Combining NMR restraints with conformational analysis to guide structure-based and ligand-based design. JCAMD. Open Access
Jain, A.N., Brueckner, A.C., Cleves, and Reibarkh, M.Y., and Sherer, E.C. (2023). A Distributional Model of Bound Ligand Conformational Strain: From Small Molecules up to Large Peptidic Macrocycles. JMC. Open Access
Brueckner, A.C., Deng. Q., Cleves. A.E., Lesburg, C.A., Alvarez, J.C., Reibarkh, M.Y., Sherer, E.C., and Jain, A.N. (2021). Conformational Strain of Macrocyclic Peptides in Ligand–Receptor Complexes Based on Advanced Refinement of Bound-State Conformers. JMC. Open Access
Jain, A.N., Cleves, A.E., Brueckner, A.C., Lesburg, C.A., Deng, Q., Sherer, E.C., and Reibarkh, M.Y. (2020). XGen: Real-Space Fitting of Complex Ligand Conformational Ensembles to X‐ray Electron Density Maps. JMC. Open Access
Cleves, A. E. and Jain, A. N. (2020). Structure-Based and Ligand-Based Virtual Screening on DUD-E+: Performance Dependence on Approximations to the Binding-Pocket. JCIM. Open Access
Cleves, A. E. and Jain, A. N. (2015). Knowledge-Guided Docking: Accurate Prospective Prediction of Bound Configurations of Novel Ligands using Surflex-Dock. JCAMD, 29(6), 485-509. Open Access
Cleves, A. E. and Jain, A. N. (2015). Chemical and Protein Structural Basis for Biological Crosstalk Between PPAR-alpha and COX Enzymes. JCAMD, 29(2),101-112. Open Access
Spitzer, R., Cleves, A. E., Varela, R., and Jain, A. N. (2013). Protein function annotation by local binding site surface similarity. Proteins: Structure, Function, and Bioinformatics. Open Access
Spitzer, R., and Jain, A.N. (2012). Surflex-Dock: Docking Benchmarks and Real-World Application. JCAMD, 26: 687-699.
Spitzer, R., Cleves, A.E., and Jain, A.N. (2011) Surface-Based Protein Binding Pocket Similarity. Proteins, 79: 2746-2763.
Spitzer, R., and Jain, A.N. (2012). Surflex-Dock: Docking Benchmarks and Real-World Application. JCAMD, 26: 687-699.
Jain, A.N. (2009). Effects of Protein Conformation in Docking: Improved Pose Prediction Through Protein Pocket Adaptation. JCAMD, 23: 355-374.
Jain, A.N. (2008). Bias, Reporting, and Sharing: Computational Evaluations of Docking Methods. JCAMD, 22, 201-212.
Pham, T. A. and Jain, A.N. (2008). Customizing Scoring Functions for Docking. JCAMD, 22, 269-286.
Ruppert, J., Welch, W. & Jain, A.N. (1997). Automatic identification and representation of protein binding sites for molecular docking. Protein Sci 6, 524-33.
Welch, W., Ruppert, J. & Jain, A.N. (1996). Hammerhead: Fast, fully automated docking of flexible ligands to protein binding sites. Chem Biol 3, 449-62.
Jain, A.N. (1996). Scoring noncovalent protein-ligand interactions: A continuous differentiable function tuned to compute binding affinities. J Comput Aided Mol Des 10, 427-40.
Affinity Module
Unique Machine-Learning Approach for Prediction Binding Affinity and Pose
The Affinity Module implements the QuanSA (Quantitative Surface-field Analysis) method, which builds physically meaningful models that approximate the causal basis of protein ligand interactions. The module implements integrated procedures for quantitative prediction of both binding affinity and ligand pose, with or without protein structural information:
- Multiple ligand alignment for molecular series that include multiple scaffolds
- Incorporation of known binding site information
- Machine-learning approach to physical binding site model induction using a multiple-instance approach
- Prediction of both binding affinity and binding mode of new ligands
- Iterative refinement of models with new data
Features and benefits:
- Fully automatic model building, including all aspects of ligand conformation and alignment
- The binding site model (a “pocket-field”) is analogous to a protein binding site, including aspects of flexibility
- The pocket-field identifies which pose a new molecule must adopt, and ligand strain is directly modeled
- Measurements of prediction confidence and molecular novelty guide user interpretation
- Very detailed aspects of molecular surface shape, directional hydrogen bonding preferences, and Coulombic electrostatics are learned
- Requires as few as 20 molecules for model induction and is capable of modeling series of hundreds of molecules
Selected Publications
Cleves, A.E., Jain, A.N., Demeter, D.A., Buchan, Z.A., Wilmot, J., and Hancock, E.N. (2024). From UK-2A to florylpicoxamid: Active learning to identify a mimic of a macrocyclic natural product. JCAMD. Open Access
Cleves, A. E., Johnson, S. R., and Jain, A. N. (2021). Synergy and Complementarity between Focused Machine Learning and Physics-Based Simulation in Affinity Prediction. JCIM. Open Access
Cleves, A. E. and Jain, A. N. (2018). Quantitative Surface Field Analysis: Learning Causal Models to Predict Ligand Binding Affinity and Pose. JCAMD, 32, 731-757. Open Access
Cleves, A. E. and Jain, A. N. (2016). Extrapolative prediction using physically-based QSAR. JCAMD, 30(2), 127-152. Open Access
Varela, R., Walters, W. P., Goldman, B. B., and Jain, A. N. (2012). Iterative Refinement of a Binding Pocket Model: Active Computational Steering of Lead Optimization. Journal of Medicinal Chemistry, 55(20), 8926-8942. Open Access
Varela, R., Cleves, A. E., Spitzer, R., and Jain, A. N. (2013). A structure-guided approach for protein pocket modeling and affinity prediction. JCAMD, 27(11), 917-934. Open Access
Jain, A.N., and Cleves, A.C. (2012). Does Your Model Weigh the Same as a Duck? JCAMD, 26, 57-67.
Jain, A.N. (2010). QMOD: Physically Meaningful QSAR. JCAMD, 24, 865-878. Open Access
Langham, J.J., Cleves, A.E., Spitzer, R., Kirshner, D., and Jain, A.N. (2009). Physical Binding Pocket Induction for Affinity Prediction. J Med Chem, 52: 6107-6125.
Cleves, A.E. and Jain, A.N. (2008). Effects of Inductive Bias on Computational Evaluations of Ligand-Based Modeling and on Drug Discovery. JCAMD, 22, 147-159
Cleves, A. E. and Jain, A.N. (2006). Robust Ligand-Based Modeling of the Biological Targets of Known Drugs. J Med Chem, 49, 2921-2938
Jain, A.N., Harris, N.L. & Park, J.Y. (1995). Quantitative binding site model generation: Compass applied to multiple chemotypes targeting the 5-HT1A receptor. J Med Chem 38, 1295-308.
Jain, A.N., Koile, K. & Chapman, D. (1994). Compass: Predicting biological activities from molecular surface properties. Performance comparisons on a steroid benchmark. J Med Chem 37, 2315-27.
Jain, A.N., Dietterich, T.G., Lathrop, R.H., Chapman, D., Critchlow, R.E., Jr., Bauer, B.E., Webster, T.A. & Lozano-Perez, T. (1994). A shape-based machine learning tool for drug design. JCAMD 8, 635-52.