Surflex Platform

The Surflex Platform consists of the five modules described below. The Surflex Manual contains details of all computational procedures and options within each command-line module.

We support Linux (most common variants), Windows, and MacOS. All of the modules are multi-core capable, and very substantial speed-ups are observed with modern multi-core laptops, workstations, and HPC clusters.

Tools Module

Fast and Accurate Small Molecule Processing


 

The Tools module addresses the most common aspects of small-molecule preparation:

  • 2D to 3D conversion (from SMILES or SDF)
  • Chirality detection and enumeration
  • Protonation
  • Conformer generation

Features and benefits:

  • Template-free and non-stochastic
  • Relies on MMFF94sf forcefield for structure derivation
  • Fast and accurate on typical drug-like ligands, with better coverage of diverse conformations
  • Fastest and most accurate method for macrocyclic ligands
  • Capable of incorporating NMR restraints, which is particularly useful for large peptidic macrocycles

Selected Publications

Jain, A.N., Brueckner, A.C., Jorge, C., Cleves, A.E., Khandelwal, P., Caceres-Cortes, J., and Mueller, L. (2023). Complex peptide macrocycle optimization: Combining NMR restraints with conformational analysis to guide structure-based and ligand-based design. JCAMD. Open Access

Jain, A.N., Brueckner, A.C., Cleves, and Reibarkh, M.Y., and Sherer, E.C. (2023). A Distributional Model of Bound Ligand Conformational Strain: From Small Molecules up to Large Peptidic Macrocycles. JMC. Open Access

Jain, A. N., Cleves, A. E., Gao, Q., Wang, X., Liu, Y. Sherer, E. C., and Reibarkh, M. Y. (2019). Complex macrocycle exploration: Parallel, heuristic, and constraint-based conformer generation using ForceGen. JCAMD, 33(6), 531-558. Open Access

Cleves, A. E. and Jain, A. N. (2017). ForceGen 3D structure and conformer generation: From small lead-like molecules to macrocyclic drugs. JCAMD, 31(5), 419-439. Open Access

 

The following recent studies all depended on the FGen3D and ForceGen Methods:

Cleves, A.E., Tandon, H, and Jain, A.N. (2024). Structure-Based Pose Prediction: Non-Cognate Docking Extended to Macrocyclic Ligands. JCAMD. Open Access

Cleves, A.E., Jain, A.N., Demeter, D.A., Buchan, Z.A., Wilmot, J., and Hancock, E.N. (2024). From UK-2A to florylpicoxamid: Active learning to identify a mimic of a macrocyclic natural product. JCAMD. Open Access

Cleves, A. E., Johnson, S. R., and Jain, A. N. (2021). Synergy and Complementarity between Focused Machine Learning and Physics-Based Simulation in Affinity Prediction. JCIM. Open Access

Cleves, A. E. and Jain, A. N. (2020). Structure-Based and Ligand-Based Virtual Screening on DUD-E+: Performance Dependence on Approximations to the Binding-Pocket. JCIM. Open Access

Cleves, A. E., Johnson, Stephen R., and Jain, A. N. (2019). Electrostatic-field and surface-shape similarity for virtual screening and pose prediction. JCAMD, 33, 865-886. Open Access

Cleves, A. E. and Jain, A. N. (2018). Quantitative Surface Field Analysis: Learning Causal Models to Predict Ligand Binding Affinity and Pose. JCAMD, 32, 731-757. Open Access

 

Similarity Module

State-of-the-Art 3D Molecular Similarity


 

The Similarity module implements ligand similarity operations using the eSim method:

  • Virtual screening
  • Pose prediction
  • Multiple ligand alignment

The core eSim methodology is also integrated into the Docking and QuanSA modules.

Features and benefits:

  • Virtual screening enrichment is both practically and statistically significantly better than alternative methods
  • Virtual screening speeds of over 20 million compounds per day on a single computing core
  • Databases of billions of molecules can be screened in hours using cloud-based computing resources
  • Pose prediction accuracy is substantially better than alternative approaches

Selected Publications

Cleves, A.E., Tandon, H, and Jain, A.N. (2024). Structure-Based Pose Prediction: Non-Cognate Docking Extended to Macrocyclic Ligands. JCAMD. Open Access

Cleves, A. E. and Jain, A. N. (2020). Structure-Based and Ligand-Based Virtual Screening on DUD-E+: Performance Dependence on Approximations to the Binding-Pocket. JCIM. Open Access

Cleves, A. E., Johnson, Stephen R., and Jain, A. N. (2019). Electrostatic-field and surface-shape similarity for virtual screening and pose prediction. JCAMD, 33, 865-886. Open Access

Cleves, A. E. and Jain, A. N. (2018). Quantitative Surface Field Analysis: Learning Causal Models to Predict Ligand Binding Affinity and Pose. JCAMD, 32, 731-757. Open Access

Cleves, A. E. and Jain, A. N. (2015). Chemical and Protein Structural Basis for Biological Crosstalk Between PPAR-alpha and COX Enzymes. JCAMD, 29(2),101-112. Open Access
Yera, E. R., Cleves, A. E., & Jain, A. N. (2014). Prediction of off-target drug effects through data fusion. In Pacific Symposium on Biocomputing (Vol. 19, pp. 160-171). Open Access

Yera, E.R., Cleves, A.E., and Jain, A.N. (2011) Chemical Structural Novelty: On-Targets and Off-Targets. J Med Chem, 64: 6771-6785. Open Access

Cleves, A. E. and Jain, A.N. (2006). Robust Ligand-Based Modeling of the Biological Targets of Known Drugs. J Med Chem, 49, 2921-2938

Cleves, A.E. and Jain, A.N. (2008). Effects of Inductive Bias on Computational Evaluations of Ligand-Based Modeling and on Drug Discovery. JCAMD, 22, 147-159
Jain, A.N. (2004). Ligand-Based Structural Hypotheses for Virtual Screening. J Med Chem. 47, 947-961.

Jain, A.N. (2000). Morphological similarity: A 3D molecular similarity method correlated with protein-ligand recognition. JCAMD 14, 199-213.

Docking and xGen Modules

Top-Tier Solution for Virtual Screening and pose Prediction + Real-Space X-ray Density Modeling of Ligands


 

The Docking module addresses all aspects of ensemble docking:

  • Large-scale PDB retrieval and processing
  • Surface-based binding site alignment using the PSIM method
  • Fully automatic pocket variant selection to cover the relevant protein conformational variation
  • Virtual screening
  • Pose prediction

Feature and benefits:

  • Automated alignment and selection of appropriate binding site variants
  • Robust and fully automatic modes for virtual screening and pose prediction\Very extensive validation
  • Highly accurate non-cognate ligand docking
  • Directly applicable to synthetic macrocycles, with accuracy equivalent to non-macrocycles

The xGen module implements a novel method for real-space refinement and de novo fitting of ligand ensembles into X-ray density maps:

  • Models ligand density using conformational ensembles
  • Avoids atom-specific B-factors as X-ray model parameters
  • Produces chemically sensible conformers with low strain energy; applicable to complex macrocycles
  • Yields superior fit to X-ray density than standard fitting approaches
  • Accessible to non-crystallographers and as part of crystallographic workflows

Selected Publications

Cleves, A.E., Tandon, H, and Jain, A.N. (2024). Structure-Based Pose Prediction: Non-Cognate Docking Extended to Macrocyclic Ligands. JCAMD. Open Access

Jain, A.N., Brueckner, A.C., Jorge, C., Cleves, A.E., Khandelwal, P., Caceres-Cortes, J., and Mueller, L. (2023). Complex peptide macrocycle optimization: Combining NMR restraints with conformational analysis to guide structure-based and ligand-based design. JCAMD. Open Access

Jain, A.N., Brueckner, A.C., Cleves, and Reibarkh, M.Y., and Sherer, E.C. (2023). A Distributional Model of Bound Ligand Conformational Strain: From Small Molecules up to Large Peptidic Macrocycles. JMC. Open Access

Brueckner, A.C., Deng. Q., Cleves. A.E., Lesburg, C.A., Alvarez, J.C., Reibarkh, M.Y., Sherer, E.C., and Jain, A.N. (2021). Conformational Strain of Macrocyclic Peptides in Ligand–Receptor Complexes Based on Advanced Refinement of Bound-State Conformers. JMC. Open Access

Jain, A.N., Cleves, A.E., Brueckner, A.C., Lesburg, C.A., Deng, Q., Sherer, E.C., and Reibarkh, M.Y. (2020). XGen: Real-Space Fitting of Complex Ligand Conformational Ensembles to X‐ray Electron Density Maps. JMC.  Open Access

Cleves, A. E. and Jain, A. N. (2020). Structure-Based and Ligand-Based Virtual Screening on DUD-E+: Performance Dependence on Approximations to the Binding-Pocket. JCIM. Open Access

Cleves, A. E. and Jain, A. N. (2015). Knowledge-Guided Docking: Accurate Prospective Prediction of Bound Configurations of Novel Ligands using Surflex-Dock. JCAMD, 29(6), 485-509. Open Access

Cleves, A. E. and Jain, A. N. (2015). Chemical and Protein Structural Basis for Biological Crosstalk Between PPAR-alpha and COX Enzymes. JCAMD, 29(2),101-112. Open Access

Spitzer, R., Cleves, A. E., Varela, R., and Jain, A. N. (2013). Protein function annotation by local binding site surface similarity. Proteins: Structure, Function, and Bioinformatics. Open Access

Spitzer, R., and Jain, A.N. (2012). Surflex-Dock: Docking Benchmarks and Real-World Application. JCAMD, 26: 687-699.

Spitzer, R., Cleves, A.E., and Jain, A.N. (2011) Surface-Based Protein Binding Pocket Similarity. Proteins, 79: 2746-2763.

Spitzer, R., and Jain, A.N. (2012). Surflex-Dock: Docking Benchmarks and Real-World Application. JCAMD, 26: 687-699.

Jain, A.N. (2009). Effects of Protein Conformation in Docking: Improved Pose Prediction Through Protein Pocket Adaptation. JCAMD, 23: 355-374.
Jain, A.N. (2008). Bias, Reporting, and Sharing: Computational Evaluations of Docking Methods. JCAMD, 22, 201-212.

Pham, T. A. and Jain, A.N. (2008). Customizing Scoring Functions for Docking. JCAMD, 22, 269-286.

Ruppert, J., Welch, W. & Jain, A.N. (1997). Automatic identification and representation of protein binding sites for molecular docking. Protein Sci 6, 524-33.

Welch, W., Ruppert, J. & Jain, A.N. (1996). Hammerhead: Fast, fully automated docking of flexible ligands to protein binding sites. Chem Biol 3, 449-62.

Jain, A.N. (1996). Scoring noncovalent protein-ligand interactions: A continuous differentiable function tuned to compute binding affinities. J Comput Aided Mol Des 10, 427-40.

Affinity Module

Unique Machine-Learning Approach for Prediction Binding Affinity and Pose


 

The Affinity Module implements the QuanSA (Quantitative Surface-field Analysis) method, which builds physically meaningful models that approximate the causal basis of protein ligand interactions. The module implements integrated procedures for quantitative prediction of both binding affinity and ligand pose, with or without protein structural information:

  • Multiple ligand alignment for molecular series that include multiple scaffolds
  • Incorporation of known binding site information
  • Machine-learning approach to physical binding site model induction using a multiple-instance approach
  • Prediction of both binding affinity and binding mode of new ligands
  • Iterative refinement of models with new data

Features and benefits:

  • Fully automatic model building, including all aspects of ligand conformation and alignment
  • The binding site model (a “pocket-field”) is analogous to a protein binding site, including aspects of flexibility
  • The pocket-field identifies which pose a new molecule must adopt, and ligand strain is directly modeled
  • Measurements of prediction confidence and molecular novelty guide user interpretation
  • Very detailed aspects of molecular surface shape, directional hydrogen bonding preferences, and Coulombic electrostatics are learned
  • Requires as few as 20 molecules for model induction and is capable of modeling series of hundreds of molecules

Selected Publications

Cleves, A.E., Jain, A.N., Demeter, D.A., Buchan, Z.A., Wilmot, J., and Hancock, E.N. (2024). From UK-2A to florylpicoxamid: Active learning to identify a mimic of a macrocyclic natural product. JCAMD. Open Access

Cleves, A. E., Johnson, S. R., and Jain, A. N. (2021). Synergy and Complementarity between Focused Machine Learning and Physics-Based Simulation in Affinity Prediction. JCIM. Open Access

Cleves, A. E. and Jain, A. N. (2018). Quantitative Surface Field Analysis: Learning Causal Models to Predict Ligand Binding Affinity and Pose. JCAMD, 32, 731-757. Open Access

Cleves, A. E. and Jain, A. N. (2016). Extrapolative prediction using physically-based QSAR. JCAMD, 30(2), 127-152. Open Access

Varela, R., Walters, W. P., Goldman, B. B., and Jain, A. N. (2012). Iterative Refinement of a Binding Pocket Model: Active Computational Steering of Lead Optimization. Journal of Medicinal Chemistry, 55(20), 8926-8942. Open Access

Varela, R., Cleves, A. E., Spitzer, R., and Jain, A. N. (2013). A structure-guided approach for protein pocket modeling and affinity prediction. JCAMD, 27(11), 917-934. Open Access

Jain, A.N., and Cleves, A.C. (2012). Does Your Model Weigh the Same as a Duck? JCAMD, 26, 57-67.
Jain, A.N. (2010). QMOD: Physically Meaningful QSAR. JCAMD, 24, 865-878. Open Access

Langham, J.J., Cleves, A.E., Spitzer, R., Kirshner, D., and Jain, A.N. (2009). Physical Binding Pocket Induction for Affinity Prediction. J Med Chem, 52: 6107-6125.

Cleves, A.E. and Jain, A.N. (2008). Effects of Inductive Bias on Computational Evaluations of Ligand-Based Modeling and on Drug Discovery. JCAMD, 22, 147-159

Cleves, A. E. and Jain, A.N. (2006). Robust Ligand-Based Modeling of the Biological Targets of Known Drugs. J Med Chem, 49, 2921-2938

Jain, A.N., Harris, N.L. & Park, J.Y. (1995). Quantitative binding site model generation: Compass applied to multiple chemotypes targeting the 5-HT1A receptor. J Med Chem 38, 1295-308.

Jain, A.N., Koile, K. & Chapman, D. (1994). Compass: Predicting biological activities from molecular surface properties. Performance comparisons on a steroid benchmark. J Med Chem 37, 2315-27.

Jain, A.N., Dietterich, T.G., Lathrop, R.H., Chapman, D., Critchlow, R.E., Jr., Bauer, B.E., Webster, T.A. & Lozano-Perez, T. (1994). A shape-based machine learning tool for drug design. JCAMD 8, 635-52.