Host–Guest Docking Tutorial
This tutorial demonstrates how to use ChemRefine for a host–guest docking workflow, followed by machine-learning refinement, DFT validation, and explicit solvation.
We will start with an initial structure (step1.xyz
) and progressively refine docking poses through MLFF and DFT optimization.
Overview
-
Step 1 – Docking (DFT)
Generate 5 initial docking poses of the guest molecule into the host cavity using XTB-level scoring. -
Step 2 – MLFF Optimization
Refine docked structures using the UMA-S-1 MLFF model (omol
task). - GPU acceleration is enabled (
device: cuda
). -
Retains structures within 10 kcal/mol of the lowest energy.
-
Step 3 – DFT Re-optimization
The lowest-energy MLFF structure is re-optimized at the DFT level for accuracy. -
Step 4 – Solvation
Add explicit solvent molecules around the final optimized host–guest complex for solvation analysis. -
Step 5 - DFT calculations
DFT calculations for each solvent molecule to get solvation free energies.
Input Files
We start with an initial structure located in the templates folder:
Orca Input Files
You can find the ORCA input files here
Interactive 3D Viewer
1. Input File
Below is a complete example of an input file (input.yaml
) for a docking study:
template_dir: ./templates
scratch_dir: /scratch/ganymede2/dal950773/orca_files/
output_dir: ./fixed_charge
orca_executable: /mfs/io/groups/sterling/software-tools/orca/orca_6_1_0_avx2/orca
# Global system settings
charge: 0
multiplicity: 1
# Optional: Override default initial structure
initial_xyz: ./templates/step1.xyz
# === Step-by-step workflow ===
steps:
# Step 1: Perform docking with DFT
- step: 1
operation: "DOCKER"
engine: "DFT"
sample_type:
method: "integer"
parameters:
num_structures: 5 # Generate 5 docked structures
# Step 2: Refine docking poses with MLFF
- step: 2
operation: "OPT+SP"
engine: "MLFF"
charge: -1
multiplicity: 1
mlff:
model_name: "uma-s-1"
task_name: "omol"
device: "cuda"
sample_type:
method: "energy_window"
parameters:
energy: 10
unit: kcal/mol
# Step 3: Validate best candidates with DFT
- step: 3
operation: "OPT+SP"
engine: "DFT"
charge: -1
multiplicity: 1
sample_type:
method: "integer"
parameters:
num_structures: 1
# Step 4: Solvation refinement
- step: 4
operation: "SOLVATOR"
engine: "DFT"
sample_type:
method: "integer"
parameters:
num_structures: 0
- step: 5
engine: "DFT"
operation: "OPT+SP"
charge: -1
multiplicity: 1
sample_type:
method: "energy_window"
parameters:
energy: 10
unit: kcal/mol
2. Running the Workflow
From the command line:
chemrefine input.yaml --maxcores 16
This runs the workflow locally with up to 16 parallel jobs.
On an HPC cluster with SLURM:
sbatch ./Examples/templates/chemrefine.slurm
4. Expected Outputs
- Docked poses from Step 1 in
outputs/step1/
- Refined MLFF structures with energies in
outputs/step2/
- Validated DFT structures in
outputs/step3/
- Final solvated complex in
outputs/step4/
- Free Energy Solvation Energies in
outputs/step5/
Each step directory contains .out
logs, .xyz
geometries, and summary files.
5. Notes & Tips
- Adjust
num_structures
in Step 1 to explore more docking poses. - Use MLFF refinement for speed, then confirm results with DFT.
- Solvation step can be skipped by removing Step 4.
- Large jobs should always be submitted via SLURM.