Skip to content

Host–Guest Docking Tutorial

This tutorial demonstrates how to use ChemRefine for a host–guest docking workflow, followed by machine-learning refinement, DFT validation, and explicit solvation.

We will start with an initial structure (step1.xyz) and progressively refine docking poses through MLFF and DFT optimization.

Overview

  • Step 1 – Docking (DFT)
    Generate 5 initial docking poses of the guest molecule into the host cavity using XTB-level scoring.

  • Step 2 – MLFF Optimization
    Refine docked structures using the UMA-S-1 MLFF model (omol task).

  • GPU acceleration is enabled (device: cuda).
  • Retains structures within 10 kcal/mol of the lowest energy.

  • Step 3 – DFT Re-optimization
    The lowest-energy MLFF structure is re-optimized at the DFT level for accuracy.

  • Step 4 – Solvation
    Add explicit solvent molecules around the final optimized host–guest complex for solvation analysis.

  • Step 5 - DFT calculations

DFT calculations for each solvent molecule to get solvation free energies.

Input Files

We start with an initial structure located in the templates folder:

Orca Input Files

You can find the ORCA input files here


Interactive 3D Viewer

1. Input File

Below is a complete example of an input file (input.yaml) for a docking study:

template_dir: ./templates
scratch_dir: /scratch/ganymede2/dal950773/orca_files/
output_dir: ./fixed_charge
orca_executable: /mfs/io/groups/sterling/software-tools/orca/orca_6_1_0_avx2/orca

# Global system settings
charge: 0
multiplicity: 1

# Optional: Override default initial structure
initial_xyz: ./templates/step1.xyz

# === Step-by-step workflow ===
steps:
  # Step 1: Perform docking with DFT
  - step: 1
    operation: "DOCKER"
    engine: "DFT"
    sample_type:
      method: "integer"
      parameters:
        num_structures: 5   # Generate 5 docked structures

  # Step 2: Refine docking poses with MLFF
  - step: 2
    operation: "OPT+SP"
    engine: "MLFF"
    charge: -1
    multiplicity: 1
    mlff:
      model_name: "uma-s-1"
      task_name: "omol"
      device: "cuda"
    sample_type:
      method: "energy_window"
      parameters:
        energy: 10
        unit: kcal/mol

  # Step 3: Validate best candidates with DFT
  - step: 3
    operation: "OPT+SP"
    engine: "DFT"
    charge: -1
    multiplicity: 1
    sample_type:
      method: "integer"
      parameters:
        num_structures: 1

  # Step 4: Solvation refinement
  - step: 4
    operation: "SOLVATOR"
    engine: "DFT"
    sample_type:
      method: "integer"
      parameters:
        num_structures: 0

  - step: 5
        engine: "DFT"
        operation: "OPT+SP"
        charge: -1
        multiplicity: 1
        sample_type:
        method: "energy_window"
        parameters:
            energy: 10
            unit: kcal/mol


2. Running the Workflow

From the command line:

chemrefine input.yaml --maxcores 16

This runs the workflow locally with up to 16 parallel jobs.

On an HPC cluster with SLURM:

sbatch ./Examples/templates/chemrefine.slurm

4. Expected Outputs

  • Docked poses from Step 1 in outputs/step1/
  • Refined MLFF structures with energies in outputs/step2/
  • Validated DFT structures in outputs/step3/
  • Final solvated complex in outputs/step4/
  • Free Energy Solvation Energies in outputs/step5/

Each step directory contains .out logs, .xyz geometries, and summary files.


5. Notes & Tips

  • Adjust num_structures in Step 1 to explore more docking poses.
  • Use MLFF refinement for speed, then confirm results with DFT.
  • Solvation step can be skipped by removing Step 4.
  • Large jobs should always be submitted via SLURM.