scan3d

Perform a three-dimensional (d1, d2, d3) grid scan with harmonic restraints and ML/MM relaxations on a layered enzyme PDB, mapping a 3D PES across three coupled distances. mlmm scan3d nests loops over d1, d2, and d3, relaxing each point with the ML/MM calculator (mlmm.backends.mlmm_calc.mlmm) under the appropriate restraints. The ML region comes from --model-pdb, and Amber parameters are read from --parm. The MLIP backend is selected via -b/--backend (default: uma), and the optimizer is PySisyphus LBFGS. Use -s/--scan-lists with a YAML/JSON spec file (recommended) or an inline Python literal. A precomputed surface can be loaded via --csv for re-plotting without re-running the scan.

Examples

# Minimal: run a 3D scan from a YAML spec
mlmm scan3d -i input.pdb --parm real.parm7 --model-pdb ml_region.pdb \
 -q 0 -s scan3d.yaml -o ./result_scan3d/

(Add --print-parsed to validate the parsed scan spec and exit without running the GPU calculation.)

# Recommended: YAML/JSON spec
cat > scan3d.yaml << 'YAML'
one_based: true
pairs:
 - [12, 45, 1.30, 3.10]
 - [10, 55, 1.20, 3.20]
 - [15, 60, 1.10, 3.00]
YAML
mlmm scan3d -i input.pdb --parm real.parm7 --model-pdb ml_region.pdb \
 -q 0 -s scan3d.yaml --print-parsed
# Inline Python literal, with pre-optimization, --dump, and custom output directory
mlmm scan3d -i input.pdb --parm real.parm7 --model-pdb ml_region.pdb \
 -q 0 -s "[(12,45,1.30,3.10),(10,55,1.20,3.20),(15,60,1.10,3.00)]" \
 --max-step-size 0.20 --dump -o ./result_scan3d/ \
 --preopt --baseline min

Workflow

  1. Load the structure through geom_loader, resolve charge/spin from CLI, and optionally run an unbiased preoptimization when --preopt.

  2. Parse targets from -s/--scan-lists (YAML/JSON spec file or inline literal; default 1-based indices unless --zero-based is passed) into three quadruples. For PDB inputs, each atom entry can be an integer index or a selector string like "TYR,285,CA"; delimiters may be spaces, commas, slashes, backticks, or backslashes.

  3. Outer loop over d1[i]: relax with only the d1 restraint active, starting from the previously scanned geometry whose d1 value is closest.

  4. Middle loop over d2[j]: relax with d1 and d2 restraints, starting from the closest (d1, d2) geometry.

  5. Inner loop over d3[k]: relax with all three restraints, measure the unbiased energy (bias removed for evaluation), and write the constrained geometry and convergence flag.

  6. After the scan completes, assemble surface.csv, apply the kcal/mol baseline shift (--baseline {min|first}), and generate a 3D RBF-interpolated isosurface plot (scan3d_density.html) honoring --zmin/--zmax.

Outputs

out_dir/ (default: ./result_scan3d/)
 surface.csv # Grid metadata (d1, d2, d3, energy, convergence)
 scan3d_density.html # 3D energy isosurface visualization
 grid/point_i###_j###_k###.xyz # Relaxed geometry for each grid point
 grid/point_i###_j###_k###.pdb # PDB companions (B-factors: ML=0, Movable-MM=10, Frozen=20)
 grid/inner_path_d1_###_d2_###_trj.xyz # Present only when --dump is True

Filename tags i###_j###_k### are integer hundredths of an angstrom (d1×100, d2×100, d3×100), not step indices.

CLI options

Option

Description

Default

-i, --input PATH

Full enzyme PDB (no link atoms).

Required unless --csv

--parm PATH

Amber parm7 topology for the full enzyme.

Required unless --csv

--model-pdb PATH

PDB defining the ML region.

None

--model-indices TEXT

Explicit ML-region atom indices (alternative to --model-pdb).

None

--model-indices-one-based / --model-indices-zero-based

Indexing convention for --model-indices.

True (1-based)

--detect-layer / --no-detect-layer

Auto-detect ML/MM layers from B-factors.

True

-q, --charge INT

ML-region net charge.

None (required unless -l or --csv is given)

-l, --ligand-charge TEXT

Per-resname charge mapping (e.g., GPP:-3,SAM:1). Derives total charge when -q is omitted.

None

-m, --multiplicity INT

Spin multiplicity (2S+1).

1

--freeze-atoms TEXT

1-based comma-separated frozen atom indices.

None

--hess-cutoff FLOAT

Distance cutoff (Å) from ML region for MM atoms to include in Hessian calculation. Can be combined with --detect-layer.

None

--movable-cutoff FLOAT

Distance cutoff (Å) from ML region for movable MM atoms. Providing this disables --detect-layer.

None

-s, --scan-lists TEXT

Scan targets: a YAML/JSON spec file path (auto-detected, with pairs containing 3 quadruples) or an inline Python literal with three quadruples (i,j,low,high). i/j can be integer indices or PDB atom selectors.

Required

--csv FILE

Load precomputed surface.csv and generate plot without running a scan.

None

--one-based / --zero-based

Interpret (i, j) indices as 1- or 0-based.

True (1-based)

--print-parsed/--no-print-parsed

Print parsed pair tuples after -s/--scan-lists resolution.

False

--max-step-size FLOAT

Maximum distance increment per step (Å). Controls grid density.

0.20

--bias-k FLOAT

Harmonic well strength k (eV/Ų).

300.0

--relax-max-cycles INT

Maximum optimizer cycles during each biased relaxation.

10000

--dump/--no-dump

Write inner d3 scan TRJs per (d1, d2) slice.

False

-o, --out-dir TEXT

Output directory root for grids and plots.

./result_scan3d/

--thresh TEXT

Convergence preset override (gau_loose, gau, gau_tight, gau_vtight, baker, never).

baker

--config FILE

Base YAML configuration file (applied first).

None

--ref-pdb FILE

Reference PDB topology for non-PDB inputs.

None

--preopt/--no-preopt

Run an unbiased optimization before scanning.

False

--baseline {min,first}

Shift kcal/mol energies so the global min or (i,j,k)=(0,0,0) is zero.

min

--zmin FLOAT

Manual lower limit for the isosurface color bands (kcal/mol).

Autoscaled

--zmax FLOAT

Manual upper limit for the isosurface color bands (kcal/mol).

Autoscaled

-b, --backend CHOICE

MLIP backend for the ML region: uma, orb, mace, aimnet2.

uma

--embedcharge/--no-embedcharge

Enable xTB point-charge embedding correction for MM-to-ML environmental effects (experimental).

False

--embedcharge-cutoff FLOAT

Cutoff radius (Å) for embed-charge MM atoms.

12.0

--cmap/--no-cmap

Enable CMAP (backbone cross-map dihedral correction) in model parm7. Default: disabled (consistent with Gaussian ONIOM).

--no-cmap

--mm-backend [hessian_ff|openmm]

MM backend (analytical Hessian vs OpenMM finite-difference).

hessian_ff

--link-atom-method [scaled|fixed]

Link-atom placement: scaled ($g$-factor) or fixed 1.09/1.01 Å.

scaled

--out-json/--no-out-json

Write machine-readable result.json to out_dir.

False

--convert-files/--no-convert-files

Toggle XYZ/TRJ to PDB companions when a PDB template is available.

True

The full flag list is in the generated command reference; do not hand-duplicate it here.

Scan-list syntax

Inline literal format

When -s/--scan-lists receives a value that is not a file path, it is treated as a single Python literal string. Shell quoting matters.

The literal is a Python list of exactly three quadruples (atom1, atom2, low_A, high_A):

-s '[(atom1, atom2, low_A, high_A), (atom3, atom4, low_A, high_A), (atom5, atom6, low_A, high_A)]'
  • Wrap the entire literal in single quotes so the shell does not interpret parentheses or spaces.

  • Each quadruple defines one scan axis: the distance between atom1atom2 is scanned from low_A to high_A.

  • Unlike scan, only one literal is accepted (no multi-stage support).

Atoms can be given as integer indices or PDB selector strings:

Method

Example

Notes

Integer index

(1, 5, 1.30, 3.10)

1-based by default (--one-based)

PDB selector

("TYR,285,CA", "MMT,309,C10", 1.30, 3.10)

Residue name, residue number, atom name

PDB selector tokens can be separated by any of: comma ,, space, slash /, backtick `, or backslash \. Token order is flexible.

# All of these specify the same atom:
"TYR,285,CA"
"TYR 285 CA"
"TYR/285/CA"
"285,TYR,CA" # order is flexible

Quoting rules:

# Correct: single-quote the list, double-quote selector strings inside
-s '[("TYR,285,CA","MMT,309,C10",1.30,3.10),("TYR,285,CB","MMT,309,C11",1.20,3.20),("TYR,285,CG","MMT,309,C12",1.10,3.00)]'

# Correct: integer indices need no inner quotes
-s '[(1, 5, 1.30, 3.10), (2, 8, 1.20, 3.20), (3, 12, 1.10, 3.00)]'

# Avoid: double-quoting the outer literal requires escaping inner quotes
-s "[(\"TYR,285,CA\",\"MMT,309,C10\",1.30,3.10),...]"

YAML configuration

geom:
 coord_type: cart
 freeze_atoms: []
calc:
 charge: 0
 spin: 1
mlmm:
 real_parm7: real.parm7
 model_pdb: ml_region.pdb
opt:
 thresh: baker
 max_cycles: 10000
 dump: false
 out_dir: ./result_scan3d/
lbfgs:
 max_step: 0.3
 out_dir: ./result_scan3d/
bias:
 k: 300.0

See Also