scan¶
Overview¶
Summary: Drive a reaction coordinate by scanning bond distances with harmonic restraints. Use
--scan-liststo define target distances. Multiple stages run sequentially, each starting from the previous stage’s relaxed result.
At a glance¶
Use when: You have a single structure and want to push specific distances to explore a plausible path (often before
path-search/path-opt).Input: One structure + one or more
--scan-listsliterals (each literal = one stage).Defaults:
--opt-mode light(LBFGS),--preopt True,--endopt True,--max-step-size 0.20 Å.Outputs: Per-stage
result.xyz(+ optional.pdb/.gjf), and optional concatenated trajectories when--dump True.Note:
--scan-listsis parsed as a Python literal; quoting/escaping matters (see examples).
pdb2reaction scan performs a staged, bond-length–driven scan using the UMA calculator and harmonic restraints. At each step, the temporary targets are updated, restraint wells are applied, and the structure is relaxed with LBFGS (--opt-mode light) or RFOptimizer (--opt-mode heavy).
When you provide multiple --scan-lists literals after a single flag, stages run sequentially and each stage starts from the previous stage’s relaxed structure. After the biased walk, optional unbiased pre-/post-optimizations (--preopt, --endopt) can clean up geometries before writing result.* to disk.
For XYZ/GJF inputs, --ref-pdb supplies a reference PDB topology while keeping XYZ coordinates, enabling format-aware PDB/GJF output conversion.
Usage¶
pdb2reaction scan -i INPUT.{pdb|xyz|trj|...} [-q CHARGE] [--ligand-charge <number|'RES:Q,...'>] [-m MULT] \
--scan-lists '[(i,j,targetÅ), ...]' [options]
[--convert-files {True\|False}] [--ref-pdb FILE]
Examples¶
# Single-stage, minimal inputs
pdb2reaction scan -i input.pdb -q 0 --scan-lists '[("TYR,285,CA","MMT,309,C10",1.35)]'
# Two stages, LBFGS relaxations, and trajectory dumping
pdb2reaction scan -i input.pdb -q 0 --scan-lists \
'[("TYR,285,CA","MMT,309,C10",1.35)]' \
'[("TYR,285,CA","MMT,309,C10",2.20),("TYR,285,CB","MMT,309,C11",1.80)]' \
--max-step-size 0.20 --dump True --out-dir ./result_scan/ --opt-mode light \
--preopt True --endopt True
# Supply multiple stage literals after a single --scan-lists
pdb2reaction scan -i input.pdb -q 0 --scan-lists \
'[("TYR,285,CA","MMT,309,C10",1.35)]' \
'[("TYR,285,CA","MMT,309,C10",2.20),("TYR,285,CB","MMT,309,C11",1.80)]'
--scan-lists format¶
--scan-lists accepts Python literal strings evaluated by the CLI. Shell quoting matters.
Basic structure¶
Each literal is a Python list of triples (atom1, atom2, target_Å):
--scan-lists '[(atom1, atom2, target_Å), ...]'
Wrap the entire literal in single quotes so the shell does not interpret parentheses or spaces.
Each triple drives the distance between
atom1–atom2towardtarget_Å.One literal = one stage. For multiple stages, pass multiple literals after a single
--scan-listsflag (do not repeat the flag).
Specifying atoms¶
Atoms can be given as integer indices or PDB selector strings:
Method |
Example |
Notes |
|---|---|---|
Integer index |
|
1-based by default ( |
PDB selector |
|
Residue name, residue number, atom name |
PDB selector tokens can be separated by any of: comma ,, space, slash /, backtick `, or backslash \. Token order is flexible.
# All of these specify the same atom:
"TYR,285,CA"
"TYR 285 CA"
"TYR/285/CA"
"285,TYR,CA" # order is flexible
Quoting rules¶
# Correct: single-quote the list, double-quote selector strings inside
--scan-lists '[("TYR,285,CA","MMT,309,C10",1.35)]'
# Correct: integer indices need no inner quotes
--scan-lists '[(1, 5, 2.0)]'
# Avoid: double-quoting the outer literal requires escaping inner quotes
--scan-lists "[(\"TYR,285,CA\",\"MMT,309,C10\",1.35)]"
Multiple stages¶
Pass multiple literals after a single --scan-lists flag. Each literal becomes one stage:
# Stage 1: drive one bond to 1.35 Å
# Stage 2: drive two bonds simultaneously
--scan-lists \
'[("TYR,285,CA","MMT,309,C10",1.35)]' \
'[("TYR,285,CA","MMT,309,C10",2.20),("TYR,285,CB","MMT,309,C11",1.80)]'
Stages run sequentially; each starts from the previous stage’s relaxed result. Do not repeat the --scan-lists flag — supply all stage literals after a single flag.
Workflow¶
Load the structure through
geom_loader, resolving charge/spin from the CLI overrides, the embedded Gaussian template (if present), or defaults. If-qis omitted but--ligand-chargeis provided, the input is treated as an enzyme–substrate complex andextract.py’s charge summary derives the total charge before any scans.Optionally run an unbiased preoptimization (
--preopt True) before any biasing so the starting point is relaxed.For each stage literal supplied via
--scan-lists, parse and normalize the(i, j)indices (1-based by default). When the input is a PDB, each entry may be either an integer index or an atom selector string like'TYR,285,CA'; selector fields can be separated by spaces, commas, slashes, backticks, or backslashes and may be in any order (fallback assumes resname, resseq, atom). Compute the per-bond displacementΔ = target − currentand split it intoN = ceil(max(|Δ|) / h)steps usingh = --max-step-size. Every bond receives its ownδ = Δ / Nincrement.March through all steps, updating the temporary targets, applying the harmonic wells
E = Σ ½ k (|ri − rj| − target)², and minimizing with UMA. Optimizer cycles are capped by--relax-max-cyclesunless YAML specifiesopt.max_cycles.After the last step of each stage, optionally run an unbiased relaxation (
--endopt True) before reporting covalent bond changes and writing theresult.*files.Repeat for every stage; optional trajectories are dumped only when
--dumpisTrue.
CLI options¶
Option |
Description |
Default |
|---|---|---|
|
Structure file accepted by |
Required |
|
Total charge (CLI > template). When omitted, charge can be inferred from |
Required unless a |
|
Total charge or per-resname mapping used when |
None |
|
UMA predictor parallelism (workers > 1 disables analytic Hessians; |
|
|
Spin multiplicity 2S+1. Inherits the |
|
|
Python literal with |
Required |
|
Interpret atom indices as 1- or 0-based. |
|
|
Maximum change in any scanned bond per step (Å). Controls the number of integration steps. |
|
|
Harmonic bias strength |
|
|
Cap on optimizer cycles during preopt, each biased step, and end-of-stage cleanups. Used unless YAML sets |
|
|
|
|
|
When the input is PDB, freeze the parents of link hydrogens. |
|
|
Dump concatenated biased trajectories ( |
|
|
Toggle XYZ/TRJ → PDB/GJF companions for PDB/Gaussian inputs (trajectory conversion only writes PDB). |
|
|
Reference PDB topology to use when the input is XYZ/GJF (keeps XYZ coordinates). |
None |
|
Output directory root. |
|
|
Convergence preset override ( |
|
|
YAML overrides for |
None |
|
Run an unbiased optimization before scanning. |
|
|
Run an unbiased optimization after each stage. |
|
Section bias¶
k(300): Harmonic strength in eV·Å⁻².
Section bond¶
UMA-based bond-change detection shared with path-search:
device("cuda"): UMA device for graph analysis.bond_factor(1.20): Covalent-radius scaling for cutoff.margin_fraction(0.05): Fractional tolerance for comparisons.delta_fraction(0.05): Minimum relative change to flag formation/breaking.
Outputs¶
out_dir/ (default: ./result_scan/)
├─ preopt/ # Present when --preopt is True
│ ├─ result.xyz
│ ├─ result.pdb # PDB companion for PDB inputs when conversion is enabled
│ └─ result.gjf # When a Gaussian template exists and conversion is enabled
└─ stage_XX/ # One folder per stage
├─ result.xyz
├─ result.pdb # PDB mirror of the final structure (conversion enabled)
├─ result.gjf # Gaussian mirror when templates exist and conversion is enabled
├─ scan.trj # Written when --dump is True
└─ scan.pdb # Trajectory companion for PDB inputs when conversion is enabled (no scan.gjf is produced)
Console summaries of the resolved
geom,calc,opt,bias,bond, and optimizer blocks plus per-stage bond-change reports.
Notes¶
Provide multiple literals after a single
--scan-listsflag; repeated flags are not accepted. Tuples must have positive targets. Atom indices are normalized to 0-based internally. For PDB inputs,i/jcan be selector strings with flexible delimiters (space/comma/slash/backtick/backslash) and unordered tokens.--freeze-linksaugments userfreeze_atomsby adding parents of link-H atoms in PDB files so pockets stay rigid.Charge inherits Gaussian template metadata when available. For non-
.gjfinputs,-q/--chargeis required unless--ligand-chargeis provided (supported for PDB inputs or XYZ/GJF with--ref-pdb); explicit-qstill overrides. Multiplicity inherits.gjfmetadata when available, otherwise defaults to1.Stage results (
result.xyzplus optional PDB/GJF companions) are written regardless of--dump; trajectories are written only when--dumpisTrueand converted toscan.pdb(PDB inputs only) when conversion is enabled.
YAML configuration (--args-yaml)¶
The YAML root must be a mapping. YAML parameters override CLI. Shared sections reuse the definitions documented for YAML Reference.
geom:
coord_type: cart # coordinate type: cartesian vs dlc internals
freeze_atoms: [] # 0-based frozen atoms merged with CLI/link detection
calc:
charge: 0 # total charge (CLI/template override)
spin: 1 # spin multiplicity 2S+1
model: uma-s-1p1 # UMA model tag
task_name: omol # UMA task name
device: auto # UMA device selection
max_neigh: null # maximum neighbors for graph construction
radius: null # cutoff radius for neighbor search
r_edges: false # store radial edges
out_hess_torch: true # request torch-form Hessian
freeze_atoms: null # calculator-level frozen atoms
hessian_calc_mode: FiniteDifference # Hessian mode selection
return_partial_hessian: false # full Hessian (avoids shape mismatches)
opt:
thresh: gau # convergence preset (Gaussian/Baker-style)
max_cycles: 10000 # optimizer cycle cap
print_every: 100 # logging stride
min_step_norm: 1.0e-08 # minimum norm for step acceptance
assert_min_step: true # stop if steps fall below threshold
rms_force: null # explicit RMS force target
rms_force_only: false # rely only on RMS force convergence
max_force_only: false # rely only on max force convergence
force_only: false # skip displacement checks
converge_to_geom_rms_thresh: 0.05 # geom RMS threshold when converging to ref
overachieve_factor: 0.0 # factor to tighten thresholds
check_eigval_structure: false # validate Hessian eigenstructure
line_search: true # enable line search
dump: false # dump trajectory/restart data
dump_restart: false # dump restart checkpoints
prefix: "" # filename prefix
out_dir: ./result_scan/ # output directory
lbfgs:
thresh: gau # LBFGS convergence preset
max_cycles: 10000 # iteration limit
print_every: 100 # logging stride
min_step_norm: 1.0e-08 # minimum accepted step norm
assert_min_step: true # assert when steps stagnate
rms_force: null # explicit RMS force target
rms_force_only: false # rely only on RMS force convergence
max_force_only: false # rely only on max force convergence
force_only: false # skip displacement checks
converge_to_geom_rms_thresh: 0.05 # RMS threshold when targeting geometry
overachieve_factor: 0.0 # tighten thresholds
check_eigval_structure: false # validate Hessian eigenstructure
line_search: true # enable line search
dump: false # dump trajectory/restart data
dump_restart: false # dump restart checkpoints
prefix: "" # filename prefix
out_dir: ./result_scan/ # output directory
keep_last: 7 # history size for LBFGS buffers
beta: 1.0 # initial damping beta
gamma_mult: false # multiplicative gamma update toggle
max_step: 0.3 # maximum step length
control_step: true # control step length adaptively
double_damp: true # double damping safeguard
mu_reg: null # regularization strength
max_mu_reg_adaptions: 10 # cap on mu adaptations
rfo:
thresh: gau # RFOptimizer convergence preset
max_cycles: 10000 # iteration cap
print_every: 100 # logging stride
min_step_norm: 1.0e-08 # minimum accepted step norm
assert_min_step: true # assert when steps stagnate
rms_force: null # explicit RMS force target
rms_force_only: false # rely only on RMS force convergence
max_force_only: false # rely only on max force convergence
force_only: false # skip displacement checks
converge_to_geom_rms_thresh: 0.05 # RMS threshold when targeting geometry
overachieve_factor: 0.0 # tighten thresholds
check_eigval_structure: false # validate Hessian eigenstructure
line_search: true # enable line search
dump: false # dump trajectory/restart data
dump_restart: false # dump restart checkpoints
prefix: "" # filename prefix
out_dir: ./result_scan/ # output directory
trust_radius: 0.1 # trust-region radius
trust_update: true # enable trust-region updates
trust_min: 0.0 # minimum trust radius
trust_max: 0.1 # maximum trust radius
max_energy_incr: null # allowed energy increase per step
hessian_update: bfgs # Hessian update scheme
hessian_init: calc # Hessian initialization source
hessian_recalc: 200 # rebuild Hessian every N steps
hessian_recalc_adapt: null # adaptive Hessian rebuild factor
small_eigval_thresh: 1.0e-08 # eigenvalue threshold for stability
alpha0: 1.0 # initial micro step
max_micro_cycles: 50 # micro-iteration limit
rfo_overlaps: false # enable RFO overlaps
gediis: false # enable GEDIIS
gdiis: true # enable GDIIS
gdiis_thresh: 0.0025 # GDIIS acceptance threshold
gediis_thresh: 0.01 # GEDIIS acceptance threshold
gdiis_test_direction: true # test descent direction before DIIS
adapt_step_func: true # adaptive step scaling toggle
bias:
k: 300 # harmonic bias strength (eV·Å⁻²)
bond:
device: cuda # UMA device for bond analysis
bond_factor: 1.2 # covalent-radius scaling
margin_fraction: 0.05 # tolerance margin for comparisons
delta_fraction: 0.05 # minimum relative change to flag bonds
See Also¶
all — End-to-end workflow with
--scan-listsfor single-structure inputspath-search — MEP search using scan endpoints as intermediates
extract — Generate pocket PDBs before scanning
YAML Reference — Full
biasandbondconfiguration optionsGlossary — Definitions of MEP, Segment