path-opt

Overview

Summary: Find an MEP between exactly two structures with GSM (default) or DMF (--mep-mode dmf). Writes the path trajectory and exports the highest-energy image (HEI) as a TS candidate.

At a glance

  • Use when: You have reactant and product endpoints (R → P) and want a first-pass MEP.

  • Method: GSM by default; switch to DMF with --mep-mode dmf.

  • Outputs: final_geometries_trj.xyz (path) and hei.xyz (HEI), plus optional .pdb/.gjf companions when conversion is enabled.

  • Defaults: --opt-mode grad (LBFGS), --climb, --max-nodes 20, --thresh gau, --thresh-stopt gau_loose.

  • Next step: Optimize the HEI with tsopt (includes imaginary-frequency check; expect one imaginary frequency) → irc.

pdb2reaction path-opt searches for a minimum-energy path (MEP) between two endpoints and reports the highest-energy image (HEI). Treat the HEI as a candidate transition state until it is validated with tsopt (which includes an imaginary-frequency check) and irc. For workflows that start from two or more structures and automatically refine only the reactive region, use path-search.

When to use path-opt vs path-search: Use path-opt when you have exactly 2 endpoint structures and want MEP optimization without recursive refinement. Use path-search when you have 2 or more structures and want automatic recursive refinement of regions with bond changes.

An MLIP backend (UMA by default; switch with -b/--backend to ORB, MACE, or AIMNet2) provides energies, gradients, and Hessians for every image. Before optimization starts, a rigid-body alignment step keeps the string stable; if you define freeze_atoms, only those atoms are used for the RMSD fit (the transform is still applied to all atoms).

Note

Frozen atoms in DMF mode use HarmonicFixAtoms (harmonic restraints with k=300 eV/Ų) instead of pysisyphus’s hard coordinate freeze used by GSM. This means frozen atoms in DMF can move slightly from their reference positions, which differs from the rigid freeze in GSM mode.

Minimal example

pdb2reaction path-opt -i reactant.pdb product.pdb -q 0 -m 1 \
 --out-dir ./result_path_opt

Output checklist

  • result_path_opt/final_geometries_trj.xyz

  • result_path_opt/hei.xyz

  • result_path_opt/hei.pdb (when PDB conversion is available)

Common examples

  1. Pre-optimize endpoints before MEP search.

pdb2reaction path-opt -i reactant.pdb product.pdb -q 0 -m 1 \
 --preopt --preopt-max-cycles 20000 --out-dir ./result_path_opt_preopt
  1. Use DMF mode instead of GSM.

pdb2reaction path-opt -i reactant.pdb product.pdb -q 0 -m 1 \
 --mep-mode dmf --max-nodes 12 --out-dir ./result_path_opt_dmf
  1. Freeze link parents and disable climb for a quick pass.

pdb2reaction path-opt -i reactant.pdb product.pdb -q 0 -m 1 \
 --freeze-links --no-climb --out-dir ./result_path_opt_fast

Usage

pdb2reaction path-opt -i REACTANT.{pdb|xyz} PRODUCT.{pdb|xyz} [-q CHARGE] [-l, --ligand-charge <number|'RES:Q,...'>] [-m MULT] \
 [-b/--backend uma|orb|mace|aimnet2] [--solvent SOLVENT] [--solvent-model alpb|cpcmx] \
 [--workers N] [--workers-per-node N] \
 [--mep-mode {gsm|dmf}] [--freeze-links/--no-freeze-links] [--max-nodes N] [--max-cycles N] \
 [--climb/--no-climb] [--dump/--no-dump] [--thresh PRESET] [--thresh-stopt PRESET] \
 [--preopt/--no-preopt] [--preopt-max-cycles N] [--opt-mode grad|hess] [--fix-ends/--no-fix-ends] \
 [--show-config/--no-show-config] [--dry-run/--no-dry-run] \
 [--convert-files/--no-convert-files] [--ref-pdb FILE]

Workflow

  1. Pre-alignment & freeze resolution

  • All endpoints after the first are Kabsch-aligned to the first structure. If either endpoint defines freeze_atoms, only those atoms participate in the RMSD fit and the resulting transform is applied to every atom.

  • When --freeze-links is active, link-hydrogen parent atoms are automatically frozen (see Link hydrogen and frozen atoms).

  1. String growth and HEI export

  • After the path is grown and refined, the tool searches for the highest-energy internal local maximum (preferred). If none exists, it falls back to the maximum among internal nodes; if no internal nodes are present, the global maximum is exported.

  • The highest-energy image (HEI) is written both as .xyz and .pdb when a PDB reference exists, and as .gjf when a Gaussian template is available; these conversions honor --convert-files.

Key behaviors

  • Endpoints: Exactly two structures are required. Formats follow geom_loader. PDB inputs (or XYZ/GJF with --ref-pdb) enable trajectory/HEI PDB exports.

  • Charge/spin: Charge is resolved via the standard priority chain (see CLI Conventions: Charge specification for details).

  • MEP segments: --max-nodes controls the number of internal nodes/images. For GSM, total images = max_nodes + 2 (including fixed endpoints). For DMF, max_nodes sets the number of movable images along the chain. GSM growth and optional climbing-image refinement use the StringOptimizer convergence preset from --thresh-stopt or stopt.thresh (gau_loose, gau, gau_tight, gau_vtight, baker, never).

  • Endpoint preoptimization: --thresh controls only the single-structure endpoint optimizer selected by --opt-mode (opt.lbfgs.thresh / opt.rfo.thresh).

  • Climbing image: --climb toggles both the standard climbing step and the Lanczos-based tangent refinement.

  • Dumping: --dump mirrors stopt.dump=True for the StringOptimizer, producing trajectory dumps inside out_dir. Restart YAML is written only when enabled in YAML.

  • Exit codes: See Exit codes in CLI Conventions.

CLI options

Option

Description

Default

-i, --input PATH PATH

Reactant and product structures (.pdb/.xyz).

Required

-q, --charge INT

Total charge (calc.charge). Required for non-.gjf inputs unless --ligand-charge/-l derivation succeeds (PDB inputs or XYZ/GJF with --ref-pdb). .gjf templates can supply it; if .gjf inputs lack charge metadata, the run aborts unless -q is provided. Overrides --ligand-charge/-l when both are set.

Required unless template/derivation applies

-l, --ligand-charge TEXT

Total charge or per-resname mapping used when -q is omitted. Triggers extract-style charge derivation on the full complex for PDB inputs (or XYZ/GJF when --ref-pdb is supplied).

None

--workers, --workers-per-node

MLIP predictor parallelism (workers > 1 disables analytic Hessians; UMA backend only; workers_per_node forwarded to the parallel predictor). See workers > 1 silent FD downgrade for diagnostic notes.

1, 1

-m, --multiplicity INT

Spin multiplicity (calc.spin).

Template/1

--freeze-links/--no-freeze-links

PDB-only: freeze link-H parents (merged with YAML). See extract for link-hydrogen details.

True

--freeze-atoms TEXT

Comma-separated 1-based atom indices to freeze explicitly (e.g., '1,3,5'). Complements --freeze-links; applies to any input format.

None

--max-nodes INT

Number of internal nodes. GSM: total images = max_nodes + 2 (the two endpoints are fixed). DMF: number of movable images along the chain (no implicit endpoint expansion).

20

--mep-mode {gsm|dmf}

Select GSM (string-based) or DMF (direct flux) path generator.

gsm

--max-cycles INT

Optimizer macro-iteration cap (stopt.max_cycles).

300

--climb/--no-climb

Enable climbing-image refinement (and Lanczos tangent).

True

--dump/--no-dump

Dump MEP trajectories (GSM/DMF). Restart YAML is written only when enabled in YAML.

False

--opt-mode TEXT

Single-structure optimizer for endpoint preoptimization (grad = LBFGS, hess = RFO).

grad

--convert-files/--no-convert-files

Toggle XYZ/TRJ → PDB/GJF companions for PDB/Gaussian inputs.

True

--ref-pdb FILE

Reference PDB topology for XYZ/GJF inputs (keeps XYZ coordinates) to enable PDB conversions.

None

-o, --out-dir TEXT

Output directory.

./result_path_opt/

--thresh TEXT

Override convergence preset for endpoint preoptimization only (opt.lbfgs/rfo.thresh).

gau

--thresh-stopt TEXT

Override convergence preset for the string optimizer (stopt.thresh).

gau_loose

--config FILE

Base YAML configuration layer applied before explicit CLI values.

None

--show-config/--no-show-config

Print resolved configuration (including YAML layers) and continue.

False

-b, --backend {uma,orb,mace,aimnet2}

MLIP backend.

uma

--solvent TEXT

Implicit solvent name for xTB correction (e.g. water). none to disable.

none

--solvent-model {alpb,cpcmx}

xTB solvent model.

alpb

--dry-run/--no-dry-run

Validate options and print the execution plan without running optimization.

False

--preopt/--no-preopt

Pre-optimize each endpoint with the selected single-structure optimizer before alignment/MEP search (GSM/DMF). Scope-dependent default: False under standalone path-opt; flipped to True when invoked via pdb2reaction all (see all → MEP Search Options).

False

--preopt-max-cycles INT

Cap for endpoint preoptimization cycles.

10000

--fix-ends/--no-fix-ends

Keep the endpoint geometries fixed during GSM growth/refinement.

False

--out-json/--no-out-json

Write a machine-readable result.json to out_dir. See JSON Output Schema for the schema.

False

Outputs

out_dir/
├─ final_geometries_trj.xyz # XYZ path; comment line holds energies when provided
├─ final_geometries_trj.pdb # When a PDB reference is available (input PDB or --ref-pdb) and conversion enabled
├─ hei.xyz # Highest-energy image with its energy on the comment line
├─ hei.pdb # HEI converted to PDB when a PDB reference is available (conversion enabled)
├─ hei.gjf # HEI written using a detected Gaussian template (conversion enabled)
├─ align_refine/ # Intermediate files from the rigid alignment/refinement stage (created when alignment runs)
└─ <optimizer dumps> # Trajectory dumps when --dump (restart YAML only via YAML dump_restart)

Console output echoes the resolved YAML blocks and prints cycle-by-cycle MEP progress (GSM/DMF) with timing information.

See CLI Conventions: Configuration precedence for the full resolution order.

Note

Reference duplication. The YAML keys for geom, calc, gs, dmf, stopt, opt.lbfgs, and opt.rfo are defined canonically in YAML Reference. When the two pages disagree, the canonical YAML Reference entries (and pdb2reaction/defaults.py) take precedence; only path-opt-specific overrides are reproduced below.

YAML sections used by path-opt

See YAML Reference for full key listings:

  • geom--freeze-links augments freeze_atoms for PDB inputs.

  • calc — MLIP backend setup.

  • gs — Growing String representation (GSM mode).

  • dmf — Direct Max Flux + (C)FB-ENM interpolation (DMF mode).

  • stopt — StringOptimizer settings.

  • opt.lbfgs / opt.rfo — Endpoint single-structure preoptimization. YAML overrides CLI --preopt-max-cycles.

path-opt-specific defaults

The following keys differ from the canonical defaults when invoked via path-opt:

stopt:
 out_dir: ./result_path_opt/ # output directory (path-opt default)
opt:
 lbfgs:
   out_dir: ./result_path_opt/ # output directory (path-opt default)
 rfo:
   out_dir: ./result_path_opt/ # output directory (path-opt default)

See Also

  • Common Error Recipes – Symptom-first failure routing

  • Troubleshooting – Detailed troubleshooting guide

  • path-search — Recursive MEP search with automatic refinement (for 2+ structures)

  • tsopt — Optimize the HEI as a TS candidate (includes imaginary-frequency check; follow with IRC)

  • extract — Generate active site model (binding pocket) PDBs for path-opt inputs

  • all — End-to-end workflow (uses recursive path-search by default; add --refine-path False for single-pass path-opt. The --refine-path flag lives on pdb2reaction all only — see all.md → MEP Search Options for its definition.)

  • YAML Reference — Full gs, dmf, stopt, opt configuration options

  • Glossary — Definitions of MEP, GSM, DMF, HEI