path-search¶
Builds a continuous minimum-energy path (MEP) from two or more structures in reaction order (R → … → P). Use it when you need a single stitched MEP with automatic refinement. For just two endpoints with no recursive refinement, path-opt is the simpler option.
The path is generated by one of two engines — GSM (default, --mep-mode gsm, string-based) or DMF (--mep-mode dmf, Direct Max Flux) — and refined only where covalent bond changes are detected. Refinement targets either the highest-energy image and its immediate neighbors (HEI±1, --refine-mode peak) or the nearest local minima on each side (--refine-mode minima); the default is peak for GSM and minima for DMF. The resolved subpaths are stitched into one trajectory, and the highest-energy image (HEI) of each segment is exported as a TS candidate (validate with tsopt + IRC).
Recursive decomposition automatically detects multistep reactions and builds a detailed MEP for each elementary step. Complex multistep mechanisms may require manual trial-and-error — adjusting input intermediates, scan specifications, or convergence thresholds — to obtain a satisfactory pathway.
Examples¶
Command form:
pdb2reaction path-search -i R.pdb [I.pdb ...] P.pdb [-q CHARGE] [-l, --ligand-charge <number|'RES:Q,...'>] [--multiplicity 2S+1]
[-b/--backend uma|orb|mace|aimnet2] [--solvent SOLVENT] [--solvent-model alpb|cpcmx]
[--workers N] [--workers-per-node N]
[--mep-mode {gsm|dmf}] [--freeze-links/--no-freeze-links] [--thresh PRESET] [--thresh-stopt PRESET]
[--refine-mode {peak|minima}]
[--max-nodes N] [--max-cycles N] [--climb/--no-climb]
[--opt-mode grad|hess] [--dump/--no-dump]
[--out-dir DIR] [--preopt/--no-preopt]
[--align/--no-align] [--ref-full-pdb FILE...] [--ref-pdb FILE...]
[--convert-files/--no-convert-files]
[--show-config/--no-show-config] [--dry-run/--no-dry-run]
Two endpoints (reactant → product):
pdb2reaction path-search -i reactant.pdb product.pdb -q 0 -m 1 \
--out-dir ./result_path_search
Provide explicit intermediates for a multistep path:
# Provide explicit intermediates for a multistep path
pdb2reaction path-search -i R.pdb IM1.pdb IM2.pdb P.pdb -q -1 -m 1 \
--out-dir ./result_path_search_multi
Enable merged full-system outputs with template references:
# Enable merged full-system outputs with template references
pdb2reaction path-search -i R.pdb IM1.pdb P.pdb -q 0 -m 1 \
--ref-full-pdb holo_template.pdb --out-dir ./result_path_search_merge
Use DMF mode with minima refinement:
# Use DMF mode with minima refinement
pdb2reaction path-search -i reactant.pdb product.pdb -q 0 -m 1 \
--mep-mode dmf --refine-mode minima --out-dir ./result_path_search_dmf
Workflow¶
Initial segment per pair (GSM/DMF) – run
GrowingStringor DMF between each adjacent input (A→B) to obtain a coarse MEP and identify the highest-energy image (HEI).Local relaxation around HEI – refine either HEI ± 1 (
refine-mode=peak) or the nearest local minima on each side of the HEI (refine-mode=minima) with the chosen single-structure optimizer (opt-mode) to recover nearby minima (End1,End2).Default: When
--refine-modeis omitted, it defaults topeakfor GSM andminimafor DMF.Decide between kink vs. refinement:
If no covalent bond change is detected between
End1andEnd2, treat the region as a kink — a conformational rearrangement with no bond breaking or formation (see Glossary): insertsearch.kink_max_nodeslinear nodes and optimize each individually.Otherwise, the region is a reactive segment — a segment in which covalent bond changes are detected between the endpoints (see Glossary). Launch a refinement segment (GSM/DMF) between
End1andEnd2to sharpen the barrier.
Selective recursion – compare bond changes for
(A→End1)and(End2→B)using thebondthresholds. Recurse only on sub-intervals that still contain covalent updates. Recursion depth is capped bysearch.max_depth.Stitching & bridging – concatenate resolved subpaths, dropping duplicate endpoints when RMSD ≤
search.stitch_rmsd_thresh. If the RMSD gap between two stitched pieces exceedssearch.bridge_rmsd_thresh, insert a bridge segment — a connecting segment between two non-adjacent intermediates (see Glossary) — using GSM/DMF. When the interface itself shows a bond change, a new recursive segment replaces the bridge.Alignment & merging (optional) – with
--align(default), pre-optimized structures are rigidly aligned to the first input andfreeze_atomsare reconciled. Provide--ref-full-pdbto merge active site model trajectories back into full-size PDB templates (one template per input unless alignment allows reuse of the first file).
Bond-change detection relies on bond_changes.compare_structures with thresholds surfaced under the bond YAML section. All MLIP backends are constructed once and shared across structures for efficiency.
Outputs¶
out_dir/ (default:./result_path_search/)
├─ mep_trj.xyz # Primary MEP trajectory
├─ mep.pdb # PDB companion when inputs were PDB templates and conversion is enabled
├─ mep.gjf # Gaussian companion when a Gaussian template is detected
├─ mep_w_ref.pdb # Merged full-system MEP (requires ref PDB/template)
├─ mep_seg_XX_trj.xyz # Per-segment MEP trajectory (XYZ)
├─ mep_seg_XX.pdb # Per-segment PDB companion (when conversion is enabled)
├─ mep_seg_XX.gjf # Per-segment Gaussian companion (when a template is detected)
├─ mep_w_ref_seg_XX.pdb # Merged per-segment paths when covalent changes exist (requires ref PDB)
├─ hei_seg_XX.xyz # Per-segment highest-energy image
├─ hei_seg_XX.pdb # HEI PDB companion (when conversion is enabled)
├─ hei_seg_XX.gjf # HEI Gaussian companion (when a template is detected)
├─ hei_w_ref_seg_XX.pdb # Merged HEI in full-system context (requires ref PDB)
├─ summary.json # Barrier and classification summary for every recursive segment
├─ summary.log # Text summary
├─ mep_plot.png # ΔE profile generated via `trj2fig` (kcal/mol, reactant reference)
├─ energy_diagram_MEP.png # Static export of the MEP state-energy diagram (relative to reactant)
└─ seg_NNN_*/ # GSM/DMF dumps, HEI snapshots, kink/refinement diagnostics per segment
Console reports covering resolved configuration blocks (
geom,calc,gs,stopt,opt.*,bond,search); see Verbosity levels.
CLI options¶
The full flag list is in the generated command reference; the table below covers the options that need explanation — do not hand-duplicate it here.
The table is grouped by purpose; within each group the most-used options come first.
Option |
Description |
Default |
|---|---|---|
Input & charge |
||
|
Two or more structures in reaction order (reactant → product). Pass all files after a single |
Required |
|
Net charge. Required for non- |
Required unless template/derivation applies |
|
Either a scalar integer (e.g., |
None |
|
Spin multiplicity (2S+1). |
|
Backend & compute |
||
|
MLIP backend. |
|
|
MLIP predictor parallelism (workers > 1 disables analytic Hessians; UMA backend only; |
|
|
Implicit solvent name for xTB correction (e.g. |
|
|
xTB solvent model. |
|
Active-region freezing |
||
|
When loading PDB active site models, freeze the parent atoms of cap hydrogens. See extract for cap-hydrogen details. |
|
|
Comma-separated 1-based atom indices to freeze explicitly (e.g., |
None |
MEP search |
||
|
Segment generator: GSM (string-based) or DMF (Direct Max Flux). |
|
|
DMF compute backend ( |
|
|
Pre-optimize each endpoint with the selected single-structure optimizer (L-BFGS/RFO) before MEP search. |
|
|
Internal nodes per MEP segment (GSM string images or DMF images). |
|
|
Maximum MEP optimization cycles (GSM/DMF). |
|
|
Enable climbing image for GSM segments (bridge segments always run without climbing). |
|
Refinement |
||
|
Seeds for refinement: |
Auto |
|
Single-structure optimizer for HEI±1/kink nodes. |
|
Convergence thresholds |
||
|
Override convergence preset for single-structure optimizations only ( |
|
|
Override convergence preset for the string optimizer ( |
|
Merge & alignment |
||
|
Align all inputs to the first structure before searching. |
|
|
Full-size template PDBs (one per input, unless |
None |
|
Active site model reference PDBs used for the final full-system merge when inputs are XYZ/GJF (one per input, matching input order). |
None |
Output & config |
||
|
Output directory. |
|
|
Dump MEP (GSM/DMF) and single-structure trajectories. Restart YAML is written only when enabled in YAML. |
|
|
Toggle XYZ/TRJ → PDB/GJF companions for PDB or Gaussian inputs. XYZ/GJF inputs do not produce a PDB companion of their own primary trajectory. |
|
|
Base YAML configuration layer applied before explicit CLI values. |
None |
|
Print resolved configuration (including YAML layer metadata) and continue. |
|
|
Validate options and print the execution plan without running path search. |
|
See CLI Conventions: Configuration precedence for the full resolution order.
YAML configuration¶
The YAML root must be a mapping. Shared sections reuse YAML Reference: geom/calc mirror single-structure options (with --freeze-links augmenting geom.freeze_atoms for PDBs), and stopt inherits the StringOptimizer knobs documented for path-opt (see path-opt.md).
bond and search are central to the recursion logic and shown below; gs, dmf, stopt, opt.lbfgs, and opt.rfo are reproduced only for the path-search-specific out_dir overrides.
bond carries the MLIP-based bond-change detection parameters shared with scan: device, bond_factor, margin_fraction, and delta_fraction.
path-search-specific overrides¶
stopt:
out_dir: ./result_path_search/ # path-search override (canonical default: ./result_path_opt/)
opt:
lbfgs:
out_dir: ./result_path_search/ # path-search override (canonical default: ./result_opt/)
rfo:
out_dir: ./result_path_search/ # path-search override (canonical default: ./result_opt/)
bond and search are kept here because they are central to the path-search recursion logic:
bond:
device: auto # MLIP device for bond analysis
bond_factor: 1.2 # covalent-radius scaling
margin_fraction: 0.05 # tolerance margin for comparisons
delta_fraction: 0.05 # minimum relative change to flag bonds
search:
max_depth: 10 # recursion depth limit
stitch_rmsd_thresh: 0.0001 # RMSD threshold for stitching segments
bridge_rmsd_thresh: 0.0001 # RMSD threshold for bridging nodes
max_nodes_segment: 20 # max nodes per segment
max_nodes_bridge: 5 # max nodes per bridge
kink_max_nodes: 3 # max nodes for kink optimizations
max_seq_kink: 2 # max sequential kinks
refine_mode: null # optional refinement strategy (auto-chooses when null)
See Also¶
Common Error Recipes — Symptom-first failure routing
Troubleshooting — Detailed troubleshooting guide
path-opt — Single-pass MEP optimization (no recursive refinement)
tsopt — Optimize the HEI as a transition state
extract — Generate active site model PDBs for path-search inputs
all — End-to-end workflow (defaults to single-pass path-opt;
--refine-path Truefor recursive path-search)YAML Reference — Full
gs,dmf,bond,searchconfiguration optionsGlossary — Definitions of MEP, GSM, DMF, HEI