path-search¶
Overview¶
Summary: Build a continuous MEP from two or more structures with GSM (default) or DMF (
--mep-mode dmf). Automatically refines only regions with bond changes and exports the highest-energy image (HEI) as a TS candidate (validate with tsopt + IRC).
At a glance¶
Use when: You have R → … → P structures (2+ inputs) and want a single stitched MEP with automatic refinement.
Method: Chains GSM/DMF segments and recursively refines only sub-intervals that still contain covalent changes.
Outputs:
mep_trj.xyz(main trajectory),summary.json(segment-by-segment results), and optional plots/merged PDBs when enabled.Defaults:
--mep-mode gsm,--opt-mode grad(LBFGS),--no-preopt,--align,--thresh gau,--thresh-stopt gau_loose.Next step: HEI output alone does not validate a TS. Follow with tsopt (includes imaginary-frequency check) and irc.
pdb2reaction path-search builds a continuous minimum-energy path (MEP) across two or more structures using GSM (default) or DMF (--mep-mode dmf). It selectively refines only those regions where covalent bond changes are detected, then stitches the resolved subpaths into a single trajectory.
When --convert-files is enabled (default), the command mirrors trajectories into .pdb companions when PDB references exist, and writes .gjf companions for HEI snapshots when Gaussian templates exist. For XYZ/GJF inputs, --ref-pdb supplies an active site model (binding pocket)-level PDB topology while keeping XYZ coordinates, and --ref-full-pdb enables full-template merges (XYZ/GJF inputs still do not produce PDB companions).
The recursive decomposition automatically detects multistep reactions and builds a detailed MEP for each elementary step. However, complex multistep mechanisms may require manual trial-and-error—adjusting input intermediates, scan specifications, or convergence thresholds—to obtain a satisfactory pathway.
If you only have two endpoints and do not need recursive refinement, path-opt is the simpler option.
Minimal example¶
pdb2reaction path-search -i reactant.pdb product.pdb -q 0 -m 1 \
--out-dir ./result_path_search
Output checklist¶
result_path_search/mep_trj.xyzresult_path_search/summary.jsonresult_path_search/summary.logresult_path_search/mep_plot.png(when plotting succeeds)
Common examples¶
Provide explicit intermediates for a multistep path.
pdb2reaction path-search -i R.pdb IM1.pdb IM2.pdb P.pdb -q -1 -m 1 \
--out-dir ./result_path_search_multi
Enable merged full-system outputs with template references.
pdb2reaction path-search -i R.pdb IM1.pdb P.pdb -q 0 -m 1 \
--ref-full-pdb holo_template.pdb --out-dir ./result_path_search_merge
Use DMF mode with minima refinement.
pdb2reaction path-search -i reactant.pdb product.pdb -q 0 -m 1 \
--mep-mode dmf --refine-mode minima --out-dir ./result_path_search_dmf
Usage¶
pdb2reaction path-search -i R.pdb [I.pdb ...] P.pdb [-q CHARGE] [-l, --ligand-charge <number|'RES:Q,...'>] [--multiplicity 2S+1]
[-b/--backend uma|orb|mace|aimnet2] [--solvent SOLVENT] [--solvent-model alpb|cpcmx]
[--workers N] [--workers-per-node N]
[--mep-mode {gsm|dmf}] [--freeze-links/--no-freeze-links] [--thresh PRESET] [--thresh-stopt PRESET]
[--refine-mode {peak|minima}]
[--max-nodes N] [--max-cycles N] [--climb/--no-climb]
[--opt-mode grad|hess] [--dump/--no-dump]
[--out-dir DIR] [--preopt/--no-preopt]
[--align/--no-align] [--ref-full-pdb FILE...] [--ref-pdb FILE...]
[--convert-files/--no-convert-files]
[--show-config/--no-show-config] [--dry-run/--no-dry-run]
Examples¶
Active site model-only MEP between two endpoints:
pdb2reaction path-search -i reactant.pdb product.pdb -q 0
Multistep search with YAML overrides and merged full-system output:
pdb2reaction path-search \ -i R.pdb IM1.pdb IM2.pdb P.pdb -q -1 \ --ref-full-pdb holo_template.pdb --out-dir ./run_ps
CLI options¶
Option |
Description |
Default |
|---|---|---|
|
Two or more structures in reaction order (reactant → product). Pass all files after a single |
Required |
|
Net charge. Required for non- |
Required unless template/derivation applies |
|
Per-residue charge mapping (e.g., |
None |
|
MLIP predictor parallelism (workers > 1 disables analytic Hessians; UMA backend only; |
|
|
Spin multiplicity (2S+1). |
|
|
When loading PDB active site models, freeze the parent atoms of link hydrogens. See extract for link-hydrogen details. |
|
|
Comma-separated 1-based atom indices to freeze explicitly (e.g., |
None |
|
Internal nodes per MEP segment (GSM string images or DMF images). |
|
|
Maximum MEP optimization cycles (GSM/DMF). |
|
|
Enable climbing image for GSM segments (bridge segments always run without climbing). |
|
|
Single-structure optimizer for HEI±1/kink nodes. |
|
|
Segment generator: GSM (string-based) or DMF (direct flux). |
|
|
Seeds for refinement: |
Auto |
|
Dump MEP (GSM/DMF) and single-structure trajectories. Restart YAML is written only when enabled in YAML. |
|
|
Toggle XYZ/TRJ → PDB/GJF companions for PDB or Gaussian inputs. |
|
|
Output directory. |
|
|
Override convergence preset for single-structure optimizations only ( |
|
|
Override convergence preset for the string optimizer ( |
|
|
Base YAML configuration layer applied before explicit CLI values. |
None |
|
Print resolved configuration (including YAML layer metadata) and continue. |
|
|
MLIP backend. |
|
|
Implicit solvent name for xTB correction (e.g. |
|
|
xTB solvent model. |
|
|
Validate options and print the execution plan without running path search. |
|
|
Pre-optimize each endpoint before MEP search. Scope-dependent default: |
|
|
Align all inputs to the first structure before searching. |
|
|
Full-size template PDBs (one per input, unless |
None |
|
Active site model reference PDBs used for the final full-system merge when inputs are XYZ/GJF (one per input, matching input order). |
None |
Workflow¶
Initial segment per pair (GSM/DMF) – run
GrowingStringor DMF between each adjacent input (A→B) to obtain a coarse MEP and identify the highest-energy image (HEI).Local relaxation around HEI – refine either HEI ± 1 (
refine-mode=peak) or the nearest local minima on each side of the HEI (refine-mode=minima) with the chosen single-structure optimizer (opt-mode) to recover nearby minima (End1,End2).Default: When
--refine-modeis omitted, it defaults topeakfor GSM andminimafor DMF.Decide between kink vs. refinement:
If no covalent bond change is detected between
End1andEnd2, treat the region as a kink – a conformational rearrangement with no bond breaking or formation (see Glossary): insertsearch.kink_max_nodeslinear nodes and optimize each individually.Otherwise, the region is a reactive segment – a segment in which covalent bond changes are detected between the endpoints (see Glossary). Launch a refinement segment (GSM/DMF) between
End1andEnd2to sharpen the barrier.
Selective recursion – compare bond changes for
(A→End1)and(End2→B)using thebondthresholds. Recurse only on sub-intervals that still contain covalent updates. Recursion depth is capped bysearch.max_depth.Stitching & bridging – concatenate resolved subpaths, dropping duplicate endpoints when RMSD ≤
search.stitch_rmsd_thresh. If the RMSD gap between two stitched pieces exceedssearch.bridge_rmsd_thresh, insert a bridge segment – a connecting segment between two non-adjacent intermediates (see Glossary) – using GSM/DMF. When the interface itself shows a bond change, a brand-new recursive segment replaces the bridge.Alignment & merging (optional) – with
--align(default), pre-optimized structures are rigidly aligned to the first input andfreeze_atomsare reconciled. Provide--ref-full-pdbto merge active site model trajectories back into full-size PDB templates (one template per input unless alignment allows reuse of the first file).
Bond-change detection relies on bond_changes.compare_structures with thresholds surfaced under the bond YAML section. MLIP backends are constructed once and shared across all structures for efficiency.
Outputs¶
out_dir/ (default:./result_path_search/)
├─ mep_trj.xyz # Primary MEP trajectory
├─ mep.pdb # PDB companion when inputs were PDB templates and conversion is enabled
├─ mep_w_ref.pdb # Merged full-system MEP (requires ref PDB/template)
├─ mep_w_ref_seg_XX.pdb # Merged per-segment paths when covalent changes exist (requires ref PDB)
├─ summary.json # Barrier and classification summary for every recursive segment
├─ summary.log # Text summary
├─ mep_plot.png # ΔE profile generated via `trj2fig` (kcal/mol, reactant reference)
├─ energy_diagram_MEP.png # Static export of the MEP state-energy diagram (relative to reactant)
└─ seg_000_*/ # GSM/DMF dumps, HEI snapshots, kink/refinement diagnostics per segment
Console reports covering resolved configuration blocks (
geom,calc,gs,stopt,opt.*,bond,search).
Notes¶
For symptom-first diagnosis, start with Common Error Recipes, then use Troubleshooting for detailed fixes.
Provide at least two inputs; otherwise the command exits with an “invalid value” error for
-i/--input.Repeat
--ref-full-pdbonce per file when providing multiple templates; with--align, only the first template is reused for merges.All MLIP backends are shared across structures for efficiency.
When
--dumpis set, MEP (GSM/DMF) and single-structure optimizations emit trajectories. Restart YAML is written only whendump_restartis enabled in YAML.
See CLI Conventions: Configuration precedence for the full resolution order.
The YAML root must be a mapping. Shared sections reuse YAML Reference: geom/calc mirror single-structure options (with --freeze-links augmenting geom.freeze_atoms for PDBs), and stopt inherits the StringOptimizer knobs documented for path-opt (see path-opt.md).
Note
Reference duplication. The YAML keys for geom, calc, gs, dmf, stopt, opt.lbfgs, and opt.rfo listed below mirror the canonical definitions in YAML Reference. When the two pages disagree, the canonical YAML Reference entries (and pdb2reaction/defaults.py) take precedence; the appendix on this page is reproduced inline only for path-search-specific defaults (e.g. out_dir: ./result_path_search/).
gs (Growing String) inherits defaults from pdb2reaction.path_opt.GS_KW with overrides for max_nodes (internal nodes per segment), climb behavior (climb, climb_rms, climb_fixed), and reparameterization cadence (reparam_every_full, reparam_check).
opt houses the single-structure optimizers used for HEI±1 and kink nodes, split into lbfgs and rfo subsections. Each subsection mirrors YAML Reference but defaults to out_dir: ./result_path_search/ and dump: False.
bond carries the MLIP-based bond-change detection parameters shared with scan: device, bond_factor, margin_fraction, and delta_fraction.
dmf bundles Direct Max Flux + (C)FB-ENM controls applied whenever --mep-mode dmf is selected. The defaults mirror the shared DMF_KW dictionary and can be overridden per run:
path-search-specific overrides¶
For full key listings of geom, calc, gs, dmf, stopt, opt.lbfgs, and opt.rfo, see YAML Reference. The defaults below differ from the canonical entries only in out_dir, which points to ./result_path_search/ instead of the per-section default:
stopt:
out_dir: ./result_path_search/ # path-search override (canonical default: ./result_path_opt/)
opt:
lbfgs:
out_dir: ./result_path_search/ # path-search override (canonical default: ./result_opt/)
rfo:
out_dir: ./result_path_search/ # path-search override (canonical default: ./result_opt/)
bond and search are kept here because they are central to the path-search recursion logic:
bond:
device: auto # MLIP device for bond analysis
bond_factor: 1.2 # covalent-radius scaling
margin_fraction: 0.05 # tolerance margin for comparisons
delta_fraction: 0.05 # minimum relative change to flag bonds
search:
max_depth: 10 # recursion depth limit
stitch_rmsd_thresh: 0.0001 # RMSD threshold for stitching segments
bridge_rmsd_thresh: 0.0001 # RMSD threshold for bridging nodes
max_nodes_segment: 10 # max nodes per segment
max_nodes_bridge: 5 # max nodes per bridge
kink_max_nodes: 3 # max nodes for kink optimizations
max_seq_kink: 2 # max sequential kinks
refine_mode: null # optional refinement strategy (auto-chooses when null)
See Also¶
Common Error Recipes – Symptom-first failure routing
Troubleshooting – Detailed troubleshooting guide
path-opt — Single-pass MEP optimization (no recursive refinement)
tsopt — Optimize the HEI as a transition state
extract — Generate active site model PDBs for path-search inputs
all — End-to-end workflow (uses recursive path-search by default;
--refine-path Falsefor single-pass path-opt)YAML Reference — Full
gs,dmf,bond,searchconfiguration optionsGlossary — Definitions of MEP, GSM, DMF, HEI