Contents Menu Expand Light mode Dark mode Auto light/dark, in light mode Auto light/dark, in dark mode
pdb2reaction
pdb2reaction

Guides

  • Getting Started
  • Concepts & Workflow
  • Troubleshooting
  • CLI Conventions
  • はじめに
  • 概念とワークフロー
  • トラブルシューティング
  • CLI 規約

Commands

  • all
  • extract
  • fix-altloc
  • add-elem-info
  • opt
  • tsopt
  • path-opt
  • path-search
  • scan
  • scan2d
  • scan3d
  • freq
  • irc
  • dft
  • trj2fig
  • energy-diagram
  • all
  • extract
  • fix-altloc
  • add-elem-info
  • opt
  • tsopt
  • path-opt
  • path-search
  • scan
  • scan2d
  • scan3d
  • freq
  • irc
  • dft
  • trj2fig
  • energy-diagram

Reference

  • YAML Reference
  • uma_pysis calculator
  • Glossary
  • YAML 設定リファレンス
  • uma_pysis 計算機
  • 用語集

Language

  • pdb2reaction ドキュメント
    • はじめに
    • 概念とワークフロー
    • トラブルシューティング
    • CLI 規約
    • all
    • extract
    • fix-altloc
    • add-elem-info
    • opt
    • tsopt
    • path-opt
    • path-search
    • scan
    • scan2d
    • scan3d
    • freq
    • irc
    • dft
    • trj2fig
    • energy-diagram
    • YAML 設定リファレンス
    • uma_pysis 計算機
    • 用語集
Back to top
EN | JA

path-search¶

Overview¶

Summary: Build a continuous MEP from two or more structures with GSM (default) or DMF (--mep-mode dmf). Automatically refines only regions with bond changes and exports the highest-energy image (HEI) as a TS candidate (validate with freq/IRC).

At a glance¶

  • Use when: You have R → … → P structures (2+ inputs) and want a single stitched MEP with automatic refinement.

  • Method: Chains GSM/DMF segments and recursively refines only sub-intervals that still contain covalent changes.

  • Outputs: mep.trj (main trajectory), summary.yaml (segment-by-segment results), and optional plots/merged PDBs when enabled.

  • Defaults: --mep-mode gsm, --opt-mode light (LBFGS), --preopt True, --align True, --thresh gau.

  • Next step: HEI output alone does not validate a TS. Follow with tsopt, freq, and irc.

pdb2reaction path-search builds a continuous minimum-energy path (MEP) across two or more structures using GSM (default) or DMF (--mep-mode dmf). It selectively refines only those regions where covalent bond changes are detected, then stitches the resolved subpaths into a single trajectory.

When --convert-files is enabled (default), the command mirrors trajectories into .pdb companions when PDB references exist, and writes .gjf companions for HEI snapshots when Gaussian templates exist. For XYZ/GJF inputs, --ref-pdb supplies a pocket-level PDB topology while keeping XYZ coordinates, and --ref-full-pdb enables full-template merges (XYZ/GJF inputs still do not produce PDB companions).

If you only have two endpoints and do not need recursive refinement, path-opt is the simpler option.

Usage¶

pdb2reaction path-search -i R.pdb [I.pdb ...] P.pdb [-q CHARGE] [--ligand-charge <number|'RES:Q,...'>] [--multiplicity 2S+1]
                         [--workers N] [--workers-per-node N]
                         [--mep-mode {gsm|dmf}] [--freeze-links {True\|False}] [--thresh PRESET]
                         [--refine-mode {peak|minima}]
                         [--max-nodes N] [--max-cycles N] [--climb {True\|False}]
                         [--opt-mode light|heavy] [--dump {True\|False}]
                         [--out-dir DIR] [--preopt {True\|False}]
                         [--align {True\|False}] [--ref-full-pdb FILE ...] [--ref-pdb FILE ...]
                         [--convert-files {True\|False}]
                         [--args-yaml FILE]

Examples¶

  • Pocket-only MEP between two endpoints:

    pdb2reaction path-search -i reactant.pdb product.pdb -q 0
    
  • Multistep search with YAML overrides and merged full-system output:

    pdb2reaction path-search \
        -i R.pdb IM1.pdb IM2.pdb P.pdb -q -1 \
        --args-yaml params.yaml --ref-full-pdb holo_template.pdb --out-dir ./run_ps
    

CLI options¶

Option

Description

Default

-i, --input PATH...

Two or more structures in reaction order (reactant → product). Repeat -i or pass multiple paths after one flag.

Required

-q, --charge INT

Total charge. Required for non-.gjf inputs unless --ligand-charge derivation succeeds (PDB inputs). Overrides --ligand-charge when both are set.

Required unless template/derivation applies

--ligand-charge TEXT

Total charge or per-resname mapping used when -q is omitted. Triggers extract-style charge derivation on the full complex for PDB inputs.

None

--workers, --workers-per-node

UMA predictor parallelism (workers > 1 disables analytic Hessians; workers_per_node forwarded to the parallel predictor).

1, 1

-m, --multiplicity INT

Spin multiplicity (2S+1).

.gjf template value or 1

--freeze-links {True|False}

When loading PDB pockets, freeze the parent atoms of link hydrogens.

True

--max-nodes INT

Internal nodes per MEP segment (GSM string images or DMF images).

10

--max-cycles INT

Maximum MEP optimization cycles (GSM/DMF).

300

--climb {True|False}

Enable climbing image for GSM segments (bridge segments always run without climbing).

True

--opt-mode TEXT

Single-structure optimizer for HEI±1/kink nodes. light maps to LBFGS; heavy maps to RFO.

light

--mep-mode {gsm|dmf}

Segment generator: GSM (string-based) or DMF (direct flux).

gsm

--refine-mode {peak|minima}

Seeds for refinement: peak optimizes HEI±1; minima searches outward from the HEI toward the nearest local minima on each side. Defaults to peak for GSM and minima for DMF when omitted.

Auto

--dump {True|False}

Dump MEP (GSM/DMF) and single-structure trajectories. Restart YAML is written only when enabled in YAML.

False

--convert-files {True|False}

Toggle XYZ/TRJ → PDB/GJF companions for PDB or Gaussian inputs.

True

--out-dir TEXT

Output directory.

./result_path_search/

--thresh TEXT

Override convergence preset for GSM and per-image optimizations (gau_loose, gau, gau_tight, gau_vtight, baker, never).

gau

--args-yaml FILE

YAML overrides (see below).

None

--preopt {True|False}

Pre-optimize each endpoint before MEP search (recommended).

True

--align {True|False}

Align all inputs to the first structure before searching.

True

--ref-full-pdb PATH...

Full-size template PDBs (one per input, unless --align lets you reuse the first).

None

--ref-pdb PATH...

Pocket reference PDBs to use when inputs are XYZ/GJF (one per input; keeps XYZ coordinates).

None

Workflow¶

  1. Initial segment per pair (GSM/DMF) – run GrowingString or DMF between each adjacent input (A→B) to obtain a coarse MEP and identify the highest-energy image (HEI).

  2. Local relaxation around HEI – refine either HEI ± 1 (refine-mode=peak) or the nearest local minima on each side of the HEI (refine-mode=minima) with the chosen single-structure optimizer (opt-mode) to recover nearby minima (End1, End2).

  3. Decide between kink vs. refinement:

    • If no covalent bond change is detected between End1 and End2, treat the region as a kink: insert search.kink_max_nodes linear nodes and optimize each individually.

    • Otherwise, launch a refinement segment (GSM/DMF) between End1 and End2 to sharpen the barrier.

  4. Selective recursion – compare bond changes for (A→End1) and (End2→B) using the bond thresholds. Recurse only on sub-intervals that still contain covalent updates. Recursion depth is capped by search.max_depth.

  5. Stitching & bridging – concatenate resolved subpaths, dropping duplicate endpoints when RMSD ≤ search.stitch_rmsd_thresh. If the RMSD gap between two stitched pieces exceeds search.bridge_rmsd_thresh, insert a bridge MEP segment (GSM/DMF). When the interface itself shows a bond change, a brand-new recursive segment replaces the bridge.

  6. Alignment & merging (optional) – with --align (default), pre-optimized structures are rigidly aligned to the first input and freeze_atoms are reconciled. Provide --ref-full-pdb to merge pocket trajectories back into full-size PDB templates (one template per input unless alignment allows reuse of the first file).

Bond-change detection relies on bond_changes.compare_structures with thresholds surfaced under the bond YAML section. UMA calculators are constructed once and shared across all structures for efficiency.

Outputs¶

out_dir/ (default: ./result_path_search/)
├─ mep.trj                  # Primary MEP trajectory
├─ mep.pdb                  # PDB companion when inputs were PDB templates and conversion is enabled
├─ mep_w_ref.pdb            # Merged full-system MEP (requires ref PDB/template)
├─ mep_w_ref_seg_XX.pdb     # Merged per-segment paths when covalent changes exist (requires ref PDB)
├─ summary.yaml             # Barrier and classification summary for every recursive segment
├─ mep_plot.png             # ΔE profile generated via `trj2fig` (kcal/mol, reactant reference)
├─ energy_diagram_MEP.png   # Static export of the MEP state-energy diagram (relative to reactant)
└─ seg_000_*/               # GSM/DMF dumps, HEI snapshots, kink/refinement diagnostics per segment
  • Console reports covering resolved configuration blocks (geom, calc, gs, opt, sopt.*, bond, search).

Notes¶

  • Provide at least two inputs; click.BadParameter is raised otherwise.

  • --ref-full-pdb can be given once followed by multiple filenames; with --align, only the first template is reused for merges.

  • All UMA calculators are shared across structures for efficiency.

  • When --dump is set, MEP (GSM/DMF) and single-structure optimizations emit trajectories. Restart YAML is written only when dump_restart is enabled in YAML.

  • Charge/spin inherit .gjf template metadata when available. If -q is omitted but --ligand-charge is provided, the inputs are treated as an enzyme–substrate complex and extract.py’s charge summary computes the total charge when the inputs are PDBs; explicit -q still overrides. For non-.gjf inputs without a usable --ligand-charge, the command aborts; multiplicity defaults to 1 when omitted.

YAML configuration (--args-yaml)¶

The YAML root must be a mapping. YAML parameters override the CLI values. Shared sections reuse YAML Reference: geom/calc mirror single-structure options (with --freeze-links augmenting geom.freeze_atoms for PDBs), and opt inherits the StringOptimizer knobs documented for path-opt (see path_opt.md).

gs (Growing String) inherits defaults from pdb2reaction.path_opt.GS_KW with overrides for max_nodes (internal nodes per segment), climb behavior (climb, climb_rms, climb_fixed), and reparameterization cadence (reparam_every_full, reparam_check).

sopt houses the single-structure optimizers used for HEI±1 and kink nodes, split into lbfgs and rfo subsections. Each subsection mirrors YAML Reference but defaults to out_dir: ./result_path_search/ and dump: False.

bond carries the UMA-based bond-change detection parameters shared with scan: device, bond_factor, margin_fraction, and delta_fraction.

search governs the recursion logic: max_depth, stitch_rmsd_thresh, bridge_rmsd_thresh, max_nodes_segment, max_nodes_bridge, kink_max_nodes, max_seq_kink, and refine_mode (auto-selects peak for GSM or minima for DMF when left null). The legacy rmsd_align flag is ignored but kept for compatibility.

dmf bundles Direct Max Flux + (C)FB-ENM controls applied whenever --mep-mode dmf is selected. The defaults mirror the shared DMF_KW dictionary and can be overridden per run:

Example YAML (default value)¶

geom:
  coord_type: cart           # coordinate type: cartesian vs dlc internals
  freeze_atoms: []           # 0-based frozen atoms merged with CLI/link detection
calc:
  charge: 0                  # total charge (CLI/template override)
  spin: 1                    # spin multiplicity 2S+1
  model: uma-s-1p1           # UMA model tag
  task_name: omol            # UMA task name
  device: auto               # UMA device selection
  max_neigh: null            # maximum neighbors for graph construction
  radius: null               # cutoff radius for neighbor search
  r_edges: false             # store radial edges
  out_hess_torch: true       # request torch-form Hessian
  freeze_atoms: null         # calculator-level frozen atoms
  hessian_calc_mode: FiniteDifference   # Hessian mode selection
  return_partial_hessian: false         # full Hessian (avoids shape mismatches)
gs:
  fix_first: true            # keep the first endpoint fixed during optimization
  fix_last: true             # keep the last endpoint fixed during optimization
  max_nodes: 10              # maximum string nodes
  perp_thresh: 0.005         # perpendicular displacement threshold
  reparam_check: rms         # reparametrization check metric
  reparam_every: 1           # reparametrization stride
  reparam_every_full: 1      # full reparametrization stride
  param: equi                # parametrization scheme
  max_micro_cycles: 10       # micro-iteration limit
  reset_dlc: true            # rebuild delocalized coordinates each step
  climb: true                # enable climbing image
  climb_rms: 0.0005          # climbing RMS threshold
  climb_lanczos: true        # Lanczos refinement for climbing
  climb_lanczos_rms: 0.0005  # Lanczos RMS threshold
  climb_fixed: false         # keep climbing image fixed
  scheduler: null            # optional scheduler backend
opt:
  type: string               # optimizer type label
  stop_in_when_full: 300     # early stop threshold when string is full
  align: false               # alignment toggle (kept off)
  scale_step: global         # step scaling mode
  max_cycles: 300            # maximum optimization cycles
  dump: false                # dump trajectory/restart data
  dump_restart: false        # dump restart checkpoints
  reparam_thresh: 0.0        # reparametrization threshold
  coord_diff_thresh: 0.0     # coordinate difference threshold
  out_dir: ./result_path_search/   # output directory
  print_every: 10            # logging stride
dmf:
  max_cycles: 300            # maximum DMF/IPOPT iterations
  correlated: true           # correlated DMF propagation
  sequential: true           # sequential DMF execution
  fbenm_only_endpoints: false   # run FB-ENM beyond endpoints
  fbenm_options:
    delta_scale: 0.2         # FB-ENM displacement scaling
    bond_scale: 1.25         # bond cutoff scaling
    fix_planes: true         # enforce planar constraints
    two_hop_mode: sparse     # neighbor traversal strategy
  cfbenm_options:
    bond_scale: 1.25         # CFB-ENM bond cutoff scaling
    corr0_scale: 1.1         # correlation scale for corr0
    corr1_scale: 1.5         # correlation scale for corr1
    corr2_scale: 1.6         # correlation scale for corr2
    eps: 0.05                # correlation epsilon
    pivotal: true            # pivotal residue handling
    single: true             # single-atom pivots
    remove_fourmembered: true   # prune four-membered rings
    two_hop_mode: sparse     # neighbor traversal strategy
  dmf_options:
    remove_rotation_and_translation: false  # keep rigid-body motions
    mass_weighted: false     # toggle mass weighting
    parallel: false          # enable parallel DMF
    eps_vel: 0.01            # velocity tolerance
    eps_rot: 0.01            # rotational tolerance
    beta: 10.0               # beta parameter for DMF
    update_teval: false      # update transition evaluation
  k_fix: 300.0               # harmonic constant for restraints
sopt:
  lbfgs:
    thresh: gau                # LBFGS convergence preset
    max_cycles: 10000          # iteration limit
    print_every: 100           # logging stride
    min_step_norm: 1.0e-08     # minimum accepted step norm
    assert_min_step: true      # assert when steps stagnate
    rms_force: null            # explicit RMS force target
    rms_force_only: false      # rely only on RMS force convergence
    max_force_only: false      # rely only on max force convergence
    force_only: false          # skip displacement checks
    converge_to_geom_rms_thresh: 0.05   # RMS threshold when targeting geometry
    overachieve_factor: 0.0    # tighten thresholds
    check_eigval_structure: false   # validate Hessian eigenstructure
    line_search: true          # enable line search
    dump: false                # dump trajectory/restart data
    dump_restart: false        # dump restart checkpoints
    prefix: ""                 # filename prefix
    out_dir: ./result_path_search/   # output directory
    keep_last: 7               # history size for LBFGS buffers
    beta: 1.0                  # initial damping beta
    gamma_mult: false          # multiplicative gamma update toggle
    max_step: 0.3              # maximum step length
    control_step: true         # control step length adaptively
    double_damp: true          # double damping safeguard
    mu_reg: null               # regularization strength
    max_mu_reg_adaptions: 10   # cap on mu adaptations
  rfo:
    thresh: gau                # RFOptimizer convergence preset
    max_cycles: 10000          # iteration cap
    print_every: 100           # logging stride
    min_step_norm: 1.0e-08     # minimum accepted step norm
    assert_min_step: true      # assert when steps stagnate
    rms_force: null            # explicit RMS force target
    rms_force_only: false      # rely only on RMS force convergence
    max_force_only: false      # rely only on max force convergence
    force_only: false          # skip displacement checks
    converge_to_geom_rms_thresh: 0.05   # RMS threshold when targeting geometry
    overachieve_factor: 0.0    # tighten thresholds
    check_eigval_structure: false   # validate Hessian eigenstructure
    line_search: true          # enable line search
    dump: false                # dump trajectory/restart data
    dump_restart: false        # dump restart checkpoints
    prefix: ""                 # filename prefix
    out_dir: ./result_path_search/   # output directory
    trust_radius: 0.1          # trust-region radius
    trust_update: true         # enable trust-region updates
    trust_min: 0.0             # minimum trust radius
    trust_max: 0.1             # maximum trust radius
    max_energy_incr: null      # allowed energy increase per step
    hessian_update: bfgs       # Hessian update scheme
    hessian_init: calc         # Hessian initialization source
    hessian_recalc: 200        # rebuild Hessian every N steps
    hessian_recalc_adapt: null # adaptive Hessian rebuild factor
    small_eigval_thresh: 1.0e-08   # eigenvalue threshold for stability
    alpha0: 1.0                # initial micro step
    max_micro_cycles: 50       # micro-iteration limit
    rfo_overlaps: false        # enable RFO overlaps
    gediis: false              # enable GEDIIS
    gdiis: true                # enable GDIIS
    gdiis_thresh: 0.0025       # GDIIS acceptance threshold
    gediis_thresh: 0.01        # GEDIIS acceptance threshold
    gdiis_test_direction: true # test descent direction before DIIS
    adapt_step_func: true      # adaptive step scaling toggle
bond:
  device: cuda                # UMA device for bond analysis
  bond_factor: 1.2            # covalent-radius scaling
  margin_fraction: 0.05       # tolerance margin for comparisons
  delta_fraction: 0.05        # minimum relative change to flag bonds
search:
  max_depth: 10               # recursion depth limit
  stitch_rmsd_thresh: 0.0001  # RMSD threshold for stitching segments
  bridge_rmsd_thresh: 0.0001  # RMSD threshold for bridging nodes
  rmsd_align: true            # legacy alignment flag (ignored)
  max_nodes_segment: 10       # max nodes per segment
  max_nodes_bridge: 5         # max nodes per bridge
  kink_max_nodes: 3           # max nodes for kink optimizations
  max_seq_kink: 2             # max sequential kinks
  refine_mode: null           # optional refinement strategy (auto-chooses when null)

See Also¶

  • path-opt — Single-pass MEP optimization (no recursive refinement)

  • tsopt — Optimize the HEI as a transition state

  • extract — Generate pocket PDBs for path-search inputs

  • all — End-to-end workflow that calls path-search internally

  • YAML Reference — Full gs, dmf, bond, search configuration options

  • Glossary — Definitions of MEP, GSM, DMF, HEI

Copyright © 2025, Takuto Ohmura
Made with Sphinx and @pradyunsg's Furo
Contents
  • path-search
    • Overview
      • At a glance
    • Usage
      • Examples
    • CLI options
    • Workflow
    • Outputs
    • Notes
    • YAML configuration (--args-yaml)
      • Example YAML (default value)
    • See Also