all¶
Overview¶
pdb2reaction all runs the entire workflow end-to-end:
pocket extraction → (optional) staged UMA scan → recursive MEP search (path-search, GSM/DMF) → merge back into the full system → (optional) TS optimization + IRC (tsopt) → (optional) vibrational analysis / thermochemistry (freq) → (optional) single-point DFT (dft).
Important
--tsopt True produces TS candidates. Always validate them (imaginary mode + connectivity) with freq and irc before mechanistic interpretation.
It supports three common modes:
Multi-structure workflow — Provide ≥2 structures (PDB/GJF/XYZ) in reaction order plus a substrate definition.
allextracts pockets, runs GSM/DMF MEP search, merges the optimized path back into the full-system template(s), and optionally runs TSOPT/freq/DFT per reactive segment.Single-structure + staged scan — Provide one structure plus one or more
--scan-lists. The scan generates an ordered set of intermediates that become MEP endpoints.One
--scan-listsliteral runs a single scan stage.Multiple stages are passed as multiple values after a single
--scan-listsflag (the flag itself cannot be repeated).
TSOPT-only pocket TS optimization — Provide a single input structure, omit
--scan-lists, and set--tsopt True.allextracts the pocket (if-c/--centeris given) and runs TS optimization + IRC, with optional freq/DFT, on that single system.
PDB/GJF companion files are generated when templates are available, controlled by --convert-files {True\|False} (enabled by default).
Usage¶
pdb2reaction all -i INPUT1 [INPUT2 ...] -c SUBSTRATE [options]
Examples¶
# Multi-structure ensemble with explicit ligand charges and post-processing
pdb2reaction all -i reactant.pdb product.pdb -c 'GPP,MMT' \
--ligand-charge 'GPP:-3,MMT:-1' --mult 1 --freeze-links True \
--max-nodes 10 --max-cycles 100 --climb True --opt-mode light \
--out-dir ./result_all --tsopt True --thermo True --dft True
# Single-structure staged scan followed by GSM/DMF + TSOPT/freq/DFT
pdb2reaction all -i single.pdb -c '308,309' \
--scan-lists '[("TYR,285,CA","MMT,309,C10",2.20),("TYR,285,CB","MMT,309,C11",1.80)]' \
--opt-mode heavy --tsopt True --thermo True --dft True
# TSOPT-only workflow (no path search)
pdb2reaction all -i reactant.pdb -c 'GPP,MMT' \
--ligand-charge 'GPP:-3,MMT:-1' --tsopt True --thermo True --dft True
Workflow¶
Active-site pocket extraction (if
-c/--centeris provided)Substrates may be specified via PDB paths, residue IDs (
123,124orA:123,B:456), or residue names (GPP,MMT).Optional toggles forward to the extractor:
--radius,--radius-het2het,--include-H2O,--exclude-backbone,--add-linkH,--selected-resn, and--verbose.Per-input pocket PDBs are saved under
<out-dir>/pockets/. When multiple structures are supplied, their pockets are unioned per residue selection.The first pocket’s total charge is propagated to scan/MEP/TSOPT.
Optional staged scan (single-input only)
Each
--scan-listsargument is a Python-like list of(i,j,target_Å)tuples describing a UMA scan stage. Atom indices refer to the original input ordering (1-based) and are remapped to the pocket ordering. For PDB inputs,i/jcan be integer indices or selector strings like'TYR,285,CA'; selectors accept spaces/commas/slashes/backticks/backslashes (,/`\) as delimiters and allow unordered tokens (fallback assumes resname, resseq, atom).A single literal runs a one-stage scan; multiple literals run sequentially so stage 2 begins from stage 1’s result, and so on. Supply multiple literals after a single flag (repeated flags are not accepted).
Scan inherits charge/spin,
--freeze-links, the UMA optimizer preset (--opt-mode),--args-yaml, and--preopt. The--dumpflag is forwarded to scan only when explicitly set on this command; otherwise scan uses its own default (False). Overrides such as--scan-out-dir,--scan-one-based,--scan-max-step-size,--scan-bias-k,--scan-relax-max-cycles,--scan-preopt, and--scan-endoptapply per run.Stage endpoints (
stage_XX/result.pdb) become the ordered intermediates that feed the subsequent MEP step.
MEP search on pockets (recursive GSM/DMF)
Executes
path-searchby default using the extracted pockets (or the original entire structures if extraction is skipped). Outputs are written under<out-dir>/path_search/. Relevant options:--mult,--freeze-links,--max-nodes,--max-cycles,--climb,--opt-mode,--dump,--preopt,--args-yaml, and--out-dir.Use
--refine-path Falseto switch to a single-passpath-optGSM/DMF chain without the recursive refiner.For multi-input PDB runs, the full-system templates are automatically passed to
path-searchfor reference merging. Single-structure scan runs reuse the original full PDB template for every stage.
Merge pockets back to the full systems
When reference PDB templates exist, merged
mep_w_ref*.pdband per-segmentmep_w_ref_seg_XX.pdbfiles are emitted under<out-dir>/path_search/.
Optional per-segment post-processing
--tsopt True: run TS optimization on each HEI pocket, follow with EulerPC IRC, and emit segment energy diagrams.--thermo True: callfreqon (R, TS, P) to obtain vibrational/thermochemistry data and a UMA Gibbs diagram.--dft True: launch single-point DFT on (R, TS, P) and build a DFT diagram. When combined with--thermo True, a DFT//UMA Gibbs diagram (DFT energies + UMA thermal correction) is also produced.Shared overrides include
--opt-mode,--opt-mode-post(overrides TSOPT/post-IRC optimization mode),--flatten-imag-mode,--hessian-calc-mode,--tsopt-max-cycles,--tsopt-out-dir,--freq-*,--dft-*, and--dft-engine(GPU-first by default).When you have ample VRAM available, setting
--hessian-calc-modetoAnalyticalis strongly recommended.
TSOPT-only mode (single input,
--tsopt True, no--scan-lists)Skips the MEP/merge stages. Runs
tsopton the pocket (or full input if extraction is skipped), performs EulerPC IRC, identifies the higher-energy endpoint as reactant (R), and generates the same set of energy diagrams plus optional freq/DFT outputs.
Charge and spin precedence¶
Charge resolution (highest to lowest priority):
Priority |
Source |
When Used |
|---|---|---|
1 |
|
Explicit CLI override |
2 |
Pocket extraction |
When |
3 |
|
Fallback when extraction fails or is skipped |
4 |
|
Embedded charge/spin metadata |
5 |
Default |
None (unresolved charge is an error) |
Spin resolution: --mult (CLI) → .gjf template → default (1)
Tip: Always provide
--ligand-chargefor non-standard substrates to ensure correct charge propagation.
Input expectations¶
Extraction enabled (
-c/--center): inputs must be PDB files so residues can be located.Extraction skipped: inputs may be PDB/XYZ/GJF.
Multi-structure runs require ≥2 structures.
CLI Options¶
Note: Default values shown are used when the option is not specified.
Input/Output Options¶
Option |
Description |
Default |
|---|---|---|
|
Two or more full structures in reaction order (single input allowed only with |
Required |
|
Top-level output directory. |
|
|
Global toggle for XYZ/TRJ → PDB/GJF companions when templates are available. |
|
|
Dump MEP (GSM/DMF) trajectories. Always forwarded to |
|
|
YAML forwarded unchanged to all subcommands. |
None |
Charge/Spin Options¶
Option |
Description |
Default |
|---|---|---|
|
Total charge or residue-specific mapping for unknown residues (recommended). |
None |
|
Force the total system charge (overrides |
None |
|
Spin multiplicity forwarded to all downstream steps. |
|
Extraction Options¶
Option |
Description |
Default |
|---|---|---|
|
Substrate specification (PDB path, residue IDs, or residue names). |
Required for extraction |
|
Pocket inclusion cutoff (Å). |
|
|
Independent hetero–hetero cutoff (Å). |
|
|
Include waters (HOH/WAT/TIP3/SOL). |
|
|
Remove backbone atoms on non-substrate amino acids. |
|
|
Add link hydrogens for severed bonds. |
|
|
Residues to force include. |
|
|
Freeze link parents in pocket PDBs. |
|
|
Enable INFO-level extractor logging. |
|
MEP Search Options¶
Option |
Description |
Default |
|---|---|---|
|
MEP search algorithm: GSM (Growing String Method) or DMF (Direct Max Flux). |
|
|
MEP internal nodes per segment. |
|
|
MEP maximum optimization cycles. |
|
|
Enable TS climbing for the first segment. |
|
|
Optimizer preset (light → LBFGS/Dimer, heavy → RFO/RSIRFO). |
|
|
Convergence preset ( |
|
|
Pre-optimize pocket endpoints before MEP search. |
|
|
If True, run recursive |
|
UMA Calculator Options¶
Option |
Description |
Default |
|---|---|---|
|
UMA predictor parallelism (workers > 1 disables analytic Hessians). |
|
|
Shared UMA Hessian engine. |
|
Post-Processing Options¶
Option |
Description |
Default |
|---|---|---|
|
Run TS optimization + IRC per reactive segment. |
|
|
Run vibrational analysis ( |
|
|
Run single-point DFT on R/TS/P. |
|
|
Optimizer preset for TSOPT and post-IRC optimization. |
None |
|
Convergence preset for post-IRC endpoint optimizations ( |
|
|
Enable extra-imaginary-mode flattening in |
|
TSOPT optimizer selection order: --opt-mode-post (if set) → --opt-mode (only when explicitly provided) → TSOPT default (heavy).
TSOPT Overrides¶
Option |
Description |
Default |
|---|---|---|
|
Override |
|
|
Custom tsopt subdirectory. |
None |
Freq Overrides¶
Option |
Description |
Default |
|---|---|---|
|
Base directory override for freq outputs. |
None |
|
Maximum modes to write. |
|
|
Mode animation amplitude (Å). |
|
|
Frames per mode animation. |
|
|
Mode sorting behavior. |
|
|
Thermochemistry temperature (K). |
|
|
Thermochemistry pressure (atm). |
|
DFT Overrides¶
Option |
Description |
Default |
|---|---|---|
|
Preferred backend ( |
|
|
Base directory override for DFT outputs. |
None |
|
Functional/basis pair. |
|
|
Maximum SCF iterations. |
|
|
SCF convergence tolerance. |
|
|
PySCF grid level. |
|
Scan Options (Single-Input Runs)¶
Option |
Description |
Default |
|---|---|---|
|
Staged scans: |
None |
|
Override the scan output directory. |
None |
|
Force scan indexing to 1-based or 0-based. |
|
|
Maximum step size (Å). |
|
|
Harmonic bias strength (eV/Ų). |
|
|
Relaxation max cycles per step. |
|
|
Override the scan preoptimization toggle. |
|
|
Override the scan end-of-stage optimization toggle. |
|
Outputs¶
out_dir/ (default: ./result_all/)
├─ summary.log # formatted summary for quick inspection
├─ summary.yaml # YAML version summary
├─ pockets/ # Per-input pocket PDBs when extraction runs
├─ scan/ # Staged pocket scan results (present when --scan-lists is provided)
├─ path_search/ # MEP results (GSM/DMF): trajectories, merged PDBs, diagrams, summary.yaml, per-segment folders
├─ path_search/post_seg_XX/ # Post-processing outputs (TS optimization, IRC, freq, DFT, diagrams)
└─ tsopt_single/ # TSOPT-only outputs with IRC endpoints and optional freq/DFT directories
Console logs summarizing pocket charge resolution, YAML contents, scan stages, MEP progress (GSM/DMF), and per-stage timing.
Reading summary.log¶
The log is organized into numbered sections:
[1] Global MEP overview – image/segment counts, MEP trajectory plot paths, and the aggregate MEP energy diagram.
[2] Segment-level MEP summary (UMA path) – per-segment barriers (
ΔE‡), reaction energies (ΔE), and bond-change summaries.[3] Per-segment post-processing (TSOPT / Thermo / DFT) – per-segment TS imaginary frequency checks, IRC outputs, and UMA/thermo/DFT energy tables.
[4] Energy diagrams (overview) – diagram tables for MEP/UMA/Gibbs/DFT series plus an optional cross-method summary table.
[5] Output directory structure – a compact tree of generated files with inline annotations.
Reading summary.yaml¶
The YAML is a compact, machine-readable summary. Common top-level keys include:
out_dir,n_images,n_segments– run metadata and total counts.segments– list of per-segment entries withindex,tag,kind,barrier_kcal,delta_kcal, andbond_changes.energy_diagrams(optional) – diagram payloads withlabels,energies_kcal,energies_au,ylabel, andimagepaths.
summary.yaml intentionally omits the formatted tables and filesystem tree that appear in summary.log.
Notes¶
Always provide
--ligand-charge(numeric or per-residue mapping) when formal charges cannot be inferred so the correct total charge propagates to scan/MEP/TSOPT/DFT.Reference PDB templates for merging are derived automatically from the original inputs; the explicit
--ref-full-pdboption ofpath-searchis intentionally hidden in this wrapper.Convergence presets:
--threshdefaults togau;--thresh-postdefaults tobaker.Extraction radii: passing
0to--radiusor--radius-het2hetis internally clamped to0.001 Åby the extractor.Energies in diagrams are reported relative to the first state (reactant) in kcal/mol.
Omitting
-c/--centerskips extraction and feeds the entire input structures directly to the MEP/tsopt/freq/DFT stages; single-structure runs still require either--scan-listsor--tsopt True.--args-yamllets you coordinate all calculators from a single configuration file. YAML values override CLI flags.
YAML configuration (--args-yaml)¶
The same YAML file is forwarded unchanged to every invoked subcommand. Each tool reads the sections described in its own documentation:
Subcommand |
YAML Sections |
|---|---|
|
|
|
|
|
|
|
|
|
Note: YAML contents take precedence over CLI values when both are provided.
Minimal example:
calc:
model: uma-s-1p1
hessian_calc_mode: Analytical # recommended when VRAM permits
gs:
max_nodes: 12
climb: true
dft:
grid_level: 6
For a complete reference of all YAML options, see YAML Configuration Reference.
See Also¶
Getting Started — Installation and first run guide
Concepts & Workflow — Mental model of pockets, segments, and stages
extract — Standalone pocket extraction (called internally by
all)path-search — Standalone MEP search (called internally by
all)tsopt — Standalone TS optimization
freq — Standalone vibrational analysis
dft — Standalone DFT calculations
Troubleshooting — Common errors and fixes
YAML Reference — Complete YAML configuration options
Glossary — Definitions of MEP, TS, IRC, GSM, DMF