tsopt¶
mlmm tsopt refines a transition-state candidate on a layered enzyme PDB into a first-order saddle point. Run it on a standalone transition-state (TS) guess, or on the highest-energy image (HEI) extracted by path-search.
Two optimizers are available, and you pick between them with --opt-mode:
Restricted-Step Image-function Rational Function Optimization (RS-I-RFO) (
--opt-mode hess) is the default and the conservative choice when you can afford the Hessian work. It runs with microiteration (--microiter, default on) that alternates a machine-learning (ML) 1-step RS-I-RFO move with a molecular-mechanics (MM) L-BFGS relaxation.Hessian-Guided Dimer (
--opt-mode grad) is the lighter alternative, suited to a lower-cost search or quick iteration from several TS guesses. Add--ml-only-hessian-dimerto use only the ML-region Hessian for dimer orientation (faster).
After convergence, a surplus-imaginary-mode flatten loop (--flatten) removes extra negative modes via mass-scaled displacements. A validated TS should show exactly one imaginary frequency — always confirm the mode and connectivity with freq / irc.
Building a TS candidate first¶
tsopt refines a candidate — it does not find one from scratch. Pick the route that matches the information you already have, then feed the result into tsopt → irc → freq (or mlmm all --tsopt).
Route |
Subcommand |
What it does |
Use when |
|---|---|---|---|
(a) MEP / path search |
|
Recursive GSM/DMF minimum-energy-path search; brackets the TS between endpoints, bridges gaps between segments, and emits one TS per segment. |
You have a reactant (and optionally a product or intermediates) and want the path discovered. |
(b) Distance-restrained build-up |
Adds a harmonic restraint |
You have neither a usable second endpoint nor a TS guess — drive the reacting bond directly. |
# Route (a): discover the path, then refine its highest-energy image
mlmm path-search -i r.pdb p.pdb --parm enzyme.parm7 -l 'LIG:Q' -o result_mep
# Route (b): drive the reacting distance to build a TS candidate
mlmm scan -i r.pdb --parm enzyme.parm7 -l 'LIG:Q' \
--scan-lists '[(1,5,1.40)]' -o result_scan
Note
There is no opt --restraint flag. Plain opt is an un-restrained optimizer; the restrained build-up of a TS candidate is done with scan (drive the distance) or with path-search (route a).
Wrong number of imaginary frequencies¶
A clean first-order saddle has exactly one dominant imaginary mode along the reaction coordinate. Two common failures are a spurious second small imaginary mode, or no dominant reaction mode at all.
Symptom |
Fix |
|---|---|
Spurious 2nd small imaginary mode, or no dominant reaction mode |
Raise precision with |
Still no clean saddle |
Combine them, then verify in |
--flatten runs the surplus-imaginary-mode flattening loop (grad: dimer loop; hess: post-RS-I-RFO); --no-flatten forces flatten_max_iter=0. It is most useful when a dominant reaction mode survives alongside a tiny residual one — for example, a mutant chorismate-mutase TS converged to the Claisen mode at −223 cm⁻¹ plus a residual −12.5 cm⁻¹, and --flatten drives it to a clean single-imaginary saddle.
mlmm tsopt -i ts_guess.pdb --parm enzyme.parm7 -l 'LIG:Q' -b uma \
--precision fp64 --coord-type dlc -o result_ts
--coord-type selects the optimization coordinate system (cart | redund | dlc | tric; default cart). dlc (delocalized internal coordinates) is slower but converges more robustly on torsion-rich systems and is more likely to reach a clean first-order saddle.
Warning
--coord-type dlc needs a Hessian-based optimizer. On opt with the default L-BFGS (--opt-mode grad) it is silently forced back to cart; use it on tsopt (RFO / RS-I-RFO) or opt --opt-mode hess. path-opt / path-search accept only cart and dlc. DLC + link atom and DLC + 3-layer frozen MM are numerically unverified, so cart remains the default, and is the setting used to produce the published results.
See Common Error Recipes — Recipe 4 for symptom-first routing of the same failure.
Controlled mutant-vs-WT (or mechanism-vs-mechanism) comparison¶
Important
For a mutant-versus-wild-type (or mechanism-versus-mechanism) barrier comparison, all compared models must use the same atom set — identical atom count and residues. Otherwise the energy reference differs and the comparison is not a controlled experiment. A geometrically re-derived ML/movable/frozen partition on the mutant also produces spurious soft modes (tsopt.n_imaginary ≥ 2, both tiny → the IRC aborts).
In mlmm, preserve the wild-type ML/MM layering by transplanting the WT B-factor layer encoding onto the mutated structure and running with --detect-layer:
Step |
Action |
|---|---|
1 |
Build the mutant; keep the same residue set as WT (only the mutated residue’s identity differs). |
2 |
Copy WT’s per-atom B-factor layer codes onto the mutant by |
3 |
Run with |
mlmm all -i mutant_layered.pdb -l 'LIG:Q' \
--tsopt True --thermo True -o result_mutant
Flag |
Action |
Why |
|---|---|---|
|
keep (default |
reads the transplanted B-factors so the ML / movable / frozen layers are byte-identical to WT |
|
omit |
geometric extraction would re-derive a different pocket on the mutant; omitting |
|
keep |
a non-standard ligand’s charge is not in the standard amino-acid table; |
|
do not pass |
it disables |
The same-atom-set principle applies equally when comparing two mechanisms on the same enzyme: keep an identical atom set across both models and vary only the reaction coordinate.
Examples¶
The command form is mlmm tsopt -i TS_GUESS --parm PARM7 --model-pdb ML_REGION -q CHARGE -m MULT [options]. mlmm tsopt --help shows core options; mlmm tsopt --help-advanced shows the full option list.
Default run:
mlmm tsopt -i ts_guess.pdb --parm real.parm7 --model-pdb ml_region.pdb \
-q 0 -m 1 --out-dir ./result_tsopt
Light mode (Dimer) with analytical Hessian:
# Light mode (Dimer) with analytical Hessian when VRAM allows
mlmm tsopt -i ts_guess.pdb --parm real.parm7 --model-pdb ml_region.pdb \
-q 0 -m 1 --opt-mode grad --hessian-calc-mode Analytical --out-dir ./result_tsopt_grad
Heavy mode (RS-I-RFO) with YAML overrides:
# Heavy mode (RS-I-RFO) with YAML overrides
mlmm tsopt -i ts_guess.pdb --parm real.parm7 --model-pdb ml_region.pdb \
-q 0 -m 1 --opt-mode hess --config tsopt.yaml --out-dir ./result_tsopt_hess
# --dump keeps the full optimization trajectory; --backend mace uses the MACE backend
Workflow¶
Input handling — load the enzyme PDB, Amber topology, and ML-region definition. Resolve charge / spin. Frozen atoms from CLI and YAML are merged.
ML/MM calculator setup — build the ML/MM calculator (MLIP backend +
hessian_ff).-b/--backendselects the MLIP (uma,orb,mace, oraimnet2; defaultuma).--hessian-calc-modecontrols whether the ML backend evaluates Hessians analytically or by finite difference. With--embedcharge, xTB point-charge embedding provides MM-to-ML environmental corrections.Light mode (Hessian-Guided Dimer) — the Dimer stage periodically refreshes the dimer direction by evaluating an exact Hessian (active subspace, TR-projected). The mechanics:
During the loose / final Dimer loops the
hessian_fffinite-difference Hessian is disabled (mm_fd=False). The ML backend Hessian is then embedded into the full 3N × 3N space with MM atoms zero-padded, giving a partial Hessian that still guides the Dimer direction updates.When the flatten loop is enabled (
--flatten), the stored active Hessian is updated via Bofill using displacements and gradient differences.Each loop estimates imaginary modes, flattens once, refreshes the dimer direction, and runs a Dimer + L-BFGS micro-segment.
Heavy mode (RS-I-RFO) — runs the RS-I-RFO optimizer with optional Hessian reference files and micro-cycle controls defined in the
rsirfoYAML section. The flatten behaviour:With
--flatten, when more than one imaginary mode remains after convergence the workflow flattens extra modes and reruns RS-I-RFO until only one imaginary mode remains or the flatten-iteration cap is reached.Each flatten iteration recomputes a fresh ML/MM Hessian (partial ML-only by default, or full per
--partial-hessian-flatten) for imaginary-mode detection. There is no Bofill update in this path.
Mode export + conversion — the converged imaginary mode is always written to
vib/imag_*_trj.xyzand mirrored to.pdbwhen the input was PDB and conversion is enabled. The optimization trajectory and final geometry are also converted to PDB via the input template when--dump.
Outputs¶
Three artifacts are written to result_tsopt/: final_geometry.pdb (and .xyz) — the optimized first-order saddle point (3-layer B-factor encoding preserved for PDB); vib/imag_*_trj.xyz — animation of every detected imaginary mode (expect exactly one for a valid TS); and vib/imag_*.pdb — PDB companions of the imaginary modes (PDB inputs only).
out_dir/ (default: ./result_tsopt/)
├── final_geometry.xyz # Always written
├── final_geometry.pdb # When the input was PDB
├── optimization_all_trj.xyz # Concatenated Dimer segments (--dump)
├── optimization_all.pdb # PDB companion (--dump, PDB input)
├── vib/
│ ├── imag_NN_±XXXX.XXcm-1_trj.xyz # Imaginary-mode trajectory
│ └── imag_NN_±XXXX.XXcm-1.pdb # PDB companion
└── .dimer_mode.dat # Dimer orientation seed (grad mode)
CLI options¶
The full flag list is in the generated command reference; the table below covers the options that need explanation — do not hand-duplicate the exhaustive list.
Option |
Description |
Default |
|---|---|---|
Input & charge |
||
|
Starting geometry (PDB or XYZ). If XYZ, use |
Required |
|
Reference PDB topology when input is XYZ. |
None |
|
Amber parm7 topology for the whole enzyme. |
Required |
|
PDB containing the ML-region atoms. Optional when |
None |
|
Comma-separated atom indices for the ML region (ranges allowed). |
None |
|
Interpret |
|
|
Detect ML/MM layers from input PDB B-factors. |
|
|
Net charge of the ML region. |
None (required unless |
|
Per-resname charge mapping (e.g. |
None |
|
Spin multiplicity (2S+1) for the ML region. |
|
Active-region freezing |
||
|
Comma-separated 1-based indices to freeze (merged with YAML |
None |
|
Distance cutoff (Å) from the ML region for MM atoms to include in Hessian calculation. Applied to movable MM atoms. |
|
|
Distance cutoff (Å) for movable MM atoms. |
None |
TS search & optimizer mode |
||
|
ML Hessian mode: |
|
|
Maximum total optimizer cycles. |
|
|
TS optimizer mode (Choice: |
|
|
Microiteration: alternate ML 1-step (RS-I-RFO) + MM relaxation (L-BFGS). Only effective in |
|
|
Use ML-region-only Hessian for dimer orientation in |
|
Convergence & flatten |
||
|
Convergence preset ( |
None |
|
Extra-imaginary-mode flattening loop. |
None → disabled by default (0 iterations); |
|
Use partial Hessian (ML only) for imaginary-mode detection in the flatten loop. |
|
|
Active DOF for final frequency analysis: |
|
|
Skip post-convergence frequency analysis and imaginary-mode flattening. Useful for large unfrozen systems where Hessian diagonalization is expensive. TS saddle-point order will NOT be verified. |
|
Backend & compute |
||
|
MLIP backend for the ML region: |
|
|
MLIP backend precision; routed to backend-native kwarg (UMA |
|
|
xTB point-charge embedding correction for MM-to-ML environmental effects (experimental). |
|
|
Cutoff radius (Å) for embed-charge MM atoms. |
|
|
CMAP (backbone cross-map dihedral correction) in the model parm7. Disabled by default, consistent with Gaussian ONIOM. |
|
|
MM backend (analytical Hessian vs OpenMM finite-difference). |
|
|
Link-atom placement: scaled (g-factor) or fixed 1.09 / 1.01 Å. |
|
Output & config |
||
|
Write the concatenated trajectory |
|
|
Toggle XYZ / TRJ → PDB companions for PDB inputs. |
|
|
Output directory. |
|
|
Base YAML configuration applied before explicit CLI options. |
None |
|
Print resolved config layers and continue execution. |
|
|
Write a machine-readable |
|
|
Validate inputs / config and print the execution plan without running TS optimization (shown in |
|
YAML configuration¶
Settings are applied with defaults < config < explicit CLI < override. Shared sections reuse YAML Reference.
geom:
coord_type: cart
freeze_atoms: []
calc:
charge: 0
spin: 1
mlmm:
real_parm7: real.parm7
model_pdb: ml_region.pdb
backend: uma # uma | orb | mace | aimnet2
hessian_calc_mode: Analytical # or FiniteDifference
opt:
thresh: baker
max_cycles: 10000
out_dir: ./result_tsopt/
rsirfo: # --opt-mode hess
trust_max: 0.10 # bohr; tuned for ML/MM stability near the TS
hessian_recalc: 500 # lower (50-200) if the TS mode is lost
track_mode_by_overlap: false # set true if the TS mode switches root
hessian_dimer: # --opt-mode grad
flatten_max_iter: 50 # 0 with --no-flatten
microiter:
micro_thresh: null # MM relaxation preset; null -> same as macro
Full schema (every section, key, and default): YAML Reference.
Tip
Set rsirfo.track_mode_by_overlap: true if the TS mode switches root during optimization (e.g. when multiple imaginary frequencies are present). If TS convergence is slow or the TS mode is lost, lowering hessian_recalc (e.g. to 50–200) helps — more frequent exact Hessian recalculations improve robustness at the cost of additional Hessian evaluations.
Notes¶
Active-DOF projection and mass-weighted translation / rotation removal (PHVA + TR projection) mirror freq.py, ensuring consistent imaginary-mode analysis and mode writing.
Note
rsirfo.trust_max defaults to 0.10 bohr for improved ML/MM stability near the TS.
The shared opt block also provides an energy-plateau fallback (energy_plateau: true by default, energy_plateau_thresh: 1.0e-4 au over energy_plateau_window: 50 steps). If the MLIP force noise floor prevents the gradient-based thresh preset from being reached, the optimizer still exits cleanly once the energy itself has plateaued. See yaml-reference for full details.
For --microiter, rsirfo.thresh controls the macro RS-I-RFO step. The MM
relaxation threshold is set with microiter.micro_thresh; when it is null or
omitted, the micro step uses the same preset as the macro step. There is no
--micro-thresh CLI flag.
See Also¶
Common Error Recipes · Troubleshooting · path-search · opt · freq · irc · all · YAML Reference · Glossary.