YAML Reference

This document provides a comprehensive reference for all YAML configuration options used across pdb2reaction subcommands. Configuration files are passed via --args-yaml and follow the precedence: defaults → CLI → YAML (YAML has highest precedence).


Overview

All pdb2reaction subcommands that accept --args-yaml share common configuration sections:

Section

Description

Used by

geom

Geometry and coordinate settings

all, opt, scan, scan2d, scan3d, tsopt, freq, irc, path-opt, path-search

calc

UMA calculator configuration

all, opt, scan, scan2d, scan3d, tsopt, freq, irc, path-opt, path-search

opt

Shared optimizer settings

opt, scan, scan2d, scan3d, tsopt, path-opt, path-search

lbfgs

L-BFGS optimizer settings

opt, scan, scan2d, scan3d, path-search

rfo

RFO optimizer settings

opt, scan, scan2d, scan3d, path-search

gs

Growing String Method settings

path-opt, path-search

dmf

Direct Max Flux settings

path-opt, path-search

irc

IRC integration settings

irc

freq

Vibrational analysis settings

freq

thermo

Thermochemistry settings

freq

dft

DFT calculation settings

dft

bias

Harmonic bias settings

scan, scan2d, scan3d

bond

Bond-change detection settings

scan, path-search

search

Recursive path search settings

path-search

hessian_dimer

Hessian Dimer TS optimization

tsopt

rsirfo

RS-I-RFO TS optimization

tsopt

sopt

Single-structure optimizer for path-search

path-search


Shared Sections

geom

Geometry loading and coordinate handling.

geom:
  coord_type: cart           # Coordinate type: "cart" (Cartesian) or "dlc" (delocalized internals)
  freeze_atoms: []           # 0-based indices of atoms to freeze; merged with CLI --freeze-links detection

Notes:

  • freeze_atoms from YAML is merged with atoms detected via --freeze-links for PDB inputs

  • Frozen atoms have zeroed forces; their Hessian columns are also zeroed

  • For irc, geom.coord_type is forced to cart after YAML/CLI merging


calc

UMA machine-learning calculator configuration.

calc:
  charge: 0                            # Total system charge (overridden by CLI -q)
  spin: 1                              # Spin multiplicity 2S+1 (overridden by CLI -m)
  model: uma-s-1p1                     # UMA pretrained model name
  task_name: omol                      # Task tag recorded in UMA batches
  device: auto                         # Device: "cuda", "cpu", or "auto"
  max_neigh: null                      # Maximum neighbors for graph construction
  radius: null                         # Cutoff radius for neighbor search
  r_edges: false                       # Store radial edges
  workers: 1                           # UMA inference workers (workers>1 disables analytical Hessians)
  workers_per_node: 1                  # Workers per node for parallel predictor
  out_hess_torch: true                 # Return Hessian as torch.Tensor
  hessian_double: true                 # Assemble/return Hessian in float64
  # freeze_atoms: null                 # Inherited from geom.freeze_atoms; do not set directly
  hessian_calc_mode: FiniteDifference  # Hessian mode: "Analytical" or "FiniteDifference"
  return_partial_hessian: false        # Return only active-DOF Hessian block

Notes:

  • hessian_calc_mode: Analytical is recommended when sufficient VRAM is available

  • workers > 1 disables analytical Hessians

  • Charge/spin inherit .gjf template metadata when available

  • freq sets calc.return_partial_hessian = true by default (PHVA); YAML can override.

  • IRC forces geom.coord_type = cart and calc.return_partial_hessian = false regardless of YAML.

  • For irc, calc.return_partial_hessian is forced to false after YAML/CLI merging


opt

Shared optimizer controls used by both L-BFGS and RFO.

opt:
  type: string                         # StringOptimizer-only (path-opt/path-search): optimizer type label
  thresh: gau                          # Convergence preset: gau_loose, gau, gau_tight, gau_vtight, baker, never
  stop_in_when_full: 300               # StringOptimizer-only: early stop threshold when string is full
  align: false                         # StringOptimizer-only: alignment toggle
  scale_step: global                   # StringOptimizer-only: step scaling mode
  max_cycles: 10000                    # Maximum optimizer iterations
  print_every: 100                     # Logging stride
  min_step_norm: 1.0e-08               # Minimum step norm for acceptance
  assert_min_step: true                # Stop if steps fall below threshold
  rms_force: null                      # Explicit RMS force target
  rms_force_only: false                # Rely only on RMS force convergence
  max_force_only: false                # Rely only on max force convergence
  force_only: false                    # Skip displacement checks
  converge_to_geom_rms_thresh: 0.05    # RMS threshold when converging to reference geometry
  overachieve_factor: 0.0              # Factor to tighten thresholds
  check_eigval_structure: false        # Validate Hessian eigenstructure
  line_search: true                    # Enable line search
  dump: false                          # Dump trajectory/restart data
  dump_restart: false                  # Dump restart checkpoints
  reparam_thresh: 0.0                  # StringOptimizer-only: reparameterization threshold
  coord_diff_thresh: 0.0               # StringOptimizer-only: coordinate difference threshold
  prefix: ""                           # Filename prefix
  out_dir: ./result_opt/               # Output directory

Convergence Presets:

Preset

Max Force

RMS Force

Max Step

RMS Step

gau_loose

2.5e-3

1.7e-3

1.0e-2

6.7e-3

gau

4.5e-4

3.0e-4

1.8e-3

1.2e-3

gau_tight

1.5e-5

1.0e-5

6.0e-5

4.0e-5

gau_vtight

2.0e-6

1.0e-6

6.0e-6

4.0e-6

baker

3.0e-4

2.0e-4

3.0e-4

2.0e-4


lbfgs

L-BFGS optimizer settings (extends opt).

lbfgs:
  # Inherits all opt settings, plus:
  keep_last: 7                         # History size for L-BFGS buffers
  beta: 1.0                            # Initial damping beta
  gamma_mult: false                    # Multiplicative gamma update toggle
  max_step: 0.3                        # Maximum step length
  control_step: true                   # Control step length adaptively
  double_damp: true                    # Double damping safeguard
  mu_reg: null                         # Regularization strength
  max_mu_reg_adaptions: 10             # Cap on mu adaptations

rfo

Rational Function Optimizer settings (extends opt).

rfo:
  # Inherits all opt settings, plus:
  trust_radius: 0.1                    # Trust-region radius
  trust_update: true                   # Enable trust-region updates
  trust_min: 0.0                       # Minimum trust radius
  trust_max: 0.1                       # Maximum trust radius
  max_energy_incr: null                # Allowed energy increase per step
  hessian_update: bfgs                 # Hessian update scheme: bfgs, bofill, etc.
  hessian_init: calc                   # Hessian initialization: calc, unit, etc.
  hessian_recalc: 200                  # Rebuild Hessian every N steps
  hessian_recalc_adapt: null           # Adaptive Hessian rebuild factor
  small_eigval_thresh: 1.0e-08         # Eigenvalue threshold for stability
  alpha0: 1.0                          # Initial micro step
  max_micro_cycles: 50                 # Micro-iteration limit
  rfo_overlaps: false                  # Enable RFO overlaps
  gediis: false                        # Enable GEDIIS
  gdiis: true                          # Enable GDIIS
  gdiis_thresh: 0.0025                 # GDIIS acceptance threshold
  gediis_thresh: 0.01                  # GEDIIS acceptance threshold
  gdiis_test_direction: true           # Test descent direction before DIIS
  adapt_step_func: true                # Adaptive step scaling

Path Optimization Sections

gs

Growing String Method settings.

gs:
  fix_first: true                      # Keep first endpoint fixed
  fix_last: true                       # Keep last endpoint fixed
  max_nodes: 10                        # Maximum string nodes (internal images)
  perp_thresh: 0.005                   # Perpendicular displacement threshold
  reparam_check: rms                   # Reparametrization check metric
  reparam_every: 1                     # Reparametrization stride
  reparam_every_full: 1                # Full reparametrization stride
  param: equi                          # Parametrization scheme
  max_micro_cycles: 10                 # Micro-iteration limit
  reset_dlc: true                      # Rebuild delocalized coordinates each step
  climb: true                          # Enable climbing image
  climb_rms: 0.0005                    # Climbing RMS threshold
  climb_lanczos: true                  # Lanczos refinement for climbing
  climb_lanczos_rms: 0.0005            # Lanczos RMS threshold
  climb_fixed: false                   # Keep climbing image fixed
  scheduler: null                      # Optional scheduler backend

dmf

Direct Max Flux settings for MEP optimization.

dmf:
  max_cycles: 300                      # Maximum DMF/IPOPT iterations (overridden by --max-cycles)
  correlated: true                     # Correlated DMF propagation
  sequential: true                     # Sequential DMF execution
  fbenm_only_endpoints: false          # Run FB-ENM beyond endpoints
  fbenm_options:
    delta_scale: 0.2                   # FB-ENM displacement scaling
    bond_scale: 1.25                   # Bond cutoff scaling
    fix_planes: true                   # Enforce planar constraints
    two_hop_mode: sparse               # Neighbor traversal strategy
  cfbenm_options:
    bond_scale: 1.25                   # CFB-ENM bond cutoff scaling
    corr0_scale: 1.1                   # Correlation scale for corr0
    corr1_scale: 1.5                   # Correlation scale for corr1
    corr2_scale: 1.6                   # Correlation scale for corr2
    eps: 0.05                          # Correlation epsilon
    pivotal: true                      # Pivotal residue handling
    single: true                       # Single-atom pivots
    remove_fourmembered: true          # Prune four-membered rings
    two_hop_mode: sparse               # Neighbor traversal strategy
  dmf_options:
    remove_rotation_and_translation: false  # Keep rigid-body motions
    mass_weighted: false               # Toggle mass weighting
    parallel: false                    # Enable parallel DMF
    eps_vel: 0.01                      # Velocity tolerance
    eps_rot: 0.01                      # Rotational tolerance
    beta: 10.0                         # Beta parameter for DMF
    update_teval: false                # Update transition evaluation
  k_fix: 300.0                         # Harmonic constant for restraints


sopt

Single-structure optimizers for path-search (HEI±1 and kink nodes).

sopt:
  lbfgs:
    # Same keys as lbfgs section above
    thresh: gau
    max_cycles: 10000
    out_dir: ./result_path_search/
    dump: false
    # ... (see lbfgs section)
  rfo:
    # Same keys as rfo section above
    thresh: gau
    max_cycles: 10000
    out_dir: ./result_path_search/
    dump: false
    # ... (see rfo section)

TS Optimization Sections

hessian_dimer

Hessian Dimer TS optimization settings (tsopt –opt-mode light).

hessian_dimer:
  thresh_loose: gau_loose              # Loose convergence preset
  thresh: baker                        # Main convergence preset
  update_interval_hessian: 500         # Hessian rebuild cadence
  neg_freq_thresh_cm: 5.0              # Negative frequency threshold (cm⁻¹)
  flatten_amp_ang: 0.1                 # Flattening amplitude (Å)
  flatten_max_iter: 50                 # Flattening iteration cap
  flatten_sep_cutoff: 0.0              # Minimum distance between representative atoms
  flatten_k: 10                        # Representative atoms sampled per mode
  flatten_loop_bofill: false           # Bofill update for flatten displacements
  mem: 100000                          # Memory limit for solver
  device: auto                         # Device selection for eigensolver
  root: 0                              # Targeted TS root index
  dimer:
    length: 0.0189                     # Dimer separation (Bohr)
    rotation_max_cycles: 15            # Max rotation iterations
    rotation_method: fourier           # Rotation optimizer method
    rotation_thresh: 0.0001            # Rotation convergence threshold
    rotation_tol: 1                    # Rotation tolerance factor
    rotation_max_element: 0.001        # Max rotation matrix element
    rotation_interpolate: true         # Interpolate rotation steps
    rotation_disable: false            # Disable rotations entirely
    rotation_disable_pos_curv: true    # Disable when positive curvature detected
    rotation_remove_trans: true        # Remove translational components
    trans_force_f_perp: true           # Project forces perpendicular to translation
    bonds: null                        # Bond list for constraints
    N_hessian: null                    # Hessian size override
    bias_rotation: false               # Bias rotational search
    bias_translation: false            # Bias translational search
    bias_gaussian_dot: 0.1             # Gaussian bias dot product
    seed: null                         # RNG seed for rotations
    write_orientations: true           # Write rotation orientations
    forward_hessian: true              # Propagate Hessian forward
  lbfgs:
    # Same keys as lbfgs section
    thresh: baker
    max_cycles: 10000

rsirfo

RS-I-RFO TS optimization settings (tsopt –opt-mode heavy).

rsirfo:
  thresh: baker                        # RS-IRFO convergence preset
  max_cycles: 10000                    # Iteration cap
  print_every: 100                     # Logging stride
  min_step_norm: 1.0e-08               # Minimum accepted step norm
  assert_min_step: true                # Assert when steps stagnate
  roots: [0]                           # Target root indices
  hessian_ref: null                    # Reference Hessian
  rx_modes: null                       # Reaction-mode definitions
  prim_coord: null                     # Primary coordinates to monitor
  rx_coords: null                      # Reaction coordinates to monitor
  hessian_update: bofill               # Hessian update scheme
  hessian_recalc_reset: true           # Reset recalc counter after exact Hessian
  max_micro_cycles: 50                 # Micro-iterations per macro cycle
  augment_bonds: false                 # Augment reaction path based on bond analysis
  min_line_search: true                # Enforce minimum line-search step
  max_line_search: true                # Enforce maximum line-search step
  assert_neg_eigval: false             # Require negative eigenvalue at convergence
  # Also inherits rfo-like settings: trust_radius, trust_update, etc.

IRC Section

irc (section)

IRC integration settings.

irc:
  step_length: 0.1                     # Integration step length
  max_cycles: 125                      # Maximum steps along IRC
  downhill: false                      # Follow downhill direction only
  forward: true                        # Propagate in forward direction
  backward: true                       # Propagate in backward direction
  root: 0                              # Normal-mode root index
  hessian_init: calc                   # Hessian initialization source
  hessian_update: bofill               # Hessian update scheme
  hessian_recalc: null                 # Hessian rebuild cadence
  displ: energy                        # Displacement construction method
  displ_energy: 0.001                  # Energy-based displacement scaling
  displ_length: 0.1                    # Length-based displacement fallback
  rms_grad_thresh: 0.001               # RMS gradient convergence threshold
  hard_rms_grad_thresh: null           # Hard RMS gradient stop
  energy_thresh: 0.000001              # Energy change threshold
  imag_below: 0.0                      # Imaginary frequency cutoff
  force_inflection: true               # Enforce inflection detection
  check_bonds: false                   # Check bonds during propagation
  out_dir: ./result_irc/               # Output directory
  prefix: ""                           # Filename prefix
  dump_fn: irc_data.h5                 # IRC data filename
  dump_every: 5                        # Dump stride
  max_pred_steps: 500                  # Predictor-corrector max steps
  loose_cycles: 3                      # Loose cycles before tightening
  corr_func: mbs                       # Correlation function choice

Vibrational Analysis Sections

freq (section)

Vibrational frequency analysis settings.

freq:
  amplitude_ang: 0.8                   # Displacement amplitude for modes (Å)
  n_frames: 20                         # Number of frames per mode animation
  max_write: 10                        # Maximum number of modes to write
  sort: value                          # Sort order: "value" or "abs"

thermo

Thermochemistry settings.

thermo:
  temperature: 298.15                  # Thermochemistry temperature (K)
  pressure_atm: 1.0                    # Thermochemistry pressure (atm)
  dump: false                          # Write thermoanalysis.yaml

DFT Section

dft (section)

DFT calculation settings.

dft:
  func: wb97m-v                        # Exchange-correlation functional
  basis: def2-tzvpd                    # Basis set name
  func_basis: null                     # Combined "FUNC/BASIS" string (overrides func/basis)
  conv_tol: 1.0e-09                    # SCF convergence tolerance (Hartree)
  max_cycle: 100                       # Maximum SCF iterations
  grid_level: 3                        # PySCF grid level
  verbose: 0                           # PySCF verbosity (0-9)
  out_dir: ./result_dft/               # Output directory root

Scan Sections

bias

Harmonic bias settings for scans.

bias:
  k: 300.0                             # Harmonic bias strength (eV·Å⁻²)

bond

UMA-based bond-change detection.

bond:
  device: cuda                         # UMA device for bond analysis
  bond_factor: 1.2                     # Covalent-radius scaling for cutoff
  margin_fraction: 0.05                # Fractional tolerance for comparisons
  delta_fraction: 0.05                 # Minimum relative change to flag bond formation/breaking

Example: Complete Configuration File

Below is a comprehensive example combining multiple sections:

# pdb2reaction configuration example
# Use with: pdb2reaction all -i R.pdb P.pdb -c 'LIG' --args-yaml config.yaml

geom:
  coord_type: cart
  freeze_atoms: []

calc:
  charge: 0
  spin: 1
  model: uma-s-1p1
  device: auto
  hessian_calc_mode: Analytical        # Recommended when VRAM permits

gs:
  max_nodes: 12
  climb: true
  climb_lanczos: true

opt:
  thresh: gau
  max_cycles: 300
  dump: false
  out_dir: ./result_all/

sopt:
  lbfgs:
    thresh: gau
    max_cycles: 10000
  rfo:
    thresh: gau
    max_cycles: 10000

bond:
  bond_factor: 1.2
  delta_fraction: 0.05

search:
  max_depth: 10
  max_nodes_segment: 10

freq:
  max_write: 10
  amplitude_ang: 0.8

thermo:
  temperature: 298.15
  pressure_atm: 1.0

dft:
  func: wb97m-v
  basis: def2-tzvpd
  grid_level: 3

See Also

  • all - End-to-end workflow

  • opt - Single-structure optimization

  • tsopt - Transition state optimization

  • path-search - Recursive MEP search

  • freq - Vibrational analysis

  • dft - DFT calculations

  • uma_pysis - UMA calculator details