Architecture: mlmm-toolkit


1. Overview

mlmm-toolkit is a Python CLI that performs ML/MM (ONIOM) enzymatic reaction-path analysis on a complete protein environment. ML/MM here means a hybrid model in which a small reaction core is treated by a machine-learning interatomic potential (ML) and the surrounding protein by a molecular-mechanics (MM) force field, combined through the subtractive ONIOM (Our own N-layered Integrated molecular Orbital and molecular Mechanics) energy scheme.

The input is a PDB plus a substrate name. From these the tool automatically generates the parm7 topology and encodes the ONIOM region split (ML / Movable MM / Frozen MM) into B-factor channels. It then runs a full-system Hessian-based transition-state (TS) search via a macro/micro alternation scheme.

The result is a full reaction path produced by the stage pipeline extract mm-parm ONIOM model MEP tsopt IRC freq dft, where MEP is the minimum-energy path and IRC the intrinsic reaction coordinate.

The package is laid out as 6 physical layer directories (cli/, workflows/, domain/, backends/, io/, core/). The role and dependency direction of each are summarized in the §2.1 layer table below.

External code imports directly from the layer directory (from mlmm.backends.mlmm_calc import MLMMCore, from mlmm.core.utils import , import mlmm.io.trj2fig, etc.); the previous flat-top shim layer has been retired in this release. §2.4 details the two import surfaces this leaves.

Three bundled forks (pysisyphus/, thermoanalysis/, hessian_ff/) live at the repo top as repo-internal modules. They are deliberately not the upstream PyPI distributions (and hessian_ff/ has no upstream at all — bundling is mandatory). See §6.


2. Layered structure (6 physical directories)

2.1 Layer table

layer

dir

responsibility

may depend on

L1 Interface

mlmm/cli/

Click root group, decorator factories, --help-advanced, bool flag normalization, subcommand resolver, AmberTools preflight

workflows/, core/

L2 Application

mlmm/workflows/

per-subcommand orchestration; one file per stage runner (all.py, path_search.py, tsopt.py, extract.py, oniom_export.py, mm_parm.py, …)

domain/, backends/, io/, core/

L3 Domain

mlmm/domain/

chemistry-aware helper logic (bond change detection, bond summary, element-info propagation)

core/

L4a Infra (MLIP + ONIOM)

mlmm/backends/

MLIP backend dispatcher + per-backend adapter + ML/MM ONIOM calculator core

core/

L4b Infra (I/O)

mlmm/io/

output layout, summary, trajectory, PDB fix, energy diagram, Hessian cache, analytical-Hessian glue

core/

L5 Foundation

mlmm/core/

defaults (single source of truth), utils (PDB / XYZ / plot helpers), future errors.py / types.py

(none)

(bundle, not a layer)

<repo>/pysisyphus/, <repo>/thermoanalysis/, <repo>/hessian_ff/

repo-internal forks (optimizer / thermochemistry / analytical MM Hessian)

(sibling, layer-external)

Dependency direction (one-way): L1 L2 {L3, L4} L5 (per the §2.1 layer table). The directional rule is enforced by CI marker coverage (.github/scripts/check_engineering_markers.py). Bundled forks sit outside the layer graph and may be imported from any layer through their absolute package path (from pysisyphus.X import Y, from hessian_ff.analytical_hessian import ).

2.2 ASCII map of the package tree

mlmm_toolkit/ [GH: t-0hmura/mlmm_toolkit]
├── pyproject.toml packages.find = ["mlmm*",...] (glob, frozen)
├── README.md / CONTRIBUTING.md / CHANGELOG.md
├── docs/
│ ├── architecture.md ← this file
│ └──... (Sphinx site, unchanged)
├── mlmm/ ← package body, 6-layer physical dir
│ ├── __init__.py PEP 562 lazy: _LAZY_IMPORTS + __getattr__
│ ├── __main__.py `from mlmm.cli.app import cli`
│ ├── _version.py / py.typed
│ │
│ ├── cli/ # === L1 Interface ===
│ │ ├── app.py Click group + _LAZY_SUBCOMMANDS registry (absolute paths)
│ │ ├── common_options.py @add_precision_option / @add_backend_model_option / @add_ml_charge_spin_options et al.
│ │ ├── decorators.py make_is_param_explicit, bool/YAML helpers, render_cli_exception
│ │ ├── help_pages.py --help-advanced pager
│ │ ├── bool_compat.py --flag / --no-flag normalization
│ │ ├── default_group.py subcommand resolver, lazy module import
│ │ └── preflight.py AmberTools / conda env / GPU preflight
│ │
│ ├── workflows/ # === L2 Application ===
│ │ ├── all.py full pipeline orchestrator (extract → … → DFT)
│ │ ├── path_search.py / path_opt.py MEP search / COS wrapper
│ │ ├── tsopt.py / freq.py / irc.py / dft.py per-stage runners
│ │ ├── opt.py / scan.py / scan2d.py /
│ │ │ scan3d.py / scan_common.py ONIOM geometry opt / scans
│ │ ├── extract.py active-site extraction CLI
│ │ ├── define_layer.py ML / Movable MM / Frozen MM B-factor assignment
│ │ ├── mm_parm.py AmberTools-driven parm7 / rst7 generation
│ │ ├── oniom_export.py ONIOM input writer (Gaussian / ORCA)
│ │ ├── oniom_import.py ONIOM input reader (sanity / atom-name diff)
│ │ └── align_freeze.py Kabsch + frozen-subset rmsd
│ │
│ ├── domain/ # === L3 Domain ===
│ │ ├── bond_changes.py R↔P bond detection
│ │ ├── bond_summary.py post-IRC diagnostic
│ │ └── add_elem_info.py PDB element column normalizer
│ │
│ ├── backends/ # === L4a Infra (MLIP + ONIOM) ===
│ │ ├── __init__.py --precision routing (apply_precision_to_calc_cfg)
│ │ ├── mlmm_calc.py ML/MM ONIOM calculator core (4 MLIP backends UMA / Orb / MACE / AIMNet2
│ │ inline; CHEMISTRY-RULE:1 / 2 / 8 / 9 host)
│ │ │ Future: split into base.py + per-backend uma.py / orb.py
│ │ │ / mace.py / aimnet2.py + ONIOM subdir
│ │ └── xtb_embedcharge_correction.py xTB point-charge embedding correction (--embedcharge)
│ │
│ ├── io/ # === L4b Infra (I/O) ===
│ │ ├── summary.py summary.json / summary.md writer
│ │ ├── energy_diagram.py Plotly diagram
│ │ ├── trj2fig.py trajectory → PNG / HTML / SVG / PDF
│ │ ├── pdb_fix.py altloc resolution
│ │ ├── hessian_cache.py in-memory Hessian cache
│ │ └── hessian_calc.py numerical-Hessian build + frequency / vibrational I/O helpers
│ │
│ ├── core/ # === L5 Foundation ===
│ │ ├── defaults.py C1 single source of truth for every default
│ │ ├── utils.py PDB / XYZ / plot helpers
│ │ ├── logging.py -v / -vv logging wiring
│ │ ├── calc_eval.py per-stage calc evaluation
│ │ └── residue_data.py residue tables
│ │
│ └── mcp/ # non-layer subpackage: MCP server exposing every CLI subcommand
│   ├── server.py / _runner.py
│   └── _tools.py
│
├── tests/ smoke / unit
├── .github/ workflows/ + scripts/ (docs-quality lint helpers; CI-only)
└── (repo-top sibling, layer-external bundled forks)
 pysisyphus/ ~90 file, repo-internal fork (slimmed; CLI driver + QM backends + wavefunction + dead optimizers / IRC / NEB variants removed)
 thermoanalysis/ 5 file, repo-internal fork
 hessian_ff/ 19 file / 4.2k LOC, NO upstream PyPI, mandatory bundling

2.3 Per-layer responsibility detail

L1 cli/. Only this layer constructs Click commands and parses argv. app.py holds the root Click.Group plus the _LAZY_SUBCOMMANDS registry — every entry uses an absolute module path (mlmm.workflows.all, mlmm.io.trj2fig, …) so the resolver is independent of where default_group.py itself lives. The mlmm-specific preflight.py (AmberTools / conda env / GPU preflight) lives here because it runs during CLI startup before any L2 workflow is invoked.

L2 workflows/ (~21 files). One file per subcommand. Each file owns a single @click.command() named cli and its private helpers. Large stage runners (all.py = 4,414 LOC, path_search.py = 2,352 LOC, tsopt.py = 3,181 LOC, extract.py = 2,274 LOC, oniom_export.py = 2,027 LOC) remain as single files in the current layout; future work may split them into per-stage subdirectories, but this is opt-in and out of scope for this release line.

L3 domain/. Chemistry-aware helper logic that may import torch / numpy / pysisyphus.constants (numeric back-ends), but may not import machine-learning interatomic potential (MLIP) runtimes (fairchem, orb_models, mace, aimnet). Two distinct CI gates cover this, both in .github/scripts/check_engineering_markers.py:

  • The MLIP-runtime deny list (fairchem / orb_models / mace / aimnet) is enforced repo-wide by _check_external_library_scope, which forbids those imports in any module outside backends/.

  • The separate # DOMAIN_PURE module-docstring marker is a distinct CI gate (_check_domain_pure) that flags the specific backend-agnostic modules required to stay MLIP-free — backends/mlmm_calc.py, workflows/tsopt.py, workflows/freq.py (and present on workflows/sp.py). It is not itself the deny-list mechanism, and no domain/ file carries it.

Domain helpers are reusable by any L2 stage runner.

L4a backends/. The ML/MM ONIOM calculator core (mlmm_calc.py = 2,550 LOC) lives here together with the backend dispatch (__init__.py) and the standalone xTB point-charge embedding correction (xtb_embedcharge_correction.py, driven by --embedcharge). Today the 4 MLIP backends (UMA / Orb / MACE / AIMNet2) that evaluate the ML region and the OpenMM / hessian_ff coupling all sit inline inside mlmm_calc.py; future work may split this into backends/{base, uma, orb, mace, aimnet2}.py for the MLIP layer plus a backends/mlmm_calc/ subdir for the ONIOM core (core.py, ase_calc.py, embed_charge.py, hessianff_calc.py, openmm_calc.py, facade.py). The current single-file mlmm_calc.py carries chemistry rules #1 (subtractive ONIOM), #2 (link-atom Hessian B-matrix), #8 (3-layer 5-pass partial Hessian), and #9 (parm7 atom indexing) — see §5.1.

L4b io/ (7 files). Output-side I/O concerns: per-stage summary writer, energy diagram, trajectory rendering, PDB altloc fix, Hessian cache, numerical Hessian construction + frequency / vibrational I/O (hessian_calc.py). io/ never depends on workflows/; output format is owned here and consumed by stage runners.

L5 core/. The lowest layer. defaults.py is the single source of truth for every CLI default — grep here before adding a number anywhere else. utils.py is a ~3,200-LOC grab-bag of PDB / XYZ / plotting helpers; future work may split it into utils/{pdb,plot,coord,yaml,freeze,input_prep}.py. logging.py (-v / -vv wiring), calc_eval.py (per-stage calc evaluation) and residue_data.py (residue tables) also live here. The internal-only modules errors.py, types.py / _stage.py will be introduced here as they land.

2.4 Lazy-import mechanism (conceptual diagram)

External consumer Package root Layer dir
------------------ ---------------- -----------

from mlmm.core.utils import x ────────────────────────────────────► mlmm/core/utils.py

import mlmm.io.trj2fig ──────────────────────────────────────────► mlmm/io/trj2fig.py

from mlmm.backends.mlmm_calc import ─────────────────────────────► mlmm/backends/mlmm_calc.py
 MLMMCore

from mlmm import MLMMCore ─────► mlmm/__init__.py
 __getattr__("MLMMCore")
 └─► _LAZY_IMPORTS["MLMMCore"]
 = "mlmm.backends.mlmm_calc"
 └─► importlib.import_module(...)
 └─► getattr(module, "MLMMCore")

mlmm myaction ─────────────────► mlmm/cli/app.py
 _LAZY_SUBCOMMANDS["myaction"]
 = ("mlmm.workflows.myaction", "cli", "...")
 └─► importlib.import_module(absolute path)
 └─► getattr(module, "cli") → Click command

Two import surfaces (the flat-top shim layer was retired in this release; downstream code that used from mlmm.<oldmod> must migrate to the layered path):

  1. Layered import path: external code imports directly from the layer directory (see the §2.1 layer table; e.g. from mlmm.backends.mlmm_calc import MLMMCore).

  2. Root symbol attribute (from mlmm import MLMMCore) — handled by mlmm/__init__.py:_LAZY_IMPORTS + PEP 562 __getattr__. The five re-exported symbols (MLMMCore, MLMMASECalculator, mlmm, mlmm_ase, mlmm_mm_only) all resolve to mlmm.backends.mlmm_calc and are loaded on first access, so import mlmm stays cheap (only __version__ is eager). There is no root module-attribute surface — submodules are reached by their full path (import mlmm.io.trj2fig), not as attributes of the top-level package.

The CLI subcommand resolver (cli/app.py:_LAZY_SUBCOMMANDS) uses absolute module paths (e.g. "mlmm.workflows.all") so that moving default_group.py into cli/ does not silently break subcommand discovery (the registry no longer depends on __package__).


3. Fresh-eyes 5-step navigation (≈ 40 min total)

For a contributor opening the repo for the first time, follow this path top-to-bottom; each step closes one concern.

step

minutes

open

what you learn

1

3

README.md

one-paragraph elevator pitch + single-command usage

2

5

this file (docs/architecture.md) §2 + §4

6-layer dir tree, dependency direction, where each concern lives

3

5

mlmm/cli/app.py

Click root group, _LAZY_SUBCOMMANDS registry (≈ 22 entries), absolute-path resolution

4

20

mlmm/workflows/all.py (4,414 LOC, skim)

one full subcommand top-to-bottom; trace extract mm-parm ONIOM model MEP tsopt IRC freq dft

5

7

CONTRIBUTING.md §3 + §4

5 add-a-X recipes + the “do not touch” hidden constraints

After step 5 you can read any other file by following the file index in §4. The package is intentionally flat-within-each-layer — there is no nested package below mlmm/<layer>/ (other than the future split of backends/mlmm_calc/ into per-backend modules), so you never need to navigate more than two directories deep.


4. File index — “where does this concern live?”

4.1 CLI / entry (L1 cli/)

concern

file

Click root group + subcommand dispatch

mlmm/cli/app.py

Subcommand resolver (lazy import)

mlmm/cli/default_group.py

python -m mlmm shim

mlmm/__main__.py

Shared option decorator factories

mlmm/cli/common_options.py

Bool/YAML/exception CLI helpers

mlmm/cli/decorators.py

--help-advanced pager

mlmm/cli/help_pages.py

Bool flag compat (--flag / --no-flag + value style)

mlmm/cli/bool_compat.py

AmberTools / conda env / GPU preflight

mlmm/cli/preflight.py

4.2 Workflow stage runners (L2 workflows/)

Acronyms used below: MEP = minimum-energy path; GSM = growing-string method; COS = chain-of-states; RSIRFO = restricted-step image-function rational-function optimization (also written RS-I-RFO); Bofill = the Bofill Hessian-update formula; PHVA = partial Hessian vibrational analysis; IRC = intrinsic reaction coordinate; Kabsch = the Kabsch rigid-body alignment algorithm.

concern

file

Full pipeline orchestrator

mlmm/workflows/all.py

Geometry optimization (ONIOM macro/micro pre-opt)

mlmm/workflows/opt.py

1D / 2D / 3D scans + shared

mlmm/workflows/scan{,2d,3d,_common}.py

MEP search (GSM)

mlmm/workflows/path_search.py

MEP optimizer core (pysisyphus COS)

mlmm/workflows/path_opt.py

TS optimization (RSIRFO + Bofill + macro/micro)

mlmm/workflows/tsopt.py

Vibrational analysis (PHVA + UMA active block)

mlmm/workflows/freq.py

IRC integration (macro / micro)

mlmm/workflows/irc.py

Single-point DFT (gpu4pyscf subprocess, ONIOM-embedded)

mlmm/workflows/dft.py

Active-site extraction (cluster cap)

mlmm/workflows/extract.py

ML / Movable MM / Frozen MM region assignment

mlmm/workflows/define_layer.py

AmberTools-driven MM parameter generation

mlmm/workflows/mm_parm.py

ONIOM input writer (Gaussian / ORCA)

mlmm/workflows/oniom_export.py

ONIOM input reader (sanity, atom-name diff)

mlmm/workflows/oniom_import.py

Kabsch / frozen-subset alignment

mlmm/workflows/align_freeze.py

4.3 Chemistry helpers (L3 domain/)

concern

file

R↔P bond change detection

mlmm/domain/bond_changes.py

Post-IRC bond summary

mlmm/domain/bond_summary.py

PDB element column normalizer

mlmm/domain/add_elem_info.py

4.4 MLIP + ONIOM (L4a backends/)

concern

file

ML/MM ONIOM calculator core + 4 inline MLIP backends + ONIOM coupling

mlmm/backends/mlmm_calc.py

--precision routing (apply_precision_to_calc_cfg / _PRECISION_DISPATCH)

mlmm/backends/__init__.py

Backend dispatch / factory (_create_ml_backend)

mlmm/backends/mlmm_calc.py

xTB point-charge embedding correction (--embedcharge)

mlmm/backends/xtb_embedcharge_correction.py

per-backend adapter split (planned, not yet present)

mlmm/backends/{base, uma, orb, mace, aimnet2}.py

ONIOM core subdir (planned, not yet present)

mlmm/backends/mlmm_calc/{core, ase_calc, embed_charge, hessianff_calc, openmm_calc, facade}.py

See MLIP Backends for the add-a-backend recipe (currently scoped to the planned per-backend split; until that lands, backend additions touch mlmm_calc.py inline).

4.5 I/O (L4b io/)

concern

file

summary.json / summary.md writer

mlmm/io/summary.py

Plotly energy diagram

mlmm/io/energy_diagram.py

Trajectory → PNG / HTML / SVG / PDF

mlmm/io/trj2fig.py

PDB altloc resolution

mlmm/io/pdb_fix.py

In-memory Hessian cache (per-run TTL)

mlmm/io/hessian_cache.py

Numerical Hessian build + frequency / vibrational I/O

mlmm/io/hessian_calc.py

Harmonic restraint setup

mlmm/workflows/restraints.py (L2 stage helper)

4.6 Foundation (L5 core/)

concern

file

Every CLI default (single source of truth)

mlmm/core/defaults.py

PDB / XYZ / plot helpers

mlmm/core/utils.py

-v / -vv logging wiring

mlmm/core/logging.py

Per-stage calc evaluation

mlmm/core/calc_eval.py

Residue tables

mlmm/core/residue_data.py

(future) internal Protocol / TypedDict

mlmm/core/types.py

4.7 Repo-internal bundled forks

dir

role

divergent files (do NOT replace with upstream)

pysisyphus/

optimizer / TS / IRC engine

irc/IRC.py, optimizers/hessian_updates.py, run.py, tsoptimizers/TSHessianOptimizer.py, calculators/* (5 files total)

thermoanalysis/

thermochemistry (ΔG, ZPE, partition functions)

QCData.py (branding diff vs upstream)

hessian_ff/

analytical Hessian on MM force field — PyPI 404, bundling is mandatory

analytical_hessian.py (sole entry consumed by mlmm/backends/mlmm_calc.py)

See each dir’s README.md for the touch-restriction boundary.


5. Hidden constraints (read this before any patch)

5.1 Nine chemistry rules (grep recipe)

Nine correctness-critical rules are spread across backends/, workflows/, and core/defaults.py. They are not detected by smoke tests — silent drift here breaks reaction-path accuracy. Inline # CHEMISTRY-RULE:N markers and # DOMAIN_PURE module-docstring markers identify the rules; .github/scripts/check_engineering_markers.py enforces marker completeness in CI.

To find every chemistry rule before editing:

# List all 9 rule sites in the repo (host file + line)
grep -rnE '# CHEMISTRY-RULE:[0-9]+' mlmm/

# List every # DOMAIN_PURE marker (= chemistry-rule host modules)
grep -rn '# DOMAIN_PURE' mlmm/

All 9 rules apply to mlmm:

#

rule

host file

1

Subtractive ONIOM energy formula (E = mm_real + ml_model mm_model)

mlmm/backends/mlmm_calc.py

2

Link-atom Hessian B-matrix projection

mlmm/backends/mlmm_calc.py

3

Macro / micro alternation (RS-I-RFO hess mode microiteration)

mlmm/workflows/tsopt.py

4

gpu4pyscf rks_lowmem triple-guard

mlmm/workflows/dft.py

5

def2 family auto-ECP injection

mlmm/workflows/dft.py

6

PHVA + UMA active-block partial Hessian

mlmm/workflows/freq.py

7

bofill_update advanced-indexing scatter

mlmm/workflows/tsopt.py

8

3-layer 5-pass partial Hessian assembly

mlmm/backends/mlmm_calc.py

9

parm7 atom indexing (1-based / serial gap handling)

mlmm/backends/mlmm_calc.py

Editing any of these requires a [CHEMISTRY-RULE:N] commit prefix and a HEAVY-tier numerical-golden gate pass (see CONTRIBUTING.md §1.1).

Recommended learning order (4 chemistry clusters):

cluster

rules

shared concern

learn-first file

5-pass Hessian set

#1, #2, #8, #9

subtractive ONIOM + link-atom B-matrix + 3-layer assembly + parm7 indexing

mlmm/backends/mlmm_calc.py (host of 4 of the 9 rules)

TS optimization set

#3, #7

macro / micro alternation + Bofill scatter

mlmm/workflows/tsopt.py

Vibrational set

#6

PHVA + UMA active-block partial Hessian

mlmm/workflows/freq.py

DFT set

#4, #5

gpu4pyscf low-memory + def2 ECP injection

mlmm/workflows/dft.py

For mlmm the practical curriculum is the 5-pass Hessian set first (#1, #2, #8, #9 — all in mlmm_calc.py), then the TS set (#3, #7), then DFT (#4, #5), then vibrational (#6).

5.2 VRAM-management invariant (do not refactor del chains)

The IRC / TSopt / Freq stages explicitly del GPU-resident objects (calc, geom, hess) between stages to free CUDA memory; the all workflow additionally runs gc.collect() at stage boundaries. Do not refactor these del / gc.collect() statements out — long-running ML/MM all jobs on the full protein environment OOM without them.

5.3 Bundled forks: do NOT install upstream alongside

The bundled pysisyphus/, thermoanalysis/, and hessian_ff/ packages are forks (and in the case of hessian_ff/, the only available distribution — PyPI returns 404). Reinstalling pip install pysisyphus or pip install thermoanalysis next to this package silently breaks:

  • pysisyphus/irc/IRC.py — initial-displacement memory hygiene

  • pysisyphus/optimizers/hessian_updates.py — Bofill scatter on advanced indices, CPU-only bofill_update path for GPU OOM avoidance

  • pysisyphus/tsoptimizers/TSHessianOptimizer.py — RSIRFO kwargs

  • pysisyphus/calculators/... — GPU-aware backend hooks

  • thermoanalysis/QCData.py — branding / I/O diff vs upstream

  • hessian_ff/analytical_hessian.py — sole entry consumed by backends/mlmm_calc.py; no upstream alternative exists

5.4 pyproject.toml arrays are 0-diff

[tool.setuptools.packages.find].include and dependencies are treated as 0-diff arrays during this release. The include glob (mlmm*) already auto-discovers any new layer subpackage; adding a vendor/ or internal/ container directory, or pinning a new runtime dependency, breaks the install contract and is forbidden by the release scope. Reflow / comment edits are fine; array contents are frozen.

5.5 _LAZY_SUBCOMMANDS registry must use absolute paths

mlmm/cli/app.py:_LAZY_SUBCOMMANDS resolves every subcommand through an absolute module path. Switching any entry back to a relative dotted import (".all" etc.) silently breaks subcommand discovery whenever default_group.py moves, because the resolver’s __package__ then drifts away from the package root. See internal design notes.


6. Bundled forks (repo-internal)

mlmm_toolkit ships three repo-internal modules at the repo top:

dir

upstream PyPI?

purpose

scope of edits allowed

pysisyphus/

NO — fork, do not pip install pysisyphus alongside

optimizer, TS, IRC, COS, calculators

annotation-only in this release (docstring + type hints); logic edits forbidden

thermoanalysis/

NO — fork (branding diff)

ΔG, ZPE, partition functions, QCData

same as pysisyphus/

hessian_ff/

NO — PyPI 404, bundling mandatory

analytical Hessian on MM force field

same as pysisyphus/

Each dir carries its own README.md listing the divergent files and the touch-restriction boundary. From the layer model these forks live outside the L1..L5 graph: any layer may import them via the absolute package path (from pysisyphus.X import Y, from hessian_ff.analytical_hessian import ) without breaking the L1 L2 {L3, L4} L5 direction.



8. ML/MM (ONIOM) scope

mlmm-toolkit operates on the full protein environment via ONIOM:

  • ML region: substrate + reaction-center residues, evaluated by one of 4 machine-learning interatomic potential (MLIP) backends (UMA / Orb / MACE / AIMNet2); an optional xTB point-charge embedding correction (--embedcharge) adds MM→ML environmental effects

  • Movable MM region: a shell around the ML region, free to move under the AMBER force field

  • Frozen MM region: the rest of the protein, held rigid

The split is encoded in B-factor channels of the input PDB and propagated through extract mm-parm ONIOM model MEP tsopt IRC freq dft.