Troubleshooting¶
This page collects common failure modes and practical fixes. Search this page for the error message you encounter. If you want a symptom-first entrypoint, start with Common Error Recipes and then return here for details.
Preflight checklist¶
Before a long run, verify:
You can run
mlmm -hand see the CLI help.UMA can be downloaded (Hugging Face login/token is available on the machine).
For enzyme workflows: your input PDB(s) contain hydrogens and element symbols.
When you provide multiple PDBs: they have the same atoms in the same order (only coordinates differ).
AmberTools is installed and
tleapis on your PATH (required formm-parm).The
hessian_ffC++ native extension is built (cd hessian_ff/native && make).
Input / extraction problems¶
“Element symbols are missing… please run add-elem-info”¶
Typical message:
Element symbols are missing in '...'.
Please run `mlmm add-elem-info -i...` to populate element columns before running extract.
Fix:
Run:
mlmm add-elem-info -i input.pdb -o input_with_elem.pdb
Then re-run
extract/allusing the updated PDB.
Why it happens:
Many PDBs do not populate the element column consistently.
extractrequires element symbols for reliable atom typing.
“[multi] Atom count mismatch…” or “[multi] Atom order mismatch…”¶
Typical messages:
[multi] Atom count mismatch between input #1 and input #2:...
[multi] Atom order mismatch between input #1 and input #2.
Fix:
Regenerate all structures with the same preparation workflow (same protonation tool, same settings).
If you add hydrogens, do it in a way that produces consistent ordering across all frames.
Tip:
For ensembles generated by MD, prefer extracting frames from the same trajectory/topology rather than mixing PDBs produced by different tools.
“My pocket is empty / missing important residues”¶
Symptoms:
The extracted pocket is unexpectedly small.
Key catalytic residues are missing.
Fixes to try:
Increase
--radius(e.g., 2.6 -> 3.5 Angstrom).Use
--selected-resnto force-include residues (e.g.,--selected-resn 'A:123,B:456').If backbone trimming is too aggressive, set
--no-exclude-backbone.
Charge / spin problems¶
“Charge is required…” (non-GJF inputs)¶
Calculation subcommands require explicit -q/--charge.
In all, charge is resolved in order: -q/--charge override -> extraction summary -> --ligand-charge fallback when extraction is skipped.
Fix:
Provide charge and multiplicity explicitly:
mlmm path-search -i R.pdb P.pdb --parm real.parm7 --model-pdb model.pdb -q 0 -m 1
Or, when using extraction, provide a residue-name mapping and run through
all:
mlmm -i R.pdb P.pdb -c 'SAM,GPP' -l 'SAM:1,GPP:-3'
AmberTools / mm-parm problems¶
tleap not found¶
Typical message:
FileNotFoundError: tleap not found on PATH
or
mm-parm requires AmberTools (tleap, antechamber, parmchk2).
Fix:
Install AmberTools via conda:
conda install -c conda-forge ambertools -y
Or load the appropriate module on HPC:
module load ambertools
Verify availability:
which tleap
which antechamber
which parmchk2
Note: without AmberTools, you can still run
opt,tsopt,path-search, etc. if you supply--parmmanually.
antechamber fails for a ligand¶
Symptoms:
mm-parmfails during ligand parameterization.Errors about atom type assignment or charge calculation.
Fixes to try:
Check that the ligand has correct element symbols and bond connectivity in the PDB.
Ensure
--ligand-chargeis specified correctly:-l 'GPP:-3,SAM:1'.Use
--keep-tempto preserve intermediate files and inspect<resname>.antechamber.log:
mlmm mm-parm -i input.pdb -l 'LIG:-1' --keep-temp
Check that hydrogen atoms are correctly added and TER records are appropriate.
Try running antechamber manually on the extracted ligand PDB to diagnose the issue:
antechamber -i ligand.pdb -fi pdb -o ligand.mol2 -fo mol2 -c bcc -nc -3 -at gaff2
parm7/rst7 mismatch errors¶
Typical messages:
Atom count in parm7 (...) does not match input PDB (...)
or
RuntimeError: parm7 topology does not match the input structure
or
Coordinate shape mismatch for... got (N, 3), expected (M, 3)
Fix:
The parm7 file must correspond to exactly the same atoms (in the same order) as the input PDB.
Re-run
mm-parmto regenerate the parm7 from the current PDB.Do not edit or reorder PDB atoms after running
mm-parm.When re-running
mm-parm, use the output PDB (<prefix>.pdb) as the input for subsequent calculations, since tleap may add or remove hydrogens.
parm7 element order does not match PDB¶
Symptoms:
oniom-exportreports “Element sequence mismatch at atom index…”
Fix:
Use
--no-element-checkto disable the element check (verify results manually).The correct fix is to use the same PDB for
-ithat was used when generating the parm7.
hessian_ff build problems¶
Build fails (“make” errors)¶
Typical symptoms:
makeinhessian_ff/native/produces compilation errors.ImportError: cannot import name 'ForceFieldTorch' from 'hessian_ff'.RuntimeError: hessian_ff build attempts failed: ...
Fixes to try:
Ensure you have a C++ compiler (g++ >= 9) installed:
g++ --version
Ensure PyTorch headers are available:
python -c "import torch; print(torch.utils.cmake_prefix_path)"
On HPC, load a compiler module:
module load gcc/11
Clean and rebuild:
conda install -c conda-forge ninja -y
cd hessian_ff/native && make clean && make
hessian_ff import errors¶
Typical message:
ImportError: cannot import name 'ForceFieldTorch' from 'hessian_ff'
or:
RuntimeError: hessian_ff build attempts failed: ...
To rebuild hessian_ff native extensions in this environment:
conda install -c conda-forge ninja -y
cd $(python -c "import hessian_ff; print(hessian_ff.__path__[0])")/native && make clean && make
Fix:
The C++ native extension needs to be built first:
cd hessian_ff/native && make
Ensure the
hessian_ffpackage is in your Python path (it should be if you installed mlmm-toolkit withpip install -e .).
B-factor layer assignment problems¶
Wrong layer assignments¶
Symptoms:
Atoms are assigned to unexpected layers.
ML region is too small or too large.
Fixes to try:
Inspect the layer-assigned PDB visually (color by B-factor in your molecular viewer).
Check that
--model-pdbcorrectly defines the ML region atoms.Adjust the distance cutoffs in
define-layer:--radius-freeze(default 8.0 Angstrom): controls Movable-MM/Frozen boundary.If needed, control Hessian-target MM separately in calc options (
hess_cutoff,hess_mm_atoms).If using
use_bfactor_layers: truein YAML, verify that B-factor values match the expected encoding (0.0, 10.0, 20.0 with tolerance 1.0).
B-factor values are not recognized¶
Typical symptoms:
Calculator treats all atoms as frozen or all as ML.
B-factor values are not one of {0.0, 10.0, 20.0}.
Fix:
Re-run
define-layerto ensure correct B-factor encoding.A tolerance of 1.0 is applied: B-factors near 0/10/20 map to ML/Movable/Frozen.
Do not manually edit B-factors to arbitrary values.
–detect-layer does not work as expected¶
Symptoms:
Automatic layer detection from B-factors produces unexpected ML/Movable/Frozen splits.
Running with
--detect-layerwithout--model-pdbfails.
Fixes to try:
Ensure the input is a PDB (or an XYZ with
--ref-pdb).Re-run
define-layerto explicitly assign B-factors, then use the generated PDB.For distance-based control, specify
hess_cutoff/movable_cutoffand switch to--no-detect-layerif needed.Note that supplying
--movable-cutoffdisables--detect-layer.
Installation / environment problems¶
UMA download/authentication errors¶
Symptoms:
Errors about missing Hugging Face authentication or being unable to download model weights.
Fix:
Log in once per environment/machine:
huggingface-cli login
On HPC, ensure your home directory (or HF cache directory) is writable from compute nodes.
CUDA / PyTorch mismatch¶
Symptoms:
torch.cuda.is_available()is false even though you have a GPU.CUDA runtime errors at import time.
Fixes:
Install a PyTorch build matching your cluster CUDA runtime.
Confirm GPU visibility:
nvidia-smi
python -c "import torch; print(torch.version.cuda, torch.cuda.is_available())"
DMF mode fails (cyipopt missing)¶
If you use DMF (--mep-mode dmf) and see errors importing IPOPT/cyipopt:
Fix:
Install
cyipoptfrom conda-forge (recommended) before installingmlmm:
conda install -c conda-forge cyipopt
Plot export fails (Chrome missing)¶
If figure export fails and you see Plotly/Chrome-related errors:
Fix:
Install a headless Chrome once:
plotly_get_chrome -y
Calculation / convergence problems¶
CUDA out of memory (VRAM)¶
Symptoms:
torch.cuda.OutOfMemoryError: CUDA out of memorySystem hangs or crashes during Hessian calculation.
ML/MM systems are typically larger than pure cluster models, so VRAM pressure is higher.
Fixes to try:
Reduce ML region size: use a smaller extraction radius or manually trim
--model-pdb.Use FiniteDifference ML Hessian: set
--hessian-calc-mode FiniteDifference(uses less VRAM but is slower).Move MM to CPU: set
mm_device: cpuin YAML (default).Reduce Hessian-target MM region: decrease
hess_cutoff(YAML/CLI where available).Use 3-layer + Hessian-target control: set
hess_cutoffandmovable_cutoffin YAML to limit the number of atoms included in the Hessian:
calc:
hess_cutoff: 3.6
movable_cutoff: 8.0
Pre-define layers with
define-layerand useuse_bfactor_layers: true.Use a GPU with more VRAM: 24 GB+ recommended for systems with 500+ ML atoms; 48 GB+ for 1000+ ML atoms.
Reduce pocket size: use a smaller
--radiusduring extraction.
TS optimization fails to converge¶
Symptoms:
TS optimization runs for many cycles without converging.
Multiple imaginary frequencies remain after optimization.
Fixes to try:
Switch optimizer modes:
--opt-mode grad(Dimer) or--opt-mode hess(RS-I-RFO).Enable flattening of extra imaginary modes:
--flatten.Increase max cycles:
--max-cycles 20000.Use tighter convergence:
--thresh bakeror--thresh gau_tight.Adjust
hess_cutoffto expand the range of atoms included in the Hessian calculation.
IRC does not terminate properly¶
Symptoms:
IRC stops before reaching a clear minimum.
Energy oscillates or gradient remains high.
Fixes to try:
Reduce step size:
--step-size 0.05(default is 0.10).Increase max cycles:
--max-cycles 200.Check if the TS candidate has only one imaginary frequency before running IRC.
MEP search (GSM/DMF) fails or gives unexpected results¶
Symptoms:
Path search terminates with no valid MEP.
Bond changes are not detected correctly.
Fixes to try:
Increase
--max-nodes(e.g., 15 or 20) for complex reactions.Enable endpoint pre-optimization:
--preopt.Try the alternative MEP method:
--mep-mode dmf(if GSM fails) or vice versa.Adjust bond detection parameters in YAML (
bond.bond_factor,bond.delta_fraction).
Performance / stability tips¶
Out of memory (VRAM): reduce ML region size, reduce Hessian-target MM region, reduce nodes (
--max-nodes), or use lighter optimizer settings (--opt-mode grad).Analytical ML Hessian is slow or OOM: use
--hessian-calc-mode FiniteDifferencefor the ML region. Only useAnalyticalif you have ample VRAM (24 GB+ recommended for 300+ ML atoms).MM Hessian:
mm_fd: true(default) uses finite-difference for MM Hessian. Analytical MM Hessian (mm_fd: false) is faster for small systems but may require more memory.MM Hessian is slow: set
hess_cutoffto limit the number of Hessian-target MM atoms.Large systems (2000+ atoms): ensure frozen atoms are properly set (Frozen layer, B=20) to reduce the movable DOF count. Use
define-layerwith appropriate cutoffs.Multi-GPU: place ML on one GPU (
ml_cuda_idx: 0) and MM on another (mm_device: cuda,mm_cuda_idx: 1) if available.ML and MM parallel execution: by default, ML (GPU) and MM (CPU) run in parallel. Tune CPU thread count with
mm_threads.
Backend-specific issues¶
ImportError when using –backend orb/mace/aimnet2¶
Symptom: ImportError: orb-models is required for the ORB backend
Fix: Install the optional dependency for the chosen backend:
pip install "mlmm-toolkit[orb]" # ORB backend
pip install "mlmm-toolkit[aimnet2]" # AIMNet2 backend
# MACE: pip uninstall fairchem-core && pip install mace-torch (separate env required)
CUDA out of memory with non-UMA backends¶
Symptom: RuntimeError: CUDA out of memory when using ORB, MACE, or AIMNet2.
Fix: Non-UMA backends use finite-difference Hessians, which require more VRAM. Options:
Reduce
--radius-partial-hessianto limit Hessian-target atomsUse
--hessian-calc-mode FiniteDifferenceexplicitly with a smallerhess_cutoffUse
ml_device: cpuin YAML (slower but avoids VRAM limits)
xTB not found when using –embedcharge¶
Symptom: FileNotFoundError: xtb command not found
Fix: Install xTB and ensure it’s on $PATH:
conda install -c conda-forge xtb
How to report an issue¶
When asking for help, include:
The exact command line you ran
summary.log(or console output)The smallest input files that reproduce the problem (if possible)
Your environment: OS, Python, CUDA, PyTorch versions
Whether AmberTools and hessian_ff are properly installed