Device Configuration & HPC Setup¶
Overview¶
Summary: How to configure GPU/CPU devices for the ML/MM calculator and submit jobs on HPC clusters.
Device Parameters¶
The ML/MM calculator (mlmm_calc.mlmm) uses separate device settings for the ML and MM backends:
Parameter |
Default |
Description |
|---|---|---|
|
|
Device for MLIP inference. |
|
|
CUDA device index when |
|
|
MM force field engine. |
|
|
Device for MM backend. |
|
|
CUDA device index when |
|
|
Number of CPU threads for MM backend. |
YAML configuration example¶
calc:
ml_device: cuda
ml_cuda_idx: 0
mm_backend: hessian_ff
mm_device: cpu
mm_threads: 16
Using the OpenMM backend with CUDA¶
calc:
ml_device: cuda
ml_cuda_idx: 0
mm_backend: openmm
mm_device: cuda
mm_cuda_idx: 0
Note: When both ML and MM use CUDA, they share GPU memory. For large systems, consider using
mm_device: cputo reduce VRAM consumption.
VRAM Management¶
Hessian device (--hess-device)¶
The freq command supports --hess-device to control where Hessian assembly and diagonalization run:
# Default: same device as ml_device (typically CUDA)
mlmm freq -i input.pdb --parm real.parm7 -q -1
# Force CPU for Hessian assembly (saves VRAM for large systems)
mlmm freq -i input.pdb --parm real.parm7 -q -1 --hess-device cpu
Use --hess-device cpu when:
The active region is large (> ~500 unfrozen atoms)
You encounter CUDA out-of-memory errors during frequency calculations
VRAM is limited (< 16 GB)
General VRAM tips¶
Reduce the ML region size: Use
mlmm extractwith a smaller--radiusormlmm define-layerwith a tighter--radius-freeze.Use hessian_ff (default): The hessian_ff backend is CPU-only, leaving all VRAM for MLIP inference.
Avoid OpenMM CUDA for large systems: If both ML and MM use CUDA, VRAM pressure doubles.
Monitor VRAM: Set
print_vram: Truein YAML to print VRAM usage after each ML evaluation.
HPC Job Submission¶
PBS example¶
#!/bin/bash
#PBS -N mlmm_opt
#PBS -q default
#PBS -l nodes=1:ppn=32:gpus=1,mem=120GB,walltime=72:00:00
#PBS -o ${PBS_JOBNAME}.o${PBS_JOBID}
#PBS -e ${PBS_JOBNAME}.e${PBS_JOBID}
set -euo pipefail
hostname
cd "${PBS_O_WORKDIR}"
# Load environment modules
. /home/apps/Modules/init/profile.sh
module load cuda/12.9
# Activate conda environment
source ~/miniconda3/etc/profile.d/conda.sh
conda activate mlmm
# Run optimization
mlmm opt \
-i r_complex_layered.pdb \
--parm p_complex.parm7 \
-q -1 -m 1 \
--opt-mode grad \
--out-dir opt_result
Slurm example¶
#!/bin/bash
#SBATCH --job-name=mlmm_opt
#SBATCH --partition=gpu
#SBATCH --gres=gpu:1
#SBATCH --cpus-per-task=32
#SBATCH --mem=120G
#SBATCH --time=72:00:00
#SBATCH --output=%x_%j.out
#SBATCH --error=%x_%j.err
set -euo pipefail
hostname
module load cuda/12.9
source ~/miniconda3/etc/profile.d/conda.sh
conda activate mlmm
mlmm opt \
-i r_complex_layered.pdb \
--parm p_complex.parm7 \
-q -1 -m 1 \
--opt-mode grad \
--out-dir opt_result
Key points¶
Single GPU: mlmm-toolkit uses one GPU per job. Request
gpus=1(PBS) or--gres=gpu:1(Slurm).CPU threads: Request enough CPUs for the MM backend (
mm_threads, default 16). Setppn=32(PBS) or--cpus-per-task=32(Slurm) for safety margin.Memory: 120 GB is typically sufficient for enzyme active-site models. Increase for very large systems.
CUDA module: Load CUDA before activating conda to ensure PyTorch finds the correct CUDA runtime.
Specifying a GPU index¶
If you are allocated multiple GPUs or want to target a specific GPU on a multi-GPU node:
# Option A: Environment variable (affects all CUDA programs)
export CUDA_VISIBLE_DEVICES=0
# Option B: YAML configuration (mlmm-specific)
# In config.yaml:
# calc:
# ml_cuda_idx: 0
mlmm opt -i input.pdb --parm real.parm7 -q -1 --config config.yaml
Limitations¶
No multi-GPU parallelism: Unlike pdb2reaction (which supports multi-worker inference via Ray), mlmm-toolkit runs on a single GPU. The hessian_ff backend does not support parallel predictors.
No distributed computing: All calculations run within a single process on a single node.
hessian_ff is CPU-only: The default MM backend always runs on CPU regardless of
mm_devicesetting.
See Also¶
Getting Started – Installation and CUDA setup
ML/MM Calculator – Calculator architecture and parameters
YAML Reference – Full configuration reference
freq –
--hess-deviceoption detailsTroubleshooting – Common error fixes