dft¶
Overview¶
Summary: Run a single-point DFT calculation on the ML region using PySCF/GPU4PySCF, then recombine with MM energies to obtain the ML(dft)/MM total energy. Results include energy and population analysis (Mulliken, meta-Lowdin, IAO charges).
mlmm dft extracts the ML region from the full enzyme PDB, appends link hydrogens, and runs a single-point PySCF (or GPU4PySCF) calculation. After the DFT evaluation, the script recomputes the ML(dft)/MM total energy by combining the PySCF high-level energy with MM evaluations of the full system (REAL-low) and the ML subset (MODEL-low):
E_total = E_REAL_low + E_ML(DFT) - E_MODEL_low
The GPU4PySCF backend is activated automatically when available; otherwise PySCF CPU is used. The default functional/basis is wb97m-v/def2-tzvpd.
Minimal example¶
mlmm dft -i enzyme.pdb --parm real.parm7 --model-pdb ml_region.pdb \
-q 0 -m 1 --out-dir ./result_dft
Output checklist¶
result_dft/ml_region_with_linkH.xyzresult_dft/result.yamlStandard output block with ML(dft)/MM combined energy
Common examples¶
Change functional/basis for a higher-level single point.
mlmm dft -i enzyme.pdb --parm real.parm7 --model-pdb ml_region.pdb \
-q 0 -m 1 --func-basis "wb97m-v/def2-tzvpd" --out-dir ./result_dft_tz
Freeze selected atoms in the ML/MM setup before DFT.
mlmm dft -i enzyme.pdb --parm real.parm7 --model-pdb ml_region.pdb \
-q -1 -m 2 --freeze-atoms "1,3,5" --out-dir ./result_dft_freeze
Tighten SCF convergence and allow more cycles.
mlmm dft -i enzyme.pdb --parm real.parm7 --model-pdb ml_region.pdb \
-q 0 -m 1 --conv-tol 1e-10 --max-cycle 200 --out-dir ./result_dft_tight
Workflow¶
Input handling – The full enzyme PDB (
-i), Amber topology (--parm), and ML-region definition (--model-pdbor--model-indicesor B-factor detection via--detect-layer) are loaded. Link hydrogens are appended automatically (C/N parents within 1.7 Å) unless explicitlink_mlmmpairs are provided via YAML.SCF build –
--func-basisis parsed into functional and basis. Density fitting is enabled automatically with PySCF defaults. The GPU4PySCF backend is used when available; otherwise CPU PySCF is used. When--embedchargeis enabled, MM point charges from the Amber topology are embedded into the QM Hamiltonian viapyscf.qmmm.mm_charge(), so the DFT wavefunction is self-consistently polarized by the MM environment.ML(dft)/MM recombination – After the DFT converges, MM evaluations of the full system (REAL-low) and the ML subset (MODEL-low) are computed. The combined energy is reported in Hartree and kcal/mol.
Population analysis & outputs – Mulliken, meta-Lowdin, and IAO charges and spin densities (UKS only) are written alongside the combined energy block in
result.yaml.
CLI options¶
Option |
Description |
Default |
|---|---|---|
|
Full enzyme structure (PDB or XYZ). If XYZ, use |
Required |
|
Reference PDB topology when input is XYZ. |
None |
|
Amber parm7 topology for the full system. |
Required |
|
PDB defining the ML region (atom IDs must match the enzyme PDB). Optional when |
None |
|
Comma-separated atom indices for the ML region (ranges allowed, e.g. |
None |
|
Interpret |
|
|
Detect ML/MM layers from input PDB B-factors (B=0/10/20). |
|
|
Charge of the ML region. |
Required |
|
Spin multiplicity (2S+1) for the ML region. |
|
|
Comma-separated 1-based indices to freeze (e.g. |
None |
|
Functional/basis pair as |
|
|
Maximum SCF iterations. |
|
|
SCF convergence tolerance (Hartree). |
|
|
DFT integration grid level (0=coarse, 3=default, 9=ultrafine). |
|
|
Output directory. |
|
|
Base YAML configuration file applied before explicit CLI options. |
None |
|
Print resolved configuration and continue execution. |
|
|
MLIP backend used for the low-level ONIOM recombination: |
|
|
Enable electrostatic embedding: MM point charges from the Amber topology are added to the PySCF QM Hamiltonian so the DFT wavefunction is polarized by the MM environment. |
|
|
Cutoff radius (Å) for embed-charge MM atoms. |
|
|
Validate options and print execution plan without running DFT. Shown in |
|
|
Toggle XYZ/TRJ to PDB companions when a PDB template is available. |
|
Outputs¶
out_dir/ (default: ./result_dft/)
├── ml_region_with_linkH.xyz # ML-region coordinates (with link-H) used for DFT
├── result.yaml # DFT + ML(dft)/MM energy summary, charges, spin densities
└── (stdout) # Pretty-printed configuration blocks and energies
result.yamlexpands to:energy: Hartree/kcal/mol values, convergence flag, wall time, backend info (gpu4pyscf vs pyscf(cpu)).charges: Mulliken, meta-Lowdin, and IAO atomic charges (nullwhen a method fails).spin_densities: Mulliken, meta-Lowdin, and IAO spin densities (UKS-only for spins).
It also summarizes charge, multiplicity, spin (2S), functional, basis, convergence knobs, and resolved output directory.
YAML configuration¶
Accepts a mapping root; the dft section (and optional geom, calc/mlmm) is applied when present. Merge order is:
defaults
--configexplicit CLI options
dft keys (defaults in parentheses):
func_basis("wb97m-v/def2-tzvpd"): CombinedFUNC/BASISstring.conv_tol(1e-9): SCF convergence threshold (Hartree).max_cycle(100): Maximum SCF iterations.grid_level(3): PySCFgrids.level.verbose(4): PySCF verbosity (0-9).out_dir("./result_dft/"): Output directory root.
geom:
coord_type: cart # optional geom_loader settings
calc:
model_charge: 0 # ML region charge
model_mult: 1 # spin multiplicity 2S+1
mlmm:
real_parm7: real.parm7 # Amber parm7 topology
model_pdb: ml_region.pdb # ML-region definition
dft:
func_basis: wb97m-v/def2-tzvpd # exchange-correlation functional / basis set
conv_tol: 1.0e-09 # SCF convergence tolerance (Hartree)
max_cycle: 100 # maximum SCF iterations
grid_level: 3 # PySCF grid level
verbose: 4 # PySCF verbosity (0-9)
out_dir: ./result_dft/ # output directory root
See Also¶
Common Error Recipes – Symptom-first failure routing
Troubleshooting – Detailed troubleshooting guide
freq – Vibrational frequency analysis (often precedes DFT refinement)
opt – Single-structure geometry optimization
all – End-to-end workflow with
--dftYAML Reference – Full
dftconfiguration optionsGlossary – Definitions of DFT, SP (Single Point)