dft¶
Runs single-point DFT with GPU4PySCF or CPU PySCF, reporting energy and population analysis (Mulliken, meta-Löwdin, IAO charges). The default functional/basis is ωB97M-V/def2-tzvpd. Use it for single-point DFT energy (and population analysis) on a small active-site model, typically to refine MLIP-optimized R/TS/P structures. Select the backend via --engine (default gpu); use cpu when no GPU is available, or for portable/debug runs.
See --engine (dft) vs --dft-engine (all) for the
--engine(standalonedft) vs--dft-engine(forwarded throughpdb2reaction all) naming convention.
Prerequisites: DFT dependencies (PySCF, GPU4PySCF) are not included in the default install. Install them with
pip install "pdb2reaction[dft]".
Examples¶
Command form:
pdb2reaction dft -i INPUT.{pdb|xyz|gjf|...} [-q CHARGE] [-l, --ligand-charge <number|'RES:Q,...'>] [-m MULTIPLICITY] \
[--func-basis 'FUNC/BASIS'] \
[--max-cycle N] [--conv-tol Eh] [--grid-level L] \
[--out-dir DIR] [--engine gpu|cpu] [--convert-files/--no-convert-files] \
[--ref-pdb FILE] [--config FILE] [--show-config] [--dry-run]
Basic GPU single point.
pdb2reaction dft -i input.pdb -q 0 -m 1 --engine gpu --out-dir ./result_dft
Run with tighter SCF settings.
pdb2reaction dft -i input.pdb -q 0 -m 1 \
--func-basis 'wb97m-v/def2-tzvpd' --conv-tol 1e-10 --max-cycle 200 \
--engine gpu --out-dir ./result_dft_tight
Caveat: The tight
def2-tzvpdsetting above suits only small active-site models (≲150 atoms on a GPU with sufficient VRAM). See Notes for size / basis / backend thresholds.
Force CPU backend for portability.
pdb2reaction dft -i input.pdb -q 0 -m 1 --engine cpu --out-dir ./result_dft_cpu
Derive total charge from ligand mapping when -q is omitted.
pdb2reaction dft -i input.pdb -l 'LIG:0' -m 1 \
--engine gpu --out-dir ./result_dft_ligand
When -q is omitted but --ligand-charge/-l is provided, the input is treated as an enzyme–substrate complex and extract.py’s charge summary computes the total charge; an explicit -q still overrides. For non-.gjf inputs, omitting -q without --ligand-charge/-l aborts.
Workflow¶
Input handling – Any file loadable by
geom_loader(.pdb/.xyz/_trj.xyz/…) is accepted. Coordinates are re-exported asinput_geometry.xyz. For XYZ/GJF inputs,--ref-pdbsupplies a reference PDB topology for atom-count validation and (if you also use--ligand-charge/-l) charge derivation; the DFT stage itself does not emit PDB/GJF outputs.SCF build –
--func-basisis parsed into functional and basis.--enginecontrols GPU/CPU preference (gpurequires GPU4PySCF and raises an error if unavailable;cpuforces CPU). On the closed-shell GPU path with--lowmem(default), the SCF object isgpu4pyscf.dft.rks_lowmem.RKS, which uses a memory-efficient direct-JK pipeline (no density fitting); on the open-shell GPU, CPU, or--no-lowmempaths, density fitting is enabled automatically with PySCF defaults. Nonlocal corrections (e.g., VV10) are not configured explicitly beyond the backend defaults.Population analysis & outputs – After convergence (or failure) the command writes
result.yamlsummarizing the energy (in hartree and kcal/mol), convergence metadata, backend info, and per-atom Mulliken/meta-Löwdin/IAO charges and spin densities (UKS only for spins). Any failed analysis column is set tonullwith a warning.
Outputs¶
out_dir/ (default:./result_dft/)
├─ input_geometry.xyz # Geometry snapshot sent to PySCF
├─ result.yaml # Energy/charge/spin summaries with convergence/engine metadata
result.yamlexpands to:energy: energy in hartree and kcal/mol, convergence flag, engine metadata (engine:gpu4pyscf(rks_lowmem)/gpu4pyscf/pyscf(cpu);used_gpu;used_lowmem).charges: Mulliken, meta-Löwdin, and IAO atomic charges (nullwhen a method fails).spin_densities: Mulliken, meta-Löwdin, and IAO spin densities (UKS only for spins).It also summarizes charge, multiplicity, spin (2S), functional, basis, convergence knobs, and resolved output directory.
CLI options¶
The full flag list is in the generated command reference; the table below covers the options that need explanation.
Option |
Description |
Default |
|---|---|---|
|
Structure file accepted by |
Required |
|
Total charge supplied to PySCF ( |
Required unless template/derivation applies |
|
Either a scalar integer (e.g., |
None |
|
Spin multiplicity (2S+1). Converted to |
|
|
Functional/basis pair in |
|
|
Maximum SCF iterations ( |
|
|
SCF convergence tolerance in hartree ( |
|
|
PySCF numerical integration grid level ( |
|
|
Output directory ( |
|
|
SCF backend: gpu (GPU4PySCF) or cpu (PySCF). See --engine (dft) vs --dft-engine (all) for the |
|
|
Use |
|
|
No-op on |
|
|
Reference PDB topology to validate atom counts and enable ligand-charge derivation for XYZ/GJF inputs (no output conversion). |
None |
|
Base YAML configuration file applied before explicit CLI options. |
None |
|
Print resolved configuration and continue execution. |
|
|
Write a machine-readable |
|
|
Validate options and print execution plan without running DFT. |
|
YAML configuration¶
Accepts a mapping root; the dft section (and optional geom) is applied when present. Merge order is:
defaults
--configexplicit CLI options
geom:
coord_type: cart # optional geom_loader settings
dft:
func: wb97m-v # exchange–correlation functional
basis: def2-tzvpd # basis set name (alternatively use func_basis: "FUNC/BASIS")
conv_tol: 1.0e-09 # SCF convergence tolerance (hartree)
max_cycle: 100 # maximum SCF iterations
grid_level: 3 # PySCF grid level
verbose: 0 # PySCF verbosity (0-9); CLI -v 2/3 raises runtime PySCF verbosity to >=4
out_dir: ./result_dft/ # output directory root
Full schema (every key and default): YAML Reference.
Exit codes¶
See Exit codes in CLI Conventions.
Notes¶
System size / basis cost:
def2-tzvpdis expensive; on 16-24 GB GPUs it OOMs above ~150 atoms, and DFT single points are practical only up to ~300 atoms. Use--func-basis 'wb97m-v/def2-svp'(1-3 kcal/mol barrier-height error) or an external program (ORCA, Gaussian) for larger systems, and extract a small active-site model first.Blackwell GPUs (RTX 50xx): GPU4PySCF may OOM even at ~100 atoms; use
--engine cpuor an external DFT program.CPU backend:
--engine cpuis practical only for small models (<=150 atoms) and small basis sets; larger systems are prohibitively slow.HPC scratch: PySCF / GPU4PySCF write to
$PYSCF_TMPDIR(then$TMPDIR,/tmp); on nodes with a small or tmpfs/tmp, setPYSCF_TMPDIRto the job filesystem (e.g.export PYSCF_TMPDIR="$PBS_O_WORKDIR") before launching.Compiled GPU4PySCF wheels may not support non-x86 systems; build from source in that case (see https://github.com/pyscf/gpu4pyscf).
No auxiliary basis guessing is implemented; density-fitting behavior is described under Workflow (SCF build) and the
--lowmemCLI option.The YAML input file must have a mapping root; the
dftsection is optional. Non-mapping roots raise an error viaload_yaml_dict.IAO spin/charge analysis may fail for challenging systems; corresponding columns in
result.yamlbecomenulland a warning is printed.
See Also¶
Common Error Recipes – Symptom-first failure routing
Troubleshooting — Detailed fixes for common failure modes
freq — MLIP-based vibrational analysis (often precedes DFT refinement)
all — End-to-end workflow with
--dftYAML Reference — Full
dftconfiguration optionsGlossary — Definitions of DFT, SP (Single Point)