fix-altloc¶
Remove alternate location (altLoc) indicators from PDB files by selecting the best conformer for each atom based on occupancy and dropping duplicates. Use it to clean a structure carrying altLoc characters before downstream ML/MM preparation, typically after repairing element columns via add-elem-info. The altLoc column (column 17, 1-based) is blanked with a single space (a 1-character replacement; no shifting or reformatting), and when the same atom appears in multiple altLoc states the highest-occupancy copy is retained (earliest in file on ties, or when occupancy is missing). ATOM / HETATM records undergo altLoc selection and blanking; ANISOU records are kept only if the corresponding ATOM/HETATM line (same serial) is kept.
Examples¶
Command form:
mlmm fix-altloc -i INPUT [-o OUTPUT] [options]
Resolve altLocs in a single file (writes <input>_clean.pdb):
mlmm fix-altloc -i 1abc.pdb
Resolve altLocs in a single file with an explicit output name:
mlmm fix-altloc -i 1abc.pdb -o 1abc_fixed.pdb
Process a directory recursively into a new output directory:
mlmm fix-altloc -i ./structures -o ./cleaned --recursive
Process a directory recursively, overwriting files in place:
mlmm fix-altloc -i ./structures --inplace --recursive
Workflow¶
Check if the input file contains any non-blank altLoc characters (column 17).
If no altLoc is found and
--forceis not set, skip the file (left unchanged).
For each ATOM/HETATM record, build an identity key ignoring the altLoc field:
record name, atom name, residue name, chain ID, residue sequence, insertion code, segID
Among atoms with the same identity key, select the best one using the occupancy / earliest-appearance rule (highest occupancy; ties or missing occupancy keep the earliest in file; occupancy is read from columns 55–60).
Write output with:
Only the selected atoms retained
altLoc column (17) blanked to a single space
ANISOU records filtered to match retained atoms
Handling different atom counts between altLoc states¶
When different altLoc states contain different atoms (e.g., altLoc A has atoms
N, CA, CB, CG while altLoc B has N, CA, CB, CD), fix-altloc processes them as follows:
Duplicate atoms (same residue + atom name in multiple altLocs, e.g., N, CA, CB): The best one is selected using the same occupancy / earliest-appearance rule.
Unique atoms (only present in one altLoc, e.g., CG in A, CD in B): ALL unique atoms are preserved in the output.
Example:
Input:
ATOM 1 N AALA A 1... 0.50 # altLoc A
ATOM 2 CA AALA A 1... 0.50 # altLoc A
ATOM 3 CG AALA A 1... 0.50 # altLoc A only
ATOM 4 N BALA A 1... 0.40 # altLoc B
ATOM 5 CA BALA A 1... 0.40 # altLoc B
ATOM 6 CD BALA A 1... 0.40 # altLoc B only
Output:
ATOM 1 N ALA A 1... 0.50 # from A (higher occ)
ATOM 2 CA ALA A 1... 0.50 # from A (higher occ)
ATOM 3 CG ALA A 1... 0.50 # kept (A only)
ATOM 6 CD ALA A 1... 0.40 # kept (B only)
Outputs¶
A PDB file with alternate locations removed:
File input:
<input>_clean.pdbby default (when-o/--outis omitted)Directory input:
<input>_clean/directory by default (mirrors subpaths)OUTPUT.pdbif-o/--outis providedOriginal file overwritten if
--inplaceis set (backup saved as<input>.pdb.bak)
Python API¶
For programmatic use, the module exports:
from pathlib import Path
from mlmm.io.pdb_fix import has_altloc, clean_pdb_file
# Check if a file has altLoc
if has_altloc(Path("input.pdb")):
# Resolve altLoc into a cleaned PDB (always overwrites output)
clean_pdb_file(Path("input.pdb"), Path("output.pdb"))
CLI options¶
Option |
Description |
Default |
|---|---|---|
|
Input PDB file or directory. |
Required |
|
Output file (if input is a file) or directory (if input is a directory). |
File input: |
|
Process |
|
|
Overwrite input file(s) in-place (creates |
|
|
Allow overwriting existing output files. |
|
|
Process files even if no altLoc is detected. |
|
The full flag list is in the generated command reference.
Notes¶
Files with no altLoc characters are skipped unless
--forceis set.
See Also¶
Common Error Recipes — Symptom-first failure routing
Troubleshooting — Detailed troubleshooting guide
add-elem-info — Repair PDB element columns before altLoc fixing
extract — Extract active-site pocket after altLoc resolution
all — End-to-end ML/MM workflow (run
fix-altlocbeforehand if your inputs carry altLocs)