fix-altloc¶
Overview¶
Summary: Remove alternate location (altLoc) indicators from PDB files by selecting the best conformer for each atom based on occupancy and dropping duplicates.
What it does¶
Blank the PDB altLoc column (column 17, 1-based) with a single space.
This is a 1-character replacement (no shifting / no reformatting).
If the same atom appears multiple times due to alternate locations (altLoc like A/B/… or custom labels like H/L), keep the “best” one:
Highest occupancy first
If tied (or occupancy missing), keep the earliest one in the file
Handled records¶
ATOM/HETATM: altLoc selection and blankingANISOU: kept only if the corresponding ATOM/HETATM line (same serial) is kept
Automatic skip behavior¶
By default, if a file contains no altLoc characters (all column 17 positions
are blank), the file is skipped and no output is written. Use --force to
process files regardless of altLoc presence.
Minimal example¶
mlmm fix-altloc -i 1abc.pdb
Output checklist¶
<input>_clean.pdb– PDB with altLoc column blanked and duplicates removedOriginal file unchanged (unless
--inplaceis set)
Common examples¶
Specify output file.
mlmm fix-altloc -i 1abc.pdb -o 1abc_fixed.pdb
Process a directory recursively.
mlmm fix-altloc -i ./structures -o ./cleaned --recursive
Overwrite input files in-place (creates .bak backups).
mlmm fix-altloc -i ./structures --inplace --recursive
Force processing even if no altLoc is detected.
mlmm fix-altloc -i 1abc.pdb -o 1abc_fixed.pdb --force
Workflow¶
Check if the input file contains any non-blank altLoc characters (column 17).
If no altLoc is found and
--forceis not set, skip the file.
For each ATOM/HETATM record, build an identity key ignoring the altLoc field:
record name, atom name, residue name, chain ID, residue sequence, insertion code, segID
Among atoms with the same identity key, select the one with:
Highest occupancy (columns 55–60)
If tied, the earliest appearance in the file
Write output with:
Only the selected atoms retained
altLoc column (17) blanked to a single space
ANISOU records filtered to match retained atoms
CLI options¶
Option |
Description |
Default |
|---|---|---|
|
Input PDB file or directory. |
Required |
|
Output file (if input is a file) or directory (if input is a directory). |
File input: |
|
Process |
|
|
Overwrite input file(s) in-place (creates |
|
|
Allow overwriting existing output files. |
|
|
Process files even if no altLoc is detected. |
|
Outputs¶
A PDB file with alternate locations removed:
File input:
<input>_clean.pdbby default (when-o/--outis omitted)Directory input:
<input>_clean/directory by default (mirrors subpaths)OUTPUT.pdbif-o/--outis providedOriginal file overwritten if
--inplaceis set (backup saved as<input>.pdb.bak)
Integration with all workflow¶
When running the mlmm all workflow, fix-altloc can be automatically
invoked as a preflight step after add-elem-info (if element fields were
missing) and before pocket extraction. This ensures that:
Element symbols are populated first
Alternate locations are resolved to a single conformer
The extraction step receives clean, unambiguous coordinates
Files are only processed if altLoc characters are detected; otherwise, the original file is passed through unchanged.
Handling different atom counts between altLoc states¶
When different altLoc states contain different atoms (e.g., altLoc A has atoms
N, CA, CB, CG while altLoc B has N, CA, CB, CD), fix-altloc handles this correctly:
Duplicate atoms (same residue + atom name in multiple altLocs, e.g., N, CA, CB): The best one is selected based on occupancy (highest first, then earliest in file).
Unique atoms (only present in one altLoc, e.g., CG in A, CD in B): ALL unique atoms are preserved in the output.
This ensures the output structure contains all atoms from all altLoc states, with only true duplicates resolved to a single conformer.
Example:
Input:
ATOM 1 N AALA A 1... 0.50 # altLoc A
ATOM 2 CA AALA A 1... 0.50 # altLoc A
ATOM 3 CG AALA A 1... 0.50 # altLoc A only
ATOM 4 N BALA A 1... 0.40 # altLoc B
ATOM 5 CA BALA A 1... 0.40 # altLoc B
ATOM 6 CD BALA A 1... 0.40 # altLoc B only
Output:
ATOM 1 N ALA A 1... 0.50 # from A (higher occ)
ATOM 2 CA ALA A 1... 0.50 # from A (higher occ)
ATOM 3 CG ALA A 1... 0.50 # kept (A only)
ATOM 6 CD ALA A 1... 0.40 # kept (B only)
API usage¶
For programmatic use, the module exports:
from mlmm.fix_altloc import has_altloc, fix_altloc_file
# Check if a file has altLoc
if has_altloc(Path("input.pdb")):
# Fix altLoc
was_processed = fix_altloc_file("input.pdb", "output.pdb", overwrite=True)
See Also¶
Common Error Recipes – Symptom-first failure routing
Troubleshooting – Detailed troubleshooting guide
add-elem-info – Repair PDB element columns before altLoc fixing
extract – Extract active-site pocket after altLoc resolution
all – End-to-end workflow that includes automatic altLoc fixing