Python API
PoseBusters class
The PoseBusters class collects the molecules to test, runs the modules, and reports the test results.
- class posebusters.PoseBusters(config: str | dict[str, Any] = 'redock', top_n: int | None = None)
Class to run all tests on a set of molecules.
- bust(mol_pred: Iterable[Mol | Path] | Mol | Path, mol_true: Mol | Path | None = None, mol_cond: Mol | Path | None = None, full_report: bool = False) pd.DataFrame
Run tests on one or more molecules.
- Parameters:
mol_pred – Generated molecule(s), e.g. de-novo generated molecule or docked ligand, with one or more poses.
mol_true – True molecule, e.g. crystal ligand, with one or more poses.
mol_cond – Conditioning molecule, e.g. protein.
full_report – Whether to include all columns in the output or only the boolean ones specified in the config.
Notes
Molecules can be provided as rdkit molecule objects or file paths.
- Returns:
Pandas dataframe with results.
- bust_table(mol_table: DataFrame, full_report: bool = False) DataFrame
Run tests on molecules provided in pandas dataframe as paths or rdkit molecule objects.
- Parameters:
mol_table – Pandas dataframe with columns “mol_pred”, “mol_true”, “mol_cond” containing paths to molecules.
full_report – Whether to include all columns in the output or only the boolean ones specified in the config.
- Returns:
Pandas dataframe with results.
Modules
A PoseBusters module is a function that takes one or more of mol_pred
, mol_true
, and mol_cond
as input and returns one or more test results as a dictionary.
- Inputs
Must take one of
mol_pred
,mol_true
, andmol_cond
provided as RDKit molecules.Other inputs are parameters for which default values must be specified.
- Outputs
The output must be a dictionary with at least the results entry.
The
results
entry contains a dictionary with keys corresponding to the test names and the test outcomes.Other output entries to contain further results e.g. lengths and bound for all bonds in ligand.
Distance Geometry
- posebusters.modules.distance_geometry.check_geometry(mol_pred: Mol, threshold_bad_bond_length: float = 0.2, threshold_clash: float = 0.2, threshold_bad_angle: float = 0.2, bound_matrix_params: dict[str, Any] = {'doTriangleSmoothing': True, 'scaleVDW': True, 'set15bounds': True, 'useMacrocycle14config': False}, ignore_hydrogens: bool = True, sanitize: bool = True) dict[str, Any]
Use RDKit distance geometry bounds to check the geometry of a molecule.
- Parameters:
mol_pred – Predicted molecule (docked ligand). Only the first conformer will be checked.
threshold_bad_bond_length – Bond length threshold in relative percentage. 0.2 means that bonds may be up to 20% longer than DG bounds. Defaults to 0.2.
threshold_clash – Threshold for how much overlap constitutes a clash. 0.2 means that the two atoms may be up to 80% of the lower bound apart. Defaults to 0.2.
threshold_bad_angle – Bond angle threshold in relative percentage. 0.2 means that bonds may be up to 20% longer than DG bounds. Defaults to 0.2.
bound_matrix_params – Parameters passe to RDKit’s GetMoleculeBoundsMatrix function.
ignore_hydrogens – Whether to ignore hydrogens. Defaults to True.
sanitize – Sanitize molecule before running DG module (recommended). Defaults to True.
- Returns:
PoseBusters results dictionary.
Energy Ratio
- posebusters.modules.energy_ratio.check_energy_ratio(mol_pred: Mol, threshold_energy_ratio: float = 7.0, ensemble_number_conformations: int = 100)
Check whether the energy of the docked ligand is within user defined range.
- Parameters:
mol_pred – Predicted molecule (docked ligand) with exactly one conformer.
threshold_energy_ratio – Limit above which the energy ratio is deemed to high. Defaults to 7.0.
ensemble_number_conformations – Number of conformations to generate for the ensemble over which to average. Defaults to 100.
- Returns:
PoseBusters results dictionary.
Flatness
- posebusters.modules.flatness.check_flatness(mol_pred: Mol, threshold_flatness: float = 0.1, flat_systems: dict[str, str] = {'aromatic_5_membered_rings_sp2': '[ar5^2]1[ar5^2][ar5^2][ar5^2][ar5^2]1', 'aromatic_6_membered_rings_sp2': '[ar6^2]1[ar6^2][ar6^2][ar6^2][ar6^2][ar6^2]1', 'trigonal_planar_double_bonds': '[C;X3;^2](*)(*)=[C;X3;^2](*)(*)'}) dict[str, Any]
Check whether substructures of molecule are flat.
- Parameters:
mol_pred – Molecule with exactly one conformer.
threshold_flatness – Maximum distance from shared plane used as cutoff. Defaults to 0.1.
flat_systems – Patterns of flat systems provided as SMARTS. Defaults to 5 and 6 membered aromatic rings and carbon sigma bonds.
- Returns:
PoseBusters results dictionary.
Identity
- posebusters.modules.identity.check_identity(mol_pred: Mol, mol_true: Mol, inchi_options: str = '') dict[str, Any]
Check two molecules are identical (docking relevant identity).
- Parameters:
mol_pred – Predicted molecule (docked ligand).
mol_true – Ground truth molecule (crystal ligand) with a conformer.
inchi_options – String of options to pass to the InChI module. Defaults to “”.
- Returns:
PoseBusters results dictionary.
Intermolecular Distance
- posebusters.modules.intermolecular_distance.check_intermolecular_distance(mol_pred: Mol, mol_cond: Mol, radius_type: str = 'vdw', radius_scale: float = 1.0, clash_cutoff: float = 0.75, ignore_types: set[str] = {'hydrogens'}, max_distance: float = 5.0, search_distance: float = 6.0) dict[str, Any]
Check that predicted molecule is not too close and not too far away from conditioning molecule.
- Parameters:
mol_pred – Predicted molecule (docked ligand) with one conformer.
mol_cond – Conditioning molecule (protein) with one conformer.
radius_type – Type of atomic radius to use. Possible values are “vdw” (van der Waals) and “covalent”. Defaults to “vdw”.
radius_scale – Scaling factor for the atomic radii. Defaults to 0.8.
clash_cutoff – Threshold for how much the atoms may overlap before a clash is reported. Defaults to 0.05.
ignore_types – Which types of atoms to ignore in mol_cond. Possible values to include are “hydrogens”, “protein”, “organic_cofactors”, “inorganic_cofactors”, “waters”. Defaults to {“hydrogens”}.
max_distance – Maximum distance (in Angstrom) predicted and conditioning molecule may be apart to be considered as valid. Defaults to 5.0.
- Returns:
PoseBusters results dictionary.
Loading
- posebusters.modules.loading.check_loading(mol_pred: Any = None, mol_true: Any = None, mol_cond: Any = None) dict[str, dict[str, bool]]
Check that molecule files were loaded successfully.
- Parameters:
mol_pred – Predicted molecule. Defaults to None.
mol_true – Ground truth molecule. Defaults to None.
mol_cond – Conditioning molecule. Defaults to None.
- Returns:
PoseBusters results dictionary.
RMSD
- posebusters.modules.rmsd.check_rmsd(mol_pred: Mol, mol_true: Mol, rmsd_threshold: float = 2.0) dict[str, dict[str, bool | float]]
Calculate RMSD and related metrics between predicted molecule and closest ground truth molecule.
- Parameters:
mol_pred – Predicted molecule (docked ligand) with exactly one conformer.
mol_true – Ground truth molecule (crystal ligand) with at least one conformer. If multiple conformers are present, the lowest RMSD will be reported.
rmsd_threshold – Threshold in angstrom for reporting whether RMSD is within threshold. Defaults to 2.0.
- Returns:
PoseBusters results dictionary.
Volume Overlap
- posebusters.modules.volume_overlap.check_volume_overlap(mol_pred: Mol, mol_cond: Mol, clash_cutoff: float = 0.05, vdw_scale: float = 0.8, ignore_types: set[str] = {'hydrogens'}, search_distance: float = 6.0) dict[str, dict]
Check volume overlap between ligand and protein.
- Parameters:
mol_pred – Predicted molecule (docked ligand) with one conformer.
mol_cond – Conditioning molecule (protein) with one conformer.
clash_cutoff – Threshold for how much volume overlap is allowed. This is the maximum share of volume of mol_pred allowed to overlap with mol_cond. Defaults to 0.05.
vdw_scale – Scaling factor for the van der Waals radii which define the volume around each atom. Defaults to 0.8.
ignore_types – Which types of atoms in mol_cond to ignore. Possible values to include are “hydrogens”, “protein”, “organic_cofactors”, “inorganic_cofactors”, “waters”. Defaults to {“hydrogens”}.
- Returns:
PoseBusters results dictionary.