Python API

PoseBusters class

The PoseBusters class collects the molecules to test, runs the modules, and reports the test results.

class posebusters.PoseBusters(config: str | dict[str, Any] = 'redock', top_n: int | None = None)

Class to run all tests on a set of molecules.

bust(mol_pred: Iterable[Mol | Path] | Mol | Path, mol_true: Mol | Path | None = None, mol_cond: Mol | Path | None = None, full_report: bool = False) pd.DataFrame

Run tests on one or more molecules.

Parameters:
  • mol_pred – Generated molecule(s), e.g. de-novo generated molecule or docked ligand, with one or more poses.

  • mol_true – True molecule, e.g. crystal ligand, with one or more poses.

  • mol_cond – Conditioning molecule, e.g. protein.

  • full_report – Whether to include all columns in the output or only the boolean ones specified in the config.

Notes

  • Molecules can be provided as rdkit molecule objects or file paths.

Returns:

Pandas dataframe with results.

bust_table(mol_table: DataFrame, full_report: bool = False) DataFrame

Run tests on molecules provided in pandas dataframe as paths or rdkit molecule objects.

Parameters:
  • mol_table – Pandas dataframe with columns “mol_pred”, “mol_true”, “mol_cond” containing paths to molecules.

  • full_report – Whether to include all columns in the output or only the boolean ones specified in the config.

Returns:

Pandas dataframe with results.

Modules

A PoseBusters module is a function that takes one or more of mol_pred, mol_true, and mol_cond as input and returns one or more test results as a dictionary.

Inputs
  • Must take one of mol_pred, mol_true, and mol_cond provided as RDKit molecules.

  • Other inputs are parameters for which default values must be specified.

Outputs
  • The output must be a dictionary with at least the results entry.

  • The results entry contains a dictionary with keys corresponding to the test names and the test outcomes.

  • Other output entries to contain further results e.g. lengths and bound for all bonds in ligand.

Distance Geometry

posebusters.modules.distance_geometry.check_geometry(mol_pred: Mol, threshold_bad_bond_length: float = 0.2, threshold_clash: float = 0.2, threshold_bad_angle: float = 0.2, bound_matrix_params: dict[str, Any] = {'doTriangleSmoothing': True, 'scaleVDW': True, 'set15bounds': True, 'useMacrocycle14config': False}, ignore_hydrogens: bool = True, sanitize: bool = True) dict[str, Any]

Use RDKit distance geometry bounds to check the geometry of a molecule.

Parameters:
  • mol_pred – Predicted molecule (docked ligand). Only the first conformer will be checked.

  • threshold_bad_bond_length – Bond length threshold in relative percentage. 0.2 means that bonds may be up to 20% longer than DG bounds. Defaults to 0.2.

  • threshold_clash – Threshold for how much overlap constitutes a clash. 0.2 means that the two atoms may be up to 80% of the lower bound apart. Defaults to 0.2.

  • threshold_bad_angle – Bond angle threshold in relative percentage. 0.2 means that bonds may be up to 20% longer than DG bounds. Defaults to 0.2.

  • bound_matrix_params – Parameters passe to RDKit’s GetMoleculeBoundsMatrix function.

  • ignore_hydrogens – Whether to ignore hydrogens. Defaults to True.

  • sanitize – Sanitize molecule before running DG module (recommended). Defaults to True.

Returns:

PoseBusters results dictionary.

Energy Ratio

posebusters.modules.energy_ratio.check_energy_ratio(mol_pred: Mol, threshold_energy_ratio: float = 7.0, ensemble_number_conformations: int = 100)

Check whether the energy of the docked ligand is within user defined range.

Parameters:
  • mol_pred – Predicted molecule (docked ligand) with exactly one conformer.

  • threshold_energy_ratio – Limit above which the energy ratio is deemed to high. Defaults to 7.0.

  • ensemble_number_conformations – Number of conformations to generate for the ensemble over which to average. Defaults to 100.

Returns:

PoseBusters results dictionary.

Flatness

posebusters.modules.flatness.check_flatness(mol_pred: Mol, threshold_flatness: float = 0.1, flat_systems: dict[str, str] = {'aromatic_5_membered_rings_sp2': '[ar5^2]1[ar5^2][ar5^2][ar5^2][ar5^2]1', 'aromatic_6_membered_rings_sp2': '[ar6^2]1[ar6^2][ar6^2][ar6^2][ar6^2][ar6^2]1', 'trigonal_planar_double_bonds': '[C;X3;^2](*)(*)=[C;X3;^2](*)(*)'}) dict[str, Any]

Check whether substructures of molecule are flat.

Parameters:
  • mol_pred – Molecule with exactly one conformer.

  • threshold_flatness – Maximum distance from shared plane used as cutoff. Defaults to 0.1.

  • flat_systems – Patterns of flat systems provided as SMARTS. Defaults to 5 and 6 membered aromatic rings and carbon sigma bonds.

Returns:

PoseBusters results dictionary.

Identity

posebusters.modules.identity.check_identity(mol_pred: Mol, mol_true: Mol, inchi_options: str = '') dict[str, Any]

Check two molecules are identical (docking relevant identity).

Parameters:
  • mol_pred – Predicted molecule (docked ligand).

  • mol_true – Ground truth molecule (crystal ligand) with a conformer.

  • inchi_options – String of options to pass to the InChI module. Defaults to “”.

Returns:

PoseBusters results dictionary.

Intermolecular Distance

posebusters.modules.intermolecular_distance.check_intermolecular_distance(mol_pred: Mol, mol_cond: Mol, radius_type: str = 'vdw', radius_scale: float = 1.0, clash_cutoff: float = 0.75, ignore_types: set[str] = {'hydrogens'}, max_distance: float = 5.0, search_distance: float = 6.0) dict[str, Any]

Check that predicted molecule is not too close and not too far away from conditioning molecule.

Parameters:
  • mol_pred – Predicted molecule (docked ligand) with one conformer.

  • mol_cond – Conditioning molecule (protein) with one conformer.

  • radius_type – Type of atomic radius to use. Possible values are “vdw” (van der Waals) and “covalent”. Defaults to “vdw”.

  • radius_scale – Scaling factor for the atomic radii. Defaults to 0.8.

  • clash_cutoff – Threshold for how much the atoms may overlap before a clash is reported. Defaults to 0.05.

  • ignore_types – Which types of atoms to ignore in mol_cond. Possible values to include are “hydrogens”, “protein”, “organic_cofactors”, “inorganic_cofactors”, “waters”. Defaults to {“hydrogens”}.

  • max_distance – Maximum distance (in Angstrom) predicted and conditioning molecule may be apart to be considered as valid. Defaults to 5.0.

Returns:

PoseBusters results dictionary.

Loading

posebusters.modules.loading.check_loading(mol_pred: Any = None, mol_true: Any = None, mol_cond: Any = None) dict[str, dict[str, bool]]

Check that molecule files were loaded successfully.

Parameters:
  • mol_pred – Predicted molecule. Defaults to None.

  • mol_true – Ground truth molecule. Defaults to None.

  • mol_cond – Conditioning molecule. Defaults to None.

Returns:

PoseBusters results dictionary.

RMSD

posebusters.modules.rmsd.check_rmsd(mol_pred: Mol, mol_true: Mol, rmsd_threshold: float = 2.0) dict[str, dict[str, bool | float]]

Calculate RMSD and related metrics between predicted molecule and closest ground truth molecule.

Parameters:
  • mol_pred – Predicted molecule (docked ligand) with exactly one conformer.

  • mol_true – Ground truth molecule (crystal ligand) with at least one conformer. If multiple conformers are present, the lowest RMSD will be reported.

  • rmsd_threshold – Threshold in angstrom for reporting whether RMSD is within threshold. Defaults to 2.0.

Returns:

PoseBusters results dictionary.

Volume Overlap

posebusters.modules.volume_overlap.check_volume_overlap(mol_pred: Mol, mol_cond: Mol, clash_cutoff: float = 0.05, vdw_scale: float = 0.8, ignore_types: set[str] = {'hydrogens'}, search_distance: float = 6.0) dict[str, dict]

Check volume overlap between ligand and protein.

Parameters:
  • mol_pred – Predicted molecule (docked ligand) with one conformer.

  • mol_cond – Conditioning molecule (protein) with one conformer.

  • clash_cutoff – Threshold for how much volume overlap is allowed. This is the maximum share of volume of mol_pred allowed to overlap with mol_cond. Defaults to 0.05.

  • vdw_scale – Scaling factor for the van der Waals radii which define the volume around each atom. Defaults to 0.8.

  • ignore_types – Which types of atoms in mol_cond to ignore. Possible values to include are “hydrogens”, “protein”, “organic_cofactors”, “inorganic_cofactors”, “waters”. Defaults to {“hydrogens”}.

Returns:

PoseBusters results dictionary.