User API examples
Setup python environment and install posebusters to run this notebook.
conda create -n posebusters python=3.10 jupyter notebook
conda activate posebusters
pip install posebusters --upgrade
[1]:
from posebusters import PoseBusters
from pathlib import Path
[2]:
pred_file = Path("inputs/generated_molecules.sdf") # predicted or generated molecules
true_file = Path("inputs/crystal_ligand.sdf") # "ground truth" molecules
cond_file = Path("inputs/protein.pdb") # conditioning molecule
PoseBusters default configs
redock
The `redock’ mode is for ligands docked into their cognate receptor crystal structures.
[3]:
# by default only the binary test report columns are returned
buster = PoseBusters(config="redock")
df = buster.bust([pred_file], true_file, cond_file)
print(df.shape)
df
[20:53:44] WARNING: Problems/mismatches: Mobile-H( Hydrogens: Number; Mobile-H groups: Falsely present, Attachment points; Charge(s): Do not match)
(3, 25)
[3]:
mol_pred_loaded | mol_true_loaded | mol_cond_loaded | sanitization | all_atoms_connected | molecular_formula | molecular_bonds | double_bond_stereochemistry | tetrahedral_chirality | bond_lengths | ... | protein-ligand_maximum_distance | minimum_distance_to_protein | minimum_distance_to_organic_cofactors | minimum_distance_to_inorganic_cofactors | minimum_distance_to_waters | volume_overlap_with_protein | volume_overlap_with_organic_cofactors | volume_overlap_with_inorganic_cofactors | volume_overlap_with_waters | rmsd_≤_2å | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
file | molecule | |||||||||||||||||||||
inputs/generated_molecules.sdf | molecule_1 | True | True | True | True | True | False | False | True | False | False | ... | False | True | True | True | True | True | True | True | True | False |
molecule_2 | True | True | True | True | True | False | False | True | True | True | ... | False | True | True | True | True | True | True | True | True | False | |
molecule_3 | True | True | True | True | True | False | False | True | True | True | ... | False | True | True | True | True | True | True | True | True | False |
3 rows × 25 columns
dock
The dock
mode is for de-novo generated molecules for a given receptor or for ligands docked into a non-cognate receptor.
[4]:
buster = PoseBusters(config="dock")
df = buster.bust([pred_file], true_file, cond_file)
print(df.shape)
df
(3, 19)
[4]:
mol_pred_loaded | mol_cond_loaded | sanitization | all_atoms_connected | bond_lengths | bond_angles | internal_steric_clash | aromatic_ring_flatness | double_bond_flatness | internal_energy | protein-ligand_maximum_distance | minimum_distance_to_protein | minimum_distance_to_organic_cofactors | minimum_distance_to_inorganic_cofactors | minimum_distance_to_waters | volume_overlap_with_protein | volume_overlap_with_organic_cofactors | volume_overlap_with_inorganic_cofactors | volume_overlap_with_waters | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
file | molecule | |||||||||||||||||||
inputs/generated_molecules.sdf | molecule_1 | True | True | True | True | False | False | True | True | True | True | False | True | True | True | True | True | True | True | True |
molecule_2 | True | True | True | True | True | True | True | True | True | True | False | True | True | True | True | True | True | True | True | |
molecule_3 | True | True | True | True | True | True | True | True | True | True | False | True | True | True | True | True | True | True | True |
mol
The mol
mode is for de-novo generated molecules or for generated molecular conformations.
[5]:
buster = PoseBusters(config="mol")
df = buster.bust([pred_file], None, None)
print(df.shape)
df
(3, 9)
[5]:
mol_pred_loaded | sanitization | all_atoms_connected | bond_lengths | bond_angles | internal_steric_clash | aromatic_ring_flatness | double_bond_flatness | internal_energy | ||
---|---|---|---|---|---|---|---|---|---|---|
file | molecule | |||||||||
inputs/generated_molecules.sdf | molecule_1 | True | True | True | False | False | True | True | True | True |
molecule_2 | True | True | True | True | True | True | True | True | True | |
molecule_3 | True | True | True | True | True | True | True | True | True |
Output formatting
full report
The full_report
option of bust
will return all columns of the test reports, not only the binary columns. This is useful for debugging and for further analysis of the results.
[6]:
buster = PoseBusters(config="mol")
df = buster.bust([pred_file], None, None, full_report=True)
print(df.shape)
df
(3, 36)
[6]:
mol_pred_loaded | sanitization | all_atoms_connected | bond_lengths | bond_angles | internal_steric_clash | aromatic_ring_flatness | double_bond_flatness | internal_energy | mol_true_loaded | ... | number_valid_noncov_pairs | number_aromatic_rings_checked | number_aromatic_rings_pass | aromatic_ring_maximum_distance_from_plane | number_double_bonds_checked | number_double_bonds_pass | double_bond_maximum_distance_from_plane | ensemble_avg_energy | mol_pred_energy | energy_ratio | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
file | molecule | |||||||||||||||||||||
inputs/generated_molecules.sdf | molecule_1 | True | True | True | False | False | True | True | True | True | False | ... | 235 | 0 | 0 | NaN | 0 | 0 | NaN | 608.632249 | 1377.869730 | 2.263879 |
molecule_2 | True | True | True | True | True | True | True | True | True | False | ... | 260 | 1 | 1 | 0.078919 | 0 | 0 | NaN | 259.822517 | 253.508495 | 0.975699 | |
molecule_3 | True | True | True | True | True | True | True | True | True | False | ... | 257 | 1 | 1 | 0.091866 | 0 | 0 | NaN | 301.579457 | 341.205292 | 1.131394 |
3 rows × 36 columns