PoseBusters: Plausibility checks for generated molecule poses.

PoseBusters is a command line tool and Python library to check the output of molecular conformation generators, docking programs, or other programs that generate molecules in 3D.

Quick start

PoseBusters can be installed from PyPI.

>>> pip install posebusters

Use the bust command to check molecules, docked ligands or generated molecules conditioned on a protein.

>>> bust redocked_ligand.sdf -l crystal_ligand.sdf -p protein.pdb --outfmt short
redocked_ligand.sdf redocked_ligand  passes (25 / 25)
...
>>> bust redocked_ligand.sdf -l crystal_ligand.sdf -p protein.pdb --outfmt long
Long summary for redocked_ligand.sdf redocked_ligand
MOL_PRED loaded                          .   
MOL_TRUE loaded                          .   
MOL_COND loaded                          .   
Sanitization                             .   
All atoms connected                      .   
Molecular formula                        .   
Molecular bonds                          .   
Double bond stereochemistry              .   
Tetrahedral chirality                    .   
Bond lengths                             .   
Bond angles                              .   
...
>>> bust redocked_ligand.sdf -l crystal_ligand.sdf -p protein.pdb --outfmt csv
file,molecule,mol_pred_loaded,mol_true_loaded,mol_cond_loaded,sanitization,all_atoms_connected,molecular_formula,molecular_bonds,double_bond_stereochemistry,tetrahedral_chirality,bond_lengths,bond_angles,internal_steric_clash,aromatic_ring_flatness,double_bond_flatness,internal_energy,protein-ligand_maximum_distance,minimum_distance_to_protein,minimum_distance_to_organic_cofactors,minimum_distance_to_inorganic_cofactors,minimum_distance_to_waters,volume_overlap_with_protein,volume_overlap_with_organic_cofactors,volume_overlap_with_inorganic_cofactors,volume_overlap_with_waters,rmsd_≤_2å
redocked_ligand.sdf,redocked_ligand,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True
...

For more usage examples, bulk processing, and the Python API see the documentation for the command line tool and the Python library.

Docking methods comparison

For more detailed information about the tests and for a study using PoseBusters to compare docking methods, refer to our preprint:

@online{buttenschoen2023posebusters,
   title = {{{PoseBusters}}: {{AI-based}} Docking Methods Fail to Generate Physically Valid Poses or Generalise to Novel Sequences},
   shorttitle = {{{PoseBusters}}},
   author = {Buttenschoen, Martin and Morris, Garrett M. and Deane, Charlotte M.},
   date = {2023-08-10},
   eprint = {2308.05777},
   eprinttype = {arxiv}
}

Sample checks

For more information on the checks, see Checks.

Steric clash

Bad:

Internal clash fail

Molecule intertwined and atoms clashing

Good:

Internal clash true

Aromatic rings flatness

Bad:

Flat aromatics fail

Conjugated pi bond systems should be flat

Good:

Flat aromatics true

Volume overlap

Bad:

Volume overlap fail

Ligand and receptor clash

Good:

Volume overlap true

For more information on the checks, see Checks.

Contents

Checks

Example failure modes

In molecular conformation generation, docking, and de-novo molecular generation the generated molecules conformation should have a reasonable geometry including standard bond lengths and angles and no steric clash.

Bond lengths

Bond lengths fail

Bottom left carbon-oxygen bond too short

Bond lengths true

Bond angles

Bond angles fail

Bond angles off and atoms clashing

Bond angles true

Steric clash

Internal clash fail

Molecule intertwined and atoms clashing

Internal clash true

High energy conformation

Energy ratio fail

Twisted rings energetically unfavorable

Energy ratio true

Aromatic rings not flat

Flat aromatics fail

Conjugated pi bond systems should be flat

Flat aromatics true

In docking the molecular identity should be preserved including stereochemistry.

Tetrahedral stereochemistry changed

Tetrahedral stereochemistry fail

Top right oxygen facing the wrong way

Tetrahedral stereochemistry true

Double bond stereochemistry changed

Double bond stereochemistry fail

Right most double bond should be cis

Double bond stereochemistry true

In de-novo molecular generator or docking the generated molecule should be placed with in a receptor’s binding pocket without any steric clash.

Volume overlap

Volume overlap fail

Ligand and receptor clash

Volume overlap true

More details on tests and docking method comparison

For more detailed information about the tests and for a study using PoseBusters to compare docking methods, refer to our preprint:

@online{buttenschoen2023posebusters,
title = {{{PoseBusters}}: {{AI-based}} Docking Methods Fail to Generate Physically Valid Poses or Generalise to Novel Sequences},
shorttitle = {{{PoseBusters}}},
author = {Buttenschoen, Martin and Morris, Garrett M. and Deane, Charlotte M.},
date = {2023-08-10},
eprint = {2308.05777},
eprinttype = {arxiv}
}

Command line interface

PoseBusters provides the command bust for checking generated molecules and optionally taking a conditioning protein or ligands into account.

Use bust to check a series of molecules within one .sdf file.

>>> bust generated_molecules.sdf --outfmt long
Long summary for generated_molecules.sdf molecule_1
MOL_PRED loaded         .   
Sanitization            .   
All atoms connected     .   
Bond lengths            Fail
Bond angles             Fail
Internal steric clash   .   
Aromatic ring flatness  .   
Double bond flatness    .   
Internal energy         .   
Long summary for generated_molecules.sdf molecule_2
MOL_PRED loaded         .   
Sanitization            .   
All atoms connected     .   
Bond lengths            .   
Bond angles             .   
Internal steric clash   .   
Aromatic ring flatness  .   
Double bond flatness    .   
Internal energy         .   
Long summary for generated_molecules.sdf molecule_3
...

Check a docked ligand or generated molecule conditioned on a protein.

>>> bust generated_ligands.sdf -p protein.pdb --outfmt long
Long summary for generated_ligands.sdf conformation_1
MOL_PRED loaded                          .   
MOL_COND loaded                          .   
Sanitization                             .   
All atoms connected                      .   
Bond lengths                             .   
Bond angles                              .   
Internal steric clash                    .   
Aromatic ring flatness                   .   
Double bond flatness                     .   
Internal energy                          .   
Protein-ligand maximum distance          .   
Minimum distance to protein              .   
Minimum distance to organic cofactors    .   
Minimum distance to inorganic cofactors  .   
Minimum distance to waters               .   
Volume overlap with protein              .   
Volume overlap with organic cofactors    .   
Volume overlap with inorganic cofactors  .   
Volume overlap with waters               .   
Long summary for generated_ligands.sdf conformation_2
...

Check a series of re-docked ligands against the crystal ligand and protein.

>>> bust redocked_ligand.sdf -l crystal_ligand.sdf -p protein.pdb --outfmt long
Long summary for redocked_ligand.sdf redocked_ligand
MOL_PRED loaded                          .   
MOL_TRUE loaded                          .   
MOL_COND loaded                          .   
Sanitization                             .   
All atoms connected                      .   
Molecular formula                        .   
Molecular bonds                          .   
Double bond stereochemistry              .   
Tetrahedral chirality                    .   
Bond lengths                             .   
Bond angles                              .   
Internal steric clash                    .   
Aromatic ring flatness                   .   
Double bond flatness                     .   
Internal energy                          .   
Protein-ligand maximum distance          .   
Minimum distance to protein              .   
Minimum distance to organic cofactors    .   
Minimum distance to inorganic cofactors  .   
Minimum distance to waters               .   
...

Use the -t option to bulk check multiple sets of files.

>>> bust -t molecule_table.csv --outfmt long
Long summary for 1ia1/1ia1_ligand.sdf TQ3
MOL_PRED loaded                          .   
MOL_TRUE loaded                          .   
MOL_COND loaded                          .   
Sanitization                             .   
All atoms connected                      .   
Molecular formula                        .   
Molecular bonds                          .   
Double bond stereochemistry              .   
Tetrahedral chirality                    .   
Bond lengths                             .   
Bond angles                              .   
Internal steric clash                    .   
Aromatic ring flatness                   .   
Double bond flatness                     .   
Internal energy                          .   
Protein-ligand maximum distance          .   
Minimum distance to protein              .   
Minimum distance to organic cofactors    .   
Minimum distance to inorganic cofactors  .   
Minimum distance to waters               .   
...

Output format options

The short format is the default output format.

>>> bust generated_molecules.sdf --outfmt short
generated_molecules.sdf molecule_1  passes (7 / 9)
generated_molecules.sdf molecule_2  passes (9 / 9)
generated_molecules.sdf molecule_3  passes (9 / 9)

The long format lists each test result for each molecule/conformation.

>>> bust generated_molecules.sdf --outfmt long
Long summary for generated_molecules.sdf molecule_1
MOL_PRED loaded         .   
Sanitization            .   
All atoms connected     .   
Bond lengths            Fail
Bond angles             Fail
Internal steric clash   .   
Aromatic ring flatness  .   
Double bond flatness    .   
Internal energy         .   
Long summary for generated_molecules.sdf molecule_2
MOL_PRED loaded         .   
Sanitization            .   
All atoms connected     .   
Bond lengths            .   
Bond angles             .   
Internal steric clash   .   
Aromatic ring flatness  .   
Double bond flatness    .   
Internal energy         .   
Long summary for generated_molecules.sdf molecule_3
...

For copying and saving the output use the csv option.

>>> bust generated_molecules.sdf --outfmt csv
file,molecule,mol_pred_loaded,sanitization,all_atoms_connected,bond_lengths,bond_angles,internal_steric_clash,aromatic_ring_flatness,double_bond_flatness,internal_energy
generated_molecules.sdf,molecule_1,True,True,True,False,False,True,True,True,True
generated_molecules.sdf,molecule_2,True,True,True,True,True,True,True,True,True
generated_molecules.sdf,molecule_3,True,True,True,True,True,True,True,True,True

For the csv and long option, the --full-report option can be used to show extra information beyond the pass/fail status of each test.

>>> bust generated_molecules.sdf --outfmt csv --full-report
file,molecule,mol_pred_loaded,sanitization,all_atoms_connected,bond_lengths,bond_angles,internal_steric_clash,aromatic_ring_flatness,double_bond_flatness,internal_energy,mol_true_loaded,mol_cond_loaded,passes_valence_checks,passes_kekulization,number_bonds,shortest_bond_relative_length,longest_bond_relative_length,number_short_outlier_bonds,number_long_outlier_bonds,number_angles,most_extreme_relative_angle,number_outlier_angles,number_noncov_pairs,shortest_noncovalent_relative_distance,number_clashes,number_valid_bonds,number_valid_angles,number_valid_noncov_pairs,number_aromatic_rings_checked,number_aromatic_rings_pass,aromatic_ring_maximum_distance_from_plane,number_double_bonds_checked,number_double_bonds_pass,double_bond_maximum_distance_from_plane,ensemble_avg_energy,mol_pred_energy,energy_ratio
generated_molecules.sdf,molecule_1,True,True,True,False,False,True,True,True,True,False,False,True,True,27,0.8567948530292413,1.2797521607839486,0,1,38,1.3092992455795367,2,235,0.7738421740224927,0,26,36,235,0,0,,0,0,,608.5563186553628,1377.8697297363487,2.264161405440378
generated_molecules.sdf,molecule_2,True,True,True,True,True,True,True,True,True,False,False,True,True,27,0.8951618398686666,1.0410778462688173,0,0,38,1.1090396233249729,0,260,0.9498773674061802,0,27,38,260,1,1,0.10944595440461957,0,0,,259.7528747776052,253.50849461438324,0.9759603039290007
generated_molecules.sdf,molecule_3,True,True,True,True,True,True,True,True,True,False,False,True,True,28,0.8337212356590878,1.0705087667583508,0,0,40,1.1485967900513152,0,257,0.9115215657793475,0,28,40,257,1,1,0.09186620947420678,0,0,,301.49094062530395,341.2052918845499,1.131726516149629
...

Saving the output

The --out option can be used to save the output to a file:

bust generated_molecules.sdf --outfmt csv --out results.csv

Help

Running with the --help option prints information about the command line options.

>>> bust --help
usage: bust [-l MOL_TRUE] [-p MOL_COND] [-t TABLE] [--outfmt {short,long,csv}]
            [--output OUTPUT] [--full-report] [--no-header] [--config CONFIG]
            [--top-n TOP_N] [-v] [-h]
            [mol_pred [mol_pred ...]]

PoseBusters: Plausibility checks for generated molecule poses.

Input:
  mol_pred              molecule(s) to check
  -l MOL_TRUE           true molecule, e.g. crystal ligand
  -p MOL_COND           conditioning molecule, e.g. protein
  -t TABLE              run multiple inputs listed in a .csv file

Output:
  --outfmt {short,long,csv}
                        output format
  --output OUTPUT       output file (default: stdout)
  --full-report         print details for each test
  --no-header           print output without header

Configuration:
  --config CONFIG       configuration file
  --top-n TOP_N         run on TOP_N results in MOL_PRED only (default: all)

Information:
  -v, --version         show program's version number and exit
  -h, --help            show this help message and exit
>>> bust --version
bust 0.2.5

Python API

PoseBusters class

The PoseBusters class collects the molecules to test, runs the modules, and reports the test results.

class posebusters.PoseBusters(config: str | dict[str, Any] = 'redock', top_n: int | None = None)

Class to run all tests on a set of molecules.

bust(mol_pred: Iterable[Mol | Path] | Mol | Path, mol_true: Mol | Path | None = None, mol_cond: Mol | Path | None = None, full_report: bool = False) pd.DataFrame

Run tests on one or more molecules.

Parameters:
  • mol_pred – Generated molecule(s), e.g. de-novo generated molecule or docked ligand, with one or more poses.

  • mol_true – True molecule, e.g. crystal ligand, with one or more poses.

  • mol_cond – Conditioning molecule, e.g. protein.

  • full_report – Whether to include all columns in the output or only the boolean ones specified in the config.

Notes

  • Molecules can be provided as rdkit molecule objects or file paths.

Returns:

Pandas dataframe with results.

bust_table(mol_table: DataFrame, full_report: bool = False) DataFrame

Run tests on molecules provided in pandas dataframe as paths or rdkit molecule objects.

Parameters:
  • mol_table – Pandas dataframe with columns “mol_pred”, “mol_true”, “mol_cond” containing paths to molecules.

  • full_report – Whether to include all columns in the output or only the boolean ones specified in the config.

Returns:

Pandas dataframe with results.

Modules

A PoseBusters module is a function that takes one or more of mol_pred, mol_true, and mol_cond as input and returns one or more test results as a dictionary.

Inputs
  • Must take one of mol_pred, mol_true, and mol_cond provided as RDKit molecules.

  • Other inputs are parameters for which default values must be specified.

Outputs
  • The output must be a dictionary with at least the results entry.

  • The results entry contains a dictionary with keys corresponding to the test names and the test outcomes.

  • Other output entries to contain further results e.g. lengths and bound for all bonds in ligand.

Distance Geometry
posebusters.modules.distance_geometry.check_geometry(mol_pred: Mol, threshold_bad_bond_length: float = 0.2, threshold_clash: float = 0.2, threshold_bad_angle: float = 0.2, bound_matrix_params: dict[str, Any] = {'doTriangleSmoothing': True, 'scaleVDW': True, 'set15bounds': True, 'useMacrocycle14config': False}, ignore_hydrogens: bool = True, sanitize: bool = True) dict[str, Any]

Use RDKit distance geometry bounds to check the geometry of a molecule.

Parameters:
  • mol_pred – Predicted molecule (docked ligand). Only the first conformer will be checked.

  • threshold_bad_bond_length – Bond length threshold in relative percentage. 0.2 means that bonds may be up to 20% longer than DG bounds. Defaults to 0.2.

  • threshold_clash – Threshold for how much overlap constitutes a clash. 0.2 means that the two atoms may be up to 80% of the lower bound apart. Defaults to 0.2.

  • threshold_bad_angle – Bond angle threshold in relative percentage. 0.2 means that bonds may be up to 20% longer than DG bounds. Defaults to 0.2.

  • bound_matrix_params – Parameters passe to RDKit’s GetMoleculeBoundsMatrix function.

  • ignore_hydrogens – Whether to ignore hydrogens. Defaults to True.

  • sanitize – Sanitize molecule before running DG module (recommended). Defaults to True.

Returns:

PoseBusters results dictionary.

Energy Ratio
posebusters.modules.energy_ratio.check_energy_ratio(mol_pred: Mol, threshold_energy_ratio: float = 7.0, ensemble_number_conformations: int = 100)

Check whether the energy of the docked ligand is within user defined range.

Parameters:
  • mol_pred – Predicted molecule (docked ligand) with exactly one conformer.

  • threshold_energy_ratio – Limit above which the energy ratio is deemed to high. Defaults to 7.0.

  • ensemble_number_conformations – Number of conformations to generate for the ensemble over which to average. Defaults to 100.

Returns:

PoseBusters results dictionary.

Flatness
posebusters.modules.flatness.check_flatness(mol_pred: Mol, threshold_flatness: float = 0.1, flat_systems: dict[str, str] = {'aromatic_5_membered_rings_sp2': '[ar5^2]1[ar5^2][ar5^2][ar5^2][ar5^2]1', 'aromatic_6_membered_rings_sp2': '[ar6^2]1[ar6^2][ar6^2][ar6^2][ar6^2][ar6^2]1', 'trigonal_planar_double_bonds': '[C;X3;^2](*)(*)=[C;X3;^2](*)(*)'}) dict[str, Any]

Check whether substructures of molecule are flat.

Parameters:
  • mol_pred – Molecule with exactly one conformer.

  • threshold_flatness – Maximum distance from shared plane used as cutoff. Defaults to 0.1.

  • flat_systems – Patterns of flat systems provided as SMARTS. Defaults to 5 and 6 membered aromatic rings and carbon sigma bonds.

Returns:

PoseBusters results dictionary.

Identity
posebusters.modules.identity.check_identity(mol_pred: Mol, mol_true: Mol, inchi_options: str = '') dict[str, Any]

Check two molecules are identical (docking relevant identity).

Parameters:
  • mol_pred – Predicted molecule (docked ligand).

  • mol_true – Ground truth molecule (crystal ligand) with a conformer.

  • inchi_options – String of options to pass to the InChI module. Defaults to “”.

Returns:

PoseBusters results dictionary.

Intermolecular Distance
posebusters.modules.intermolecular_distance.check_intermolecular_distance(mol_pred: Mol, mol_cond: Mol, radius_type: str = 'vdw', radius_scale: float = 1.0, clash_cutoff: float = 0.75, ignore_types: set[str] = {'hydrogens'}, max_distance: float = 5.0, search_distance: float = 6.0) dict[str, Any]

Check that predicted molecule is not too close and not too far away from conditioning molecule.

Parameters:
  • mol_pred – Predicted molecule (docked ligand) with one conformer.

  • mol_cond – Conditioning molecule (protein) with one conformer.

  • radius_type – Type of atomic radius to use. Possible values are “vdw” (van der Waals) and “covalent”. Defaults to “vdw”.

  • radius_scale – Scaling factor for the atomic radii. Defaults to 0.8.

  • clash_cutoff – Threshold for how much the atoms may overlap before a clash is reported. Defaults to 0.05.

  • ignore_types – Which types of atoms to ignore in mol_cond. Possible values to include are “hydrogens”, “protein”, “organic_cofactors”, “inorganic_cofactors”, “waters”. Defaults to {“hydrogens”}.

  • max_distance – Maximum distance (in Angstrom) predicted and conditioning molecule may be apart to be considered as valid. Defaults to 5.0.

Returns:

PoseBusters results dictionary.

Loading
posebusters.modules.loading.check_loading(mol_pred: Any = None, mol_true: Any = None, mol_cond: Any = None) dict[str, dict[str, bool]]

Check that molecule files were loaded successfully.

Parameters:
  • mol_pred – Predicted molecule. Defaults to None.

  • mol_true – Ground truth molecule. Defaults to None.

  • mol_cond – Conditioning molecule. Defaults to None.

Returns:

PoseBusters results dictionary.

RMSD
posebusters.modules.rmsd.check_rmsd(mol_pred: Mol, mol_true: Mol, rmsd_threshold: float = 2.0) dict[str, dict[str, bool | float]]

Calculate RMSD and related metrics between predicted molecule and closest ground truth molecule.

Parameters:
  • mol_pred – Predicted molecule (docked ligand) with exactly one conformer.

  • mol_true – Ground truth molecule (crystal ligand) with at least one conformer. If multiple conformers are present, the lowest RMSD will be reported.

  • rmsd_threshold – Threshold in angstrom for reporting whether RMSD is within threshold. Defaults to 2.0.

Returns:

PoseBusters results dictionary.

Volume Overlap
posebusters.modules.volume_overlap.check_volume_overlap(mol_pred: Mol, mol_cond: Mol, clash_cutoff: float = 0.05, vdw_scale: float = 0.8, ignore_types: set[str] = {'hydrogens'}, search_distance: float = 6.0) dict[str, dict]

Check volume overlap between ligand and protein.

Parameters:
  • mol_pred – Predicted molecule (docked ligand) with one conformer.

  • mol_cond – Conditioning molecule (protein) with one conformer.

  • clash_cutoff – Threshold for how much volume overlap is allowed. This is the maximum share of volume of mol_pred allowed to overlap with mol_cond. Defaults to 0.05.

  • vdw_scale – Scaling factor for the van der Waals radii which define the volume around each atom. Defaults to 0.8.

  • ignore_types – Which types of atoms in mol_cond to ignore. Possible values to include are “hydrogens”, “protein”, “organic_cofactors”, “inorganic_cofactors”, “waters”. Defaults to {“hydrogens”}.

Returns:

PoseBusters results dictionary.

User API examples

Setup python environment and install posebusters to run this notebook.

conda create -n posebusters python=3.10 jupyter notebook
conda activate posebusters
pip install posebusters --upgrade
[1]:
from posebusters import PoseBusters
from pathlib import Path
[2]:
pred_file = Path("inputs/generated_molecules.sdf") # predicted or generated molecules
true_file = Path("inputs/crystal_ligand.sdf") # "ground truth" molecules
cond_file = Path("inputs/protein.pdb") # conditioning molecule

PoseBusters default configs

redock

The `redock’ mode is for ligands docked into their cognate receptor crystal structures.

[3]:
# by default only the binary test report columns are returned
buster = PoseBusters(config="redock")
df = buster.bust([pred_file], true_file, cond_file)
print(df.shape)
df
[20:53:44] WARNING:  Problems/mismatches: Mobile-H( Hydrogens: Number; Mobile-H groups: Falsely present, Attachment points; Charge(s): Do not match)

(3, 25)
[3]:
mol_pred_loaded mol_true_loaded mol_cond_loaded sanitization all_atoms_connected molecular_formula molecular_bonds double_bond_stereochemistry tetrahedral_chirality bond_lengths ... protein-ligand_maximum_distance minimum_distance_to_protein minimum_distance_to_organic_cofactors minimum_distance_to_inorganic_cofactors minimum_distance_to_waters volume_overlap_with_protein volume_overlap_with_organic_cofactors volume_overlap_with_inorganic_cofactors volume_overlap_with_waters rmsd_≤_2å
file molecule
inputs/generated_molecules.sdf molecule_1 True True True True True False False True False False ... False True True True True True True True True False
molecule_2 True True True True True False False True True True ... False True True True True True True True True False
molecule_3 True True True True True False False True True True ... False True True True True True True True True False

3 rows × 25 columns

dock

The dock mode is for de-novo generated molecules for a given receptor or for ligands docked into a non-cognate receptor.

[4]:
buster = PoseBusters(config="dock")
df = buster.bust([pred_file], true_file, cond_file)
print(df.shape)
df
(3, 19)
[4]:
mol_pred_loaded mol_cond_loaded sanitization all_atoms_connected bond_lengths bond_angles internal_steric_clash aromatic_ring_flatness double_bond_flatness internal_energy protein-ligand_maximum_distance minimum_distance_to_protein minimum_distance_to_organic_cofactors minimum_distance_to_inorganic_cofactors minimum_distance_to_waters volume_overlap_with_protein volume_overlap_with_organic_cofactors volume_overlap_with_inorganic_cofactors volume_overlap_with_waters
file molecule
inputs/generated_molecules.sdf molecule_1 True True True True False False True True True True False True True True True True True True True
molecule_2 True True True True True True True True True True False True True True True True True True True
molecule_3 True True True True True True True True True True False True True True True True True True True
mol

The mol mode is for de-novo generated molecules or for generated molecular conformations.

[5]:
buster = PoseBusters(config="mol")
df = buster.bust([pred_file], None, None)
print(df.shape)
df
(3, 9)
[5]:
mol_pred_loaded sanitization all_atoms_connected bond_lengths bond_angles internal_steric_clash aromatic_ring_flatness double_bond_flatness internal_energy
file molecule
inputs/generated_molecules.sdf molecule_1 True True True False False True True True True
molecule_2 True True True True True True True True True
molecule_3 True True True True True True True True True

Output formatting

full report

The full_report option of bust will return all columns of the test reports, not only the binary columns. This is useful for debugging and for further analysis of the results.

[6]:
buster = PoseBusters(config="mol")
df = buster.bust([pred_file], None, None, full_report=True)
print(df.shape)
df
(3, 36)
[6]:
mol_pred_loaded sanitization all_atoms_connected bond_lengths bond_angles internal_steric_clash aromatic_ring_flatness double_bond_flatness internal_energy mol_true_loaded ... number_valid_noncov_pairs number_aromatic_rings_checked number_aromatic_rings_pass aromatic_ring_maximum_distance_from_plane number_double_bonds_checked number_double_bonds_pass double_bond_maximum_distance_from_plane ensemble_avg_energy mol_pred_energy energy_ratio
file molecule
inputs/generated_molecules.sdf molecule_1 True True True False False True True True True False ... 235 0 0 NaN 0 0 NaN 608.632249 1377.869730 2.263879
molecule_2 True True True True True True True True True False ... 260 1 1 0.078919 0 0 NaN 259.822517 253.508495 0.975699
molecule_3 True True True True True True True True True False ... 257 1 1 0.091866 0 0 NaN 301.579457 341.205292 1.131394

3 rows × 36 columns