Command line interface
PoseBusters provides the command bust
for checking generated molecules
and optionally taking a conditioning protein or ligands into account.
Use bust
to check a series of molecules within one .sdf
file.
>>> bust generated_molecules.sdf --outfmt long
Long summary for generated_molecules.sdf molecule_1
MOL_PRED loaded .
Sanitization .
All atoms connected .
Bond lengths Fail
Bond angles Fail
Internal steric clash .
Aromatic ring flatness .
Double bond flatness .
Internal energy .
Long summary for generated_molecules.sdf molecule_2
MOL_PRED loaded .
Sanitization .
All atoms connected .
Bond lengths .
Bond angles .
Internal steric clash .
Aromatic ring flatness .
Double bond flatness .
Internal energy .
Long summary for generated_molecules.sdf molecule_3
...
Check a docked ligand or generated molecule conditioned on a protein.
>>> bust generated_ligands.sdf -p protein.pdb --outfmt long
Long summary for generated_ligands.sdf conformation_1
MOL_PRED loaded .
MOL_COND loaded .
Sanitization .
All atoms connected .
Bond lengths .
Bond angles .
Internal steric clash .
Aromatic ring flatness .
Double bond flatness .
Internal energy .
Protein-ligand maximum distance .
Minimum distance to protein .
Minimum distance to organic cofactors .
Minimum distance to inorganic cofactors .
Minimum distance to waters .
Volume overlap with protein .
Volume overlap with organic cofactors .
Volume overlap with inorganic cofactors .
Volume overlap with waters .
Long summary for generated_ligands.sdf conformation_2
...
Check a series of re-docked ligands against the crystal ligand and protein.
>>> bust redocked_ligand.sdf -l crystal_ligand.sdf -p protein.pdb --outfmt long
Long summary for redocked_ligand.sdf redocked_ligand
MOL_PRED loaded .
MOL_TRUE loaded .
MOL_COND loaded .
Sanitization .
All atoms connected .
Molecular formula .
Molecular bonds .
Double bond stereochemistry .
Tetrahedral chirality .
Bond lengths .
Bond angles .
Internal steric clash .
Aromatic ring flatness .
Double bond flatness .
Internal energy .
Protein-ligand maximum distance .
Minimum distance to protein .
Minimum distance to organic cofactors .
Minimum distance to inorganic cofactors .
Minimum distance to waters .
...
Use the -t
option to bulk check multiple sets of files.
>>> bust -t molecule_table.csv --outfmt long
Long summary for 1ia1/1ia1_ligand.sdf TQ3
MOL_PRED loaded .
MOL_TRUE loaded .
MOL_COND loaded .
Sanitization .
All atoms connected .
Molecular formula .
Molecular bonds .
Double bond stereochemistry .
Tetrahedral chirality .
Bond lengths .
Bond angles .
Internal steric clash .
Aromatic ring flatness .
Double bond flatness .
Internal energy .
Protein-ligand maximum distance .
Minimum distance to protein .
Minimum distance to organic cofactors .
Minimum distance to inorganic cofactors .
Minimum distance to waters .
...
Output format options
The short format is the default output format.
>>> bust generated_molecules.sdf --outfmt short
generated_molecules.sdf molecule_1 passes (7 / 9)
generated_molecules.sdf molecule_2 passes (9 / 9)
generated_molecules.sdf molecule_3 passes (9 / 9)
The long format lists each test result for each molecule/conformation.
>>> bust generated_molecules.sdf --outfmt long
Long summary for generated_molecules.sdf molecule_1
MOL_PRED loaded .
Sanitization .
All atoms connected .
Bond lengths Fail
Bond angles Fail
Internal steric clash .
Aromatic ring flatness .
Double bond flatness .
Internal energy .
Long summary for generated_molecules.sdf molecule_2
MOL_PRED loaded .
Sanitization .
All atoms connected .
Bond lengths .
Bond angles .
Internal steric clash .
Aromatic ring flatness .
Double bond flatness .
Internal energy .
Long summary for generated_molecules.sdf molecule_3
...
For copying and saving the output use the csv
option.
>>> bust generated_molecules.sdf --outfmt csv
file,molecule,mol_pred_loaded,sanitization,all_atoms_connected,bond_lengths,bond_angles,internal_steric_clash,aromatic_ring_flatness,double_bond_flatness,internal_energy
generated_molecules.sdf,molecule_1,True,True,True,False,False,True,True,True,True
generated_molecules.sdf,molecule_2,True,True,True,True,True,True,True,True,True
generated_molecules.sdf,molecule_3,True,True,True,True,True,True,True,True,True
For the csv
and long
option, the --full-report
option can
be used to show extra information beyond the pass/fail status of each test.
>>> bust generated_molecules.sdf --outfmt csv --full-report
file,molecule,mol_pred_loaded,sanitization,all_atoms_connected,bond_lengths,bond_angles,internal_steric_clash,aromatic_ring_flatness,double_bond_flatness,internal_energy,mol_true_loaded,mol_cond_loaded,passes_valence_checks,passes_kekulization,number_bonds,shortest_bond_relative_length,longest_bond_relative_length,number_short_outlier_bonds,number_long_outlier_bonds,number_angles,most_extreme_relative_angle,number_outlier_angles,number_noncov_pairs,shortest_noncovalent_relative_distance,number_clashes,number_valid_bonds,number_valid_angles,number_valid_noncov_pairs,number_aromatic_rings_checked,number_aromatic_rings_pass,aromatic_ring_maximum_distance_from_plane,number_double_bonds_checked,number_double_bonds_pass,double_bond_maximum_distance_from_plane,ensemble_avg_energy,mol_pred_energy,energy_ratio
generated_molecules.sdf,molecule_1,True,True,True,False,False,True,True,True,True,False,False,True,True,27,0.8567948530292413,1.2797521607839486,0,1,38,1.3092992455795367,2,235,0.7738421740224927,0,26,36,235,0,0,,0,0,,608.5563186553628,1377.8697297363487,2.264161405440378
generated_molecules.sdf,molecule_2,True,True,True,True,True,True,True,True,True,False,False,True,True,27,0.8951618398686666,1.0410778462688173,0,0,38,1.1090396233249729,0,260,0.9498773674061802,0,27,38,260,1,1,0.10944595440461957,0,0,,259.7528747776052,253.50849461438324,0.9759603039290007
generated_molecules.sdf,molecule_3,True,True,True,True,True,True,True,True,True,False,False,True,True,28,0.8337212356590878,1.0705087667583508,0,0,40,1.1485967900513152,0,257,0.9115215657793475,0,28,40,257,1,1,0.09186620947420678,0,0,,301.49094062530395,341.2052918845499,1.131726516149629
...
Saving the output
The --out
option can be used to save the output to a file:
bust generated_molecules.sdf --outfmt csv --out results.csv
Help
Running with the --help
option prints information about the command line options.
>>> bust --help
usage: bust [-l MOL_TRUE] [-p MOL_COND] [-t TABLE] [--outfmt {short,long,csv}]
[--output OUTPUT] [--full-report] [--no-header] [--config CONFIG]
[--top-n TOP_N] [-v] [-h]
[mol_pred [mol_pred ...]]
PoseBusters: Plausibility checks for generated molecule poses.
Input:
mol_pred molecule(s) to check
-l MOL_TRUE true molecule, e.g. crystal ligand
-p MOL_COND conditioning molecule, e.g. protein
-t TABLE run multiple inputs listed in a .csv file
Output:
--outfmt {short,long,csv}
output format
--output OUTPUT output file (default: stdout)
--full-report print details for each test
--no-header print output without header
Configuration:
--config CONFIG configuration file
--top-n TOP_N run on TOP_N results in MOL_PRED only (default: all)
Information:
-v, --version show program's version number and exit
-h, --help show this help message and exit
>>> bust --version
bust 0.2.5