Command line interface

PoseBusters provides the command bust for checking generated molecules and optionally taking a conditioning protein or ligands into account.

Use bust to check a series of molecules within one .sdf file.

>>> bust generated_molecules.sdf --outfmt long
Long summary for generated_molecules.sdf molecule_1
MOL_PRED loaded         .   
Sanitization            .   
All atoms connected     .   
Bond lengths            Fail
Bond angles             Fail
Internal steric clash   .   
Aromatic ring flatness  .   
Double bond flatness    .   
Internal energy         .   
Long summary for generated_molecules.sdf molecule_2
MOL_PRED loaded         .   
Sanitization            .   
All atoms connected     .   
Bond lengths            .   
Bond angles             .   
Internal steric clash   .   
Aromatic ring flatness  .   
Double bond flatness    .   
Internal energy         .   
Long summary for generated_molecules.sdf molecule_3
...

Check a docked ligand or generated molecule conditioned on a protein.

>>> bust generated_ligands.sdf -p protein.pdb --outfmt long
Long summary for generated_ligands.sdf conformation_1
MOL_PRED loaded                          .   
MOL_COND loaded                          .   
Sanitization                             .   
All atoms connected                      .   
Bond lengths                             .   
Bond angles                              .   
Internal steric clash                    .   
Aromatic ring flatness                   .   
Double bond flatness                     .   
Internal energy                          .   
Protein-ligand maximum distance          .   
Minimum distance to protein              .   
Minimum distance to organic cofactors    .   
Minimum distance to inorganic cofactors  .   
Minimum distance to waters               .   
Volume overlap with protein              .   
Volume overlap with organic cofactors    .   
Volume overlap with inorganic cofactors  .   
Volume overlap with waters               .   
Long summary for generated_ligands.sdf conformation_2
...

Check a series of re-docked ligands against the crystal ligand and protein.

>>> bust redocked_ligand.sdf -l crystal_ligand.sdf -p protein.pdb --outfmt long
Long summary for redocked_ligand.sdf redocked_ligand
MOL_PRED loaded                          .   
MOL_TRUE loaded                          .   
MOL_COND loaded                          .   
Sanitization                             .   
All atoms connected                      .   
Molecular formula                        .   
Molecular bonds                          .   
Double bond stereochemistry              .   
Tetrahedral chirality                    .   
Bond lengths                             .   
Bond angles                              .   
Internal steric clash                    .   
Aromatic ring flatness                   .   
Double bond flatness                     .   
Internal energy                          .   
Protein-ligand maximum distance          .   
Minimum distance to protein              .   
Minimum distance to organic cofactors    .   
Minimum distance to inorganic cofactors  .   
Minimum distance to waters               .   
...

Use the -t option to bulk check multiple sets of files.

>>> bust -t molecule_table.csv --outfmt long
Long summary for 1ia1/1ia1_ligand.sdf TQ3
MOL_PRED loaded                          .   
MOL_TRUE loaded                          .   
MOL_COND loaded                          .   
Sanitization                             .   
All atoms connected                      .   
Molecular formula                        .   
Molecular bonds                          .   
Double bond stereochemistry              .   
Tetrahedral chirality                    .   
Bond lengths                             .   
Bond angles                              .   
Internal steric clash                    .   
Aromatic ring flatness                   .   
Double bond flatness                     .   
Internal energy                          .   
Protein-ligand maximum distance          .   
Minimum distance to protein              .   
Minimum distance to organic cofactors    .   
Minimum distance to inorganic cofactors  .   
Minimum distance to waters               .   
...

Output format options

The short format is the default output format.

>>> bust generated_molecules.sdf --outfmt short
generated_molecules.sdf molecule_1  passes (7 / 9)
generated_molecules.sdf molecule_2  passes (9 / 9)
generated_molecules.sdf molecule_3  passes (9 / 9)

The long format lists each test result for each molecule/conformation.

>>> bust generated_molecules.sdf --outfmt long
Long summary for generated_molecules.sdf molecule_1
MOL_PRED loaded         .   
Sanitization            .   
All atoms connected     .   
Bond lengths            Fail
Bond angles             Fail
Internal steric clash   .   
Aromatic ring flatness  .   
Double bond flatness    .   
Internal energy         .   
Long summary for generated_molecules.sdf molecule_2
MOL_PRED loaded         .   
Sanitization            .   
All atoms connected     .   
Bond lengths            .   
Bond angles             .   
Internal steric clash   .   
Aromatic ring flatness  .   
Double bond flatness    .   
Internal energy         .   
Long summary for generated_molecules.sdf molecule_3
...

For copying and saving the output use the csv option.

>>> bust generated_molecules.sdf --outfmt csv
file,molecule,mol_pred_loaded,sanitization,all_atoms_connected,bond_lengths,bond_angles,internal_steric_clash,aromatic_ring_flatness,double_bond_flatness,internal_energy
generated_molecules.sdf,molecule_1,True,True,True,False,False,True,True,True,True
generated_molecules.sdf,molecule_2,True,True,True,True,True,True,True,True,True
generated_molecules.sdf,molecule_3,True,True,True,True,True,True,True,True,True

For the csv and long option, the --full-report option can be used to show extra information beyond the pass/fail status of each test.

>>> bust generated_molecules.sdf --outfmt csv --full-report
file,molecule,mol_pred_loaded,sanitization,all_atoms_connected,bond_lengths,bond_angles,internal_steric_clash,aromatic_ring_flatness,double_bond_flatness,internal_energy,mol_true_loaded,mol_cond_loaded,passes_valence_checks,passes_kekulization,number_bonds,shortest_bond_relative_length,longest_bond_relative_length,number_short_outlier_bonds,number_long_outlier_bonds,number_angles,most_extreme_relative_angle,number_outlier_angles,number_noncov_pairs,shortest_noncovalent_relative_distance,number_clashes,number_valid_bonds,number_valid_angles,number_valid_noncov_pairs,number_aromatic_rings_checked,number_aromatic_rings_pass,aromatic_ring_maximum_distance_from_plane,number_double_bonds_checked,number_double_bonds_pass,double_bond_maximum_distance_from_plane,ensemble_avg_energy,mol_pred_energy,energy_ratio
generated_molecules.sdf,molecule_1,True,True,True,False,False,True,True,True,True,False,False,True,True,27,0.8567948530292413,1.2797521607839486,0,1,38,1.3092992455795367,2,235,0.7738421740224927,0,26,36,235,0,0,,0,0,,608.5563186553628,1377.8697297363487,2.264161405440378
generated_molecules.sdf,molecule_2,True,True,True,True,True,True,True,True,True,False,False,True,True,27,0.8951618398686666,1.0410778462688173,0,0,38,1.1090396233249729,0,260,0.9498773674061802,0,27,38,260,1,1,0.10944595440461957,0,0,,259.7528747776052,253.50849461438324,0.9759603039290007
generated_molecules.sdf,molecule_3,True,True,True,True,True,True,True,True,True,False,False,True,True,28,0.8337212356590878,1.0705087667583508,0,0,40,1.1485967900513152,0,257,0.9115215657793475,0,28,40,257,1,1,0.09186620947420678,0,0,,301.49094062530395,341.2052918845499,1.131726516149629
...

Saving the output

The --out option can be used to save the output to a file:

bust generated_molecules.sdf --outfmt csv --out results.csv

Help

Running with the --help option prints information about the command line options.

>>> bust --help
usage: bust [-l MOL_TRUE] [-p MOL_COND] [-t TABLE] [--outfmt {short,long,csv}]
            [--output OUTPUT] [--full-report] [--no-header] [--config CONFIG]
            [--top-n TOP_N] [-v] [-h]
            [mol_pred [mol_pred ...]]

PoseBusters: Plausibility checks for generated molecule poses.

Input:
  mol_pred              molecule(s) to check
  -l MOL_TRUE           true molecule, e.g. crystal ligand
  -p MOL_COND           conditioning molecule, e.g. protein
  -t TABLE              run multiple inputs listed in a .csv file

Output:
  --outfmt {short,long,csv}
                        output format
  --output OUTPUT       output file (default: stdout)
  --full-report         print details for each test
  --no-header           print output without header

Configuration:
  --config CONFIG       configuration file
  --top-n TOP_N         run on TOP_N results in MOL_PRED only (default: all)

Information:
  -v, --version         show program's version number and exit
  -h, --help            show this help message and exit
>>> bust --version
bust 0.2.5