From 76f3e75093c890e2f3760316c2cf819929530717 Mon Sep 17 00:00:00 2001 From: Amy He Date: Wed, 6 Nov 2024 19:30:29 -0800 Subject: [PATCH 01/18] remove repeated meeko installation from env setup in tutorials; remove TOC from tutorials for heading formatting --- docs/source/tutorial1.rst | 11 ----------- docs/source/tutorial2.rst | 11 ----------- docs/source/tutorial3.rst | 4 ---- 3 files changed, 26 deletions(-) diff --git a/docs/source/tutorial1.rst b/docs/source/tutorial1.rst index 50acc810..019dd963 100644 --- a/docs/source/tutorial1.rst +++ b/docs/source/tutorial1.rst @@ -5,10 +5,6 @@ Basic Docking This tutorial provides practice examples and a step-by-step guide for the two basic procedures, **Ligand Preparation** and **Receptor Preparation**, with Meeko for molecular docking and virtual screening with `AutoDock Vina `_ and `AutoDock-GPU `_. It is based on, but not a full version of the tutorial materials in `Forlilab tutorials `_. -.. contents:: - :local: - :depth: 2 - Prerequisites and Environment Setup =================================== @@ -32,13 +28,6 @@ Install the required Python packages through ``conda-forge`` Install the additional packages and data from GitHub repositories ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -- (Python package) Meeko - -.. code-block:: bash - - git clone --single-branch --branch develop https://github.com/forlilab/Meeko.git - cd Meeko; pip install --use-pep517 -e .; cd .. - - (Python package) scrubber .. code-block:: bash diff --git a/docs/source/tutorial2.rst b/docs/source/tutorial2.rst index b65e6e7a..27739bb5 100644 --- a/docs/source/tutorial2.rst +++ b/docs/source/tutorial2.rst @@ -8,10 +8,6 @@ This is a reactive docking example that uses the AutoDock-GPU executable to gene Follow the instructions to set up the environment and run this command-line example on your own device (Linux, MacOS or WSL). To run this example in a Colab notebook, see :ref:`colab_examples`. -.. contents:: - :local: - :depth: 2 - Introduction ============ @@ -44,13 +40,6 @@ Install the required Python packages through ``conda-forge`` Install the additional packages and data from GitHub repositories ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -- (Python package) Meeko - -.. code-block:: bash - - git clone --single-branch --branch develop https://github.com/forlilab/Meeko.git - cd Meeko; pip install --use-pep517 -e .; cd .. - - (Python package) scrubber .. code-block:: bash diff --git a/docs/source/tutorial3.rst b/docs/source/tutorial3.rst index ce4c602e..e2c0a1e5 100644 --- a/docs/source/tutorial3.rst +++ b/docs/source/tutorial3.rst @@ -8,10 +8,6 @@ This is a tethered (two-point attached covalent) docking example that uses the A Follow the instructions to set up the environment and run this command-line example on your own device (Linux, MacOS or WSL). To run this example in a Colab notebook, see :ref:`colab_examples`. -.. contents:: - :local: - :depth: 2 - Introduction ============ From cf2129864fe01059822910e4cb2d5c9dccd76719 Mon Sep 17 00:00:00 2001 From: Amy He Date: Wed, 6 Nov 2024 20:56:58 -0800 Subject: [PATCH 02/18] add usage to mk_export.py --- docs/source/cli_export_result.rst | 63 +++++++++++++++++++++++++++---- docs/source/tutorial2.rst | 6 +-- docs/source/tutorial3.rst | 2 +- 3 files changed, 60 insertions(+), 11 deletions(-) diff --git a/docs/source/cli_export_result.rst b/docs/source/cli_export_result.rst index 641fbd21..eb0847b2 100644 --- a/docs/source/cli_export_result.rst +++ b/docs/source/cli_export_result.rst @@ -1,8 +1,11 @@ mk_export.py ============ +About +----- + Convert docking results to SDF ------------------------------- +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ AutoDock-GPU and Vina write docking results in the PDBQT format. The DLG output from AutoDock-GPU contains docked poses in PDBQT blocks, plus additional information. @@ -13,12 +16,12 @@ from RDKit molecules. .. code-block:: bash - mk_export.py molecule.pdbqt -o molecule.sdf - mk_export.py vina_results.pdbqt -o vina_results.sdf - mk_export.py autodock-gpu_results.dlg -o autodock-gpu_results.sdf + mk_export.py molecule.pdbqt -s molecule.sdf + mk_export.py vina_results.pdbqt -s vina_results.sdf + mk_export.py autodock-gpu_results.dlg -s autodock-gpu_results.sdf Why this matters ----------------- +~~~~~~~~~~~~~~~~ Making RDKit molecules from SMILES is safer than guessing bond orders from the coordinates, specially because the PDBQT lacks hydrogens bonded @@ -31,9 +34,8 @@ but because this is a nearly impossible task. obabel -:"C1C=CCO1" -o pdbqt --gen3d | obabel -i pdbqt -o smi [C]1=[C][C]=[C]O1 - Caveats -------- +~~~~~~~ If docking does not use explicit Hs, which it often does not, the exported positions of hydrogens are calculated from RDKit. This can @@ -41,3 +43,50 @@ be annoying if a careful forcefield minimization is employed before docking, as probably rigorous Hs positions will be replaced by the RDKit geometry rules, which are empirical and much simpler than most force fields. + +Usage +----- + +.. code-block:: bash + + mk_export.py [OPTIONS] docking_results_filename(s) + +Positional Argument +~~~~~~~~~~~~~~~~~~~ + +.. option:: docking_results_filename + + One or more docking output files in either PDBQT format (from Vina) or DLG format (from AD-GPU). + +Options +~~~~~~~ + +.. option:: -s, --write_sdf + + Specify the output SDF filename. Defaults to the input filename with a suffix defined by ``--suffix``. + +.. option:: -p, --write_pdb + + Specify the output PDB filename. Defaults to the input filename with a suffix defined by ``--suffix``. + +.. option:: --suffix + + Set a suffix for output filenames that are not explicitly specified. The default suffix is ``_docked``. + +.. option:: -j, --read_json + + Provide a receptor file generated by ``mk_prepare_receptor`` with the ``-j/--write_json`` option. + +.. option:: --all_dlg_poses + + (Flag) Write all poses from AutoDock-GPU DLG output files, instead of only the lead of each cluster. + +.. option:: -k, --keep_flexres_sdf + + (Flag) Include flexible residues, if any, in the SDF output. + +.. option:: -, --redirect_stdout + + (Flag) Instead of writing an SDF file, print it directly to the standard output (STDOUT). + + diff --git a/docs/source/tutorial2.rst b/docs/source/tutorial2.rst index 27739bb5..59be7092 100644 --- a/docs/source/tutorial2.rst +++ b/docs/source/tutorial2.rst @@ -282,9 +282,9 @@ And to run the docking calculation, the ligand PDBQT file (``AMP.pdbqt``), the f If you're running these calculations on Google T4 backends, here are the pre-compiled executables of autogrid4 and adgpu specifically for T4: -- autodock-gpu v1.5.3 -`autodock_gpu_128wi `_ -`adgpu_analysis `_ +- autodock-gpu v1.5.3-2e658c3 +`autodock_gpu_128wi `_ +`adgpu_analysis `_ - autogrid v4.2.6 `autogrid4 `_ diff --git a/docs/source/tutorial3.rst b/docs/source/tutorial3.rst index e2c0a1e5..c830324c 100644 --- a/docs/source/tutorial3.rst +++ b/docs/source/tutorial3.rst @@ -193,4 +193,4 @@ It is also possible to export the docking poses to a multi-model PDB file with u --default_altloc A -f $flexres \ --box_enveloping "LIG.pdb" --padding 8.0 - mk_export.py HIE_AMP.dlg -s 3kgd_HIE_AMP_adgpu_out.sdf -k 3kgd_receptorH.json -p 3kgd_HIE_AMP_adgpu_out.pdb \ No newline at end of file + mk_export.py HIE_AMP.dlg -s 3kgd_HIE_AMP_adgpu_out.sdf -j 3kgd_receptorH.json -p 3kgd_HIE_AMP_adgpu_out.pdb \ No newline at end of file From a731b350eef899a3a84877dd49728c277e1d2380 Mon Sep 17 00:00:00 2001 From: Amy He Date: Wed, 6 Nov 2024 21:49:02 -0800 Subject: [PATCH 03/18] list options in cli docs --- docs/source/cli_export_result.rst | 18 +-- docs/source/cli_lig_prep.rst | 85 ++++++++++++- docs/source/cli_rec_prep.rst | 193 ++++++++++++++++++++++++++++-- 3 files changed, 275 insertions(+), 21 deletions(-) diff --git a/docs/source/cli_export_result.rst b/docs/source/cli_export_result.rst index eb0847b2..1c35736f 100644 --- a/docs/source/cli_export_result.rst +++ b/docs/source/cli_export_result.rst @@ -61,25 +61,25 @@ Positional Argument Options ~~~~~~~ -.. option:: -s, --write_sdf +.. option:: --suffix - Specify the output SDF filename. Defaults to the input filename with a suffix defined by ``--suffix``. + Set a suffix for output filenames that are not explicitly specified. The default suffix is ``_docked``. -.. option:: -p, --write_pdb +.. option:: -s, --write_sdf - Specify the output PDB filename. Defaults to the input filename with a suffix defined by ``--suffix``. + Specify the output SDF filename. Defaults to the input filename with a suffix defined by ``--suffix``. -.. option:: --suffix +.. option:: -j, --read_json - Set a suffix for output filenames that are not explicitly specified. The default suffix is ``_docked``. + Provide a receptor file generated by ``mk_prepare_receptor`` with the ``-j/--write_json`` option. Currently only effective when used with ``-p, --write_pdb``. -.. option:: -j, --read_json +.. option:: -p, --write_pdb - Provide a receptor file generated by ``mk_prepare_receptor`` with the ``-j/--write_json`` option. + Specify the output PDB filename. Defaults to the input filename with a suffix defined by ``--suffix``. Must be used together with ``-j, --read_json``. .. option:: --all_dlg_poses - (Flag) Write all poses from AutoDock-GPU DLG output files, instead of only the lead of each cluster. + (Flag) Write all poses from AutoDock-GPU DLG output files, instead of only the lead of each cluster. Currently only effective for ``-s, --write_sdf``. .. option:: -k, --keep_flexres_sdf diff --git a/docs/source/cli_lig_prep.rst b/docs/source/cli_lig_prep.rst index 91fe3224..b246c043 100644 --- a/docs/source/cli_lig_prep.rst +++ b/docs/source/cli_lig_prep.rst @@ -3,8 +3,11 @@ mk_prepare_ligand.py Command line tool to prepare small organic molecules. +About +----- + Write PDBQT files ------------------ +~~~~~~~~~~~~~~~~~ AutoDock-GPU and Vina read molecules in the PDBQT format. These can be prepared by Meeko from SD files, or from Mol2 files, but SDF is strongly preferred. @@ -14,3 +17,83 @@ by Meeko from SD files, or from Mol2 files, but SDF is strongly preferred. mk_prepare_ligand.py -i molecule.sdf -o molecule.pdbqt mk_prepare_ligand.py -i multi_mol.sdf --multimol_outdir folder_for_pdbqt_files +Usage +----- + +.. code-block:: bash + + python mk_prepare_ligand.py [OPTIONS] + +Positional Argument +~~~~~~~~~~~~~~~~~~~ + +.. option:: -i, --mol + + The input molecule file, in formats such as MOL2, SDF, etc. + +Options +~~~~~~~ + +.. option:: -c, --config_file + + Configure `MoleculePreparation` from a JSON file. Command-line arguments will override settings in the file. + + **Example:** + + .. code-block:: bash + + python mk_prepare_ligand.py -c config.json -i ligand.sdf + +.. option:: -v, --verbose + + (Flag) Print detailed information about the molecule setup process. + +.. option:: --name_from_prop + + Set the molecule name using a specified RDKit or SDF property. + + **Example:** + + .. code-block:: bash + + python mk_prepare_ligand.py -i ligand.sdf --name_from_prop compound_name + +.. option:: -o, --out + + Specify the output PDBQT filename. Only compatible with single-molecule input. + +.. option:: --multimol_outdir + + Specify the directory to write PDBQT output files for multi-molecule inputs. Incompatible with `-o/--out` and `-`/`--`. + +.. option:: --multimol_prefix + + Replace the internal molecule name in multi-molecule input with the specified prefix. Incompatible with `-o/--out` and `-`/`--`. + +.. option:: -z, --multimol_targz + + (Flag) Compress output files into a `.tar.gz` archive. + +.. option:: --multimol_targz_size + + Define the number of PDBQT files per `.tar.gz` archive. Default is 10000. + +.. option:: -, -- + + (Flag) Redirect output to standard output (STDOUT) instead of writing a file. Ignored if `-o/--out` is specified. Only compatible with single-molecule input. + +Molecule Preparation Options +---------------------------- + +.. option:: --rigid_macrocycles + + (Flag) Keep macrocycles rigid in their input conformation. + +.. option:: --macrocycle_allow_A + + (Flag) Allow bond break with atom type A, which will be retyped as carbon (C). + +.. option:: --keep_chorded_rings + + (Flag) Retain all rings from exhaustive ring perception. + diff --git a/docs/source/cli_rec_prep.rst b/docs/source/cli_rec_prep.rst index 2fadd173..aa0ecae2 100644 --- a/docs/source/cli_rec_prep.rst +++ b/docs/source/cli_rec_prep.rst @@ -1,3 +1,9 @@ +mk_prepare_receptor +=================== + +About +----- + The input structure is matched against templates to guarantee chemical correctness and identify problems with the input structures. This allows the user to identify and fix problems, resulting in a molecular @@ -27,28 +33,22 @@ Residue name is primary key unless user overrides. Currently not supported: capped residues from charmm-gui. -mk_prepare_receptor -=================== - Basic usage ------------ +~~~~~~~~~~~ .. code-block:: bash mk_prepare_receptor -i examples/system.pdb --write_pdbqt prepared.pdbqt - - - Protonation states ------------------- +~~~~~~~~~~~~~~~~~~ Adding templates ----------------- +~~~~~~~~~~~~~~~~ Write flags ------------ +~~~~~~~~~~~ The option flags starting with ``--write`` in ``mk_prepare_receptor`` can be used both with an argument to specify the outpuf filename: @@ -77,7 +77,7 @@ in which case the specified filenames have priority over the default basename. .. _templates: Templates ---------- +~~~~~~~~~ The templates contain SMILES strings that are used to create the RDKit molecules that constitute every residue in the processed model. In this way, @@ -102,3 +102,174 @@ is a single carbon atom. Our template SMILES would be "[H]C[H]". The RDKit molecule will have three atoms and the carbon will have two implicit hydrogens. The implicit hydrogens correspond to bonds to adjacent residues in the processed polymer. + +Usage +----- + +.. code-block:: bash + + python mk_prepare_receptor.py [OPTIONS] + +Options +~~~~~~~ + +Input/Output Options +~~~~~~~~~~~~~~~~~~~~ + +.. option:: --read_pdb + + Read a PDB file directly (not in PDBQT format) without using ProDy. + +.. option:: -i, --read_with_prody + + Read a PDB or mmCIF file using ProDy (if installed). ProDy can be installed from PyPI or conda-forge. + +.. option:: -o, --output_basename + + Specify a default basename for output files created by `--write` options when no filename is specified. + +.. option:: -p, --write_pdbqt [*] + + Output PDBQT files with `_rigid` or `_flex` suffixes for flexible residues. Defaults to `--output_basename` if no filename is provided. + +.. option:: -j, --write_json [*] + + Save the receptor's parameterized configuration to JSON format. Defaults to `--output_basename` if unspecified. + +.. option:: --write_pdb [*] + + Save the prepared receptor in PDB format. Must specify the filename. + +.. option:: -g, --write_gpf [*] + + Output an AutoGrid input file (GPF). Defaults to `--output_basename` if not specified. + +.. option:: -v, --write_vina_box [*] + + Generate a configuration file for Vina with grid box dimensions. Defaults to `--output_basename` if not specified. + +Receptor Perception Options +--------------------------- + +.. option:: -n, --set_template