diff --git a/.gitignore b/.gitignore index 47d9349..564013d 100644 --- a/.gitignore +++ b/.gitignore @@ -2,14 +2,37 @@ *.bpc *.c *.dacs -*.egg-info *.log *.pptx *.prefs -*.pyc -*.pyd *.txt *.zip -.coverage .db +*.py[cod] +*.egg-info +*.eggs +.ipynb_checkpoints +*.vtk + +build +dist +.cache +__pycache__ + +htmlcov +.coverage +coverage.xml +.pytest_cache + +docs/_build +docs/apidocs +playground/ + +# ide +.idea +.eclipse +.vscode + +# Mac +.DS_Store diff --git a/README.md b/README.md index c4f7a00..fe2bbdd 100644 --- a/README.md +++ b/README.md @@ -8,7 +8,9 @@ Collection of tools for automated processing and clustering of single-crystal electron diffraction data. -Install using `pip install edtools`. +Install using `pip install edtools`. Installation should take less than 20 seconds on a normal desktop. + +Find the latest [releases](https://github.com/instamatic-dev/edtools/releases) for the versions that have been tested on. [The source for this project is available here][src]. @@ -32,7 +34,7 @@ Looks files matching `CORRECT.LP` in all subdirectories and extracts unit cell/i ### find_cell.py -This program a cells.yaml file and shows histogram plots with the unit cell parameters. This program mimicks `CELLPARM` (http://xds.mpimf-heidelberg.mpg.de/html_doc/cellparm_program.html) and calculates the weighted mean lattice parameters, where the weight is typically the number of observed reflections (defaults to 1.0). For each lattice parameter, the mean is calculated in a given range (default range = median+-2). The range can be changed by dragging the cursor on the histogram plots. +This program a cells.yaml file and shows histogram plots with the unit cell parameters. This program mimicks [`CELLPARM`](http://xds.mpimf-heidelberg.mpg.de/html_doc/cellparm_program.html) and calculates the weighted mean lattice parameters, where the weight is typically the number of observed reflections (defaults to 1.0). For each lattice parameter, the mean is calculated in a given range (default range = median+-2). The range can be changed by dragging the cursor on the histogram plots. Alternatively, the unit cells can be clustered by giving the `--cluster` command, in which a dendrogram is shown. The cluster cutoff can be selected by clicking in the dendrogram. The clusters will be written to `cells_cluster_#.yaml`. @@ -109,13 +111,19 @@ Usage: edtools.find_rotation_axis [XDS.INP] ``` +## OS Requirement + +The package has been mainly developed and tested under windows 10. -## Requirements +## Software Requirements -- Python3.6 including `numpy`, `scipy`, `matplotlib`, and `pandas` libraries +- Python 3.6+ including `numpy`, `scipy`, `matplotlib`, and `pandas` libraries - `sginfo` or `cctbx.python` must be available on the system path for `edtools.make_shelx` -- (Windows 10) Access to [WSL](https://en.wikipedia.org/wiki/Windows_Subsystem_for_Linux) -- (Windows 10) XDS and related tools must be available under WSL +- (Windows 10 or newer) Access to [WSL](https://en.wikipedia.org/wiki/Windows_Subsystem_for_Linux) +- (Windows 10 or newer) XDS and related tools must be available under WSL + +## Package dependencies +Check [pyproject.toml](pyproject.toml) for the full dependency list and versions. [src]: https://github.com/instamatic-dev/edtools diff --git a/docs/conf.py b/docs/conf.py index 8fe2e3f..3f459eb 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -54,6 +54,7 @@ def setup(app): # 'nbsphinx_link', # 'sphinx.ext.todo', # 'sphinx.ext.viewcode', + 'nbsphinx', 'autodocsumm', ] diff --git a/docs/examples/edtools_demo.ipynb b/docs/examples/edtools_demo.ipynb new file mode 100644 index 0000000..5ef52d6 --- /dev/null +++ b/docs/examples/edtools_demo.ipynb @@ -0,0 +1,995 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "716302f5", + "metadata": {}, + "source": [ + "# *edtools* Demo\n", + "\n", + "**edtools** is a python package for automated processing of a large number of 3D electron diffraction (3D ED) datasets. It can be downloaded from https://doi.org/10.5281/zenodo.5727189. \n", + "\n", + "For runing *edtools*, *XDS* package for reduction of 3D ED datasets is required. *XDS* package is available at https://xds.mr.mpg.de/html_doc/downloading.html.\n", + "\n", + "A typical cycle of using *edtools* for processing batch 3D ED datasets goes through the following steps:\n", + "\n", + "- `edtools.autoindex`\n", + "- `edtools.extract_xds_info`\n", + "- `edtools.find_cell`\n", + "- `edtools.update_xds`\n", + "- `edtools.make_xscale`\n", + "- `edtools.cluster`\n", + " \n", + "Here we demonstrate the processing of batch 3D ED datasets for phase analysis and structure determination using *edtools*. The datasets for the demo can be downloaded from...\n", + " \n", + "The datasets were collected on a zeolite mixture sample using serial rotation electron diffraction (SerialRED) data collection technique implemented in the program **Instamatic** (available at https://doi.org/10.5281/zenodo.5175957), which runs on a JEOL JEM-2100-LaB6 at 200 kV equipped with a 512 x 512 Timepix hybrid pixel detector (55 x 55 µm pixel size, QTPX-262k, Amsterdam Scientific Instruments).\n", + "\n", + "The zeolite mixture sample contains phases **IWV**,**RTH**, and ***CTH**. The information of these three phases can be found from the structure database of zeolites (https://europe.iza-structure.org/IZA-SC/ftc_table.php).\n", + "\n", + "This demo takes around 5-10 min to run on a normal desktop computer with all the required packages installed properly beforehand.\n" + ] + }, + { + "cell_type": "markdown", + "id": "7e6d7456", + "metadata": {}, + "source": [ + "## 1. Indexing\n", + "\n", + "Automatically index the 3D ED datasets by running *XDS* in all subfolders (SMV) that contains file `XDS.INP`, which is automatically generated during data collection using *Instamatic*." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "41309fa9-9722-4574-9f92-9e0eb3b2ca74", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "16 files named XDS.INP (subdir: None) found.\n", + "\n", + " 0: C:\\Users\\yluo\\demo\\data\\stagepos_0067\\crystal_0001\\SMV # Tue May 3 19:58:34 2022\n", + "Spgr 5 - Cell 26.93 14.05 5.36 90.00 90.89 90.00 - Vol 2027.80\n", + "\n", + " # dmax dmin ntot nuniq compl i/sig rmeas CC(1/2) ISa B(ov)\n", + " # dmax dmin ntot nuniq compl i/sig rmeas CC(1/2) ISa B(ov)\n", + "---------------------------------------------------------------------------------\n", + " 0 4.35 0.80 583 324 15.0 4.59 13.7 98.6 7.47 6.72\n", + " - 0.85 0.80 54 42 12.5 1.96 26.8 91.2\n", + "\n", + "\n", + " 1: C:\\Users\\yluo\\demo\\data\\stagepos_0164\\crystal_0000\\SMV # Tue May 3 19:58:35 2022\n", + "Spgr 1 - Cell 9.49 9.90 12.47 66.56 89.45 86.35 - Vol 1072.59\n", + "\n", + " # dmax dmin ntot nuniq compl i/sig rmeas CC(1/2) ISa B(ov)\n", + " 1 6.39 0.80 229 209 4.8 24.01 22.3 96.2 50.00 4.74\n", + " - 0.91 0.85 31 29 4.5 12.16 21.3 0.0\n", + "\n", + "\n", + " 3: C:\\Users\\yluo\\demo\\data\\stagepos_0299\\crystal_0001\\SMV # Tue May 3 19:58:38 2022\n", + "Spgr 1 - Cell 4.83 14.83 16.03 115.66 89.61 94.16 - Vol 1031.87\n", + "\n", + " # dmax dmin ntot nuniq compl i/sig rmeas CC(1/2) ISa B(ov)\n", + " 3 2.05 0.80 400 312 7.5 2.44 20.6 95.4 4.24 6.11\n", + " - 0.84 0.80 27 26 4.7 1.79 11.2 0.0\n", + "\n", + "\n", + " 4: C:\\Users\\yluo\\demo\\data\\stagepos_0325\\crystal_0000\\SMV # Tue May 3 19:58:40 2022\n", + "Spgr 5 - Cell 13.69 25.42 14.90 90.00 115.84 90.00 - Vol 4666.87\n", + "\n", + " # dmax dmin ntot nuniq compl i/sig rmeas CC(1/2) ISa B(ov)\n", + " 4 11.09 0.79 3744 2147 42.8 3.44 13.1 99.6 13.90 8.10\n", + " - 0.97 0.90 623 336 47.9 1.32 68.9 84.2\n", + "\n", + "\n", + " 5: C:\\Users\\yluo\\demo\\data\\stagepos_0341\\crystal_0000\\SMV # Tue May 3 19:58:42 2022\n", + "Spgr 5 - Cell 25.67 13.50 17.73 90.00 132.44 90.00 - Vol 4534.43\n", + "\n", + " # dmax dmin ntot nuniq compl i/sig rmeas CC(1/2) ISa B(ov)\n", + " 5 6.86 0.80 2161 1081 21.8 4.23 10.5 100.0 33.30 8.74\n", + " - 0.97 0.90 342 159 23.3 0.83 130.3 69.8\n", + "\n", + "\n", + " 6: C:\\Users\\yluo\\demo\\data\\stagepos_0368\\crystal_0001\\SMV # Tue May 3 19:58:43 2022\n", + "Spgr 1 - Cell 10.17 10.36 12.16 93.71 113.40 98.01 - Vol 1154.16\n", + "\n", + " # dmax dmin ntot nuniq compl i/sig rmeas CC(1/2) ISa B(ov)\n", + " 6 10.17 0.80 611 443 9.4 3.17 14.9 97.5 5.07 4.64\n", + " - 0.85 0.80 56 53 7.0 1.96 73.1 6.4\n", + "\n", + "\n", + " 7: C:\\Users\\yluo\\demo\\data\\stagepos_0538\\crystal_0000\\SMV # Tue May 3 19:58:45 2022\n", + "Spgr 1 - Cell 10.55 10.52 11.81 80.39 66.60 75.74 - Vol 1162.33\n", + "\n", + " # dmax dmin ntot nuniq compl i/sig rmeas CC(1/2) ISa B(ov)\n", + " 7 5.10 0.80 443 330 7.0 3.80 10.7 99.4 8.61 5.62\n", + " - 0.85 0.80 38 36 4.8 1.80 76.5 0.0\n", + "\n", + "\n", + " 8: C:\\Users\\yluo\\demo\\data\\stagepos_0648\\crystal_0001\\SMV # Tue May 3 19:58:46 2022\n", + "Spgr 1 - Cell 13.82 14.32 16.18 86.20 111.75 116.39 - Vol 2645.41\n", + "\n", + " # dmax dmin ntot nuniq compl i/sig rmeas CC(1/2) ISa B(ov)\n", + " 8 6.37 0.80 1460 989 9.1 2.88 16.3 97.4 5.24 7.62\n", + " - 0.85 0.80 166 125 7.3 1.36 62.8 71.4\n", + "\n", + "\n", + " 9: C:\\Users\\yluo\\demo\\data\\stagepos_0849\\crystal_0000\\SMV # Tue May 3 19:58:48 2022\n", + "Spgr 5 - Cell 15.06 26.22 15.41 90.00 118.30 90.00 - Vol 5357.50\n", + "\n", + " # dmax dmin ntot nuniq compl i/sig rmeas CC(1/2) ISa B(ov)\n", + " 9 13.11 0.79 2063 1319 22.1 3.46 10.5 99.5 12.09 7.58\n", + " - 0.89 0.83 326 223 24.5 1.01 53.6 85.6\n", + "\n", + "\n", + " 10: C:\\Users\\yluo\\demo\\data\\stagepos_0905\\crystal_0000\\SMV # Tue May 3 19:58:49 2022\n", + "Spgr 3 - Cell 13.91 5.07 14.97 90.00 117.96 90.00 - Vol 932.53\n", + "\n", + " # dmax dmin ntot nuniq compl i/sig rmeas CC(1/2) ISa B(ov)\n", + " 10 12.33 0.80 479 300 13.8 3.68 13.4 99.5 16.07 9.46\n", + " - 1.20 1.07 58 35 14.6 4.72 24.9 91.2\n", + "\n", + "\n", + " 11: C:\\Users\\yluo\\demo\\data\\stagepos_0905\\crystal_0001\\SMV # Tue May 3 19:58:51 2022\n", + "Spgr 1 - Cell 13.71 14.57 15.77 83.07 68.29 62.34 - Vol 2587.36\n", + "\n", + " # dmax dmin ntot nuniq compl i/sig rmeas CC(1/2) ISa B(ov)\n", + " 11 11.49 0.80 1596 1144 10.7 3.30 12.4 98.5 7.24 7.18\n", + " - 0.85 0.80 124 121 7.0 0.94 22.6 83.4\n", + "\n", + "\n", + " 12: C:\\Users\\yluo\\demo\\data\\stagepos_0980\\crystal_0000\\SMV # Tue May 3 19:58:53 2022\n", + "Spgr 1 - Cell 14.56 15.00 15.27 97.22 105.97 120.36 - Vol 2621.77\n", + "\n", + " # dmax dmin ntot nuniq compl i/sig rmeas CC(1/2) ISa B(ov)\n", + " 12 7.54 0.80 1746 1222 11.3 4.00 13.3 98.5 8.77 5.85\n", + " - 0.85 0.80 164 146 8.4 1.48 36.5 88.0\n", + "\n", + "\n", + " 13: C:\\Users\\yluo\\demo\\data\\stagepos_1014\\crystal_0000\\SMV # Tue May 3 19:58:54 2022\n", + "Spgr 1 - Cell 5.30 14.56 15.04 112.06 93.44 86.65 - Vol 1072.87\n", + "\n", + " # dmax dmin ntot nuniq compl i/sig rmeas CC(1/2) ISa B(ov)\n", + " 13 5.01 0.81 447 328 7.5 4.11 10.9 98.5 6.65 6.67\n", + " - 0.85 0.80 51 44 6.3 2.11 18.7 92.7\n", + "\n", + "\n", + " 15: C:\\Users\\yluo\\demo\\data\\stagepos_1283\\crystal_0001\\SMV # Tue May 3 19:58:57 2022\n", + "Spgr 1 - Cell 13.64 15.02 25.09 93.07 91.13 114.33 - Vol 4672.25\n", + "\n", + " # dmax dmin ntot nuniq compl i/sig rmeas CC(1/2) ISa B(ov)\n", + " 15 6.60 0.80 3124 2149 11.3 3.54 8.4 99.5 12.64 6.94\n", + " - 0.85 0.80 346 280 9.2 1.24 56.2 85.0\n", + "\n" + ] + } + ], + "source": [ + "!edtools.autoindex" + ] + }, + { + "cell_type": "markdown", + "id": "3fa96297", + "metadata": {}, + "source": [ + "## 2. Extract cell\n", + "\n", + "Extract the determined unit cell parameters from the output files (`CORRECT.LP`) of *XDS* " + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "c4870424", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "14 files named CORRECT.LP (subdir: None) found.\n", + " 1: C:\\Users\\yluo\\demo\\data\\stagepos_0067\\crystal_0001\\SMV # Tue May 3 19:58:34 2022\n", + "Spgr 5 - Cell 26.93 14.05 5.36 90.00 90.89 90.00 - Vol 2027.80\n", + "\n", + " 2: C:\\Users\\yluo\\demo\\data\\stagepos_0164\\crystal_0000\\SMV # Tue May 3 19:58:35 2022\n", + "Spgr 1 - Cell 9.49 9.90 12.47 66.56 89.45 86.35 - Vol 1072.59\n", + "\n", + " 3: C:\\Users\\yluo\\demo\\data\\stagepos_0299\\crystal_0001\\SMV # Tue May 3 19:58:38 2022\n", + "Spgr 1 - Cell 4.83 14.83 16.03 115.66 89.61 94.16 - Vol 1031.87\n", + "\n", + " 4: C:\\Users\\yluo\\demo\\data\\stagepos_0325\\crystal_0000\\SMV # Tue May 3 19:58:40 2022\n", + "Spgr 5 - Cell 13.69 25.42 14.90 90.00 115.84 90.00 - Vol 4666.87\n", + "\n", + " 5: C:\\Users\\yluo\\demo\\data\\stagepos_0341\\crystal_0000\\SMV # Tue May 3 19:58:42 2022\n", + "Spgr 5 - Cell 25.67 13.50 17.73 90.00 132.44 90.00 - Vol 4534.43\n", + "\n", + " 6: C:\\Users\\yluo\\demo\\data\\stagepos_0368\\crystal_0001\\SMV # Tue May 3 19:58:43 2022\n", + "Spgr 1 - Cell 10.17 10.36 12.16 93.71 113.40 98.01 - Vol 1154.16\n", + "\n", + " 7: C:\\Users\\yluo\\demo\\data\\stagepos_0538\\crystal_0000\\SMV # Tue May 3 19:58:45 2022\n", + "Spgr 1 - Cell 10.55 10.52 11.81 80.39 66.60 75.74 - Vol 1162.33\n", + "\n", + " 8: C:\\Users\\yluo\\demo\\data\\stagepos_0648\\crystal_0001\\SMV # Tue May 3 19:58:46 2022\n", + "Spgr 1 - Cell 13.82 14.32 16.18 86.20 111.75 116.39 - Vol 2645.41\n", + "\n", + " 9: C:\\Users\\yluo\\demo\\data\\stagepos_0849\\crystal_0000\\SMV # Tue May 3 19:58:48 2022\n", + "Spgr 5 - Cell 15.06 26.22 15.41 90.00 118.30 90.00 - Vol 5357.50\n", + "\n", + " 10: C:\\Users\\yluo\\demo\\data\\stagepos_0905\\crystal_0000\\SMV # Tue May 3 19:58:49 2022\n", + "Spgr 3 - Cell 13.91 5.07 14.97 90.00 117.96 90.00 - Vol 932.53\n", + "\n", + " 11: C:\\Users\\yluo\\demo\\data\\stagepos_0905\\crystal_0001\\SMV # Tue May 3 19:58:51 2022\n", + "Spgr 1 - Cell 13.71 14.57 15.77 83.07 68.29 62.34 - Vol 2587.36\n", + "\n", + " 12: C:\\Users\\yluo\\demo\\data\\stagepos_0980\\crystal_0000\\SMV # Tue May 3 19:58:53 2022\n", + "Spgr 1 - Cell 14.56 15.00 15.27 97.22 105.97 120.36 - Vol 2621.77\n", + "\n", + " 13: C:\\Users\\yluo\\demo\\data\\stagepos_1014\\crystal_0000\\SMV # Tue May 3 19:58:54 2022\n", + "Spgr 1 - Cell 5.30 14.56 15.04 112.06 93.44 86.65 - Vol 1072.87\n", + "\n", + " 14: C:\\Users\\yluo\\demo\\data\\stagepos_1283\\crystal_0001\\SMV # Tue May 3 19:58:57 2022\n", + "Spgr 1 - Cell 13.64 15.02 25.09 93.07 91.13 114.33 - Vol 4672.25\n", + "\n", + " # dmax dmin ntot nuniq compl i/sig rmeas CC(1/2) ISa B(ov)\n", + "---------------------------------------------------------------------------------\n", + "\n", + " 1 4.35 0.80 583 324 15.0 4.59 13.7 98.6 7.47 6.72 # C:\\Users\\yluo\\demo\\data\\stagepos_0067\\crystal_0001\\SMV\\CORRECT.LP\n", + " - 0.85 0.80 54 42 12.5 1.96 26.8 91.2\n", + "\n", + " 2 6.39 0.80 229 209 4.8 24.01 22.3 96.2 50.00 4.74 # C:\\Users\\yluo\\demo\\data\\stagepos_0164\\crystal_0000\\SMV\\CORRECT.LP\n", + " - 0.91 0.85 31 29 4.5 12.16 21.3 0.0\n", + "\n", + " 3 2.05 0.80 400 312 7.5 2.44 20.6 95.4 4.24 6.11 # C:\\Users\\yluo\\demo\\data\\stagepos_0299\\crystal_0001\\SMV\\CORRECT.LP\n", + " - 0.84 0.80 27 26 4.7 1.79 11.2 0.0\n", + "\n", + " 4 11.09 0.79 3744 2147 42.8 3.44 13.1 99.6 13.90 8.10 # C:\\Users\\yluo\\demo\\data\\stagepos_0325\\crystal_0000\\SMV\\CORRECT.LP\n", + " - 0.97 0.90 623 336 47.9 1.32 68.9 84.2\n", + "\n", + " 5 6.86 0.80 2161 1081 21.8 4.23 10.5 100.0 33.30 8.74 # C:\\Users\\yluo\\demo\\data\\stagepos_0341\\crystal_0000\\SMV\\CORRECT.LP\n", + " - 0.97 0.90 342 159 23.3 0.83 130.3 69.8\n", + "\n", + " 6 10.17 0.80 611 443 9.4 3.17 14.9 97.5 5.07 4.64 # C:\\Users\\yluo\\demo\\data\\stagepos_0368\\crystal_0001\\SMV\\CORRECT.LP\n", + " - 0.85 0.80 56 53 7.0 1.96 73.1 6.4\n", + "\n", + " 7 5.10 0.80 443 330 7.0 3.80 10.7 99.4 8.61 5.62 # C:\\Users\\yluo\\demo\\data\\stagepos_0538\\crystal_0000\\SMV\\CORRECT.LP\n", + " - 0.85 0.80 38 36 4.8 1.80 76.5 0.0\n", + "\n", + " 8 6.37 0.80 1460 989 9.1 2.88 16.3 97.4 5.24 7.62 # C:\\Users\\yluo\\demo\\data\\stagepos_0648\\crystal_0001\\SMV\\CORRECT.LP\n", + " - 0.85 0.80 166 125 7.3 1.36 62.8 71.4\n", + "\n", + " 9 13.11 0.79 2063 1319 22.1 3.46 10.5 99.5 12.09 7.58 # C:\\Users\\yluo\\demo\\data\\stagepos_0849\\crystal_0000\\SMV\\CORRECT.LP\n", + " - 0.89 0.83 326 223 24.5 1.01 53.6 85.6\n", + "\n", + " 10 12.33 0.80 479 300 13.8 3.68 13.4 99.5 16.07 9.46 # C:\\Users\\yluo\\demo\\data\\stagepos_0905\\crystal_0000\\SMV\\CORRECT.LP\n", + " - 1.20 1.07 58 35 14.6 4.72 24.9 91.2\n", + "\n", + " 11 11.49 0.80 1596 1144 10.7 3.30 12.4 98.5 7.24 7.18 # C:\\Users\\yluo\\demo\\data\\stagepos_0905\\crystal_0001\\SMV\\CORRECT.LP\n", + " - 0.85 0.80 124 121 7.0 0.94 22.6 83.4\n", + "\n", + " 12 7.54 0.80 1746 1222 11.3 4.00 13.3 98.5 8.77 5.85 # C:\\Users\\yluo\\demo\\data\\stagepos_0980\\crystal_0000\\SMV\\CORRECT.LP\n", + " - 0.85 0.80 164 146 8.4 1.48 36.5 88.0\n", + "\n", + " 13 5.01 0.81 447 328 7.5 4.11 10.9 98.5 6.65 6.67 # C:\\Users\\yluo\\demo\\data\\stagepos_1014\\crystal_0000\\SMV\\CORRECT.LP\n", + " - 0.85 0.80 51 44 6.3 2.11 18.7 92.7\n", + "\n", + " 14 6.60 0.80 3124 2149 11.3 3.54 8.4 99.5 12.64 6.94 # C:\\Users\\yluo\\demo\\data\\stagepos_1283\\crystal_0001\\SMV\\CORRECT.LP\n", + " - 0.85 0.80 346 280 9.2 1.24 56.2 85.0\n", + "\n", + "Wrote 14 cells to file cells.xlsx\n", + "Wrote 14 cells to file cells.yaml\n", + "Wrote 8 entries to file filelist.txt (completeness > 10.0%, CC(1/2) > 90.0%)\n", + "\n", + "Most likely lattice types:\n", + " 1 Lattice type `aP` (spgr: 1) was found 9 times (score: 10056)\n", + " 2 Lattice type `mC` (spgr: 5) was found 4 times (score: 8551)\n", + " 3 Lattice type `mP` (spgr: 3) was found 1 times (score: 479)\n", + "\n", + " ** the score corresponds to the total number of indexed reflections.\n" + ] + } + ], + "source": [ + "!edtools.extract_xds_info" + ] + }, + { + "cell_type": "markdown", + "id": "2738796d", + "metadata": {}, + "source": [ + "## 3. Unit-cell-based clustering for phase analysis" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5c5caa88", + "metadata": {}, + "outputs": [], + "source": [ + "!edtools.find_cell cells.yaml -s --cluster --metric lcv" + ] + }, + { + "attachments": { + "find_cell_step3.png": { + "image/png": "" + } + }, + "cell_type": "markdown", + "id": "b7020b02", + "metadata": {}, + "source": [ + "![find_cell_step3.png](attachment:find_cell_step3.png)" + ] + }, + { + "cell_type": "markdown", + "id": "2e15e134-bc0d-4bba-a0bb-49f385e26180", + "metadata": {}, + "source": [ + "Console Output\n", + "\n", + "```\n", + "Linkage method = average\n", + "Cutoff distance = 0.1735\n", + "Distance metric = lcv\n", + "\n", + "----------------------------------------\n", + "\n", + "Cluster #1 (6 items)\n", + " 4 [ 14.04 14.39 14.72 76.68 62.79 61.86] Vol.: 2331.3\n", + " 5 [ 13.50 14.38 14.63 75.73 64.60 63.07] Vol.: 2283.0\n", + " 8 [ 13.89 14.29 17.00 72.43 63.61 63.57] Vol.: 2684.8\n", + " 9 [ 14.81 15.07 15.52 62.45 74.78 62.16] Vol.: 2711.1\n", + " 11 [ 13.73 14.56 16.03 84.26 68.05 62.57] Vol.: 2629.5\n", + " 12 [ 14.43 14.90 15.40 81.24 74.01 61.15] Vol.: 2787.8\n", + " ---\n", + "Mean: [ 14.07 14.60 15.55 75.46 67.97 62.40] Vol.: 2571.3\n", + " Min: [ 13.50 14.29 14.63 62.45 62.79 61.15] Vol.: 2283.0\n", + " Max: [ 14.81 15.07 17.00 84.26 74.78 63.57] Vol.: 2787.8\n", + "\n", + "Cluster #3 (4 items)\n", + " 1 [ 5.47 14.07 15.30 63.22 87.59 88.58] Vol.: 1050.9\n", + " 3 [ 5.33 14.99 16.06 64.44 89.16 82.51] Vol.: 1144.9\n", + " 10 [ 5.05 14.37 14.53 62.13 88.52 89.11] Vol.: 932.0\n", + " 13 [ 5.30 14.89 15.18 66.79 86.51 86.59] Vol.: 1098.1\n", + " ---\n", + "Mean: [ 5.29 14.58 15.27 64.15 87.95 86.70] Vol.: 1056.5\n", + " Min: [ 5.05 14.07 14.53 62.13 86.51 82.51] Vol.: 932.0\n", + " Max: [ 5.47 14.99 16.06 66.79 89.16 89.11] Vol.: 1144.9\n", + "\n", + "Cluster #4 (3 items)\n", + " 2 [ 9.52 9.98 12.85 65.60 87.80 85.43] Vol.: 1107.8\n", + " 6 [ 10.21 10.36 12.08 85.86 67.02 81.83] Vol.: 1165.3\n", + " 7 [ 10.55 10.75 11.75 80.34 66.42 75.73] Vol.: 1179.4\n", + " ---\n", + "Mean: [ 10.09 10.36 12.23 77.27 73.75 81.00] Vol.: 1150.9\n", + " Min: [ 9.52 9.98 11.75 65.60 66.42 75.73] Vol.: 1107.8\n", + " Max: [ 10.55 10.75 12.85 85.86 87.80 85.43] Vol.: 1179.4\n", + "\n", + "Wrote cluster 3 to file `cells_cluster_3_4-items.yaml`\n", + "Wrote cluster 4 to file `cells_cluster_4_3-items.yaml`\n", + "Wrote cluster 1 to file `cells_cluster_1_6-items.yaml`\n", + "```\n" + ] + }, + { + "cell_type": "markdown", + "id": "5bbf31db-6b98-47ba-8f00-f549c7603da5", + "metadata": {}, + "source": [ + "The three resulted clusters 1, 3, 4 correspond to phases **IWV**, ***CTH**, and **RTH**, respectively.\n", + "\n", + "With the averaged primitive unit cell parameters of each cluster, one can use the online tool http://cci.lbl.gov/cctbx/lattice_symmetry.html to find unit cell with higher symmetry with a pre-set tolerance.\n", + "\n", + "We take cluster 1 (phase **IWV**) as an example. The averaged unit cell parameters are:\n", + "14.07, 14.6, 15.55, 75.46, 67.97, 62.4\n", + " \n", + "The unit cell parameters with a higher symmetry (space group: *Fmmm* (69)) are:\n", + "14.07, 25.8828, 28.9294, 90, 90, 90\n", + "\n", + "The same operation can be done for all the other clusters." + ] + }, + { + "cell_type": "markdown", + "id": "c59a8d2a", + "metadata": {}, + "source": [ + "## 4. Update the *XDS.INP* files\n", + "\n", + "This step used `edtools.update_xds` to update the XDS input files with the determined unit cell parameters and space group." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "2b5b8420", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "16 files named XDS.INP (subdir: None) found.\n", + "\u001b[K C:\\Users\\yluo\\demo\\data\\stagepos_0067\\crystal_0001\\SMV\\XDS.INP\n", + "\u001b[K C:\\Users\\yluo\\demo\\data\\stagepos_0164\\crystal_0000\\SMV\\XDS.INP\n", + "\u001b[K C:\\Users\\yluo\\demo\\data\\stagepos_0290\\crystal_0002\\SMV\\XDS.INP\n", + "\u001b[K C:\\Users\\yluo\\demo\\data\\stagepos_0299\\crystal_0001\\SMV\\XDS.INP\n", + "\u001b[K C:\\Users\\yluo\\demo\\data\\stagepos_0325\\crystal_0000\\SMV\\XDS.INP\n", + "\u001b[K C:\\Users\\yluo\\demo\\data\\stagepos_0341\\crystal_0000\\SMV\\XDS.INP\n", + "\u001b[K C:\\Users\\yluo\\demo\\data\\stagepos_0368\\crystal_0001\\SMV\\XDS.INP\n", + "\u001b[K C:\\Users\\yluo\\demo\\data\\stagepos_0538\\crystal_0000\\SMV\\XDS.INP\n", + "\u001b[K C:\\Users\\yluo\\demo\\data\\stagepos_0648\\crystal_0001\\SMV\\XDS.INP\n", + "\u001b[K C:\\Users\\yluo\\demo\\data\\stagepos_0849\\crystal_0000\\SMV\\XDS.INP\n", + "\u001b[K C:\\Users\\yluo\\demo\\data\\stagepos_0905\\crystal_0000\\SMV\\XDS.INP\n", + "\u001b[K C:\\Users\\yluo\\demo\\data\\stagepos_0905\\crystal_0001\\SMV\\XDS.INP\n", + "\u001b[K C:\\Users\\yluo\\demo\\data\\stagepos_0980\\crystal_0000\\SMV\\XDS.INP\n", + "\u001b[K C:\\Users\\yluo\\demo\\data\\stagepos_1014\\crystal_0000\\SMV\\XDS.INP\n", + "\u001b[K C:\\Users\\yluo\\demo\\data\\stagepos_1261\\crystal_0001\\SMV\\XDS.INP\n", + "\u001b[K C:\\Users\\yluo\\demo\\data\\stagepos_1283\\crystal_0001\\SMV\\XDS.INP\n", + "\u001b[KUpdated 16 files\n" + ] + } + ], + "source": [ + "!edtools.update_xds -c 14.07 25.8828 28.9294 90 90 90 -s 69" + ] + }, + { + "cell_type": "markdown", + "id": "26c3ce5c", + "metadata": {}, + "source": [ + "## 5. Refine phases\n", + "\n", + "Rerun **autoindex**, **extract_xds_info** and **find_cell** for the desired phases to be successfully indexed by *XDS*. All the other phases are hopefully excluded in that a phase with different enough unit cell will not be indexed successfully. There are however cases when different phases have similar unit cells, which cannot be told apart during this step." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "e49065f1", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "16 files named XDS.INP (subdir: None) found.\n", + "\n", + " 4: C:\\Users\\yluo\\demo\\data\\stagepos_0325\\crystal_0000\\SMV # Tue May 3 20:16:50 2022\n", + "Spgr 69 - Cell 13.88 25.44 27.26 90.00 90.00 90.00 - Vol 9625.70\n", + "\n", + " # dmax dmin ntot nuniq compl i/sig rmeas CC(1/2) ISa B(ov)\n", + " 4 9.30 0.80 3938 1852 69.0 3.24 20.2 99.3 11.45 8.21\n", + " - 0.91 0.85 614 290 74.4 0.86 109.5 80.9\n", + "\n", + "\n", + " 5: C:\\Users\\yluo\\demo\\data\\stagepos_0341\\crystal_0000\\SMV # Tue May 3 20:16:52 2022\n", + "Spgr 69 - Cell 13.52 24.94 27.07 90.00 90.00 90.00 - Vol 9127.70\n", + "\n", + " # dmax dmin ntot nuniq compl i/sig rmeas CC(1/2) ISa B(ov)\n", + " 5 10.88 0.80 2203 1029 40.4 3.84 11.7 99.9 27.38 9.78\n", + " - 1.07 0.98 299 135 41.8 1.04 107.2 75.6\n", + "\n", + "\n", + " 8: C:\\Users\\yluo\\demo\\data\\stagepos_0648\\crystal_0001\\SMV # Tue May 3 20:16:57 2022\n", + "Spgr 69 - Cell 14.01 25.97 29.04 90.00 90.00 90.00 - Vol 10565.90\n", + "\n", + " # dmax dmin ntot nuniq compl i/sig rmeas CC(1/2) ISa B(ov)\n", + " 8 7.14 0.80 1466 781 26.2 2.61 18.2 97.1 4.73 7.15\n", + " - 0.84 0.80 142 92 19.9 0.98 62.3 52.0\n", + "\n", + "\n", + " 9: C:\\Users\\yluo\\demo\\data\\stagepos_0849\\crystal_0000\\SMV # Tue May 3 20:16:59 2022\n", + "Spgr 69 - Cell 15.10 26.02 26.72 90.00 90.00 90.00 - Vol 10498.34\n", + "\n", + " # dmax dmin ntot nuniq compl i/sig rmeas CC(1/2) ISa B(ov)\n", + " 9 7.24 0.80 1994 1126 38.5 3.27 11.9 99.5 12.91 8.08\n", + " - 0.98 0.90 322 166 41.2 1.27 70.2 90.6\n", + "\n", + " 10: C:\\Users\\yluo\\demo\\data\\stagepos_0905\\crystal_0000\\SMV -> Error in IDXREF: RETURN CODE IS IER= 0\n", + "\n", + " 11: C:\\Users\\yluo\\demo\\data\\stagepos_0905\\crystal_0001\\SMV # Tue May 3 20:17:03 2022\n", + "Spgr 69 - Cell 13.83 25.80 28.73 90.00 90.00 90.00 - Vol 10251.27\n", + "\n", + " # dmax dmin ntot nuniq compl i/sig rmeas CC(1/2) ISa B(ov)\n", + " 11 7.08 0.80 1591 808 28.2 2.88 17.1 97.8 6.24 7.63\n", + " - 0.90 0.85 254 128 30.4 1.17 42.6 93.7\n", + "\n", + "\n", + " 12: C:\\Users\\yluo\\demo\\data\\stagepos_0980\\crystal_0000\\SMV # Tue May 3 20:17:05 2022\n", + "Spgr 69 - Cell 14.39 25.16 28.10 90.00 90.00 90.00 - Vol 10173.67\n", + "\n", + " # dmax dmin ntot nuniq compl i/sig rmeas CC(1/2) ISa B(ov)\n", + " 12 5.12 0.80 1669 851 30.2 3.75 16.8 97.9 6.26 5.76\n", + " - 0.85 0.80 153 109 25.3 1.34 46.1 75.6\n", + "\n", + "\n", + " 15: C:\\Users\\yluo\\demo\\data\\stagepos_1283\\crystal_0001\\SMV # Tue May 3 20:17:10 2022\n", + "Spgr 69 - Cell 13.54 25.23 27.30 90.00 90.00 90.00 - Vol 9326.07\n", + "\n", + " # dmax dmin ntot nuniq compl i/sig rmeas CC(1/2) ISa B(ov)\n", + " 15 5.97 0.80 1620 563 21.7 6.15 8.4 99.8 11.79 7.17\n", + " - 0.85 0.80 187 78 19.1 2.24 45.1 97.9\n", + "\n" + ] + } + ], + "source": [ + "!edtools.autoindex" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "1347b809", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "7 files named CORRECT.LP (subdir: None) found.\n", + " 1: C:\\Users\\yluo\\demo\\data\\stagepos_0325\\crystal_0000\\SMV # Tue May 3 20:16:50 2022\n", + "Spgr 69 - Cell 13.88 25.44 27.26 90.00 90.00 90.00 - Vol 9625.70\n", + "\n", + " 2: C:\\Users\\yluo\\demo\\data\\stagepos_0341\\crystal_0000\\SMV # Tue May 3 20:16:52 2022\n", + "Spgr 69 - Cell 13.52 24.94 27.07 90.00 90.00 90.00 - Vol 9127.70\n", + "\n", + " 3: C:\\Users\\yluo\\demo\\data\\stagepos_0648\\crystal_0001\\SMV # Tue May 3 20:16:57 2022\n", + "Spgr 69 - Cell 14.01 25.97 29.04 90.00 90.00 90.00 - Vol 10565.90\n", + "\n", + " 4: C:\\Users\\yluo\\demo\\data\\stagepos_0849\\crystal_0000\\SMV # Tue May 3 20:16:59 2022\n", + "Spgr 69 - Cell 15.10 26.02 26.72 90.00 90.00 90.00 - Vol 10498.34\n", + "\n", + " 5: C:\\Users\\yluo\\demo\\data\\stagepos_0905\\crystal_0001\\SMV # Tue May 3 20:17:03 2022\n", + "Spgr 69 - Cell 13.83 25.80 28.73 90.00 90.00 90.00 - Vol 10251.27\n", + "\n", + " 6: C:\\Users\\yluo\\demo\\data\\stagepos_0980\\crystal_0000\\SMV # Tue May 3 20:17:05 2022\n", + "Spgr 69 - Cell 14.39 25.16 28.10 90.00 90.00 90.00 - Vol 10173.67\n", + "\n", + " 7: C:\\Users\\yluo\\demo\\data\\stagepos_1283\\crystal_0001\\SMV # Tue May 3 20:17:10 2022\n", + "Spgr 69 - Cell 13.54 25.23 27.30 90.00 90.00 90.00 - Vol 9326.07\n", + "\n", + " # dmax dmin ntot nuniq compl i/sig rmeas CC(1/2) ISa B(ov)\n", + "---------------------------------------------------------------------------------\n", + "\n", + " 1 9.30 0.80 3938 1852 69.0 3.24 20.2 99.3 11.45 8.21 # C:\\Users\\yluo\\demo\\data\\stagepos_0325\\crystal_0000\\SMV\\CORRECT.LP\n", + " - 0.91 0.85 614 290 74.4 0.86 109.5 80.9\n", + "\n", + " 2 10.88 0.80 2203 1029 40.4 3.84 11.7 99.9 27.38 9.78 # C:\\Users\\yluo\\demo\\data\\stagepos_0341\\crystal_0000\\SMV\\CORRECT.LP\n", + " - 1.07 0.98 299 135 41.8 1.04 107.2 75.6\n", + "\n", + " 3 7.14 0.80 1466 781 26.2 2.61 18.2 97.1 4.73 7.15 # C:\\Users\\yluo\\demo\\data\\stagepos_0648\\crystal_0001\\SMV\\CORRECT.LP\n", + " - 0.84 0.80 142 92 19.9 0.98 62.3 52.0\n", + "\n", + " 4 7.24 0.80 1994 1126 38.5 3.27 11.9 99.5 12.91 8.08 # C:\\Users\\yluo\\demo\\data\\stagepos_0849\\crystal_0000\\SMV\\CORRECT.LP\n", + " - 0.98 0.90 322 166 41.2 1.27 70.2 90.6\n", + "\n", + " 5 7.08 0.80 1591 808 28.2 2.88 17.1 97.8 6.24 7.63 # C:\\Users\\yluo\\demo\\data\\stagepos_0905\\crystal_0001\\SMV\\CORRECT.LP\n", + " - 0.90 0.85 254 128 30.4 1.17 42.6 93.7\n", + "\n", + " 6 5.12 0.80 1669 851 30.2 3.75 16.8 97.9 6.26 5.76 # C:\\Users\\yluo\\demo\\data\\stagepos_0980\\crystal_0000\\SMV\\CORRECT.LP\n", + " - 0.85 0.80 153 109 25.3 1.34 46.1 75.6\n", + "\n", + " 7 5.97 0.80 1620 563 21.7 6.15 8.4 99.8 11.79 7.17 # C:\\Users\\yluo\\demo\\data\\stagepos_1283\\crystal_0001\\SMV\\CORRECT.LP\n", + " - 0.85 0.80 187 78 19.1 2.24 45.1 97.9\n", + "\n", + "Wrote 7 cells to file cells.xlsx\n", + "Wrote 7 cells to file cells.yaml\n", + "Wrote 7 entries to file filelist.txt (completeness > 10.0%, CC(1/2) > 90.0%)\n", + "\n", + "Most likely lattice types:\n", + " 1 Lattice type `oF` (spgr: 22) was found 7 times (score: 14481)\n", + "\n", + " ** the score corresponds to the total number of indexed reflections.\n" + ] + } + ], + "source": [ + "!edtools.extract_xds_info" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ffba629d", + "metadata": {}, + "outputs": [], + "source": [ + "!edtools.find_cell cells.yaml --cluster --metric lcv" + ] + }, + { + "attachments": { + "find_cell_step5.png": { + "image/png": "" + } + }, + "cell_type": "markdown", + "id": "551ac6d9", + "metadata": {}, + "source": [ + "![find_cell_step5.png](attachment:find_cell_step5.png)" + ] + }, + { + "cell_type": "markdown", + "id": "a6d7a310-e206-480b-b729-8357742d09c0", + "metadata": {}, + "source": [ + "Console Output\n", + "\n", + "```\n", + "Linkage method = average\n", + "Cutoff distance = 0.0564\n", + "Distance metric = lcv\n", + "\n", + "----------------------------------------\n", + "\n", + "Cluster #1 (7 items)\n", + " 1 [ 13.97 25.49 27.12 90.00 90.00 90.00] Vol.: 9657.9\n", + " 2 [ 13.53 25.01 27.18 90.00 90.00 90.00] Vol.: 9195.6\n", + " 3 [ 14.03 26.02 29.55 90.00 90.00 90.00] Vol.: 10790.3\n", + " 4 [ 14.94 26.14 26.94 90.00 90.00 90.00] Vol.: 10522.3\n", + " 5 [ 13.85 25.79 29.03 90.00 90.00 90.00] Vol.: 10364.0\n", + " 6 [ 14.52 24.95 28.11 90.00 90.00 90.00] Vol.: 10184.6\n", + " 7 [ 13.53 25.13 27.15 90.00 90.00 90.00] Vol.: 9233.7\n", + " ---\n", + "Mean: [ 14.05 25.50 27.87 90.00 90.00 90.00] Vol.: 9992.6\n", + " Min: [ 13.53 24.95 26.94 90.00 90.00 90.00] Vol.: 9195.6\n", + " Max: [ 14.94 26.14 29.55 90.00 90.00 90.00] Vol.: 10790.3\n", + "\n", + "Wrote cluster 1 to file `cells_cluster_1_7-items.yaml`\n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "aefbcba6", + "metadata": {}, + "source": [ + "## 6. Generate the input file for *XSCALE* \n", + "\n", + "This command generates the desired unit cell cluster for *XSCALE*." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "id": "ec35604a", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Loaded 7 cells\n", + "Lowest possible symmetry for 69 (oF): 22\n", + "\n", + "Using:\n", + " SPACE_GROUP_NUMBER= 69\n", + " UNIT_CELL_CONSTANTS= 14.050 25.500 27.870 90.000 90.000 90.000\n", + "\n", + "Wrote file XSCALE.INP\n", + "Wrote file XDSCONV.INP\n" + ] + } + ], + "source": [ + "!edtools.make_xscale cells_cluster_1_7-items.yaml -c 14.05 25.50 27.87 90.00 90.00 90.00 -s 69" + ] + }, + { + "cell_type": "markdown", + "id": "dc563af6", + "metadata": {}, + "source": [ + "## 7. Run *XSCALE* \n", + "\n", + "*XSCALE* calculates the correlation coefficients between different datasets." + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "id": "ad7be0a7", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + " ***** XSCALE ***** (VERSION Jan 10, 2022 BUILT=20220220) 3-May-2022\n", + " Author: Wolfgang Kabsch\n", + " Copy licensed until 31-Mar-2023 to\n", + " academic users for non-commercial applications\n", + " No redistribution.\n", + "\n", + "\n", + " ******************************************************************************\n", + " CONTROL CARDS\n", + " ******************************************************************************\n", + "\n", + " SAVE_CORRECTION_IMAGES= FALSE \n", + " SPACE_GROUP_NUMBER= 69 \n", + " UNIT_CELL_CONSTANTS= 14.050 25.500 27.870 90.000 90.000 90.000 \n", + " \n", + " OUTPUT_FILE= MERGED.HKL \n", + " \n", + " INPUT_FILE= data/stagepos_0325/crystal_0000/SMV/XDS_ASCII.HKL \n", + " INCLUDE_RESOLUTION_RANGE= 20 0.8 \n", + " \n", + " INPUT_FILE= data/stagepos_0341/crystal_0000/SMV/XDS_ASCII.HKL \n", + " INCLUDE_RESOLUTION_RANGE= 20 0.8 \n", + " \n", + " INPUT_FILE= data/stagepos_0648/crystal_0001/SMV/XDS_ASCII.HKL \n", + " INCLUDE_RESOLUTION_RANGE= 20 0.8 \n", + " \n", + " INPUT_FILE= data/stagepos_0849/crystal_0000/SMV/XDS_ASCII.HKL \n", + " INCLUDE_RESOLUTION_RANGE= 20 0.8 \n", + " \n", + " INPUT_FILE= data/stagepos_0905/crystal_0001/SMV/XDS_ASCII.HKL \n", + " INCLUDE_RESOLUTION_RANGE= 20 0.8 \n", + " \n", + " INPUT_FILE= data/stagepos_0980/crystal_0000/SMV/XDS_ASCII.HKL \n", + " INCLUDE_RESOLUTION_RANGE= 20 0.8 \n", + " \n", + " INPUT_FILE= data/stagepos_1283/crystal_0001/SMV/XDS_ASCII.HKL \n", + " INCLUDE_RESOLUTION_RANGE= 20 0.8 \n", + " \n", + "\n", + " THE DATA COLLECTION STATISTICS REPORTED BELOW ASSUMES:\n", + " SPACE_GROUP_NUMBER= 69\n", + " UNIT_CELL_CONSTANTS= 14.05 25.50 27.87 90.000 90.000 90.000\n", + "\n", + " \n", + "\n", + " ALL DATA SETS WILL BE SCALED TO data/stagepos_0325/crystal_0000/SMV/XDS_ASCII.HKL \n", + "\n", + "\n", + " ******************************************************************************\n", + " READING INPUT REFLECTION DATA FILES\n", + " ******************************************************************************\n", + "\n", + " DATA MEAN REFLECTIONS INPUT FILE NAME\n", + " SET# INTENSITY ACCEPTED REJECTED\n", + " 1 0.3010E+02 3938 0 data/stagepos_0325/crystal_0000/SMV/XDS_ASCII.HKL\n", + " 2 0.1368E+02 2205 0 data/stagepos_0341/crystal_0000/SMV/XDS_ASCII.HKL\n", + " 3 0.9168E+02 1453 0 data/stagepos_0648/crystal_0001/SMV/XDS_ASCII.HKL\n", + " 4 0.4279E+02 1931 0 data/stagepos_0849/crystal_0000/SMV/XDS_ASCII.HKL\n", + " 5 0.8542E+02 1590 0 data/stagepos_0905/crystal_0001/SMV/XDS_ASCII.HKL\n", + " 6 0.1676E+03 1662 0 data/stagepos_0980/crystal_0000/SMV/XDS_ASCII.HKL\n", + " 7 0.1915E+03 1620 0 data/stagepos_1283/crystal_0001/SMV/XDS_ASCII.HKL\n", + "\n", + "\n", + " ******************************************************************************\n", + " OVERALL SCALING AND CRYSTAL DISORDER CORRECTION\n", + " ******************************************************************************\n", + "\n", + " CORRELATIONS BETWEEN INPUT DATA SETS AFTER CORRECTIONS\n", + "\n", + " DATA SETS NUMBER OF COMMON CORRELATION RATIO OF COMMON B-FACTOR\n", + " #i #j REFLECTIONS BETWEEN i,j INTENSITIES (i/j) BETWEEN i,j\n", + "\n", + " 1 2 365 0.983 3.0178 0.0167\n", + " 1 3 239 0.980 0.5288 -0.4884\n", + " 2 3 426 0.941 0.1906 -0.5003\n", + " 1 4 548 0.906 1.1005 -0.6223\n", + " 2 4 343 0.972 0.3024 -0.3512\n", + " 3 4 359 0.900 1.6429 0.0764\n", + " 1 5 194 0.959 0.3817 0.1801\n", + " 2 5 412 0.966 0.1782 -0.2899\n", + " 3 5 496 0.985 0.9410 0.1110\n", + " 4 5 257 0.931 0.6070 0.0399\n", + " 1 6 533 0.941 0.1974 -0.6063\n", + " 2 6 219 0.896 0.1000 -0.8155\n", + " 3 6 201 0.939 0.6006 -0.2345\n", + " 4 6 247 0.844 0.2524 -0.1710\n", + " 5 6 168 0.878 0.5172 -0.1561\n", + " 1 7 65 0.968 0.4836 -1.1409\n", + " 2 7 317 0.978 0.1070 -0.9383\n", + " 3 7 348 0.984 0.5482 -0.3209\n", + " 4 7 122 0.771 0.3567 -0.6073\n", + " 5 7 376 0.988 0.5696 -0.4315\n", + " 6 7 122 0.843 1.5030 -0.3918\n", + "\n", + "\n", + " K*EXP(B*SS) = Factor applied to intensities\n", + " SS = (2sin(theta)/lambda)^2\n", + "\n", + " K B DATA SET NAME\n", + " 1.000E+00 0.000 data/stagepos_0325/crystal_0000/SMV/XDS_ASCII.HKL\n", + " 2.939E+00 0.046 data/stagepos_0341/crystal_0000/SMV/XDS_ASCII.HKL\n", + " 5.399E-01 -0.443 data/stagepos_0648/crystal_0001/SMV/XDS_ASCII.HKL\n", + " 9.548E-01 -0.410 data/stagepos_0849/crystal_0000/SMV/XDS_ASCII.HKL\n", + " 5.007E-01 -0.266 data/stagepos_0905/crystal_0001/SMV/XDS_ASCII.HKL\n", + " 2.389E-01 -0.615 data/stagepos_0980/crystal_0000/SMV/XDS_ASCII.HKL\n", + " 3.345E-01 -0.846 data/stagepos_1283/crystal_0001/SMV/XDS_ASCII.HKL\n", + "\n", + " ******************************************************************************\n", + " CORRECTION PARAMETERS FOR THE STANDARD ERROR OF REFLECTION INTENSITIES\n", + " ******************************************************************************\n", + "\n", + " The variance v0(I) of the intensity I obtained from counting statistics is\n", + " replaced by v(I)=a*(v0(I)+b*I^2). The model parameters a, b are chosen to\n", + " minimize the discrepancies between v(I) and the variance estimated from\n", + " sample statistics of symmetry related reflections. This model implicates\n", + " an asymptotic limit ISa=1/SQRT(a*b) for the highest I/Sigma(I) that the\n", + " experimental setup can produce (Diederichs (2010) Acta Cryst D66, 733-740).\n", + " Often the value of ISa is reduced from the initial value ISa0 due to systematic\n", + " errors showing up by comparison with other data sets in the scaling procedure.\n", + " (ISa=ISa0=-1 if v0 is unknown for a data set.)\n", + "\n", + " a b ISa ISa0 INPUT DATA SET\n", + " 2.787E+00 1.140E-02 5.61 11.45 data/stagepos_0325/crystal_0000/SMV/XDS_ASCII.HKL\n", + " 2.085E+00 3.701E-03 11.38 27.38 data/stagepos_0341/crystal_0000/SMV/XDS_ASCII.HKL\n", + " 7.984E+00 2.322E-02 2.32 4.73 data/stagepos_0648/crystal_0001/SMV/XDS_ASCII.HKL\n", + " 5.360E+00 1.192E-02 3.96 12.91 data/stagepos_0849/crystal_0000/SMV/XDS_ASCII.HKL\n", + " 6.750E+00 2.043E-02 2.69 6.25 data/stagepos_0905/crystal_0001/SMV/XDS_ASCII.HKL\n", + " 1.412E+01 9.586E-03 2.72 6.26 data/stagepos_0980/crystal_0000/SMV/XDS_ASCII.HKL\n", + " 1.003E+00 1.391E-01 2.68 11.79 data/stagepos_1283/crystal_0001/SMV/XDS_ASCII.HKL\n", + " \n", + "\n", + " FACTOR TO PLACE ALL DATA SETS TO AN APPROXIMATE ABSOLUTE SCALE 0.145870E+03\n", + " (ASSUMING A PROTEIN WITH 50% SOLVENT)\n", + "\n", + "\n", + "\n", + " ******************************************************************************\n", + " STATISTICS OF SCALED OUTPUT DATA SET : MERGED.HKL\n", + " FILE TYPE: XDS_ASCII MERGE=FALSE FRIEDEL'S_LAW=TRUE \n", + "\n", + " 9 OUT OF 14399 REFLECTIONS REJECTED\n", + " 14390 REFLECTIONS ON OUTPUT FILE \n", + "\n", + " ******************************************************************************\n", + " DEFINITIONS:\n", + " R-FACTOR\n", + " observed = (SUM(ABS(I(h,i)-I(h))))/(SUM(I(h,i)))\n", + " expected = expected R-FACTOR derived from Sigma(I)\n", + "\n", + " COMPARED = number of reflections used for calculating R-FACTOR\n", + " I/SIGMA = mean of intensity/Sigma(I) of unique reflections\n", + " (after merging symmetry-related observations)\n", + " Sigma(I) = standard deviation of reflection intensity I\n", + " estimated from sample statistics\n", + "\n", + " R-meas = redundancy independent R-factor (intensities)\n", + " Diederichs & Karplus (1997), Nature Struct. Biol. 4, 269-275.\n", + "\n", + " CC(1/2) = percentage of correlation between intensities from\n", + " random half-datasets. Correlation significant at\n", + " the 0.1% level is marked by an asterisk.\n", + " Karplus & Diederichs (2012), Science 336, 1030-33\n", + " Anomal = percentage of correlation between random half-sets\n", + " Corr of anomalous intensity differences. Correlation\n", + " significant at the 0.1% level is marked.\n", + " SigAno = mean anomalous difference in units of its estimated\n", + " standard deviation (|F(+)-F(-)|/Sigma). F(+), F(-)\n", + " are structure factor estimates obtained from the\n", + " merged intensity observations in each parity class.\n", + " Nano = Number of unique reflections used to calculate\n", + " Anomal_Corr & SigAno. At least two observations\n", + " for each (+ and -) parity are required.\n", + "\n", + "\n", + " cpu time used by XSCALE 0.2 sec\n", + " elapsed wall-clock time 0.2 sec\n" + ] + } + ], + "source": [ + "!wsl xscale" + ] + }, + { + "cell_type": "markdown", + "id": "2d5e0af1", + "metadata": {}, + "source": [ + "## 8. Intensity-based clustering\n", + "\n", + "Run intensity-based clustering to further filter out datasets with low correlation (to remove poor quality datasets), or from a different phase that with similar enough unit cell. Cut-off on the dendrogram is selected manually. A number below 0.4 can be a good starting choice.\n", + "\n", + "In the end, integration results from datasets corresponding to different clusters are automatically copied to different folders after running clustering. The merged intensities in file `shelx.hkl` can be used for structure determination." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5456d48e", + "metadata": {}, + "outputs": [], + "source": [ + "!edtools.cluster" + ] + }, + { + "attachments": { + "intensity_cluster.png": { + "image/png": "" + } + }, + "cell_type": "markdown", + "id": "7b61c2d0", + "metadata": {}, + "source": [ + "![intensity_cluster.png](attachment:intensity_cluster.png)" + ] + }, + { + "cell_type": "markdown", + "id": "4657e4fd-8c7d-4d07-8984-d28cb7acba32", + "metadata": {}, + "source": [ + "Console Output\n", + "\n", + "```\n", + "Running XSCALE on cluster 1\n", + "\n", + "Clustering results\n", + "\n", + "Cutoff distance: 0.259\n", + "Equivalent CC(I): 0.966\n", + "Method: average\n", + "\n", + " # N_clust CC(1/2) N_obs N_uniq N_poss Compl. N_comp R_meas d_min i/sigma | Lauegr. prob. conf. idx\n", + " 1*** 5 96.6* 10778 2522 2690 93.8* 10458 0.285* 0.80 3.23\n", + "(Sorted by 'Completeness')\n", + "\n", + "Cluster 1: [1, 2, 3, 5, 7]\n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "2cb1d59a-8268-4bce-90a5-3b198791d3b1", + "metadata": {}, + "source": [ + "## Instruction for using on your own data\n", + "\n", + "- Install **edtools** and all software dependencies on your system\n", + "- Put all your 3D ED datasets in one folder. All the 3D ED datasets are expected to be in some *XDS* readable image format, e.g. SMV. A correctly configured *XDS.INP* file is also expected for each dataset.\n", + "- Open Windows command prompt from the root directory which contains all the datasets\n", + "- Follow the demo" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.11" + }, + "widgets": { + "application/vnd.jupyter.widget-state+json": { + "state": {}, + "version_major": 2, + "version_minor": 0 + } + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/docs/examples/find_cell_step3.png b/docs/examples/find_cell_step3.png new file mode 100644 index 0000000..98981e4 Binary files /dev/null and b/docs/examples/find_cell_step3.png differ diff --git a/docs/examples/find_cell_step5.png b/docs/examples/find_cell_step5.png new file mode 100644 index 0000000..495ef5a Binary files /dev/null and b/docs/examples/find_cell_step5.png differ diff --git a/docs/examples/intensity_cluster.png b/docs/examples/intensity_cluster.png new file mode 100644 index 0000000..5b42d1f Binary files /dev/null and b/docs/examples/intensity_cluster.png differ diff --git a/docs/index.rst b/docs/index.rst index 65115a1..46c8de1 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -12,6 +12,16 @@ API Reference edtools +Examples +======== +.. toctree:: + :maxdepth: 1 + :caption: Examples + :glob: + + examples/* + + Links ===== .. toctree:: diff --git a/docs/requirements.txt b/docs/requirements.txt index 7d7eb09..35e2727 100644 --- a/docs/requirements.txt +++ b/docs/requirements.txt @@ -1,5 +1,6 @@ sphinx sphinx_rtd_theme +nbsphinx readthedocs-sphinx-search autodocsumm lmfit