-
Notifications
You must be signed in to change notification settings - Fork 0
/
serial_crystallography.html
246 lines (208 loc) · 19.6 KB
/
serial_crystallography.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
<!DOCTYPE html>
<html lang="en" data-content_root="./">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" /><meta name="viewport" content="width=device-width, initial-scale=1" />
<title>Processing serial synchrotron crystallography (SSX) datasets with xia2 — xia2 documentation</title>
<link rel="stylesheet" type="text/css" href="_static/pygments.css?v=fa44fd50" />
<link rel="stylesheet" type="text/css" href="_static/alabaster.css?v=12dfc556" />
<link rel="stylesheet" href="_static/xia2.css" type="text/css" />
<script src="_static/documentation_options.js?v=5929fcd5"></script>
<script src="_static/doctools.js?v=9a2dae69"></script>
<script src="_static/sphinx_highlight.js?v=dc90522c"></script>
<link rel="index" title="Index" href="genindex.html" />
<link rel="search" title="Search" href="search.html" />
<link rel="next" title="Overcoming problems with the detector geometry for SSX data" href="xia2-ssx-geometry-refinement.html" />
<link rel="prev" title="Processing multi-crystal datasets with xia2" href="multi_crystal.html" />
<link rel="stylesheet" href="_static/custom.css" type="text/css" />
</head><body>
<!-- <div class="logoheader container"> -->
<!-- <a href="index.html"> -->
<!-- <img class="logoheader" alt="DIALS" src="_static/dials_header.png" /> -->
<!-- </a> -->
<!-- </div> -->
<div class="document">
<div class="documentwrapper">
<div class="bodywrapper">
<div class="body" role="main">
<section id="processing-serial-synchrotron-crystallography-ssx-datasets-with-xia2">
<h1>Processing serial synchrotron crystallography (SSX) datasets with xia2<a class="headerlink" href="#processing-serial-synchrotron-crystallography-ssx-datasets-with-xia2" title="Link to this heading">¶</a></h1>
<p>xia2 is able to processes serial synchrotron data through a SSX
pipeline built upon data analysis programs from DIALS.</p>
<p>To process serial data from images to a merged MTZ file, the
minimal recommended example command is:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">xia2</span><span class="o">.</span><span class="n">ssx</span> <span class="n">template</span><span class="o">=../</span><span class="n">lyso_1_</span><span class="c1">###.cbf space_group=P43212 \</span>
<span class="n">unit_cell</span><span class="o">=</span><span class="mf">79.1</span><span class="p">,</span><span class="mf">79.1</span><span class="p">,</span><span class="mf">38.2</span><span class="p">,</span><span class="mi">90</span><span class="p">,</span><span class="mi">90</span><span class="p">,</span><span class="mi">90</span> <span class="n">reference</span><span class="o">../</span><span class="n">lyso</span><span class="o">.</span><span class="n">mtz</span>
</pre></div>
</div>
<p>i.e. the minimal required information is the location of the images, the expected
space group and unit cell. A suitable reference file, containing a reference set of
intensities or pdb model (<code class="samp docutils literal notranslate"><span class="pre">.mtz,</span> <span class="pre">.cif,</span> <span class="pre">.pdb</span></code>) is also recommended to enable data
reduction using a reference.
Typically, a reference geometry and mask should also be provided, as described below.</p>
<p><strong>Table of contents of xia2 SSX processing documentation</strong></p>
<div class="line-block">
<div class="line"><a class="reference internal" href="#dataintegration"><span class="std std-ref">Data Integration</span></a></div>
<div class="line"><a class="reference internal" href="#datareduction"><span class="std std-ref">Data Reduction</span></a></div>
<div class="line"><a class="reference internal" href="#geometryrefinement"><span class="std std-ref">Overcoming difficulties in geometry refinement</span></a></div>
<div class="line"><a class="reference internal" href="#mergingingroups"><span class="std std-ref">Merging in groups</span></a></div>
</div>
<section id="data-integration">
<span id="dataintegration"></span><h2>Data Integration<a class="headerlink" href="#data-integration" title="Link to this heading">¶</a></h2>
<p>To integrate data, the images must be provided, plus the space group to use for
processing and an estimate of the unit cell.
If the space group or unit cell are not known, the pipeline can be initially
run without these; spotfinding and indexing will be performed, and the most
populous unit cell clusters will be presented to the user.</p>
<p>The location of the image data must be specified by using one of the options
<code class="samp docutils literal notranslate"><span class="pre">image=</span></code>, <code class="samp docutils literal notranslate"><span class="pre">template=</span></code> or <code class="samp docutils literal notranslate"><span class="pre">directory=</span></code> (multiple instances of
one option is allowed). Here <code class="samp docutils literal notranslate"><span class="pre">image</span></code> corresponds to an image file
(e.g. <code class="samp docutils literal notranslate"><span class="pre">.cbf</span></code> or <code class="samp docutils literal notranslate"><span class="pre">.h5</span></code>), <code class="samp docutils literal notranslate"><span class="pre">template</span></code> is a file template for a
sequence of images (e.g. for files <code class="samp docutils literal notranslate"><span class="pre">lyso_1_001.cbf</span></code> to <code class="samp docutils literal notranslate"><span class="pre">lyso_1_900.cbf</span></code>, the matching
template is <code class="samp docutils literal notranslate"><span class="pre">lyso_1_###.cbf</span></code>), and <code class="samp docutils literal notranslate"><span class="pre">directory</span></code> is a path to a
directory containing image files.
For data in h5 format, the option <code class="samp docutils literal notranslate"><span class="pre">image=/path/to/image_master.h5</span></code> is recommended.
For cbf data, the file template option is recommended, e.g.:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">xia2</span><span class="o">.</span><span class="n">ssx</span> <span class="n">template</span><span class="o">=../</span><span class="n">lyso_1_</span><span class="c1">###.cbf space_group=P43212 \</span>
<span class="n">unit_cell</span><span class="o">=</span><span class="mf">79.1</span><span class="p">,</span><span class="mf">79.1</span><span class="p">,</span><span class="mf">38.2</span><span class="p">,</span><span class="mi">90</span><span class="p">,</span><span class="mi">90</span><span class="p">,</span><span class="mi">90</span>
</pre></div>
</div>
<p>The overall sequence of the data integration part of the pipeline is as follows.
In the absence of an explicitly provided reference geometry, spotfinding and
indexing will be run on batches of 1000 images at a time, until a threshold number
of crystals have been indexed (250 by default, as specified by the
<code class="samp docutils literal notranslate"><span class="pre">geometry_refinement.n_crystals</span></code> parameter). Then joint refinement of the
detector geometry is run to determine an improved geometry model. Following this,
the refined geometry is used to perform spotfinding, indexing and integration of
all images, with feedback provided on batches of 1000 images at a time. This batch
size can be adjusted with the <code class="samp docutils literal notranslate"><span class="pre">batch_size</span></code> parameter. The number of available
processes will be determined automatically (optionally a value for <code class="samp docutils literal notranslate"><span class="pre">nproc</span></code> can be given),
and parallel processing will be performed within the batch.</p>
<p>A DIALS reference geometry file (<code class="samp docutils literal notranslate"><span class="pre">refined.expt</span></code>) can be provided as input
with the option <code class="samp docutils literal notranslate"><span class="pre">reference_geometry=</span></code>, which will be used instead of
performing a joint refinement on the data. A mask file created from the DIALS
image viewer can also be provided with the option <code class="samp docutils literal notranslate"><span class="pre">mask=</span></code>, which will be
used in spotfinding and integration. A few other common options to set are the
<code class="samp docutils literal notranslate"><span class="pre">max_lattices</span></code> parameter, which determines the number of lattices to search
for on each image in indexing, and <code class="samp docutils literal notranslate"><span class="pre">d_min</span></code> which controls the resolution
limit for spotfinding and integration.</p>
<p>To see the full list of options and their descriptions, run <code class="samp docutils literal notranslate"><span class="pre">xia2.ssx</span> <span class="pre">-ce2</span> <span class="pre">-a1</span></code>.
Change the number after <code class="samp docutils literal notranslate"><span class="pre">-ce</span></code> to a value from 0 to 3 to see different
“expert levels” of program parameters. Note that a phil options file can be
provided for each of the DIALS programs, to allow further customisation of the
options for the individual programs. Additionally, stepwise processing can be
performed by running the program multiple times with the option <code class="samp docutils literal notranslate"><span class="pre">steps=find_spots</span></code>,
then <code class="samp docutils literal notranslate"><span class="pre">steps=index</span></code> and finally <code class="samp docutils literal notranslate"><span class="pre">steps=integrate</span></code>.</p>
</section>
<section id="data-reduction">
<span id="datareduction"></span><h2>Data Reduction<a class="headerlink" href="#data-reduction" title="Link to this heading">¶</a></h2>
<p>Following data integration, data reduction (reindexing, scaling and merging) will
be performed. The data reduction can be run separately to the full pipeline through
the command <code class="samp docutils literal notranslate"><span class="pre">xia2.ssx_reduce</span></code>, taking integrated data as input, e.g.:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">xia2</span><span class="o">.</span><span class="n">ssx_reduce</span> <span class="o">../</span><span class="n">xia2_ssx</span><span class="o">/</span><span class="n">batch_</span><span class="o">*/</span><span class="n">integrated</span><span class="o">*.</span><span class="p">{</span><span class="n">expt</span><span class="p">,</span><span class="n">refl</span><span class="p">}</span>
</pre></div>
</div>
<p>To run only the data integration without reduction, use the option
<code class="samp docutils literal notranslate"><span class="pre">steps=find_spots+index+integrate</span></code> (i.e. omit <code class="samp docutils literal notranslate"><span class="pre">+reduce</span></code>) when running <code class="samp docutils literal notranslate"><span class="pre">xia2.ssx</span></code>.</p>
<p>The data reduction process consists of unit cell filtering, followed by indexing
ambiguity resolution in batches (if ambiguities are possible due to lattice
and space group symmetries), followed by scaling and merging.</p>
<p>If a reference dataset/PDB model is
provided with the option <code class="samp docutils literal notranslate"><span class="pre">reference=</span></code>, then reindexing and scaling is performed
in parallel in batches of at least <code class="samp docutils literal notranslate"><span class="pre">reduction_batch_size</span></code> crystals, using intensities
extracted/generated from the reference as a reference when reindexing and scaling.
It is recommended to use a high-quality reference set of intensities in preference to genering
a set of intensities from a PDB model, to give a higher accuracy. If generating intensities
from a PDB model, the default bulk solvent parameters (<code class="samp docutils literal notranslate"><span class="pre">k_sol</span></code> and <code class="samp docutils literal notranslate"><span class="pre">b_sol</span></code>) should
be adjusted to suitable values.
If there is no reference given, the scaling is not performed in parallel. Other important
options are setting <code class="samp docutils literal notranslate"><span class="pre">anomalous=True/False</span></code> and specifying a <code class="samp docutils literal notranslate"><span class="pre">d_min</span></code> value.
To evaluate the success of indexing ambiguity resolution, it is important to inspect
the html output from dials.cosym jobs in the <code class="samp docutils literal notranslate"><span class="pre">data_reduction\reindex</span></code> folder.
To see the full list of data reduction parameters and their descriptions,
run <code class="samp docutils literal notranslate"><span class="pre">xia2.ssx_reduce</span> <span class="pre">-ce3</span> <span class="pre">-a2</span></code>. The output of the data reduction pipeline
is a merged MTZ file which can be taken onwards for structure determination.</p>
<p><code class="docutils literal notranslate"><span class="pre">xia2.ssx_reduce</span></code> also supports quick remerging of already processed data by using the option
<code class="docutils literal notranslate"><span class="pre">steps=merge</span></code>. This can be useful for aggregating multiple processing jobs processed
using the same reference model, or generating MTZ files with specified resolution
cutoffs e.g.:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">xia2</span><span class="o">.</span><span class="n">ssx_reduce</span> <span class="o">../</span><span class="n">xia2_ssx_reduce</span><span class="o">/</span><span class="n">DataFiles</span><span class="o">/</span><span class="n">scaled</span><span class="o">*.</span><span class="p">{</span><span class="n">expt</span><span class="p">,</span><span class="n">refl</span><span class="p">}</span> <span class="n">steps</span><span class="o">=</span><span class="n">merge</span> <span class="n">d_min</span><span class="o">=</span><span class="mf">1.8</span>
<span class="n">xia2</span><span class="o">.</span><span class="n">ssx_reduce</span> <span class="n">steps</span><span class="o">=</span><span class="n">merge</span> <span class="o">../</span><span class="p">{</span><span class="n">chip1</span><span class="p">,</span><span class="n">chip2</span><span class="p">}</span><span class="o">/</span><span class="n">DataFiles</span><span class="o">/</span><span class="n">scaled</span><span class="o">*.</span><span class="p">{</span><span class="n">expt</span><span class="p">,</span><span class="n">refl</span><span class="p">}</span>
</pre></div>
</div>
</section>
<section id="overcoming-difficulties-in-geometry-refinement">
<span id="geometryrefinement"></span><h2>Overcoming difficulties in geometry refinement<a class="headerlink" href="#overcoming-difficulties-in-geometry-refinement" title="Link to this heading">¶</a></h2>
<p>The first step of processing with <code class="samp docutils literal notranslate"><span class="pre">xia2.ssx</span></code> is to improve the detector geometry in order to
improve the indexing rate. The starting assumption is that the initial geometry is good enough to
successfully index a fraction of the images. When this is not the case, it will be necessary to
change some of the program parameters. Examples and suggestions of how to tackle trickier
processing cases are described in the link below.</p>
<div class="toctree-wrapper compound">
<ul>
<li class="toctree-l1"><a class="reference internal" href="xia2-ssx-geometry-refinement.html">Overcoming problems with the detector geometry</a><ul>
<li class="toctree-l2"><a class="reference internal" href="xia2-ssx-geometry-refinement.html#setting-the-detector-distance-beam-centre">Setting the detector distance/beam centre</a></li>
<li class="toctree-l2"><a class="reference internal" href="xia2-ssx-geometry-refinement.html#relaxing-restraints-in-refinement">Relaxing restraints in refinement</a></li>
<li class="toctree-l2"><a class="reference internal" href="xia2-ssx-geometry-refinement.html#refining-multi-panel-detectors">Refining multi-panel detectors</a></li>
</ul>
</li>
</ul>
</div>
</section>
<section id="merging-in-groups">
<span id="mergingingroups"></span><h2>Merging in groups<a class="headerlink" href="#merging-in-groups" title="Link to this heading">¶</a></h2>
<p><code class="docutils literal notranslate"><span class="pre">xia2.ssx_reduce</span></code> also supports merging in groups to handle more complex experiments such
as dose series experiments. This is described in more detail in the links below.</p>
<div class="toctree-wrapper compound">
<ul>
<li class="toctree-l1"><a class="reference internal" href="xia2-ssx-dose-series.html">Dose-series experiments and merging in groups</a><ul>
<li class="toctree-l2"><a class="reference internal" href="xia2-ssx-dose-series.html#dose-series-the-dose-series-repeat-option">Dose series - the <em>dose_series_repeat</em> option</a></li>
<li class="toctree-l2"><a class="reference internal" href="xia2-ssx-dose-series.html#dose-series-using-a-grouping-yml-file">Dose series - using a <em>grouping.yml</em> file</a></li>
<li class="toctree-l2"><a class="reference internal" href="xia2-ssx-dose-series.html#generalised-merge-grouping-on-metadata">Generalised merge grouping on metadata</a></li>
</ul>
</li>
</ul>
</div>
</section>
</section>
</div>
</div>
</div>
<div class="sphinxsidebar" role="navigation" aria-label="Main">
<div class="sphinxsidebarwrapper">
<p class="logo">
<a href="index.html">
<img class="logo" src="_static/xia2-new-logo.gif" alt="Logo" />
</a>
</p>
<h3>Navigation</h3>
<ul class="current">
<li class="toctree-l1"><a class="reference internal" href="quick_start.html">Getting started</a></li>
<li class="toctree-l1"><a class="reference internal" href="using_xia2.html">Using xia2</a></li>
<li class="toctree-l1"><a class="reference internal" href="installation.html">Installation</a></li>
<li class="toctree-l1"><a class="reference internal" href="introductory_example.html">Introductory example</a></li>
<li class="toctree-l1"><a class="reference internal" href="insulin_tutorial.html">Insulin tutorial</a></li>
<li class="toctree-l1"><a class="reference internal" href="multi_crystal.html">Multi-crystal data</a></li>
<li class="toctree-l1 current"><a class="current reference internal" href="#">Serial-crystallography</a></li>
<li class="toctree-l1"><a class="reference internal" href="program_output.html">Program output</a></li>
<li class="toctree-l1"><a class="reference internal" href="parameters.html">Parameters</a></li>
<li class="toctree-l1"><a class="reference internal" href="comments.html">Comments</a></li>
<li class="toctree-l1"><a class="reference internal" href="background.html">History</a></li>
<li class="toctree-l1"><a class="reference internal" href="acknowledgements.html">Acknowledgements</a></li>
<li class="toctree-l1"><a class="reference internal" href="release_notes.html">Release notes</a></li>
<li class="toctree-l1"><a class="reference internal" href="license.html">License</a></li>
</ul>
</div>
</div>
<div class="clearer"></div>
</div>
<div class="footer container">
<a href="http://www.diamond.ac.uk/"><img class="logofooter" alt="Diamond" src="_static/diamond_logo.png" /></a>
<a href="http://dials.diamond.ac.uk/"><img class="logofooter" alt="DIALS" src="_static/dials_logo_fx_transparent_bg.png" /></a>
<a href="http://www.ccp4.ac.uk/"><img class="logofooter" alt="CCP4" src="_static/CCP4-logo-plain.png" /></a>
</div>
<div class="footer">
©2020, Diamond Light Source.
</div>
</body>
</html>