-
Notifications
You must be signed in to change notification settings - Fork 0
/
image_differencing.tex
344 lines (300 loc) · 22.1 KB
/
image_differencing.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
\section{Analysis of Image Differencing Performance \label{sec:imDiff}}
LSST will detect motion and flux variability by differencing each incoming image
against a deep template (built by combining multiple images of the same region).
Sources in difference images, called \DIASources, will be detected at a signal-to-noise
ratio (SNR) threshold of $\nu=5$. Up to about 1,000 deg$^{-2}$
astrophysical, real, detections (e.g. variable stars) are expected in LSST image differencing, including
up to about 500 deg$^{-2}$ asteroids on the Ecliptic. In addition to real detections,
there are false detections due to imaging or processing artifacts and
false detections caused simply by statistical noise
fluctuations in the background. In a typical LSST difference image, the expected
density of false detections due to background fluctuations is about 60 deg$^{-2}$
(see below for details)---much lower than the expected rate of astrophysical
detections (see \S\ref{sec:kaiser} below).
Historically, surveys have reported detection rates in image differencing that are much
higher, depending on the survey; see \cite{denneau13}; \cite{kessler15}; \cite{goldstein15}.
For example, Pan-STARRS1 (PS1) reported a transient detection rate as high as 8,200 deg$^{-2}$
\citep{denneau13}. For a ``menagerie'' of PS1 artifacts (with memorable names such as
{\it chocolate chip cookies, frisbee, piano, arrowhead, UFO}), see Fig.~17 in \cite{denneau13}.
They reported that ``Many of the false detections are easily explained as internal reflections,
ghosts, or other well-understood image artifacts, ...''. As discussed in \S\ref{sec:tracklets},
such a high false detection rate is at the limit of what could be handled even
with the substantial computing power planned for LSST.
Over the past decade,
the second-generation surveys have learned tremendously from the PS1 experience. There are surveys
running today, such as Dark Energy Survey, which have largely solved the key problems that
PS1 has encountered. Major improvements to hardware include CCDs with significantly fewer
artifacts (e.g. DECam, see below, and LSST) and optical systems designed to minimize ghosting
and internal reflections (e.g. in LSST). Improvements to the software include advanced image
differencing pipelines (e.g., PTFIDE for the Palomar Transient Factory and the Zwicky Transient
Facility) and various machine-learning classifiers for filtering false detections. For example,
\cite{goldstein15} used a Random Forest classifier with the Dark Energy Survey data
and cleaned their sample of transient detections from a raw false:true detection
rate ratio of 13:1 to a filtered rate of 1:3. The resulting false detections are
morphologically much simpler; for example, compare
Fig.~1 in \cite{goldstein15} to Fig.~17 in \cite{denneau13}.
Here we summarize an analysis of image differencing performance based on DECam
data and difference images produced and processed using prototype LSST
software \citep{DMTN-006}. This analysis demonstrates that the false
detection rate anticipated for LSST (without using any machine-learning classifiers for
filtering false detections) is significantly below the threshold for
successful deployment of MOPS, as will be discussed in \S\ref{sec:mops}.
\subsection{LSST Image Differencing Pipeline and Data Processing}
The LSST prototype image differencing and analysis code derives from the
algorithms employed in the HOTPANTS package \citep{becker15}, and was used for surveys such as SuperMACHO
\citep{becker05} and ESSENCE \citep{miknaitis07}. While this software is
functional as-is, it is expected that the ultimate LSST pipeline will include
improved methods for handling observations at high airmass and the effects of
differential chromatic refraction due to the Earth's atmosphere. In
this work, we conservatively assume that the pipeline used for LSST will have the
same performance as the current code.
\subsection{False Detections due to Background Fluctuations \label{sec:kaiser}}
Some false detections are expected simply due to background fluctuations, even
at a high SNR threshhold. The number of
such detections, as a function of the threshold SNR, the number of pixels and
seeing, can be computed using the statistics of Gaussian random fields.
For an image with a Gaussian background noise, convolved with a Gaussian point
spread function (PSF) with width $\sigma_g$ (in pixels), the number of peaks, $N$, above a
given SNR threshold, $\nu$, is given by (N. Kaiser, priv. comm.)
\begin{equation}
N(>\nu) = \frac{n_{row}*n_{col}}{2^{5/2} \, \pi^{3/2} \, \sigma_g^2} \, \nu \, e^{-\nu^2 /2}
\label{eq-theory}
\end{equation}
where $n_{row}$ and $n_{col}$ are the number of pixel rows and columns in the image.
This expression was verified empirically by LSST data management team using
raytraced image simulations\footnote{See \url{https://github.com/lsst/W13report}}.
For 4k by 4k LSST sensors the pixel size is 0.2 arcsec, and for a nominal seeing
of 0.85 arcsec and $\nu=5$, $N(>\nu) = 59$ deg$^{-2}$.
It is generally not well appreciated just how steep is the dependence of $N(>\nu)$
on $\nu$ due to the exponential term. Changing the threshold from 5 to 5.5
decreases the expected rate by a factor of 12, and the rate increases by a factor
of 9.7 when the threshold is changed from 5 to 4.5. In practice, an empirical estimate
of the background noise is used when computing the SNR for each detected source.
When this estimate is incorrect, e.g. due to reasons discussed below, then the
implied detection threshold is wrong, too. For example, if the noise is underestimated
by only 10\%, the computed SNR will be too large by 10\%, and the adopted
threshold $\nu=5$ will actually correspond to $\nu=4.5$ -- and thus the
sample will include 9.7 times as many false detections due to background
fluctuations! Hence, the noise in difference images has to be estimated to
high accuracy.
\subsubsection{The Impact and Treatment of Correlated Noise}
When the LSST pipeline convolves the science image to match the PSF of the template
image, the per-pixel variance in the image is reduced, and at the same
time correlations between neighboring pixels are introduced. This violates the
assumption made by standard image processing algorithms that each pixel is an
independent draw from a Poisson distribution. The per-pixel noise reduction is
reflected in the variance plane that accompanies each exposure during
processing, but the covariance between pixels is not tracked.
The significance of detections and the uncertainties on source measurements is
then estimated based on this incomplete information provided by the variance
plane, leading to a biased detection threshold.
The magnitude of this effect can be large---using only the per-pixel variance
measurements can result in underestimating the true noise on PSF-size scales by
20\% or more. A detection threshold of $\nu=5$ thus actually corresponds to
$\nu=4$, and it is easy to see using eq.~\ref{eq-theory} that this error results
in an increase in the number of false detections by a factor of $\sim$70!
A histogram of the number of sources detected in difference images, as a function
of SNR computed using forced photometry measurements, is shown in the right panel
of Figure~\ref{fig:snr_comparison}. The blue line shows the expected counts given
by eq.~\ref{eq-theory}, in good agreement with data. When the SNR is estimated
incorrectly due to correlated noise (left panel), the distribution clearly ramps
up at a much higher SNR value and results in numerous false detections that are
mis-classified as $>5 \sigma$ detections.
Tracking the covariance caused by multiple convolutions is a planned feature for
the LSST software stack, but is not currently implemented. Previous surveys, such as
Pan-STARRS1, have used a small covariance ``pseudo-matrix'', which tracks the
covariance between a small region of neighboring pixels, and then assumed that
this relationship between pixels is constant across an image (Paul Price, priv. comm.).
This method avoids the creation of the full $N_{\rm pixels}$ by $N_{\rm pixels}$
covariance matrix, which is impractically large and mostly empty.
In the interim, for this analysis we have mitigated the problem by utilizing forced photometry of \DIASources
on individual images (that is, before convolution to match their PSFs). This
produces both flux measurements and associated uncertainties which are not
affected by covariance, enabling us to accurately set a SNR threshold
that recognizes and rejects all \DIASources with $\nu < 5$. This mitigation step
will be unnecessary once the image covariance tracking is properly implemented
in the LSST stack. Alternative solutions such as image ``decorrelation''
\citep{DMTN-021}, or the \citet{zackay} image differencing algorithm, would also
alleviate the covariance problem. Tests of these methods in the LSST pipeline are ongoing.
\begin{figure}
\centering
\plotone{figures/snr_comparison.pdf}
\caption{
Histogram of the reported SNR of sources measured in the difference images,
using two different SNR estimates: SNR (incorrectly) estimated using the variance plane
of the difference images (left) and SNR (correctly) estimated from forced photometry on
the input images (right). The blue lines indicate the expected SNR distribution
based on Gaussian background noise. The difference between these two histograms
illustrates the strong impact of the SNR mis-estimation. Using the correct SNR values,
the vast majority of putative $>5 \sigma$ detections become $<5 \sigma$ detections
and can be disregarded.
}
\label{fig:snr_comparison}
\end{figure}
\subsection{Testing the LSST Pipeline with DECam}
The Dark Energy Survey (DES) is an optical/near-infrared survey that aims to probe the
dynamics of the expansion of the universe and the growth of large scale structure by
imaging 5,000 sq. deg. of the southern sky. DECam, the imaging camera developed for
DES, is sufficiently similar to LSST camera to enable an informative study of false detection
rates: DECam includes 62 mosaicked deep-depletion CCDs, with a total pixel count of
520 Mpix over its 3 deg$^2$ field of view, and has a similar filter complement as LSST \citep{DECam}.
The data we use here are a subset of a DECam NEO survey (PI: L. Allen, NOAO) conducted
in the first half of 2013. The data for a given field consist of 40-second exposures separated
by about five minutes. Due to the difference in telescope aperture, these images
are about 1 mag shallower than the 30 second visits by LSST.
Throughout this section, we used data for
five different fields, each with between three and five visits for a total of 15
``science'' visits plus 5 template visits. We've subsequently verified the findings using a much larger superset of 540 visits from the same survey. The enlarged set was bounded by $28^\circ < b < 58^\circ$ and $315^\circ < l < 350^\circ$ (Galactic coordinates), covering a range of object densities better representative of the LSST's "Wide-Fast-Deep" (WFD) survey (except for the depth, which we discuss in \S\ref{sec:imDiffScale}). Using this set, we find false detection rates broadly consistent with those from the smaller subset: the additionally processed visits actually have {\em fewer} false detections by about 30\% (on average).
From the NOAO archive we obtained images that had already had instrumental
signature removal applied by the DECam Community Pipeline. Each image was
processed through the initial LSST pipeline for background subtraction, PSF
determination, source detection and measurement (collectively
termed ``processCcd'' in the LSST pipeline). For each field, we arbitrarily
selected one of the visits to serve as the ``template'' exposure, against which
the other visits in the field are differenced. Sources are then detected in the
difference image to produce \DIASources, and forced photometry performed in both
the original ``science'' and ``template'' exposures at the position of any
\DIASource.
In LSST operations, coadded prior exposures will be used
as templates for image differencing rather than single visits, which will reduce
the noise in template images. In this study, our use of single visits instead of
coadded templates implies that some moving objects or transients will appear as
negative sources in the difference images. We simply disregard these sources
since our goal is to mimic LSST operations rather than discover all possible
transients in this dataset.
%After producing difference images, we detect sources (\DIASources) and measure
%their properties. In addition, we also perform forced photometry (flux measurements
%at the positions of detected \DIASources) on individual single-epoch input images
%using the current versions of LSST pipelines \citep{juric15}.
\subsubsection{The Transient and False Detection Rates in DECam Images}
Using the $>5\sigma$ cut based on SNR estimated using forced photometry,
the average number of positive \DIASources is $\sim 1000$ deg$^{-2}$,
with some fields having as few as $500$ deg$^{-2}$. A large fraction of
these detections are the result of stars that have been poorly-subtracted and
left significant residuals in the difference image. It is a common problem for
subtracted stars to exhibit ``ringing'' with both positive and negative
excursions, and these images are no exception. Because the focus of this work
is on detecting moving objects rather than variable stars or transients, we have
not attempted to correct these subtraction artifacts. Instead, we simply exclude
difference image detections where there is significant ($>15 \sigma$) flux in both the
science and template images at the position of the \DIASource---that is, we exclude all \DIASources that
overlap with a static source (of course, some may be truly variable sources).
The area lost due to this masking is less than $1\%$ of the total sky. Again, this
is not the intended behavior of LSST during production, but instead a temporary
expedient we can use for conservatively estimating the system's performance.
One could view this step as a ``poor man's'' machine learning step.
After excluding all \DIASources associated with stationary objects, the
remaining candidate moving object detections number on average $\sim 350$
deg$^{-2}$. This sample includes asteroids, false positives, and possibly some
true astrophysical transients that are not associated with stationary objects
(gamma-ray burst afterglows, very faint variable stars with sharp light
curve maxima, etc.). To improve the estimate of the fraction of these remaining
objects that are false, we visually classified one focal plane of detections either
as obvious imaging artifacts, obvious PSF-like detections, or unidentifiable
detections. Approximately 25\% of the reported detections were clearly some
sort of uncorrected artifact (we did not pursue the cause of individual artifacts),
25\% appeared to be acceptable PSF-like features, and the remaining 50\% were
ambiguous or had too low of signal to noise to be able to classify.
Therefore, a conservative upper limit on the fraction of false detections is 75\%,
assuming the 25\% of detections which had acceptable PSF-like features were real objects,
corresponding to a rate of 263 deg$^{-2}$. Given the size of DECam pixels
(0.263 arcsec) and typical seeing of about 1.1 arcsec ($\sigma_g$ = 1.8 pix),
the expected rate due to background fluctuations is 33 deg$^{-2}$, leaving
a rate of 230 deg$^{-2}$ as ``unexplained'' false detections.
The SNR distribution of this sample is proportional to 1/SNR$^{2.5}$, which
is similar to distributions expected for astrophysical objects. This fact implies
that the sample might be dominated by true astrophysical transients and
asteroids; nevertheless, we adopt the above conservative upper limit of 75\%.
\subsubsection{Scaling DECam Results to LSST Performance \label{sec:imDiffScale}}
The LSST false detection rate due to background fluctuations will be about twice
as large as for DECam because of LSST's smaller pixels and better seeing. The scaling
with pixel size and seeing for ``unexplained'' false detections is not obvious
because their cause is unknown. For example, if they are a pixel-induced effect,
their rate should be scaled up by the square of the ratio of angular pixel sizes, or
a factor of 1.72. If they are instead dominated by true astrophysical transients,
they should not be scaled at all. Since our dataset contains an unknown
mixture of false detections from these two types of scalings, in addition to a
significant number of true detections, we cannot derive a precise scaling for the
false detection rate. Instead, we adopt a {\it conservative} option by assuming
that {\it all of our detections are false} and behave like pixel-induced effects,
which scales up the DECam rate for ``unexplained'' false detections to 396 deg$^{-2}$.
In addition, there will be 60 deg$^{-2}$ false detections due to background fluctuations
(eq.~\ref{eq-theory}, referenced to the median seeing of 0.85 arcsec).
The total false detection rate of $\sim450$ deg$^{-2}$ anticipated for LSST is thus
comparable to the rate of astrophysical transients. Again, this estimate of the false
detection rate is conservative and it would not be very surprising if it turns out
to be much smaller.
In addition to the uncertainties mentioned above, it is hard to precisely account for
the fact that LSST images will be about one magnitude deeper than the analyzed DECam
images. The contribution to false detections from background fluctuations
(60 deg$^{-2}$) should not be changed because it depends on SNR, not the
specific magnitude limit. If all the remaining
false detections are due to pixel-induced effects, they are also dependent on SNR
(i.e. counts) and not on magnitude. In this scenario, the number of false detections
would not vary even though LSST images would be deeper. If instead all remaining
false detections are astrophysical in nature, the DECam rate (230 deg$^{-2}$) should
{\it not} be multiplied by 1.72 for pixel-scaling, and instead should be corrected for the
difference in image depth. Since the observed differential
\DIASource distribution scales approximately as SNR$^{-2.5}$, one magnitude of
depth increases the sample size by a factor of about 4. More precisely, we know
that one third of the ``unexplained'' false detections are likely pixel-induced effects
(the 25\% which were clearly some uncorrected artifact above),
and no more than two thirds can be astrophysical, so the scaled rate expected
for LSST would be (1/3*1.72 + 2/3*4)*230 + 60 = 805 deg$^{-2}$, or about
a factor of 1.8 higher than the adopted rate of $\sim450$ deg$^{-2}$. We emphasize
that this estimate corresponds to the worst case scenario that is rather unlikely
to be correct.
\subsubsection{Spatially Correlated Transients}
\begin{figure}
\centering
\plotone{figures/stacked_tracklets.pdf}
\caption{
``Stacked'' tracklets of three or more detections surrounding bright stars are shown in the left panel
as lines. $\Delta \textrm{Dec}$ and $\Delta \textrm{RA}$ are the tracklet coordinates relative to a nearby
bright star, such that any correlated features due to, e.g., bleed trails or
diffraction spikes, will show a grouping of lines with similar positions and
orientations. No such features are seen and the vast majority of
tracklets are consistent with real moving objects (and on average aligned with
the Ecliptic). The ``hole'' in center of the plot comes from the rejection of \DIASources
in regions with significant flux in the template image. The right panel shows
the position angle of the tracklets (degrees East of North), while the dashed
vertical line indicates the angle that would be seen if the tracklets were on
lines of constant Ecliptic latitude. Again the vast majority appear to be
aligned with the Ecliptic.
}
\label{fig:stacked_tracklets}
\end{figure}
We have checked for correlations in tracklets around bright stars, which
might arise if either optical or processing artifacts exhibited some preferred
alignment. Such an alignment may create false tracklets at a rate greater than
an overdensity of uncorrelated detections would. To investigate these
correlations we generated tracklets using the current prototype version of MOPS.
For this test tracklets are required to have three or more detections (out of five visits
at each pointing) that align with velocities less than $2\deg$/day. For each
star brighter than 14th magnitude in the UCAC4 catalog \citep{UCAC4}, we
identified all tracklets within 4 arcminutes of the star, and ``stacked'' the
tracklets surrounding each star onto a single plot. The resulting tracklet
distribution can be seen in Figure~\ref{fig:stacked_tracklets}, where each black
line corresponds to a single tracklet. If, for example, there was an excess of
tracklets along CCD bleed trails from bright stars, these would appear as a set
of lines along the $\pm\Delta \textrm{RA}$ direction, or a peak at $90\deg$ or
$270\deg$ in the histogram (right panel of Figure~\ref{fig:stacked_tracklets}).
We do not see any such correlated detections after inspecting approximately
5,000 tracklets (the number in the plot is limited for legibility).
We also investigated whether \DIASources from multiple visits are correlated
in pixel coordinates, which might arise from uncorrected detector anomalies.
We analyzed a 4-visit subset of the 15 science visits used in the rest of this
section, and for each \DIASource computed the distance in pixels to any neighboring
\DIASources, even if they appeared on different visits or different fields
(similar to the 2-point correlation function). From this we identified 24
\DIASources that were located within two pixels of another \DIASource. We did
not find any correlation at larger radii. Visual inspection shows that many are
near parts of an image where a defect (such as a cosmic ray, bad column, bleed
trail, etc) had been interpolated over, though for some the cause is unclear.
The implied density of correlated \DIASources is about 2.3 deg$^{-2}$, rendering
this effect relatively unimportant.
We note that the number and implied sky density of asteroids bright enough to produce
scattered light and diffraction spike artifacts, which would appear to move at solar system
rates, is at least two orders of magnitude smaller than the sky density of asteroids down
to LSST depth on the Ecliptic. Therefore, such artifacts will be unimportant as a contributor
to false tracks.