forked from nservant/HiC-Pro
-
Notifications
You must be signed in to change notification settings - Fork 0
/
NEWS
433 lines (226 loc) · 12.6 KB
/
NEWS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
***********************************
CHANGES IN VERSION 2.9.0
NEW FEATURES
o update iced_0.4.2
SIGNIFICANT USER-VISIBLE CHANGES
o Fix issues with floating values in hicpro2fithic.py (F. Ay)
o Update hicpro2fithic.py script. New '-r' option to set up the data resolution (F. Ay)
o samtools >1.0 is now required
o The -s build_contact_maps option now directly creates matrix files from .allValidPairs file. It does not merge the .validPairs anymore. Please use -s merge_persample to merge .validPairs files and remove duplicates.
o Update manual and correct errors
BUG FIXES
o Fix bug for SLURM support
o Fix python path in merge_valid_interaction.sh
o Fix bug in hicpro2juicebox when chromosome names have "_"
o Fix a bug in detection of religation events
o Fix some issue with sort -T. This option is no longer used and replaced by setting the TMPDIR variable
***********************************
CHANGES IN VERSION 2.8.0
NEW FEATURES
o Return bias vector after IC normalization
o New utils hicpro2fithic.py to convert HiC-Pro output into fitHiC input
SIGNIFICANT USER-VISIBLE CHANGES
o Update iced version (-> 0.4.0). This version includes a major changes on matrix filtering before iced normalization
o sparseToDense.py utils. Create output file in the current folder by default
BUG FIXES
o Fix bug in the help message of digest_genome.py
***********************************
CHANGES IN VERSION 2.7.9
NEW FEATURES
o Update for capture Hi-C. New variable CAPTURE_TARGET allowing to restrict the analysis to the captured region(s)
SIGNIFICANT USER-VISIBLE CHANGES
o Update iced version (-> 0.3.0)
o Exit with an error if the restriction fragment file is set but not found. In the previous version, HiC-Pro automatically switched to DNAse mode.
o Optimisation of python scripts
BUG FIXES
***********************************
CHANGES IN VERSION 2.7.8
NEW FEATURES
o Religation are now discarded and considered as invalid pairs. Religation are defined as FR read pairs involving contiguous restriction fragments
SIGNIFICANT USER-VISIBLE CHANGES
o Installation process has been changed ! Please read the manual !
o Update of sparseTodense utils. It now provides a ---perchr option to generate intrachromosomal dense contact maps
o Clean hicpro2juicebox.sh utils
BUG FIXES
o Update script allowing to check the python version
o Bug fix in R code in case of NA values
***********************************
CHANGES IN VERSION 2.7.7
NEW FEATURES
o New utility : sparseToDense.py - Convert a sparse symmetric matrix file in dense format for further analysis
o Pull request from J. Brayet for installation process
SIGNIFICANT USER-VISIBLE CHANGES
o Update of the hicpro2juicebox.sh utils. -g option now requires the chromosome size files from HiC-Pro. This version now works for any organism
o Update of the manual
o Check python module version during installation
o Pull request NV - udpate iced version. Note that this version of iced rescales automatically the normalized matrix such that the total number of counts is identical to the original matrix
o Add -c option in hicpro2juicebox.sh utils to write the "chr" prefix before the chromosome name. The use of "chr" depends on the reference genome of juicebox
***********************************
CHANGES IN VERSION 2.7.6
NEW FEATURES
o New utils : hicpro2juicebox. HiC-Pro results can now be converted in Juicebox input for visualization (A. Barrera, N. Servant)
o New utils : make_viewpoints.py. Allow to generate viewpoint from the list of valid interaction products. Initially developed for capture-C analysis
SIGNIFICANT USER-VISIBLE CHANGES
o validPairs output format was updated for Juicebox compatibility. New columns were added with restriction fragment names and mapping quality
BUG FIXES
o Fix bug in hic_inc.sh for numeric comparison
o Fix bug in bam/sam extension removal
***********************************
CHANGES IN VERSION 2.7.4
NEW FEATURES
o Support for LSF scheduler thanks to J. Phlipps-Cremins
SIGNIFICANT USER-VISIBLE CHANGES
o Update singleton reporting in pairing mode (pull request N. Varoquaux, F. Ay)
o Additional checks on input files to report more "user-friendly" errors
BUG FIXES
o Fix bugs in singleton and multiple hits reporting
o Fix bug when fastq.gz files are processed without parallel mode but using a cluster
***********************************
CHANGES IN VERSION 2.7.3b
BUG FIXES
o Bug fixed if space in the configuration file
o Bug fixed in plot_pairing_portion.R (pull request M. Blum)
o Update samtools sort for compatibility with version >1.1
SIGNIFICANT USER-VISIBLE CHANGES
o Update python access in utils (pull request D. Vanichkina)
o Update iced version (0.2.1 official release)
***********************************
CHANGES IN VERSION 2.7.2b
NEW FEATURES
o HiC-Pro is now compatible with the HiCPlotter viewer (see https://github.com/kcakdemir/HiCPlotter)
o Add support for the SLURM scheduler thanks to A. D'Ippolito !
o Add support for the SGE scheduler thanks to G. Li !
SIGNIFICANT USER-VISIBLE CHANGES
o be careful - configuration files for installing and running HiC-Pro have been updated to manage multiple schedulers !
***********************************
CHANGES IN VERSION 2.7.1
NEW FEATURES
o Add new stepwise option "merge_persample". Migth be useful in some case to merge the different samples and remove the duplicates if specified in the config files. This step is also run in the "build_contact_map" steps, before generating the contact maps.
o Can now generate raw contact maps at the restriction fragment resolution (build_matrix v1.2). To do so, specified a BIN_SIZE=-1. Note that in pratice ICE will be run on these matrices too although its assumption of the equal visibility across bins may require further exploration in the present case.
o New version of the iced package 0.2.1
SIGNIFICANT USER-VISIBLE CHANGES
o Change in the calculation of insert size distribution. The length of DNA fragment after ligation are now calculated using the read start. In the previous version, we used the middle of the read as starting point. The insert size was therefore shifted by a factor equal to the read length
o Change default behavior of split_reads.py utility (output = "./", nreads=20e6)
o ALLELE_SPECIFIC_SNP file will be detected using the full absolute path or in the annotation folder
o R1 and R2 reads are now ordered by genomic position in the validPairs files. This might have an impact on the duplicates level
o BIN_STEP variable from the configuration file is now deprecated
o Update some graphical labels
BUG FIXES
o Bug fix in C++ compilation. Add sys/stat.h inclusion
o Fix bug in ice normalization. FILTER_HIGH_COUNT_PERC were not considered in ice_norm.sh
***********************************
CHANGES IN VERSION 2.7.0
NEW FEATURES
o New parameters - FILTER_HIGH_COUNT_PERC / FILTER_LOW_COUNT_PERC to filter out extreme count before normalization. Note that the parameter SPARSE_FILTERING is now deprecated and replaced by FILTER_LOW_COUNT_PERC
o New utility ; digest_genome.py which take a fasta file and the name(s) or sequence(s) of the restriction enzyme(s) in order to generate the list of restriction fragments after genome digestion.
o HiC-Pro is now able to process data generate from protocols without restriction enzyme such as DNase Hi-C. See the manual for more information
SIGNIFICANT USER-VISIBLE CHANGES
o The mapping step2 is now optional and will be run only if the ligation site is specified
o New parameter MIN_CIS_DIST in order to discard all contacts below the specified distance. This is mainly used for DNase Hi-C to remove artefact which are likely to be self ligation product.
BUG FIXES
o Fix bug in plot scale
***********************************
CHANGES IN VERSION 2.6.0
NEW FEATURES
o Allele specific version of HiC-Pro
o New utility to build the VCF file for allele specific analysis
o New manual version
o Manual is now online at http://nservant.github.io/HiC-Pro/
SIGNIFICANT USER-VISIBLE CHANGES
o New configuration file with ALLELE_SPECIFIC_SNP field. If specified HiC-Pro is run in allele specific mode.
o Improve stepwise analysis. Check input files according to the specified step
o The split_reads.py utility can now handle fastq.gz files
BUG FIXES
o Fix bug in build_raw_maps.sh in --chrsize (reported by D. Robelin)
o Fix bugs in CIS short/long ranges interaction reporting
o Fix bugs in bowtie pairing - option -m and -s
o Fix bug in mapped_2hic_fragments when singleton are not removed during the pairing
***********************************
CHANGES IN VERSION 2.5.2
NEW FEATURES
o New quality control plots, i.e. duplicates level, short/long range interactions, fragment size
SIGNIFICANT USER-VISIBLE CHANGES
o HiC-Pro can now be run from BAM aligned files
o Stepwise analysis can now be run in addition to the parallel mode
o Additional checks when running the pipeline
o Quality control plots are now run after each step instead of once at the end of the pipeline
BUG FIXES
o Bug fixed when HiC-Pro is run with relative path
o Bug fixed in mergeSAM.py in case of duplicated reference in the SAM header
o Check PREFIX installation variable during installation process
***********************************
CHANGES IN VERSION 2.5.1
SIGNIFICANT USER-VISIBLE CHANGES
o samtools 0.1.19 or higher is required
o Any character after "\" in read names are now discarded so that myread\1 == myread\2
o Warning is printed during the restriction fragment assignment is a chromosome is not defined in the annotation file
BUG FIXES
o Add pyton path in ice_norm.sh
o Bug fixed in iced genomic intervals symlink
o Add missing ${PYTHON_PATH} variable in scripts
o Add pysam check during installation
o Bug in install process
o Bug in python PATH
***********************************
CHANGES IN VERSION 2.5.0
SIGNIFICANT USER-VISIBLE CHANGES
o MergeSAM.pl is now in python based on pysam library
o All SAM outputs are converted into BAM files
o Check dependencies version number during installation
***********************************
CHANGES IN VERSION 2.4.2
NEW FEATURES
o Add HiC-Pro utilities - split_reads
***********************************
CHANGES IN VERSION 2.4.1
SIGNIFICANT USER-VISIBLE CHANGES
o Add PREFIX variable in installation configuration to set the installation directory
o Improve logs reporting and organization
o Remove SAM files by default when running the complete workflow
BUG FIXES
o Bug fixed in rmdup
***********************************
CHANGES IN VERSION 2.4.0
BUG FIXES
o Bug fixed if input path is not absolute
NEW FEATURES
o Support both short and long options
SIGNIFICANT USER-VISIBLE CHANGES
o ORGANISM option in config file replaced by REFERENCE_GENOME
o CUT_SITE_5OVER option in config file replaced by LIGATION_SITE
o GENOME_SIZE and GENOME_FRAGMENT files are detected as they are specified or from the annotation folder
***********************************
CHANGES IN VERSION 2.3.0
NEW FEATURES
o Update error and log reporting
o Change scripts folder
o Automatic installation process, using make install
o HiC-Pro can now be run using the bin/HiC-Pro bash script
o Add licence in all scripts
o Bug fixed in build_map - new option matrix_format asis/lower/upper/complete
o New version of ICE normalization
***********************************
CHANGES IN VERSION 2.2.0
o Add ICE normalization
o Change mapping strategy. Local mapping is replaced by a global approach on trimmed reads
***********************************
CHANGES IN VERSION 2.1.0
NEW FEATURES
o Manage .fastq.gz files
o New pictures for mapping, pairing and valid pairs results
o Check cutSite for local mapped reads (mergeSAM.pl)
o New output option for overlapMapped2HiCFragment.py --samOutput
o Change read position used for fragment overlap (middle instead of 5' end to avoid cutting site sequencing)
o Remove file conversion sam to .aln
o Additional filters on MAPQ (mergeSAM.pl)
o Additional check to discard invalid reads pair (mergeSAM.pl)
***********************************
CHANGES IN VERSION 2.0
NEW FEATURES
o Major release by E. Viara
o New parallelized version of the pipeline. The workflow is parallelized by read pairs from the alignment, to the list of valid interactions
o New version of build_matrix tool written in C++
***********************************
CHANGES IN VERSION 1.0
NEW FEATURES
o First version of the pipeline