-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathINSTALL.details
232 lines (196 loc) · 6.08 KB
/
INSTALL.details
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
# GeneMark-ETP project
####################################################
# Technical details
# Check the text below if you are interested in:
# Which third-party tools are included in ETP?
# How are third-party tools compiled and configured?
####################################################
## THIRD PARTY TOOLS expected in the default path -----------
wget
wget is required with some input files.
Users may provide a link to WWW resource in the YAML species configuration file, for example, the path to the genome file in GenBank.
In this case, the genome will be downloaded using the "wget" command.
"wget" is not required when input files are located on a local computer system.
scp
"scp" is required when input files are located on the remote computer system and the user account is configured for remote file copy.
gunzip
"gunzip" is required when input files are compressed by gzip.
## GeneMark dependencies -----------
GeneMark-ETP includes two GeneMark packages:
- GeneMark-ES/ET/EP+
- GeneMarkS-T
with corresponding dependencies.
GeneMark-ES/ET/EP+ is located in
bin/gmes
The source code of GeneMark.hmm eukaryotic 3 algorithm is available here
https://github.com/gatech-genemark/GeneMarkHmmEukaryotic3
GeneMarkS-T is located in
bin/gmst
Some C/C++ code dependencies from GeneMark* packages:
- probuild
## THIRD PARTY TOOLS in "bin" folder -----------
Two programs from AUGUSTUS and BRAKER projects are included in GeneMark-ETP:
- bam2hints
- filterIntronsFindStrand.pl
The code of bam2hints was compiled using the following commands:
```
cd tools
mkdir -p src
cd src
git clone https://github.com/pezmaster31/bamtools.git
cd bamtools/
mkdir -p build
cd build
cmake ..
make
make install
cd ../../
git clone https://github.com/Gaius-Augustus/Augustus.git
cd Augustus/auxprogs/bam2hints/
```
Add -static option to LDFLAGS in the Makefile
```
make
cp bam2hints ../../../../../bin
```
## THIRD PARTY TOOLS in PATH -----------
Please check licensing conditions for third-party tools.
The list of required third-party bioinformatics tools is below:
1. required
name: bedtools
version: version 2.30
source: https://github.com/arq5x/bedtools2
programs: bedtools
Code in the folder "tools" was installed using the following commands:
```
cd tools
mkdir -p src
cd src
wget https://github.com/arq5x/bedtools2/releases/download/v2.30.0/bedtools.static.binary
chmod +x bedtools.static.binary
mv bedtools.static.binary ../bedtools
```
2. required when using GeneMark-ETP with automatic download of files with RNA-Seq data from NCBI SRA database
name: sra-tools
version: version 3.00
source: https://github.com/ncbi/sra-tools
programs: fastq-dump, prefetch, vdb-config
Code in the folder "tools" was installed using the following commands:
```
cd tools
mkdir -p src
cd src
wget https://ftp-trace.ncbi.nlm.nih.gov/sra/sdk/3.0.1/sratoolkit.3.0.1-centos_linux64.tar.gz
tar xf sratoolkit.3.0.1-centos_linux64.tar.gz
rm sratoolkit.3.0.1-centos_linux64.tar.gz
mv sratoolkit.3.0.1-centos_linux64 sratoolkit
cp sratoolkit/bin/fasterq-dump-orig.3.0.1 ../fastq-dump
cp sratoolkit/bin/prefetch-orig.3.0.1 ../prefetch
cp sratoolkit/bin/vdb-config ../vdb-config
```
ATTENTION
NCBI sratoolkit must be configured before first usage.
Run configuration script: vdb-config --interactive
Default configuration was used in our tests.
3. required for RNA-Seq alignment
name: hisat2
version: version 2.2.1
source: https://github.com/DaehwanKimLab/hisat2
programs: hisat2, hisat2-build, hisat2-build-l, hisat2-build-s, hisat2-align-l, hisat2-align-s
Code in the folder "tools" was installed using the following commands:
```
cd tools
mkdir -p src
cd src
wget https://cloud.biohpc.swmed.edu/index.php/s/fE9QCsX3NH4QwBi/download
mv download hisat2-2.2.1-source.zip
unzip hisat2-2.2.1-source.zip
rm hisat2-2.2.1-source.zip
cd hisat2-2.2.1
```
Add -static option to RELEASE_FLAGS in Makefile
RELEASE_FLAGS = -O3 $(BITS_FLAG) $(SSE_FLAG) -funroll-loops -g3 -static
```
make
cp hisat2 ../../
cp hisat2-build ../../
cp hisat2-build-l ../../
cp hisat2-build-s ../../
cp hisat2-align-l ../../
cp hisat2-align-s ../../
cd ../../
```
4. required for RNA-Seq alignment BAM file preparation
name: samtools
version: version 1.16.1
source: https://github.com/samtools/samtools/releases/download/1.16.1/samtools-1.16.1.tar.bz2
programs: samtools
Code in the folder "tools" was installed using the following commands:
```
cd tools
mkdir -p src
cd src
wget https://github.com/samtools/samtools/releases/download/1.16.1/samtools-1.16.1.tar.bz2
tar xf samtools-1.16.1.tar.bz2
rm samtools-1.16.1.tar.bz2
cd samtools-1.16.1
./configure --without-libdeflate --disable-bz2 --disable-lzma --without-curses --disable-libcurl
make samtools LDFLAGS="-static"
cp samtools ../../
cd ../../
```
5. required
name: stringtie
version: version 2.2.1
source: https://github.com/gpertea/stringtie.git
programs: stringtie
Code in the folder "tools" was installed using the following commands:
```
cd tools
mkdir -p src
cd src
git clone https://github.com/gpertea/stringtie.git
cd stringtie
make clean release LDFLAGS=' -static '
cp stringtie ../../
cd ../../
```
6. required
name: gffread
version:
source: https://github.com/gpertea/gffread
programs: gffread
Code in the folder "tools" was installed using the following commands:
```
cd tools
mkdir -p src
cd src
git clone https://github.com/gpertea/gffread
cd gffread
make LDFLAGS=' -static '
cp gffread ../../
cd ../../
```
7. required
name: diamond
version:
source: https://github.com/bbuchfink/diamond.git
programs: diamond
Code in the folder "tools" was installed using the following commands:
```
cd tools
mkdir -p src
cd src
git clone https://github.com/bbuchfink/diamond.git
cd diamond
echo "" >> CMakeLists.txt
echo "SET_TARGET_PROPERTIES (diamond PROPERTIES LINK_SEARCH_START_STATIC 1)" >> CMakeLists.txt
echo "SET_TARGET_PROPERTIES (diamond PROPERTIES LINK_SEARCH_END_STATIC 1)" >> CMakeLists.txt
mkdir build
cd build
cmake .. -DBUILD_STATIC=ON -DSTATIC_LIBGCC=ON -DSTATIC_LIBSTDC++=ON
make
cp diamond ../../../
cd ../../
```
## END -----------