-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathatom.xml
488 lines (321 loc) · 219 KB
/
atom.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title>deelaaay learning</title>
<link href="/atom.xml" rel="self"/>
<link href="http://idcat.cn/"/>
<updated>2019-11-21T00:47:14.015Z</updated>
<id>http://idcat.cn/</id>
<author>
<name>deelaaay</name>
</author>
<generator uri="http://hexo.io/">Hexo</generator>
<entry>
<title>tahoe简单部署</title>
<link href="http://idcat.cn/tahoe%E7%AE%80%E5%8D%95%E9%83%A8%E7%BD%B2.html"/>
<id>http://idcat.cn/tahoe简单部署.html</id>
<published>2019-11-21T00:46:41.000Z</published>
<updated>2019-11-21T00:47:14.015Z</updated>
<content type="html"><![CDATA[<p>环境:</p><p>6个节点</p> <a id="more"></a><p>1、 每个节点tahoe部署安装,并写入自己的ip地址主机名到hosts文件</p><p>2、 角色配置</p><p>切换到venv环境下:</p><p>#source venv/bin/activate</p><p>Introducer</p><p># tahoe create-introducer –hostname=kub2 . ##后面有个点号代表当前目录</p><p>#tahoe start . 启动服务 ,在当前目录下private有introducer.furl文件拷贝到clienthe storage节点上。</p><p>Client:</p><p>##tahoe create-client </p><p>生成.tahoe目录,拷贝introduce.furl到配置文件中。</p><p>启停服务</p><p>(venv) [root@kub1 .tahoe]# tahoe stop .</p><p>STOPPING ‘/root/.tahoe’</p><p>process 6349 is dead</p><p>(venv) [root@kub1 .tahoe]# tahoe start .</p><p>STARTING ‘/root/.tahoe’</p><p>daemonizing in ‘/root/.tahoe’</p><p>starting node in ‘/root/.tahoe’</p><p>Node has started successfully</p><p>(venv) [root@kub1 .tahoe]#</p><p>Storage:</p><p>#tahoe create-node –hostname=kub3 . #有个点号 </p><p>拷贝introducer.furl到配置文件tahoe.cfg中</p><p>#tahoe start . 启动服务 </p><p>浏览器192.168.5.12:3456 访问</p>]]></content>
<summary type="html">
<p>环境:</p>
<p>6个节点</p>
</summary>
<category term="tahoe" scheme="http://idcat.cn/tags/tahoe/"/>
</entry>
<entry>
<title>hibench7.0编译安装</title>
<link href="http://idcat.cn/hibench7-0%E7%BC%96%E8%AF%91%E5%AE%89%E8%A3%85.html"/>
<id>http://idcat.cn/hibench7-0编译安装.html</id>
<published>2019-11-20T12:38:25.000Z</published>
<updated>2019-11-20T12:40:50.822Z</updated>
<content type="html"><![CDATA[<p>hibench7.0编译安装</p><p><strong>引言</strong></p><p>HiBench是一个大数据基准套件,可以帮助您评测不同大数据平台的性能、吞吐量和系统资源利用率。本文仅介绍如何对hadoop进行测试,其他大数据平台使用请参考官网<a href="https://github.com/intel-hadoop/HiBench" target="_blank" rel="noopener">https://github.com/intel-hadoop/HiBench</a></p><p><strong>软件依赖</strong></p><p>HiBench需要java环境,以及Maven管理。</p><a id="more"></a><p><strong>安装java运行环境</strong></p><p>新建目录/home/java 然后上传相应的jdk二进制文件到此目录并解压。(本例为aarch64为平台,x86平台一样的操作步骤,亲试)</p><p>root@kylin1:/home/java# ll</p><p>总用量 71608</p><p>drwxr-xr-x 3 root root 4096 4月 8 11:59 ./</p><p>drwxr-xr-x 4 root root 4096 4月 8 11:28 ../</p><p>drwxr-xr-x 7 root root 4096 4月 8 11:44 jdk1.8.0_201/</p><p>-rw-r–r– 1 root root 73312819 4月 8 11:43 jdk-8u201-linux-arm64-vfp-hflt.tar.gz</p><p>添加环境变量:/etc/profile 文件末尾增加如下行</p><p>export JAVA_HOME=/home/java/jdk1.8.0_201</p><p>export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar</p><p>export PATH=$PATH:$JAVA_HOME/bin</p><p>使其生效</p><p>source /etc/profile</p><p>查看</p><p>root@kylin1:/home/java# java -version</p><p>java version “1.8.0_201”</p><p>Java(TM) SE Runtime Environment (build 1.8.0_201-b09)</p><p>Java HotSpot(TM) 64-Bit Server VM (build 25.201-b09, mixed mode)</p><p><strong>安装Maven</strong></p><p>下载Maven包</p><p>wget <a href="http://apache.fayea.com/maven/maven-3/3.6.0/binaries/apache-maven-3.6.0-bin.zip" target="_blank" rel="noopener">http://apache.fayea.com/maven/maven-3/3.6.0/binaries/apache-maven-3.6.0-bin.zip</a></p><p>解压缩</p><p>unzip apache-maven-3.6.0-bin.zip -d /usr/local/</p><p>添加环境变量</p><p>cat /etc/profile,在文件末尾增加如下</p><p>export M3_HOME=/usr/local/apache-maven-3.6.0</p><p>export PATH=$M3_HOME/bin:$PATH</p><p>source /etc/profile</p><p><strong>测试Maven环境</strong></p><p>mvn -v</p><p>看到相应版本信息输出即表明配置正确:</p><p>root@kylin3:/usr/local/HiBench-HiBench-7.0# mvn -v</p><p>Apache Maven 3.6.0 (97c98ec64a1fdfee7767ce5ffb20918da4f719f3; 2018-10-25T02:41:47+08:00)</p><p>Maven home: /usr/local/apache-maven-3.6.0</p><p>Java version: 1.8.0_201, vendor: Oracle Corporation, runtime: /home/java/jdk1.8.0_201/jre</p><p>Default locale: zh_CN, platform encoding: UTF-8</p><p>OS name: “linux”, version: “4.4.58-20180615.kylin.server.yun+-generic”, arch: “aarch64”, family: “unix”</p><p><strong>下载HiBench</strong></p><p>git clone <a href="https://github.com/intel-hadoop/HiBench.git比较慢,建议直接网页下载zip包,然后解压到/usr/local目录下。" target="_blank" rel="noopener">https://github.com/intel-hadoop/HiBench.git比较慢,建议直接网页下载zip包,然后解压到/usr/local目录下。</a></p><p>root@kylin3:/usr/local/HiBench-HiBench-7.0# pwd</p><p>/usr/local/HiBench-HiBench-7.0</p><p><strong>安装Hibench</strong></p><p>切到HiBench下,执行对应的安装操作,可以选择自己想要安装的模块。以安装hadoop框架为例:</p><p>root@kylin3:/usr/local/HiBench-HiBench-7.0# mvn -Phadoopbench -Dspark=2.1 -Dscala=2.11 clean package</p><p>更多安装方法参考<a href="https://github.com/Intel-bigdata/HiBench/blob/master/docs/build-hibench.md" target="_blank" rel="noopener">https://github.com/Intel-bigdata/HiBench/blob/master/docs/build-hibench.md</a></p><p>因为网络原因,下次模块过程中经常出现暂停现象。当前解决办法暂停下载,ctrl+c 然后继续执行上面的命令。</p><p><strong>遇到的问题:</strong></p><p><strong>[WARNING] Could not get content</strong></p><p><strong>org.apache.maven.wagon.TransferFailedException: Failed to transfer file <a href="http://archive.apache.org/dist/hive/hive-0.14.0/apache-hive-0.14.0-bin.tar.gz" target="_blank" rel="noopener">http://archive.apache.org/dist/hive/hive-0.14.0/apache-hive-0.14.0-bin.tar.gz</a> with status code 503</strong></p><p>下载hive出错,解决:</p><p>root@kylin3:/usr/local/HiBench-HiBench-7.0# pwd</p><p>/usr/local/HiBench-HiBench-7.0</p><p>root@kylin3:/usr/local/HiBench-HiBench-7.0# vim hadoopbench/sql/pom.xml</p><p>修改该文件中,将http改为https即可</p> <properties><br><br> <strong><repo><a href="https://archive.apache.org" target="_blank" rel="noopener">https://archive.apache.org</a></repo></strong><br><br> <file>dist/hive/hive-0.14.0/apache-hive-0.14.0-bin.tar.gz</file><br><br> </properties><p><strong>org.apache.maven.wagon.TransferFailedException: Failed to transfer file <a href="http://archive.apache.org/dist/nutch/apache-nutch-1.2-bin.tar.gz" target="_blank" rel="noopener">http://archive.apache.org/dist/nutch/apache-nutch-1.2-bin.tar.gz</a> with status code 503</strong></p><p>root@kylin3:/usr/local/HiBench-HiBench-7.0/hadoopbench/nutchindexing# pwd</p><p>/usr/local/HiBench-HiBench-7.0/hadoopbench/nutchindexing</p><p>vim pod.xml</p><p>修改如下文件由http改成https</p><p> <configuration></configuration></p><p> <url><a href="https://archive.apache.org/dist/nutch/apache-nutch-1.2-bin.tar.gz" target="_blank" rel="noopener">https://archive.apache.org/dist/nutch/apache-nutch-1.2-bin.tar.gz</a></url></p><p> </p><p>最终结果</p><p>[INFO] Reactor Summary:</p><p>[INFO] </p><p>[INFO] hibench 7.0 …………………………………. SUCCESS [ 0.173 s]</p><p>[INFO] hibench-common 7.0 …………………………… SUCCESS [ 10.755 s]</p><p>[INFO] HiBench data generation tools 7.0 ……………… SUCCESS [ 12.696 s]</p><p>[INFO] hadoopbench 7.0 ……………………………… SUCCESS [ 0.004 s]</p><p>[INFO] hadoopbench-sql 7.0 ………………………….. SUCCESS [ 2.355 s]</p><p>[INFO] mahout 7.0 ………………………………….. SUCCESS [ 6.676 s]</p><p>[INFO] PEGASUS: A Peta-Scale Graph Mining System 2.0-SNAPSHOT SUCCESS [ 0.978 s]</p><p>[INFO] nutchindexing 7.0 ……………………………. SUCCESS [ 41.968 s]</p><p>[INFO] ————————————————————————</p><p>[INFO] BUILD SUCCESS</p><p>[INFO] ————————————————————————</p><p>[INFO] Total time: 01:15 min</p><p>[INFO] Finished at: 2019-04-09T11:58:27+08:00</p><p>[INFO] ————————————————————————</p>]]></content>
<summary type="html">
<p>hibench7.0编译安装</p>
<p><strong>引言</strong></p>
<p>HiBench是一个大数据基准套件,可以帮助您评测不同大数据平台的性能、吞吐量和系统资源利用率。本文仅介绍如何对hadoop进行测试,其他大数据平台使用请参考官网<a href="https://github.com/intel-hadoop/HiBench" target="_blank" rel="noopener">https://github.com/intel-hadoop/HiBench</a></p>
<p><strong>软件依赖</strong></p>
<p>HiBench需要java环境,以及Maven管理。</p>
</summary>
<category term="hibench7.0" scheme="http://idcat.cn/tags/hibench7-0/"/>
</entry>
<entry>
<title>docker基本操作</title>
<link href="http://idcat.cn/docker%E5%9F%BA%E6%9C%AC%E6%93%8D%E4%BD%9C.html"/>
<id>http://idcat.cn/docker基本操作.html</id>
<published>2018-12-05T00:02:21.000Z</published>
<updated>2019-11-21T00:55:02.239Z</updated>
<content type="html"><![CDATA[<p>Ubuntu14.04安装docker 摘自aliyun官网<br><a href="https://yq.aliyun.com/articles/110806?spm=5176.8351553.0.0.691e1991DpXzU5" target="_blank" rel="noopener">https://yq.aliyun.com/articles/110806?spm=5176.8351553.0.0.691e1991DpXzU5</a></p><a id="more"></a><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line"># step 1: 安装必要的一些系统工具</span><br><span class="line">sudo apt-get update</span><br><span class="line">sudo apt-get -y install apt-transport-https ca-certificates curl software-properties-common</span><br><span class="line"># step 2: 安装GPG证书</span><br><span class="line">curl -fsSL http://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add -</span><br><span class="line"># Step 3: 写入软件源信息</span><br><span class="line">sudo add-apt-repository "deb [arch=amd64] http://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable"</span><br><span class="line"># Step 4: 更新并安装 Docker-CE</span><br><span class="line">sudo apt-get -y update</span><br><span class="line">sudo apt-get -y install docker-ce</span><br><span class="line"></span><br><span class="line"># 安装指定版本的Docker-CE:</span><br><span class="line"># Step 1: 查找Docker-CE的版本:</span><br><span class="line"># apt-cache madison docker-ce</span><br><span class="line"># docker-ce | 17.03.1~ce-0~ubuntu-xenial | http://mirrors.aliyun.com/docker-ce/linux/ubuntu xenial/stable amd64 Packages</span><br><span class="line"># docker-ce | 17.03.0~ce-0~ubuntu-xenial | http://mirrors.aliyun.com/docker-ce/linux/ubuntu xenial/stable amd64 Packages</span><br><span class="line"># Step 2: 安装指定版本的Docker-CE: (VERSION 例如上面的 17.03.1~ce-0~ubuntu-xenial)</span><br><span class="line"># sudo apt-get -y install docker-ce=[VERSION]</span><br><span class="line"></span><br><span class="line">查看版本</span><br><span class="line">root@aaa:~# docker version</span><br><span class="line">Client:</span><br><span class="line"> Version:17.12.1-ce</span><br><span class="line"> API version:1.35</span><br><span class="line"> Go version:go1.9.4</span><br><span class="line"> Git commit:7390fc6</span><br><span class="line"> Built:Tue Feb 27 22:17:56 2018</span><br><span class="line"> OS/Arch:linux/amd64</span><br><span class="line"></span><br><span class="line">Server:</span><br><span class="line"> Engine:</span><br><span class="line"> Version:17.12.1-ce</span><br><span class="line"> API version:1.35 (minimum version 1.12)</span><br><span class="line"> Go version:go1.9.4</span><br><span class="line"> Git commit:7390fc6</span><br><span class="line"> Built:Tue Feb 27 22:16:28 2018</span><br><span class="line"> OS/Arch:linux/amd64</span><br><span class="line"> Experimental:false</span><br></pre></td></tr></table></figure><h5 id="docker基本操作"><a href="#docker基本操作" class="headerlink" title="docker基本操作"></a>docker基本操作</h5><p>登录阿里云容器镜像服务,创建自己仓库(配置加速器等参考<a href="https://cr.console.aliyun.com/cn-hangzhou/mirrors)" target="_blank" rel="noopener">https://cr.console.aliyun.com/cn-hangzhou/mirrors)</a></p><p>登入仓库</p><p>docker login 默认登入到hub.docker.com<br>docker login <a href="mailto:[email protected]" target="_blank" rel="noopener">[email protected]</a> registry.cn-hangzhou.aliyuncs.com #指定登入到阿里云容器</p><p>以上均输入密码即可。</p><p>root@aaa:~/.docker# cat config.json<br>{<br> “auths”: {<br> “registry.cn-hangzhou.aliyuncs.com”: {<br> “auth”: “MzkwOTqwdqdfmNvbTp6aGFuZ3RhbzE5ODg=”<br> }<br> },<br> “HttpHeaders”: {<br> “User-Agent”: “Docker-Client/17.12.1-ce (linux)”<br> }<br>}</p><p><strong>拉取镜像</strong></p><p>首先登入hub.docker找到合适镜像并pull</p><p>root@aaa:~# docker pull centos:7.2.1511<br>7.2.1511: Pulling from library/centos<br>f2d1d709a1da: Already exists<br>Digest: sha256:29083aecbc86ed398ee3464f69433e529039d6f640d50171b6b385bb0d28230d<br>Status: Downloaded newer image for centos:7.2.1511<br>root@aaa:~# docker images<br>REPOSITORY TAG IMAGE ID CREATED SIZE<br>centos 7.2.1511 4cbf48630b46 7 weeks ago 195MB</p><p><strong>上传镜像</strong></p><p>root@aaa:~# docker tag centos:7.2.1511 registry.cn-hangzhou.aliyuncs.com/momo/centos:7.2.1511</p><p>root@aaa:~# docker images<br>REPOSITORY TAG IMAGE ID CREATED SIZE<br>centos 7.2.1511 4cbf48630b46 7 weeks ago 195MB<br>registry.cn-hangzhou.aliyuncs.com/momo/centos 7.2.1511 4cbf48630b46 7 weeks ago 195MB</p><p>root@aaa:~# docker push centos:7.2.1511 registry.cn-hangzhou.aliyuncs.com/momo/centos:7.2.1511</p><p><strong>创建容器</strong></p><p>root@aaa:~# docker run -it centos:7.2.1511 /bin/bash<br>[root@fc7b209f4881 /]# </p><p><strong>退出容器</strong></p><p>exit 退出容器,并停止docker ps -a</p><p>ctrl+p+q 退出容器,但容器后台运行</p><p><strong>进入容器</strong></p><p>root@aaa:~# docker ps -a<br>CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES<br>b2699dcd1471 centos7.2:1511 “/bin/bash” About an hour ago Up About an hour volume<br>root@aaa:~# docker attach b2699dcd1471<br>[root@b2699dcd1471 ~]# ls<br>anaconda-ks.cfg</p><p>root@aaa:~# docker exec -it b2699dcd1471 /bin/bash<br>[root@b2699dcd1471 /]# read escape sequence (按ctrl+p+q后显示)</p><p><strong>容器启停</strong></p><p>docker start/stop/restart docker-id</p><p><strong>导入和导出容器</strong></p><p>root@aaa:~# docker run -it centos7.2:1511 /bin/bash<br>[root@281429989f81 /]# root@aaa:~#<br>root@aaa:~# docker ps -a<br>CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES<br>281429989f81 centos7.2:1511 “/bin/bash” 9 seconds ago Up 8 seconds brave<br>root@aaa:~# docker export 281429989f81 >centos7_run.tar<br>root@aaa:~#<br>root@aaa:~# cat centos7_run.tar | docker import - centos7:7.2.1511<br>sha256:a61afb9845ebf64ec9b2db88eaecd7eb916323d9f27535bdb93a3bfb6abf9423<br>root@aaa:~# docker images<br>REPOSITORY TAG IMAGE ID CREATED SIZE<br>centos7 7.2.1511 a61afb9845eb 5 seconds ago 195MB<br>centos7.2 1511 0a2bad7da9b5 12 months ago 195MB</p><p><strong>删除容器</strong></p><p>root@aaa:~# docker ps -a<br>CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES<br>fc7b209f4881 centos:7.2.1511 “/bin/bash” About a minute ago Exited (0) 5 seconds ago heuristic_goldberg<br>b2699dcd1471 centos7.2:1511 “/bin/bash” About an hour ago Up About an hour volume<br>root@aaa:~# docker rm fc7b<br>fc7b<br>root@aaa:~# docker ps -a<br>CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES<br>b2699dcd1471 centos7.2:1511 “/bin/bash” About an hour ago Up About an hour volume</p><p><strong>删除镜像</strong></p><p>root@aaa:~# docker images<br>REPOSITORY TAG IMAGE ID CREATED SIZE<br>registry.cn-hangzhou.aliyuncs.com/momo/centos 7.2.1511 4cbf48630b46 7 weeks ago 195MB<br>centos 7.2.1511 4cbf48630b46 7 weeks ago 195MB</p><p>root@aaa:~# docker rmi registry.cn-hangzhou.aliyuncs.com/momo/centos:7.2.1511<br>Untagged: registry.cn-hangzhou.aliyuncs.com/momo/centos:7.2.1511</p><p><strong>保存本地和载入镜像</strong></p><p>root@aaa:~# docker images<br>REPOSITORY TAG IMAGE ID CREATED SIZE<br>centos7.2 1511 0a2bad7da9b5 12 months ago 195MB</p><p>root@aaa:~# docker save -o centos7.2.1511.tar centos7.2:1511<br>root@aaa:~# ls<br> centos7.2.1511.tar<br>root@aaa:~# </p><p><strong>测试载入</strong></p><p>root@aaa:~# docker rmi centos7.2:1511<br>Untagged: centos7.2:1511</p><p>root@aaa:~# docker images<br>REPOSITORY TAG IMAGE ID CREATED SIZE</p><p>root@aaa:~# docker load -i centos7.2.1511.tar<br>Loaded image: centos7.2:1511<br>root@aaa:~# docker images<br>REPOSITORY TAG IMAGE ID CREATED SIZE<br>centos7.2 1511 0a2bad7da9b5 12 months ago 195MB</p><p><strong>数据卷</strong></p><p>挂载主机目录作为数据卷</p><p>root@aaa:~# mkdir /mnt/ceshi</p><p>root@aaa:~# docker run -it –name ceshi -v /mnt/ceshi:/opt/ceshi centos7.2:1511</p><p>–name 指定容器名字 /mnt/ceshi:/opt/ceshi 前面为宿主机目录,后面为容器内挂载点</p><p>[root@2d018b52fd75 /]# df -Th<br>Filesystem Type Size Used Avail Use% Mounted on<br>none aufs 886G 3.6G 837G 1% /<br>tmpfs tmpfs 64M 0 64M 0% /dev<br>tmpfs tmpfs 16G 0 16G 0% /sys/fs/cgroup<br>/dev/mapper/aaa–vg-root ext4 886G 3.6G 837G 1% /opt/ceshi<br>shm tmpfs 64M 0 64M 0% /dev/shm<br>tmpfs tmpfs 16G 0 16G 0% /proc/scsi<br>tmpfs tmpfs 16G 0 16G 0% /sys/firmware<br>[root@2d018b52fd75 /]# echo 123 > /opt/ceshi/aaa<br>[root@2d018b52fd75 /]# exit<br>exit<br>root@aaa:~# cat /mnt/ceshi/aaa<br>123</p><p><strong>数据卷容器</strong></p><p>创建数据卷容器</p><p>root@aaa:~# docker run -it -v /dbdata –name dbdata centos7.2:1511<br>[root@feabca5f7c07 /]# exit</p><p>创建新的容器挂载容器卷(这里的容器卷不一定非要run运行状态,这里为exit状态)</p><p>root@aaa:~# docker run -it –volumes-from dbdata –name db1 centos7.2:1511<br>[root@505e202a7617 /]# df -Th<br>Filesystem Type Size Used Avail Use% Mounted on<br>none aufs 886G 3.6G 837G 1% /<br>tmpfs tmpfs 64M 0 64M 0% /dev<br>tmpfs tmpfs 16G 0 16G 0% /sys/fs/cgroup<br>/dev/mapper/aaa–vg-root ext4 886G 3.6G 837G 1% /dbdata<br>shm tmpfs 64M 0 64M 0% /dev/shm<br>tmpfs tmpfs 16G 0 16G 0% /proc/scsi<br>tmpfs tmpfs 16G 0 16G 0% /sys/firmware<br>[root@505e202a7617 /]# echo db1 > /dbdata/aaa<br>[root@505e202a7617 /]# exit<br>exit<br>root@aaa:~# docker start feabca5f7c07<br>feabca5f7c07<br>root@aaa:~# docker attach feabca5f7c07<br>[root@feabca5f7c07 /]# cat /dbdata/aaa<br>db1<br>[root@feabca5f7c07 /]# exit</p><p><strong>–link使用</strong></p><p>docker的link机制可以通过一个name来和另一个容器通信,link机制方便了容器去发现其它的容器并且可以安全的传递一些连接信息给其它的容器 .正常情况下,容器之间是不会互通的。加上–link之后,容器之间便能互相通信。–link机制即将被淘汰,有新的网络处理机制替换。</p><p>docker run -it –name haproxy –link app1:app1 –link app2:app2 -p 6301:6301 -v ~/project/haproxy:/tmp haproxy /bin/bash </p><p>如上命令创建容器后,通过–link指定容器 然后进入到容器中查看hosts文件就可以看到ip 主机名 容器名之间的映射记录。</p><p>root@b5ce94fc1c1f:/usr/local/sbin# cat /etc/hosts<br>127.0.0.1 localhost<br>::1 localhost ip6-localhost ip6-loopback<br>fe00::0 ip6-localnet<br>ff00::0 ip6-mcastprefix<br>ff02::1 ip6-allnodes<br>ff02::2 ip6-allrouters<br>172.17.0.5 app1 44f4eb740fe6<br>172.17.0.6 app2 3eb6cd729eff<br>172.17.0.7 b5ce94fc1c1f</p><p><strong>端口指定</strong></p><p>通过主机某个端口与容器内指定端口互通后,外界可以通过主机端口访问容器内应用。</p><p>-P 主机端口随机指定</p><p>pull测试镜像</p><p>root@aaa:~# docker pull training/webapp</p><p>root@aaa:~# docker run -d -P training/webapp python app.py<br>8ac3787f2ea9bb42b9561d71124deb34de2f727058d718ba4851b9dd555f7c65<br>root@aaa:~# docker ps -a<br>CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES<br>8ac3787f2ea9 training/webapp “python app.py” 5 seconds ago Up 4 seconds 0.0.0.0:32770->5000/tcp confident_ptolemy</p><p>可以看到主机端口为32770 对应容器端口5000</p><p>通过主机访问</p><p>root@aaa:~# curl 127.0.0.1:32770<br>Hello world!root</p><p> -p 指定主机端口</p><p>root@aaa:~# docker run -d -p 5555:5000 training/webapp python app.py<br>c8c795a5c7137d64652ba4c7765c28c89b8c3f18ca333eac32b60e5fd0d23547<br>root@aaa:~# curl 127.0.0.1:5555<br>Hello world!</p>]]></content>
<summary type="html">
<p>Ubuntu14.04安装docker 摘自aliyun官网<br><a href="https://yq.aliyun.com/articles/110806?spm=5176.8351553.0.0.691e1991DpXzU5" target="_blank" rel="noopener">https://yq.aliyun.com/articles/110806?spm=5176.8351553.0.0.691e1991DpXzU5</a></p>
</summary>
<category term="docker" scheme="http://idcat.cn/tags/docker/"/>
</entry>
<entry>
<title>petasan部署以及简单使用</title>
<link href="http://idcat.cn/petasan%E9%83%A8%E7%BD%B2%E4%BB%A5%E5%8F%8A%E7%AE%80%E5%8D%95%E4%BD%BF%E7%94%A8.html"/>
<id>http://idcat.cn/petasan部署以及简单使用.html</id>
<published>2018-10-25T10:53:12.000Z</published>
<updated>2018-12-03T12:56:18.188Z</updated>
<content type="html"><![CDATA[<p>无意间发现有个开源的san产品(petasan),底层基于ceph做的,当前最新版本2.1.0,底层ceph已经支持到12.2.7了。支持VMware vshere,hyper_v等虚拟化以及database。其san访问控制是基于LIO做的,上层提供完整的wen界面进行ceph集群监控等支持手动配置crush等,支持在线配置iscsi lun等。提供主机节点的各个硬件的监控以及ceph集群状态监控,邮件告警,多路iscsi访问。</p><p>详情自己查看官网介绍<a href="http://www.petasan.org/" target="_blank" rel="noopener">http://www.petasan.org/</a></p><p>自己在虚拟机下做了下简单部署测试,挺简单的,下面记录下配置部署过程。</p><a id="more"></a><h3 id="环境介绍"><a href="#环境介绍" class="headerlink" title="环境介绍"></a><strong>环境介绍</strong></h3><p><img src="file:///C:\Users\zhang\AppData\Local\Temp\ksohtml\wps90E9.tmp.jpg" alt="img"> </p><p>3台虚拟机,共2块网卡,分别为eth0 eth1。其中managerment iscsi1 backend1共用eth0网卡,iscsi2,backend2共用网卡eth1.</p><h3 id="安装系统"><a href="#安装系统" class="headerlink" title="安装系统"></a><strong>安装系统</strong></h3><p><img src="file:///C:\Users\zhang\AppData\Local\Temp\ksohtml\wps90F9.tmp.jpg" alt="img"> </p><p><img src="file:///C:\Users\zhang\AppData\Local\Temp\ksohtml\wps910A.tmp.jpg" alt="img"> </p><p><img src="file:///C:\Users\zhang\AppData\Local\Temp\ksohtml\wps910B.tmp.jpg" alt="img"> </p><p>最终完成系统安装后,点击reboot。</p><h3 id="配置petasan"><a href="#配置petasan" class="headerlink" title="配置petasan"></a><strong>配置petasan</strong></h3><p>登入第一台虚拟机ip:5001,这里为192.168.0.181:5001</p><p>创建集群</p><p><img src="file:///C:\Users\zhang\AppData\Local\Temp\ksohtml\wps911C.tmp.jpg" alt="img"> </p><p><img src="file:///C:\Users\zhang\AppData\Local\Temp\ksohtml\wps912C.tmp.jpg" alt="img"> </p><p><img src="file:///C:\Users\zhang\AppData\Local\Temp\ksohtml\wps913D.tmp.jpg" alt="img"> </p><p>按照设定完成第一台部署。</p><p>然后剩余2台主机,同样登入ip:5001 选择加入集群 按照提示一步步部署即可,部署成功后,主机显示如下</p><p><img src="file:///C:\Users\zhang\AppData\Local\Temp\ksohtml\wps913E.tmp.jpg" alt="img"> </p><p>最后一台主机部署的时候,需要一段时间,因为后台在部署ceph。最终成功后,可以查看集群状态。</p><p><img src="file:///C:\Users\zhang\AppData\Local\Temp\ksohtml\wps914E.tmp.jpg" alt="img"> </p><h3 id="使用petasan"><a href="#使用petasan" class="headerlink" title="使用petasan"></a><strong>使用petasan</strong></h3><p>通过ip:5000 登入,初始用户名密码为admin/password 可自行修改密码。</p><p><img src="file:///C:\Users\zhang\AppData\Local\Temp\ksohtml\wps915F.tmp.jpg" alt="img"> </p><p>下图显示详细的各个功能,包括集群监控等等,创建iscsi 等。</p><p><img src="file:///C:\Users\zhang\AppData\Local\Temp\ksohtml\wps9160.tmp.jpg" alt="img"> </p><p>在配置界面设置iscsi详细信息,包括虚拟ip段,根据之前规划的进行相关填写即可。</p><p><img src="file:///C:\Users\zhang\AppData\Local\Temp\ksohtml\wps9171.tmp.jpg" alt="img"> </p><p>创建共享使用的disk,选择多个活动的path,防止单点故障。</p><p><img src="file:///C:\Users\zhang\AppData\Local\Temp\ksohtml\wps9181.tmp.jpg" alt="img"> </p><p>查看active path为6的信息,该disk共享提供6个path分别2个分布在不同节点的不同网卡上。</p><p><img src="file:///C:\Users\zhang\AppData\Local\Temp\ksohtml\wps9182.tmp.jpg" alt="img"> </p><p>其他就不做介绍了,web端提供的很详细的step介绍。</p><p>详情查看官网资料。</p>]]></content>
<summary type="html">
<p>无意间发现有个开源的san产品(petasan),底层基于ceph做的,当前最新版本2.1.0,底层ceph已经支持到12.2.7了。支持VMware vshere,hyper_v等虚拟化以及database。其san访问控制是基于LIO做的,上层提供完整的wen界面进行ceph集群监控等支持手动配置crush等,支持在线配置iscsi lun等。提供主机节点的各个硬件的监控以及ceph集群状态监控,邮件告警,多路iscsi访问。</p>
<p>详情自己查看官网介绍<a href="http://www.petasan.org/" target="_blank" rel="noopener">http://www.petasan.org/</a></p>
<p>自己在虚拟机下做了下简单部署测试,挺简单的,下面记录下配置部署过程。</p>
</summary>
<category term="petasan targetcli ceph" scheme="http://idcat.cn/tags/petasan-targetcli-ceph/"/>
</entry>
<entry>
<title>centos7系统lio简单使用</title>
<link href="http://idcat.cn/centos7%E7%B3%BB%E7%BB%9Flio%E7%AE%80%E5%8D%95%E4%BD%BF%E7%94%A8.html"/>
<id>http://idcat.cn/centos7系统lio简单使用.html</id>
<published>2018-10-24T12:06:16.000Z</published>
<updated>2018-10-24T12:16:18.650Z</updated>
<content type="html"><![CDATA[<h1 id="centos7安装target"><a href="#centos7安装target" class="headerlink" title="centos7安装target"></a>centos7安装target</h1><p>targetcli是一个iSCSI配置管理工具,该工具简单易用,可以直接替换scsi-target-utils,Linux-IO target 在linux内核2.6.38以后,用软件实现的scsi target,支持的SAN技术中所有流行的存储协议包括Fibre Channel、FCoe\iSCSI、iser等,同时还能为本机生成模拟的scsi设备,以及为虚拟机提供基于virtio的scsi设备</p><a id="more"></a><p>[root@node1 ~]# yum install target</p><p>启动服务</p><p>[root@node1 ~]# systemctl status target<br>● target.service - Restore LIO kernel target configuration<br> Loaded: loaded (/usr/lib/systemd/system/target.service; disabled; vendor preset: disabled)<br> Active: inactive (dead)<br>[root@node1 ~]# systemctl start target<br>[root@node1 ~]# systemctl status target<br>● target.service - Restore LIO kernel target configuration<br> Loaded: loaded (/usr/lib/systemd/system/target.service; disabled; vendor preset: disabled)<br> Active: active (exited) since Wed 2018-10-24 10:55:09 CST; 1s ago<br> Process: 4752 ExecStart=/usr/bin/targetctl restore (code=exited, status=0/SUCCESS)<br> Main PID: 4752 (code=exited, status=0/SUCCESS)</p><p>Oct 24 10:55:09 node1 systemd[1]: Starting Restore LIO kernel target configuration…<br>Oct 24 10:55:09 node1 target[4752]: No saved config file at /etc/target/saveconfig.json, ok, exiting<br>Oct 24 10:55:09 node1 systemd[1]: Started Restore LIO kernel target configuration.</p><h1 id="简单挂载测试"><a href="#简单挂载测试" class="headerlink" title="简单挂载测试"></a>简单挂载测试</h1><p>下面进行最简单的rbd块设备挂载测试。事先准备了rbd块设备挂载到/dev/rbd0</p><p>[root@node1 ~]# targetcli<br>targetcli shell version 2.1.fb46<br>Copyright 2011-2013 by Datera, Inc and others.<br>For help on commands, type ‘help’.</p><p>/> ls<br>o- / …………………………………………………………………………………… […]<br> o- backstores …………………………………………………………………………. […]<br> | o- block ………………………………………………………………. [Storage Objects: 0]<br> | o- fileio ……………………………………………………………… [Storage Objects: 0]<br> | o- pscsi ………………………………………………………………. [Storage Objects: 0]<br> | o- ramdisk …………………………………………………………….. [Storage Objects: 0]<br> o- iscsi ……………………………………………………………………….. [Targets: 0]<br> o- loopback …………………………………………………………………….. [Targets: 0]</p><h2 id="创建iscsi块设备"><a href="#创建iscsi块设备" class="headerlink" title="创建iscsi块设备"></a>创建iscsi块设备</h2><p>/> /backstores/block create rbd-lun /dev/rbd0<br>Created block storage object rbd-lun using /dev/rbd0.<br>/> ls<br>o- / …………………………………………………………………………………… […]<br> o- backstores …………………………………………………………………………. […]<br> | o- block ………………………………………………………………. [Storage Objects: 1]<br> | | o- rbd-lun ………………………………………. [/dev/rbd0 (1.0GiB) write-thru deactivated]<br> | | o- alua ……………………………………………………………….. [ALUA Groups: 1]<br> | | o- default_tg_pt_gp ………………………………………. [ALUA state: Active/optimized]<br> | o- fileio ……………………………………………………………… [Storage Objects: 0]<br> | o- pscsi ………………………………………………………………. [Storage Objects: 0]<br> | o- ramdisk …………………………………………………………….. [Storage Objects: 0]<br> o- iscsi ……………………………………………………………………….. [Targets: 0]<br> o- loopback …………………………………………………………………….. [Targets: 0]</p><h2 id="创建LIO-iscsi目标"><a href="#创建LIO-iscsi目标" class="headerlink" title="创建LIO iscsi目标"></a>创建LIO iscsi目标</h2><p>/> /iscsi create<br>Created target iqn.2003-01.org.linux-iscsi.node1.x8664:sn.9c6086f5562d.<br>Created TPG 1.<br>Global pref auto_add_default_portal=true<br>Created default portal listening on all IPs (0.0.0.0), port 3260.<br>/> ls<br>o- / …………………………………………………………………………………… […]<br> o- backstores …………………………………………………………………………. […]<br> | o- block ………………………………………………………………. [Storage Objects: 1]<br> | | o- rbd-lun ………………………………………. [/dev/rbd0 (1.0GiB) write-thru deactivated]<br> | | o- alua ……………………………………………………………….. [ALUA Groups: 1]<br> | | o- default_tg_pt_gp ………………………………………. [ALUA state: Active/optimized]<br> | o- fileio ……………………………………………………………… [Storage Objects: 0]<br> | o- pscsi ………………………………………………………………. [Storage Objects: 0]<br> | o- ramdisk …………………………………………………………….. [Storage Objects: 0]<br> o- iscsi ……………………………………………………………………….. [Targets: 1]<br> | o- iqn.2003-01.org.linux-iscsi.node1.x8664:sn.9c6086f5562d ……………………………. [TPGs: 1]<br> | o- tpg1 ……………………………………………………………. [no-gen-acls, no-auth]<br> | o- acls ……………………………………………………………………… [ACLs: 0]<br> | o- luns ……………………………………………………………………… [LUNs: 0]<br> | o- portals ………………………………………………………………… [Portals: 1]<br> | o- 0.0.0.0:3260 …………………………………………………………………. [OK]<br> o- loopback …………………………………………………………………….. [Targets: 0]</p><h2 id="使用默认配置创建portal"><a href="#使用默认配置创建portal" class="headerlink" title="使用默认配置创建portal"></a>使用默认配置创建portal</h2><p>/> cd /iscsi/iqn.2003-01.org.linux-iscsi.node1.x8664:sn.9c6086f5562d/tpg1/<br>/iscsi/iqn.2003-01.org.linux-iscsi.node1.x8664:sn.9c6086f5562d/tpg1/acls/<br>/iscsi/iqn.2003-01.org.linux-iscsi.node1.x8664:sn.9c6086f5562d/tpg1/luns/<br>/iscsi/iqn.2003-01.org.linux-iscsi.node1.x8664:sn.9c6086f5562d/tpg1/portals/ /iscsi/iqn.20…86f5562d/tpg1> portals/ create</p><p>Using default IP port 3260<br>Binding to INADDR_ANY (0.0.0.0)<br>This NetworkPortal already exists in configFS</p><h2 id="创建逻辑单元-LUN"><a href="#创建逻辑单元-LUN" class="headerlink" title="创建逻辑单元(LUN)"></a>创建逻辑单元(LUN)</h2><p>/> cd /iscsi/iqn.2003-01.org.linux-iscsi.node1.x8664:sn.9c6086f5562d/tpg1/<br>/iscsi/iqn.20…86f5562d/tpg1> luns/ create /backstores/block/rbd-lun<br>Created LUN 0.</p><h2 id="定义客户端访问控制权限(无限制)"><a href="#定义客户端访问控制权限(无限制)" class="headerlink" title="定义客户端访问控制权限(无限制)"></a>定义客户端访问控制权限(无限制)</h2><p>/> cd /iscsi/iqn.2003-01.org.linux-iscsi.node1.x8664:sn.9c6086f5562d/tpg1/</p><p>/iscsi/iqn.20…86f5562d/tpg1> ls<br>o- tpg1 …………………………………………………………………. [no-gen-acls, no-auth]<br> o- acls …………………………………………………………………………… [ACLs: 0]<br> o- luns …………………………………………………………………………… [LUNs: 1]<br> | o- lun0 ………………………………………… [block/rbd-lun (/dev/rbd0) (default_tg_pt_gp)]<br> o- portals ……………………………………………………………………… [Portals: 1]<br>o- 0.0.0.0:3260 ………………………………………………………………………. [OK]<br>/iscsi/iqn.20…86f5562d/tpg1> set attribute authentication=0 demo_mode_write_protect=0 generate_node_acls=1 cache_dynamic_acls=1<br>Parameter demo_mode_write_protect is now ‘0’.<br>Parameter authentication is now ‘0’.<br>Parameter generate_node_acls is now ‘1’.<br>Parameter cache_dynamic_acls is now ‘1’.<br>/iscsi/iqn.20…86f5562d/tpg1> ls<br>o- tpg1 ……………………………………………………………………. [gen-acls, no-auth]<br> o- acls …………………………………………………………………………… [ACLs: 0]<br> o- luns …………………………………………………………………………… [LUNs: 1]<br> | o- lun0 ………………………………………… [block/rbd-lun (/dev/rbd0) (default_tg_pt_gp)]<br> o- portals ……………………………………………………………………… [Portals: 1]<br>o- 0.0.0.0:3260 ………………………………………………………………………. [OK]</p><h2 id="保存配置"><a href="#保存配置" class="headerlink" title="保存配置"></a>保存配置</h2><p>/> saveconfig<br>Configuration saved to /etc/target/saveconfig.json<br>/> exit<br>Global pref auto_save_on_exit=true<br>Last 10 configs saved in /etc/target/backup/.<br>Configuration saved to /etc/target/saveconfig.json</p><h1 id="客户端访问"><a href="#客户端访问" class="headerlink" title="客户端访问"></a>客户端访问</h1><p>[root@node2 ~]# iscsiadm -m discovery -t st -p 192.168.3.250<br>192.168.3.250:3260,1 iqn.2003-01.org.linux-iscsi.node1.x8664:sn.9c6086f5562d<br>[root@node2 ~]# iscsiadm -m node -T iqn.2003-01.org.linux-iscsi.node1.x8664:sn.9c6086f5562d -p 192.168.3.250 -l<br>Logging in to [iface: default, target: iqn.2003-01.org.linux-iscsi.node1.x8664:sn.9c6086f5562d, portal: 192.168.3.250,3260] (multiple)<br>Login to [iface: default, target: iqn.2003-01.org.linux-iscsi.node1.x8664:sn.9c6086f5562d, portal: 192.168.3.250,3260] successful.</p><p>[root@node2 ~]# lsblk<br>NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT<br>sda 8:0 0 30G 0 disk<br>├─sda1 8:1 0 1G 0 part /boot<br>└─sda2 8:2 0 29G 0 part<br> ├─centos-root 253:0 0 27G 0 lvm /<br> └─centos-swap 253:1 0 2G 0 lvm [SWAP]<br>sdd 8:48 0 1G 0 disk<br>sr0 11:0 1 4.2G 0 rom </p><p>sdd磁盘即是挂载的磁盘。</p>]]></content>
<summary type="html">
<h1 id="centos7安装target"><a href="#centos7安装target" class="headerlink" title="centos7安装target"></a>centos7安装target</h1><p>targetcli是一个iSCSI配置管理工具,该工具简单易用,可以直接替换scsi-target-utils,Linux-IO target 在linux内核2.6.38以后,用软件实现的scsi target,支持的SAN技术中所有流行的存储协议包括Fibre Channel、FCoe\iSCSI、iser等,同时还能为本机生成模拟的scsi设备,以及为虚拟机提供基于virtio的scsi设备</p>
</summary>
<category term="lio target" scheme="http://idcat.cn/tags/lio-target/"/>
</entry>
<entry>
<title>ceph-mon基础运维</title>
<link href="http://idcat.cn/ceph-mon%E5%9F%BA%E7%A1%80%E8%BF%90%E7%BB%B4.html"/>
<id>http://idcat.cn/ceph-mon基础运维.html</id>
<published>2018-06-27T12:07:00.000Z</published>
<updated>2019-11-21T01:12:23.794Z</updated>
<content type="html"><![CDATA[<p>全文参考<a href="https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html/troubleshooting_guide/troubleshooting-monitors" target="_blank" rel="noopener">https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html/troubleshooting_guide/troubleshooting-monitors</a></p><p>进行翻译,可以直接查看原文。<br><a id="more"></a></p><h1 id="一、The-ceph-mon-Daemon-Cannot-Start"><a href="#一、The-ceph-mon-Daemon-Cannot-Start" class="headerlink" title="一、The ceph-mon Daemon Cannot Start"></a>一、The ceph-mon Daemon Cannot Start</h1><pre><code>systemctl status ceph-mon@<host-name>systemctl start ceph-mon@<host-name></code></pre><p>执行star启动命令后,mon daemon 守护进程就是无法启动。检查相应主机mon的日志,默认在路劲/var/log/ceph/ceph-mon.<host-name>.log</host-name></p><p>一、 如果日志包含以下类似信息,说明mon store有数据损坏</p><p>Corruption: error in middle of record</p><p>Corruption: 1 missing files; e.g.: /var/lib/ceph/mon/mon.0/store.db/1234567.ldb</p><p><strong>修复monmap</strong></p><p>如果一个 monitor拥有过时的数据或者错误的monmap表,这样它就不会加入到选举中去。因为他可能使用错误的IP地址去连接其他正常的mon ip。最安全的修复此问题的方法便是从其他正常运行mon的节点上获取最新monmap表,然后注入到问题mon节点。</p><p>此方法最基础的条件便是剩余的集群拥有正常的quorum状态,或者说至少有一台主机拥有正确且最新的monmap表。<br>a、剩下的主机拥有正常的quorum状态,使用ceph mon getmap获取monmap表</p><pre><code>ceph mon getmap -o /tmp/monmap</code></pre><p>b、如果剩余的主机没有产生quorum状态,且至少有一台主机节点拥有正确的monmap表,则通过如下方式获取monmap表</p><p>停止mon服务: systemctl stop ceph-mon@host1</p><p>复制monmap: ceph-mon -i mon.host1 –extract-monmap /tmp/monmap</p><p>c、注入新的monmap到损坏的节点上(mon 进程必须停止)</p><pre><code>ceph-mon -i mon.hosta --inject-monmap /tmp/monmap</code></pre><p>然后启动所有mon进程</p><p>然后,在某些情况下,所有的monitor节点可能同时出现故障,例如所有的节点配置了错误的磁盘或文件系统设置,异常的断电等等都会导致monitor节点故障。如果所有mon节点上的store均故障,我们可以从其他osd节点上恢复它。使用工具ceph-monstore-tool 和ceph-objectstore-tool</p><p>具体步骤如下:首先确保安装rsync 和ceph-test包</p><p>运行如下命令,在所有mon数据故障节点上</p><p>1、从所有osd节点上搜集集群map</p><pre><code>$ ms=/tmp/monstore/$ mkdir $ms$ for host in $host_list; dorsync -avz "$ms" root@$host:"$ms"; rm -rf "$ms"ssh root@$host <<EOFfor osd in /var/lib/ceph/osd/ceph-*; do ceph-objectstore-tool --data-path \$osd --op update-mon-db --mon-store-path $msdoneEOFrsync -avz root@$host:$ms $ms; done</code></pre><p>2、设置合适的权限</p><pre><code>$ ceph-authtool /etc/ceph/ceph.client.admin.keyring -n mon. --cap mon 'allow *'$ ceph-authtool /etc/ceph/ceph.client.admin.keyring -n client.admin --cap mon 'allow *' --cap osd 'allow *' --cap mds 'allow *'</code></pre><p>3、从搜集到的map信息中重建monitor store</p><pre><code>$ ceph-monstore-tool /tmp/mon-store rebuild -- --keyring /etc/ceph/ceph.client.admin.keyring</code></pre><p>如果cephx处于none状态,则去掉后面的–keyring</p><pre><code>$ ceph-monstore-tool /tmp/mon-store rebuild</code></pre><p>4、备份坏了的store</p><pre><code>mv /var/lib/ceph/mon/mon.0/store.db /var/lib/ceph/mon/mon.0/store.db.corrupted</code></pre><p>5、替换正确的store</p><pre><code>mv /tmp/mon-store/store.db /var/lib/ceph/mon/mon.0/store.db</code></pre><p>6、修改owner</p><pre><code>chown -R ceph:ceph /var/lib/ceph/mon/mon.0/store.db</code></pre><h1 id="二、The-ceph-mon-Daemon-Is-Running-but-Still-Marked-as-down"><a href="#二、The-ceph-mon-Daemon-Is-Running-but-Still-Marked-as-down" class="headerlink" title="二、The ceph-mon Daemon Is Running, but Still Marked as down"></a>二、The ceph-mon Daemon Is Running, but Still Marked as down</h1><p>从那台标记自己down的mon主机上执行查看状态命令 (mon.a mon.主机名)</p><pre><code>ceph daemon mon.a mon_status </code></pre><p>查看返回结果:</p><p>1、 如果结果显示为probing ,确认本地输出结果中其他monitor节点信息。如果主机IP地址显示不正确,则该机monmap表有点问题。如果主机IP地址显示正确,确认下mon节点之间monitor clock是否同步。或者是其他网络问题,请自行排查。</p><p>2、 如果结果显示为electing,请确认下mon节点之间mon clock是否同步。</p><p>时钟不同步 执行ceph health detail 显示如下</p><pre><code>mon.a (rank 0) addr 127.0.0.1:6789/0 is down (out of quorum)mon.a addr 127.0.0.1:6789/0 clock skew 0.08235s > max 0.05s (latency 0.0045s)</code></pre><p>同时,日志信息中也包含如下报错信息</p><pre><code>2015-06-04 07:28:32.035795 7f806062e700 0 log [WRN] : mon.a 127.0.0.1:6789/0 clock skew 0.14s > max 0.05s2015-06-04 04:31:25.773235 7f4997663700 0 log [WRN] : message from mon.1 was stamped 0.186257s in the future, clocks not synchronized</code></pre><p>上面显示时钟偏移了0.14s大于默认接受偏移值0.05,因此报此错误。参数mon_clock_drift_allowed决定了集群可容忍的时钟偏移量。请不要随意修改该值,这样可能会导致集群不稳定。时钟同步对于ceph来说是非常重要的,否则会带来很多意想不到的问题。其他网络问题以及NTP等时钟同步问题这里不做介绍。</p><h1 id="三、The-Monitor-Store-is-Getting-Too-Big"><a href="#三、The-Monitor-Store-is-Getting-Too-Big" class="headerlink" title="三、The Monitor Store is Getting Too Big"></a>三、The Monitor Store is Getting Too Big</h1><p>ceph health 命令返回如下错误信息:</p><pre><code>mon.ceph1 store is getting too big! 48031 MB >= 15360 MB -- 62% avail</code></pre><p>ceph monitor store实际上是一个levelDB 数据库,里面存储了很多key-values 键值对条目。该数据库存放了整个集群的map表,默认存储地址在/var/lig/ceph/mon/<cluster-name>-hostname/store.db。查询大型的mon存储可能需要时间,因此,在相应客户端查询时,mon可能会被延迟。</cluster-name></p><p>修复此问题,检查数据库大小:</p><p>如 </p><pre><code>du -sch /var/lib/ceph/mon/ceph-host1/store.db47G /var/lib/ceph/mon/ceph-ceph1/store.db/47G total</code></pre><p><strong>释放部分空间</strong></p><p>方法如下</p><p>1、ceph-mon进程在运行中</p><pre><code>ceph tell mon.host1 compact</code></pre><p>2、在启动ceph-mon进程过程中,写入配置文件中</p><p>[mon]</p><pre><code>mon_compact_on_start = true</code></pre><p>然后重启mon进程systemctl restart ceph-mon@host1</p><p>确保mon处于正常的选举状态</p><pre><code>ceph mon stat</code></pre><p>3、使用ceph-monstore-tool工具(ceph-test安装包必须先安装)</p><p>首先确保ceph-daemon没有运行</p><pre><code>systemctl stop ceph-mon@host1ceph-monstore-tool /var/lib/ceph/mon/mon.node1 compactsystemctl start ceph-mon@host1</code></pre>]]></content>
<summary type="html">
<p>全文参考<a href="https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html/troubleshooting_guide/troubleshooting-monitors" target="_blank" rel="noopener">https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html/troubleshooting_guide/troubleshooting-monitors</a></p>
<p>进行翻译,可以直接查看原文。<br>
</summary>
<category term="ceph" scheme="http://idcat.cn/tags/ceph/"/>
</entry>
<entry>
<title>saltstack一键部署haproxy+keepalived+nginx负载均衡高可用环境</title>
<link href="http://idcat.cn/saltstack%E4%B8%80%E9%94%AE%E9%83%A8%E7%BD%B2haproxy-keepalived-nginx%E8%B4%9F%E8%BD%BD%E5%9D%87%E8%A1%A1%E9%AB%98%E5%8F%AF%E7%94%A8%E7%8E%AF%E5%A2%83.html"/>
<id>http://idcat.cn/saltstack一键部署haproxy-keepalived-nginx负载均衡高可用环境.html</id>
<published>2018-06-09T12:28:21.000Z</published>
<updated>2018-06-09T12:32:40.421Z</updated>
<content type="html"><![CDATA[<p>本文档仅作为自己学习记录使用。<br>部分内容参考<a href="https://www.kancloud.cn/louis1986/saltstack/521459" target="_blank" rel="noopener">https://www.kancloud.cn/louis1986/saltstack/521459</a></p><p>本例环境架构: centos7.2-1511 测试环境,均关闭防火墙以及selinux</p><table><thead><tr><th>主机名</th><th style="text-align:center">角色</th><th style="text-align:right">ip</th></tr></thead><tbody><tr><td>master</td><td style="text-align:center">master</td><td style="text-align:right">192.168.4.10</td></tr><tr><td>minion1</td><td style="text-align:center">haproxy keepalived</td><td style="text-align:right">192.168.4.11</td></tr><tr><td>minion2</td><td style="text-align:center">haproxy keepalived</td><td style="text-align:right">192.168.4.12</td></tr><tr><td>minion3</td><td style="text-align:center">nginx</td><td style="text-align:right">192.168.4.13</td></tr><tr><td>minion3</td><td style="text-align:center">nginx</td><td style="text-align:right">192.168.4.14</td></tr><tr><td></td><td style="text-align:center">VIP</td><td style="text-align:right">192.168.4.16</td></tr></tbody></table><p>该配置环境主要是配置haproxy + keepalived负载均衡的高可用,其中haproxy通过轮训的方式连接到后端实际的2台nginx服务器。<br><a id="more"></a><br><strong>salt安装</strong></p><p>master节点: yum install epel-release -y yum install salt-master</p><p>minion节点: yum install epel-release -y yum install salt-minion</p><p><strong>salt基础配置</strong></p><p>本例不作详细介绍。</p><p>master节点上 vi /etc/salt/master 修改如下:其他均默认。</p><pre><code>file_roots: base: - /srv/salt/base prod: - /srv/salt/prodinterface: 192.168.4.10</code></pre><p>minion节点上 vi /etc/salt/minion 修改如下: 其他均默认。</p><pre><code>master: master</code></pre><p>所有master minion节点启动服务后(systemctl start salt-minion/salt-master)<br>在master执行 salt-key -A 接受所有minion节点的key。相关情况不做详细介绍。</p><p>所有节点/etc/hosts 均一致</p><pre><code>[root@master ~]# cat /etc/hosts127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4::1 localhost localhost.localdomain localhost6 localhost6.localdomain6192.168.4.10 master192.168.4.11 minion1192.168.4.12 minion2192.168.4.13 minion3192.168.4.14 minion4</code></pre><p>下面配置state等不作详细介绍,直接复制粘贴。</p><p><strong>所有配置均在master上,首先查看tree目录</strong></p><pre><code>[root@master ~]# cd /srv/salt/[root@master salt]# tree.├── base│ ├── init│ │ ├── audit.sls│ │ ├── cron.sls│ │ ├── dns.sls│ │ ├── env_init.sls│ │ ├── epel.sls│ │ ├── files│ │ │ ├── resolv.conf│ │ │ └── sysctl.conf│ │ ├── history.sls│ │ ├── sysctl.sls│ │ └── yum.sls│ └── top.sls└── prod├── cluster│ ├── files│ │ ├── haproxy-outside.cfg│ │ └── haproxy-outside-keepalived.cfg│ ├── haproxy-outside-keepalived.sls│ └── haproxy-outside.sls├── haproxy│ ├── files│ │ └── haproxy-1.8.9.tar.gz│ └── install_haproxy.sls├── keepalived│ ├── files│ │ ├── keepalived│ │ ├── keepalived-1.4.2.tar.gz│ │ ├── keepalived.conf│ │ └── keepalived.sysconfig│ └── install_keepalived.sls├── nginx│ ├── files│ │ ├── nginx-1.12.2.tar.gz│ │ ├── nginx.conf│ │ ├── nginx.init│ │ ├── pcre-8.41.tar.gz│ │ └── zlib-1.2.11.tar.gz│ ├── nginx-install.sls│ ├── nginx-service.sls│ ├── nginx-user.sls│ ├── pcre-install.sls│ └── zlib-install.sls└── pkg └── pkg-init.sls13 directories, 33 files</code></pre><p><strong>首先介绍base/init目录下的文件</strong></p><p>[root@master init]# tree</p><pre><code>.├── audit.sls├── cron.sls├── dns.sls├── env_init.sls├── epel.sls├── files│ ├── resolv.conf│ └── sysctl.conf├── history.sls├── sysctl.sls└── yum.sls1 directory, 10 files</code></pre><p>该目录文件为所有节点配置初始化的一些配置,比方说统一dns,统一安装epel源 统一sysctl参数等等。其中env_init.sls 是统一调配入口,这样只需要运行env_init就可以自动运行其他所有配置文件。可自行增加编辑。</p><pre><code>[root@master init]# cat env_init.sls include: - init.audit - init.cron - init.dns - init.epel - init.history - init.sysctl - init.yum[root@master init]# cat audit.sls /etc/bashrc: file.append: - text: - export PROMPT_COMMAND='{ msg=$(history 1 | { read x y; echo $y; });logger "[euid=$(whoami)]":$(who am i):[`pwd']"$msg; }'[root@master init]# cat cron.sls ntpdate-install: pkg.installed: - name: ntpdateset-crontab: cron.present: - name: /usr/sbin/ntpdate time1.aliyun.com >> /dev/null 2>&1 - user: root - minute: "*2" - require: - pkg: ntpdate-install[root@master init]# cat dns.sls /etc/resolv.conf: file.managed: - source: salt://init/files/resolv.conf - user: root - group: root - mode: 644[root@master init]# cat epel.sls yum_epel: pkg.installed: - name: epel-release - unless: rpm -qa |grep epel-release[root@master init]# cat history.sls /etc/profile: file.append: - text: - export HISTTIMEFORMAT="%F %T `whoami`"[root@master init]# cat sysctl.sls /etc/sysctl.conf: file.managed: - source: salt://init/files/sysctl.conf - user: root - group: root - mode: 644[root@master init]# cat yum.sls yum_base: pkg.installed: - names: - gcc - gcc-c++ - make - autoconf - net-tools - lrzsz - sysstat - vim-enhanced - openssh-clients - lsof - tree - wget - cmake</code></pre><p>该目录下file目录</p><pre><code>[root@master init]# tree files/files/├── resolv.conf└── sysctl.conf0 directories, 2 files[root@master init]# cd files/[root@master files]# lltotal 8-rw-r--r-- 1 root root 53 Jun 6 11:37 resolv.conf-rw-r--r-- 1 root root 449 Jun 6 11:57 sysctl.conf[root@master files]# cat resolv.conf # Generated by NetworkManager #根据实际情况填写nameserver 192.168.0.1[root@master files]# cat sysctl.conf # sysctl settings are defined through files in# /usr/lib/sysctl.d/, /run/sysctl.d/, and /etc/sysctl.d/.## Vendors settings live in /usr/lib/sysctl.d/.# To override a whole file, create a new file with the same in# /etc/sysctl.d/ and put new settings there. To override# only specific settings, add a file with a lexically later# name in /etc/sysctl.d/ and put new settings there.## For more information, see sysctl.conf(5) and sysctl.d(5).#本例为空,测试环境不想调试内核参数,若实际应用中,请自行输入需要调整的内核参数</code></pre><p><strong>介绍prod目录</strong></p><p>该目录为实际的安装包以及配置等目录。首先查看tree</p><p>每个目录均为一个需要安装的软件包以及其配置文件。cluster目录是后期在生成环境下结合不同环境配置haproxy和keepalived的配置文档,最后介绍。其他的目录比如nginx haproxy等都是安装配置。</p><pre><code>[root@master prod]# tree.├── cluster│ ├── files│ │ ├── haproxy-outside.cfg│ │ └── haproxy-outside-keepalived.cfg│ ├── haproxy-outside-keepalived.sls│ └── haproxy-outside.sls├── haproxy│ ├── files│ │ └── haproxy-1.8.9.tar.gz│ └── install_haproxy.sls├── keepalived│ ├── files│ │ ├── keepalived│ │ ├── keepalived-1.4.2.tar.gz│ │ ├── keepalived.conf│ │ └── keepalived.sysconfig│ └── install_keepalived.sls├── nginx│ ├── files│ │ ├── nginx-1.12.2.tar.gz│ │ ├── nginx.conf│ │ ├── nginx.init│ │ ├── pcre-8.41.tar.gz│ │ └── zlib-1.2.11.tar.gz│ ├── nginx-install.sls│ ├── nginx-service.sls│ ├── nginx-user.sls│ ├── pcre-install.sls│ └── zlib-install.sls└── pkg └── pkg-init.sls9 directories, 22 files</code></pre><p><strong>首先看pkg目录</strong></p><p>这个目录是所有节点部署nginx haproxy keepalived等软件需要的依赖包</p><pre><code>[root@master prod]# cd pkg/[root@master pkg]# lltotal 4-rw-r--r-- 1 root root 167 Jun 6 14:07 pkg-init.sls[root@master pkg]# cat pkg-init.sls pkg-init: pkg.installed: - names: - gcc - gcc-c++ - glibc - make - autoconf - openssl - openssl-devel - automake</code></pre><p><strong>其次haproxy目录</strong></p><pre><code>[root@master haproxy]# pwd/srv/salt/prod/haproxy[root@master haproxy]# tree.├── files│ └── haproxy-1.8.9.tar.gz└── install_haproxy.sls1 directory, 2 files</code></pre><p>file目录下为haproxy安装源码包</p><pre><code>[root@master haproxy]# cd files/[root@master files]# lltotal 2012-rw-r--r-- 1 root root 2057051 Jun 6 14:15 haproxy-1.8.9.tar.gz安装配置文件[root@master haproxy]# cat install_haproxy.sls include: - pkg.pkg-inithaproxy-install: file.managed: - name: /usr/local/src/haproxy-1.8.9.tar.gz - source: salt://haproxy/files/haproxy-1.8.9.tar.gz - user: root - group: root - mode: 755 cmd.run: - name: cd /usr/local/src && tar xf haproxy-1.8.9.tar.gz && cd haproxy-1.8.9 && make TARGET=linux2628 PREFIX=/usr/local/haproxy && make install PREFIX=/usr/local/haproxy && sed -i 's?BIN=/usr/sbin/$BASENAME?BIN=/usr/local/haproxy/sbin/$BASENAME?' /usr/local/src/haproxy-1.8.9/examples/haproxy.init && sed -i '/NETWORKING/c [[ $NETWORKING = "no" ]] && exit 0' /usr/local/src/haproxy-1.8.9/examples/haproxy.init && cp /usr/local/src/haproxy-1.8.9/examples/haproxy.init /etc/init.d/haproxy && chmod +x /etc/init.d/haproxy - unless: test -d /usr/local/haproxy - require: - pkg: pkg-init - file: haproxy-installhaproxy_chkconfig: cmd.run: - name: chkconfig --add haproxy && chkconfig --level 2345 haproxy on - unless: chkconfig --list |grep haproxy - require: - file: haproxy-installhaproxy-config-dir: file.directory: - name: /etc/haproxy - user: root - group: root - mode: 755net.ipv4.ip_nonlocal_bind: cmd.run: - name: echo "net.ipv4.ip_nonlocal_bind=1" >> /etc/sysctl.conf && sysctl -p - unless: cat /etc/sysctl.conf | grep net.ipv4.ip_nonlocal_bind - require: - file: haproxy-install</code></pre><p><strong>keepalived目录</strong></p><pre><code>[root@master prod]# cd keepalived/[root@master keepalived]# lltotal 4drwxr-xr-x 2 root root 102 Jun 7 11:03 files-rw-r--r-- 1 root root 1452 Jun 7 11:18 install_keepalived.sls</code></pre><p>首先查看files目录</p><pre><code>[root@master files]# lltotal 736-rwxr-xr-x 1 root root 1335 Jun 7 11:01 keepalived-rw-r--r-- 1 root root 738096 Feb 26 00:48 keepalived-1.4.2.tar.gz-rw-r--r-- 1 root root 3550 Jun 7 11:02 keepalived.conf-rw-r--r-- 1 root root 667 Jun 7 11:02 keepalived.sysconfig</code></pre><p>keepalived文件为keepalived的service启动服务文件,在/etc/init.d/目录下,keepalived.conf 为其基础配置文件,keepalived.sysconfig为启动文件需要的配置文件。</p><pre><code>[root@master files]# cat keepalived#!/bin/sh## Startup script for the Keepalived daemon## processname: keepalived# pidfile: /var/run/keepalived.pid# config: /etc/keepalived/keepalived.conf# chkconfig: - 21 79# description: Start and stop Keepalived# Source function library. /etc/rc.d/init.d/functions# Source configuration file (we set KEEPALIVED_OPTIONS there). /etc/sysconfig/keepalivedRETVAL=0prog="keepalived"start() { echo -n $"Starting $prog: " daemon /usr/local/keepalived/sbin/keepalived ${KEEPALIVED_OPTIONS}##上面参数是修改之后的,默认的为/sbin/keepalived ${KEEPALIVED_OPTIONS} RETVAL=$? echo [ $RETVAL -eq 0 ] && touch /var/lock/subsys/$prog}stop() { echo -n $"Stopping $prog: " killproc keepalived RETVAL=$? echo [ $RETVAL -eq 0 ] && rm -f /var/lock/subsys/$prog}reload() { echo -n $"Reloading $prog: " killproc keepalived -1 RETVAL=$? echo}# See how we were called.case "$1" in start) start ;; stop) stop ;; reload) reload ;; restart) stop start ;; condrestart) if [ -f /var/lock/subsys/$prog ]; then stop start fi ;; status) status keepalived RETVAL=$? ;; *) echo "Usage: $0 {start|stop|reload|restart|condrestart|status}" RETVAL=1esacexit $RETVAL[root@master files]# cat keepalived.conf ##该文件为默认文件,放这里是为了启动过程中有个初始默认文件,后期结合实际生产环境会被修改的,在cluster目录中介绍。! Configuration File for keepalivedglobal_defs { notification_email { [email protected] [email protected] [email protected] } notification_email_from [email protected] smtp_server 192.168.200.1 smtp_connect_timeout 30 router_id LVS_DEVEL vrrp_skip_check_adv_addr vrrp_strict vrrp_garp_interval 0 vrrp_gna_interval 0}vrrp_instance VI_1 { state MASTER interface eth0 virtual_router_id 51 priority 100 advert_int 1 authentication { auth_type PASS auth_pass 1111 } virtual_ipaddress { 192.168.200.16 192.168.200.17 192.168.200.18 }}virtual_server 192.168.200.100 443 { delay_loop 6 lb_algo rr lb_kind NAT persistence_timeout 50 protocol TCP real_server 192.168.201.100 443 { weight 1 SSL_GET { url { path / digest ff20ad2481f97b1754ef3e12ecd3a9cc } url { path /mrtg/ digest 9b3a0c85a887a256d6939da88aabd8cd } connect_timeout 3 retry 3 delay_before_retry 3 } }}virtual_server 10.10.10.2 1358 { delay_loop 6 lb_algo rr lb_kind NAT persistence_timeout 50 protocol TCP sorry_server 192.168.200.200 1358 real_server 192.168.200.2 1358 { weight 1 HTTP_GET { url { path /testurl/test.jsp digest 640205b7b0fc66c1ea91c463fac6334d } url { path /testurl2/test.jsp digest 640205b7b0fc66c1ea91c463fac6334d } url { path /testurl3/test.jsp digest 640205b7b0fc66c1ea91c463fac6334d } connect_timeout 3 retry 3 delay_before_retry 3 } } real_server 192.168.200.3 1358 { weight 1 HTTP_GET { url { path /testurl/test.jsp digest 640205b7b0fc66c1ea91c463fac6334c } url { path /testurl2/test.jsp digest 640205b7b0fc66c1ea91c463fac6334c } connect_timeout 3 retry 3 delay_before_retry 3 } }}virtual_server 10.10.10.3 1358 { delay_loop 3 lb_algo rr lb_kind NAT persistence_timeout 50 protocol TCP real_server 192.168.200.4 1358 { weight 1 HTTP_GET { url { path /testurl/test.jsp digest 640205b7b0fc66c1ea91c463fac6334d } url { path /testurl2/test.jsp digest 640205b7b0fc66c1ea91c463fac6334d } url { path /testurl3/test.jsp digest 640205b7b0fc66c1ea91c463fac6334d } connect_timeout 3 retry 3 delay_before_retry 3 } } real_server 192.168.200.5 1358 { weight 1 HTTP_GET { url { path /testurl/test.jsp digest 640205b7b0fc66c1ea91c463fac6334d } url { path /testurl2/test.jsp digest 640205b7b0fc66c1ea91c463fac6334d } url { path /testurl3/test.jsp digest 640205b7b0fc66c1ea91c463fac6334d } connect_timeout 3 retry 3 delay_before_retry 3 } }}[root@master files]# cat keepalived.sysconfig #默认文件,在解压之后的安装包里面# Options for keepalived. See `keepalived --help' output and keepalived(8) and# keepalived.conf(5) man pages for a list of all options. Here are the most# common ones :## --vrrp -P Only run with VRRP subsystem.# --check -C Only run with Health-checker subsystem.# --dont-release-vrrp -V Dont remove VRRP VIPs & VROUTEs on daemon stop.# --dont-release-ipvs -I Dont remove IPVS topology on daemon stop.# --dump-conf -d Dump the configuration data.# --log-detail -D Detailed log messages.# --log-facility -S 0-7 Set local syslog facility (default=LOG_DAEMON)#KEEPALIVED_OPTIONS="-D"</code></pre><p>查看keepalived的salt配置文档</p><pre><code>[root@master keepalived]# lltotal 4drwxr-xr-x 2 root root 102 Jun 7 11:03 files-rw-r--r-- 1 root root 1452 Jun 7 11:18 install_keepalived.sls[root@master keepalived]# cat install_keepalived.sls include: - pkg.pkg-initdependency_package_install: pkg.installed: - names: - libnl3-devel - libnfnetlink-develkeepalived-install: file.managed: - name: /usr/local/src/keepalived-1.4.2.tar.gz - source: salt://keepalived/files/keepalived-1.4.2.tar.gz - user: root - group: root - mode: 755 cmd.run: - name: cd /usr/local/src && tar -xf keepalived-1.4.2.tar.gz && cd keepalived-1.4.2 && ./configure --prefix=/usr/local/keepalived && make && make install - unless: test -d /usr/local/keepalived - require: - pkg: pkg-init - pkg: dependency_package_install - file: keepalived-installkeepalived-init: file.managed: - name: /etc/init.d/keepalived - source: salt://keepalived/files/keepalived - user: root - group: root - mode: 755 cmd.run: - name: chkconfig --add keepalived && chkconfig --level 2345 keepalived on - unless: chkconfig --list | grep keepalived - require: - file: keepalived-init/etc/sysconfig/keepalived: file.managed: - source: salt://keepalived/files/keepalived.sysconfig - user: root - group: root - mode: 644/etc/keepalived: file.directory: - user: root - group: root - mode: 755 /etc/keepalived/keepalived.conf: file.managed: - source: salt://keepalived/files/keepalived.conf - user: root - group: root - mode: 644 - require: - file: /etc/keepalived</code></pre><p><strong>nginx目录</strong></p><pre><code>[root@master nginx]# tree.├── files│ ├── nginx-1.12.2.tar.gz│ ├── nginx.conf│ ├── nginx.init│ ├── pcre-8.41.tar.gz│ └── zlib-1.2.11.tar.gz├── nginx-install.sls├── nginx-service.sls├── nginx-user.sls├── pcre-install.sls└── zlib-install.sls1 directory, 10 files</code></pre><p>file目录中为nginx的源码包以及需要的依赖包pcre和zlib的源码包。nginx.conf为nginx的配置文件,nginx.init为启动脚本 既/etc/init.d目录下的service控制服务脚本。</p><p> nginx-install.sls 为nginx的安装脚本 nginx-service.sls启动nginx服务脚本 nginx-user.sls 为创建nginx用户脚本 pcre-install.sls zlib-install.sls 分别为安装pcr和zlib的脚本。</p><p>files目录下:</p><pre><code>[root@master files]# cat nginx.conf user nginx;worker_processes auto;error_log logs/error.log error;worker_rlimit_nofile 30000;pid /var/run/nginx.pid;events { use epoll; worker_connections 65535;}http { include mime.types; default_type application/octet-stream; sendfile on; tcp_nopush on; underscores_in_headers on; keepalive_timeout 10; send_timeout 60; gzip on; include /usr/local/nginx/conf/vhost/*.conf; server { listen 80; root /usr/local/nginx/html; index index.html; server_name 127.0.0.1; location /nginx_status { stub_status on; access_log off; allow 127.0.0.1; deny all; } }}[root@master files]# cat nginx.init #!/bin/sh## nginx - this script starts and stops the nginx daemon## chkconfig: - 85 15 # description: Nginx is an HTTP(S) server, HTTP(S) reverse \# proxy and IMAP/POP3 proxy server# processname: nginx# config: /etc/nginx/nginx.conf# config: /etc/sysconfig/nginxpidfile: /var/run/nginx.pid# Source function library.. /etc/rc.d/init.d/functions# Source networking configuration.. /etc/sysconfig/network# Check that networking is up.[ "$NETWORKING" = "no" ] && exit 0nginx="/usr/local/nginx/sbin/nginx"prog=$(basename $nginx)##指定nginx的配置文件目录 NGINX_CONF_FILE="/usr/local/nginx/conf/nginx.conf"[ -f /etc/sysconfig/nginx ] && . /etc/sysconfig/nginxlockfile=/var/lock/subsys/nginxmake_dirs() { # make required directories user=`$nginx -V 2>&1 | grep "configure arguments:" | sed 's/[^*]*--user=\([^ ]*\).*/\1/g' -` if [ -z "`grep $user /etc/passwd`" ]; then useradd -M -s /bin/nologin $user fi options=`$nginx -V 2>&1 | grep 'configure arguments:'` for opt in $options; do if [ `echo $opt | grep '.*-temp-path'` ]; then value=`echo $opt | cut -d "=" -f 2` if [ ! -d "$value" ]; then # echo "creating" $value mkdir -p $value && chown -R $user $value fi fi done}start() { [ -x $nginx ] || exit 5 [ -f $NGINX_CONF_FILE ] || exit 6 make_dirs echo -n $"Starting $prog: " daemon $nginx -c $NGINX_CONF_FILE retval=$? echo [ $retval -eq 0 ] && touch $lockfile return $retval}stop() { echo -n $"Stopping $prog: " killproc $prog -QUIT retval=$? echo [ $retval -eq 0 ] && rm -f $lockfile return $retval}restart() { configtest || return $? stop sleep 1 start}reload() { configtest || return $? echo -n $"Reloading $prog: " $nginx -s reload RETVAL=$? echo}force_reload() { restart}configtest() { $nginx -t -c $NGINX_CONF_FILE}rh_status() { status $prog}rh_status_q() { rh_status >/dev/null 2>&1}case "$1" in start) rh_status_q && exit 0 $1 ;; stop) rh_status_q || exit 0 $1 ;; restart|configtest) $1 ;; reload) rh_status_q || exit 7 $1 ;; force-reload) force_reload ;; status) rh_status ;; condrestart|try-restart) rh_status_q || exit 0 ;; *) echo $"Usage: $0 {start|stop|status|restart|condrestart|try-restart|reload|force-reload|configtest}" exit 2esac</code></pre><p>查看其他配置文件nginx目录下,其他安装包安装配置文件</p><pre><code> nginx安装配置[root@master nginx]# cat nginx-install.sls include: - pkg.pkg-init - nginx.nginx-user - nginx.pcre-install - nginx.zlib-install/var/cache/nginx: file.directory: - user: nginx - group: nginx - mode: 755 - makedirs: Truenginx_dependence: pkg.installed: - names: - gd - gd-develnginx-source-install: file.managed: - name: /usr/local/src/nginx-1.12.2.tar.gz - source: salt://nginx/files/nginx-1.12.2.tar.gz - user: root - group: root - mode: 755 cmd.run: - name: cd /usr/local/src && tar xf nginx-1.12.2.tar.gz && cd nginx-1.12.2 && ./configure --prefix=/usr/local/nginx --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --user=nginx --group=nginx --with-file-aio --with-threads --with-http_addition_module --with-http_auth_request_module --with-http_flv_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_mp4_module --with-http_realip_module --with-http_secure_link_module --with-http_ssl_module --with-http_stub_status_module --with-http_sub_module --with-http_v2_module --with-stream --with-stream_ssl_module --with-http_image_filter_module --with-pcre=/usr/local/src/pcre-8.41 --with-zlib=/usr/local/src/zlib-1.2.11 && make && make install - unless: test -d /usr/local/nginx - require: - file: nginx-source-install - pkg: pkg-init - cmd: pcre-source-install - cmd: zlib-source-install - user: nginx-user-group启动服务配置[root@master nginx]# cat nginx-service.sls include: - nginx.nginx-installnginx-init: file.managed: - name: /etc/init.d/nginx - source: salt://nginx/files/nginx.init - user: root - group: root - mode: 755 cmd.run: - name: chkconfig --add nginx && chkconfig --level 2345 nginx on - unless: chkconfig --list | grep nginx - require: - file: nginx-init/usr/local/nginx/conf/nginx.conf: file.managed: - source: salt://nginx/files/nginx.conf - user: nginx - group: nginx - mode: 644nginx-vhost: file.directory: - name: /usr/local/nginx/conf/vhost - require: - cmd: nginx-source-install service.running: - name: nginx - enable: True - reload: True - require: - cmd: nginx-init - watch: - file: /usr/local/nginx/conf/nginx.conf创建nginx user配置[root@master nginx]# cat nginx-user.sls nginx-user-group: group.present: - name: nginx - gid: 1010 user.present: - name: nginx - fullname: nginx - shell: /sbin/nologin - uid: 1010 - gid: 1010pcre源码包安装配置[root@master nginx]# cat pcre-install.sls include: - pkg.pkg-initpcre-source-install: file.managed: - name: /usr/local/src/pcre-8.41.tar.gz - source: salt://nginx/files/pcre-8.41.tar.gz - user: root - group: root - mode: 755 cmd.run: - name: cd /usr/local/src && tar xf pcre-8.41.tar.gz && cd pcre-8.41 && ./configure --prefix=/usr/local/pcre && make && make install - unless: test -d /usr/local/pcre - require: - file: pcre-source-installzlib安装包安装配置[root@master nginx]# cat zlib-install.sls include: - pkg.pkg-initzlib-source-install: file.managed: - name: /usr/local/src/zlib-1.2.11.tar.gz - source: salt://nginx/files/zlib-1.2.11.tar.gz - user: root - group: root - mode: 755 cmd.run: - name: cd /usr/local/src && tar xf zlib-1.2.11.tar.gz && cd zlib-1.2.11 && ./configure --prefix=/usr/local/zlib && make && make install - unless: test -d /usr/local/zlib - require: - file: zlib-source-install</code></pre><p>以上所有配置结合top.sls文件后都能安装配置成功。下面结合测试环境增加并修改haproxy keepalived配置 实现nginx服务的负载均衡以及高可用。</p><p><strong>cluster目录</strong></p><pre><code>[root@master cluster]# tree .├── files│ ├── haproxy-outside.cfg│ └── haproxy-outside-keepalived.cfg├── haproxy-outside-keepalived.sls└── haproxy-outside.sls1 directory, 4 files</code></pre><p>首先介绍2个sls文件 为salt的配置文件,haproxy-outside.sls为配置haproxy ,haproxy-outside-keepalived.sls为配置haproxy的keepalived的配置。 files目录里面分别为haproxy keepalived的配置文件。可结合实际生产环境进行修改调整。</p><pre><code>#修改haproxy配置文件并启动服务 [root@master cluster]# cat haproxy-outside.sls include: - haproxy.install_haproxyhaproxy-service: file.managed: - name: /etc/haproxy/haproxy.cfg - source: salt://cluster/files/haproxy-outside.cfg - user: root - group: root - mode: 644 service.running: - name: haproxy - enable: True - reload: True - require: - cmd: haproxy-install - watch: - file: haproxy-service #修改keepalived配置文件并启动服务。注意这里用到了jinja模块,对多后端通过变量进行设置参数。这里因为2个keepalived配置文件需要的master backup priority等值不一样。通过变量指定。[root@master cluster]# cat haproxy-outside-keepalived.sls include: - keepalived.install_keepalivedkeepalived-service: file.managed: - name: /etc/keepalived/keepalived.conf - source: salt://cluster/files/haproxy-outside-keepalived.cfg - user: root - group: root - mode: 644 - template: jinja {% set STATEID = ["MASTER","BACKUP"] %} {% set PRIORITYID = [120,100] %} {% if grains['fqdn'] == 'minion1' %} - ROUTEID: minion1 - STATEID: {{ STATEID[0] }} - PRIORITYID: {{ PRIORITYID[0] }} {% elif grains['fqdn'] == 'minion2' %} - ROUTEID: minion2 - STATEID: {{ STATEID[1] }} - PRIORITYID: {{ PRIORITYID[1] }} {% endif %} service.running: - name: keepalived - enable: True - watch: - file: keepalived-service####haproxy的配置文件[root@master files]# pwd/srv/salt/prod/cluster/files[root@master files]# lltotal 8-rw-r--r-- 1 root root 1296 Jun 7 16:47 haproxy-outside.cfg-rw-r--r-- 1 root root 375 Jun 8 12:22 haproxy-outside-keepalived.cfg[root@master files]# cat haproxy-outside.cfg global log 127.0.0.1 local2 chroot /usr/local/haproxy pidfile /usr/local/haproxy/haproxy.pid maxconn 10000 daemon nbproc 1defaults option http-keep-alive maxconn 10000 mode http log global option httplog timeout http-request 10s timeout queue 1m timeout connect 10s timeout client 1m timeout server 1m timeout http-keep-alive 10s timeout check 10s#################通过haproxy节点8888端口/haproxy-status 查看haproxy状态listen status mode http bind *:8888 stats enable stats hide-version stats uri /haproxy-status stats auth haproxy:saltstack stats admin if TRUE stats realm Haproxy\ Statistics#################前端绑定VIP指向后端default_backend nginxfrontend web bind 192.168.4.16:80 mode http option httplog log global default_backend nginx################定义nginx后端的2台实际nginx物理机节点backend nginx option forwardfor header X-REAL-IP option httpchk HEAD / HTTP/1.0 balance roundrobin server minion3 192.168.4.13:80 check inter 2000 rise 30 fall 15 server minion4 192.168.4.14:80 check inter 2000 rise 30 fall 15###keepalived的配置文件,引用了之前文件haproxy-outside-keepalived.sls变量[root@master files]# cat haproxy-outside-keepalived.cfg global_defs { router_id {{ROUTEID}}}vrrp_instance haproxy_ha { state {{STATEID}} interface eno16777736 virtual_router_id 36 priority {{PRIORITYID}} advert_int 1 authentication { auth_type PASS auth_pass 1111 } virtual_ipaddress { 192.168.4.16 }}</code></pre><p>这里所有配置均已介绍完毕。下面开始统一部署测试。回到base目录下,编写top.sls文件</p><pre><code>[root@master base]# cat top.sls #base:定义*既所有的主机执行init目录下的env_init.sls文件即节点初始化的配置base: '*': - init.env_init##prod 定义了不同的minion节点需要执行的步骤,此例中minion1 minion2需要安装haproxy keepalived 以及配置高可用以及负载均衡。 minion3 minon4节点只需要安装nginx而已。prod: 'minion1': - haproxy.install_haproxy - keepalived.install_keepalived - cluster.haproxy-outside - cluster.haproxy-outside-keepalived 'minion2': - haproxy.install_haproxy - keepalived.install_keepalived - cluster.haproxy-outside - cluster.haproxy-outside-keepalived 'minion3': - nginx.nginx-service 'minion4': - nginx.nginx-service</code></pre><p>运行脚本,部署该例环境</p><pre><code>[root@master base]# salt '*' state.highstate</code></pre><p>此例必须返回所有成功。本例测试环境中均已调试OK 执行OK 。</p><p>下面查看运行完成之后的效果,这里修改后端nginx minion3 minion4的首页配置文件</p><pre><code>[root@minion3 html]# cat /usr/local/nginx/html/index.html minion3[root@minion4 ~]# cat /usr/local/nginx/html/index.html minion4</code></pre><p>浏览器上登入192.168.4.11:8888/haproxy-status 192.168.4.12:8888/haproxy-status 以及VIP查看haproxy状态,用户名密码为之前配置文件中定义的haproxy/saltstack</p><p>minion1登入haproxy查看</p><p><img src="https://i.imgur.com/z6tC1Ms.jpg" alt=""></p><p>minion2登入haproxy查看</p><p><img src="https://i.imgur.com/1g3n3nT.jpg" alt=""></p><p>VIP登入haproxy查看</p><p><img src="https://i.imgur.com/DITW0Ny.jpg" alt=""></p><p>minion3节点登入nginx</p><p><img src="https://i.imgur.com/fyh10cx.jpg" alt=""></p><p>minion4节点登入nginx</p><p><img src="https://i.imgur.com/3BlE632.jpg" alt=""></p><p>VIP登入nginx 并刷新浏览器</p><p><img src="https://i.imgur.com/VsJiaZl.jpg" alt=""></p><p><img src="https://i.imgur.com/b8RTcHH.jpg" alt=""></p><p>可以看到测试效果已经实现了。</p>]]></content>
<summary type="html">
<p>本文档仅作为自己学习记录使用。<br>部分内容参考<a href="https://www.kancloud.cn/louis1986/saltstack/521459" target="_blank" rel="noopener">https://www.kancloud.cn/louis1986/saltstack/521459</a></p>
<p>本例环境架构: centos7.2-1511 测试环境,均关闭防火墙以及selinux</p>
<table>
<thead>
<tr>
<th>主机名</th>
<th style="text-align:center">角色</th>
<th style="text-align:right">ip</th>
</tr>
</thead>
<tbody>
<tr>
<td>master</td>
<td style="text-align:center">master</td>
<td style="text-align:right">192.168.4.10</td>
</tr>
<tr>
<td>minion1</td>
<td style="text-align:center">haproxy keepalived</td>
<td style="text-align:right">192.168.4.11</td>
</tr>
<tr>
<td>minion2</td>
<td style="text-align:center">haproxy keepalived</td>
<td style="text-align:right">192.168.4.12</td>
</tr>
<tr>
<td>minion3</td>
<td style="text-align:center">nginx</td>
<td style="text-align:right">192.168.4.13</td>
</tr>
<tr>
<td>minion3</td>
<td style="text-align:center">nginx</td>
<td style="text-align:right">192.168.4.14</td>
</tr>
<tr>
<td></td>
<td style="text-align:center">VIP</td>
<td style="text-align:right">192.168.4.16</td>
</tr>
</tbody>
</table>
<p>该配置环境主要是配置haproxy + keepalived负载均衡的高可用,其中haproxy通过轮训的方式连接到后端实际的2台nginx服务器。<br>
</summary>
<category term="saltstack haproxy keepalived nginx" scheme="http://idcat.cn/tags/saltstack-haproxy-keepalived-nginx/"/>
</entry>
<entry>
<title>systemtap-centos7上安装调试</title>
<link href="http://idcat.cn/systemtap-centos7%E4%B8%8A%E5%AE%89%E8%A3%85%E8%B0%83%E8%AF%95.html"/>
<id>http://idcat.cn/systemtap-centos7上安装调试.html</id>
<published>2018-05-11T02:04:36.000Z</published>
<updated>2018-05-11T02:07:10.100Z</updated>
<content type="html"><![CDATA[<p><strong>systemtap简介</strong></p><p>SystemTap 是一款诊断Linux系统性能的工具,可以跟踪内核以及用户态程序中的任意函数、syscall、语句甚至指令,可以用来动态地收集调试和性能信息的工具,不需要我们重新编译、重启内核。</p><p>详细介绍查看官网地址。其官网地址 <a href="https://sourceware.org/systemtap/" title="https://sourceware.org/systemtap/" target="_blank" rel="noopener">https://sourceware.org/systemtap/</a>。<br><a id="more"></a><br><strong>systemtap安装</strong></p><p>本例环境 centos7 7.2.1511 内核版本3.10.0-693.21.1.el7.x86_64</p><p>systemtap包安装</p><pre><code>yum install systemtap systemtap-runtime</code></pre><p>systemtap内核信息安装包安装</p><p>• kernel-debuginfo</p><p>• kernel-debuginfo-common</p><p>• kernel-devel</p><p>通过root用户执行如下命令进行以上3个包的自动安装</p><pre><code>stap-prep</code></pre><p><strong>确认安装成功</strong></p><p>执行如下脚本,返回complete即可</p><pre><code>[root@node1 ~]# stap -v -e 'probe vfs.read {printf("read performed\n"); exit()}'Pass 1: parsed user script and 466 library scripts using 226636virt/39688res/3308shr/36524data kb, in 190usr/130sys/323real ms.Pass 2: analyzed script: 1 probe, 1 function, 7 embeds, 0 globals using 371188virt/179300res/4640shr/181076data kb, in 1270usr/690sys/1964real ms.Pass 3: using cached /root/.systemtap/cache/df/stap_df47b68c3f1c63f931107c651e9afa76_2692.cPass 4: using cached /root/.systemtap/cache/df/stap_df47b68c3f1c63f931107c651e9afa76_2692.koPass 5: starting run.read performedPass 5: run completed in 0usr/50sys/448real ms.[root@node1 ~]# </code></pre><p><strong>其他机器上执行</strong></p><p>当用户允许一个systemtap 脚本的时候,它会将脚本build为一个内核模块,然后加载该模块,当有事件触发该模块的时候,则执行模块。但是systemtap 脚本只能在部署了debuginfo等内核信息包的主机运行,那么在其他没有安装这些debuginfo的主机如何运行systemtap脚本呢?</p><p>这里systemtap,命令stap支持指定参数-m 将脚本编译为一个内核模块,然后将该模块复制到其他没有安装debuginfo的节点。在其他节点上通过staprun module.ko执行脚本。</p><p>举例:</p><ul><li><p>node1 作为host system是安装了systemtap systemtap-runtime kernel-debuginfo等所有包的节点。</p></li><li><p>node3 作为target system 仅仅安装了systemtap-runtime包。</p></li><li><p>node1 和node3 内核尽量保持一致。不一致也没有关系,通过-r参数指定target system内核版本即可,但是前提是node1即host system节点必须也安装了target system节点的内核版本。</p></li></ul><p>查看内核信息以及将脚本编译输出到模块teststap</p><pre><code>[root@node1 ~]# uname -r3.10.0-693.21.1.el7.x86_64[root@node3 ~]# uname -r3.10.0-693.21.1.el7.x86_64[root@node1 ~]# stap -v -r 3.10.0-693.21.1.el7.x86_64 -e 'probe vfs.read {printf("read performed\n"); exit()}' -m teststap 1: parsed user script and 466 library scripts using 226648virt/39732res/3336shr/36536data kb, in 290usr/30sys/329real ms.Pass 2: analyzed script: 1 probe, 1 function, 7 embeds, 0 globals using 371200virt/179340res/4660shr/181088data kb, in 1160usr/650sys/1810real ms.Pass 3: translated to C into "/tmp/stap2XZAPY/teststap_src.c" using 371200virt/179588res/4908shr/181088data kb, in 20usr/70sys/81real ms.Pass 4: compiled C into "teststap.ko" in 790usr/1380sys/2079real ms.Pass 5: starting run.read performedPass 5: run completed in 0usr/70sys/393real ms.[root@node1 ~]# ll teststap.ko -rw-r--r-- 1 root root 96912 May 10 16:29 teststap.ko</code></pre><p>上面命令-r 指定内核版本,若是2者之间内核一致,则可以不需要-r参数。<br>将该模块文件复制到node3节点,然后执行如下确认:</p><pre><code>[root@node1 ~]# scp teststap.ko root@node3:/rootteststap.ko 100% 95KB 33.9MB/s 00:00 [root@node1 ~]# ssh node3Last login: Thu May 10 14:36:32 2018 from 192.168.0.99 [root@node3 ~]# staprun teststap.ko read performed[root@node3 ~]# </code></pre><p>成功执行模块指令</p><p><strong>运行systemtap 脚本</strong></p><p>常用命令stap staprun 相关使用使用man查看</p><p>运行stap命令用户最好拥有root权限,若是普通用户则需要将其加入到组stapdev 或者stapusr ,详情查看官网介绍。</p><p>脚本运行方式:</p><p>1、通过脚本文件</p><p>stap script_file_name</p><p>2、通过- 从标准输出中获取脚本信息</p><p>echo “probe timer.s(1) {exit()}” | stap -v -</p><p><strong>stap基本参数</strong></p><p>-v 详细列出输出信息-vvv列出更多信息</p><p>-o filename 将标准输出到指定文件名</p><p>-S size,count 限制output文件的数量和单个文件大小。</p><p>-x process_id 设置systemtap句柄函数target()获取指定进程ID的信息</p><p>-c command 设置systemtap句柄函数target()通过指定的command来运行</p><p>-e ‘scrip’ 使用脚本作为systemtap的输入</p><p>-F 让systemtap在后台已进程方式运行,默认后台模式有2种,一种是in-memory ,一种是file 模式</p><ul><li>in-memory 模式</li></ul><p>执行</p><pre><code>stap -F iotime.stp </code></pre><p>执行改命令后,stap打印简单的改命令执行信息,方便你重新连接改脚本:</p><pre><code>Disconnecting from systemtap module.To reconnect, type "staprun -A stap_5dd0073edcb1f13f7565d8c343063e68_19556"</code></pre><p>获取结果</p><pre><code>staprun -A stap_5dd0073edcb1f13f7565d8c343063e68_19556</code></pre><ul><li>file 模式</li></ul><p>执行</p><pre><code>stap -F -o /tmp/pfaults.log -S 1,2 pfaults.stp</code></pre><p>生成2个文件,每个文件1M大小,文件格式如/tmp/iotime.log.[0-9]+<br>始终保持最新的数据到这2个文件中,旧数据会被remove。该命令执行后会输出进程id号,则通过kill 命令停止脚本运行,举例id号为7590</p><pre><code>kill -s SIGTERM 7590</code></pre>]]></content>
<summary type="html">
<p><strong>systemtap简介</strong></p>
<p>SystemTap 是一款诊断Linux系统性能的工具,可以跟踪内核以及用户态程序中的任意函数、syscall、语句甚至指令,可以用来动态地收集调试和性能信息的工具,不需要我们重新编译、重启内核。</p>
<p>详细介绍查看官网地址。其官网地址 <a href="https://sourceware.org/systemtap/" title="https://sourceware.org/systemtap/" target="_blank" rel="noopener">https://sourceware.org/systemtap/</a>。<br>
</summary>
<category term="systemtap" scheme="http://idcat.cn/tags/systemtap/"/>
</entry>
<entry>
<title>记录一次因对象权限不对导致osd无法启动的排查过程</title>
<link href="http://idcat.cn/%E8%AE%B0%E5%BD%95%E4%B8%80%E6%AC%A1%E5%9B%A0%E5%AF%B9%E8%B1%A1%E6%9D%83%E9%99%90%E4%B8%8D%E5%AF%B9%E5%AF%BC%E8%87%B4osd%E6%97%A0%E6%B3%95%E5%90%AF%E5%8A%A8%E7%9A%84%E6%8E%92%E6%9F%A5%E8%BF%87%E7%A8%8B.html"/>
<id>http://idcat.cn/记录一次因对象权限不对导致osd无法启动的排查过程.html</id>
<published>2018-05-11T02:03:48.000Z</published>
<updated>2018-05-11T02:06:39.057Z</updated>
<content type="html"><![CDATA[<p>记录一次osd启动失败的例子,仅个人记录使用。<br><a id="more"></a><br>集群状态:</p><pre><code>[root@node1 omap]# ceph -scluster 911c57dc-a930-4da8-ab0e-69f6b6586e3d health HEALTH_WARN 328 pgs degraded 328 pgs stuck degraded 328 pgs stuck unclean 328 pgs stuck undersized 328 pgs undersized recovery 319/957 objects degraded (33.333%) too many PGs per OSD (328 > max 300) monmap e1: 3 mons at {node1=192.168.1.141:6789/0,node2=192.168.1.142:6789/0,node3=192.168.1.143:6789/0} election epoch 108, quorum 0,1,2 node1,node2,node3 fsmap e705662: 1/1/1 up {0=node1=up:active}, 2 up:standby osdmap e705733: 3 osds: 2 up, 2 in; 328 remapped pgs flags sortbitwise,require_jewel_osds pgmap v1296660: 328 pgs, 12 pools, 457 MB data, 319 objects 1071 MB used, 29626 MB / 30697 MB avail 319/957 objects degraded (33.333%) 328 active+undersized+degraded</code></pre><p>查看是node1上的osd0 没有启动,于是启动osd0</p><pre><code>2018-05-10 12:08:08.756988 7fa896131800 0 filestore(/var/lib/ceph/osd/ceph-0) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled2018-05-10 12:08:08.760327 7fa896131800 1 journal _open /var/lib/ceph/osd/ceph-0/journal fd 18: 5368709120 bytes, block size 4096 bytes, directio = 1, aio = 12018-05-10 12:08:08.762157 7fa896131800 1 journal _open /var/lib/ceph/osd/ceph-0/journal fd 18: 5368709120 bytes, block size 4096 bytes, directio = 1, aio = 12018-05-10 12:08:08.763165 7fa896131800 1 filestore(/var/lib/ceph/osd/ceph-0) upgrade2018-05-10 12:08:08.763591 7fa896131800 0 <cls> cls/cephfs/cls_cephfs.cc:202: loading cephfs_size_scan2018-05-10 12:08:08.763774 7fa896131800 0 <cls> cls/hello/cls_hello.cc:305: loading cls_hello2018-05-10 12:08:08.770974 7fa896131800 -1 osd/OSD.h: In function 'OSDMapRef OSDService::get_map(epoch_t)' thread 7fa896131800 time 2018-05-10 12:08:08.769611osd/OSD.h: 894: FAILED assert(ret) ceph version 10.2.10 (5dc1e4c05cb68dbf62ae6fce3f0700e4654fdbbe) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0x560b85e5c9e5] 2: (OSDService::get_map(unsigned int)+0x3d) [0x560b8582f14d] 3: (OSD::init()+0x1fe2) [0x560b857e2b42] 4: (main()+0x2c01) [0x560b85746461] 5: (__libc_start_main()+0xf5) [0x7fa892e2fc05] 6: (()+0x35d917) [0x560b85790917] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.</code></pre><p>只能看到get_map(epoch_t) FAILED assert(ret)。不知道什么原因导致的。于是开启调试日志启动该osd查看更多详细信息。</p><pre><code>[root@node1 ceph-0]# ceph-osd -f --cluster ceph -i 0 --setuser ceph --setgroup ceph --debug-osd=10 --debug-filestore=10 --log-to-stderr=1</code></pre><p>查看输出中显示如下:</p><pre><code>-2> 2018-05-10 12:57:33.725952 7f66199a7800 10 filestore(/var/lib/ceph/osd/ceph-0) error opening file /var/lib/ceph/osd/ceph-0/current/meta/DIR_4/DIR_A/DIR_0/osdmap.705730__0_C8EB90A4__none with flags=2: (13) Permission denied-1> 2018-05-10 12:57:33.725960 7f66199a7800 10 filestore(/var/lib/ceph/osd/ceph-0) FileStore::read(meta/#-1:2509d713:::osdmap.705730:0#) open error: (13) Permission denied 0> 2018-05-10 12:57:33.727508 7f66199a7800 -1 osd/OSD.h: In function 'OSDMapRef OSDService::get_map(epoch_t)' thread 7f66199a7800 time 2018-05-10 12:57:33.725969osd/OSD.h: 894: FAILED assert(ret)</code></pre><p>有对象文件权限拒绝。</p><p>进入目录查看</p><pre><code>[root@node1 DIR_0]# lltotal 24-rw-r--r-- 1 ceph ceph 5171 Apr 25 09:32 osdmap.705305__0_C8F530A4__none-rw-r--r-- 1 ceph ceph 5171 Apr 25 09:39 osdmap.705518__0_C8F460A4__none-rw-r--r-- 1 root root 5659 May 10 11:48 osdmap.705730__0_C8EB90A4__none</code></pre><p>均是root用户。于是修改权限</p><pre><code>[root@node1 DIR_0]# chown ceph:ceph osdmap.705730__0_C8EB90A4__none </code></pre><p>启动该osd</p><pre><code>[root@node1 DIR_0]# systemctl start ceph-osd@0</code></pre><p>查看集群状态</p><pre><code>[root@node1 DIR_0]# ceph -scluster 911c57dc-a930-4da8-ab0e-69f6b6586e3d health HEALTH_WARN too many PGs per OSD (328 > max 300) monmap e1: 3 mons at {node1=192.168.1.141:6789/0,node2=192.168.1.142:6789/0,node3=192.168.1.143:6789/0} election epoch 108, quorum 0,1,2 node1,node2,node3 fsmap e705662: 1/1/1 up {0=node1=up:active}, 2 up:standby osdmap e705736: 3 osds: 3 up, 3 in flags sortbitwise,require_jewel_osds pgmap v1296809: 328 pgs, 12 pools, 457 MB data, 319 objects 1593 MB used, 44453 MB / 46046 MB avail 328 active+clean</code></pre><p>至此,集群恢复OK。</p>]]></content>
<summary type="html">
<p>记录一次osd启动失败的例子,仅个人记录使用。<br>
</summary>
<category term="ceph" scheme="http://idcat.cn/tags/ceph/"/>
</entry>
<entry>
<title>blktrace+fio对比测试分析</title>
<link href="http://idcat.cn/blktrace-fio%E5%AF%B9%E6%AF%94%E6%B5%8B%E8%AF%95%E5%88%86%E6%9E%90.html"/>
<id>http://idcat.cn/blktrace-fio对比测试分析.html</id>
<published>2018-05-08T08:13:20.000Z</published>
<updated>2018-05-08T08:25:44.517Z</updated>
<content type="html"><![CDATA[<p>利用BLKTRACE分析磁盘IO </p><p>在Linux系统上,查看磁盘的负载情况,咱们一般使用iostat监控工具,iostat的详细介绍查看另外的培训资料。其中很重要的参数就是await,await表示单个I/O所需的平均时间,但它同时包含了I/O Scheduler所消耗的时间和硬件所消耗的时间,所以不能作为硬件性能的指标。那如何才能分辨一个io从下发到返回整个时间上,是硬件层耗时多还是在io调度上耗时多呢?如何查看io在各个时间段所消耗的时间呢?那么,blktrace在这种场合就能派上用场,因为它能记录I/O所经历的各个步骤,从中可以分析是IO Scheduler慢还是硬件响应慢,以及各个时间段所用时间。<br><a id="more"></a><br>blktrace的原理<br>一个I/O请求进入block layer之后,可能会经历下面的过程:</p><ul><li>Remap: 可能被DM(Device Mapper)或MD(Multiple Device, Software RAID) remap到其它设备</li><li>Split: 可能会因为I/O请求与扇区边界未对齐、或者size太大而被分拆(split)成多个物理I/O</li><li>Merge: 可能会因为与其它I/O请求的物理位置相邻而合并(merge)成一个I/O</li></ul><ul><li>被IO Scheduler依照调度策略发送给driver</li></ul><ul><li>被driver提交给硬件,经过HBA、电缆(光纤、网线等)、交换机(SAN或网络)、最后到达存储设备,设备完成IO请求之后再把结果发回。</li></ul><p>blktrace能记录I/O所经历的各个步骤,来看一下它记录的数据,包含9个字段,下图标示了其中8个字段的含义,大致的意思是“哪个进程在访问哪个硬盘的哪个扇区,进行什么操作,进行到哪个步骤,时间戳是多少”:<br><img src="https://i.imgur.com/tyws5Pz.png" alt=""></p><ul><li>第一个字段:8,0 这个字段是设备号 major device ID和minor device ID。-第二个字段:3 表示CPU</li><li>第三个字段:11 序列号</li><li>第四个字段:0.009507758 Time Stamp是时间偏移</li><li>第五个字段:PID 本次IO对应的进程ID</li><li>第六个字段:Event,这个字段非常重要,反映了IO进行到了那一步</li><li>第七个字段:R表示 Read, W是Write,D表示block,B表示BarrierOperation</li><li>第八个字段:223490+56,表示的是起始block number 和 number of blocks,即我们常说的Offset 和 Size(扇区??)</li><li>第九个字段: 进程名</li></ul><p>其中第六个字段非常有用:每一个字母都代表了IO请求所经历的某个阶段。</p><ul><li>A 映射值对应设备 IO was remapped to a different device</li><li>B IO反弹,由于32位地址长度限制,所以需要copy数据到低位内存,这会有性能损耗。IO bounced</li><li>C IO完成 IO completion</li><li>D 将IO发送给驱动 IO issued to driver</li><li>F IO请求,前合并 IO front merged with request on queue</li><li>G 获取 请求 Get request</li><li>I IO插入请求队列 IO inserted onto request queue</li><li>M IO请求,后合并 IO back merged with request on queue</li><li>P 插上块设备队列(队列插入机制) Plug request</li><li>Q io被请求队列处理代码接管。 IO handled by request queue code</li><li>S 等待发送请求。 Sleep request</li><li>T 由于超时而拔出设备队列 Unplug due to timeout</li><li>U 拔出设备队列 Unplug request</li><li>X 开始新的扇区 Split</li></ul><p>其中最重要的几个阶段如下:</p><pre><code>Q – 即将生成IO请求|G – IO请求生成|I – IO请求进入IO Scheduler队列|D – IO请求进入driver|C – IO请求执行完毕</code></pre><p>根据以上步骤对应的时间戳就可以计算出I/O请求在每个阶段所消耗的时间:</p><pre><code>Q2G – 生成IO请求所消耗的时间,包括remap和split的时间;G2I – IO请求进入IO Scheduler所消耗的时间,包括merge的时间;I2D – IO请求在IO Scheduler中等待的时间;D2C – IO请求在driver和硬件上所消耗的时间;Q2C – 整个IO请求所消耗的时间(Q2I + I2D + D2C = Q2C),相当于iostat的await。</code></pre><p>btt工具,blkparse等参考<a href="http://www.idcat.cn/2018/04/21/blktrace-%E5%B7%A5%E5%85%B7%E7%AE%80%E4%BB%8B/" title="http://www.idcat.cn/2018/04/21/blktrace-%E5%B7%A5%E5%85%B7%E7%AE%80%E4%BB%8B/" target="_blank" rel="noopener">http://www.idcat.cn/2018/04/21/blktrace-%E5%B7%A5%E5%85%B7%E7%AE%80%E4%BB%8B/</a></p><h1 id="测试"><a href="#测试" class="headerlink" title="测试"></a>测试</h1><p><strong>1、4k direct=1</strong> </p><p>通过fio对本地磁盘sdc挂载点进行4K小文件顺序写测试,期间监控磁盘信息。完成fio测试后,手动终止监控程序,然后进行数据分析。并与iostat的数据进行对比分析。</p><p>fio 开启测试</p><pre><code>[root@node1 samba-test]# fio -filename=./4k_file -direct=1 -iodepth=1 -thread -rw=write -ioengine=libaio -bs=4k -size=3G -numjobs=4 -times=300 -group_reporting -name=mytest2mytest2: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1...fio-2.2.5Starting 4 threads</code></pre><p>注意:</p><p>1、direct=1即o_direct直写io。不会经过系统缓存,即没有合并</p><p>2、numjobs=4 bs=4k 对应iostat即队列深度为4.</p><p>3、bs=4k 对应iostat扇区数为8 </p><p>开启监控磁盘sdc</p><pre><code>[root@node1 ~]# blktrace -d /dev/sdc </code></pre><p>iostat监控</p><p><img src="https://i.imgur.com/pbfZKtL.jpg" alt=""></p><p>FIO运行结束后,终止blktrace进程</p><p>收集数据</p><pre><code>[root@node1 fioblk]# blktrace -d /dev/sdc^C=== sdc === CPU 0: 5504242 events, 258012 KiB data CPU 1: 5693967 events, 266905 KiB data Total: 11198209 events (dropped 0), 524917 KiB data[root@node1 fioblk]# lltotal 524920-rw-r--r-- 1 root root 264203656 Apr 27 14:08 sdc.blktrace.0-rw-r--r-- 1 root root 273310432 Apr 27 14:08 sdc.blktrace.1[root@node1 fioblk]#</code></pre><p>会看到生成文件数以当前cpu个数命名。</p><p>生成数据</p><p>blkparse -i sdc -d sdc.blktrace.bin</p><p>会将所有数据汇总到文件sdc.blktrace.bin里面。期间我截图了如下,代表3个io的过程。对比上面介绍。<br> <img src="https://i.imgur.com/nGTPClj.jpg" alt=""><br>WS表示同步写</p><p>分析数据</p><p>生成所有数据,后面介绍针对性的数据。</p><pre><code>[root@node1 fioblk]# btt -i sdc.blktrace.bin -A |less </code></pre><p><img src="https://i.imgur.com/QbV8sv4.jpg" alt=""></p><ul><li>Q2G – 生成IO请求所消耗的时间,包括remap和split的时间;本例0.009ms</li><li>G2I – IO请求进入IO Scheduler所消耗的时间,包括merge的时间;0.001ms</li><li>I2D – IO请求在IO Scheduler中等待的时间;0.0009ms</li><li>D2C – IO请求在driver和硬件上所消耗的时间;0.7ms</li></ul><p>因此:</p><p>Q2C – 整个IO请求所消耗的时间(Q2I + I2D + D2C = Q2C),相当于iostat的await。 几乎0.78ms与iostat wait值一致。在io scheduler上几乎没有消耗任何时间。</p><p>生成iops 和bw每秒值</p><pre><code>btt -i sdc.blktrace.bin -q sdc.q2c_latencytotal 1079400-rw-r--r-- 1 root root 2759 Apr 27 14:45 8,32_iops_fp.dat-rw-r--r-- 1 root root 4340 Apr 27 14:45 8,32_mbps_fp.dat-rw-r--r-- 1 root root 264203656 Apr 27 14:08 sdc.blktrace.0-rw-r--r-- 1 root root 273310432 Apr 27 14:08 sdc.blktrace.1-rw-r--r-- 1 root root 537514088 Apr 27 14:16 sdc.blktrace.bin-rw-r--r-- 1 root root 30244692 Apr 27 14:45 sdc.q2c_latency_8,32_q2c.dat-rw-r--r-- 1 root root 2759 Apr 27 14:45 sys_iops_fp.dat-rw-r--r-- 1 root root 4340 Apr 27 14:45 sys_mbps_fp.dat</code></pre><p>生成io大小分布</p><pre><code>btt -i sdc.blktrace.bin -B sdc.offset[root@node1 fioblk]# lltotal 1183168-rw-r--r-- 1 root root 2759 Apr 27 14:57 8,32_iops_fp.dat-rw-r--r-- 1 root root 4340 Apr 27 14:57 8,32_mbps_fp.dat-rw-r--r-- 1 root root 264203656 Apr 27 14:08 sdc.blktrace.0-rw-r--r-- 1 root root 273310432 Apr 27 14:08 sdc.blktrace.1-rw-r--r-- 1 root root 537514088 Apr 27 14:16 sdc.blktrace.bin-rw-r--r-- 1 root root 53125272 Apr 27 14:57 sdc.offset_8,32_c.dat-rw-r--r-- 1 root root 53125272 Apr 27 14:57 sdc.offset_8,32_w.dat</code></pre><p>本例只有write,所有生成了_w _c 写是w 读是r c是w+c 本例c和w一样大。查看文件内容</p><pre><code>0.000002716 52608272 526082800.000129844 52608000 526080080.000324497 52607688 526076960.000927928 52606144 526061520.001015187 52608280 526082880.001449302 52608008 52608016</code></pre><p>第一行为时间,第二个为其实扇区大小,第3行为每个io结束扇区大小。可以算出每个io为8个扇区大小,既4k 与测试实际相符合。</p><p><strong>2、1M direct=1</strong></p><p>fio运行</p><pre><code>[root@node1 samba-test]# fio -filename=./1m_file -direct=1 -iodepth=1 -thread -rw=write -ioengine=libaio -bs=1M -size=3G -numjobs=4 -times=300 -group_reporting -name=mytest2mytest2: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=1...fio-2.2.5Starting 4 threadsmytest2: Laying out IO file(s) (1 file(s) / 3072MB)</code></pre><p>注意:</p><ul><li>1、bs=1M ,sdc块设备最大单次io大小为512,因此会分为2个io写入,numjobs=4,因此队列深度为2*4 = 8 </li><li>[root@node1 1m]# cat /sys/block/sdc/queue/max_sectors_kb<br>512</li><li>[root@node1 ~]# cat /sys/block/sdc/queue/nr_requests<br>128 磁盘最大队列深度。</li><li>2、bs=1M 最大io数为512kb,除以512byte扇区大小为操作扇区数1024<br>与iostat一致。</li></ul><p>blktrace取值</p><pre><code>[root@node1 1m]# blktrace -d /dev/sdc</code></pre><p>iostat取值<br><img src="https://i.imgur.com/O8hjNoU.jpg" alt=""></p><p>fio运行完成后,生成数据,并分析数据</p><p>生成数据</p><pre><code>[root@node1 1m]# blkparse -i sdc -d sdc.blktrace.bin</code></pre><p><img src="https://i.imgur.com/5YcmCt4.jpg" alt=""></p><p>可以看到Q-C一个完整的io路径。右边起始扇区+1024个扇区数。与实际相符合。</p><p>分析数据</p><pre><code>root@node1 1m]# btt -i sdc.blktrace.bin -A | less</code></pre><p><img src="https://i.imgur.com/FiYWzt6.jpg" alt=""></p><ul><li>Q2G – 生成IO请求所消耗的时间,包括remap和split的时间;<br>本例0.7ms 因为有块分为2部分。1m分为2个512kb io,单个512kb io 到扇区后得与扇区对齐等等</li><li>G2I – IO请求进入IO Scheduler所消耗的时间,包括merge的时间;<br>0.009ms 没有进行合并 绕过缓存了。</li><li>I2D – IO请求在IO Scheduler中等待的时间;<br>2.8ms</li><li>D2C – IO请求在driver和硬件上所消耗的时间;<br>34.5 ms</li></ul><p>因此:</p><p>Q2C – 整个IO请求所消耗的时间(Q2I + I2D + D2C = Q2C),相当于iostat的await。 为38.1ms与iostat wait值几乎一致。在io scheduler上几乎没有消耗任何时间。时间花费在driver硬件层。</p><p>生成iops 和bw每秒值</p><pre><code>btt -i sdc.blktrace.bin -q sdc.q2c_latency[root@node1 1m]# lltotal 14508-rw-r--r-- 1 root root 913 Apr 27 15:40 8,32_iops_fp.dat-rw-r--r-- 1 root root 1731 Apr 27 15:40 8,32_mbps_fp.dat-rw-r--r-- 1 root root 3295528 Apr 27 15:20 sdc.blktrace.0-rw-r--r-- 1 root root 3894704 Apr 27 15:20 sdc.blktrace.1-rw-r--r-- 1 root root 7190232 Apr 27 15:28 sdc.blktrace.bin-rw-r--r-- 1 root root 452027 Apr 27 15:40 sdc.q2c_latency_8,32_q2c.dat-rw-r--r-- 1 root root 913 Apr 27 15:40 sys_iops_fp.dat-rw-r--r-- 1 root root 1731 Apr 27 15:40 sys_mbps_fp.dat</code></pre><p>生成io大小分布</p><pre><code>btt -i sdc.blktrace.bin -B sdc.offset[root@node1 1m]# lltotal 16148-rw-r--r-- 1 root root 913 Apr 27 15:44 8,32_iops_fp.dat-rw-r--r-- 1 root root 1731 Apr 27 15:44 8,32_mbps_fp.dat-rw-r--r-- 1 root root 3295528 Apr 27 15:20 sdc.blktrace.0-rw-r--r-- 1 root root 3894704 Apr 27 15:20 sdc.blktrace.1-rw-r--r-- 1 root root 7190232 Apr 27 15:28 sdc.blktrace.bin-rw-r--r-- 1 root root 835954 Apr 27 15:44 sdc.offset_8,32_c.dat-rw-r--r-- 1 root root 835954 Apr 27 15:44 sdc.offset_8,32_w.dat-rw-r--r-- 1 root root 452027 Apr 27 15:40 sdc.q2c_latency_8,32_q2c.dat-rw-r--r-- 1 root root 913 Apr 27 15:44 sys_iops_fp.dat-rw-r--r-- 1 root root 1731 Apr 27 15:44 sys_mbps_fp.dat</code></pre><p>只有写,即_w 与_c一致。查看_w文件sdc.offset_8,32_w.dat</p><pre><code>127.845942908 65054784 65055808127.846185883 65055808 65056832127.952831800 65056832 65057856127.953065986 65057856 65058880127.955207647 65058880 65059904</code></pre><p>第一行为时间,第二个为其实扇区大小,第3行为每个io结束扇区大小。可以算出每个io为1024个扇区大小,既1M 与测试实际相符合。</p><p><strong>3、4k direct=0</strong></p><pre><code>[root@node1 samba-test]# fio -filename=./4k_file_direct0 -direct=0 -iodepth=1 -thread -rw=write -ioengine=libaio -bs=4k -size=3G -numjobs=4 -times=300 -group_reporting -name=mytest2mytest2: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1...fio-2.2.5Starting 4 threadsmytest2: Laying out IO file(s) (1 file(s) / 3072MB)</code></pre><ul><li>direct=0后 启动io buffer </li></ul><p>iostat监控<br><img src="https://i.imgur.com/ap1qitG.jpg" alt=""></p><p>blktrace监控数据</p><p>生成数据</p><pre><code>[root@node1 1m]# blkparse -i sdc -d sdc.blktrace.bin</code></pre><p><img src="https://i.imgur.com/wm94VIu.jpg" alt=""></p><p>4k对象大小,在io路径变成1024 既最大512kb字节,很明显进行IO合并,大部分io动作都是Q G I来回循环,即生成io请求到io请求队列,就返回继续申请新的io到队列,整体由内核kworker进行io任务的调度,在内存swapper中完成io写入。W表示异步写。</p><p>分析数据</p><pre><code>[root@node1 1m]# btt -i sdc.blktrace.bin -A | less</code></pre><p><img src="https://i.imgur.com/sVzF5Lq.jpg" alt=""></p><ul><li>Q2G – 生成IO请求所消耗的时间,包括remap和split的时间;<br>本例5.6ms</li><li>G2I – IO请求进入IO Scheduler所消耗的时间,包括merge的时间;<br>0.009ms 没有进行合并。</li><li>I2D – IO请求在IO Scheduler中等待的时间;<br>635 ms</li><li>D2C – IO请求在driver和硬件上所消耗的时间;<br>188 ms</li><li>Q2C – 整个IO请求所消耗的时间(Q2I + I2D + D2C = Q2C),相当于iostat的await。 为828.6ms与iostat wait值几乎一致。在io scheduler上消耗太多时间。在driver硬件层耗时相对少一些,但是也很大了达到188ms。<br>其中M2D io合并时间平均4.5s.</li></ul><p>生成io大小分布</p><pre><code>btt -i sdc.blktrace.bin -B sdc.offset93.463618386 74670904 74671928 93.463870465 74671928 74672952 93.464140190 74672952 74673976 93.464445965 74673976 74675000 93.464740898 74675000 74676024 93.465027976 74676024 74677048 93.465313376 74677048 74678072</code></pre><p>同样的是以1024个扇区为最小io。</p>]]></content>
<summary type="html">
<p>利用BLKTRACE分析磁盘IO </p>
<p>在Linux系统上,查看磁盘的负载情况,咱们一般使用iostat监控工具,iostat的详细介绍查看另外的培训资料。其中很重要的参数就是await,await表示单个I/O所需的平均时间,但它同时包含了I/O Scheduler所消耗的时间和硬件所消耗的时间,所以不能作为硬件性能的指标。那如何才能分辨一个io从下发到返回整个时间上,是硬件层耗时多还是在io调度上耗时多呢?如何查看io在各个时间段所消耗的时间呢?那么,blktrace在这种场合就能派上用场,因为它能记录I/O所经历的各个步骤,从中可以分析是IO Scheduler慢还是硬件响应慢,以及各个时间段所用时间。<br>
</summary>
<category term="blktrace fio" scheme="http://idcat.cn/tags/blktrace-fio/"/>
</entry>
<entry>
<title>hadoop2.7.3通过s3对接ceph10.2 radosgw测试</title>
<link href="http://idcat.cn/hadoop2-7-3%E9%80%9A%E8%BF%87s3%E5%AF%B9%E6%8E%A5ceph10-2-radosgw%E6%B5%8B%E8%AF%95.html"/>
<id>http://idcat.cn/hadoop2-7-3通过s3对接ceph10-2-radosgw测试.html</id>
<published>2018-04-25T10:02:18.000Z</published>
<updated>2018-04-25T10:15:36.977Z</updated>
<content type="html"><![CDATA[<p>公司提出测试需求,将Hadoop2.7与ceph10.2 S3对象存储进行集成测试,hadoop官网介绍:<a href="http://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html" target="_blank" rel="noopener">官网介绍</a><br>后查阅相关资料完成对接测试,现将环境部署,对接测试完整过程,整理如下:<br><a id="more"></a><br><strong>hadoop环境</strong></p><p>2台主机 主机名分别为master slave . master作为hadoop namenode,slave作为datanode.<br>hadoop集群部署过程参考: <a href="http://www.178pt.com/156.html" target="_blank" rel="noopener">hadoop集群部署</a></p><p>ceph10.2 radosgw配置过程参考:<a href="http://www.178pt.com/250.html" target="_blank" rel="noopener">radosgw配置</a></p><p><strong>hadoop集成s3</strong></p><p>在master(namenode)节点上修改core-site.xml,增加如下配置(endpoint key根据实际填写):</p><pre><code><!-- Put site-specific property overrides in this file. --><configuration><property> <name>fs.defaultFS</name> <value>hdfs://master:9000</value></property><property> <name>io.file.buffer.size</name> <value>131072</value></property><property> <name>hadoop.tmp.dir</name> <value>file:/usr/hadoop/tmp</value> <description>Abase for other temporary directories.</description></property>##增加如下内容<property> <name>fs.s3a.access.key</name> <value>YZ8H5J5B4BS4HGJ6U8YC</value> <description>AWS access key ID. Omit for Role-based authentication.</description></property><property> <name>fs.s3a.secret.key</name> <value>KzPrV6ytwoZoQCMHzbnXXMQKrjH5MLnD3Wsb0AjJ</value> <description>AWS secret key</description></property><property> <name>fs.s3a.endpoint</name> <value>192.168.1.31:7480</value> <description>AWS S3 endpoint to connect to. An up-to-date list is provided in the AWS Documentation: regions and endpoints. Without this property, the standard region (s3.amazonaws.com) is assumed. </description></property><property> <name>fs.s3a.connection.ssl.enabled</name> <value>false</value> <description>Enables or disables SSL connections to S3.</description></property>##增加结束</configuration></code></pre><p>在master slave 2个hadoop节点上拷贝s3相关的jar包,否则会报错。</p><pre><code>[root@master etc]# pwd/usr/hadoop/hadoop-2.7.3/etc[root@master etc]# cp hadoop/share/hadoop/tools/lib/hadoop-aws-2.7.3.jar hadoop/share/hadoop/common/lib/[root@master etc]# cp hadoop/share/hadoop/tools/lib/aws-java-sdk-1.7.4.jar hadoop/share/hadoop/common/lib/[root@master etc]# cp hadoop/share/hadoop/tools/lib/joda-time-2.9.4.jar hadoop/share/hadoop/common/lib/[root@master etc]# cp hadoop/share/hadoop/tools/lib/jackson-*.jar hadoop/share/hadoop/common/lib/</code></pre><p>重启hadoop</p><pre><code>[root@master etc]# stop-all.sh[root@master etc]# start-all.sh</code></pre><p><strong>hadoop集成s3测试</strong></p><p>ceph 节点上创建桶hadoop,并上传文件</p><pre><code>[root@radosgw1 ~]# s3cmd mb s3://hadoopBucket 's3://hadoop/' created[root@radosgw1 ~]# s3cmd put abc s3://hadoopupload: 'abc' -> 's3://hadoop/abc' [1 of 1] 1109 of 1109 100% in 1s 1096.74 B/s done[root@radosgw1 ~]# s3cmd ls s3://hadoop2018-04-25 08:47 1109 s3://hadoop/abc</code></pre><p>hadoop master节点上查看</p><pre><code>[root@master ~]# hadoop fs -ls s3a://hadoop/Found 1 items-rw-rw-rw- 1 1109 2018-04-25 16:47 s3a://hadoop/abc</code></pre><p>1、 从hadoop client本机上传文件到对象存储</p><pre><code>[root@master ~]# ls ceshi.txt ceshi.txt[root@master ~]# hadoop fs -put ceshi.txt s3a://hadoop/[root@master ~]# hadoop fs -ls s3a://hadoop/Found 2 items-rw-rw-rw- 1 1109 2018-04-25 16:47 s3a://hadoop/abc-rw-rw-rw- 1 1083 2018-04-25 16:52 s3a://hadoop/ceshi.txt[root@master ~]#</code></pre><p>集群端查看</p><pre><code>[root@radosgw1 ~]# s3cmd ls s3://hadoop2018-04-25 08:47 1109 s3://hadoop/abc2018-04-25 08:52 1083 s3://hadoop/ceshi.txt</code></pre><p>2、 将文件从对象存储下载到本地</p><pre><code> [root@master ~]# rm -f ceshi.txt [root@master ~]# ls ceshi.txtls: cannot access ceshi.txt: No such file or directory[root@master ~]# hadoop fs -get s3a://hadoop/ceshi.txt [root@master ~]# ls ceshi.txtceshi.txt[root@master ~]#</code></pre><p>3、 将文件从对象拷贝到hdfs文件系统</p><pre><code>[root@master ~]# hdfs dfs -ls /Found 4 itemsdrwxr-xr-x - root supergroup 0 2018-04-25 15:21 /hahadrwxr-xr-x - root supergroup 0 2018-04-25 12:10 /inputdrwxr-xr-x - root supergroup 0 2018-04-25 12:11 /outputdrwx------ - root supergroup 0 2018-04-25 12:11 /tmp[root@master ~]# hdfs dfs -ls /ceshi.txtls: `/ceshi.txt': No such file or directory[root@master ~]# hadoop distcp s3a://hadoop/ceshi.txt /ceshi.txt18/04/25 17:00:10 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[s3a://hadoop/ceshi.txt], targetPath=/ceshi.txt, targetPathExists=false, preserveRawXattrs=false}18/04/25 17:00:10 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.1.20:803218/04/25 17:00:30 INFO Configuration.deprecation: io.sort.mb is deprecated. Instead, use mapreduce.task.io.sort.mb18/04/25 17:00:30 INFO Configuration.deprecation: io.sort.factor is deprecated. Instead, use mapreduce.task.io.sort.factor18/04/25 17:00:31 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.1.20:803218/04/25 17:00:31 INFO mapreduce.JobSubmitter: number of splits:118/04/25 17:00:32 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1524633996089_000918/04/25 17:00:32 INFO impl.YarnClientImpl: Submitted application application_1524633996089_000918/04/25 17:00:32 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1524633996089_0009/18/04/25 17:00:32 INFO tools.DistCp: DistCp job-id: job_1524633996089_000918/04/25 17:00:32 INFO mapreduce.Job: Running job: job_1524633996089_000918/04/25 17:00:40 INFO mapreduce.Job: Job job_1524633996089_0009 running in uber mode : false18/04/25 17:00:40 INFO mapreduce.Job: map 0% reduce 0%18/04/25 17:00:52 INFO mapreduce.Job: map 100% reduce 0%18/04/25 17:01:05 INFO mapreduce.Job: Job job_1524633996089_0009 completed successfully18/04/25 17:01:05 INFO mapreduce.Job: Counters: 38File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=121596 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=330 HDFS: Number of bytes written=1083 HDFS: Number of read operations=14 HDFS: Number of large read operations=0 HDFS: Number of write operations=4 S3A: Number of bytes read=1083 S3A: Number of bytes written=0 S3A: Number of read operations=3 S3A: Number of large read operations=0 S3A: Number of write operations=0Job Counters Launched map tasks=1 Other local map tasks=1 Total time spent by all maps in occupied slots (ms)=20780 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=20780 Total vcore-milliseconds taken by all map tasks=20780 Total megabyte-milliseconds taken by all map tasks=21278720Map-Reduce Framework Map input records=1 Map output records=0 Input split bytes=135 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=154 CPU time spent (ms)=1350 Physical memory (bytes) snapshot=113676288 Virtual memory (bytes) snapshot=862224384 Total committed heap usage (bytes)=29032448File Input Format Counters Bytes Read=195File Output Format Counters Bytes Written=0org.apache.hadoop.tools.mapred.CopyMapper$Counter BYTESCOPIED=1083 BYTESEXPECTED=1083 COPY=1[root@master ~]# hdfs dfs -ls /ceshi.txt-rw-r--r-- 1 root supergroup 1083 2018-04-25 17:00 /ceshi.txt[root@master ~]# </code></pre><p>4、 将文件从HDFS文件系统拷贝到s3对象存储中</p><pre><code>s3对象列出所有文件[root@radosgw1 ~]# s3cmd ls s3://hadoop2018-04-25 08:47 1109 s3://hadoop/abc2018-04-25 08:52 1083 s3://hadoop/ceshi.txt[root@radosgw1 ~]# 将hdfs文件系统下的/haha目录中anaconda-ks.cfg文件传到s3对象存储里面[root@master ~]# hdfs dfs -ls /hahaFound 1 items-rw-r--r-- 1 root supergroup 1083 2018-04-25 15:21 /haha/anaconda-ks.cfg[root@master ~]# hadoop distcp /haha/anaconda-ks.cfg s3a://hadoop/18/04/25 17:06:18 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[/haha/anaconda-ks.cfg], targetPath=s3a://hadoop/, targetPathExists=true, preserveRawXattrs=false}18/04/25 17:06:18 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.1.20:803218/04/25 17:06:24 INFO Configuration.deprecation: io.sort.mb is deprecated. Instead, use mapreduce.task.io.sort.mb18/04/25 17:06:24 INFO Configuration.deprecation: io.sort.factor is deprecated. Instead, use mapreduce.task.io.sort.factor18/04/25 17:06:25 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.1.20:803218/04/25 17:06:26 INFO mapreduce.JobSubmitter: number of splits:118/04/25 17:06:26 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1524633996089_001018/04/25 17:06:26 INFO impl.YarnClientImpl: Submitted application application_1524633996089_001018/04/25 17:06:26 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1524633996089_0010/18/04/25 17:06:26 INFO tools.DistCp: DistCp job-id: job_1524633996089_001018/04/25 17:06:26 INFO mapreduce.Job: Running job: job_1524633996089_001018/04/25 17:06:35 INFO mapreduce.Job: Job job_1524633996089_0010 running in uber mode : false18/04/25 17:06:35 INFO mapreduce.Job: map 0% reduce 0%18/04/25 17:06:57 INFO mapreduce.Job: map 100% reduce 0%18/04/25 17:08:14 INFO mapreduce.Job: Job job_1524633996089_0010 completed successfully18/04/25 17:08:14 INFO mapreduce.Job: Counters: 38File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=121562 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=1459 HDFS: Number of bytes written=0 HDFS: Number of read operations=10 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 S3A: Number of bytes read=0 S3A: Number of bytes written=1083 S3A: Number of read operations=11 S3A: Number of large read operations=0 S3A: Number of write operations=3Job Counters Launched map tasks=1 Other local map tasks=1 Total time spent by all maps in occupied slots (ms)=86489 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=86489 Total vcore-milliseconds taken by all map tasks=86489 Total megabyte-milliseconds taken by all map tasks=88564736Map-Reduce Framework Map input records=1 Map output records=0 Input split bytes=134 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=151 CPU time spent (ms)=1760 Physical memory (bytes) snapshot=116514816 Virtual memory (bytes) snapshot=863125504 Total committed heap usage (bytes)=29032448File Input Format Counters Bytes Read=242File Output Format Counters Bytes Written=0org.apache.hadoop.tools.mapred.CopyMapper$Counter BYTESCOPIED=1083 BYTESEXPECTED=1083 COPY=1[root@master ~]# s3集群端验证[root@radosgw1 ~]# s3cmd ls s3://hadoop2018-04-25 08:47 1109 s3://hadoop/abc2018-04-25 09:08 1083 s3://hadoop/anaconda-ks.cfg2018-04-25 08:52 1083 s3://hadoop/ceshi.txt[root@radosgw1 ~]# </code></pre><p>5、 将对象存储中的文件作为mapreduce的输入,进行计算之后将结果输出到hdfs文件系统中。</p><pre><code>将对象存储中的/hadoop/abc文件作为mapreduce的文件输入,计算结果输出到hdfs的/result目录 [root@master ~]# hadoop fs -ls s3a://hadoop/Found 3 items-rw-rw-rw- 1 1109 2018-04-25 16:47 s3a://hadoop/abc-rw-rw-rw- 1 1083 2018-04-25 17:08 s3a://hadoop/anaconda-ks.cfg-rw-rw-rw- 1 1083 2018-04-25 16:52 s3a://hadoop/ceshi.txt[root@master ~]# hdfs dfs -ls /resultls: `/result': No such file or directory[root@master ~]#</code></pre><p>当前hdfs是没有/result目录的,下面进行计算操作</p><pre><code>[root@master ~]# hadoop jar /usr/hadoop/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount s3a://hadoop/abc /result18/04/25 17:19:53 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.1.20:803218/04/25 17:19:55 INFO input.FileInputFormat: Total input paths to process : 118/04/25 17:19:56 INFO mapreduce.JobSubmitter: number of splits:118/04/25 17:19:56 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1524633996089_001118/04/25 17:19:57 INFO impl.YarnClientImpl: Submitted application application_1524633996089_001118/04/25 17:19:57 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1524633996089_0011/18/04/25 17:19:57 INFO mapreduce.Job: Running job: job_1524633996089_001118/04/25 17:20:06 INFO mapreduce.Job: Job job_1524633996089_0011 running in uber mode : false18/04/25 17:20:06 INFO mapreduce.Job: map 0% reduce 0%18/04/25 17:20:23 INFO mapreduce.Job: map 100% reduce 0%18/04/25 17:20:31 INFO mapreduce.Job: map 100% reduce 100%18/04/25 17:20:32 INFO mapreduce.Job: Job job_1524633996089_0011 completed successfully18/04/25 17:20:32 INFO mapreduce.Job: Counters: 54File System Counters FILE: Number of bytes read=1442 FILE: Number of bytes written=240937 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=81 HDFS: Number of bytes written=1121 HDFS: Number of read operations=5 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 S3A: Number of bytes read=1109 S3A: Number of bytes written=0 S3A: Number of read operations=1 S3A: Number of large read operations=0 S3A: Number of write operations=0Job Counters Launched map tasks=1 Launched reduce tasks=1 Rack-local map tasks=1 Total time spent by all maps in occupied slots (ms)=14541 Total time spent by all reduces in occupied slots (ms)=5450 Total time spent by all map tasks (ms)=14541 Total time spent by all reduce tasks (ms)=5450 Total vcore-milliseconds taken by all map tasks=14541 Total vcore-milliseconds taken by all reduce tasks=5450 Total megabyte-milliseconds taken by all map tasks=14889984 Total megabyte-milliseconds taken by all reduce tasks=5580800Map-Reduce Framework Map input records=43 Map output records=104 Map output bytes=1517 Map output materialized bytes=1442 Input split bytes=81 Combine input records=104 Combine output records=79 Reduce input groups=79 Reduce shuffle bytes=1442 Reduce input records=79 Reduce output records=79 Spilled Records=158 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=230 CPU time spent (ms)=2230 Physical memory (bytes) snapshot=324866048 Virtual memory (bytes) snapshot=1723260928 Total committed heap usage (bytes)=162926592Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0File Input Format Counters Bytes Read=1109File Output Format Counters Bytes Written=1121[root@master ~]#</code></pre><p>计算成功后,查看hdfs目录,下面可以看到目录存在,且计算结果文件也存在,且返回SUCCESS</p><pre><code>[root@master ~]# hdfs dfs -ls /resultFound 2 items-rw-r--r-- 1 root supergroup 0 2018-04-25 17:20 /result/_SUCCESS-rw-r--r-- 1 root supergroup 1121 2018-04-25 17:20 /result/part-r-00000[root@master ~]# </code></pre><p>6、 将对象存储中的文件作为mapreduce的输入,进行计算之后将结果输出到对象存储桶中。</p><pre><code>首先查看对象存储桶中hadoop下result目录是否存在。待会输出结果会传到这里。[root@master ~]# hadoop fs -ls s3a://hadoop/resultls: `s3a://hadoop/result': No such file or directory[root@master ~]# hadoop jar /usr/hadoop/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount s3a://hadoop/abc s3a://hadoop/result18/04/25 17:25:27 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.1.20:803218/04/25 17:25:39 INFO input.FileInputFormat: Total input paths to process : 118/04/25 17:25:40 INFO mapreduce.JobSubmitter: number of splits:118/04/25 17:25:41 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1524633996089_001218/04/25 17:25:41 INFO impl.YarnClientImpl: Submitted application application_1524633996089_001218/04/25 17:25:41 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1524633996089_0012/18/04/25 17:25:41 INFO mapreduce.Job: Running job: job_1524633996089_001218/04/25 17:25:53 INFO mapreduce.Job: Job job_1524633996089_0012 running in uber mode : false18/04/25 17:25:53 INFO mapreduce.Job: map 0% reduce 0%18/04/25 17:26:57 INFO mapreduce.Job: map 100% reduce 0%18/04/25 17:27:18 INFO mapreduce.Job: map 100% reduce 67%18/04/25 17:27:27 INFO mapreduce.Job: map 100% reduce 100%18/04/25 17:32:44 INFO mapreduce.Job: Job job_1524633996089_0012 completed successfully18/04/25 17:32:44 INFO mapreduce.Job: Counters: 54File System Counters FILE: Number of bytes read=1442 FILE: Number of bytes written=240925 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=81 HDFS: Number of bytes written=0 HDFS: Number of read operations=1 HDFS: Number of large read operations=0 HDFS: Number of write operations=0 S3A: Number of bytes read=1109 S3A: Number of bytes written=1121 S3A: Number of read operations=19 S3A: Number of large read operations=0 S3A: Number of write operations=5Job Counters Launched map tasks=1 Launched reduce tasks=1 Rack-local map tasks=1 Total time spent by all maps in occupied slots (ms)=22928 Total time spent by all reduces in occupied slots (ms)=198775 Total time spent by all map tasks (ms)=22928 Total time spent by all reduce tasks (ms)=198775 Total vcore-milliseconds taken by all map tasks=22928 Total vcore-milliseconds taken by all reduce tasks=198775 Total megabyte-milliseconds taken by all map tasks=23478272 Total megabyte-milliseconds taken by all reduce tasks=203545600Map-Reduce Framework Map input records=43 Map output records=104 Map output bytes=1517 Map output materialized bytes=1442 Input split bytes=81 Combine input records=104 Combine output records=79 Reduce input groups=79 Reduce shuffle bytes=1442 Reduce input records=79 Reduce output records=79 Spilled Records=158 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=256 CPU time spent (ms)=1550 Physical memory (bytes) snapshot=336670720 Virtual memory (bytes) snapshot=1724592128 Total committed heap usage (bytes)=162926592Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0File Input Format Counters Bytes Read=1109File Output Format Counters Bytes Written=1121[root@master ~]# </code></pre><p>成功后,验证查看</p><pre><code>hadoop节点验证[root@master ~]# hadoop fs -ls s3a://hadoop/resultFound 2 items-rw-rw-rw- 1 0 2018-04-25 17:33 s3a://hadoop/result/_SUCCESS-rw-rw-rw- 1 1121 2018-04-25 17:32 s3a://hadoop/result/part-r-00000[root@master ~]# ceph集群节点验证[root@radosgw1 ~]# s3cmd ls s3://hadoop/result/2018-04-25 09:33 0 s3://hadoop/result/_SUCCESS2018-04-25 09:32 1121 s3://hadoop/result/part-r-00000</code></pre><p>7、 将 HDFS 中的文件作为 MapReduce 的输入,计算结果输出到对象存储的存储空间中</p><pre><code>下面将hdfs中ceshi.txt作为计算输入,将结果输出对象存储中hadoop/output目录中。 前期查看[root@master ~]# hdfs dfs -ls /Found 6 items-rw-r--r-- 1 root supergroup 1083 2018-04-25 17:00 /ceshi.txt[root@master ~]# hadoop fs -ls s3a://hadoop/outputls: `s3a://hadoop/output': No such file or directory开始计算并输出[root@master ~]# hadoop jar /usr/hadoop/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount /ceshi.txt s3a://hadoop/output18/04/25 17:39:55 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.1.20:803218/04/25 17:40:04 INFO input.FileInputFormat: Total input paths to process : 118/04/25 17:40:05 INFO mapreduce.JobSubmitter: number of splits:118/04/25 17:40:05 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1524633996089_001318/04/25 17:40:06 INFO impl.YarnClientImpl: Submitted application application_1524633996089_001318/04/25 17:40:06 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1524633996089_0013/18/04/25 17:40:06 INFO mapreduce.Job: Running job: job_1524633996089_001318/04/25 17:40:19 INFO mapreduce.Job: Job job_1524633996089_0013 running in uber mode : false18/04/25 17:40:19 INFO mapreduce.Job: map 0% reduce 0%18/04/25 17:41:16 INFO mapreduce.Job: map 100% reduce 0%18/04/25 17:41:36 INFO mapreduce.Job: map 100% reduce 67%18/04/25 17:41:45 INFO mapreduce.Job: map 100% reduce 100%18/04/25 17:46:38 INFO mapreduce.Job: Job job_1524633996089_0013 completed successfully18/04/25 17:46:38 INFO mapreduce.Job: Counters: 54File System Counters FILE: Number of bytes read=1404 FILE: Number of bytes written=240873 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=1176 HDFS: Number of bytes written=0 HDFS: Number of read operations=2 HDFS: Number of large read operations=0 HDFS: Number of write operations=0 S3A: Number of bytes read=0 S3A: Number of bytes written=1091 S3A: Number of read operations=18 S3A: Number of large read operations=0 S3A: Number of write operations=5Job Counters Launched map tasks=1 Launched reduce tasks=1 Data-local map tasks=1 Total time spent by all maps in occupied slots (ms)=17302 Total time spent by all reduces in occupied slots (ms)=173487 Total time spent by all map tasks (ms)=17302 Total time spent by all reduce tasks (ms)=173487 Total vcore-milliseconds taken by all map tasks=17302 Total vcore-milliseconds taken by all reduce tasks=173487 Total megabyte-milliseconds taken by all map tasks=17717248 Total megabyte-milliseconds taken by all reduce tasks=177650688Map-Reduce Framework Map input records=41 Map output records=102 Map output bytes=1483 Map output materialized bytes=1404 Input split bytes=93 Combine input records=102 Combine output records=77 Reduce input groups=77 Reduce shuffle bytes=1404 Reduce input records=77 Reduce output records=77 Spilled Records=154 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=261 CPU time spent (ms)=1570 Physical memory (bytes) snapshot=325062656 Virtual memory (bytes) snapshot=1724448768 Total committed heap usage (bytes)=162926592Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0File Input Format Counters Bytes Read=1083File Output Format Counters Bytes Written=1091</code></pre><p>在hadoop节点和集群节点验证</p><pre><code>[root@master ~]# hadoop fs -ls s3a://hadoop/outputFound 2 items-rw-rw-rw- 1 0 2018-04-25 17:47 s3a://hadoop/output/_SUCCESS-rw-rw-rw- 1 1091 2018-04-25 17:46 s3a://hadoop/output/part-r-00000[root@master ~]# [root@radosgw1 ~]# s3cmd ls s3://hadoop/output/2018-04-25 09:47 0 s3://hadoop/output/_SUCCESS2018-04-25 09:46 1091 s3://hadoop/output/part-r-00000[root@radosgw1 ~]# </code></pre><p>可以看到集群端和hadoop节点端都能看到。</p><p>至此,配置测试结束。</p>]]></content>
<summary type="html">
<p>公司提出测试需求,将Hadoop2.7与ceph10.2 S3对象存储进行集成测试,hadoop官网介绍:<a href="http://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html" target="_blank" rel="noopener">官网介绍</a><br>后查阅相关资料完成对接测试,现将环境部署,对接测试完整过程,整理如下:<br>
</summary>
<category term="ceph10.2 radosgw hadoop" scheme="http://idcat.cn/tags/ceph10-2-radosgw-hadoop/"/>
</entry>
<entry>
<title>nginx反向代理和负载均衡最基础配置实现</title>
<link href="http://idcat.cn/nginx%E5%8F%8D%E5%90%91%E4%BB%A3%E7%90%86%E5%92%8C%E8%B4%9F%E8%BD%BD%E5%9D%87%E8%A1%A1%E6%9C%80%E5%9F%BA%E7%A1%80%E9%85%8D%E7%BD%AE%E5%AE%9E%E7%8E%B0.html"/>
<id>http://idcat.cn/nginx反向代理和负载均衡最基础配置实现.html</id>
<published>2018-04-24T09:39:44.000Z</published>
<updated>2018-04-24T09:40:56.474Z</updated>
<content type="html"><![CDATA[<p>本文仅在最基础的nginx环境下做反向代理和负载均衡的大概配置介绍,具体调优以及其他功能暂不考虑。仅仅展现这2个功能的基本实现配置,本文记录仅供参考。</p><p><strong>环境:</strong></p><p>单台centos7的虚拟机,nginx安装使用yum进行安装。默认nginx配置文件在/etc/nginx/ 下 nginx.conf文件。<br><a id="more"></a></p><p><strong>默认配置</strong></p><p>备份默认配置文件</p><pre><code>[root@node1 ~]# cp /etc/nginx/nginx.conf{,_bak}</code></pre><p>清空注释内容,使配置文件更加清晰</p><pre><code>[root@node1 ~]# sed -i '/^#/d' /etc/nginx/nginx.conf</code></pre><p>查看默认配置</p><pre><code>[root@node1 ~]# cat /etc/nginx/nginx.conf_bak |grep -v ^# | grep -v ^$user nginx;worker_processes auto;error_log /var/log/nginx/error.log;pid /run/nginx.pid;include /usr/share/nginx/modules/*.conf;events { worker_connections 1024;}http { log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for"'; access_log /var/log/nginx/access.log main; sendfile on; tcp_nopush on; tcp_nodelay on; keepalive_timeout 65; types_hash_max_size 2048; include /etc/nginx/mime.types; default_type application/octet-stream; include /etc/nginx/conf.d/*.conf; server { listen 80 default_server; listen [::]:80 default_server; server_name _; root /usr/share/nginx/html; include /etc/nginx/default.d/*.conf; location / { } error_page 404 /404.html; location = /40x.html { } error_page 500 502 503 504 /50x.html; location = /50x.html { } }}</code></pre><p>简单介绍:</p><p>http模块下面包括server 等</p><ul><li><p>listen server监听端口</p></li><li><p>server_name 通过什么域名进行访问。本例为空即本机IP</p></li><li><p>root 定义服务器网站根目录默认位置</p></li><li><p>location 通用匹配,任何未匹配到其它location的请求都会匹配到这里</p></li></ul><p>修改默认配置下主页内容</p><pre><code>[root@node1 nginx]# cat /usr/share/nginx/html/index.html this is example</code></pre><p>访问测试</p><pre><code>[root@node1 ~]# curl node1this is example</code></pre><p><strong>反向代理</strong></p><p>例如客户端A访问服务端B ,然而B实际上访问了服务端C,然后返回结果给B,在返回结果给A。至始至终,客户端A是不知道C的存在的,只会认为访问B并得到返回。这样可以很好的保护了实际WEB服务端C。</p><p>修改默认配置文件nginx.conf </p><pre><code>server { listen 80 default_server; listen [::]:80 default_server; server_name -; root /usr/share/nginx/html; include /etc/nginx/default.d/*.conf; location /{ proxy_pass http://node1:8081; }} server { listen 8081 default_server; root /usr/share/nginx/html_1; server_name -; index index.html; location / { } }}</code></pre><p>在原有server下面增加了一个server,即新的虚拟机,后台默认端口为8081,默认root为html_ 1,该目录是从默认html 拷贝过来的,修改默认index.html内容为8081。然后在默认80端口的虚拟机上增加proxy_pass <a href="http://node1:8081" target="_blank" rel="noopener">http://node1:8081</a>; 表示客户端访问80端口的时候,其实指向了另外的虚拟机的8081端口上。修改后重载nginx。</p><pre><code>[root@node1 ~]# systemctl reload nginx[root@node1 ~]# cat /usr/share/nginx/html_1/index.html 8081</code></pre><p>访问测试</p><pre><code>[root@node1 html_1]# curl node18081</code></pre><p>已经配置成功了。本来访问80端口的下index.html应该是this is example ,现在返回的是8081虚拟机的index.html内容。</p><p><strong>负载均衡</strong></p><p>单台web服务器受限于其资源限制,提供对外访问连接毕竟有限。增加多台nginx后端web服务器同时提供访问,可增加并发性能,以及提高可拓展性。nginx负载均衡支持客户端访问请求根据策略分摊到后端实际的某台web服务器上。</p><p>修改默认配置文件nginx.conf </p><pre><code>##在http下面增加upstream配置 upstream my_http { #默认轮训策略,还有ip_hash ,weight,最小连接数等算法。 server 192.168.1.141:8081; server 192.168.1.141:8082; server 192.168.1.141:8083; }server { listen 80 default_server; listen [::]:80 default_server; server_name -; root /usr/share/nginx/html; include /etc/nginx/default.d/*.conf; location /{ proxy_pass http://my_http; }} ##本机起3个虚拟机,分别使用不同的端口号,模拟多台server服务端。root定义网站路径,index指定默认网页 server { listen 8081 ; root /usr/share/nginx/html_1; server_name -; index index.html; location / { } } server { listen 8082; root /usr/share/nginx/html_2; server_name -; index index.html; location / { } } server { listen 8083; root /usr/share/nginx/html_3; server_name -; index index.html; location / { } }</code></pre><p>在原有server下面增加了一个3个server,即3台新的虚拟机,后台默认端口分别为8081、8082、8083,默认root为html_ 1,html_ 2,html_ 3。这些目录是从默认html 拷贝过来的,分别修改默认index.html内容为8081,8082,8083 然后在默认80端口的虚拟机上增加proxy_ pass <a href="http://my_http" target="_blank" rel="noopener">http://my_http</a>; 即http:// + upstream 名称 ; 表示客户端每次访问80端口的时候,其实指向了另外3台虚拟机。修改后重载nginx。</p><pre><code>[root@node1 ~]# cat /usr/share/nginx/html_1/index.html 8081[root@node1 ~]# cat /usr/share/nginx/html_2/index.html 8082[root@node1 ~]# cat /usr/share/nginx/html_3/index.html 8083[root@node1 ~]# systemctl reload nginx</code></pre><p>访问测试</p><pre><code>[root@node1 ~]# curl node18081[root@node1 ~]# curl node18082[root@node1 ~]# curl node18083</code></pre><p>已经配置成功了。本来访问80端口的下index.html应该是this is example ,现在客户端每次访问80端口,依次返回的是后端3台虚拟机的index.html内容。</p><p>记录到此。</p>]]></content>
<summary type="html">
<p>本文仅在最基础的nginx环境下做反向代理和负载均衡的大概配置介绍,具体调优以及其他功能暂不考虑。仅仅展现这2个功能的基本实现配置,本文记录仅供参考。</p>
<p><strong>环境:</strong></p>
<p>单台centos7的虚拟机,nginx安装使用yum进行安装。默认nginx配置文件在/etc/nginx/ 下 nginx.conf文件。<br>
</summary>
<category term="nginx" scheme="http://idcat.cn/tags/nginx/"/>
</entry>
<entry>
<title>tdb文件简介</title>
<link href="http://idcat.cn/tdb%E6%96%87%E4%BB%B6%E7%AE%80%E4%BB%8B.html"/>
<id>http://idcat.cn/tdb文件简介.html</id>
<published>2018-04-23T09:50:57.000Z</published>
<updated>2018-04-23T09:53:02.997Z</updated>
<content type="html"><![CDATA[<p><strong>TDB文件介绍</strong></p><p>samba在运行时,Samba 存储许多信息,从本地密码到希望从中收到信息的一系列客户端。这类数据其中一些是暂时的,在 Samba 重启时可能会被丢弃,但是另一些却是永久的,不会被丢弃。这类数据可能是很大的,也可能是不经常访问只是在内存中保留,或者在重启时保持存在。要满足这些要求,Samba 团队创建了 Trivial Database。它实际上是一个键值存储,这意味着数据通过惟一键的方式存储和检索,且没有像在关系数据库中那样的表联接。键值存储 — 尤其是 TDB — 被设计成将数据存储到磁盘并将其取回的一种快速方式。<br><a id="more"></a></p><p>查看samba下tdb文件,只列出/var/lib/samba下面的,还有很多其他目录存在samba tdb文件。</p><pre><code>[root@node1 samba]# cd /var/lib/samba/[root@node1 samba]# lltotal 2236-rw------- 1 root root 421888 Apr 23 15:10 account_policy.tdbdrwxr-xr-x 1 root root 0 Nov 28 00:21 drivers-rw-r--r-- 1 root root 425984 Apr 23 15:10 gencache.tdb-rw------- 1 root root 696 Apr 23 15:10 group_mapping.tdbdrwxr-xr-x 1 root root 456 Apr 23 15:10 lockdrwxr-xr-x 1 root root 0 Apr 23 15:10 printingdrwx------ 1 root root 86 Apr 23 17:21 private-rw------- 1 root root 528384 Apr 23 15:10 registry.tdb-rw------- 1 root root 421888 Apr 23 15:10 share_info.tdb-rw-r--r-- 1 root root 483328 Apr 23 17:43 smbprofile.tdbdrwxr-x--- 1 root wbpriv 0 Nov 28 00:21 winbindd_privileged</code></pre><p>至于这些tdb文件如何查看数据,以及修改备份,下面介绍samba自带的几个tdb工具。</p><p><strong>tdbtool工具介绍</strong></p><p>tdbtool工具可以在命令行上接受命令,也可以打开交互式控制台类似shell一样。要在命令行上完成任务,请运行 tdbtool example.tdb command options,其中 example.tdb 是文件名,command 是命令,针对命令的选项位于最后。要使用 tdb shell,只需单独运行 tdbtool 或在命令行上传递文件的名称。个人建议使用交互式控制台方式。以下是tdbtool参数介绍</p><pre><code>tdbtool: create dbname : create a database open dbname : open an existing database transaction_start : start a transaction transaction_commit : commit a transaction transaction_cancel : cancel a transaction erase : erase the database dump : dump the database as strings keys : dump the database keys as strings hexkeys : dump the database keys as hex values info : print summary info about the database insert key data : insert a record move key file : move a record to a destination tdb storehex key data : store a record (replace), key/value in hex format store key data : store a record (replace) show key : show a record by key delete key : delete a record by key list : print the database hash table and freelist free : print the database freelist freelist_size : print the number of records in the freelist check : check the integrity of an opened database repack : repack the database speed : perform speed tests on the database ! command : execute system command 1 | first : print the first record n | next : print the next record q | quit : terminate \n : repeat 'next' command</code></pre><p>下面分别介绍:</p><p>1、创建数据库</p><pre><code>[root@node1 tdbtest]# tdbtool tdb> create hello[root@node1 tdbtest]# lltotal 4-rw------- 1 root root 696 Apr 23 15:53 hello</code></pre><p>2、打开数据库</p><pre><code>tdb> open hello</code></pre><p>3、插入数据</p><pre><code>tdb> insert name zhangsan</code></pre><p>4、查询数据</p><pre><code>tdb> show namekey 4 bytesnamedata 8 bytes[000] 7A 68 61 6E 67 73 61 6E zhangsan </code></pre><p>5、查看所有数据</p><pre><code>tdb> dumpkey 5 bytesname1data 4 bytes[000] 6C 69 73 69 lisi key 4 bytesnamedata 8 bytes[000] 7A 68 61 6E 67 73 61 6E zhangsan </code></pre><p>总共2条KEY/VALUES键值对,既2条数据信息。</p><p>6、列出key值</p><pre><code>tdb> keyskey 5 bytes: name1key 4 bytes: name</code></pre><p>7、修改values值</p><pre><code>tdb> store name zhangStoring key:key 4 bytesnamedata 5 bytes[000] 7A 68 61 6E 67 zhang </code></pre><p>将name值由zhangsan 修改为zhang,查看修改结果</p><pre><code>tdb> dump key 5 bytesname1data 4 bytes[000] 77 61 6E 67 77 75 lisi key 4 bytesnamedata 5 bytes[000] 7A 68 61 6E 67 zhang </code></pre><p>8、删除某个key值</p><pre><code>tdb> delete nametdb> dumpkey 5 bytesname1data 4 bytes[000] 6C 69 73 69 lisi </code></pre><p>将key值为name的删掉后,查看只剩下name1记录。</p><p>9、检查数据完整性</p><pre><code>tdb> checkDatabase integrity is OK and has 2 records.</code></pre><p>10、复制数据到另外的数据库(后者数据库必须存在)</p><pre><code>tdb> move name2 hello1key 5 bytesname2 data 6 bytes[000] 77 61 6E 67 77 75 wangwu record moved</code></pre><p>查看hello1记录</p><pre><code>tdb> open hello1tdb> dumpkey 5 bytesname2data 6 bytes[000] 77 61 6E 67 77 75 wangwu </code></pre><p>11、执行系统命令</p><pre><code>tdb> ! pwd/root/tdbtesttdb> ! dateMon Apr 23 16:36:18 CST 2018</code></pre><p>12、支持事务处理</p><p>开启事务</p><pre><code>tdb> transaction_starttdb> insert name3 testtdb> show name3key 5 bytesname3data 4 bytes[000] 74 65 73 74 test </code></pre><p>取消事务</p><pre><code>tdb> transaction_canceltdb> show name3fetch failed</code></pre><p>提交事务</p><pre><code>tdb> transaction_starttdb> insert name3 testtdb> transaction_committdb> show name3key 5 bytesname3data 4 bytes[000] 74 65 73 74 test</code></pre><p><strong>tdbdump 工具介绍</strong></p><p>tdbdump是用来查看tdb文件中的所有键值对数据的工具</p><p>已hello为例, 查看所有数据</p><pre><code>[root@node1 tdbtest]# tdbdump hello{key(5) = "name1"data(4) = "lisi"}{key(5) = "name2"data(6) = "wangwu"}{key(5) = "name3"data(4) = "test"}</code></pre><p>每个键值对数据key data 数字为字节数</p><p><strong>tdbbackup 工具介绍</strong></p><p>tdbbackup工具为tdb数据库文件的备份工具。</p><ul><li><p>备份hello数据库</p><p> [root@node1 tdbtest]# tdbbackup hello<br> [root@node1 tdbtest]# ll<br> total 828<br> -rw——- 1 root root 831488 Apr 23 16:42 hello<br> -rw——- 1 root root 8192 Apr 23 16:38 hello1<br> -rw——- 1 root root 8192 Apr 23 17:25 hello.bak<br>hello.bak就是备份文件。这里发现两者文件大小不一样,通过md5对比。因为是不同的文件,文件MD5值肯定是不一样的,但是文件内容是完全一样的。</p></li></ul><p>查看文件md5</p><pre><code>[root@node1 tdbtest]# md5sum hello8c55e7dabbeab30e3cd96e96b59fb052 hello[root@node1 tdbtest]# md5sum hello.bak c20b4f9b01f5715bbec8f950cf394f51 hello.bak</code></pre><p>查看文件内容md5</p><pre><code>[root@node1 tdbtest]# tdbdump hello | md5sum 88be32a888d3cd63132e09a0de8d69de -[root@node1 tdbtest]# tdbdump hello.bak | md5sum 88be32a888d3cd63132e09a0de8d69de -</code></pre><ul><li>恢复hello数据</li></ul><p>模拟删除数据</p><pre><code>[root@node1 tdbtest]# lltotal 828-rw------- 1 root root 831488 Apr 23 16:42 hello-rw------- 1 root root 8192 Apr 23 16:38 hello1-rw------- 1 root root 8192 Apr 23 17:25 hello.bak[root@node1 tdbtest]# >hello[root@node1 tdbtest]# lltotal 16-rw------- 1 root root 0 Apr 23 17:33 hello-rw------- 1 root root 8192 Apr 23 16:38 hello1-rw------- 1 root root 8192 Apr 23 17:25 hello.bak[root@node1 tdbtest]# tdbbackup -v hellorestoring hello[root@node1 tdbtest]# lltotal 24-rw------- 1 root root 8192 Apr 23 17:33 hello-rw------- 1 root root 8192 Apr 23 16:38 hello1-rw------- 1 root root 8192 Apr 23 17:25 hello.bak</code></pre><p>看到文件大小一致了,现在对比md5值</p><pre><code>[root@node1 tdbtest]# md5sum helloc20b4f9b01f5715bbec8f950cf394f51 hello[root@node1 tdbtest]# md5sum hello.bak c20b4f9b01f5715bbec8f950cf394f51 hello.bak[root@node1 tdbtest]# tdbdump hello |md5sum 88be32a888d3cd63132e09a0de8d69de -[root@node1 tdbtest]# tdbdump hello.bak |md5sum 88be32a888d3cd63132e09a0de8d69de -</code></pre><p>看到MD5值与之前备份之前一致了。查看数据</p><pre><code>[root@node1 tdbtest]# tdbdump hello{key(5) = "name1"data(4) = "lisi"}{key(5) = "name2"data(6) = "wangwu"}{key(5) = "name3"data(4) = "test"}</code></pre>]]></content>
<summary type="html">
<p><strong>TDB文件介绍</strong></p>
<p>samba在运行时,Samba 存储许多信息,从本地密码到希望从中收到信息的一系列客户端。这类数据其中一些是暂时的,在 Samba 重启时可能会被丢弃,但是另一些却是永久的,不会被丢弃。这类数据可能是很大的,也可能是不经常访问只是在内存中保留,或者在重启时保持存在。要满足这些要求,Samba 团队创建了 Trivial Database。它实际上是一个键值存储,这意味着数据通过惟一键的方式存储和检索,且没有像在关系数据库中那样的表联接。键值存储 — 尤其是 TDB — 被设计成将数据存储到磁盘并将其取回的一种快速方式。<br>
</summary>
<category term="tdbdump tdbbakcup tdbtool" scheme="http://idcat.cn/tags/tdbdump-tdbbakcup-tdbtool/"/>
</entry>
<entry>
<title>blktrace 工具简介</title>
<link href="http://idcat.cn/blktrace-%E5%B7%A5%E5%85%B7%E7%AE%80%E4%BB%8B.html"/>
<id>http://idcat.cn/blktrace-工具简介.html</id>
<published>2018-04-21T14:00:18.000Z</published>
<updated>2018-04-21T14:07:43.764Z</updated>
<content type="html"><![CDATA[<p>利用BLKTRACE分析IO性能</p><p>在Linux系统上,如果I/O发生性能问题,有没有办法进一步定位故障位置呢?iostat等最常用的工具肯定是指望不上的,blktrace在这种场合就能派上用场,因为它能记录I/O所经历的各个步骤,从中可以分析是IO Scheduler慢还是硬件响应慢。简化版io路径图:<br><a id="more"></a><br><img src="https://i.imgur.com/tnNhcA4.png" alt=""></p><p>一个I/O请求进入block layer之后,可能会经历下面的过程:</p><ul><li>Remap: 可能被DM(Device Mapper)或MD(Multiple Device, Software RAID) remap到其它设备</li><li>Split: 可能会因为I/O请求与扇区边界未对齐、或者size太大而被分拆(split)成多个物理I/O</li><li>Merge: 可能会因为与其它I/O请求的物理位置相邻而合并(merge)成一个I/O</li><li>被IO Scheduler依照调度策略发送给driver</li><li>被driver提交给硬件,经过HBA、电缆(光纤、网线等)、交换机(SAN或网络)、最后到达存储设备,设备完成IO请求之后再把结果发回。</li></ul><p>blktrace能记录I/O所经历的各个步骤,来看一下它记录的数据,包含9个字段,下图标示了其中8个字段的含义,大致的意思是“哪个进程在访问哪个硬盘的哪个扇区,进行什么操作,进行到哪个步骤,时间戳是多少”:<br><img src="https://i.imgur.com/NZL9qcm.png" alt=""></p><p>-第一个字段:8,0 这个字段是设备号 major device ID和minor device ID。</p><p>-第二个字段:3 表示CPU</p><p>-第三个字段:11 序列号</p><p>-第四个字段:0.009507758 Time Stamp是时间偏移</p><p>-第五个字段:PID 本次IO对应的进程ID</p><p>-第六个字段:Event,这个字段非常重要,反映了IO进行到了那一步</p><p>-第七个字段:R表示 Read, W是Write,D表示block,B表示Barrier Operation</p><p>-第八个字段:223490+56,表示的是起始block number 和 number of blocks,即我们常说的Offset 和 Size</p><p>-第九个字段: 进程名</p><p>其中第六个字段非常有用:每一个字母都代表了IO请求所经历的某个阶段。</p><pre><code>A 映射值对应设备 IO was remapped to a different deviceB IO反弹,由于32位地址长度限制,所以需要copy数据到低位内存,这会有性能损耗。IO bouncedC IO完成 IO completionD 将IO发送给驱动 IO issued to driverF IO请求,前合并 IO front merged with request on queueG 获取 请求 Get requestI IO插入请求队列 IO inserted onto request queueM IO请求,后合并 IO back merged with request on queueP 插上块设备队列(队列插入机制) Plug requestQ io被请求队列处理代码接管。 IO handled by request queue codeS 等待发送请求。 Sleep requestT 由于超时而拔出设备队列 Unplug due to timeoutU 拔出设备队列 Unplug requestX 开始新的扇区 Split</code></pre><p>以下需要记清楚的:</p><pre><code>Q – 即将生成IO请求G – IO请求生成I – IO请求进入IO Scheduler队列D – IO请求进入driverC – IO请求执行完毕</code></pre><p>注意,整个IO路径,分成很多段,每一段开始的时候,都会有一个时间戳,根据上一段开始的时间和下一段开始的时间,就可以得到IO 路径各段花费的时间。</p><p>注意,我们心心念念的service time,也就是反应块设备处理能力的指标,就是从D到C所花费的时间,简称D2C。</p><p>而iostat输出中的await,即整个IO从生成请求到IO请求执行完毕,即从Q到C所花费的时间,我们简称Q2C。</p><p>我们知道Linux 有I/O scheduler,调度器的效率如何,I2D是重要的指标。</p><p>注意,这只是blktrace输出的一个部分,很明显,我们还能拿到offset和size,根据offset,我们能拿到某一段时间里,应用程序都访问了整个块设备的那些block,从而绘制出块设备访问轨迹图。</p><p><strong>blktrace centos7安装</strong></p><p>yum install blktrace -y </p><p>会自动生成blktrace blkparse btt 3个工具,其中,blktrace收集数据,blkparce分析数据,btt汇总数据。</p><p>blktrace的用法</p><p>使用blktrace需要挂载debugfs:</p><p>$ mount -t debugfs debugfs /sys/kernel/debug</p><p>利用blktrace查看实时数据的方法,比如要看的硬盘是sdb:</p><p>$ blktrace -d /dev/sdb -o – | blkparse -i –</p><p>需要停止的时候,按Ctrl-C。</p><p>个人常用方法: </p><p>blktrace -d /dev/sdc</p><p>生成数据: 应用结束后,手动终止监控,会生成cpu数量的文件</p><p>blkparse -i sdc -d sdc.blktrace.bin</p><p>分析数据: btt</p><p>btt -i sdc.blktrace.bin -A |less</p><p>汇总后,部分截图</p><p><img src="https://i.imgur.com/uLH3VD0.png" alt=""></p><p>根据以上步骤对应的时间戳就可以计算出I/O请求在每个阶段所消耗的时间:</p><pre><code>Q2G – 生成IO请求所消耗的时间,包括remap和split的时间;G2I – IO请求进入IO Scheduler所消耗的时间,包括merge的时间;I2D – IO请求在IO Scheduler中等待的时间;D2C – IO请求在driver和硬件上所消耗的时间;Q2C – 整个IO请求所消耗的时间(Q2I + I2D + D2C = Q2C),相当于iostat的await。</code></pre><p>如果I/O性能慢的话,以上指标有助于进一步定位缓慢发生的地方:</p><pre><code>D2C可以作为硬件性能的指标;I2D可以作为IO Scheduler性能的指标。</code></pre><p>更多分析请参考下面链接。</p><p><a href="http://bean-li.github.io/blktrace-to-report/" target="_blank" rel="noopener">本文参考 http://bean-li.github.io/blktrace-to-report/</a></p>]]></content>
<summary type="html">
<p>利用BLKTRACE分析IO性能</p>
<p>在Linux系统上,如果I/O发生性能问题,有没有办法进一步定位故障位置呢?iostat等最常用的工具肯定是指望不上的,blktrace在这种场合就能派上用场,因为它能记录I/O所经历的各个步骤,从中可以分析是IO Scheduler慢还是硬件响应慢。简化版io路径图:<br>
</summary>
<category term="blktrace btt blkparse" scheme="http://idcat.cn/tags/blktrace-btt-blkparse/"/>
</entry>
<entry>
<title>cosbench例子-参考</title>
<link href="http://idcat.cn/cosbench%E4%BE%8B%E5%AD%90-%E5%8F%82%E8%80%83.html"/>
<id>http://idcat.cn/cosbench例子-参考.html</id>
<published>2018-04-15T11:30:01.000Z</published>
<updated>2018-04-15T11:34:50.074Z</updated>
<content type="html"><![CDATA[<p>仅此记录,自己参考。<br><a id="more"></a></p><pre><code><?xml version="1.0" encoding="UTF-8" ?><workload name="128k-36" description="sample benchmark for s3"> <storage type="s3" /> <workflow><workstage name="init"> <storage type="s3" config="accesskey=zhangadmin;secretkey=zhangadmin;endpoint=http://192.168.0.191:7480" /> <work type="init" workers="3" config="cprefix=128zhang;containers=r(1,30)" /></workstage><workstage name="prepare"> <work type="prepare" workers="36" config="cprefix=128zhang;containers=r(1,30);objects=r(1,1000);sizes=c(128)KB" /></workstage><workstage name="main"> <work name="write" workers="36" runtime="300"> <storage type="s3" config="accesskey=zhangadmin;secretkey=zhangadmin;endpoint=http://192.168.0.191:7480" /> <operation type="write" ratio="100" config="cprefix=128zhang;containers=u(1,10);objects=u(1,1000);sizes=c(128)KB" /> </work> <work name="write" workers="36" runtime="300"> <storage type="s3" config="accesskey=zhangadmin;secretkey=zhangadmin;endpoint=http://192.168.0.192:7480" /> <operation type="write" ratio="100" config="cprefix=128zhang;containers=u(11,20);objects=u(1,1000);sizes=c(128)KB" /> </work> <work name="write" workers="36" runtime="300"> <storage type="s3" config="accesskey=zhangadmin;secretkey=zhangadmin;endpoint=http://192.168.0.193:7480" /> <operation type="write" ratio="100" config="cprefix=128zhang;containers=u(21,30);objects=u(1,1000);sizes=c(128)KB" /> </work></workstage><workstage name="cleanup"> <work type="cleanup" workers="36" config="cprefix=128zhang;containers=r(1,30);objects=r(1,1000)" /></workstage><workstage name="dispose"> <work type="dispose" workers="36" config="cprefix=128zhang;containers=r(1,30)" /></workstage> </workflow></workload></code></pre>]]></content>
<summary type="html">
<p>仅此记录,自己参考。<br>
</summary>
<category term="cosbench" scheme="http://idcat.cn/tags/cosbench/"/>
</entry>
<entry>
<title>集群所有mon store.db丢失恢复</title>
<link href="http://idcat.cn/%E9%9B%86%E7%BE%A4%E6%89%80%E6%9C%89mon-store-db%E4%B8%A2%E5%A4%B1%E6%81%A2%E5%A4%8D.html"/>
<id>http://idcat.cn/集群所有mon-store-db丢失恢复.html</id>
<published>2018-04-02T11:42:38.000Z</published>
<updated>2019-11-21T01:13:44.907Z</updated>
<content type="html"><![CDATA[<p>1、模拟测试环境</p><p>三台虚拟机,每台虚拟机一个osd,均是mon节点,mds节点</p><pre><code>node1 192.168.1.141 node2 192.168.1.142 node3 192.168.1.143 </code></pre><a id="more"></a><p>2、模拟所有节点的mon 数据丢失 </p><p>在3个集群节点上停止mon服务,拷贝数据到其他路径</p><pre><code>[root@node1 ~]# systemctl stop ceph-mon@node1[root@node1 ~]# mv /var/lib/ceph/mon/ceph-node1/store.db /tmp/[root@node2 ~]# systemctl stop ceph-mon@node2[root@node2 ~]# mv /var/lib/ceph/mon/ceph-node2/store.db/ /tmp/[root@node3 ~]# systemctl stop ceph-mon@node3[root@node3 ~]# mv /var/lib/ceph/mon/ceph-node3/store.db/ /tmp</code></pre><p>3、从osd上获取monmap信息</p><p>先停止所有的osd服务,然后每个节点创建临时目录</p><pre><code>[root@node1 ~]# systemctl stop ceph-osd@0[root@node2 ~]# systemctl stop ceph-osd@1[root@node3 ~]# systemctl stop ceph-osd@2[root@node1 ~]# mkdir /tmp/monstore[root@node2 ~]# mkdir /tmp/monstore[root@node3 ~]# mkdir /tmp/monstore</code></pre><p>首先在node1上操作,该节点只有一个osd。从这个osd上收集mon相关的数据,存放到/tmp/monstore 目录</p><pre><code>[root@node1 ~]# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0/ --op update-mon-db --mon-store-path /tmp/monstore/osd.0 : 0 osdmaps trimmed, 31 osdmaps added. 256 pgs added.</code></pre><p>因为有多台osd机器多个osd的话,就得在每台服务器每个osd上分别执行上面命令。这个目录的数据一定要保持递增的。弄完一台就传递数据到下一台接着弄。</p><p>因为本例中有3个节点,每个节点只有一个osd,所以node1上只需要执行一次即可。</p><p>传递数据到node2上</p><pre><code>[root@node1 store.db]# rsync -avz /tmp/monstore/ node2:/tmp/monstore/sending incremental file list./store.db/store.db/000003.logstore.db/CURRENTstore.db/LOCKstore.db/MANIFEST-000002sent 27853 bytes received 95 bytes 18632.00 bytes/sectotal size is 329590 speedup is 11.79</code></pre><p>在node2节点上从osd上获取mon相关数据</p><pre><code>[root@node2 ~]# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-1/ --op update-mon-db --mon-store-path /tmp/monstore/osd.1 : 0 osdmaps trimmed, 0 osdmaps added. 96 pgs added.</code></pre><p>传递数据到node3</p><pre><code>[root@node2 ~]# rsync -avz /tmp/monstore/ node3:/tmp/monstore/sending incremental file list./store.db/store.db/000005.sststore.db/000006.logstore.db/CURRENTstore.db/LOCKstore.db/MANIFEST-000004sent 37210 bytes received 114 bytes 74648.00 bytes/sectotal size is 132524 speedup is 3.55</code></pre><p>在node3上获取mon数据</p><pre><code>[root@node3 store.db]# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-2/ --op update-mon-db --mon-store-path /tmp/monstore/osd.2 : 0 osdmaps trimmed, 0 osdmaps added. 82 pgs added.</code></pre><p>之后,将这个 目录的数据传递到所有需要恢复mon节点上,本例为node1 node2 node3(已有最新数据)</p><pre><code>[root@node3 store.db]# rsync -avz /tmp/monstore/ node1:/tmp/monstore/root@node1's password: sending incremental file liststore.db/store.db/000005.sststore.db/000008.sststore.db/000009.logstore.db/CURRENTstore.db/MANIFEST-000007sent 41661 bytes received 117 bytes 16711.20 bytes/sectotal size is 134943 speedup is 3.23[root@node3 store.db]# rsync -avz /tmp/monstore/ node2:/tmp/monstore/sending incremental file liststore.db/store.db/000008.sststore.db/000009.logstore.db/CURRENTstore.db/MANIFEST-000007sent 10113 bytes received 98 bytes 20422.00 bytes/sectotal size is 134943 speedup is 13.22</code></pre><p>4、恢复mon数据</p><p>删除之前默认路径存在的数据,或者新建目录</p><pre><code>[root@node1 mon]# mkdir /var/lib/ceph/mon/ceph-node1/[root@node1 ceph-node1]# ceph-monstore-tool /tmp/monstore/ rebuild[root@node1 ceph-node1]# cp -ra /tmp/monstore/* /var/lib/ceph/mon/ceph-node1/[root@node1 ceph-node1]# touch /var/lib/ceph/mon/ceph-node1/done[root@node1 ceph-node1]# touch /var/lib/ceph/mon/ceph-node1/systemd[root@node1 ceph-node1]# chown ceph:ceph -R /var/lib/ceph/mon/ceph-node1/</code></pre><p>在node2 node3节点上分别执行上面操作</p><p>5、重启mon</p><p>所有节点重启mon服务</p><pre><code>[root@node1 ceph-node1]# systemctl restart ceph-mon@node1Job for [email protected] failed because start of the service was attempted too often. See "systemctl status [email protected]" and "journalctl -xe" for details.To force a start use "systemctl reset-failed [email protected]" followed by "systemctl start [email protected]" again.[root@node1 ceph-node1]# systemctl reset-failed [email protected][root@node1 ceph-node1]# systemctl start [email protected]</code></pre><p>最终查看mon均没有启动,查看日志显示如下信息:</p><pre><code>2018-04-02 17:55:29.419474 7f71222da700 1 leveldb: Compacting 5@0 + 0@1 files2018-04-02 17:55:29.419975 7f71284cc600 0 mon.node1 does not exist in monmap, will attempt to join an existing cluster2018-04-02 17:55:29.420158 7f71284cc600 -1 no public_addr or public_network specified, and mon.node1 not present in monmap or ceph.conf</code></pre><p>提示节点不在monmap表中。</p><p>参考之前写的文件 <a href="http://www.idcat.cn/2018/03/27/ceph%E9%9B%86%E7%BE%A4%E6%9B%BF%E6%8D%A2mon%E8%8A%82%E7%82%B9ip%E5%9C%B0%E5%9D%80/" title="更换mon ip地址" target="_blank" rel="noopener">http://www.idcat.cn/2018/03/27/ceph%E9%9B%86%E7%BE%A4%E6%9B%BF%E6%8D%A2mon%E8%8A%82%E7%82%B9ip%E5%9C%B0%E5%9D%80/</a></p><p>6、加入monmap</p><pre><code>[root@node1 tmp]# monmaptool --create --generate -c /etc/ceph/ceph.conf /tmp/monaaamonmaptool: monmap file /tmp/monaaamonmaptool: set fsid to 911c57dc-a930-4da8-ab0e-69f6b6586e3dmonmaptool: writing epoch 0 to /tmp/monaaa (3 monitors)[root@node1 tmp]# monmaptool --print /tmp/monaaa monmaptool: monmap file /tmp/monaaaepoch 0fsid 911c57dc-a930-4da8-ab0e-69f6b6586e3dlast_changed 2018-04-02 18:25:56.526811created 2018-04-02 18:25:56.5268110: 192.168.1.141:6789/0 mon.noname-a1: 192.168.1.142:6789/0 mon.noname-b2: 192.168.1.143:6789/0 mon.noname-c</code></pre><p>可以看到主机名显示为noname-a 等。。</p><p>删除主机节点</p><pre><code>[root@node1 tmp]# monmaptool --rm noname-a /tmp/monaaa monmaptool: monmap file /tmp/monaaamonmaptool: removing noname-amonmaptool: writing epoch 0 to /tmp/monaaa (2 monitors)[root@node1 tmp]# monmaptool --rm noname-b /tmp/monaaa monmaptool: monmap file /tmp/monaaamonmaptool: removing noname-bmonmaptool: writing epoch 0 to /tmp/monaaa (1 monitors)[root@node1 tmp]# monmaptool --rm noname-c /tmp/monaaa monmaptool: monmap file /tmp/monaaamonmaptool: removing noname-cmonmaptool: writing epoch 0 to /tmp/monaaa (0 monitors)</code></pre><p>添加新的主机节点</p><pre><code>[root@node1 tmp]# monmaptool --add node1 192.168.1.141:6789 /tmp/monaaa monmaptool: monmap file /tmp/monaaamonmaptool: writing epoch 0 to /tmp/monaaa (1 monitors)[root@node1 tmp]# monmaptool --add node2 192.168.1.142:6789 /tmp/monaaa monmaptool: monmap file /tmp/monaaamonmaptool: writing epoch 0 to /tmp/monaaa (2 monitors)[root@node1 tmp]# monmaptool --add node3 192.168.1.143:6789 /tmp/monaaa monmaptool: monmap file /tmp/monaaamonmaptool: writing epoch 0 to /tmp/monaaa (3 monitors)</code></pre><p>将该monaaa文件传到node2 node3 需要恢复mon信息的节点上</p><pre><code>[root@node1 tmp]# scp /tmp/monaaa node2:/tmp/monaaa 100% 481 43.3KB/s 00:00 [root@node1 tmp]# scp /tmp/monaaa node3:/tmp/monaaa 100% 481 570.6KB/s 00:00 </code></pre><p>在各个节点上注入新的monmap表信息</p><pre><code>[root@node1 ~]# ceph-mon -i node1 --inject-monmap /tmp/monaaa [root@node2 ~]# ceph-mon -i node2 --inject-monmap /tmp/monaaa[root@node3 ~]# ceph-mon -i node3 --inject-monmap /tmp/monaaa </code></pre><p>启动mon</p><pre><code>[root@node1 ~]# systemctl start ceph-mon@node1[root@node2 ~]# systemctl start ceph-mon@node2[root@node3 ~]# systemctl start ceph-mon@node3</code></pre><p>查看集群状态</p><p>[root@node2 ~]# ceph -s</p><pre><code>cluster 911c57dc-a930-4da8-ab0e-69f6b6586e3d health HEALTH_OK monmap e1: 3 mons at {node1=192.168.1.141:6789/0,node2=192.168.1.142:6789/0,node3=192.168.1.143:6789/0} election epoch 2, quorum 0,1,2 node1,node2,node3 osdmap e31: 3 osds: 3 up, 3 in flags sortbitwise,require_jewel_osds pgmap v1: 256 pgs, 3 pools, 2068 bytes data, 20 objects 0 kB used, 0 kB / 0 kB avail 225 active+clean 28 active+clean+scrubbing 3 active+clean+scrubbing+deep</code></pre><p>至此 集群恢复OK</p><p>参考<a href="http://www.zphj1987.com/2017/04/19/why-rm-object-can-get/" target="_blank" rel="noopener">http://www.zphj1987.com/2017/04/19/why-rm-object-can-get/</a></p>]]></content>
<summary type="html">
<p>1、模拟测试环境</p>
<p>三台虚拟机,每台虚拟机一个osd,均是mon节点,mds节点</p>
<pre><code>node1 192.168.1.141
node2 192.168.1.142
node3 192.168.1.143
</code></pre>
</summary>
<category term="ceph" scheme="http://idcat.cn/tags/ceph/"/>
</entry>
<entry>
<title>ceph 多区域radosgw网关配置</title>
<link href="http://idcat.cn/ceph-%E5%A4%9A%E5%8C%BA%E5%9F%9Fradosgw%E7%BD%91%E5%85%B3%E9%85%8D%E7%BD%AE.html"/>
<id>http://idcat.cn/ceph-多区域radosgw网关配置.html</id>
<published>2018-03-28T14:06:36.000Z</published>
<updated>2018-03-28T14:07:43.015Z</updated>
<content type="html"><![CDATA[<p>一、本文环境</p><p>2台虚拟机部署2套ceph集群。2个节点分别作为不同集群的radosgw网关。</p><p>Cluster1 节点名: ceph01 ip : 192.168.0.39</p><p>Cluster2 节点名: ceph02 ip : 192.168.1.169<br><a id="more"></a><br>二,概念: </p><p> zone:包含多个RGW实例的一个逻辑概念。zone不能跨集群。同一个zone的数据保存在同一组pool中。 本例中每个zone只有一个RGW实例。</p><p> zonegroup:一个zonegroup如果包含1个或多个zone。如果一个zonegroup包含多个zone,必须指定 一个zone作为master zone,用来处理bucket和用户的创建。一个集群可以创建多个zonegroup,一个zonegroup也可以跨多个集群。 本例中只有一个zonegroup,跨2个集群,每个集群只有一个zone,分别作为master和从zone.</p><p> realm:一个realm包含1个或多个zonegroup。如果realm包含多个zonegroup,必须指定一个zonegroup为master zonegroup, 用来处理系统操作。一个系统中可以包含多个realm,多个realm之间资源完全隔离。 本例 只有一个realm,包含一个zonegroup. </p><p> RGW多活方式是在同一zonegroup的多个zone之间进行,即同一zonegroup中多个zone之间的数据是完全一致的,用户可以通过任意zone读写同一份数据。 但是,对元数据的操作,比如创建桶、创建用户,仍然只能在master zone进行。对数据的操作,比如创建桶中的对象,访问对象等,可以在任意zone中 处理。</p><p>三、在Cluster1集群上配置master zone</p><p>创建realm</p><pre><code>radosgw-admin realm create --rgw-realm=aaa --default[root@ceph01 ~]# radosgw-admin realm list{"default_info": "eff9b039-8c3c-4991-87f9-e9331b2c7824","realms": [ "aaa" ]}</code></pre><p>创建master zonegroup,先删除默认的zonegroup</p><pre><code>radosgw-admin zonegroup delete --rgw-zonegroup=default</code></pre><p>创建一个为azonggroup的zonegroup</p><pre><code>radosgw-admin zonegroup create --rgw-zonegroup=azonegroup --endpoints=192.168.0.39:7480 --master --default</code></pre><p>创建master zone,先删除默认的zone</p><pre><code>adosgw-admin zone delete --rgw-zone=default</code></pre><p>创建一个为azone的zone</p><pre><code>radosgw-admin zone create --rgw-zonegroup=azonegroup --rgw-zone=azone --endpoints=192.168.0.39:7480 --default --master</code></pre><p>创建一个auser账户用于和bzone zone同步</p><pre><code>radosgw-admin user create --uid="auser" --display-name="auser" --system</code></pre><p>用创建auser账户产生的access 和secret更新zone配置</p><pre><code>radosgw-admin zone modify --rgw-zone=azone --access-key=BKG10IM15N8EB0I7ZE7U --secret=Cvh60vBX5ciujqRaLw3bm6wMIGmLdlJ9FB4ukOG</code></pre><p>更新period</p><pre><code>radosgw-admin period update --commit</code></pre><p>修改配置ceph.conf</p><pre><code>[client.rgw.ceph-1]host = ceph-1rgw frontends = "civetweb port=7480"rgw_zone=azone</code></pre><p>重启radosgw服务</p><pre><code>systemctl restart [email protected]</code></pre><p>四、在Cluster2集群上配置slave zone</p><p>从master zone拉取realm</p><pre><code>[root@ceph02 ceph]#radosgw-admin realm pull --url=192.168.0.39:7480 --access-key=BKG10IM15N8EB0I7ZE7U --secret=Cvh60vBX5ciujqRaLw3bm6wMIGmLdlJ9FB4ukOGC</code></pre><p>注意:这里的access key 和secret是master zone上auser 账户的access key和secret</p><p>拉取period</p><pre><code>[root@ceph02 ceph]# radosgw-admin period pull --url=192.168.0.39:7480 --access-key=BKG10IM15N8EB0I7ZE7U --secret=Cvh60vBX5ciujqRaLw3bm6wMIGmLdlJ9FB4ukOGC</code></pre><p>注意:这里的access key 和secret是master zone上auser 账户的access key和secret</p><p>创建slave zone,名称为bzone</p><pre><code>[root@ceph02 ceph]# radosgw-admin zone create --rgw-zonegroup=azonegroup --rgw-zone=bzone --access-key=BKG10IM15N8EB0I7ZE7U --secret=Cvh60vBX5ciujqRaLw3bm6wMIGmLdlJ9FB4ukOGC --endpoints=192.168.1.169:7480[root@ceph02 ceph]# radosgw-admin zone list{"default_info": "118aa9b6-458f-4b4e-9aa8-1c3577bf8dd6","zones": [ "bzone", "default"]}</code></pre><p> 注意:这里的access key 和secret是master zone上auser 账户的access key和secret</p><p>更新period</p><pre><code>[root@ceph02 ceph]#radosgw-admin period update --commit</code></pre><p>有如下警告,只需要更新libcurl既可</p><pre><code>2018-03-28 18:07:46.941333 7f27d9f499c0 0 WARNING: detected a version of libcurl which contains a bug in curl_multi_wait(). enabling a workaround that may degrade performance slightly.[root@ceph02 ceph]# yum update libcurl -y</code></pre><p>修改配置ceph.conf</p><pre><code>[client.rgw.ceph-2] host = ceph-2 rgw frontends = "civetweb port=7480" rgw_zone=bzone</code></pre><p>重启radosgw服务</p><pre><code>systemctl restart [email protected]</code></pre><p>五、验证zone之间数据同步</p><p>在ceph02 从节点执行</p><pre><code>[root@ceph02 ~]# radosgw-admin sync status realm eff9b039-8c3c-4991-87f9-e9331b2c7824 (aaa) zonegroup 7e6ff2ed-f8aa-44fc-b2f3-5ca81464dd9b (azonegroup) zone 118aa9b6-458f-4b4e-9aa8-1c3577bf8dd6 (bzone)metadata sync syncing full sync: 0/64 shards incremental sync: 64/64 shards metadata is caught up with master data sync source: 08fcacca-5e53-4499-9413-5212d2477576 (azone) syncing full sync: 0/128 shards incremental sync: 128/128 shards data is caught up with source</code></pre><p>在master zone 节点ceph01上创建用户</p><pre><code>[root@ceph01 ~]# radosgw-admin user create --uid="zhang" --display-name="zhang"</code></pre><p>安装s3客户端 创建桶tong1,并put 对象</p><pre><code>[root@ceph01 ~]# s3cmd mb s3://tong1[root@ceph01 ~]# s3cmd put release.asc s3://tong1upload: 'release.asc' -> 's3://tong1/release.asc' [1 of 1]1645 of 1645 100% in 0s 33.23 kB/s done[root@ceph01 ~]# s3cmd ls s3://tong12018-03-28 12:26 1082 s3://tong1/anaconda-ks.cfg2018-03-28 13:05 1645 s3://tong1/release.asc</code></pre><p>在slave zone 节点ceph02 查看(将ceph01 /root/.s3cfg 拷贝到ceph02节点同样的目录)</p><pre><code>[root@ceph02 ~]# s3cmd ls s3://tong12018-03-28 12:26 1082 s3://tong1/anaconda-ks.cfg2018-03-28 13:05 1645 s3://tong1/release.asc</code></pre><p>至此,2个zone之间可做到高可用,保证一定的数据安全。详细测试以后进行。</p>]]></content>
<summary type="html">
<p>一、本文环境</p>
<p>2台虚拟机部署2套ceph集群。2个节点分别作为不同集群的radosgw网关。</p>
<p>Cluster1 节点名: ceph01 ip : 192.168.0.39</p>
<p>Cluster2 节点名: ceph02 ip : 192.168.1.169<br>
</summary>
<category term="ceph" scheme="http://idcat.cn/tags/ceph/"/>
</entry>
<entry>
<title>ceph-dencoder基本使用</title>
<link href="http://idcat.cn/ceph-dencoder%E5%9F%BA%E6%9C%AC%E4%BD%BF%E7%94%A8.html"/>
<id>http://idcat.cn/ceph-dencoder基本使用.html</id>
<published>2018-03-27T14:06:41.000Z</published>
<updated>2018-03-27T14:07:21.265Z</updated>
<content type="html"><![CDATA[<p>本文摘自<a href="https://blog.csdn.net/scaleqiao/article/details/51987426" target="_blank" rel="noopener">https://blog.csdn.net/scaleqiao/article/details/51987426</a></p><p>0 简介</p><p>贯穿Ceph OSD端数据处理的一个核心结构就是ObjectStore::Transaction,OSD处理的所有操作以及其关联的数据都会封装进入Transaction中的bufferlist结构里,这里的封装也就是序列化(encode),它将各种数据结构无论简单或者复杂都作为字节流,存入bufferlist中。最终Transaction会由具体的ObjectStore后端实现来处理,当然,处理时会对bufferlist中的数据进行反序列化(decode)。而本文介绍的ceph-dencoder工具就是Ceph提供的可以进行encode、decode以及dump ceph相关数据结构的工具,同时它也可以用来调试以及测试Ceph不同版本之间的兼容性。今天这里主要介绍它的decode功能,其他功能大家可以自行研究。<br><a id="more"></a><br><strong>1 安装</strong></p><p>ceph-dencoder工具是默认安装的。</p><p><strong>2 使用</strong></p><p>可以通过它的manpage或者help文档来了解它的使用</p><pre><code>[root@ceph03 ~]# man ceph-dencoder [root@ceph03 ~]# ceph-dencoder -h usage: ceph-dencoder [commands ...] version print version string (to stdout) import <encfile> read encoded data from encfile export <outfile> write encoded data to outfile set_features <num> set feature bits used for encoding get_features print feature bits (int) to stdout list_types list supported types type <classname> select in-memory type skip <num> skip <num> leading bytes before decoding decode decode into in-memory object encode encode in-memory object dump_json dump in-memory object as json (to stdout) copy copy object (via operator=) copy_ctor copy object (via copy ctor) count_tests print number of generated test objects (to stdout) select_test <n> select generated test object as in-memory object is_deterministic exit w/ success if type encodes deterministically </code></pre><p>它的用法比较简单,即ceph-dencoder跟上相应的子命令即可。<br>在具体使用时,可以通过以下命令查看ceph-dencoder当前支持哪些结构:</p><pre><code>ceph-dencoder list_types </code></pre><p>要确定需要通过哪种结构来进行数据解析,只能通过阅读源码,找到encode数据时,数据对应的数据结构。</p><p><strong>3 使用事例</strong></p><p>下面以查看object的object_info信息为例,介绍一下这个工具的使用。</p><p>在使用XFS作为后端存储时,一个object就对应一个文件,obejct的object_info信息通常是作为文件的扩展属性存在的。</p><p>首先先找到一个object对应的文件,并查看object文件的扩展属性,这里会用到XFS一个工具attr,主要用于操作XFS文件的扩展属性的:</p><pre><code>[root@ceph02 0.10_head]# rados ls -p rbdrbd_header.d3652ae8944arbd_directoryrbd_id.test</code></pre><p>查看ceph osd map rbd rbd_header.d3652ae8944a 找到对应的osd</p><pre><code>[root@ceph02 0.10_head]# [root@ceph02 0.10_head]# attr -l rbd\\uheader.d3652ae8944a__head_9EB01B90__0 Attribute "cephos.spill_out" has a 2 byte value for rbd\uheader.d3652ae8944a__head_9EB01B90__0Attribute "ceph._lock.rbd_lock" has a 23 byte value for rbd\uheader.d3652ae8944a__head_9EB01B90__0Attribute "ceph._" has a 250 byte value for rbd\uheader.d3652ae8944a__head_9EB01B90__0Attribute "ceph._@1" has a 250 byte value for rbd\uheader.d3652ae8944a__head_9EB01B90__0Attribute "ceph._@2" has a 92 byte value for rbd\uheader.d3652ae8944a__head_9EB01B90__0Attribute "ceph.snapset" has a 31 byte value for rbd\uheader.d3652ae8944a__head_9EB01B90__0</code></pre><p>从上面的输出可以看到扩展属性中总共有3部分,其中”ceph.<em>“和”ceph.</em>@1”,”ceph.<em>@2”应该算是一部分,因为ceph.</em>超出了XFS扩展属性的长度限制,所以拆成了3个。而它里面就存放了我们要找的object_info_t的数据。”cephos.spill_out”是一个两个字节的字符数组,用于记录文件的扩展属性是否溢出到了Omap里,一般它的值是“0”或者“1”,“0”就是源码里的宏定义XATTR_NO_SPILL_OUT,表示没有溢出,而“1”是XATTR_SPILL_OUT,表示有溢出。“ceph.snapset”用来记录和Object相关的Snapshot信息,它的数据以SnapSet这个结构的形式存在。</p><p>另外还有一个通用的命令getfattr也可以干类似的事情。</p><pre><code>[root@ceph02 0.10_head]# getfattr -d rbd\\uheader.d3652ae8944a__head_9EB01B90__0 -m 'user\.ceph\._'# file: rbd\134uheader.d3652ae8944a__head_9EB01B90__0user.ceph._=0sDwhKAgAABAM4AAAAAAAAABcAAAByYmRfaGVhZGVyLmQzNjUyYWU4OTQ0Yf7/////////kBuwngAAAAAAAAAAAAAAAAAGAxwAAAAAAAAAAAAAAP////8AAAAAAAAAAP//////////AAAAAAoAAAAAAAAAOAAAAAkAAAAAAAAAOAAAAAICFQAAAAhr0wAAAAAAAAIAAAAAAAAAAQAAAAAAAAAAAAAAIki6WizsNiECAhUAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAEAAAAIa9MAAAAAAAAEA5QAAAABAAAAAAAAAB4AAAAAAA==user.ceph._@1=0sAADTQ2beAAIAAMCoAakAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAIAAAAAAAAAAAAAAAAAQAAAAEAAAAAAAAACGvTAAAAAAAABAOUAAAAAQAAAAAAAAAeAAAAAAAAANNDZt4AAgAAwKgBqQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA==user.ceph._@2=0sAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAHAAAACJIulr/91Ah//////////8=user.ceph._lock.rbd_lock=0sAQERAAAAAAAAAAEIAAAAaW50ZXJuYWw=</code></pre><p>接着,我们将扩展属性中的数据dump到一个文件当中,注意应该将”ceph.<em>“和”ceph.</em>@1”,”ceph._@1”的数据拼起来。</p><pre><code>[root@ceph02 0.10_head]# attr -q -g "ceph._" rbd\\uheader.d3652ae8944a__head_9EB01B90__0 > 1.txt[root@ceph02 0.10_head]# attr -q -g "ceph._@1" rbd\\uheader.d3652ae8944a__head_9EB01B90__0 >> 1.txt[root@ceph02 0.10_head]# attr -q -g "ceph._@2" rbd\\uheader.d3652ae8944a__head_9EB01B90__0 >> 1.txt </code></pre><p>最后使用ceph-dencoder工具将其内容解析出来。</p><pre><code>[root@ceph02 0.10_head]# ceph-dencoder import 1.txt type object_info_t decode dump_json{"oid": { "oid": "rbd_header.d3652ae8944a", "key": "", "snapid": -2, "hash": 2662341520, "max": 0, "pool": 0, "namespace": ""},"version": "56'10","prior_version": "56'9","last_reqid": "client.54123.1:2","user_version": 8,"size": 0,"mtime": "2018-03-27 21:33:22.557247","local_mtime": "2018-03-27 21:33:22.558954","lost": 0,"flags": 28,"snaps": [],"truncate_seq": 0,"truncate_size": 0,"data_digest": 4294967295,"omap_digest": 4294967295,"watchers": { "client.54123": { "cookie": 1, "timeout_seconds": 30, "addr": { "nonce": 3731243987, "addr": "192.168.1.169:0" } }}}</code></pre><p>后期遇到详细学习。仅做记录。</p>]]></content>
<summary type="html">
<p>本文摘自<a href="https://blog.csdn.net/scaleqiao/article/details/51987426" target="_blank" rel="noopener">https://blog.csdn.net/scaleqiao/article/details/51987426</a></p>
<p>0 简介</p>
<p>贯穿Ceph OSD端数据处理的一个核心结构就是ObjectStore::Transaction,OSD处理的所有操作以及其关联的数据都会封装进入Transaction中的bufferlist结构里,这里的封装也就是序列化(encode),它将各种数据结构无论简单或者复杂都作为字节流,存入bufferlist中。最终Transaction会由具体的ObjectStore后端实现来处理,当然,处理时会对bufferlist中的数据进行反序列化(decode)。而本文介绍的ceph-dencoder工具就是Ceph提供的可以进行encode、decode以及dump ceph相关数据结构的工具,同时它也可以用来调试以及测试Ceph不同版本之间的兼容性。今天这里主要介绍它的decode功能,其他功能大家可以自行研究。<br>
</summary>
<category term="ceph" scheme="http://idcat.cn/tags/ceph/"/>
</entry>
<entry>
<title>ceph-kvstore-tool工具简单介绍</title>
<link href="http://idcat.cn/ceph-kvstore-tool%E5%B7%A5%E5%85%B7%E7%AE%80%E5%8D%95%E4%BB%8B%E7%BB%8D.html"/>
<id>http://idcat.cn/ceph-kvstore-tool工具简单介绍.html</id>
<published>2018-03-27T12:07:00.000Z</published>
<updated>2018-03-27T12:09:04.629Z</updated>
<content type="html"><![CDATA[<p>参考<a href="https://blog.csdn.net/scaleqiao/article/details/51946042" target="_blank" rel="noopener">https://blog.csdn.net/scaleqiao/article/details/51946042</a></p><p>大家都知道Ceph的很多数据比如PG log、Monitor的数据都存在kvstore里(leveldb或者RocksDB中),Ceph也提供了查看kvstore里数据的工具,它就是ceph-kvstore-tool。<br><a id="more"></a><br>1 安装ceph-kvstore-tool工具</p><p>如果你是从官网释放的rpm包安装的Ceph,那么ceph-kvstore-tool默认是没有安装的,它包含在ceph-test这个rpm中,你可以通过以下方法安装。</p><pre><code>yum install ceph-test </code></pre><p>2 ceph-kvstore-tool命令使用介绍</p><p>以下介绍基于Ceph 10.2.3版本,</p><pre><code>[root@ceph02 ~]# ceph --versionceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b) </code></pre><p>查看帮助</p><pre><code>[root@ceph02 ~]# ceph-kvstore-tool -hUsage: ceph-kvstore-tool <leveldb|rocksdb|...> <store path> command [args...]Commands: list [prefix] list-crc [prefix] exists <prefix> [key] get <prefix> <key> [out <file>] crc <prefix> <key> get-size [<prefix> <key>] set <prefix> <key> [ver <N>|in <file>] store-copy <path> [num-keys-per-tx] store-crc <path></code></pre><p>描述的比较简单,但是基本上告诉你了这个命令的用法。当前使用的是leveldb数据库,store path指定的是leveldb数据库路径,例如mon 目录的store.db, osd目录下的current/omap 目录。</p><p>我们知道leveldb是一个kvstore也就是kv的数据库,prefix就是数据库的表名。key就是表里面的key值,而value值这里通过get 后指定输出到out file中。<br>如下所示,list输出格式为 表名:key</p><p>查看mon leveldb数据则停止mon服务,osd也是如此</p><pre><code>[root@ceph02 ~]# systemctl stop ceph-mon@ceph02[root@ceph02 ~]# ceph-kvstore-tool leveldb /var/lib/ceph/mon/ceph-ceph02/store.db/ list |head -n 52018-03-27 17:18:52.735099 7fede398b040 1 leveldb: Recovering log #2652018-03-27 17:18:52.735886 7fede398b040 1 leveldb: Level-0 table #267: started2018-03-27 17:18:52.737658 7fede398b040 1 leveldb: Level-0 table #267: 54804 bytes OKauth:1auth:10auth:100auth:101auth:102 </code></pre><p>查看所有表名</p><pre><code>[root@ceph02 ~]# ceph-kvstore-tool leveldb /var/lib/ceph/mon/ceph-ceph02/store.db/ list |awk -F ':' '{print $1}'|uniq2018-03-27 17:24:59.608065 7f8580087040 1 leveldb: Recovering log #2682018-03-27 17:24:59.656403 7f8580087040 1 leveldb: Delete type=3 #2662018-03-27 17:24:59.656449 7f8580087040 1 leveldb: Delete type=0 #268authlogmmds_healthmds_metadatamdsmapmonitormonitor_storemonmaposd_metadataosdmappaxospgmappgmap_metapgmap_osdpgmap_pg </code></pre><p>从中我们发现了比较熟悉的各种map,mdsmap、monmap、osdmap、pgmap、auth等。而如上所讲,其中每一张表都有很多表项组成,接下来我们使用ceph-kvstore-tool来查看一下monmap这个表中某个表项中的数据。</p><pre><code>[root@ceph02 ~]# ceph-kvstore-tool leveldb /var/lib/ceph/mon/ceph-ceph02/store.db/ list| grep monmap2018-03-27 17:26:52.435840 7f6601faf040 1 leveldb: Recovering log #2702018-03-27 17:26:52.440832 7f6601faf040 1 leveldb: Delete type=3 #2692018-03-27 17:26:52.440874 7f6601faf040 1 leveldb: Delete type=0 #270monmap:1monmap:2monmap:first_committedmonmap:last_committedmonmap:latest</code></pre><p>从中,可以看出monmap目前有三个表项,看看monmap:1这项包含什么数据。其中指定leveldb数据库,get获取表monmap,中key为1 的值,将值out指定到文件monmap.1.txt中</p><pre><code>[root@ceph02 ~]# ceph-kvstore-tool leveldb /var/lib/ceph/mon/ceph-ceph02/store.db/ get monmap 1 out monmap.1.txt2018-03-27 17:30:25.096235 7f8181d4b040 1 leveldb: Recovering log #272(monmap, 1)2018-03-27 17:30:25.120278 7f8181d4b040 1 leveldb: Delete type=3 #2712018-03-27 17:30:25.120318 7f8181d4b040 1 leveldb: Delete type=0 #272[root@ceph02 ~]# file monmap.1.txt monmap.1.txt: DBase 3 data file (301334528 records)</code></pre><p>导出来的这个文件是一个DBase文件,是编译过的,需要使用工具来解析它。我们知道Ceph中很多数据都是经过序列化(encode)之后持久化的(这里有篇文章有相关的介绍<a href="https://www.ustack.com/blog/cephxuliehua/),所以要解析这些数据需要将它们反序列化(decode),Ceph提供了一个反序列化的工具ceph-dencode,关于ceph-dencoder的使用会在以后介绍。" target="_blank" rel="noopener">https://www.ustack.com/blog/cephxuliehua/),所以要解析这些数据需要将它们反序列化(decode),Ceph提供了一个反序列化的工具ceph-dencode,关于ceph-dencoder的使用会在以后介绍。</a></p><pre><code>[root@ceph02 ~]# ceph-dencoder import monmap.1.txt type MonMap decode dump_json{ "epoch": 1,"fsid": "f6110559-f0f1-4e14-9213-f29471329ee9","modified": "2017-02-09 15:34:21.701901","created": "2017-02-09 15:34:21.701901","mons": [ { "rank": 0, "name": "ceph02", "addr": "192.168.0.40:6789\/0" }]</code></pre><p>}<br>可以看到内容。如下查看monmap表中key为1 的size</p><pre><code>[root@ceph02 ~]# ceph-kvstore-tool leveldb /var/lib/ceph/mon/ceph-ceph02/store.db/ get-size monmap 12018-03-27 17:39:57.819864 7f2800190040 1 leveldb: Recovering log #276log - 0misc - 65908sst - 12156146total - 12222054total: 12222054estimated store size: 12222054(monmap,1) size 1922018-03-27 17:39:57.821447 7f2800190040 1 leveldb: Delete type=0 #2762018-03-27 17:39:57.821473 7f2800190040 1 leveldb: Delete type=3 #275</code></pre><p>如下查看osdmap信息</p><pre><code>[root@ceph02 ~]# ceph-kvstore-tool leveldb /var/lib/ceph/mon/ceph-ceph02/store.db/ list osdmap2018-03-27 17:43:46.711599 7fcf02a6a040 1 leveldb: Recovering log #280osdmap:1osdmap:10osdmap:11。。。osdmap:19。。。。osdmap:full_45osdmap:full_46osdmap:full_47osdmap:full_48</code></pre><p>查看full_48内容</p><pre><code>[root@ceph02 ~]# ceph-kvstore-tool leveldb /var/lib/ceph/mon/ceph-ceph02/store.db/ get osdmap full_48 out osdmap-full48.txt2018-03-27 17:44:34.623250 7f8bd795f040 1 leveldb: Recovering log #282(osdmap, full_48)2018-03-27 17:44:34.624751 7f8bd795f040 1 leveldb: Delete type=3 #2812018-03-27 17:44:34.624785 7f8bd795f040 1 leveldb: Delete type=0 #282 [root@ceph02 ~]# ceph-dencoder import osdmap-full48.txt type OSDMap decode dump_json{"epoch": 48,"fsid": "f6110559-f0f1-4e14-9213-f29471329ee9","created": "2017-02-09 15:34:22.284668","modified": "2018-03-27 16:55:39.569689","flags": "sortbitwise","cluster_snapshot": "","pool_max": 2,"max_osd": 5,"pools": [ { "pool": 0, "pool_name": "rbd", "flags": 1, "flags_names": "hashpspool", "type": 1, "size": 1, "min_size": 1, "crush_ruleset": 0, "object_hash": 2, "pg_num": 64, "pg_placement_num": 64, "crash_replay_interval": 0, "last_change": "22", "last_force_op_resend": "0", "auid": 0, "snap_mode": "selfmanaged", "snap_seq": 0, "snap_epoch": 0, "pool_snaps": [], "removed_snaps": "[]", "quota_max_bytes": 0, "quota_max_objects": 0, "tiers": [], "tier_of": -1, "read_tier": -1, "write_tier": -1, "cache_mode": "none", "target_max_bytes": 0, "target_max_objects": 0, "cache_target_dirty_ratio_micro": 0, "cache_target_dirty_high_ratio_micro": 0, "cache_target_full_ratio_micro": 0, "cache_min_flush_age": 0, "cache_min_evict_age": 0, "erasure_code_profile": "", "hit_set_params": { "type": "none" }, "hit_set_period": 0, "hit_set_count": 0, "use_gmt_hitset": true, "min_read_recency_for_promote": 0, "min_write_recency_for_promote": 0, "hit_set_grade_decay_rate": 0, "hit_set_search_last_n": 0, "grade_table": [], "stripe_width": 0, "expected_num_objects": 0, "fast_read": false, "options": {} }, { "pool": 1, "pool_name": "data", "flags": 1, "flags_names": "hashpspool", "type": 1, "size": 1, "min_size": 1, "crush_ruleset": 0, "object_hash": 2, "pg_num": 64, "pg_placement_num": 64, "crash_replay_interval": 45, "last_change": "29", "last_force_op_resend": "0", "auid": 0, "snap_mode": "selfmanaged", "snap_seq": 0, "snap_epoch": 0, "pool_snaps": [], "removed_snaps": "[]", "quota_max_bytes": 0, "quota_max_objects": 0, "tiers": [], "tier_of": -1, "read_tier": -1, "write_tier": -1, "cache_mode": "none", "target_max_bytes": 0, "target_max_objects": 0, "cache_target_dirty_ratio_micro": 400000, "cache_target_dirty_high_ratio_micro": 600000, "cache_target_full_ratio_micro": 800000, "cache_min_flush_age": 0, "cache_min_evict_age": 0, "erasure_code_profile": "", "hit_set_params": { "type": "none" }, "hit_set_period": 0, "hit_set_count": 0, "use_gmt_hitset": true, "min_read_recency_for_promote": 0, "min_write_recency_for_promote": 0, "hit_set_grade_decay_rate": 0, "hit_set_search_last_n": 0, "grade_table": [], "stripe_width": 0, "expected_num_objects": 0, "fast_read": false, "options": {} }, { "pool": 2, "pool_name": "metadata", "flags": 1, "flags_names": "hashpspool", "type": 1, "size": 1, "min_size": 1, "crush_ruleset": 0, "object_hash": 2, "pg_num": 64, "pg_placement_num": 64, "crash_replay_interval": 0, "last_change": "32", "last_force_op_resend": "0", "auid": 0, "snap_mode": "selfmanaged", "snap_seq": 0, "snap_epoch": 0, "pool_snaps": [], "removed_snaps": "[]", "quota_max_bytes": 0, "quota_max_objects": 0, "tiers": [], "tier_of": -1, "read_tier": -1, "write_tier": -1, "cache_mode": "none", "target_max_bytes": 0, "target_max_objects": 0, "cache_target_dirty_ratio_micro": 400000, "cache_target_dirty_high_ratio_micro": 600000, "cache_target_full_ratio_micro": 800000, "cache_min_flush_age": 0, "cache_min_evict_age": 0, "erasure_code_profile": "", "hit_set_params": { "type": "none" }, "hit_set_period": 0, "hit_set_count": 0, "use_gmt_hitset": true, "min_read_recency_for_promote": 0, "min_write_recency_for_promote": 0, "hit_set_grade_decay_rate": 0, "hit_set_search_last_n": 0, "grade_table": [], "stripe_width": 0, "expected_num_objects": 0, "fast_read": false, "options": {} }],"osds": [ { "osd": 3, "uuid": "74267031-f0a0-4bbc-9952-11726bdda80b", "up": 0, "in": 0, "weight": 0.000000, "primary_affinity": 1.000000, "last_clean_begin": 38, "last_clean_end": 40, "up_from": 43, "up_thru": 43, "down_at": 46, "lost_at": 0, "public_addr": "192.168.1.169:6805\/17244", "cluster_addr": "192.168.1.169:6806\/17244", "heartbeat_back_addr": "192.168.1.169:6807\/17244", "heartbeat_front_addr": "192.168.1.169:6808\/17244", "state": [ "autoout", "exists" ] }, { "osd": 4, "uuid": "9ca8254e-3036-49b7-a340-66831590b37b", "up": 1, "in": 1, "weight": 1.000000, "primary_affinity": 1.000000, "last_clean_begin": 36, "last_clean_end": 40, "up_from": 43, "up_thru": 43, "down_at": 42, "lost_at": 0, "public_addr": "192.168.1.169:6801\/17240", "cluster_addr": "192.168.1.169:6802\/17240", "heartbeat_back_addr": "192.168.1.169:6803\/17240", "heartbeat_front_addr": "192.168.1.169:6804\/17240", "state": [ "exists", "up" ] }],"osd_xinfo": [ { "osd": 3, "down_stamp": "2018-03-27 16:26:00.221545", "laggy_probability": 0.000000, "laggy_interval": 0, "features": 576460752032874495, "old_weight": 65536 }, { "osd": 4, "down_stamp": "2018-03-27 15:46:44.704581", "laggy_probability": 0.000000, "laggy_interval": 0, "features": 576460752032874495, "old_weight": 0 }],"pg_temp": [],"primary_temp": [],"blacklist": {},"erasure_code_profiles": { "default": { "k": "2", "m": "1", "plugin": "jerasure", "technique": "reed_sol_van" }}} </code></pre><p>后期具体使用过程结合实际情况进行。</p>]]></content>
<summary type="html">
<p>参考<a href="https://blog.csdn.net/scaleqiao/article/details/51946042" target="_blank" rel="noopener">https://blog.csdn.net/scaleqiao/article/details/51946042</a></p>
<p>大家都知道Ceph的很多数据比如PG log、Monitor的数据都存在kvstore里(leveldb或者RocksDB中),Ceph也提供了查看kvstore里数据的工具,它就是ceph-kvstore-tool。<br>
</summary>
<category term="ceph" scheme="http://idcat.cn/tags/ceph/"/>
</entry>
<entry>
<title>ceph集群替换mon节点ip地址</title>
<link href="http://idcat.cn/ceph%E9%9B%86%E7%BE%A4%E6%9B%BF%E6%8D%A2mon%E8%8A%82%E7%82%B9ip%E5%9C%B0%E5%9D%80.html"/>
<id>http://idcat.cn/ceph集群替换mon节点ip地址.html</id>
<published>2018-03-27T12:06:32.000Z</published>
<updated>2018-03-27T12:11:28.596Z</updated>
<content type="html"><![CDATA[<p>是由:原来单节点虚拟机集群,搬迁服务器,从A服务器复制虚拟机到B服务器上。然后启动后发现改节点IP地址已经变换,且无法恢复到原有IP地址,则必须使用新的ip地址作为mon 地址。<br><a id="more"></a><br>mon节点日志</p><pre><code>2018-03-27 14:37:13.691047 7f6dee6354c0 0 starting mon.ceph02 rank 0 at 192.168.0.40:6789/0 mon_data /var/lib/ceph/mon/ceph-ceph02 fsid f6110559-f0f1-4e14-9213-f29471329ee92018-03-27 14:37:13.691080 7f6dee6354c0 -1 accepter.accepter.bind unable to bind to 192.168.0.40:6789: (99) Cannot assign requested address2018-03-27 14:37:13.691088 7f6dee6354c0 -1 accepter.accepter.bind was unable to bind. Trying again in 5 seconds 2018-03-27 14:37:18.691584 7f6dee6354c0 -1 accepter.accepter.bind unable to bind to 192.168.0.40:6789: (99) Cannot assign requested address2018-03-27 14:37:18.691618 7f6dee6354c0 -1 accepter.accepter.bind was unable to bind. Trying again in 5 seconds 2018-03-27 14:37:23.692664 7f6dee6354c0 -1 accepter.accepter.bind unable to bind to 192.168.0.40:6789: (99) Cannot assign requested address2018-03-27 14:37:23.692699 7f6dee6354c0 -1 accepter.accepter.bind was unable to bind after 3 attempts: (99) Cannot assign requested address2018-03-27 14:37:23.692712 7f6dee6354c0 -1 unable to bind monitor to 192.168.0.40:6789/0</code></pre><p>1、获取当前集群monmap</p><pre><code>[root@ceph02 ~]# ceph mon getmap -o /tmp/momnmap2018-03-27 14:54:38.217464 7fa5f46ae700 0 -- :/2440828314 >> 192.168.0.40:6789/0 pipe(0x7fa5e8000c80 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa5e8001f90).fault2018-03-27 14:54:41.236171 7fa5f47af700 0 -- :/2440828314 >> 192.168.0.40:6789/0 pipe(0x7fa5e80052b0 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa5e8006570).fault</code></pre><p>单节点IP已经更改,也无法从其他mon节点获取monmap信息。下面换到从ceph.conf配置文件中获取monmap信息</p><pre><code>[root@ceph02 tmp]# monmaptool --create --generate -c /etc/ceph/ceph.conf /tmp/monmapmonmaptool: monmap file /tmp/monmapmonmaptool: set fsid to f6110559-f0f1-4e14-9213-f29471329ee9monmaptool: writing epoch 0 to /tmp/monmap (1 monitors)</code></pre><p>2、查看信息</p><pre><code>[root@ceph02 tmp]# monmaptool --print monmap monmaptool: monmap file monmapepoch 0fsid f6110559-f0f1-4e14-9213-f29471329ee9last_changed 2018-03-27 15:09:11.680108created 2018-03-27 15:09:11.6801080: 192.168.0.40:6789/0 mon.noname-a</code></pre><p>可以看到原有ip。</p><p>3、删除原有ip,若有多个mon ip 则执行多次需要替换的mon 节点</p><pre><code>[root@ceph02 tmp]# monmaptool --rm noname-a /tmp/monmap monmaptool: monmap file /tmp/monmapmonmaptool: removing noname-amonmaptool: writing epoch 0 to /tmp/monmap (0 monitors)</code></pre><p>4、原有的monitor信息删除后,添加新的monitor节点,多个节点则执行多次,如下:</p><pre><code>[root@ceph02 tmp]# monmaptool --add ceph02 192.168.1.169:6789 /tmp/monmap monmaptool: monmap file /tmp/monmapmonmaptool: writing epoch 0 to /tmp/monmap (1 monitors)[root@ceph02 tmp]# monmaptool --print monmap monmaptool: monmap file monmapepoch 0fsid f6110559-f0f1-4e14-9213-f29471329ee9last_changed 2018-03-27 15:09:11.680108created 2018-03-27 15:09:11.6801080: 192.168.1.169:6789/0 mon.ceph02</code></pre><p>将新的manmap文件拷贝到所有运行ceph-mon服务的机器上(若多个mon 节点)</p><p>5、停止mon服务以及修改ceph.conf文件,并同步到所有节点</p><pre><code>[root@ceph02 tmp]# systemctl stop ceph-mon@ceph02[global]fsid = f6110559-f0f1-4e14-9213-f29471329ee9mon_initial_members = ceph02mon_host = 192.168.1.169auth_cluster_required = cephxauth_service_required = cephxauth_client_required = cephx[mon.ceph02]host = ceph02mon addr = 192.168.1.169</code></pre><p>6、注入新的monmap 多个节点则在多个节点执行 -i 后面输入实际节点主机名</p><pre><code>[root@ceph02 tmp]# ceph-mon -i ceph02 --inject-monmap /tmp/monmap </code></pre><p>7、启动mon。</p><pre><code>[root@ceph02 tmp]# systemctl start ceph-mon@ceph02[root@ceph02 tmp]# systemctl status ceph-mon@ceph02● [email protected] - Ceph cluster monitor daemon Loaded: loaded (/usr/lib/systemd/system/[email protected]; enabled; vendor preset: disabled) Active: active (running) since Tue 2018-03-27 15:36:30 CST; 4s ago Main PID: 16664 (ceph-mon) CGroup: /system.slice/system-ceph\x2dmon.slice/[email protected] └─16664 /usr/bin/ceph-mon -f --cluster ceph --id ceph02 --setuser ceph --setgroup cephMar 27 15:36:30 ceph02 systemd[1]: Started Ceph cluster monitor daemon.Mar 27 15:36:30 ceph02 systemd[1]: Starting Ceph cluster monitor daemon...Mar 27 15:36:30 ceph02 ceph-mon[16664]: starting mon.ceph02 rank 0 at 192.168.1.169:6789/0 mon_data /var/lib/ceph/m...329ee9Mar 27 15:36:30 ceph02 ceph-mon[16664]: 2018-03-27 15:36:30.386583 7f71023634c0 -1 WARNING: 'mon addr' config optio...p fileMar 27 15:36:30 ceph02 ceph-mon[16664]: continuing with monmap configurationHint: Some lines were ellipsized, use -l to show in full.[root@ceph02 tmp]# ceph -scluster f6110559-f0f1-4e14-9213-f29471329ee9 health HEALTH_OK monmap e2: 1 mons at {ceph02=192.168.1.169:6789/0} election epoch 6, quorum 0 ceph02 fsmap e11: 1/1/1 up {0=ceph02=up:active} osdmap e40: 2 osds: 2 up, 2 in flags sortbitwise pgmap v1334: 192 pgs, 3 pools, 2068 bytes data, 20 objects 70932 kB used, 10148 MB / 10217 MB avail 192 active+clean</code></pre><p>最好重启下所有osd,否则可能会遇到mds degraded </p><p>mds日志</p><pre><code>2018-03-27 15:43:29.540226 7f7c96308180 0 set uid:gid to 167:167 (ceph:ceph)2018-03-27 15:43:29.540240 7f7c96308180 0 ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b), process ceph-mds, pid 169102018-03-27 15:43:29.540395 7f7c96308180 0 pidfile_write: ignore empty --pid-file2018-03-27 15:43:29.719622 7f7c902e3700 1 mds.ceph02 handle_mds_map standby2018-03-27 15:43:29.720884 7f7c902e3700 1 mds.0.14 handle_mds_map i am now mds.0.142018-03-27 15:43:29.720888 7f7c902e3700 1 mds.0.14 handle_mds_map state change up:boot --> up:replay2018-03-27 15:43:29.720895 7f7c902e3700 1 mds.0.14 replay_start2018-03-27 15:43:29.720899 7f7c902e3700 1 mds.0.14 recovery set is2018-03-27 15:43:29.720902 7f7c902e3700 1 mds.0.14 waiting for osdmap 41 (which blacklists prior instance)2018-03-27 15:43:32.771498 7f7c8b1d8700 0 -- 192.168.1.169:6800/16910 >> 192.168.0.40:6805/2661 pipe(0x7f7ca2014800 sd=17 :0 s=1 pgs=0 cs=0 l=1 c=0x7f7ca1fd6a80).fault2018-03-27 15:43:32.771585 7f7c8a0d5700 0 -- 192.168.1.169:6800/16910 >> 192.168.0.40:6801/2448 pipe(0x7f7ca20a4000 sd=18 :0 s=1 pgs=0 cs=0 l=1 c=0x7f7ca1fd6c00).fault</code></pre><p>以上,自己做个记录。</p>]]></content>
<summary type="html">
<p>是由:原来单节点虚拟机集群,搬迁服务器,从A服务器复制虚拟机到B服务器上。然后启动后发现改节点IP地址已经变换,且无法恢复到原有IP地址,则必须使用新的ip地址作为mon 地址。<br>
</summary>
<category term="ceph" scheme="http://idcat.cn/tags/ceph/"/>
</entry>
</feed>