-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathatom.xml
567 lines (281 loc) · 476 KB
/
atom.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title>kirago杂谈</title>
<link href="http://kiragoo.github.com/atom.xml" rel="self"/>
<link href="http://kiragoo.github.com/"/>
<updated>2023-04-24T09:43:06.768Z</updated>
<id>http://kiragoo.github.com/</id>
<author>
<name>kirago</name>
</author>
<generator uri="https://hexo.io/">Hexo</generator>
<entry>
<title>OCI Image Format Spec 解读</title>
<link href="http://kiragoo.github.com/archives/16cb3562.html"/>
<id>http://kiragoo.github.com/archives/16cb3562.html</id>
<published>2023-04-24T02:03:14.000Z</published>
<updated>2023-04-24T09:43:06.768Z</updated>
<content type="html"><![CDATA[<p>在 <code>OCI</code> 标准中,为描述 <code>OCI Image</code>规范包含了 <code>image manifest</code>,<code>image index</code>(optional),<code>filesystem layers</code>集合以及 <code>configuration</code>配置。通过如上描述使得对于 <code>image</code> 在异构场景下能够被构建、传输且可执行变的通用化。</p><p>站在上层应用的视角, <code>image manifest</code> 包含内容可寻址且解压可运行的<code>filesystem layer</code>。而 <code>image configuration</code> 包含例如应用参数、环境变量等。<br><code>image index</code> 则是对系列 <code>manifest</code> 和 <code>descriptors</code> 的描述指向,用于对不同镜像的补充,一般来说是对不同架构或者不同属性的描述。</p><p><img src="https://github.com/opencontainers/image-spec/raw/main/img/build-diagram.png" alt="OCI-Overview"></p><h2 id="Specification-解读"><a href="#Specification-解读" class="headerlink" title="Specification 解读"></a><code>Specification</code> 解读</h2><h3 id="OCI-Image-Media-Types"><a href="#OCI-Image-Media-Types" class="headerlink" title="OCI Image Media Types"></a><code>OCI Image Media Types</code></h3><p><code>OCI Image Media Types</code> 定义了如下的格式:</p><ul><li><code>application/vnd.oci.descriptor.v1+json</code>: <code>Content Descriptor</code></li><li><code>application/vnd.oci.layout.header.v1+json</code>: <code>OCI Layout</code></li><li><code>application/vnd.oci.image.index.v1+json</code>: <code>Image Index</code></li><li><code>application/vnd.oci.image.manifest.v1+json</code>: <code>Image manifest</code></li><li><code>application/vnd.oci.image.config.v1+json</code>: <code>Image config</code></li><li><code>application/vnd.oci.image.layer.v1.tar</code>: <code>"Layer"</code>, 以 <code>tar</code> 方式压缩</li><li><code>application/vnd.oci.image.layer.v1.tar+gzip</code>: <code>"Layer"</code>, 以 <code>gzip</code> 方式压缩</li><li><code>application/vnd.oci.image.layer.v1.tar+zstd</code>: <code>"Layer"</code>, 以 <code>zstd</code> 方式压缩</li><li><code>application/vnd.oci.scratch.v1+json</code>: <code>Scratch blob</code></li><li><code>application/vnd.oci.artifact.manifest.v1+json</code>: <code>Artifact manifest</code></li></ul><p>而如下 <code>media types</code> 将被剔除且不建议在未来的版本中使用:</p><ul><li><code>application/vnd.oci.image.layer.nondistributable.v1.tar</code></li><li><code>application/vnd.oci.image.layer.nondistributable.v1.tar+gzip</code></li><li><code>application/vnd.oci.image.layer.nondistributable.v1.tar+zstd</code></li></ul><p>对于 <code>media type</code> 的明细配置矩阵可以参考 <a href="https://github.com/opencontainers/image-spec/blob/main/media-types.md#compatibility-matrix"><code>Compatibility Matrix</code></a>。</p><h4 id="各-Media-Type-关联关系"><a href="#各-Media-Type-关联关系" class="headerlink" title="各 Media Type 关联关系"></a>各 <code>Media Type</code> 关联关系</h4><p><img src="https://kblogs.oss-cn-beijing.aliyuncs.com/blogimgs/media-types.png" alt="media type reference"></p><p>通过 <code>Descriptors</code> 描述其中关联关系。其中 <code>image-index</code> 可以理解为是 **<code>fat-manifest</code>**,是对目标架构平台的 <code>image manifests</code>的关联描述入口。其中 <code>image manifest</code> 则是对具体 <code>image configuration</code> 和众多 <code>layers</code> 的描述关联入口。</p><h2 id="spec-中上层组件描述"><a href="#spec-中上层组件描述" class="headerlink" title="spec 中上层组件描述"></a><code>spec</code> 中上层组件描述</h2><h3 id="Image-Manifest"><a href="#Image-Manifest" class="headerlink" title="Image Manifest"></a><code>Image Manifest</code></h3><p>构建容器镜像的一种文档描述,其作用有如下三点:</p><ol><li>用于构建可内容寻址的镜像,镜像模型中包含可哈希获取镜像配置及其组件</li><li>通过 <strong><code>fat-manifest</code></strong> 来维护各异构平台支持的镜像</li><li>可转换位 <code>OCI</code> 运行时规范</li></ol><h4 id="属性介绍"><a href="#属性介绍" class="headerlink" title="属性介绍"></a>属性介绍</h4><p><code>image index</code> 主要是一系列架构平台的信息描述,而 <code>image manifest</code> 则是维护具体镜像配置,以及在具体 <code>operating system</code> 的镜像的系列 <code>layers</code> 的信息。</p><ul><li><code>shcemaVersion</code> | <em>int</em> | REQUIRED</li><li><code>mediaType</code> | <em>string</em> | REQUIRED</li><li><code>artifactType</code> | <em>string</em> | OPTIONAL</li><li><code>config</code> | <em><code>descriptor</code></em> | REQUIRED<ul><li><code>mediaType</code> | <em>string</em></li></ul></li><li><code>layers</code> | <em>array of objects</em><br>其中元素必须是 <code>descriptor</code>,且必须拥有一个入口 entry。</li><li><code>subject</code> | <em><code>descriptor</code></em> | OPTIONAL</li><li><code>annotation</code> | <em>string-string-map</em> | OPTIONAL</li></ul><h4 id="Image-Manifest-示例"><a href="#Image-Manifest-示例" class="headerlink" title="Image Manifest 示例"></a><code>Image Manifest</code> 示例</h4><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br></pre></td><td class="code"><pre><span class="line">{</span><br><span class="line"> <span class="attr">"schemaVersion"</span>: <span class="number">2</span>,</span><br><span class="line"> <span class="attr">"mediaType"</span>: <span class="string">"application/vnd.oci.image.manifest.v1+json"</span>,</span><br><span class="line"> <span class="attr">"config"</span>: {</span><br><span class="line"> <span class="attr">"mediaType"</span>: <span class="string">"application/vnd.oci.image.config.v1+json"</span>,</span><br><span class="line"> <span class="attr">"size"</span>: <span class="number">7023</span>,</span><br><span class="line"> <span class="attr">"digest"</span>: <span class="string">"sha256:b5b2b2c507a0944348e0303114d8d93aaaa081732b86451d9bce1f432a537bc7"</span></span><br><span class="line"> },</span><br><span class="line"> <span class="attr">"layers"</span>: [</span><br><span class="line"> {</span><br><span class="line"> <span class="attr">"mediaType"</span>: <span class="string">"application/vnd.oci.image.layer.v1.tar+gzip"</span>,</span><br><span class="line"> <span class="attr">"size"</span>: <span class="number">32654</span>,</span><br><span class="line"> <span class="attr">"digest"</span>: <span class="string">"sha256:9834876dcfb05cb167a5c24953eba58c4ac89b1adf57f28f2f9d09af107ee8f0"</span></span><br><span class="line"> },</span><br><span class="line"> {</span><br><span class="line"> <span class="attr">"mediaType"</span>: <span class="string">"application/vnd.oci.image.layer.v1.tar+gzip"</span>,</span><br><span class="line"> <span class="attr">"size"</span>: <span class="number">16724</span>,</span><br><span class="line"> <span class="attr">"digest"</span>: <span class="string">"sha256:3c3a4604a545cdc127456d94e421cd355bca5b528f4a9c1905b15da2eb4a4c6b"</span></span><br><span class="line"> },</span><br><span class="line"> {</span><br><span class="line"> <span class="attr">"mediaType"</span>: <span class="string">"application/vnd.oci.image.layer.v1.tar+gzip"</span>,</span><br><span class="line"> <span class="attr">"size"</span>: <span class="number">73109</span>,</span><br><span class="line"> <span class="attr">"digest"</span>: <span class="string">"sha256:ec4b8955958665577945c89419d1af06b5f7636b4ac3da7f12184802ad867736"</span></span><br><span class="line"> }</span><br><span class="line"> ],</span><br><span class="line"> <span class="attr">"subject"</span>: {</span><br><span class="line"> <span class="attr">"mediaType"</span>: <span class="string">"application/vnd.oci.image.manifest.v1+json"</span>,</span><br><span class="line"> <span class="attr">"size"</span>: <span class="number">7682</span>,</span><br><span class="line"> <span class="attr">"digest"</span>: <span class="string">"sha256:5b0bcabd1ed22e9fb1310cf6c2dec7cdef19f0ad69efa1f392e94a4333501270"</span></span><br><span class="line"> },</span><br><span class="line"> <span class="attr">"annotations"</span>: {</span><br><span class="line"> <span class="attr">"com.example.key1"</span>: <span class="string">"value1"</span>,</span><br><span class="line"> <span class="attr">"com.example.key2"</span>: <span class="string">"value2"</span></span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>**Note: <code>mediaType</code> 必须与所包含的 <code>digest</code> 相对应,比如当 <code>digest</code> 是通过 <code>ScratchDigestSHA256</code> 生成,那么 media type “必须” 配置为 <code>application/vnd/oci.scratch.v1+json</code>**。</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">{</span><br><span class="line"> <span class="attr">"mediaType"</span>: <span class="string">"application/vnd.oci.scratch.v1+json"</span>,</span><br><span class="line"> <span class="attr">"size"</span>: <span class="number">2</span>,</span><br><span class="line"> <span class="attr">"digest"</span>: <span class="string">"sha256:44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a"</span></span><br><span class="line">}</span><br></pre></td></tr></table></figure><h3 id="OCI-Image-Index-解读"><a href="#OCI-Image-Index-解读" class="headerlink" title="OCI Image Index 解读"></a><code>OCI Image Index</code> 解读</h3><p>image index 的配置设置是作为 image manifests 的上层引用。</p><h4 id="具体属性解读"><a href="#具体属性解读" class="headerlink" title="具体属性解读"></a>具体属性解读</h4><ul><li><code>schemaVersion</code> | <em>int</em> | REQUIRED</li><li><code>mediaType</code> | <em>string</em></li><li><code>manifests</code> | <em>array of objects</em> | REQUIRED<br><code>manifests</code> 包含系列 带有如下属性的<code>descriptor properties</code></li><li><code>mediaType</code> | <em>string</em></li><li><code>platform</code> | <em>obeject</em> | OPTIONAL<br> 当具体 <code>platform</code> 被提供时,需要进行如下运行时资源声明<ul><li><code>architecture</code> | <em>string</em> | REQUIRED</li><li><code>os</code> | <em>string</em> | REQUIRED</li><li><code>os.version</code> | <em>string</em> | OPTIONAL</li><li><code>os.features</code> | <em>array of strings</em> | OPTIONAL</li><li><code>variant</code> | <em>string</em> | OPTIONAL</li><li><code>features</code> | <em>array of strings</em></li></ul></li><li><code>annotations</code> | <em>string-string-map</em> | OPTIONAL</li></ul><h4 id="Platform-Variants"><a href="#Platform-Variants" class="headerlink" title="Platform Variants"></a><code>Platform Variants</code></h4><table><thead><tr><th>ISA/ABI</th><th>architecture</th><th>variant</th></tr></thead><tbody><tr><td>ARM</td><td>32-bit, v6</td><td>arm</td></tr><tr><td>ARM</td><td>32-bit, v7</td><td>arm</td></tr><tr><td>ARM</td><td>32-bit, v8</td><td>arm</td></tr><tr><td>ARM</td><td>64-bit, v8</td><td>arm64</td></tr></tbody></table><h4 id="OCI-Image-Index-示例"><a href="#OCI-Image-Index-示例" class="headerlink" title="OCI Image Index 示例"></a><code>OCI Image Index</code> 示例</h4><ul><li>simple image index with two platforms</li></ul><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line">{</span><br><span class="line"> <span class="attr">"schemaVersion"</span>: <span class="number">2</span>,</span><br><span class="line"> <span class="attr">"mediaType"</span>: <span class="string">"application/vnd.oci.image.index.v1+json"</span>,</span><br><span class="line"> <span class="attr">"manifests"</span>: [</span><br><span class="line"> {</span><br><span class="line"> <span class="attr">"mediaType"</span>: <span class="string">"application/vnd.oci.image.manifest.v1+json"</span>,</span><br><span class="line"> <span class="attr">"size"</span>: <span class="number">7143</span>,</span><br><span class="line"> <span class="attr">"digest"</span>: <span class="string">"sha256:e692418e4cbaf90ca69d05a66403747baa33ee08806650b51fab815ad7fc331f"</span>,</span><br><span class="line"> <span class="attr">"platform"</span>: {</span><br><span class="line"> <span class="attr">"architecture"</span>: <span class="string">"ppc64le"</span>,</span><br><span class="line"> <span class="attr">"os"</span>: <span class="string">"linux"</span></span><br><span class="line"> }</span><br><span class="line"> },</span><br><span class="line"> {</span><br><span class="line"> <span class="attr">"mediaType"</span>: <span class="string">"application/vnd.oci.image.manifest.v1+json"</span>,</span><br><span class="line"> <span class="attr">"size"</span>: <span class="number">7682</span>,</span><br><span class="line"> <span class="attr">"digest"</span>: <span class="string">"sha256:5b0bcabd1ed22e9fb1310cf6c2dec7cdef19f0ad69efa1f392e94a4333501270"</span>,</span><br><span class="line"> <span class="attr">"platform"</span>: {</span><br><span class="line"> <span class="attr">"architecture"</span>: <span class="string">"amd64"</span>,</span><br><span class="line"> <span class="attr">"os"</span>: <span class="string">"linux"</span></span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> ],</span><br><span class="line"> <span class="attr">"annotations"</span>: {</span><br><span class="line"> <span class="attr">"com.example.key1"</span>: <span class="string">"value1"</span>,</span><br><span class="line"> <span class="attr">"com.example.key2"</span>: <span class="string">"value2"</span></span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure><ul><li>image index with multiple media types</li></ul><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line">{</span><br><span class="line"> <span class="attr">"schemaVersion"</span>: <span class="number">2</span>,</span><br><span class="line"> <span class="attr">"mediaType"</span>: <span class="string">"application/vnd.oci.image.index.v1+json"</span>,</span><br><span class="line"> <span class="attr">"manifests"</span>: [</span><br><span class="line"> {</span><br><span class="line"> <span class="attr">"mediaType"</span>: <span class="string">"application/vnd.oci.image.manifest.v1+json"</span>,</span><br><span class="line"> <span class="attr">"size"</span>: <span class="number">7143</span>,</span><br><span class="line"> <span class="attr">"digest"</span>: <span class="string">"sha256:e692418e4cbaf90ca69d05a66403747baa33ee08806650b51fab815ad7fc331f"</span>,</span><br><span class="line"> <span class="attr">"platform"</span>: {</span><br><span class="line"> <span class="attr">"architecture"</span>: <span class="string">"ppc64le"</span>,</span><br><span class="line"> <span class="attr">"os"</span>: <span class="string">"linux"</span></span><br><span class="line"> }</span><br><span class="line"> },</span><br><span class="line"> {</span><br><span class="line"> <span class="attr">"mediaType"</span>: <span class="string">"application/vnd.oci.image.index.v1+json"</span>,</span><br><span class="line"> <span class="attr">"size"</span>: <span class="number">7682</span>,</span><br><span class="line"> <span class="attr">"digest"</span>: <span class="string">"sha256:601570aaff1b68a61eb9c85b8beca1644e698003e0cdb5bce960f193d265a8b7"</span></span><br><span class="line"> }</span><br><span class="line"> ],</span><br><span class="line"> <span class="attr">"annotations"</span>: {</span><br><span class="line"> <span class="attr">"com.example.key1"</span>: <span class="string">"value1"</span>,</span><br><span class="line"> <span class="attr">"com.example.key2"</span>: <span class="string">"value2"</span></span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h3 id="OCI-Image-Layout-Specification-解读"><a href="#OCI-Image-Layout-Specification-解读" class="headerlink" title="OCI Image Layout Specification 解读"></a>OCI Image Layout Specification 解读</h3><p>通过 image layout 及 ref 即可构建一个 OCI Runtime Specification bundle:</p><ul><li>通过 image index 查询 manifest</li><li>通过定义的 layers 使用 filesystem layers</li><li>通过 <code>image-confing.json</code> 转换为 OCI Runtime Specification</li></ul><h4 id="内容解读"><a href="#内容解读" class="headerlink" title="内容解读"></a>内容解读</h4><p>image layout 包含如下内容:</p><ul><li><code>blogs</code> 目录:<ul><li>content-addressable blobs</li><li>blog has no schema</li><li>目录必须存在但可能为空</li></ul></li><li><code>oci-layout</code> 文件<ul><li>必须存在</li><li>必须为 JSON 对象</li><li>必须包含 <code>imageLayoutVersion</code> 字段</li><li>可能还包含其他字段</li></ul></li><li><code>index.json</code> 文件<ul><li>必须存在</li><li>必须为 image index 对象</li></ul></li></ul><h4 id="示例"><a href="#示例" class="headerlink" title="示例"></a>示例</h4><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">$ <span class="built_in">cd</span> example.com/app/</span><br><span class="line">$ find . -<span class="built_in">type</span> f</span><br><span class="line">./index.json</span><br><span class="line">./oci-layout</span><br><span class="line">./blobs/sha256/3588d02542238316759cbf24502f4344ffcc8a60c803870022f335d1390c13b4</span><br><span class="line">./blobs/sha256/4b0bc1c4050b03c95ef2a8e36e25feac42fd31283e8c30b3ee5df6b043155d3c</span><br><span class="line">./blobs/sha256/7968321274dc6b6171697c33df7815310468e694ac5be0ec03ff053bb135e768</span><br></pre></td></tr></table></figure><h4 id="Blogs-解读"><a href="#Blogs-解读" class="headerlink" title="Blogs 解读"></a>Blogs 解读</h4><ul><li><code>blogs</code> 的目录中包含由SHA算法生成的各个包含实际内容的子目录组织而成。</li><li><code>blog/<alg>/<encoded></code>必须与 <code>digest <alg>:<encoded></code> 相匹配,比如 content <code>blobs/sha256/da39a3ee5e6b4b0d3255bfef95601890afd80709</code> 必须与 digest <code>sha256:da39a3ee5e6b4b0d3255bfef95601890afd80709</code> 相匹配.</li></ul><h5 id="示例-1"><a href="#示例-1" class="headerlink" title="示例"></a>示例</h5><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line">$ cat ./blobs/sha256/9b97579de92b1c195b85bb42a11011378ee549b02d7fe9c17bf2a6b35d5cb079 | jq</span><br><span class="line">{</span><br><span class="line"> <span class="attr">"schemaVersion"</span>: <span class="number">2</span>,</span><br><span class="line"> <span class="attr">"manifests"</span>: [</span><br><span class="line"> {</span><br><span class="line"> <span class="attr">"mediaType"</span>: <span class="string">"application/vnd.oci.image.manifest.v1+json"</span>,</span><br><span class="line"> <span class="attr">"size"</span>: <span class="number">7143</span>,</span><br><span class="line"> <span class="attr">"digest"</span>: <span class="string">"sha256:afff3924849e458c5ef237db5f89539274d5e609db5db935ed3959c90f1f2d51"</span>,</span><br><span class="line"> <span class="attr">"platform"</span>: {</span><br><span class="line"> <span class="attr">"architecture"</span>: <span class="string">"ppc64le"</span>,</span><br><span class="line"> <span class="attr">"os"</span>: <span class="string">"linux"</span></span><br><span class="line"> }</span><br><span class="line"> },</span><br><span class="line">...</span><br><span class="line">}</span><br></pre></td></tr></table></figure><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line">$ cat ./blobs/sha256/afff3924849e458c5ef237db5f89539274d5e609db5db935ed3959c90f1f2d51 | jq</span><br><span class="line">{</span><br><span class="line"> <span class="attr">"schemaVersion"</span>: <span class="number">2</span>,</span><br><span class="line"> <span class="attr">"config"</span>: {</span><br><span class="line"> <span class="attr">"mediaType"</span>: <span class="string">"application/vnd.oci.image.config.v1+json"</span>,</span><br><span class="line"> <span class="attr">"size"</span>: <span class="number">7023</span>,</span><br><span class="line"> <span class="attr">"digest"</span>: <span class="string">"sha256:5b0bcabd1ed22e9fb1310cf6c2dec7cdef19f0ad69efa1f392e94a4333501270"</span></span><br><span class="line"> },</span><br><span class="line"> <span class="attr">"layers"</span>: [</span><br><span class="line"> {</span><br><span class="line"> <span class="attr">"mediaType"</span>: <span class="string">"application/vnd.oci.image.layer.v1.tar+gzip"</span>,</span><br><span class="line"> <span class="attr">"size"</span>: <span class="number">32654</span>,</span><br><span class="line"> <span class="attr">"digest"</span>: <span class="string">"sha256:9834876dcfb05cb167a5c24953eba58c4ac89b1adf57f28f2f9d09af107ee8f0"</span></span><br><span class="line"> },</span><br><span class="line">...</span><br><span class="line">}</span><br></pre></td></tr></table></figure><figure class="highlight"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line">$ cat ./blobs/sha256/5b0bcabd1ed22e9fb1310cf6c2dec7cdef19f0ad69efa1f392e94a4333501270 | jq</span><br><span class="line">{</span><br><span class="line"> <span class="attr">"architecture"</span>: <span class="string">"amd64"</span>,</span><br><span class="line"> <span class="attr">"author"</span>: <span class="string">"Alyssa P. Hacker <[email protected]>"</span>,</span><br><span class="line"> <span class="attr">"config"</span>: {</span><br><span class="line"> <span class="attr">"Hostname"</span>: <span class="string">"8dfe43d80430"</span>,</span><br><span class="line"> <span class="attr">"Domainname"</span>: <span class="string">""</span>,</span><br><span class="line"> <span class="attr">"User"</span>: <span class="string">""</span>,</span><br><span class="line"> <span class="attr">"AttachStdin"</span>: <span class="literal">false</span>,</span><br><span class="line"> <span class="attr">"AttachStdout"</span>: <span class="literal">false</span>,</span><br><span class="line"> <span class="attr">"AttachStderr"</span>: <span class="literal">false</span>,</span><br><span class="line"> <span class="attr">"Tty"</span>: <span class="literal">false</span>,</span><br><span class="line"> <span class="attr">"OpenStdin"</span>: <span class="literal">false</span>,</span><br><span class="line"> <span class="attr">"StdinOnce"</span>: <span class="literal">false</span>,</span><br><span class="line"> <span class="attr">"Env"</span>: [</span><br><span class="line"> <span class="string">"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"</span></span><br><span class="line"> ],</span><br><span class="line"> <span class="attr">"Cmd"</span>: <span class="literal">null</span>,</span><br><span class="line"> <span class="attr">"Image"</span>: <span class="string">"sha256:6986ae504bbf843512d680cc959484452034965db15f75ee8bdd1b107f61500b"</span>,</span><br><span class="line">...</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h4 id="oci-layout-file"><a href="#oci-layout-file" class="headerlink" title="oci-layout file"></a>oci-layout file</h4><h5 id="示例-2"><a href="#示例-2" class="headerlink" title="示例"></a>示例</h5><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">{</span><br><span class="line"> <span class="attr">"imageLayoutVersion"</span>: <span class="string">"1.0.0"</span></span><br><span class="line">}</span><br></pre></td></tr></table></figure><h4 id="index-json-file"><a href="#index-json-file" class="headerlink" title="index.json file"></a>index.json file</h4><h5 id="示例-3"><a href="#示例-3" class="headerlink" title="示例"></a>示例</h5><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br></pre></td><td class="code"><pre><span class="line">{</span><br><span class="line"> <span class="attr">"schemaVersion"</span>: <span class="number">2</span>,</span><br><span class="line"> <span class="attr">"mediaType"</span>: <span class="string">"application/vnd.oci.image.index.v1+json"</span>,</span><br><span class="line"> <span class="attr">"manifests"</span>: [</span><br><span class="line"> {</span><br><span class="line"> <span class="attr">"mediaType"</span>: <span class="string">"application/vnd.oci.image.index.v1+json"</span>,</span><br><span class="line"> <span class="attr">"size"</span>: <span class="number">7143</span>,</span><br><span class="line"> <span class="attr">"digest"</span>: <span class="string">"sha256:0228f90e926ba6b96e4f39cf294b2586d38fbb5a1e385c05cd1ee40ea54fe7fd"</span>,</span><br><span class="line"> <span class="attr">"annotations"</span>: {</span><br><span class="line"> <span class="attr">"org.opencontainers.image.ref.name"</span>: <span class="string">"stable-release"</span></span><br><span class="line"> }</span><br><span class="line"> },</span><br><span class="line"> {</span><br><span class="line"> <span class="attr">"mediaType"</span>: <span class="string">"application/vnd.oci.image.manifest.v1+json"</span>,</span><br><span class="line"> <span class="attr">"size"</span>: <span class="number">7143</span>,</span><br><span class="line"> <span class="attr">"digest"</span>: <span class="string">"sha256:e692418e4cbaf90ca69d05a66403747baa33ee08806650b51fab815ad7fc331f"</span>,</span><br><span class="line"> <span class="attr">"platform"</span>: {</span><br><span class="line"> <span class="attr">"architecture"</span>: <span class="string">"ppc64le"</span>,</span><br><span class="line"> <span class="attr">"os"</span>: <span class="string">"linux"</span></span><br><span class="line"> },</span><br><span class="line"> <span class="attr">"annotations"</span>: {</span><br><span class="line"> <span class="attr">"org.opencontainers.image.ref.name"</span>: <span class="string">"v1.0"</span></span><br><span class="line"> }</span><br><span class="line"> },</span><br><span class="line"> {</span><br><span class="line"> <span class="attr">"mediaType"</span>: <span class="string">"application/xml"</span>,</span><br><span class="line"> <span class="attr">"size"</span>: <span class="number">7143</span>,</span><br><span class="line"> <span class="attr">"digest"</span>: <span class="string">"sha256:b3d63d132d21c3ff4c35a061adf23cf43da8ae054247e32faa95494d904a007e"</span>,</span><br><span class="line"> <span class="attr">"annotations"</span>: {</span><br><span class="line"> <span class="attr">"org.freedesktop.specifications.metainfo.version"</span>: <span class="string">"1.0"</span>,</span><br><span class="line"> <span class="attr">"org.freedesktop.specifications.metainfo.type"</span>: <span class="string">"AppStream"</span></span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> ],</span><br><span class="line"> <span class="attr">"annotations"</span>: {</span><br><span class="line"> <span class="attr">"com.example.index.revision"</span>: <span class="string">"r124356"</span></span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h3 id="Image-Layer-Filesystem-Changeset"><a href="#Image-Layer-Filesystem-Changeset" class="headerlink" title="Image Layer Filesystem Changeset"></a>Image Layer Filesystem Changeset</h3><h4 id="gzip-Media-Types"><a href="#gzip-Media-Types" class="headerlink" title="+gzip Media Types"></a>+gzip Media Types</h4><ul><li><code>application/vnd.oci.image.layer.v1.tar+gzip</code></li><li><code>application/vnd.oci.image.layer.nondistributable.v1.tar+gzip</code></li></ul><h4 id="zstd-Media-Types"><a href="#zstd-Media-Types" class="headerlink" title="+zstd Media Types"></a>+zstd Media Types</h4><ul><li><code>application/vnd.oci.image.layer.v1.tar+zstd</code></li><li><code>application/vnd.oci.image.layer.nondistributable.v1.tar</code></li></ul><h4 id="Change-Types"><a href="#Change-Types" class="headerlink" title="Change Types"></a>Change Types</h4><p>changes 类型:</p><ul><li>Additions</li><li>Modifications</li><li>Removals</li></ul><p>其中 Additions 和 Modifications 在改变集中是相同的表现,而 Removals 则通过 “whiteout” 标识。</p><h4 id="File-Types"><a href="#File-Types" class="headerlink" title="File Types"></a>File Types</h4><ul><li>regular files</li><li>directories</li><li>sockets</li><li>symbolic links</li><li>block devices</li><li>character devices</li><li>FIFOs</li></ul><h4 id="File-Attributes"><a href="#File-Attributes" class="headerlink" title="File Attributes"></a>File Attributes</h4><p>Additions 和 Modifications 必须包含额属性:</p><ul><li>Modification Time(<code>mtime</code>)</li><li>User ID(<code>uid</code>)<ul><li>User Name(<code>uname</code>)</li></ul></li><li>Group ID(<code>gid</code>)<ul><li>Group Name(<code>gname</code>)</li></ul></li><li>Mode(<code>mode</code>)</li><li>Extended Attributes(<code>xattrs</code>)</li><li>Symlink reference(<code>linkname</code> + symbolic link type)</li><li>Hardlink reference(<code>linkname</code>)</li></ul><h4 id="Creating"><a href="#Creating" class="headerlink" title="Creating"></a>Creating</h4><h4 id="Initial-Root-Filesystem"><a href="#Initial-Root-Filesystem" class="headerlink" title="Initial Root Filesystem"></a>Initial Root Filesystem</h4><p>initial root 作为基础或者父 layer。如下示例只是为了配合演示,其中的 root filesystem 只是一个初始化状态,为一个空目录。<br><code>rootfs_c9d_v1/</code></p><h4 id="Populate-Initial-Filesystem"><a href="#Populate-Initial-Filesystem" class="headerlink" title="Populate Initial Filesystem"></a>Populate Initial Filesystem</h4><p>目录和文件创建:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">rootfs-c9d-v1/</span><br><span class="line"> etc/</span><br><span class="line"> my-app-config</span><br><span class="line"> bin/</span><br><span class="line"> my-app-binary</span><br><span class="line"> my-app-tools</span><br></pre></td></tr></table></figure><p><code>rootfs-c9d_v1/</code> 归档为 tar 包,包含如下内容:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">./</span><br><span class="line">./etc/</span><br><span class="line">./etc/my-app-config</span><br><span class="line">./bin/</span><br><span class="line">./bin/my-app-binary</span><br><span class="line">./bin/my-app-tools</span><br></pre></td></tr></table></figure><h4 id="Populate-a-Comparison-Filesystem"><a href="#Populate-a-Comparison-Filesystem" class="headerlink" title="Populate a Comparison Filesystem"></a>Populate a Comparison Filesystem</h4><p>创建一个新的目录,其中初始化内容为 <code>rootfs-c9d_v1/</code>。能够保留文件属性的示例命令如下:</p><ul><li>cp(1): cp -a rootfs-c9d-v1/ rootfs-c9d-v1.s1/</li><li>rsync(1): rsync -aHAX rootfs-c9d-v1/ rootfs-c9d-v1.s1/</li><li>tar(1): mkdir rootfs-c9d-v1.s1 && tar –acls –xattrs -C rootfs-c9d-v1/ -c . | tar -C rootfs-c9d-v1.s1/ –acls –xattrs -x (including –selinux where supported)</li></ul><p>对 snapshot 的任何改变都不能改变或影响它所复制的目录。</p><p>如上 <code>rootfs-c9d-v1.s1</code> 作为 <code>rootfs-c9d-v1</code> 一个相同的快照,其将作为更新和改变做好准备。</p><p><strong>Note:</strong> <em>写时复制或者联合文件系统可以高效地处理目录 snapshots</em></p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">rootfs-c9d-v1.s1/</span><br><span class="line"> etc/</span><br><span class="line"> my-app-config</span><br><span class="line"> bin/</span><br><span class="line"> my-app-binary</span><br><span class="line"> my-app-tools</span><br></pre></td></tr></table></figure><p>在此演示中,向 <code>/etc/my-app.d</code> 中添加一个默认的配置文件,同时移除现有的配置文件。另外对 <code>./bin/my-app-tools</code> 的二进制文件做出改变(文件属性或者文件内容)来演示 <code>layout</code> 变化,变化后内容如下:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">rootfs-c9d-v1.s1/</span><br><span class="line"> etc/</span><br><span class="line"> my-app.d/</span><br><span class="line"> default.cfg</span><br><span class="line"> bin/</span><br><span class="line"> my-app-binary</span><br><span class="line"> my-app-tools</span><br></pre></td></tr></table></figure><h4 id="Determining-Changes"><a href="#Determining-Changes" class="headerlink" title="Determining Changes"></a>Determining Changes</h4><p>当两目录进行比对时,相对路径根目录是顶层目录,查找哪些内容被添加、修改或者删除。</p><p>上述演示比对结果如下:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">Added: /etc/my-app.d/</span><br><span class="line">Added: /etc/my-app.d/default.cfg</span><br><span class="line">Modified: /bin/my-app-tools</span><br><span class="line">Deleted: /etc/my-app-config</span><br></pre></td></tr></table></figure><h4 id="Representing-Changes"><a href="#Representing-Changes" class="headerlink" title="Representing Changes"></a>Representing Changes</h4><p>tar 归档文件将被创建且其中只会包含更改的文件集</p><p>其中 <code>rootfs-c9d-v1.s1</code> 显示如下:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">./etc/my-app.d/</span><br><span class="line">./etc/my-app.d/default.cfg</span><br><span class="line">./bin/my-app-tools</span><br><span class="line">./etc/.wh.my-app-config</span><br></pre></td></tr></table></figure><p>需要注意的是当更改集生效的时候需要保证 <code>./etc/my-app-config</code> 已经被删除了,其文件前缀为 <code>.wh.</code>。</p><h3 id="OCI-Image-Configuration"><a href="#OCI-Image-Configuration" class="headerlink" title="OCI Image Configuration"></a>OCI Image Configuration</h3><p>OCI <em>image</em> 是对 root filesystem 变化的有序集合且其中包含相应执行参数,可供容器运行使用。</p><h4 id="关键术语"><a href="#关键术语" class="headerlink" title="关键术语"></a>关键术语</h4><h5 id="Layer"><a href="#Layer" class="headerlink" title="Layer"></a>Layer</h5><ul><li>镜像文件系统由 layers 构成</li><li>每个 layer 对代表一组文件系统的变化,其格式为 tar 包,记录着文件的添加、修改或者删除</li><li>layers 中并不包含配置的元数据,比如环境变量或者参数,这些将作为一个完整镜像时的属性,而不是在每个layer中有体现</li><li>使用基于层的或联合的文件系统,如AUFS,或通过计算文件系统快照的差异,文件系统的变化集可以被用来呈现一系列的image layers,就像它们是一个完整的文件系统</li></ul><h5 id="Image-JSON"><a href="#Image-JSON" class="headerlink" title="Image JSON"></a>Image JSON</h5><ul><li>每个镜像将有一个关联的 JSON 结构体用来描述一些关于镜像的基础信息,如创建日期、作者,同时也可以包含一些运行时的配置例如执行入口、默认参数、网络和volume卷</li><li>JSON 描述包含对每个 layer 加密哈希后的引用,用于对该镜像的历史信息的维护</li><li>JSON 应该是不可变的</li><li>改变的话应该意味着一个新的派生镜像,而不是在原有的镜像上做出改变</li></ul><h5 id="Layer-DiffID"><a href="#Layer-DiffID" class="headerlink" title="Layer DiffID"></a>Layer DiffID</h5><p>layer 的 DiffID 是该层未压缩的 tar 归档的摘要,并以描述符摘要的格式进行序列化。</p><h5 id="Layer-ChainID"><a href="#Layer-ChainID" class="headerlink" title="Layer ChainID"></a>Layer ChainID</h5><p>为方便起见,有时用一个标识符来指代一叠加的 layer 是很有用的。</p><h5 id="ImageID"><a href="#ImageID" class="headerlink" title="ImageID"></a>ImageID</h5><p>每个镜像的的ID是由其配置JSON的SHA256哈希值给出的。</p><h4 id="属性"><a href="#属性" class="headerlink" title="属性"></a>属性</h4><ul><li>created | <em>string</em> | OPTIONAL</li><li>author | <em>string</em> | OPTIONAL</li><li>architecture | <em>string</em> | REQUIRED</li><li>os | <em>string</em> | REQUIRED</li><li>os.version | <em>string</em> | OPTIONAL</li><li>os.features | *array of strings | OPTIONAL</li><li>variant | <em>string</em> | OPTIONAL</li><li>config | <em>object</em> | OPTIONAL<ul><li>User | <em>string</em> | OPTIONAL</li><li>ExposedPorts | <em>object</em> | OPTIONAL</li><li>Env | <em>array of strings</em> | OPTIONAL</li><li>Entrypoint | <em>array of strings</em> | OPTIONAL</li><li>Cmd | <em>array of strings</em> | OPTIONAL</li><li>Volume | <em>object</em> | OPTIONAL</li><li>WorkingDir | <em>string</em> | OPTIONAL</li><li>Labels | <em>object</em> | OPTIONAL</li><li>StopSignal | <em>string</em> | OPTIONAL</li><li>Memory | <em>integer</em> | OPTIONAL</li><li>CpuShares | <em>integer</em> | OPTIONAL</li><li>Healthcheck | <em>object</em> | OPTIONAL</li></ul></li><li>rootfs | <em>object</em> | REQUIRED<ul><li>created | <em>string</em> | OPTIONAL</li><li>author | <em>string</em> | OPTIONAL</li><li>created_by | <em>string</em> | OPTIONAL</li><li>comment | <em>string</em> | OPTIONAL</li><li>empty_layer | <em>boolean</em> | OPTIONAL</li></ul></li></ul><h4 id="示例-4"><a href="#示例-4" class="headerlink" title="示例"></a>示例</h4><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br></pre></td><td class="code"><pre><span class="line">{</span><br><span class="line"> <span class="attr">"created"</span>: <span class="string">"2015-10-31T22:22:56.015925234Z"</span>,</span><br><span class="line"> <span class="attr">"author"</span>: <span class="string">"Alyssa P. Hacker <[email protected]>"</span>,</span><br><span class="line"> <span class="attr">"architecture"</span>: <span class="string">"amd64"</span>,</span><br><span class="line"> <span class="attr">"os"</span>: <span class="string">"linux"</span>,</span><br><span class="line"> <span class="attr">"config"</span>: {</span><br><span class="line"> <span class="attr">"User"</span>: <span class="string">"alice"</span>,</span><br><span class="line"> <span class="attr">"ExposedPorts"</span>: {</span><br><span class="line"> <span class="attr">"8080/tcp"</span>: {}</span><br><span class="line"> },</span><br><span class="line"> <span class="attr">"Env"</span>: [</span><br><span class="line"> <span class="string">"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"</span>,</span><br><span class="line"> <span class="string">"FOO=oci_is_a"</span>,</span><br><span class="line"> <span class="string">"BAR=well_written_spec"</span></span><br><span class="line"> ],</span><br><span class="line"> <span class="attr">"Entrypoint"</span>: [</span><br><span class="line"> <span class="string">"/bin/my-app-binary"</span></span><br><span class="line"> ],</span><br><span class="line"> <span class="attr">"Cmd"</span>: [</span><br><span class="line"> <span class="string">"--foreground"</span>,</span><br><span class="line"> <span class="string">"--config"</span>,</span><br><span class="line"> <span class="string">"/etc/my-app.d/default.cfg"</span></span><br><span class="line"> ],</span><br><span class="line"> <span class="attr">"Volumes"</span>: {</span><br><span class="line"> <span class="attr">"/var/job-result-data"</span>: {},</span><br><span class="line"> <span class="attr">"/var/log/my-app-logs"</span>: {}</span><br><span class="line"> },</span><br><span class="line"> <span class="attr">"WorkingDir"</span>: <span class="string">"/home/alice"</span>,</span><br><span class="line"> <span class="attr">"Labels"</span>: {</span><br><span class="line"> <span class="attr">"com.example.project.git.url"</span>: <span class="string">"https://example.com/project.git"</span>,</span><br><span class="line"> <span class="attr">"com.example.project.git.commit"</span>: <span class="string">"45a939b2999782a3f005621a8d0f29aa387e1d6b"</span></span><br><span class="line"> }</span><br><span class="line"> },</span><br><span class="line"> <span class="attr">"rootfs"</span>: {</span><br><span class="line"> <span class="attr">"diff_ids"</span>: [</span><br><span class="line"> <span class="string">"sha256:c6f988f4874bb0add23a778f753c65efe992244e148a1d2ec2a8b664fb66bbd1"</span>,</span><br><span class="line"> <span class="string">"sha256:5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef"</span></span><br><span class="line"> ],</span><br><span class="line"> <span class="attr">"type"</span>: <span class="string">"layers"</span></span><br><span class="line"> },</span><br><span class="line"> <span class="attr">"history"</span>: [</span><br><span class="line"> {</span><br><span class="line"> <span class="attr">"created"</span>: <span class="string">"2015-10-31T22:22:54.690851953Z"</span>,</span><br><span class="line"> <span class="attr">"created_by"</span>: <span class="string">"/bin/sh -c #(nop) ADD file:a3bc1e842b69636f9df5256c49c5374fb4eef1e281fe3f282c65fb853ee171c5 in /"</span></span><br><span class="line"> },</span><br><span class="line"> {</span><br><span class="line"> <span class="attr">"created"</span>: <span class="string">"2015-10-31T22:22:55.613815829Z"</span>,</span><br><span class="line"> <span class="attr">"created_by"</span>: <span class="string">"/bin/sh -c #(nop) CMD [\"sh\"]"</span>,</span><br><span class="line"> <span class="attr">"empty_layer"</span>: <span class="literal">true</span></span><br><span class="line"> },</span><br><span class="line"> {</span><br><span class="line"> <span class="attr">"created"</span>: <span class="string">"2015-10-31T22:22:56.329850019Z"</span>,</span><br><span class="line"> <span class="attr">"created_by"</span>: <span class="string">"/bin/sh -c apk add curl"</span></span><br><span class="line"> }</span><br><span class="line"> ]</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h3 id="Conversion-to-OCI-Runtime-Configuration"><a href="#Conversion-to-OCI-Runtime-Configuration" class="headerlink" title="Conversion to OCI Runtime Configuration"></a>Conversion to OCI Runtime Configuration</h3><ul><li>从 filesystem layers 提取 root filesystem</li><li>将 image configuration blob 转换为 OCI Runtime configuration blob</li></ul><h4 id="属性比对表"><a href="#属性比对表" class="headerlink" title="属性比对表"></a>属性比对表</h4><table><thead><tr><th>Image Field</th><th>Runtime Field</th><th>Notes</th></tr></thead><tbody><tr><td>Config.WorkingDir</td><td>process.cwd</td><td></td></tr><tr><td>Config.Env</td><td>process.env</td><td>1</td></tr><tr><td>Config.Entrypoint</td><td>process.args</td><td>2</td></tr><tr><td>Config.Cmd</td><td>process.args</td><td>2</td></tr></tbody></table><h5 id="Annotations-Fields"><a href="#Annotations-Fields" class="headerlink" title="Annotations Fields"></a>Annotations Fields</h5><table><thead><tr><th>Image Field</th><th>Runtime Field</th><th>Notes</th></tr></thead><tbody><tr><td>os</td><td>annotations</td><td>1,2</td></tr><tr><td>architecture</td><td>annotations</td><td>1,3</td></tr><tr><td>variant</td><td>annotations</td><td>1,4</td></tr><tr><td>os.version</td><td>annotations</td><td>1,5</td></tr><tr><td>os.features</td><td>annotations</td><td>1,6</td></tr><tr><td>author</td><td>annotations</td><td>1,7</td></tr><tr><td>created</td><td>annotations</td><td>1,8</td></tr><tr><td>Config.Labels</td><td>annotations</td><td></td></tr><tr><td>Config.StopSignal</td><td>annotations</td><td>1,9</td></tr></tbody></table><h4 id="Parsed-Fields"><a href="#Parsed-Fields" class="headerlink" title="Parsed Fields"></a>Parsed Fields</h4><table><thead><tr><th>Image Field</th><th>Runtime Field</th></tr></thead><tbody><tr><td>Config.User</td><td>process.user.*</td></tr></tbody></table><h4 id="Optional-Fields"><a href="#Optional-Fields" class="headerlink" title="Optional Fields"></a>Optional Fields</h4><table><thead><tr><th>Image Field</th><th>Runtime Field</th><th>Notes</th></tr></thead><tbody><tr><td>Config.ExposedPorts</td><td>annotations</td><td>1</td></tr><tr><td>Config.Volumes</td><td>mounts</td><td>2</td></tr></tbody></table><h4 id="Annotations"><a href="#Annotations" class="headerlink" title="Annotations"></a>Annotations</h4><p>关于注解的三种模式</p><ul><li><code>Config.Labels</code> -> configuration</li><li><code>annotations</code> -> manifest</li><li><code>annotations</code> -> image index</li></ul><h3 id="Descriptor"><a href="#Descriptor" class="headerlink" title="Descriptor"></a>Descriptor</h3><h4 id="属性-1"><a href="#属性-1" class="headerlink" title="属性"></a>属性</h4><ul><li>mediaType | <em>string</em> | REQUIRED</li><li>digest | <em>string</em> | REQUIRED</li><li>size | <em>int64</em> | REQUIRED</li><li>urls | <em>array of strings</em> | OPTIONAL</li><li>annotations | <em>string-string-map</em> | OPTIONAL</li><li>data | <em>string</em> | OPTIONAL</li><li>artifactType | <em>string</em> | OPTIONAL</li></ul><h4 id="Digests-摘要"><a href="#Digests-摘要" class="headerlink" title="Digests 摘要"></a>Digests 摘要</h4><p>摘要命名规范</p><figure class="highlight txt"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">digest ::= algorithm ":" encoded</span><br><span class="line">algorithm ::= algorithm-component (algorithm-separator algorithm-component)*</span><br><span class="line">algorithm-component ::= [a-z0-9]+</span><br><span class="line">algorithm-separator ::= [+._-]</span><br><span class="line">encoded ::= [a-zA-Z0-9=_-]+</span><br></pre></td></tr></table></figure><h5 id="Digest-计算规则"><a href="#Digest-计算规则" class="headerlink" title="Digest 计算规则"></a>Digest 计算规则</h5><figure class="highlight txt"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">let ID(C) = Descriptor.digest</span><br><span class="line">let C = <bytes></span><br><span class="line">let D = '<alg>:' + Encode(H(C))</span><br><span class="line">let verified = ID(C) == D</span><br></pre></td></tr></table></figure><p>其中 H 为具体 哈希算法</p><h5 id="已注册的算法"><a href="#已注册的算法" class="headerlink" title="已注册的算法"></a>已注册的算法</h5><table><thead><tr><th>algorithm identifier</th><th>algorithm</th></tr></thead><tbody><tr><td>sha256</td><td>sha-256</td></tr><tr><td>SHA512</td><td>sha-512</td></tr></tbody></table><h4 id="示例-5"><a href="#示例-5" class="headerlink" title="示例"></a>示例</h4><ul><li>包含基础信息的 Manifest</li></ul><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">{</span><br><span class="line"> <span class="attr">"mediaType"</span>: <span class="string">"application/vnd.oci.image.manifest.v1+json"</span>,</span><br><span class="line"> <span class="attr">"size"</span>: <span class="number">7682</span>,</span><br><span class="line"> <span class="attr">"digest"</span>: <span class="string">"sha256:5b0bcabd1ed22e9fb1310cf6c2dec7cdef19f0ad69efa1f392e94a4333501270"</span></span><br><span class="line">}</span><br></pre></td></tr></table></figure><ul><li>带有指定 url 的 Manifest</li></ul><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">{</span><br><span class="line"> <span class="attr">"mediaType"</span>: <span class="string">"application/vnd.oci.image.manifest.v1+json"</span>,</span><br><span class="line"> <span class="attr">"size"</span>: <span class="number">7682</span>,</span><br><span class="line"> <span class="attr">"digest"</span>: <span class="string">"sha256:5b0bcabd1ed22e9fb1310cf6c2dec7cdef19f0ad69efa1f392e94a4333501270"</span>,</span><br><span class="line"> <span class="attr">"urls"</span>: [</span><br><span class="line"> <span class="string">"https://example.com/example-manifest"</span></span><br><span class="line"> ]</span><br><span class="line">}</span><br></pre></td></tr></table></figure><ul><li>带有 artifact 的 manifest</li></ul><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">{</span><br><span class="line"> <span class="attr">"mediaType"</span>: <span class="string">"application/vnd.oci.image.manifest.v1+json"</span>,</span><br><span class="line"> <span class="attr">"size"</span>: <span class="number">123</span>,</span><br><span class="line"> <span class="attr">"digest"</span>: <span class="string">"sha256:87923725d74f4bfb94c9e86d64170f7521aad8221a5de834851470ca142da630"</span>,</span><br><span class="line"> <span class="attr">"artifactType"</span>: <span class="string">"application/vnd.example.sbom.v1"</span></span><br><span class="line">}</span><br></pre></td></tr></table></figure><h3 id="最后的话"><a href="#最后的话" class="headerlink" title="最后的话"></a>最后的话</h3><p>其实个人把此作为 containerd 源码分析的番外篇,但是对于 OCI Image 部分的理解还是很有必要的,除了 containerd 相关的 OCI Interface 的定义,下沉到更加底层的操作,其实就是对此标准的封装。</p>]]></content>
<summary type="html"><p>在 <code>OCI</code> 标准中,为描述 <code>OCI Image</code>规范包含了 <code>image manifest</code>,<code>image index</code>(optional),<code>filesystem la</summary>
<category term="Container" scheme="http://kiragoo.github.com/categories/Container/"/>
<category term="OCI" scheme="http://kiragoo.github.com/tags/OCI/"/>
</entry>
<entry>
<title>containerd存储驱动一探究竟</title>
<link href="http://kiragoo.github.com/archives/32158f45.html"/>
<id>http://kiragoo.github.com/archives/32158f45.html</id>
<published>2023-04-23T03:16:44.000Z</published>
<updated>2023-04-23T03:23:51.727Z</updated>
<content type="html"><![CDATA[<p>番外篇,简单讲讲容器镜像的存储驱动,为简单演示,以 overlay 存储驱动为例。</p><p>可以通过 <code>Linux kennel</code> 查看 <a href="https://www.kernel.org/doc/html/latest/filesystems/overlayfs.html"><code>Overlay Filesystem</code></a> 的相关明细介绍,以下内容主要以演示为主,加深理解,且此内容的理解是对后续 <code>Containerd</code> 中 <code>Snapshotter Service`` 和 </code>DiffApplier Service` 必备的基础。</p><p>模拟<br>[1-1] 模拟已经存在一个 layer 包含单一文件 <code>file_a</code> 的 <code>snapshot</code>。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">mkdir -p /tmp/a/1/fs</span><br><span class="line">touch /tmp/a/1/fs/file_a</span><br></pre></td></tr></table></figure><p>[1-2] 如若这个时候接收到再创建 <code>layer</code> 的请求,具体内容如下:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">mkdir -p /tmp/a/2/fs</span><br><span class="line">mkdir -p /tmp/a/2/workdir</span><br></pre></td></tr></table></figure><p>[1-3] 此时存储驱动 <code>overlay</code> 将进行如下处理,且将最终序列化好的 <code>mount</code> 进行返回,表意如下:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">Type: Overlay</span><br><span class="line">Source: Overlay</span><br><span class="line">Options:</span><br><span class="line"> lowerdir=/tmp/a/1/fs <-- list of all parents</span><br><span class="line"> uperdir=/tmp/a/2/fs <-- fs dir we just created</span><br><span class="line"> workdir=/tmp/a/2/workdir <-- work dir we just created</span><br></pre></td></tr></table></figure><p>[1-1…1-3]逻辑处理在 <code>containerd</code> 服务中,通过 <code>Snapshotter Service</code> 处理。</p><p>[2-1] <code>mount</code> 挂载模拟</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">mkdir /tmp/mount</span><br><span class="line">mount -t overlay -o lowerdir=/tmp/a/1/fs,upperdir=/tmp/a/2/fs,workdir=/tmp/a/2/workdir overlay /tmp/mount</span><br></pre></td></tr></table></figure><p>查看挂载点 <code>mount</code>:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">[root@k8s-slave tmp]<span class="comment"># tree mount/</span></span><br><span class="line">mount/</span><br><span class="line">└── file_a</span><br><span class="line"></span><br><span class="line">0 directories, 1 file</span><br></pre></td></tr></table></figure><p>[2-2] 添加测试文件 <code>file_b</code></p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">touch /mount/file_b</span><br></pre></td></tr></table></figure><p>由于 <code>upperdir</code> 配置为 <code>/tmp/a/2/fs</code> ,验证 <code>file_b</code> 的位置:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">[root@k8s-slave tmp]<span class="comment"># ls -al a/2/fs/</span></span><br><span class="line">总用量 8</span><br><span class="line">drwxr-xr-x 2 root root 4096 4月 23 11:00 .</span><br><span class="line">drwxr-xr-x 4 root root 4096 4月 23 10:37 ..</span><br><span class="line">c--------- 1 root root 0, 0 4月 23 11:00 file_a</span><br><span class="line">-rw-r--r-- 1 root root 0 4月 23 11:00 file_b</span><br></pre></td></tr></table></figure><p>同时请注意关于 <code>file_a</code> 的描述,其实这代表着对于上一层的 <code>COMMIT</code>。</p><p>[2-1…2-2] 逻辑处理在 <code>containerd</code> 服务中,通过 <code>DiffAplier Service</code> 处理。</p><p>[3-1] 对 <code>mount</code> 进行挂载点的卸载,看看 <code>file_b</code> 文件的真实添加位置。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">umount overlay</span><br><span class="line">[root@k8s-slave fs]<span class="comment"># pwd</span></span><br><span class="line">/tmp/a/2/fs</span><br><span class="line">[root@k8s-slave fs]<span class="comment"># tree</span></span><br><span class="line">.</span><br><span class="line">├── file_a</span><br><span class="line">└── file_b</span><br><span class="line"></span><br><span class="line">0 directories, 2 files</span><br></pre></td></tr></table></figure><p>在 overlay mount如果再加一层 layer的话,该如何处理,感兴趣的可以自己模拟下看看最终表现。</p>]]></content>
<summary type="html"><p>番外篇,简单讲讲容器镜像的存储驱动,为简单演示,以 overlay 存储驱动为例。</p>
<p>可以通过 <code>Linux kennel</code> 查看 <a href="https://www.kernel.org/doc/html/latest/filesy</summary>
<category term="containerd" scheme="http://kiragoo.github.com/categories/containerd/"/>
<category term="containerd" scheme="http://kiragoo.github.com/tags/containerd/"/>
<category term="snapshot" scheme="http://kiragoo.github.com/tags/snapshot/"/>
</entry>
<entry>
<title>NRI:下一代节点细粒度资源控制方案</title>
<link href="http://kiragoo.github.com/archives/9ed9f81c.html"/>
<id>http://kiragoo.github.com/archives/9ed9f81c.html</id>
<published>2023-04-18T13:00:44.000Z</published>
<updated>2023-04-18T13:33:43.557Z</updated>
<content type="html"><![CDATA[<blockquote><p>转载自 <a href="https://juejin.cn/post/7221357811288293432">NRI:下一代节点细粒度资源控制方案</a></p><p><a href="https://github.com/containerd/containerd/pull/4411">相关PR</a></p><p><a href="https://static.sched.com/hosted_files/kccncna2022/cc/KubeCon-NA-2022-NRI-presentation.pdf">KubeCon-NA-2022-NRI-presentation.pdf</a></p></blockquote><h1 id="背景"><a href="#背景" class="headerlink" title="背景"></a>背景</h1><p>为了满足不同业务应用场景的需求,特别是在在线任务与离线任务混布的场景下,在提高资源利用率的同时,也要保证延迟敏感服务可以得到充分的资源保证,这就需要Kubernetes提供更加细粒度的资源管理功能,增强容器的隔离性,减少容器之间的互相干扰。例如,CPU编排,内存分层,缓存管理,IO管理等。目前有很多方案,但是都有其一定的局限性。</p><p>截至目前,Kubernetes并没有提供一个非常完善的资源管理方案,很多Kubernetes周边的开源项目通过一些自己的方式修改Pod的部署和管理流程,实现资源分配的细粒度管理。例如<a href="https://link.juejin.cn/?target=https://github.com/intel/cri-resource-manager">CRI-RM</a>,<a href="https://link.juejin.cn/?target=https://koordinator.sh">Koordinator</a>,<a href="https://link.juejin.cn/?target=https://gocrane.io">Crane</a>等项目。</p><p>这些项目对Kubernetes创建和更新Pod的流程的优化可以大致分为两种模式,一种是 Proxy模式,一种是Standalone模式。</p><p><img src="https://kblogs.oss-cn-beijing.aliyuncs.com/blogimgs/oci-runtime.png" alt="pod 生命周期流程"></p><p>在目前的K8s架构中,如图a,Kubelet通过调用CRI兼容的容器运行时创建和管理Pod。CRI Runtime再通过调用OCI兼容的Low-level Runtime创建Container。</p><h2 id="Proxy-模式"><a href="#Proxy-模式" class="headerlink" title="Proxy 模式"></a>Proxy 模式</h2><p>Proxy模式(如图b)则是在客户端(Kubelet)和CRI Runtime(containerd,CRI-O等) 之间增加一个CRI Proxy中继请求和响应,在Proxy中劫持Pod以及Container的创建/更新/删除事件,对Pod的Spec进行修改或者完善,将硬件感知的资源分配策略应用于容器中。</p><h2 id="Standalone-模式"><a href="#Standalone-模式" class="headerlink" title="Standalone 模式"></a>Standalone 模式</h2><p>Standalone模式(如图c)则是在每一个Work Node上创建一个Agent,当这个Agent监听到在本节点的Pod创建或者修改事件的时候,再根据Pod Spec中的annotation,转换成细粒度资源配置的Spec,然后调用CRI Runtime实现对Pod的更新。<br>这两种方式在满足特定业务需求的同时也存在一定的缺点, 两种方式都需要依赖额外的组件,来捕获Pod的生命周期事件。Proxy 模式增加了Pod创建管理流程的链路以及部署和维护成本,Standalone 模式是在侦听到Pod创建以及修改的事件后,才会对Pod进行更新,会有一定的延迟。</p><h1 id="NRI-简介"><a href="#NRI-简介" class="headerlink" title="NRI 简介"></a>NRI 简介</h1><p>为了解决现有方案的问题,让开发者有一种更统一的实现方式,从而尽可能的提高资源调度插件的复用能力,<a href="https://link.juejin.cn/?target=https://github.com/containerd/nri">NRI</a>应运而出。NRI的概念于<a href="https://link.juejin.cn/?target=https://kccnceu2021.sched.com/event/iE1Y/maximizing-workloads-performance-with-smarter-runtimes-krisztian-litkey-alexander-kanevskiy-intel">2021 Europe Kubecon</a>首次被提出,到现在已经经历了<a href="https://link.juejin.cn/?target=https://kccncna2022.sched.com/event/182JT">两个版本</a>的迭代。</p><p><img src="https://kblogs.oss-cn-beijing.aliyuncs.com/blogimgs/CRI-request-with-nri.png" alt="CRI-request-processing-with-nri"></p><p>上图显示了NRI 以及NRI插件在整个Kubernetes的Pod创建流程的位置。NRI 插件(NRI Plugin)与NRI Adaptation之间是通过Unix Domain Socket 进行通信。<br>目前,NRI已经演进到了2.0 版本,相对于1.0版本进行了重构,增加了更加丰富的hook函数。<br>NRI是Containerd的一个子项目, NRI允许将自定义的业务逻辑插入到CRI兼容的运行时中,例如,Containerd, CRI-O。 在容器的生命周期中,这些逻辑可以对容器的Spec进行修改,或者在确定的hook点做一些OCI范围之外的操作。NRI可以用于进一步完善设备或者其他资源的分配和管理。NRI本身对任何容器运行时(CRI Runtime)的内部实现细节是不感知的。它为CRI运行时提供了一个适配库,用于集成NRI和扩展插件进行交互。<br>NRI提供了接口定义和基础组件,可以实现可插拔的CRI运行时插件,这些插件就是NRI 插件(NRI Plugin)。这些NRI插件是与运行时类型无关的,插件既可以应用于Containerd,也可以应用CRI-O。原则上,任何NRI插件(NRI Plugin)都应该能够和启用NRI的运行时(NRI-enabled CRI Runtimes)正常协作。<br>NRI 插件是一个类似守护进程的实例。插件的单个实例会处理NRI所有的事件和请求,使用Unix-domain socket来进行数据传输和通信,NRI定义了一套基于protobuf的协议–NRI plugin protocal–并通过ttRPC进行实现。这样可以通过降低每条信息的开销提高通信效率,并且可以实现有状态的NRI插件。</p><h1 id="NRI组件以及工作原理"><a href="#NRI组件以及工作原理" class="headerlink" title="NRI组件以及工作原理"></a>NRI组件以及工作原理</h1><p>NRI的实现包含多个组件,每个组件组件对于在运行时中实现端到端的NRI支持都至关重要。最主要的两个组件是,NRI API 和NRI运行时适配器(NRI Runtime Adaptation)。这些组件共同建立了运行时如何与NRI交互,NRI插件如何与Runtime的容器进行交互的模型。同时还定义了插件可以在哪些条件下对容器进行修改以及可以更改的范围。</p><h2 id="NRI-API"><a href="#NRI-API" class="headerlink" title="NRI API"></a>NRI API</h2><p>NRI底层的核心是由protobuf协议定义的底层插件API。这个API定义了两个服务, 运行时服务(Runtime Service)和 插件服务(Plugin Service)。<br>运行时服务(NRI Runtime Service)是CRI运行时暴露给NRI插件的公共接口。这个接口上所有的请求都由NRI插件发起的。这个接口提供了以下功能。</p><ul><li>启动插件注册</li><li>请求主动更新容器</li></ul><p>插件服务(NRI Plugin Service)是运行时和NRI插件进行交互的公共接口。这个接口上请求都是由NRI插件或者Runtime发起的。这个接口提供了以下功能。</p><ul><li>配置NRI插件</li><li>获取已经存在的Pod和Contaienr的初始列表</li><li>把插件挂载到Pod/contaienr 的生命周期事件中</li><li>关闭插件</li></ul><p>在NRI插件开始接受和处理容器事件之前,它需要向NRI注册自己。在注册过程中,NRI 插件和NRI之间会执行一个握手过程, 这个过程包含以下几步:</p><ol><li>插件向运行时标识自己</li><li>NRI为插件提供特定的配置</li><li>插件根据需求订阅Pod或者Container的生命周期事件</li><li>NRI向插件发送现有的pod或Container的列表</li><li>插件请求对现有容器的更新</li></ol><p>NRI插件通过唯一的索引和插件名向NRI 服务标识自己。NRI会通过插件的索引来决定插件的调用顺序。</p><p>NRI插件名称用于NRI服务从默认插件配置路径<code>/etc/nri/conf.d</code>选择对应插件的配置文件发送给NRI插件。只有当对应的NRI插件被NRI服务内部调用时,才会读取对应的配置文件。如果NRI插件是从外部启动的,那么它也可以通过其他方式获取配置。NRI插件可以根据需要订阅Pod和Container的生命周期,并且返回修改的配置。NRI插件如果采用预注册的方式运行时,需要将可执行文件的命名规则需要符合<code>xx-plugin_name</code>,例如<code>01-logger</code>。其中<code>xx</code>必须为两位数字,作为NRI插件的索引,决定了插件的的执行顺序。</p><p>在注册和握手的最后一步,NRI发送CRI 运行时已知的所有的Pod和Contaienr的信息。此时插件可以对任何已经存在的Pod和Container进行更新。<br>一旦握手结束,并且NRI插件成功向NRI服务注册之后,它将开始根据自己的订阅接收Pod和Container 的生命周期事件。</p><h2 id="运行时适配器"><a href="#运行时适配器" class="headerlink" title="运行时适配器"></a>运行时适配器</h2><p>NRI 运行时适配器(NRI Runtime Adaptation)是CRI运行时集成到NRI和与NRI交互的接口。它实现了插件发现,启动和配置。它还提供了将NRI插件插入到CRI运行时的Pod和Container的生命周期事件中的必要功能。<br>运行时适配器实现了多个NRI插件可能在处理同一个Pod或者Container的生命周期事件。它负责按照索引顺序依次调用插件,并把插件的修改内容合并后返回。在合并插件修改的OCI Spec时,当检测到到多个NRI插件对同一个Container产生了冲突的修改,就会返回一个错误。</p><h2 id="其他组件"><a href="#其他组件" class="headerlink" title="其他组件"></a>其他组件</h2><p>NRI还包含一个NRI插件Stub库(NRI Plugin Stub Library),为NRI 插件的实现提供了一个简洁易用的框架。NRI插件Stub库隐藏了NRI插件的底层实现细节,它负责连接建立、插件注册、配置和事件订阅。<br>同时NRI也提供了一些NRI 插件的示例,这些示例都是结合实际使用场景创建的,其中一些示例非常适合调试场景。目前,NRI提供的所有示例插件都基于Stub库实现的。这些示例插件的实现都可以用作学习使用Stub库的教程。<br>另外,NRI还包含一个OCI规范生成器(Wrapped OCI Spec Generator)主要用于NRI 插件用来调整和更新OCI Spec, 然后更新到Container。</p><h1 id="NRI-订阅的的Pod-x2F-Container的元信息和事件"><a href="#NRI-订阅的的Pod-x2F-Container的元信息和事件" class="headerlink" title="NRI 订阅的的Pod/Container的元信息和事件"></a>NRI 订阅的的Pod/Container的元信息和事件</h1><h2 id="Pod元信息和可用的生命周期事件"><a href="#Pod元信息和可用的生命周期事件" class="headerlink" title="Pod元信息和可用的生命周期事件"></a>Pod元信息和可用的生命周期事件</h2><p>NRI插件可以订阅的Pod生命周期事件目前有3个:<code>RunPodSandbox</code>,<code>StopPodSandbox</code>和<code>RemovePodSandbox</code>.</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">type</span> handlers <span class="keyword">struct</span> {</span><br><span class="line"> RunPodSandbox <span class="function"><span class="keyword">func</span><span class="params">(*api.PodSandbox)</span></span> <span class="type">error</span></span><br><span class="line"> StopPodSandbox <span class="function"><span class="keyword">func</span><span class="params">(*api.PodSandbox)</span></span> <span class="type">error</span></span><br><span class="line"> RemovePodSandbox <span class="function"><span class="keyword">func</span><span class="params">(*api.PodSandbox)</span></span> <span class="type">error</span></span><br><span class="line"> ...</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>从以下代码中我们可以看到在事件中可以获得的Pod的元信息。</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// https://github.com/containerd/nri/blob/v0.3.0/pkg/api/api.pb.go#L1015</span></span><br><span class="line"><span class="keyword">type</span> PodSandbox <span class="keyword">struct</span> {</span><br><span class="line">Id <span class="type">string</span> </span><br><span class="line">Name <span class="type">string</span> </span><br><span class="line">Uid <span class="type">string</span> </span><br><span class="line">Namespace <span class="type">string</span> </span><br><span class="line">Labels <span class="keyword">map</span>[<span class="type">string</span>]<span class="type">string</span> </span><br><span class="line">Annotations <span class="keyword">map</span>[<span class="type">string</span>]<span class="type">string</span> </span><br><span class="line">RuntimeHandler <span class="type">string</span> </span><br><span class="line">Linux *LinuxPodSandbox </span><br><span class="line">Pid <span class="type">uint32</span> </span><br><span class="line">}</span><br><span class="line"><span class="keyword">type</span> LinuxPodSandbox <span class="keyword">struct</span> {</span><br><span class="line">CgroupParent <span class="type">string</span> </span><br><span class="line">CgroupsPath <span class="type">string</span> </span><br><span class="line">}</span><br></pre></td></tr></table></figure><h2 id="Container-元信息和可用生命周期事件"><a href="#Container-元信息和可用生命周期事件" class="headerlink" title="Container 元信息和可用生命周期事件"></a>Container 元信息和可用生命周期事件</h2><p>NRI插件可以订阅的Container生命周期事件目前有8个:</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">type</span> handlers <span class="keyword">struct</span> {</span><br><span class="line"> ...</span><br><span class="line">CreateContainer <span class="function"><span class="keyword">func</span><span class="params">(*api.PodSandbox, *api.Container)</span></span> (*api.ContainerAdjustment, []*api.ContainerUpdate, <span class="type">error</span>)</span><br><span class="line">StartContainer <span class="function"><span class="keyword">func</span><span class="params">(*api.PodSandbox, *api.Container)</span></span> <span class="type">error</span></span><br><span class="line">UpdateContainer <span class="function"><span class="keyword">func</span><span class="params">(*api.PodSandbox, *api.Container)</span></span> ([]*api.ContainerUpdate, <span class="type">error</span>)</span><br><span class="line">StopContainer <span class="function"><span class="keyword">func</span><span class="params">(*api.PodSandbox, *api.Container)</span></span> ([]*api.ContainerUpdate, <span class="type">error</span>)</span><br><span class="line">RemoveContainer <span class="function"><span class="keyword">func</span><span class="params">(*api.PodSandbox, *api.Container)</span></span> <span class="type">error</span></span><br><span class="line">PostCreateContainer <span class="function"><span class="keyword">func</span><span class="params">(*api.PodSandbox, *api.Container)</span></span> <span class="type">error</span></span><br><span class="line">PostStartContainer <span class="function"><span class="keyword">func</span><span class="params">(*api.PodSandbox, *api.Container)</span></span> <span class="type">error</span></span><br><span class="line">PostUpdateContainer <span class="function"><span class="keyword">func</span><span class="params">(*api.PodSandbox, *api.Container)</span></span> <span class="type">error</span></span><br><span class="line"> ...</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>从以下代码中我们可以看到在事件中可以获得的Container 的元信息。</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// https://github.com/containerd/nri/blob/v0.3.0/pkg/api/api.pb.go#L1215</span></span><br><span class="line"><span class="comment">// Container metadata that is considered relevant for a plugin.</span></span><br><span class="line"><span class="keyword">type</span> Container <span class="keyword">struct</span> {</span><br><span class="line">Id <span class="type">string</span> </span><br><span class="line">PodSandboxId <span class="type">string</span> </span><br><span class="line">Name <span class="type">string</span> </span><br><span class="line">State ContainerState </span><br><span class="line">Labels <span class="keyword">map</span>[<span class="type">string</span>]<span class="type">string</span> </span><br><span class="line">Annotations <span class="keyword">map</span>[<span class="type">string</span>]<span class="type">string</span> </span><br><span class="line">Args []<span class="type">string</span> </span><br><span class="line">Env []<span class="type">string</span> </span><br><span class="line">Mounts []*Mount </span><br><span class="line">Hooks *Hooks </span><br><span class="line">Linux *LinuxContainer </span><br><span class="line">Pid <span class="type">uint32</span> </span><br><span class="line">}</span><br><span class="line"><span class="keyword">type</span> LinuxContainer <span class="keyword">struct</span> { </span><br><span class="line">Namespaces []*LinuxNamespace </span><br><span class="line">Devices []*LinuxDevice </span><br><span class="line">Resources *LinuxResources </span><br><span class="line">OomScoreAdj *OptionalInt </span><br><span class="line">CgroupsPath <span class="type">string</span> </span><br><span class="line">}</span><br><span class="line"><span class="keyword">type</span> LinuxResources <span class="keyword">struct</span> {</span><br><span class="line">Memory *LinuxMemory </span><br><span class="line">Cpu *LinuxCPU </span><br><span class="line">HugepageLimits []*HugepageLimit </span><br><span class="line">BlockioClass *OptionalString </span><br><span class="line">RdtClass *OptionalString </span><br><span class="line">Devices []*LinuxDeviceCgroup </span><br><span class="line">}</span><br></pre></td></tr></table></figure><h2 id="Container-调整和更新"><a href="#Container-调整和更新" class="headerlink" title="Container 调整和更新"></a>Container 调整和更新</h2><p>在Container 创建过程中可以调整Container的参数,在Container创建后,任何生命周期事件都可以更新Container的参数,但是调整参数和更新参数的范围是不同的,Container创建时支持更多的参数设置,Container创建完成后,只有部分参数可以修改。<br>创建过程中, Container 可以被调整的参数,如下:</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// https://github.com/containerd/nri/blob/v0.3.0/pkg/api/api.pb.go#L2246</span></span><br><span class="line"><span class="comment">// Requested adjustments to a container being created.</span></span><br><span class="line"><span class="keyword">type</span> ContainerAdjustment <span class="keyword">struct</span> {</span><br><span class="line">Annotations <span class="keyword">map</span>[<span class="type">string</span>]<span class="type">string</span> </span><br><span class="line">Mounts []*Mount </span><br><span class="line">Env []*KeyValue </span><br><span class="line">Hooks *Hooks </span><br><span class="line">Linux *LinuxContainerAdjustment </span><br><span class="line">}</span><br><span class="line"><span class="keyword">type</span> LinuxContainerAdjustment <span class="keyword">struct</span> {</span><br><span class="line">Devices []*LinuxDevice </span><br><span class="line">Resources *LinuxResources </span><br><span class="line">CgroupsPath <span class="type">string</span> </span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>Container 创建完成后,NRI插件可以对Container进行更新。这个更新操作也可以由其他任何Container创建,更新或者停止的事件触发,或者可以主动更新Container参数。更新过程中,可以改的Container的参数要少于创建时可修改的参数,不包含<code>annotation</code>,<code>mounts</code>,<code>env</code>,<code>oci hooks</code>以及<code>devices</code>.</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// https://github.com/containerd/nri/blob/v0.3.0/pkg/api/api.pb.go#L231</span></span><br><span class="line"><span class="comment">// Requested update to an already created container.</span></span><br><span class="line"><span class="keyword">type</span> ContainerUpdate <span class="keyword">struct</span> {</span><br><span class="line">ContainerId <span class="type">string</span> </span><br><span class="line">Linux *LinuxContainerUpdate </span><br><span class="line">IgnoreFailure <span class="type">bool</span> </span><br><span class="line">}</span><br><span class="line"><span class="keyword">type</span> LinuxContainerUpdate <span class="keyword">struct</span> {</span><br><span class="line"> Resources *LinuxResources </span><br><span class="line">}</span><br></pre></td></tr></table></figure><h1 id="Containerd-集成NRI"><a href="#Containerd-集成NRI" class="headerlink" title="Containerd 集成NRI"></a>Containerd 集成NRI</h1><p>目前<a href="https://link.juejin.cn/?target=https://github.com/containerd/containerd/blob/main/docs/NRI.md">Containerd1.7.0</a>的正式版本中已经包含的NRI的基本功能。<br>Contained对NRI的支持分成两部分。 其中一个是一个通用NRI插件(Common Plugin, pkg/nri/*),又称为Containerd NRI插件,用于与NRI整合。另外一个是CRI特定NRI实现(/pkg/cri/server/nri.go), 用来在运行时不可知的NRI表示和CRI 插件的内部表示之间进行数据转换。<br>NRI 通用插件是通过内置插件的方式融合到Containerd项目中的。</p><p>Containerd NRI插件实现了与NRI集成和交互的核心逻辑。但Containerd NRI插件 并不了解 Pod或者Container在Containerd中的内部表示。Containerd NRI插件定义了一个附加的接口,域(Domain),每当将内部的Pod和Container的表示形式转换为Runtime不感知的NRI的表示形式时,或者当外部的NRI插件发起一个修改配置的请求需要应用到Containerd管理的容器时,会被使用到。<br>Domain-Namespace(简称,Domain)实现了Containerd NRI插件的接口的函数,通Domain处理指定Containerd命名空间的Pod和Container。Containerd 的命名空间隔离了client和containerd, 例如”k8s.io” 是服务Kubernetes CRI client的,”moby“是 服务于Docker clinet的, ”containerd”是默认的Domain,服务于containerd和ctr.<br>Containerd CRI插件将把自己注册在NRI Domain的”k8s.io”的命名空间中,允许外部NRI插件自定义容器配置。目前这个域(Domain)只对原生的CRI(pkg/cri/server)做了实现。更多的试验性的工作还在持续开发中。<br>这种使用Domian功能分离的主要原因是允许NRI插件用于其他类型的Sandbox和其他容器客户端,而不仅仅用于“k8s.io”命名空间中的CRI容器,例如,未来可能兼容Docker。</p><h2 id="在Containerd中启用对NRI的支持"><a href="#在Containerd中启用对NRI的支持" class="headerlink" title="在Containerd中启用对NRI的支持"></a>在Containerd中启用对NRI的支持</h2><p>通过启用或者禁用通用的Containerd NRI 插件来开启或者关闭Containerd 对NRI的支持。默认情况下NRI插件功能是被关闭的。可以在Containerd的配置文件中,默认路径为<code>/etc/containerd/config.toml</code>,通过编辑<code>[plugins."io.containerd.nri.v1.nri"]</code>这个Section来启用控制NRI的状态, <code>disable=true</code> 或者<code>disable=false</code>.<br>以下是NRI配置的详细说明:</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line">[plugins."io.containerd.nri.v1.nri"]</span><br><span class="line"> # 是否禁用NRI</span><br><span class="line"> disable = false</span><br><span class="line"> # 是否允许外部启用的NRI连接</span><br><span class="line"> disable_connections = false</span><br><span class="line"> # 插件配置文件路径</span><br><span class="line"> plugin_config_path = "/etc/nri/conf.d"</span><br><span class="line"> # 启动时加载NRI插件的默认路径</span><br><span class="line"> plugin_path = "/opt/nri/plugins"</span><br><span class="line"> # NRI插件连接后注册超时时间</span><br><span class="line"> plugin_registration_timeout = "5s"</span><br><span class="line"> # NRI插件处理事件/请求超时时间</span><br><span class="line"> plugin_request_timeout = "2s"</span><br><span class="line"> # NRI socket的存储路径</span><br><span class="line"> socket_path = "/var/run/nri/nri.sock"</span><br></pre></td></tr></table></figure><p>NRI插件的启动有两种方式。</p><p>第一种是预注册(Pre-Registered), 在这种方式下当NRI适配器(NRI Adaptation)实例化时,NRI插件就会自动启动。预注册就是将NRI的可执行文件放置到NRI 插件的指定路径中,默认路径就是<code>/opt/nri/plugins</code>。在使用Containerd的场景下且使用默认配置时,就是当Containerd启动时,就会自动加载并运行在<code>/opt/nri/plugins</code>路径下注册的NRI插件。</p><p>第二种是NRI插件外部运行,这种方式下NRI插件进程可以由systemd创建,或者运行在Pod中。只要保证NRI插件可以通过NRI socket和Containerd进行通信即可,默认的NRI socket存储路径为<code>/var/run/nri/nri.sock</code>。</p><p>预注册的插件是通过一个预先连接到NRI的Socket启动,外部运行的插件通过NRI Socket往NRI适配器注册自己。预注册插件和外部启动插件,这两种运行方式唯一的不同点就是如何启动以及如何连接到NRI。一旦建立了连接,所有的NRI插件都是相同的。</p><p>NRI 可以禁用外部运行插件的连接。在这种情况下NRI socket将不会被创建。上文中Containerd的配置开启了外部插件的连接。这对测试来说非常方便,可以随时连接、断开以及重新连接插件。</p><blockquote><p>注意: 不可以在同一个节点上运行两个启用NRI的Runtime,并且没有分别修改默认的NRI socket 路径。这种情况下,你需要禁用其中一个Runtime的NRI功能,或者修改其中一个Runtime的默认路径,让两个Runtime的NRI Socket路径保持不同。</p></blockquote><h1 id="NRI-成为下一代节点资源细粒度管理方案"><a href="#NRI-成为下一代节点资源细粒度管理方案" class="headerlink" title="NRI 成为下一代节点资源细粒度管理方案"></a>NRI 成为下一代节点资源细粒度管理方案</h1><p>使用NRI可以将Kubelet的Resource Manager下沉到CRI Runtime层进行管理。Kubelet当前不适合处理多种需求的扩展,在Kubelet层增加细粒度的资源分配会导致Kubelet和CRI的界限越来越模糊。<br>而NRI,则是在CRI生命周期间做调用,更适合做资源绑定和节点的拓扑感知。并且在CRI内部做插件定义和迭代,可以做到上层 Kubenetes 以最小的代价来适配变化。<br>到现在为止,已经有越来越多的节点资源细粒度管理方案开始探索使用NRI实现的可能性。当NRI 成为节点细粒度资源分配管理方案后,可以进一步提高资源管理方案的标准化,提高相关组件的可复用性。<br>目前我们已经开源了一些<a href="https://link.juejin.cn/?target=https://github.com/containers/nri-plugins">NRI插件</a>用于节点资源拓扑感知以及细粒度的资源绑定,接下来还有很多工作要做,欢迎各位<a href="https://link.juejin.cn/?target=https://github.com/airren">联系我们</a>一起推动NRI的发展,一起构建NRI的周边生态。</p>]]></content>
<summary type="html"><blockquote>
<p>转载自 <a href="https://juejin.cn/post/7221357811288293432">NRI:下一代节点细粒度资源控制方案</a></p>
<p><a href="https://github.com/container</summary>
<category term="containerd" scheme="http://kiragoo.github.com/categories/containerd/"/>
<category term="containerd" scheme="http://kiragoo.github.com/tags/containerd/"/>
</entry>
<entry>
<title>从0构建web测试框架</title>
<link href="http://kiragoo.github.com/archives/b783defe.html"/>
<id>http://kiragoo.github.com/archives/b783defe.html</id>
<published>2023-04-14T09:04:45.000Z</published>
<updated>2023-04-14T09:52:47.361Z</updated>
<content type="html"><![CDATA[<p>最近在做项目设计的时候,考虑到实际项目与传统意义上的测试有些现实的差距。目前CI部分并不是对每次的测试环境都有一个新环境的准备,在实际的自动化测试中会糅合一部分QA人工的操作,在此背景下需要在自动化测试代码层面中控制,<strong>需要注意的是,e2e测试最好的设计还是能够有个干净的单独环境用于随时拉起与清除资源,以避免不要的脏数据保证与实际功能迭代相匹配</strong>。</p><p>鉴于以上的上下文,顺便设计了个简单的可扩展的测试框架,便于项目集成。</p><h2 id="功能拆分"><a href="#功能拆分" class="headerlink" title="功能拆分"></a>功能拆分</h2><p><code>[package]core</code> 设计<br>关键元素拆分:</p><ul><li>group:以功能组为最上层测试用例注册维度</li></ul><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">type</span> BaseGroup <span class="keyword">struct</span> {</span><br><span class="line"> <span class="comment">// group name</span></span><br><span class="line"> Name <span class="keyword">string</span> </span><br><span class="line"> <span class="comment">// group description</span></span><br><span class="line">Desc <span class="keyword">string</span></span><br><span class="line"> <span class="comment">// Reqs list of features interfaces</span></span><br><span class="line">Reqs []req.Req</span><br><span class="line"> <span class="comment">// rest api client</span></span><br><span class="line">Client *restclient.Client</span><br><span class="line">}</span><br><span class="line"></span><br></pre></td></tr></table></figure><p><code>group</code> 抽象接口的设计:</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">type</span> Interface <span class="keyword">interface</span> {</span><br><span class="line"> <span class="comment">// client init</span></span><br><span class="line">InitC()</span><br><span class="line"> <span class="comment">// group init</span></span><br><span class="line">Init()</span><br><span class="line"> <span class="comment">// start group test</span></span><br><span class="line">Start() error</span><br><span class="line"> <span class="comment">// clean up resources</span></span><br><span class="line">Cleanup() error</span><br><span class="line">}</span><br></pre></td></tr></table></figure><ul><li>req:实际功能</li></ul><p>这里又要提一句的是,架构在设计 <code>struct</code> 的时候,尽量考虑复用。这里是因为在实际项目过程中,私有代码仓库中的 <code>module</code> 无法被引用,主要是因为历史原因在做整个大工程项目的时候没有做好很好的切分与独立,所以只能退而求其次来解决问题,考虑搞 <code>sync</code> 项目中的部分代码独立出一个库,但是这个高级功能在 <code>gitlab</code> 中需要付费版才支持,所以折中就出现如下架构设计。</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">type</span> BaseReq <span class="keyword">struct</span> {</span><br><span class="line"> <span class="comment">// req name</span></span><br><span class="line">Name <span class="keyword">string</span></span><br><span class="line"> <span class="comment">// req uri</span></span><br><span class="line">Uri <span class="keyword">string</span></span><br><span class="line"> <span class="comment">// req method</span></span><br><span class="line">Method <span class="keyword">string</span></span><br><span class="line"> <span class="comment">// req body</span></span><br><span class="line">Body <span class="keyword">interface</span>{}</span><br><span class="line"> <span class="comment">// req query</span></span><br><span class="line">Query <span class="keyword">map</span>[<span class="keyword">string</span>]<span class="keyword">string</span></span><br><span class="line">Runtime runtime.GroupRuntime</span><br><span class="line">}</span><br></pre></td></tr></table></figure><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Req for test request</span></span><br><span class="line"><span class="keyword">type</span> Req <span class="keyword">interface</span> {</span><br><span class="line">Init()</span><br><span class="line">Do(c *restclient.Client) error</span><br><span class="line">Prepare() error</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// PreReq for prepare request</span></span><br><span class="line"><span class="keyword">type</span> PreReq <span class="keyword">interface</span> {</span><br><span class="line">Init()</span><br><span class="line">PreCheck() <span class="keyword">interface</span>{}</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>这里抽象两个 <code>req</code>,主要是为了便于处理资源准备接口的区分,在实际的资源准备中,对于请求之后的状态码其实不应该在设计的结构体中进行描述,为了更好的表意语义。</p><ul><li>restclient:<code>http</code> 请求库构造,有了如上的测试元素的实例,还需要存在 <code>http</code> 客户端进行具体请求的执行,此次设计使用的 <code>client</code> 库<a href="https://github.com/go-resty/resty">resty</a>,虽然最近不维护了,但是看了下 <code>star</code> 和源码实现都是可控的。<strong>切记在选型时,要根据实际情况选择,如果社区活跃且支持较好,复杂点的库也可使用,如果社区活跃度一般,但是评估之后可控也可以适当尝试,具体看实际需求。</strong></li></ul><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">type</span> Client <span class="keyword">struct</span> {</span><br><span class="line">Rclient *resty.Client</span><br><span class="line">config.Config</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">NewClient</span><span class="params">()</span> *<span class="title">Client</span></span> {</span><br><span class="line">conf := config.NewConfig()</span><br><span class="line">c := &Client{}</span><br><span class="line">c.Rclient = resty.New()</span><br><span class="line">c.Server = conf.Server</span><br><span class="line">c.Port = conf.Port</span><br><span class="line">c.User = conf.User</span><br><span class="line">c.Passwd = conf.Passwd</span><br><span class="line">c.Restries = conf.Restries</span><br><span class="line"><span class="keyword">return</span> c</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(c *Client)</span> <span class="title">Request</span><span class="params">(method, uri <span class="keyword">string</span>, body <span class="keyword">interface</span>{}, query <span class="keyword">map</span>[<span class="keyword">string</span>]<span class="keyword">string</span>)</span> <span class="params">(<span class="keyword">interface</span>{}, error)</span></span> {</span><br><span class="line">r := c.Rclient.R()</span><br><span class="line">r.SetHeader(<span class="string">"Content-Type"</span>, <span class="string">"application/json"</span>)</span><br><span class="line"></span><br><span class="line">token, err := c.getToken(r)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, err</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line">t, err := util.GetValueFromJson(token, <span class="string">"token"</span>)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, err</span><br><span class="line">}</span><br><span class="line">r.SetAuthToken(t)</span><br><span class="line"></span><br><span class="line">url := fmt.Sprintf(<span class="string">"http://%s:%s%s"</span>, c.Server, c.Port, uri)</span><br><span class="line"></span><br><span class="line"><span class="keyword">switch</span> method {</span><br><span class="line"><span class="keyword">case</span> <span class="string">"GET"</span>:</span><br><span class="line">resp, err := get(r, url, query)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, err</span><br><span class="line">}</span><br><span class="line"><span class="keyword">return</span> resp, <span class="literal">nil</span></span><br><span class="line"><span class="keyword">case</span> <span class="string">"POST"</span>:</span><br><span class="line"><span class="comment">// TODO implement client</span></span><br><span class="line"><span class="built_in">panic</span>(<span class="string">"implement me"</span>)</span><br><span class="line"><span class="keyword">case</span> <span class="string">"PUT"</span>:</span><br><span class="line"><span class="comment">// TODO implement client</span></span><br><span class="line"><span class="built_in">panic</span>(<span class="string">"implement me"</span>)</span><br><span class="line"><span class="keyword">case</span> <span class="string">"DELETE"</span>:</span><br><span class="line"><span class="comment">// TODO implement client</span></span><br><span class="line"><span class="built_in">panic</span>(<span class="string">"implement me"</span>)</span><br><span class="line"><span class="keyword">default</span>:</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, fmt.Errorf(<span class="string">"unsupported method: %s"</span>, method)</span><br><span class="line">}</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(c *Client)</span> <span class="title">getToken</span><span class="params">(r *resty.Request)</span> <span class="params">(<span class="keyword">string</span>, error)</span></span> {</span><br><span class="line"><span class="keyword">var</span> token <span class="keyword">string</span></span><br><span class="line">path := fmt.Sprintf(<span class="string">"http://%s:%s/login"</span>, c.Server, c.Port)</span><br><span class="line"></span><br><span class="line">resp, err := r.SetBody(<span class="keyword">map</span>[<span class="keyword">string</span>]<span class="keyword">string</span>{<span class="string">"userid"</span>: <span class="string">"admin"</span>, <span class="string">"password"</span>: <span class="string">"Password"</span>}).</span><br><span class="line">Post(path)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> <span class="string">""</span>, fmt.Errorf(<span class="string">"get token failed with gui web : %v"</span>, err)</span><br><span class="line">}</span><br><span class="line">token = resp.String()</span><br><span class="line"><span class="keyword">return</span> token, <span class="literal">nil</span></span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">get</span><span class="params">(r *resty.Request, url <span class="keyword">string</span>, query <span class="keyword">map</span>[<span class="keyword">string</span>]<span class="keyword">string</span>)</span> <span class="params">(<span class="keyword">interface</span>{}, error)</span></span> {</span><br><span class="line"><span class="keyword">var</span> resp *resty.Response</span><br><span class="line"><span class="keyword">var</span> err error</span><br><span class="line"><span class="keyword">if</span> query != <span class="literal">nil</span> {</span><br><span class="line">resp, err = r.SetQueryParams(query).</span><br><span class="line">Get(url)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, fmt.Errorf(<span class="string">"get request failed with gui web : %v"</span>, err)</span><br><span class="line">}</span><br><span class="line">} <span class="keyword">else</span> {</span><br><span class="line">resp, err = r.Get(url)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, fmt.Errorf(<span class="string">"get request failed with gui web : %v"</span>, err)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line">}</span><br><span class="line"><span class="keyword">return</span> resp, <span class="literal">nil</span></span><br><span class="line">}</span><br></pre></td></tr></table></figure><ul><li>config: <code>web server</code> 配置定义</li></ul><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">var</span> c *Config</span><br><span class="line"></span><br><span class="line"><span class="keyword">type</span> Config <span class="keyword">struct</span> {</span><br><span class="line">Server <span class="keyword">string</span> <span class="string">`yaml:"server"`</span></span><br><span class="line">Port <span class="keyword">string</span> <span class="string">`yaml:"port"`</span></span><br><span class="line">User <span class="keyword">string</span> <span class="string">`yaml:"user"`</span></span><br><span class="line">Passwd <span class="keyword">string</span> <span class="string">`yaml:"password"`</span></span><br><span class="line">Restries <span class="keyword">int</span> <span class="string">`yaml:"restries"`</span></span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">NewConfig</span><span class="params">()</span> *<span class="title">Config</span></span> {</span><br><span class="line">dir, _ := os.Getwd()</span><br><span class="line">filePath := path.Join(dir, <span class="string">"config.yaml"</span>)</span><br><span class="line">config, err := ioutil.ReadFile(filePath)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"><span class="built_in">panic</span>(err)</span><br><span class="line">}</span><br><span class="line">err = yaml.Unmarshal(config, &c)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"><span class="built_in">panic</span>(err)</span><br><span class="line">}</span><br><span class="line"><span class="keyword">if</span> c.Restries <= <span class="number">0</span> {</span><br><span class="line">c.Restries = <span class="number">3</span></span><br><span class="line">}</span><br><span class="line"><span class="keyword">return</span> c</span><br><span class="line">}</span><br></pre></td></tr></table></figure><ul><li>扩展设计与支持: 由于忙于工作其他项目的解决方案的设计,对于测试中间日志的显示和最终结果的汇总暂时还没做,这也是一个需要注意的点,先跳过此设计。</li></ul><h2 id="流程实现"><a href="#流程实现" class="headerlink" title="流程实现"></a>流程实现</h2><p>既然具体功能元素已经拆分结束,下面就是需要设计入口来实现具体测试用例的注册与调用了。</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">type</span> GuiApiTest <span class="keyword">struct</span> {</span><br><span class="line">Groups []group.Interface</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">NewGuiApiTest</span><span class="params">()</span> *<span class="title">GuiApiTest</span></span> {</span><br><span class="line"><span class="keyword">return</span> &GuiApiTest{}</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 对需要进行测试 group 注册</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(g *GuiApiTest)</span> <span class="title">Init</span><span class="params">()</span></span> {</span><br><span class="line">g.Groups = []group.Interface{</span><br><span class="line">groups.NewFilesystemGroup(),</span><br><span class="line">}</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 实际测试元素的处理</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(g *GuiApiTest)</span> <span class="title">Start</span><span class="params">()</span> <span class="title">error</span></span> {</span><br><span class="line"><span class="keyword">for</span> i := <span class="keyword">range</span> g.Groups {</span><br><span class="line">g.Groups[i].Init()</span><br><span class="line"><span class="keyword">if</span> err := g.Groups[i].Start(); err != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> err</span><br><span class="line">}</span><br><span class="line">}</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// group 维度资源清除</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(g *GuiApiTest)</span> <span class="title">Cleanup</span><span class="params">()</span> <span class="title">error</span></span> {</span><br><span class="line"><span class="keyword">for</span> i := <span class="keyword">range</span> g.Groups {</span><br><span class="line"><span class="keyword">if</span> err := g.Groups[i].Cleanup(); err != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> err</span><br><span class="line">}</span><br><span class="line">}</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>继续往下看,看看 <code>FilesystemGroup</code> 中有啥?</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">type</span> GetFileSystemList <span class="keyword">struct</span> {</span><br><span class="line">req.BaseReq</span><br><span class="line"> <span class="comment">// 姑且是一个冗余设计</span></span><br><span class="line">prepares []req.PreReq</span><br><span class="line"> <span class="comment">// 这里可以看到,这是一个非资源准备 req,是实际的需要测试的功能接口,所以需要对 assert 做一个期望描述</span></span><br><span class="line">StatusCode <span class="keyword">int</span></span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(g *GetFileSystemList)</span> <span class="title">Init</span><span class="params">()</span></span> {</span><br><span class="line">g.Name = <span class="string">"Get_FileSystemList_Success"</span></span><br><span class="line">g.Url = <span class="string">"/v1/storage/filesystem/N9000"</span></span><br><span class="line">g.Method = <span class="string">"GET"</span></span><br><span class="line"><span class="comment">// g.ContentType = "application/json"</span></span><br><span class="line"><span class="comment">// g.Authenticated = true</span></span><br><span class="line">g.StatusCode = <span class="number">200</span></span><br><span class="line">g.prepares = <span class="literal">nil</span></span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(g *GetFileSystemList)</span> <span class="title">Prepare</span><span class="params">()</span> <span class="title">error</span></span> {</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(g *GetFileSystemList)</span> <span class="title">Do</span><span class="params">(c *restclient.Client)</span> <span class="title">error</span></span> {</span><br><span class="line"> <span class="comment">// 实际 http request 请求,对业务接口进行测试,为避免网络抖动,可根据之前 conf配置的 重试次数进行多次请求验证, resty 库原生支持,这里暂且没实现。</span></span><br><span class="line">resp, err := c.Request(g.Method, g.Url, <span class="literal">nil</span>, <span class="literal">nil</span>)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> fmt.Errorf(<span class="string">"[%s] test not passed"</span>, g.Name)</span><br><span class="line">}</span><br><span class="line">statusCode := resp.(*resty.Response).StatusCode()</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> statusCode != g.StatusCode {</span><br><span class="line"><span class="keyword">return</span> fmt.Errorf(<span class="string">"[%s] test not passed. want status code: %d, got status code: %d"</span>, g.Name, g.StatusCode, statusCode)</span><br><span class="line">}</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>如上 <code>c.Request(g.Method, g.Url,nil,nil)</code> 具体逻辑可以看 <code>Client</code> 的设计。</p><p>最后剩下的就是对整个入口的启动调用了。</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">main</span><span class="params">()</span></span> {</span><br><span class="line"></span><br><span class="line">tests := pkg.NewGuiApiTest()</span><br><span class="line">tests.Init()</span><br><span class="line">err := tests.Start()</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line">os.Exit(<span class="number">1</span>)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line">err = tests.Cleanup()</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line">os.Exit(<span class="number">1</span>)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line">os.Exit(<span class="number">0</span>)</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>自此一个雏形设计就出来了,可在此基础上继续扩展实现实际的项目需求了。^_^</p>]]></content>
<summary type="html"><p>最近在做项目设计的时候,考虑到实际项目与传统意义上的测试有些现实的差距。目前CI部分并不是对每次的测试环境都有一个新环境的准备,在实际的自动化测试中会糅合一部分QA人工的操作,在此背景下需要在自动化测试代码层面中控制,<strong>需要注意的是,e2e测试最好的设计还是能</summary>
<category term="golang" scheme="http://kiragoo.github.com/categories/golang/"/>
<category term="测试" scheme="http://kiragoo.github.com/categories/golang/%E6%B5%8B%E8%AF%95/"/>
<category term="golang" scheme="http://kiragoo.github.com/tags/golang/"/>
<category term="e2e 测试" scheme="http://kiragoo.github.com/tags/e2e-%E6%B5%8B%E8%AF%95/"/>
</entry>
<entry>
<title>containerd源码分析-[2]cri插件</title>
<link href="http://kiragoo.github.com/archives/436e2913.html"/>
<id>http://kiragoo.github.com/archives/436e2913.html</id>
<published>2023-04-03T13:04:54.000Z</published>
<updated>2023-04-19T04:38:51.054Z</updated>
<content type="html"><![CDATA[<blockquote><p>containerd-v1.7.0<br>此篇正式开启插件启用流程分析。</p></blockquote><h1 id="源码分析"><a href="#源码分析" class="headerlink" title="源码分析"></a>源码分析</h1><h2 id="初始化入口"><a href="#初始化入口" class="headerlink" title="初始化入口"></a>初始化入口</h2><p><code>pkg/cri/cri.go:42</code></p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Register CRI service plugin</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">init</span><span class="params">()</span></span> {</span><br><span class="line"> <span class="comment">// 默认配置</span></span><br><span class="line">config := criconfig.DefaultConfig()</span><br><span class="line"> <span class="comment">// 必要信息注册</span></span><br><span class="line">plugin.Register(&plugin.Registration{</span><br><span class="line"> <span class="comment">// GRPC Plugin</span></span><br><span class="line">Type: plugin.GRPCPlugin, </span><br><span class="line">ID: <span class="string">"cri"</span>,</span><br><span class="line">Config: &config,</span><br><span class="line"> <span class="comment">// Requires 插件,对于顶层 `app.Run()` 中</span></span><br><span class="line">Requires: []plugin.Type{</span><br><span class="line">plugin.EventPlugin,</span><br><span class="line">plugin.ServicePlugin,</span><br><span class="line">plugin.NRIApiPlugin,</span><br><span class="line">},</span><br><span class="line"> <span class="comment">// 初始化函数</span></span><br><span class="line">InitFn: initCRIService,</span><br><span class="line">})</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h2 id="CRIService-初始化流程"><a href="#CRIService-初始化流程" class="headerlink" title="CRIService 初始化流程"></a>CRIService 初始化流程</h2><p><code>pkg/cri/cri.go:57</code></p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">initCRIService</span><span class="params">(ic *plugin.InitContext)</span> <span class="params">(<span class="keyword">interface</span>{}, error)</span></span> {</span><br><span class="line">...</span><br><span class="line"><span class="comment">// 上下文传递</span></span><br><span class="line">ctx := ic.Context</span><br><span class="line"><span class="comment">// plugin 配置</span></span><br><span class="line">pluginConfig := ic.Config.(*criconfig.PluginConfig)</span><br><span class="line"><span class="comment">// 校验 plugin 配置</span></span><br><span class="line"><span class="keyword">if</span> err := criconfig.ValidatePluginConfig(ctx, pluginConfig); err != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, fmt.Errorf(<span class="string">"invalid plugin config: %w"</span>, err)</span><br><span class="line">}</span><br><span class="line"><span class="comment">// 初始化 criconfig</span></span><br><span class="line">c := criconfig.Config{</span><br><span class="line">PluginConfig: *pluginConfig,</span><br><span class="line">ContainerdRootDir: filepath.Dir(ic.Root),</span><br><span class="line">ContainerdEndpoint: ic.Address,</span><br><span class="line">RootDir: ic.Root,</span><br><span class="line">StateDir: ic.State,</span><br><span class="line">}</span><br><span class="line">...</span><br><span class="line"><span class="comment">// 构造 contaninerd client</span></span><br><span class="line">client, err := containerd.New(</span><br><span class="line"><span class="string">""</span>,</span><br><span class="line">containerd.WithDefaultNamespace(constants.K8sContainerdNamespace),</span><br><span class="line">containerd.WithDefaultPlatform(platforms.Default()),</span><br><span class="line"><span class="comment">// WithInMemoryServices适用于需要从另一个(内存)containerd插件(如CRI)使用containerd客户端的情况。</span></span><br><span class="line">containerd.WithInMemoryServices(ic),</span><br><span class="line">)</span><br><span class="line">...</span><br><span class="line"><span class="comment">// 根据环境变量 ENABLE_CRI_SANDDBOXES 配置构造 CRIService</span></span><br><span class="line"><span class="keyword">var</span> s server.CRIService</span><br><span class="line"><span class="keyword">if</span> os.Getenv(<span class="string">"ENABLE_CRI_SANDBOXES"</span>) != <span class="string">""</span> {</span><br><span class="line">log.G(ctx).Info(<span class="string">"using experimental CRI Sandbox server - unset ENABLE_CRI_SANDBOXES to disable"</span>)</span><br><span class="line">s, err = sbserver.NewCRIService(c, client, getNRIAPI(ic))</span><br><span class="line">} <span class="keyword">else</span> {</span><br><span class="line">log.G(ctx).Info(<span class="string">"using legacy CRI server"</span>)</span><br><span class="line">s, err = server.NewCRIService(c, client, getNRIAPI(ic))</span><br><span class="line">}</span><br><span class="line">...</span><br><span class="line"><span class="comment">// 启动协程运行 CRIService</span></span><br><span class="line"><span class="keyword">go</span> <span class="function"><span class="keyword">func</span><span class="params">()</span></span> {</span><br><span class="line"><span class="keyword">if</span> err := s.Run(); err != <span class="literal">nil</span> {</span><br><span class="line">log.G(ctx).WithError(err).Fatal(<span class="string">"Failed to run CRI service"</span>)</span><br><span class="line">}</span><br><span class="line"><span class="comment">// TODO(random-liu): Whether and how we can stop containerd.</span></span><br><span class="line">}()</span><br><span class="line">...</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>关于 <code>NRI</code> 的介绍详见 <a href="https://kiragoo.github.io/archives/9ed9f81c.html">NRI:下一代节点细粒度资源控制方案</a></p><h2 id="构造-CRIService-服务"><a href="#构造-CRIService-服务" class="headerlink" title="构造 CRIService 服务"></a>构造 <code>CRIService</code> 服务</h2><h3 id="criService-结构体定义"><a href="#criService-结构体定义" class="headerlink" title="criService 结构体定义"></a><code>criService</code> 结构体定义</h3><p><code>pkg/cri/server/service.go:71</code></p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">type</span> criService <span class="keyword">struct</span> {</span><br><span class="line"><span class="comment">// cri 配置</span></span><br><span class="line">config criconfig.Config</span><br><span class="line"><span class="comment">// 镜像文件系统路径</span></span><br><span class="line">imageFSPath <span class="keyword">string</span></span><br><span class="line"><span class="comment">// 模拟操作系统级操作</span></span><br><span class="line">os osinterface.OS</span><br><span class="line"><span class="comment">// sandboxes 相关资源</span></span><br><span class="line">sandboxStore *sandboxstore.Store</span><br><span class="line"><span class="comment">// 存储所有 sandbox name 保证其唯一性</span></span><br><span class="line">sandboxNameIndex *registrar.Registrar</span><br><span class="line"><span class="comment">// 存储 containers 相关资源</span></span><br><span class="line">containerStore *containerstore.Store</span><br><span class="line"><span class="comment">// 存储所有 container name 保证其唯一性</span></span><br><span class="line">containerNameIndex *registrar.Registrar</span><br><span class="line"><span class="comment">// 存储 images 相关资源</span></span><br><span class="line">imageStore *imagestore.Store</span><br><span class="line"><span class="comment">// 存储所有 snapshots 信息</span></span><br><span class="line">snapshotStore *snapshotstore.Store</span><br><span class="line"><span class="comment">// netPlugin 用于运行/停止 pod sandbox 时 配置/清除 网络</span></span><br><span class="line">netPlugin <span class="keyword">map</span>[<span class="keyword">string</span>]cni.CNI</span><br><span class="line"><span class="comment">// client 为 containerd 客户端实例</span></span><br><span class="line">client *containerd.Client</span><br><span class="line"><span class="comment">// streamServer 为处理 container streaming 请求的服务端</span></span><br><span class="line">streamServer streaming.Server</span><br><span class="line"><span class="comment">// eventMonitor 为监控 containerd events 的监视器</span></span><br><span class="line">eventMonitor *eventMonitor</span><br><span class="line"><span class="comment">// initialized 表明所有服务是否已经初始化了,在 server 被初始化之前,所有的 GRPC 服务必须返回 error</span></span><br><span class="line">initialized atomic.Bool</span><br><span class="line"><span class="comment">// cniNetConfMonitor 用于重载 cni network 配置,当位于 network conf dir 中的配置文件发生可用变化时需要重载配置</span></span><br><span class="line">cniNetConfMonitor <span class="keyword">map</span>[<span class="keyword">string</span>]*cniNetConfSyncer</span><br><span class="line"><span class="comment">// baseOCISpecs 包含通过 Runtime.BaseRuntimeSpec 缓存的 OCI specs</span></span><br><span class="line">baseOCISpecs <span class="keyword">map</span>[<span class="keyword">string</span>]*oci.Spec</span><br><span class="line"><span class="comment">// allCaps 为 capabilities 列表,当为空时, 从 /proc/self/status 中解析获取</span></span><br><span class="line">allCaps []<span class="keyword">string</span></span><br><span class="line"><span class="comment">// unpackDuplicationSuppressor 用于保证只有唯一一个 fetch request 或者 unpack handler 来处理 </span></span><br><span class="line">unpackDuplicationSuppressor kmutex.KeyedLocker</span><br><span class="line"><span class="comment">// nri 用于在处理 CRI 请求的时候回调 NRI</span></span><br><span class="line">nri *nri.API</span><br><span class="line"><span class="comment">// containerEventsChan 用于捕获 container 事件,并将其发送到 GetContainerEvents 调用者 </span></span><br><span class="line">containerEventsChan <span class="keyword">chan</span> runtime.ContainerEventResponse</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h3 id="NewCRIService-构造"><a href="#NewCRIService-构造" class="headerlink" title="NewCRIService 构造"></a><code>NewCRIService</code> 构造</h3><p><code>pkg/cri/server/service.go:123</code></p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">NewCRIService</span><span class="params">(config criconfig.Config, client *containerd.Client, nri *nri.API)</span> <span class="params">(CRIService, error)</span></span> {</span><br><span class="line"><span class="keyword">var</span> err error</span><br><span class="line">labels := label.NewStore()</span><br><span class="line">c := &criService{</span><br><span class="line">config: config,</span><br><span class="line">client: client,</span><br><span class="line">os: osinterface.RealOS{},</span><br><span class="line">sandboxStore: sandboxstore.NewStore(labels),</span><br><span class="line">containerStore: containerstore.NewStore(labels),</span><br><span class="line">imageStore: imagestore.NewStore(client),</span><br><span class="line">snapshotStore: snapshotstore.NewStore(),</span><br><span class="line">sandboxNameIndex: registrar.NewRegistrar(),</span><br><span class="line">containerNameIndex: registrar.NewRegistrar(),</span><br><span class="line">initialized: atomic.NewBool(<span class="literal">false</span>),</span><br><span class="line">netPlugin: <span class="built_in">make</span>(<span class="keyword">map</span>[<span class="keyword">string</span>]cni.CNI),</span><br><span class="line">unpackDuplicationSuppressor: kmutex.New(),</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// <span class="doctag">TODO:</span> figure out a proper channel size.</span></span><br><span class="line">c.containerEventsChan = <span class="built_in">make</span>(<span class="keyword">chan</span> runtime.ContainerEventResponse, <span class="number">1000</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment">// SnapshotService 检查</span></span><br><span class="line"><span class="keyword">if</span> client.SnapshotService(c.config.ContainerdConfig.Snapshotter) == <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, fmt.Errorf(<span class="string">"failed to find snapshotter %q"</span>, c.config.ContainerdConfig.Snapshotter)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 镜像文件系统路径构造</span></span><br><span class="line">c.imageFSPath = imageFSPath(config.ContainerdRootDir, config.ContainerdConfig.Snapshotter)</span><br><span class="line">logrus.Infof(<span class="string">"Get image filesystem path %q"</span>, c.imageFSPath)</span><br><span class="line"></span><br><span class="line"><span class="comment">// 冗余设计用于在 非 windows 和 linux 系统汇中初始化</span></span><br><span class="line"><span class="keyword">if</span> err := c.initPlatform(); err != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, fmt.Errorf(<span class="string">"initialize platform: %w"</span>, err)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 初始化 stream server</span></span><br><span class="line">c.streamServer, err = newStreamServer(c, config.StreamServerAddress, config.StreamServerPort, config.StreamIdleTimeout)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, fmt.Errorf(<span class="string">"failed to create stream server: %w"</span>, err)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 初始化 event monitor</span></span><br><span class="line">c.eventMonitor = newEventMonitor(c)</span><br><span class="line"></span><br><span class="line"><span class="comment">// 初始化 cni net conf monitor</span></span><br><span class="line">c.cniNetConfMonitor = <span class="built_in">make</span>(<span class="keyword">map</span>[<span class="keyword">string</span>]*cniNetConfSyncer)</span><br><span class="line"><span class="keyword">for</span> name, i := <span class="keyword">range</span> c.netPlugin {</span><br><span class="line">path := c.config.NetworkPluginConfDir</span><br><span class="line"><span class="keyword">if</span> name != defaultNetworkPlugin {</span><br><span class="line"><span class="keyword">if</span> rc, ok := c.config.Runtimes[name]; ok {</span><br><span class="line">path = rc.NetworkPluginConfDir</span><br><span class="line">}</span><br><span class="line">}</span><br><span class="line"><span class="keyword">if</span> path != <span class="string">""</span> {</span><br><span class="line">m, err := newCNINetConfSyncer(path, i, c.cniLoadOptions())</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, fmt.Errorf(<span class="string">"failed to create cni conf monitor for %s: %w"</span>, name, err)</span><br><span class="line">}</span><br><span class="line">c.cniNetConfMonitor[name] = m</span><br><span class="line">}</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 预加载 base OCI specs</span></span><br><span class="line">c.baseOCISpecs, err = loadBaseOCISpecs(&config)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, err</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 加载 sandbox controllers(pod sandbox controller and remote shim controller)</span></span><br><span class="line">c.sandboxControllers[criconfig.ModePodSandbox] = podsandbox.New(config, client, c.sandboxStore, c.os, c, c.baseOCISpecs)</span><br><span class="line">c.sandboxControllers[criconfig.ModeShim] = client.SandboxController()</span><br><span class="line"></span><br><span class="line">c.nri = nri</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> c, <span class="literal">nil</span></span><br><span class="line">}</span><br></pre></td></tr></table></figure><blockquote><p>关于 <a href="https://wiki.gentoo.org/wiki/SELinux/Labels#:~:text=The%20label%20of%20a%20process%20is%20decided%20by,will%20not%20be%20allowed%20by%20the%20SELinux%20policy.">SELinux-Label详解</a></p></blockquote><h2 id="启动-CRI-Service"><a href="#启动-CRI-Service" class="headerlink" title="启动 CRI Service"></a>启动 CRI Service</h2><p><code>pkg/cri/server/service.go:207</code></p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(c *criService)</span> <span class="title">Run</span><span class="params">()</span> <span class="title">error</span></span> {</span><br><span class="line">logrus.Info(<span class="string">"Start subscribing containerd event"</span>)</span><br><span class="line"><span class="comment">// 注册 event 事件订阅者</span></span><br><span class="line">c.eventMonitor.subscribe(c.client)</span><br><span class="line"></span><br><span class="line">logrus.Infof(<span class="string">"Start recovering state"</span>)</span><br><span class="line"><span class="comment">// 通过 containerd 和 status checkpoint 恢复 system 状态</span></span><br><span class="line"><span class="keyword">if</span> err := c.<span class="built_in">recover</span>(ctrdutil.NamespacedContext()); err != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> fmt.Errorf(<span class="string">"failed to recover state: %w"</span>, err)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// Start event handler.</span></span><br><span class="line">logrus.Info(<span class="string">"Start event monitor"</span>)</span><br><span class="line"><span class="comment">// 启动 eventMonitor</span></span><br><span class="line">eventMonitorErrCh := c.eventMonitor.start()</span><br><span class="line"></span><br><span class="line"><span class="comment">// Start snapshot stats syncer, it doesn't need to be stopped.</span></span><br><span class="line">logrus.Info(<span class="string">"Start snapshots syncer"</span>)</span><br><span class="line"><span class="comment">// 构造 snapshotSyncer</span></span><br><span class="line">snapshotsSyncer := newSnapshotsSyncer(</span><br><span class="line">c.snapshotStore,</span><br><span class="line">c.client.SnapshotService(c.config.ContainerdConfig.Snapshotter),</span><br><span class="line">time.Duration(c.config.StatsCollectPeriod)*time.Second,</span><br><span class="line">)</span><br><span class="line"><span class="comment">// 启动 snapshotsSyncer</span></span><br><span class="line">snapshotsSyncer.start()</span><br><span class="line"></span><br><span class="line"><span class="comment">// 启动 CNI network conf syncers</span></span><br><span class="line">cniNetConfMonitorErrCh := <span class="built_in">make</span>(<span class="keyword">chan</span> error, <span class="built_in">len</span>(c.cniNetConfMonitor))</span><br><span class="line"><span class="keyword">var</span> netSyncGroup sync.WaitGroup</span><br><span class="line"><span class="keyword">for</span> name, h := <span class="keyword">range</span> c.cniNetConfMonitor {</span><br><span class="line">netSyncGroup.Add(<span class="number">1</span>)</span><br><span class="line">logrus.Infof(<span class="string">"Start cni network conf syncer for %s"</span>, name)</span><br><span class="line"><span class="keyword">go</span> <span class="function"><span class="keyword">func</span><span class="params">(h *cniNetConfSyncer)</span></span> {</span><br><span class="line">cniNetConfMonitorErrCh <- h.syncLoop()</span><br><span class="line">netSyncGroup.Done()</span><br><span class="line">}(h)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> <span class="built_in">len</span>(c.cniNetConfMonitor) > <span class="number">0</span> {</span><br><span class="line"><span class="keyword">go</span> <span class="function"><span class="keyword">func</span><span class="params">()</span></span> {</span><br><span class="line">netSyncGroup.Wait()</span><br><span class="line"><span class="built_in">close</span>(cniNetConfMonitorErrCh)</span><br><span class="line">}()</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 启动 streaming server.</span></span><br><span class="line">logrus.Info(<span class="string">"Start streaming server"</span>)</span><br><span class="line">streamServerErrCh := <span class="built_in">make</span>(<span class="keyword">chan</span> error)</span><br><span class="line"><span class="keyword">go</span> <span class="function"><span class="keyword">func</span><span class="params">()</span></span> {</span><br><span class="line"><span class="keyword">defer</span> <span class="built_in">close</span>(streamServerErrCh)</span><br><span class="line"><span class="keyword">if</span> err := c.streamServer.Start(<span class="literal">true</span>); err != <span class="literal">nil</span> && err != http.ErrServerClosed {</span><br><span class="line">logrus.WithError(err).Error(<span class="string">"Failed to start streaming server"</span>)</span><br><span class="line">streamServerErrCh <- err</span><br><span class="line">}</span><br><span class="line">}()</span><br><span class="line"></span><br><span class="line"><span class="comment">// 在 NRI 中注册CRI domain</span></span><br><span class="line"><span class="keyword">if</span> err := c.nri.Register(&criImplementation{c}); err != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> fmt.Errorf(<span class="string">"failed to set up NRI for CRI service: %w"</span>, err)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 设置 server 为 初始化状态. GRPC services 正式工作.</span></span><br><span class="line">c.initialized.Set()</span><br><span class="line"></span><br><span class="line"><span class="keyword">var</span> eventMonitorErr, streamServerErr, cniNetConfMonitorErr error</span><br><span class="line"><span class="keyword">select</span> {</span><br><span class="line"><span class="keyword">case</span> eventMonitorErr = <-eventMonitorErrCh:</span><br><span class="line"><span class="keyword">case</span> streamServerErr = <-streamServerErrCh:</span><br><span class="line"><span class="keyword">case</span> cniNetConfMonitorErr = <-cniNetConfMonitorErrCh:</span><br><span class="line">}</span><br><span class="line"><span class="keyword">if</span> err := c.Close(); err != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> fmt.Errorf(<span class="string">"failed to stop cri service: %w"</span>, err)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> err := <-eventMonitorErrCh; err != <span class="literal">nil</span> {</span><br><span class="line">eventMonitorErr = err</span><br><span class="line">}</span><br><span class="line">logrus.Info(<span class="string">"Event monitor stopped"</span>)</span><br><span class="line"><span class="keyword">if</span> err := <-streamServerErrCh; err != <span class="literal">nil</span> {</span><br><span class="line">streamServerErr = err</span><br><span class="line">}</span><br><span class="line">logrus.Info(<span class="string">"Stream server stopped"</span>)</span><br><span class="line"><span class="keyword">if</span> eventMonitorErr != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> fmt.Errorf(<span class="string">"event monitor error: %w"</span>, eventMonitorErr)</span><br><span class="line">}</span><br><span class="line"><span class="keyword">if</span> streamServerErr != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> fmt.Errorf(<span class="string">"stream server error: %w"</span>, streamServerErr)</span><br><span class="line">}</span><br><span class="line"><span class="keyword">if</span> cniNetConfMonitorErr != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> fmt.Errorf(<span class="string">"cni network conf monitor error: %w"</span>, cniNetConfMonitorErr)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">}</span><br></pre></td></tr></table></figure><h2 id="流程架构图整理"><a href="#流程架构图整理" class="headerlink" title="流程架构图整理"></a>流程架构图整理</h2><p><img src="https://kblogs.oss-cn-beijing.aliyuncs.com/blogimgs/Cri-work-flow.png" alt="CRI-work-flow"></p>]]></content>
<summary type="html"><blockquote>
<p>containerd-v1.7.0<br>此篇正式开启插件启用流程分析。</p>
</blockquote>
<h1 id="源码分析"><a href="#源码分析" class="headerlink" title="源码分析"></a>源码分</summary>
<category term="contaierd" scheme="http://kiragoo.github.com/categories/contaierd/"/>
<category term="容器" scheme="http://kiragoo.github.com/tags/%E5%AE%B9%E5%99%A8/"/>
<category term="contaierd" scheme="http://kiragoo.github.com/tags/contaierd/"/>
</entry>
<entry>
<title>containerd源码分析-[1]启动流程</title>
<link href="http://kiragoo.github.com/archives/a46d2134.html"/>
<id>http://kiragoo.github.com/archives/a46d2134.html</id>
<published>2023-04-03T12:27:21.000Z</published>
<updated>2023-04-03T12:56:15.486Z</updated>
<content type="html"><![CDATA[<p>根据官网<a href="https://containerd.io/">containerd.io</a>的简介,可以得知 <code>containerd</code> 作为容器的生命周期管理。其在整个容器生态的组织架构中的职责如下:<br><img src="https://kblogs.oss-cn-beijing.aliyuncs.com/blogimgs/architecture.png" alt="architecture">。</p><h1 id="如下分析-containerd-的启动流程"><a href="#如下分析-containerd-的启动流程" class="headerlink" title="如下分析 containerd 的启动流程"></a>如下分析 <code>containerd</code> 的启动流程</h1><blockquote><p>基于 containerd-v1.7.0 版本分析</p></blockquote><p>入口代码其实比较简单,具体如下:</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// cmd/containerd/main.go</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">main</span><span class="params">()</span></span> {</span><br><span class="line">app := command.App()</span><br><span class="line"><span class="keyword">if</span> err := app.Run(os.Args); err != <span class="literal">nil</span> {</span><br><span class="line">fmt.Fprintf(os.Stderr, <span class="string">"containerd: %s\n"</span>, err)</span><br><span class="line">os.Exit(<span class="number">1</span>)</span><br><span class="line">}</span><br><span class="line">}</span><br></pre></td></tr></table></figure><ul><li>command.App 关键逻辑如下:<ul><li>flags 构造</li><li>注册插件,重点在于 plugin.Init 初始化构造</li><li>启动 TCPServer、GCPServer、TTRPCServer</li></ul></li></ul><p>流程图大纲如下:<br><img src="https://kblogs.oss-cn-beijing.aliyuncs.com/blogimgs/containerd-run.png" alt="containerd-run"></p><blockquote><p>后续会结合具体的插件进行明细示例分析。</p></blockquote>]]></content>
<summary type="html">containerd 源码分析</summary>
<category term="containerd" scheme="http://kiragoo.github.com/categories/containerd/"/>
<category term="容器" scheme="http://kiragoo.github.com/tags/%E5%AE%B9%E5%99%A8/"/>
<category term="containerd" scheme="http://kiragoo.github.com/tags/containerd/"/>
</entry>
<entry>
<title>SR-IOV VF 网卡命名问题记录</title>
<link href="http://kiragoo.github.com/archives/c882714e.html"/>
<id>http://kiragoo.github.com/archives/c882714e.html</id>
<published>2023-04-03T09:05:37.000Z</published>
<updated>2023-04-03T09:31:31.111Z</updated>
<content type="html"><![CDATA[<h1 id="SR-IOV-VF-网卡命名问题简单记录下"><a href="#SR-IOV-VF-网卡命名问题简单记录下" class="headerlink" title="SR-IOV VF 网卡命名问题简单记录下"></a>SR-IOV VF 网卡命名问题简单记录下</h1><p>最近工作项目中通过 <code>KVM</code> + <code>SR-IOV VF</code> 来构建虚拟网关节点,在资源实施规划部署时需要对每个网络平面的网卡进行统一化的命名,具体形如<br><code>Domain Node Name</code> 和 <code>Admin Node Name</code> 网卡都有统一的命名规范。</p><h2 id="通过业务脚本实现对应网卡及-Slot-配置文件"><a href="#通过业务脚本实现对应网卡及-Slot-配置文件" class="headerlink" title="通过业务脚本实现对应网卡及 Slot 配置文件"></a>通过业务脚本实现对应网卡及 <code>Slot</code> 配置文件</h2><p>通过 <code>virsh</code> atach 网卡。eg:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">virsh attach-device <span class="variable">${NODE_NAME}</span> <span class="variable">${XML_CONFIG}</span> --config</span><br></pre></td></tr></table></figure><p>为了保证生效识别到,最好通过 <code>virsh shutdown ${NODE_NAME}</code> 及 <code>virsh start ${NODE_NAME}</code> 来拉起网卡。<br>由于时间问题,具体的环境已经没有这里就不做贴图了。</p><p>这个时候通过 <code>ip a</code> 能够发现识别到网卡,此时借鉴如下 <img src="https://www.thegeekdiary.com/how-to-set-a-custom-interface-name-with-networkmanager-in-centos-rhel-7/" alt="How to Set a custom Interface Name With NetworkManager in Centos/RHEL 7"> 都不行。</p><p>通过自己的理解和调研,其实以上介绍的就是通过对 <code>/etc/udev/rules.d</code> 进行命名配置让 <code>kennel</code> 在系统启动引导阶段进行网卡重命名。</p><p>后面通过另外一种方式绕了一层:</p><ul><li>首先对原有 <code>/etc/default/grub</code> 配置文件进行配置, <code>mv /etc/default/grub /etc/default/grub.bak</code></li><li>配置 <code>/etc/default/grub</code> 内容:</li></ul><figure class="highlight txt"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">GRUB_TIMEOUT=5</span><br><span class="line">GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"</span><br><span class="line">GRUB_DEFAULT=saved</span><br><span class="line">GRUB_DISABLE_SUBMENU=true</span><br><span class="line">GRUB_TERMINAL_OUTPUT="ofconsole"</span><br><span class="line">GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=cl/root rd.lvm.lv=cl/swap net.ifnames=0 rhgb quiet" >> 添加 net.ifnames=0 重新生成网卡命名配置规范</span><br><span class="line">GRUB_DISABLE_RECOVERY="true"</span><br><span class="line">GRUB_ENABLE_BLSCFG=true</span><br><span class="line">GRUB_TERMINFO="terminfo -g 80x24 console"</span><br><span class="line">GRUB_DISABLE_OS_PROBER=true</span><br></pre></td></tr></table></figure><ul><li>生成启动文件 <code>grub2-mkconfig -o /boot/grub2/grub.cfg</code></li><li>重启虚机 <code>reboot</code></li></ul><p>通过 <code>ip a</code> 查看虚机网卡命名是否生效:<br><img src="https://kblogs.oss-cn-beijing.aliyuncs.com/blogimgs/vf-interface.png" alt="VF-Interface"></p><h2 id="对网卡进行配置"><a href="#对网卡进行配置" class="headerlink" title="对网卡进行配置"></a>对网卡进行配置</h2><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">nmcli c add con-name ${C_NAME} type ethernet ifname ${IF_NAME} ipv4.address 172.18.1.67/26</span><br><span class="line">nmcli c mod ${C_NAME} ipv4.method static</span><br><span class="line">nmcli c mod ${C_NAME} connection.autoconnect yes</span><br><span class="line">nmcli c mod ${C_NAME} 802-3-ethernet.mtu 9000</span><br><span class="line">nmcli c mod ${C_NAME} ipv6.addr-gen-mode eui64</span><br><span class="line">nmcli c up ${C_NAME}</span><br></pre></td></tr></table></figure><p>由于之前将 <code>Github</code> 作为图床,怎么说呢,不可控性太多了,今天就用 <code>oss</code> 配置了下,后面迁移啥的也比较方便,还是专业的产品做专业的事情吧。</p>]]></content>
<summary type="html"><h1 id="SR-IOV-VF-网卡命名问题简单记录下"><a href="#SR-IOV-VF-网卡命名问题简单记录下" class="headerlink" title="SR-IOV VF 网卡命名问题简单记录下"></a>SR-IOV VF 网卡命名问题简单记录下</</summary>
<category term="Linux" scheme="http://kiragoo.github.com/categories/Linux/"/>
<category term="Linux" scheme="http://kiragoo.github.com/tags/Linux/"/>
<category term="Network" scheme="http://kiragoo.github.com/tags/Network/"/>
</entry>
<entry>
<title>kubebuilder 去 kube-rbac-proxy 体验</title>
<link href="http://kiragoo.github.com/archives/4df09d70.html"/>
<id>http://kiragoo.github.com/archives/4df09d70.html</id>
<published>2022-04-26T10:12:36.000Z</published>
<updated>2023-04-03T08:26:55.906Z</updated>
<content type="html"><![CDATA[<p>安利一波广告,欢迎大家试用目前本人 maintain 的<a href="https://github.com/emqx/emqx-operator"><code>EMQX Kubernete Operator</code></a></p><p>最近在社区中接到用户反馈<a href="https://github.com/emqx/emqx-operator/issues/168">Release manifests has broken metrics</a><br>描述如下:<br><img src="https://kblogs.oss-cn-beijing.aliyuncs.com/blogimgs/metrics-issue.png" alt="issue-info"><br>使用 <code>release-1.1.5</code> 版本,看到对于目前对于 <code>Operatlr</code> 的 <code>metrics</code> 保护机制是采用的 <code>kube-rbac-proxy</code>,此处相关的内容也可以通过查看 <a href="https://book.kubebuilder.io/reference/metrics.html"><code>kubebuilder</code>官方文档</a>进行具体的阅读。</p><p>根据 <code>Issue</code> 反馈其实很快能定位到应该是 <code>Service</code> 没有匹配的 <code>Container Port</code>,看下 <code>release-1.1.5</code> 中的代码,如下<br><img src="https://kblogs.oss-cn-beijing.aliyuncs.com/blogimgs/release-1.1.5-manifests.png" alt="release-1.1.5-manifests"><br>可以看到 <code>emqx-operator-controller-manager-metrics-service</code> 中的内容如下:</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">apiVersion:</span> <span class="string">v1</span></span><br><span class="line"><span class="attr">kind:</span> <span class="string">Service</span></span><br><span class="line"><span class="attr">metadata:</span></span><br><span class="line"> <span class="attr">labels:</span></span><br><span class="line"> <span class="attr">control-plane:</span> <span class="string">controller-manager</span></span><br><span class="line"> <span class="attr">name:</span> <span class="string">emqx-operator-controller-manager-metrics-service</span></span><br><span class="line"> <span class="attr">namespace:</span> <span class="string">emqx-operator-system</span></span><br><span class="line"><span class="attr">spec:</span></span><br><span class="line"> <span class="attr">ports:</span></span><br><span class="line"> <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">https</span> <span class="comment"># 配置了 https</span></span><br><span class="line"> <span class="attr">port:</span> <span class="number">8443</span> <span class="comment"># 端口是 8443</span></span><br><span class="line"> <span class="attr">targetPort:</span> <span class="string">https</span></span><br><span class="line"> <span class="attr">selector:</span></span><br><span class="line"> <span class="attr">control-plane:</span> <span class="string">controller-manager</span></span><br></pre></td></tr></table></figure><p>而 <code>Operator</code> 相应的 <code>Deployment</code> 中 <code>.spec.containers.ports</code> 内容如下:</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">ports:</span></span><br><span class="line"> <span class="bullet">-</span> <span class="attr">containerPort:</span> <span class="number">9443</span> </span><br><span class="line"> <span class="attr">name:</span> <span class="string">webhook-server</span> <span class="comment">#只存在 webhook-server 相关的配置</span></span><br><span class="line"> <span class="attr">protocol:</span> <span class="string">TCP</span></span><br></pre></td></tr></table></figure><p>果不其然,确实这块的配置缺少了,但是考虑到目前在私有化交付或者公有化的交付过程中的保护机制,以及一些镜像维护的成本先暂不对外使用基于 <code>kube-rbac-proxy</code> 的 <code>Pod</code> 内部权限检查的机制,那么我们就得针对于 <code>Metrics</code> 的 <code>EndPoint</code> 提供一套默认的配置,方便使用者能够针对 <code>/Metrics</code> 的 <code>EndPoint</code> 进行 <code>metrics</code> 采集。</p><h1 id="基于-kustomize-的-config-维护"><a href="#基于-kustomize-的-config-维护" class="headerlink" title="基于 kustomize 的 config 维护"></a>基于 <code>kustomize</code> 的 <code>config</code> 维护</h1><p>关于 <code>kustomize</code> 的基础这里就不阐述了,重点关注下 <code>config/default/kustomization.yaml</code>:</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">bases:</span></span><br><span class="line"><span class="bullet">-</span> <span class="string">../crd</span></span><br><span class="line"><span class="bullet">-</span> <span class="string">../rbac</span></span><br><span class="line"><span class="bullet">-</span> <span class="string">../manager</span></span><br><span class="line"><span class="comment"># [WEBHOOK] To enable webhook, uncomment all the sections with [WEBHOOK] prefix including the one in</span></span><br><span class="line"><span class="comment"># crd/kustomization.yaml</span></span><br><span class="line"><span class="bullet">-</span> <span class="string">../webhook</span></span><br><span class="line"><span class="comment"># [CERTMANAGER] To enable cert-manager, uncomment all sections with 'CERTMANAGER'. 'WEBHOOK' components are required.</span></span><br><span class="line"><span class="bullet">-</span> <span class="string">../certmanager</span></span><br><span class="line"><span class="comment"># [PROMETHEUS] To enable prometheus monitor, uncomment all sections with 'PROMETHEUS'.</span></span><br><span class="line"><span class="comment">#- ../prometheus 此处目前是需要部署 对应的 ServiceMonitor 的 目前暂不开放此配置,有能力的用户可以自己generate yaml</span></span><br><span class="line"></span><br><span class="line"><span class="attr">patchesStrategicMerge:</span></span><br><span class="line"><span class="comment"># Protect the /metrics endpoint by putting it behind auth.</span></span><br><span class="line"><span class="comment"># If you want your controller-manager to expose the /metrics</span></span><br><span class="line"><span class="comment"># endpoint w/o any authn/z, please comment the following line.</span></span><br><span class="line"><span class="comment"># - manager_auth_proxy_patch.yaml # 重点就是此时的patch.yaml</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># Mount the controller config file for loading manager configurations</span></span><br><span class="line"><span class="comment"># through a ComponentConfig type</span></span><br><span class="line"><span class="comment">#- manager_config_patch.yaml</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># [WEBHOOK] To enable webhook, uncomment all the sections with [WEBHOOK] prefix including the one in</span></span><br><span class="line"><span class="comment"># crd/kustomization.yaml</span></span><br><span class="line"><span class="bullet">-</span> <span class="string">manager_webhook_patch.yaml</span></span><br></pre></td></tr></table></figure><p>此时看关联的 <code>patch.yaml</code> 文件:</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># This patch inject a sidecar container which is a HTTP proxy for the</span></span><br><span class="line"><span class="comment"># controller manager, it performs RBAC authorization against the Kubernetes API using SubjectAccessReviews.</span></span><br><span class="line"><span class="attr">apiVersion:</span> <span class="string">apps/v1</span></span><br><span class="line"><span class="attr">kind:</span> <span class="string">Deployment</span></span><br><span class="line"><span class="attr">metadata:</span></span><br><span class="line"> <span class="attr">name:</span> <span class="string">controller-manager</span></span><br><span class="line"> <span class="attr">namespace:</span> <span class="string">system</span></span><br><span class="line"><span class="attr">spec:</span></span><br><span class="line"> <span class="attr">template:</span></span><br><span class="line"> <span class="attr">spec:</span></span><br><span class="line"> <span class="attr">containers:</span></span><br><span class="line"> <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">kube-rbac-proxy</span></span><br><span class="line"> <span class="attr">image:</span> <span class="string">gcr.io/kubebuilder/kube-rbac-proxy:v0.8.0</span></span><br><span class="line"> <span class="attr">args:</span></span><br><span class="line"> <span class="bullet">-</span> <span class="string">"--secure-listen-address=0.0.0.0:8443"</span></span><br><span class="line"> <span class="bullet">-</span> <span class="string">"--upstream=http://127.0.0.1:8080/"</span></span><br><span class="line"> <span class="bullet">-</span> <span class="string">"--logtostderr=true"</span></span><br><span class="line"> <span class="bullet">-</span> <span class="string">"--v=10"</span></span><br><span class="line"> <span class="attr">ports:</span> <span class="comment">#重点关注点,因为 patch 文件被注释了,所以关于这段patch 内容就没有生成,所以出现了 issue 出现的问题</span></span><br><span class="line"> <span class="bullet">-</span> <span class="attr">containerPort:</span> <span class="number">8443</span> </span><br><span class="line"> <span class="attr">name:</span> <span class="string">https</span></span><br><span class="line"> <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">manager</span></span><br><span class="line"> <span class="attr">args:</span></span><br><span class="line"> <span class="bullet">-</span> <span class="string">"--health-probe-bind-address=:8081"</span></span><br><span class="line"> <span class="bullet">-</span> <span class="string">"--metrics-bind-address=127.0.0.1:8080"</span></span><br><span class="line"> <span class="bullet">-</span> <span class="string">"--leader-elect"</span></span><br></pre></td></tr></table></figure><p>基于交付场景的考虑,目前在工程中暂不开放 <code>manager_auth_proxy</code> 的配置,但是为了让用户可以对 <code>metrics</code> 进行相关的查看,我们需要提供默认的配置,同时还要让用户对工程项目的配置最小化改动,最终的方案是提供基于 <code>http:8080</code> 的默认配置,追加一套 <code>patch</code> 文件,当有能力维护的用户想要自定义的时候他可以取消对 <code># - manager_auth_proxy_patch.yaml</code> 的配置,实际上是触发了 <code>patch$delete</code> 动作。</p><p>明细的改动可以查看<a href="https://github.com/emqx/emqx-operator/releases/tag/1.1.6">release-1.1.6</a>去了解。</p><p>最后还是发版快乐,后续的 <code>RoadMap</code> 已经开始 <code>1.2.x</code> 的计划了,将进行 <code>.spec</code> 以及事件日志以及状态采集的优化。</p>]]></content>
<summary type="html"><p>安利一波广告,欢迎大家试用目前本人 maintain 的<a href="https://github.com/emqx/emqx-operator"><code>EMQX Kubernete Operator</code></a></p>
<p>最近在社区中接到用户反馈<</summary>
<category term="kubernetes" scheme="http://kiragoo.github.com/categories/kubernetes/"/>
<category term="kubernetes" scheme="http://kiragoo.github.com/tags/kubernetes/"/>
<category term="kubebuilder" scheme="http://kiragoo.github.com/tags/kubebuilder/"/>
</entry>
<entry>
<title>通过 OPA 运行 Kubernetes Pod Security Policy</title>
<link href="http://kiragoo.github.com/archives/a459e2be.html"/>
<id>http://kiragoo.github.com/archives/a459e2be.html</id>
<published>2022-04-26T04:15:05.000Z</published>
<updated>2023-04-03T08:26:55.906Z</updated>
<content type="html"><![CDATA[<blockquote><p>翻译自<a href="https://www.infracloud.io/blogs/kubernetes-pod-security-policies-opa/?utm_sq=gggb8083m5">Kubernetes Pod Security Policies with Open Policy Agent</a></p></blockquote><p>Kubernetes是当今云原生生态系统中最流行的容器编排平台。因此,Kubernetes的安全性也是一个越来越令人感兴趣和关注的领域。</p><p>在这篇博文中,首先我将讨论Pod安全策略准入控制器。然后我们将看到Open Policy Agent如何实现Pod安全策略。事实上,在<a href="https://www.infracloud.io/blogs/kubecon-2019-us-day-1-recap/">Kubecon + CloudNaticeCon North America 2019</a>的Kubernetes SIG Auth期间,Open Policy Agent / Gatekeeper被提及为Pod安全策略的潜在替代品。</p><p>首先,简要了解一下容器、安全和准入控制器。</p><h1 id="容器和安全概述"><a href="#容器和安全概述" class="headerlink" title="容器和安全概述"></a>容器和安全概述</h1><h2 id="Kubernetes-中的容器是什么?"><a href="#Kubernetes-中的容器是什么?" class="headerlink" title="Kubernetes 中的容器是什么?"></a><code>Kubernetes</code> 中的容器是什么?</h2><p>容器是轻量级的,可移植的,易于管理。在同一主机上运行的容器没有单独的物理/虚拟机。换句话说,容器共享资源、硬件和它们所运行的主机的操作系统内核。因此,使用者围绕哪些进程可以在容器内运行、这些进程有哪些权限、容器是否允许权限升级、使用哪些镜像等问题拥有适当的安全性变得非常重要。</p><h2 id="kubernetes-中的-Pod-是什么?"><a href="#kubernetes-中的-Pod-是什么?" class="headerlink" title="kubernetes 中的 Pod 是什么?"></a><code>kubernetes</code> 中的 <code>Pod</code> 是什么?</h2><p>Pod是Kubernetes应用程序的基本执行单元,是Kubernetes对象模型中最小、最简单的单元,由你创建或部署。它是一组由一个或多个容器组成的共享存储/网络,以及如何运行这些容器的规范。</p><p>因此,在容器上执行安全策略时,我们检查并应用Pod规范的安全策略。那么,这些政策是如何执行的呢?使用准入控制器。</p><h1 id="什么是-Admission-Controller"><a href="#什么是-Admission-Controller" class="headerlink" title="什么是 Admission Controller ?"></a>什么是 <code>Admission Controller</code> ?</h1><p>准入控制器是kube-apiserver的一部分。在配置被存储在集群设置(etcd)之前,它们拦截对Kubernetes API服务器的请求。一个准入控制器可以是验证性的(验证传入的请求),也可以是突变性的(修改传入的请求),或者两者都是。请参考<a href="https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#what-does-each-admission-controller-do">Kubernetes文档</a>,快速了解各种准入控制器的情况。</p><h1 id="使用-Open-Policy-Agent-作为-Admission-Controller"><a href="#使用-Open-Policy-Agent-作为-Admission-Controller" class="headerlink" title="使用 Open Policy Agent 作为 Admission Controller"></a>使用 <code>Open Policy Agent</code> 作为 <code>Admission Controller</code></h1><p>开放政策代理(OPA)是一个开源的、通用的政策引擎,它使把政策写成代码成为可能。OPA提供了一种高水平的声明性语言–Rego–来实现政策即代码。使用OPA,我们可以在微服务、CI/CD管道、API网关等方面执行策略。OPA最重要的用例之一是Kubernetes的策略执行,作为一个准入控制器。</p><p>这些策略是在Rego中编写的,并加载到OPA中,作为Kubernetes集群的准入控制器运行。OPA将根据Rego策略评估对Kubernetes API服务器的任何资源创建/更新/删除请求。如果该请求满足所有的策略,该请求就被允许。但即使有一个策略失败,请求也会被拒绝。</p><p><img src="https://kblogs.oss-cn-beijing.aliyuncs.com/blogimgs/admission-control.png" alt="Admission Control"></p><p>关于<code>OPA</code> 及 <code>Rego</code> 的更多详情,请阅读<a href="https://www.openpolicyagent.org/docs/latest/"><code>Rego Docs</code></a></p><p>现在,让我们来看看Pod安全策略的细节。</p><h1 id="Pod-Security-Policy-是什么?"><a href="#Pod-Security-Policy-是什么?" class="headerlink" title="Pod Security Policy 是什么?"></a><code>Pod Security Policy</code> 是什么?</h1><p>Pod安全策略(PSP)是一个集群级的资源,作为一个准入控制器来实现。PSP允许用户将安全要求转化为管理Pod规格的具体政策。起初,当一个PodSecurityPolicy资源被创建时,它什么都不做。而为了使用它,请求用户或目标pod的服务账户必须通过允许 “使用 “动词来授权使用该策略。你可以参考<a href="https://kubernetes.io/docs/concepts/policy/pod-security-policy/#enabling-pod-security-policies">Kubernetes文档</a>中的启用Pod安全策略。</p><p>请注意,PSP接纳控制器既是验证接纳控制器,又是变异接纳控制器。对于一些参数,PSP录取控制器使用默认值来改变传入的请求。此外,顺序始终是先变异,然后再验证。</p><h2 id="我们可以使用-PSP-控制哪些参数?"><a href="#我们可以使用-PSP-控制哪些参数?" class="headerlink" title="我们可以使用 PSP 控制哪些参数?"></a>我们可以使用 <code>PSP</code> 控制哪些参数?</h2><p>下表简要介绍了PSP中使用的各种参数和字段。详细解释见<a href="https://kubernetes.io/docs/concepts/policy/pod-security-policy/#policy-reference">Kubernetes文档</a>。</p><table><thead><tr><th>Field</th><th>Kubernetes API Reference (Kind – Version – Group )</th><th>Control Aspect</th></tr></thead><tbody><tr><td>privileged</td><td><a href="https://v1-18.docs.kubernetes.io/docs/reference/generated/kubernetes-api/v1.18/#securitycontext-v1-core">SecurityContext v1 core</a></td><td>Running containers in privileged mode</td></tr><tr><td>hostPID, hostIPC</td><td>PodSpec v1 core (<a href="https://v1-18.docs.kubernetes.io/docs/reference/generated/kubernetes-api/v1.18/#podspec-v1-core">https://v1-18.docs.kubernetes.io/docs/reference/generated/kubernetes-api/v1.18/#podspec-v1-core</a>)</td><td>Usage of host namespaces</td></tr><tr><td>…</td><td>…</td><td>…</td></tr></tbody></table><h1 id="如何使用-OPA-来-实现-PSP"><a href="#如何使用-OPA-来-实现-PSP" class="headerlink" title="如何使用 OPA 来 实现 PSP ?"></a>如何使用 <code>OPA</code> 来 实现 <code>PSP</code> ?</h1><p>我在前面提到,Rego语言允许我们把任何自定义策略写成代码。这意味着,我们可以使用Rego编写上述的Pod安全策略,并将OPA作为一个准入控制器来执行。</p><p>让我们快速看下 <code>Rego</code> 实现 <code>privileged pod policy</code>。</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line">package kubernetes.admission</span><br><span class="line"></span><br><span class="line">deny[message] {</span><br><span class="line"> #applies for Pod resources</span><br><span class="line"> input.request.kind.kind == "Pod"</span><br><span class="line"> #loops through all containers in the request</span><br><span class="line"> container := input.request.object.spec.containers[_]</span><br><span class="line"> #for each container, check privileged field</span><br><span class="line"> container.securityContext.privileged</span><br><span class="line"> #if all above statements are true, return message</span><br><span class="line"> message := sprintf("Container %v runs in privileged mode.", [container.name])</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>那么,这个策略是做什么的呢?如果输入请求中的任何容器是作为特权容器运行的,它将返回一条信息。</p><h1 id="PSP-实践"><a href="#PSP-实践" class="headerlink" title="PSP 实践"></a><code>PSP</code> 实践</h1><p>让我们通过一个基于minikube的基本教程来看看这个策略的运作情况。首先,按照<a href="https://www.openpolicyagent.org/docs/latest/kubernetes-tutorial/">OPA文档</a>中的教程,将OPA设置为准入控制器。这个教程加载一个入口验证策略。取而代之的是,我们将加载上面显示的特权策略。</p><p>一旦OPA被设置为minikube上的接纳控制器,使用上面的策略创建一个文件priorleged.rego。然后,在 “OPA “命名空间中,将该策略创建一个<code>configmap</code>。</p><p><code>kubectl create configmap privileged-policy --from-file=privileged.rego -n opa</code></p><p>现在,让我们使用以下清单创建一个具有特权的容器的部署。</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">apiVersion:</span> <span class="string">v1</span></span><br><span class="line"><span class="attr">kind:</span> <span class="string">Pod</span></span><br><span class="line"><span class="attr">metadata:</span></span><br><span class="line"> <span class="attr">name:</span> <span class="string">nginx</span></span><br><span class="line"> <span class="attr">namespace:</span> <span class="string">default</span></span><br><span class="line"> <span class="attr">labels:</span></span><br><span class="line"> <span class="attr">app:</span> <span class="string">nginx</span></span><br><span class="line"><span class="attr">spec:</span></span><br><span class="line"> <span class="attr">containers:</span></span><br><span class="line"> <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">nginx</span></span><br><span class="line"> <span class="attr">image:</span> <span class="string">nginx:latest</span></span><br><span class="line"> <span class="attr">securityContext:</span></span><br><span class="line"> <span class="attr">privileged:</span> <span class="literal">true</span></span><br></pre></td></tr></table></figure><p>当你尝试创建这个<code>pod</code>的时候,你将注意到因为<code>OPA</code>的存在而动作被拒绝。</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">Error from server (Container nginx runs in privileged mode.): error when creating "privileged-deploy.yaml": admission webhook "validating-webhook.openpolicyagent.org" denied the request: Container nginx runs in privileged mode.</span><br></pre></td></tr></table></figure><p>同样,我们可以为其他Pod编写安全策略,并使用<code>OPA</code>进行强制执行。</p><p>在此篇教程中,为了简单起见我们使用<code>configmap</code>来加载策略,但这不是生产部署的最佳策略。在生产环境中,你可以从外部的 <code>bundle</code> 服务器中定期的下载 <code>OPA</code> 策略。你所有的策略可以在 <code>bundle</code> 服务中进行维护,另外 <code>OPA</code> 也会定期的下载来保持其最新状态。请查看<a href="https://www.openpolicyagent.org/docs/latest/external-data/#option-3-bundle-api"><code>Bundle API</code></a>了解更多信息。</p><p>简而言之,使用 <code>OPA</code> ,我们可以强制执行 <code>Pod</code> 安全策略。不仅如此,我们还可以使用相同的设置来执行任何其他自定义的安全/基于标准的政策。</p><h1 id="在PSP中应用OPA的几点关键益处"><a href="#在PSP中应用OPA的几点关键益处" class="headerlink" title="在PSP中应用OPA的几点关键益处"></a>在<code>PSP</code>中应用<code>OPA</code>的几点关键益处</h1><ul><li>可以在同一个准入控制器中管理所有的策略避免了分散的管理</li><li>完善了 <code>Policy-as-code</code> 在 <code>CICD</code> 中的实现</li><li>使得通过版本控制工具例如 <code>Git</code> 来维护 <code>OPA</code> 策略成为了可能, <code>OPA</code> 也提供了 <code>APIs</code> 来动态的管理策略并进行加载</li><li>根据自己的实施进行定制化的拒绝信息</li></ul><p>另外,我们也可以部署 <code>OPA</code> 作为 <code>mutating admission controller</code>。这样的话,你也可以完善 <code>PSP Admission Controller</code> 的 <code>mutating</code> 行为。</p>]]></content>
<summary type="html"><blockquote>
<p>翻译自<a href="https://www.infracloud.io/blogs/kubernetes-pod-security-policies-opa/?utm_sq=gggb8083m5">Kubernetes Pod Security</summary>
<category term="kubernetes" scheme="http://kiragoo.github.com/categories/kubernetes/"/>
<category term="kubernetes" scheme="http://kiragoo.github.com/tags/kubernetes/"/>
<category term="PSP" scheme="http://kiragoo.github.com/tags/PSP/"/>
<category term="OPA" scheme="http://kiragoo.github.com/tags/OPA/"/>
</entry>
<entry>
<title>USING KUBE-RBAC-PROXY TO SECURE KUBERNETES WORKLOADS</title>
<link href="http://kiragoo.github.com/archives/217459e6.html"/>
<id>http://kiragoo.github.com/archives/217459e6.html</id>
<published>2022-04-25T13:21:12.000Z</published>
<updated>2023-04-03T08:26:55.907Z</updated>
<content type="html"><![CDATA[<blockquote><p>翻译自<a href="https://www.brancz.com/2018/02/27/using-kube-rbac-proxy-to-secure-kubernetes-workloads/">USING KUBE-RBAC-PROXY TO SECURE KUBERNETES WORKLOADS</a></p></blockquote><p>在使用Prometheus监控Kubernetes集群时,我注意到一个反复出现的问题:Prometheus检索的指标可能包含敏感信息(例如,Prometheus节点导出器暴露了主机的内核版本),潜在的入侵者可能利用这些信息在各自的Kubernetes集群中钻营。所以我问自己一个问题。如何认证和授权来自Prometheus的请求,以便Prometheus(只有Prometheus)能够从Pod中运行的应用程序中检索指标?</p><p>在Prometheus中验证和授权指标端点的默认答案是使用TLS客户端证书,然而,由于发行、验证和轮换客户端证书会变得相当复杂,因此,Prometheus请求在大多数情况下根本没有经过验证和授权。</p><p>我建立了kube-rbac-proxy,这是一个针对单一上游的小型HTTP代理,可以使用SubjectAccessReviews对Kubernetes API执行RBAC授权。在这篇文章中,我想解释一下,它是如何使用Kubernetes RBAC来完成这个任务的。</p><h1 id="RBAC是如何在幕后工作的?"><a href="#RBAC是如何在幕后工作的?" class="headerlink" title="RBAC是如何在幕后工作的?"></a>RBAC是如何在幕后工作的?</h1><p>Kubernetes基于角色的访问控制(RBAC)本身只解决了一半的问题。顾名思义,它只涉及访问控制,意味着授权,而不是认证。在一个请求能够被授权之前,它需要被认证。简单地说:我们需要找出谁在执行这个请求。在Kubernetes中,服务自我认证的机制是ServiceAccount令牌。</p><p>Kubernetes API公开了验证ServiceAccount令牌的能力,使用所谓的TokenReview。TokenReview的响应仅仅是ServiceAccount令牌是否被成功验证,以及指定的令牌与哪个用户有关。kube-rbac-proxy期望ServiceAccount令牌在Authorization HTTP头中被指定,然后使用TokenReview对其进行验证。</p><p>在这一点上,一个请求已经被验证,但还没有被授权。与TokenReview平行,Kuberenetes有一个SubjectAccessReview,它是授权API的一部分。在SubjectAccessReview中,指定了一个预期的行动以及想要执行该行动的用户。在Prometheus请求度量的具体案例中,/metrics HTTP端点被请求。不幸的是,在Kubernetes中这不是一个完全指定的资源,然而,SubjectAccessReview资源也能够授权所谓的 “非资源请求”。但可能性是无穷的,例如:授权一个代理请求,SubjecAccessReview也可以检查以确保用户有服务/代理RBAC角色。</p><p>当用Prometheus监控Kubernetes时,那么Prometheus服务器可能已经拥有访问/metrics非资源url的权限,因为从Kubernetes apiserver检索指标需要同样的RBAC角色。</p><h1 id="kube-rbac-proxy"><a href="#kube-rbac-proxy" class="headerlink" title="kube-rbac-proxy"></a>kube-rbac-proxy</h1><p>现在已经解释了所有必要的部分,让我们看看kube-rbac-proxy是如何具体地验证和授权一个请求的,案例在本博文的开头就已经说明了。普罗米修斯从节点输出器中请求度量。</p><p>当Prometheus对node-exporter执行请求时,kube-rbac-proxy在它前面,kube-rbac-proxy用提供的ServiceAccount令牌执行TokenReview,如果TokenReview成功,它继续使用SubjectAccessReview来验证,ServiceAccount被授权访问/metrics HTTP端点。</p><p>可见,从Prometheus验证和授权请求的整个流程是这样的:<br><img src="https://kblogs.oss-cn-beijing.aliyuncs.com/blogimgs/kube-rbac-proxy.png" alt="kube-rbac-proxy"></p><h1 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h1><p>如果你想做的是用一个静态授权配置文件来授权某些请求,你可能想看看kube-rbac-proxy。</p><p>虽然在你的工作负载中执行授权可能仍然有用–特别是当需要更精细的或特定的数据访问时–但很高兴知道在Kubernetes中,你有可用的工具,可以让你在花费时间之前走得更远。</p>]]></content>
<summary type="html"><blockquote>
<p>翻译自<a href="https://www.brancz.com/2018/02/27/using-kube-rbac-proxy-to-secure-kubernetes-workloads/">USING KUBE-RBAC-PROXY T</summary>
<category term="Kubernetes" scheme="http://kiragoo.github.com/categories/Kubernetes/"/>
<category term="Kubernetes" scheme="http://kiragoo.github.com/tags/Kubernetes/"/>
</entry>
<entry>
<title>利用 nektos/act debug github action locally</title>
<link href="http://kiragoo.github.com/archives/7c305169.html"/>
<id>http://kiragoo.github.com/archives/7c305169.html</id>
<published>2022-04-06T14:35:37.000Z</published>
<updated>2023-04-03T08:26:55.906Z</updated>
<content type="html"><![CDATA[<p>我们都知道利用 <code>Github Action</code> 可以用来进行构建<code>CI</code>的流水线,通过构建对应的<code>jobs</code>来实现我们的期望效果,对于目前来说很多时候构建之后的效果目前只能在远端<code>repo</code>才能够看到最终的执行结果,那么我们是否有机会能够在本地构建,然后来类似<code>DEBUG</code>看到最后期望呢?这里我将介绍如何利用<code>nektos/act</code>这一利器来本地执行 <code>Github Action</code>。</p><h1 id="概览"><a href="#概览" class="headerlink" title="概览"></a>概览</h1><p>根据官方仓库的概览描述告诉了我们对于本地构建的急切缘由!</p><ul><li>及时的结果反馈。 通过本地构建我们无需通过 <code>commit/push</code> 的事件来触发远端<code>repo</code>的<code>action</code>。我们完全可以在本地进行模拟<code>Github Actions</code>所达到的效果,从而得到一个及时的结果信息反馈。</li><li>我们完全可以通过流式线形式的构建来替换 <code>Makefile</code> 文件</li></ul><h1 id="实际操作"><a href="#实际操作" class="headerlink" title="实际操作"></a>实际操作</h1><p>这里假设你已经具备基本的<code>Github Action</code>配置能力,且具备初步的使用能力,此篇章主要是对 <code>act</code> 的配置及使用。<br>安装完可以通过 <code>act -h</code> 进行简单的查看验证。<br><img src="https://kblogs.oss-cn-beijing.aliyuncs.com/blogimgs/act-help.png" alt="act-help"></p><h2 id="安装"><a href="#安装" class="headerlink" title="安装"></a>安装</h2><p>本人使用的是 <code>macos</code>,可以通过 <code>brew install act</code> 进行安装,其他 <code>os</code> 发行版可以参考官方仓库中的安装说明进行安装。</p><h2 id="配置"><a href="#配置" class="headerlink" title="配置"></a>配置</h2><h3 id="镜像配置"><a href="#镜像配置" class="headerlink" title="镜像配置"></a>镜像配置</h3><p>在远端<code>Github Action</code> 运行的时候实际上是提供了对应的 <code>vm</code> 资源进行构建,那么在本地的话,<code>act</code> 是提供一个虚机镜像在容器层面进行 <code>os</code> 的模拟,我这里使用的是默认的 <code>platform/docker-image</code>,当然也可以自定定义。执行的命令如下:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">act -P <platform>=<docker-image></span><br></pre></td></tr></table></figure><p>如下是我本地的镜像:<br><img src="https://kblogs.oss-cn-beijing.aliyuncs.com/blogimgs/act-image.png" alt="act-image"></p><h3 id="环境变量及密钥配置"><a href="#环境变量及密钥配置" class="headerlink" title="环境变量及密钥配置"></a>环境变量及密钥配置</h3><p>很多时候在公司内部我们使用的更多的是私有仓库,那么我们在项目工程容器化的时候会拉依赖涉及到私有仓库,在 <code>github</code> 上我们会配置对应的 <code>Action->secrets</code>,在远端运行<code>github action</code>的时候是有机会获取到的,但是在本地的话我们如何传递呢?</p><p><code>act</code> 支持 <code>secrets</code> 的配置,可以通过命令参数直接传递也可以通过在本地写入配置文件进行配置。<br><img src="https://kblogs.oss-cn-beijing.aliyuncs.com/blogimgs/act-secrets.png" alt="act-secrets"></p>]]></content>
<summary type="html">利用 nektos/act 本地执行 github action</summary>
<category term="devops" scheme="http://kiragoo.github.com/categories/devops/"/>
<category term="github action" scheme="http://kiragoo.github.com/tags/github-action/"/>
<category term="act" scheme="http://kiragoo.github.com/tags/act/"/>
</entry>
<entry>
<title>手撕client-go:如何编写CRD client</title>
<link href="http://kiragoo.github.com/archives/ba22a6bc.html"/>
<id>http://kiragoo.github.com/archives/ba22a6bc.html</id>
<published>2021-12-24T12:46:08.000Z</published>
<updated>2022-04-21T12:46:07.734Z</updated>
<content type="html"><![CDATA[<h2 id="需求背景分析"><a href="#需求背景分析" class="headerlink" title="需求背景分析"></a>需求背景分析</h2><p>基于 <code>k8s</code> 的二次开发过程中,有些场景我们会定制化的去开发自己的 <code>CRD</code> + <code>Controller</code>,即 <code>Operator</code>来实现基于<code>k8s</code>的云原生化的部署与自动化运维功能,暂且称之为<strong>底层的基座能力</strong>。</p><p>如果我们想基于底层能力,并想要将其封装为控制台来供上层业务调用的话,我们需要有机会能够去控制与使用这样的接口能力,基于对<code>Client-go</code>的使用,也许有胖友会想到<code>dynamic client</code>的使用,但是作为设计与开发人员,我们应该清醒的认识到对于序列化与反序列化过程,扔一堆<code>map</code>是多么的头疼(除非恰巧业务开发人员与<code>Operator</code>设计设计者是同一人>..<)。</p><p>我们能不能有机会像使用<code>k8s</code> 中的原生资源如<code>Deployments</code>、<code>Service</code>等一样方便的去使用呢?</p><h2 id="必备概念与技能"><a href="#必备概念与技能" class="headerlink" title="必备概念与技能"></a>必备概念与技能</h2><p>在进行具体分析前,建议胖友先去了解下<a href="https://kubernetes.io/zh/docs/reference/using-api/api-concepts/"><code>Kubernetes API</code> 概念</a>,同时具备查阅<a href="https://v1-21.docs.kubernetes.io/docs/reference/generated/kubernetes-api/v1.21/"><code>Kubernetes API</code></a>的能力。</p><p>为了方便举例,我在本地<code>k8s</code>集群注册了<a href="https://github.com/emqx/emqx-operator"><code>emqx-operator</code></a> 中的 <code>CRD</code>自定义资源,将其视为与 <code>k8s</code> 原生资源同等地位。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">emqxbrokers.apps.emqx.io 2021-12-09T03:59:28Z</span><br><span class="line">emqxenterprises.apps.emqx.io 2021-12-09T03:59:28Z</span><br></pre></td></tr></table></figure><p>另外可以通过<code>kubectl api-version</code>查看其<code>API</code>相关信息:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="variable">$kubectl</span> api-versions | grep emqx</span><br><span class="line">apps.emqx.io/v1beta1</span><br></pre></td></tr></table></figure><p>通过 <code>kubectl api-resoures</code> 查看相关 <code>Group</code>,<code>Version</code>,<code>Kind</code>信息。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="variable">$kubectl</span> api-resources </span><br><span class="line">NAME SHORTNAMES APIVERSION NAMESPACED KIND</span><br><span class="line">emqxbrokers emqx apps.emqx.io/v1beta1 <span class="literal">true</span> EmqxBroker</span><br><span class="line">emqxenterprises emqx-ee apps.emqx.io/v1beta1 <span class="literal">true</span> EmqxEnterprise</span><br></pre></td></tr></table></figure><p>相信胖友应该都掌握如上知识概念及具备如上的基础技能了。^.^</p><h2 id="设计实现"><a href="#设计实现" class="headerlink" title="设计实现"></a>设计实现</h2><p>关于 <code>client-go</code> 的官方文档描述还是蛮少的,作为设计与开发者,胖友们得具备源码分析能力。下面让我们切入<code>client-go</code> 官方 <code>Repo</code> 中。其中<code>examples</code>的<code>create-update-delete-deployment</code>的示例展示了如何使用<code>client-go</code>库来进行<code>rest</code>请求的方法。</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">main</span><span class="params">()</span></span> {</span><br><span class="line"><span class="keyword">var</span> kubeconfig *<span class="keyword">string</span></span><br><span class="line"><span class="keyword">if</span> home := homedir.HomeDir(); home != <span class="string">""</span> {</span><br><span class="line">kubeconfig = flag.String(<span class="string">"kubeconfig"</span>, filepath.Join(home, <span class="string">".kube"</span>, <span class="string">"config"</span>), <span class="string">"(optional) absolute path to the kubeconfig file"</span>)</span><br><span class="line">} <span class="keyword">else</span> {</span><br><span class="line">kubeconfig = flag.String(<span class="string">"kubeconfig"</span>, <span class="string">""</span>, <span class="string">"absolute path to the kubeconfig file"</span>)</span><br><span class="line">}</span><br><span class="line">flag.Parse()</span><br><span class="line"></span><br><span class="line"> <span class="comment">// Config 的初始化</span></span><br><span class="line">config, err := clientcmd.BuildConfigFromFlags(<span class="string">""</span>, *kubeconfig)</span><br><span class="line">...</span><br><span class="line"> <span class="comment">// clientset 的构造</span></span><br><span class="line">clientset, err := kubernetes.NewForConfig(config)</span><br><span class="line"></span><br><span class="line"> <span class="comment">// 使用 resource client 进行 resource 资源的操作</span></span><br><span class="line">deploymentsClient := clientset.AppsV1().Deployments(apiv1.NamespaceDefault)</span><br><span class="line"></span><br><span class="line"><span class="comment">// Create Deployment</span></span><br><span class="line">fmt.Println(<span class="string">"Creating deployment..."</span>)</span><br><span class="line">result, err := deploymentsClient.Create(context.TODO(), deployment, metav1.CreateOptions{})</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"><span class="built_in">panic</span>(err)</span><br><span class="line">}</span><br><span class="line">fmt.Printf(<span class="string">"Created deployment %q.\n"</span>, result.GetObjectMeta().GetName())</span><br><span class="line"></span><br><span class="line"><span class="comment">// Update Deployment</span></span><br><span class="line">prompt()</span><br><span class="line">fmt.Println(<span class="string">"Updating deployment..."</span>)</span><br><span class="line"><span class="comment">// You have two options to Update() this Deployment:</span></span><br><span class="line"><span class="comment">//</span></span><br><span class="line"><span class="comment">// 1. Modify the "deployment" variable and call: Update(deployment).</span></span><br><span class="line"><span class="comment">// This works like the "kubectl replace" command and it overwrites/loses changes</span></span><br><span class="line"><span class="comment">// made by other clients between you Create() and Update() the object.</span></span><br><span class="line"><span class="comment">// 2. Modify the "result" returned by Get() and retry Update(result) until</span></span><br><span class="line"><span class="comment">// you no longer get a conflict error. This way, you can preserve changes made</span></span><br><span class="line"><span class="comment">// by other clients between Create() and Update(). This is implemented below</span></span><br><span class="line"><span class="comment">// using the retry utility package included with client-go. (RECOMMENDED)</span></span><br><span class="line"><span class="comment">//</span></span><br><span class="line"><span class="comment">// More Info:</span></span><br><span class="line"><span class="comment">// https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#concurrency-control-and-consistency</span></span><br><span class="line"></span><br><span class="line">retryErr := retry.RetryOnConflict(retry.DefaultRetry, <span class="function"><span class="keyword">func</span><span class="params">()</span> <span class="title">error</span></span> {</span><br><span class="line"><span class="comment">// Retrieve the latest version of Deployment before attempting update</span></span><br><span class="line"><span class="comment">// RetryOnConflict uses exponential backoff to avoid exhausting the apiserver</span></span><br><span class="line">result, getErr := deploymentsClient.Get(context.TODO(), <span class="string">"demo-deployment"</span>, metav1.GetOptions{})</span><br><span class="line"><span class="keyword">if</span> getErr != <span class="literal">nil</span> {</span><br><span class="line"><span class="built_in">panic</span>(fmt.Errorf(<span class="string">"Failed to get latest version of Deployment: %v"</span>, getErr))</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line">result.Spec.Replicas = int32Ptr(<span class="number">1</span>) <span class="comment">// reduce replica count</span></span><br><span class="line">result.Spec.Template.Spec.Containers[<span class="number">0</span>].Image = <span class="string">"nginx:1.13"</span> <span class="comment">// change nginx version</span></span><br><span class="line">_, updateErr := deploymentsClient.Update(context.TODO(), result, metav1.UpdateOptions{})</span><br><span class="line"><span class="keyword">return</span> updateErr</span><br><span class="line">})</span><br><span class="line"><span class="keyword">if</span> retryErr != <span class="literal">nil</span> {</span><br><span class="line"><span class="built_in">panic</span>(fmt.Errorf(<span class="string">"Update failed: %v"</span>, retryErr))</span><br><span class="line">}</span><br><span class="line">fmt.Println(<span class="string">"Updated deployment..."</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment">// List Deployments</span></span><br><span class="line"> ...</span><br><span class="line"> fmt.Printf(<span class="string">"Listing deployments in namespace %q:\n"</span>, apiv1.NamespaceDefault)</span><br><span class="line">list, err := deploymentsClient.List(context.TODO(), metav1.ListOptions{})</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"><span class="built_in">panic</span>(err)</span><br><span class="line">}</span><br><span class="line"><span class="keyword">for</span> _, d := <span class="keyword">range</span> list.Items {</span><br><span class="line">fmt.Printf(<span class="string">" * %s (%d replicas)\n"</span>, d.Name, *d.Spec.Replicas)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// Delete Deployment</span></span><br><span class="line"> ...</span><br><span class="line">fmt.Println(<span class="string">"Deleting deployment..."</span>)</span><br><span class="line">deletePolicy := metav1.DeletePropagationForeground</span><br><span class="line"><span class="keyword">if</span> err := deploymentsClient.Delete(context.TODO(), <span class="string">"demo-deployment"</span>, metav1.DeleteOptions{</span><br><span class="line">PropagationPolicy: &deletePolicy,</span><br><span class="line">}); err != <span class="literal">nil</span> {</span><br><span class="line"><span class="built_in">panic</span>(err)</span><br><span class="line">}</span><br><span class="line">fmt.Println(<span class="string">"Deleted deployment."</span>)</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>设计的重点就在于如何设计与实现<code>CRD</code>对应的 <code>resource</code>的<code>clientset</code>。</p><h3 id="clientset-的设计"><a href="#clientset-的设计" class="headerlink" title="clientset 的设计"></a><code>clientset</code> 的设计</h3><p>通过源码我们发现实际上就是通过 <code>Config</code> 去构造 <code>rest http</code> 的客户端。</p><h4 id="NewForConfig"><a href="#NewForConfig" class="headerlink" title="NewForConfig"></a><code>NewForConfig</code></h4><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">NewForConfig</span><span class="params">(c *rest.Config)</span> <span class="params">(*Clientset, error)</span></span> {</span><br><span class="line">configShallowCopy := *c</span><br><span class="line"></span><br><span class="line"><span class="comment">// share the transport between all clients</span></span><br><span class="line">httpClient, err := rest.HTTPClientFor(&configShallowCopy)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, err</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> NewForConfigAndClient(&configShallowCopy, httpClient)</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h4 id="Clientset"><a href="#Clientset" class="headerlink" title="Clientset"></a><code>Clientset</code></h4><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">type</span> Interface <span class="keyword">interface</span> {</span><br><span class="line">Discovery() discovery.DiscoveryInterface</span><br><span class="line"> ...</span><br><span class="line"> AppsV1() appsv1.AppsV1Interface <span class="comment">// Deployment 资源调用接口</span></span><br><span class="line"> ...</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="keyword">type</span> Clientset <span class="keyword">struct</span> {</span><br><span class="line"> *discovery.DiscoveryClient</span><br><span class="line"> ...</span><br><span class="line"> appsV1 *appsv1.AppsV1Client</span><br><span class="line"> ...</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>到了这里我们发现 <code>Clientset</code> 就是对 <code>AppSV1</code> 下具体资源比如<code>Deployment</code>的抽象,对外暴露引用,我们最终需要的也是提供这样的一个抽象层面。</p><p>实际上<code>AppsV1</code> 也是一个维度的抽象,让我们继续往下看:</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">type</span> AppsV1Interface <span class="keyword">interface</span> {</span><br><span class="line">RESTClient() rest.Interface</span><br><span class="line"> ...</span><br><span class="line">DeploymentsGetter</span><br><span class="line"> ...</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// AppsV1Client is used to interact with features provided by the apps group.</span></span><br><span class="line"><span class="keyword">type</span> AppsV1Client <span class="keyword">struct</span> {</span><br><span class="line">restClient rest.Interface</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line">...</span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(c *AppsV1Client)</span> <span class="title">Deployments</span><span class="params">(namespace <span class="keyword">string</span>)</span> <span class="title">DeploymentInterface</span></span> {</span><br><span class="line"><span class="keyword">return</span> newDeployments(c, namespace)</span><br><span class="line">}</span><br><span class="line">...</span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">NewForConfig</span><span class="params">(c *rest.Config)</span> <span class="params">(*AppsV1Client, error)</span></span> {</span><br><span class="line">config := *c</span><br><span class="line"> <span class="comment">// 需要重点留意 setConfigDefaults(&config)</span></span><br><span class="line"><span class="keyword">if</span> err := setConfigDefaults(&config); err != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, err</span><br><span class="line">}</span><br><span class="line">httpClient, err := rest.HTTPClientFor(&config)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, err</span><br><span class="line">}</span><br><span class="line"><span class="keyword">return</span> NewForConfigAndClient(&config, httpClient)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">NewForConfigAndClient</span><span class="params">(c *rest.Config, h *http.Client)</span> <span class="params">(*AppsV1Client, error)</span></span> {</span><br><span class="line"> ...</span><br><span class="line"><span class="keyword">return</span> &AppsV1Client{client}, <span class="literal">nil</span></span><br><span class="line">}</span><br><span class="line"></span><br><span class="line">...</span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">setConfigDefaults</span><span class="params">(config *rest.Config)</span> <span class="title">error</span></span> {</span><br><span class="line"> <span class="comment">// 需要重点留意 gv</span></span><br><span class="line">gv := v1.SchemeGroupVersion</span><br><span class="line">config.GroupVersion = &gv</span><br><span class="line">config.APIPath = <span class="string">"/apis"</span></span><br><span class="line">config.NegotiatedSerializer = scheme.Codecs.WithoutConversion()</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> config.UserAgent == <span class="string">""</span> {</span><br><span class="line">config.UserAgent = rest.DefaultKubernetesUserAgent()</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(c *AppsV1Client)</span> <span class="title">RESTClient</span><span class="params">()</span> <span class="title">rest</span>.<span class="title">Interface</span></span> {</span><br><span class="line"><span class="keyword">if</span> c == <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">}</span><br><span class="line"><span class="keyword">return</span> c.restClient</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>相信看到这里我们已经有了清晰的认识了,实际上这里就是去构造对应具体<code>Resource</code>的<code>rest client</code>。</p><p><code>Resource</code> 的<code>client</code> 大体构造我们是有了,那么客户端如何知道去请求啥<code>URL</code>的了?</p><p>这里考察我们对概念的理解与掌握,我们应该对<code>SchemeGroupVersion</code>这样的关键变量有敏锐的捕捉能力,我们去看看这个<code>SchemeGroupVersion</code>究竟是啥?</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// GroupName is the group name use in this package</span></span><br><span class="line"><span class="keyword">const</span> GroupName = <span class="string">"apps"</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// SchemeGroupVersion is group version used to register these objects</span></span><br><span class="line"><span class="keyword">var</span> SchemeGroupVersion = schema.GroupVersion{Group: GroupName, Version: <span class="string">"v1"</span>}</span><br><span class="line"></span><br><span class="line"><span class="comment">// Resource takes an unqualified resource and returns a Group qualified GroupResource</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">Resource</span><span class="params">(resource <span class="keyword">string</span>)</span> <span class="title">schema</span>.<span class="title">GroupResource</span></span> {</span><br><span class="line"><span class="keyword">return</span> SchemeGroupVersion.WithResource(resource).GroupResource()</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="keyword">var</span> (</span><br><span class="line"><span class="comment">// <span class="doctag">TODO:</span> move SchemeBuilder with zz_generated.deepcopy.go to k8s.io/api.</span></span><br><span class="line"><span class="comment">// localSchemeBuilder and AddToScheme will stay in k8s.io/kubernetes.</span></span><br><span class="line">SchemeBuilder = runtime.NewSchemeBuilder(addKnownTypes)</span><br><span class="line">localSchemeBuilder = &SchemeBuilder</span><br><span class="line">AddToScheme = localSchemeBuilder.AddToScheme</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="comment">// Adds the list of known types to the given scheme.</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">addKnownTypes</span><span class="params">(scheme *runtime.Scheme)</span> <span class="title">error</span></span> {</span><br><span class="line">scheme.AddKnownTypes(SchemeGroupVersion,</span><br><span class="line">&Deployment{},</span><br><span class="line">&DeploymentList{},</span><br><span class="line">&StatefulSet{},</span><br><span class="line">&StatefulSetList{},</span><br><span class="line">&DaemonSet{},</span><br><span class="line">&DaemonSetList{},</span><br><span class="line">&ReplicaSet{},</span><br><span class="line">&ReplicaSetList{},</span><br><span class="line">&ControllerRevision{},</span><br><span class="line">&ControllerRevisionList{},</span><br><span class="line">)</span><br><span class="line">metav1.AddToGroupVersion(scheme, SchemeGroupVersion)</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>相信看到这里一切都柳暗花明了,其实就是告知<code>k8s</code> 我注册了这些<code>schema</code>,<code>k8s</code> 知道了这些<code>Resource</code>的存在,那么当我去请求的<code>Resouce</code>操作的时候能够按照我们的预期达到功能实现。</p><h3 id="CRD-关键实现"><a href="#CRD-关键实现" class="headerlink" title="CRD 关键实现"></a><code>CRD</code> 关键实现</h3><p>那么对于自定义的 <code>CRD</code> 实现肯定少不了如上分心的<code>setConfigDefaults</code>实现,如下为针对与<code>emqx-operator</code>实现:</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">var</span> (</span><br><span class="line">emqxbrokerGVR = schema.GroupVersion{Group: <span class="string">"apps.emqx.io"</span>, Version: <span class="string">"v1beta1"</span>}</span><br><span class="line">)</span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">setConfigDefaults</span><span class="params">(config *rest.Config)</span> <span class="title">error</span></span> {</span><br><span class="line">gv := emqxbrokerGVR</span><br><span class="line">config.GroupVersion = &gv</span><br><span class="line">config.APIPath = <span class="string">"/apis"</span></span><br><span class="line">config.NegotiatedSerializer = scheme.Codecs.WithoutConversion()</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> config.UserAgent == <span class="string">""</span> {</span><br><span class="line">config.UserAgent = rest.DefaultKubernetesUserAgent()</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>到了这里对于<code>Client</code>的构造就算结束了,这其中的彩蛋还包括请求<code>API</code>的<code>URL</code>。</p><h3 id="CRD-资源操作具体实现"><a href="#CRD-资源操作具体实现" class="headerlink" title="CRD 资源操作具体实现"></a><code>CRD</code> 资源操作具体实现</h3><p>如上我们主要讲述了如何去构造<code>client</code>,那么对于<code>CRD</code>的具体操作我们是如何实现的呢?</p><p>这里让我们将注意力再返回到<code>AppsV1Interface</code>:</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">type</span> AppsV1Interface <span class="keyword">interface</span> {</span><br><span class="line">RESTClient() rest.Interface</span><br><span class="line"> ...</span><br><span class="line">DeploymentsGetter</span><br><span class="line"> ...</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>让我们看看<code>DeploymentGetter</code>究竟是啥。</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line">...</span><br><span class="line"><span class="keyword">type</span> DeploymentsGetter <span class="keyword">interface</span> {</span><br><span class="line">Deployments(namespace <span class="keyword">string</span>) DeploymentInterface</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// DeploymentInterface has methods to work with Deployment resources.</span></span><br><span class="line"><span class="keyword">type</span> DeploymentInterface <span class="keyword">interface</span> {</span><br><span class="line">Create(ctx context.Context, deployment *v1.Deployment, opts metav1.CreateOptions) (*v1.Deployment, error)</span><br><span class="line">Update(ctx context.Context, deployment *v1.Deployment, opts metav1.UpdateOptions) (*v1.Deployment, error)</span><br><span class="line">UpdateStatus(ctx context.Context, deployment *v1.Deployment, opts metav1.UpdateOptions) (*v1.Deployment, error)</span><br><span class="line">Delete(ctx context.Context, name <span class="keyword">string</span>, opts metav1.DeleteOptions) error</span><br><span class="line">DeleteCollection(ctx context.Context, opts metav1.DeleteOptions, listOpts metav1.ListOptions) error</span><br><span class="line">Get(ctx context.Context, name <span class="keyword">string</span>, opts metav1.GetOptions) (*v1.Deployment, error)</span><br><span class="line">List(ctx context.Context, opts metav1.ListOptions) (*v1.DeploymentList, error)</span><br><span class="line">Watch(ctx context.Context, opts metav1.ListOptions) (watch.Interface, error)</span><br><span class="line">Patch(ctx context.Context, name <span class="keyword">string</span>, pt types.PatchType, data []<span class="keyword">byte</span>, opts metav1.PatchOptions, subresources ...<span class="keyword">string</span>) (result *v1.Deployment, err error)</span><br><span class="line">Apply(ctx context.Context, deployment *appsv1.DeploymentApplyConfiguration, opts metav1.ApplyOptions) (result *v1.Deployment, err error)</span><br><span class="line">ApplyStatus(ctx context.Context, deployment *appsv1.DeploymentApplyConfiguration, opts metav1.ApplyOptions) (result *v1.Deployment, err error)</span><br><span class="line">GetScale(ctx context.Context, deploymentName <span class="keyword">string</span>, options metav1.GetOptions) (*autoscalingv1.Scale, error)</span><br><span class="line">UpdateScale(ctx context.Context, deploymentName <span class="keyword">string</span>, scale *autoscalingv1.Scale, opts metav1.UpdateOptions) (*autoscalingv1.Scale, error)</span><br><span class="line">ApplyScale(ctx context.Context, deploymentName <span class="keyword">string</span>, scale *applyconfigurationsautoscalingv1.ScaleApplyConfiguration, opts metav1.ApplyOptions) (*autoscalingv1.Scale, error)</span><br><span class="line"></span><br><span class="line">DeploymentExpansion</span><br><span class="line">}</span><br><span class="line">...</span><br></pre></td></tr></table></figure><p>看到这里实际上一切都已经明了,通过<code>Interface</code>抽象了对于<code>Deployment</code>的操作,具体的实现就不展开分析了。</p><h2 id="验证"><a href="#验证" class="headerlink" title="验证"></a>验证</h2><h3 id="环境准备"><a href="#环境准备" class="headerlink" title="环境准备"></a>环境准备</h3><p>运行环境为通过<code>minikube</code>启动的本地<code>k8s</code>集群,另外在集群中注册<code>CRD</code>:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="variable">$kubectl</span> get crd | grep emqx </span><br><span class="line">emqxbrokers.apps.emqx.io 2021-12-09T03:59:28Z</span><br><span class="line">emqxenterprises.apps.emqx.io 2021-12-09T03:59:28Z</span><br></pre></td></tr></table></figure><h3 id="验证CRD-Client"><a href="#验证CRD-Client" class="headerlink" title="验证CRD Client"></a>验证<code>CRD Client</code></h3><p>下面让我们验证下<code>Client</code>实际运行情况,验证对自定义的<code>CRD</code>实例的<code>Create</code>,<code>Get</code>,<code>List</code>,<code>Delete</code>的验证。</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Demo 演示</span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">DemoForEmqxBroker</span><span class="params">(config *rest.Config, ns <span class="keyword">string</span>)</span></span> {</span><br><span class="line"><span class="comment">// Create emqxbroker restclient</span></span><br><span class="line">clientset, err := pkg.NewForConfig(config)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"><span class="built_in">panic</span>(err)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line">emqxbrokerClient := clientset.EmqxBrokersV1Beta1().EmqxBrokers(ns)</span><br><span class="line"></span><br><span class="line"><span class="comment">// Create emqxbroker instance</span></span><br><span class="line">Prompt()</span><br><span class="line">fmt.Println(<span class="string">"[> create emqxbroker"</span>)</span><br><span class="line">emqxbroker, err := emqxbrokerClient.Create(context.TODO(), resource.GenerateEmqxbroker(ns), metav1.CreateOptions{})</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"><span class="built_in">panic</span>(err)</span><br><span class="line">}</span><br><span class="line">fmt.Printf(<span class="string">"create emqxbroker: %+v\n"</span>, emqxbroker)</span><br><span class="line"></span><br><span class="line"><span class="comment">// Get emqxbroker instance</span></span><br><span class="line">Prompt()</span><br><span class="line"> fmt.Println(<span class="string">"[> get emqxbroker"</span>)</span><br><span class="line">eb, err := emqxbrokerClient.Get(context.TODO(), <span class="string">"emqx"</span>, metav1.GetOptions{})</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"><span class="built_in">panic</span>(err)</span><br><span class="line">}</span><br><span class="line">fmt.Printf(<span class="string">"emqxbroker found: %+v\n"</span>, eb)</span><br><span class="line"></span><br><span class="line"><span class="comment">// Get emqxbroker list</span></span><br><span class="line">Prompt()</span><br><span class="line"> fmt.Println(<span class="string">"[> list emqxbroker"</span>)</span><br><span class="line">eblist, err := emqxbrokerClient.List(context.TODO(), metav1.ListOptions{})</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"><span class="built_in">panic</span>(err)</span><br><span class="line">}</span><br><span class="line">fmt.Printf(<span class="string">"emqxbroker list: %+v\n"</span>, eblist)</span><br><span class="line"></span><br><span class="line"><span class="comment">// Delete emqxbroker instance</span></span><br><span class="line">Prompt()</span><br><span class="line"> fmt.Println(<span class="string">"[> delete emqxbroker"</span>)</span><br><span class="line">err = emqxbrokerClient.Delete(context.TODO(), <span class="string">"emqx"</span>, metav1.DeleteOptions{})</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"><span class="built_in">panic</span>(err)</span><br><span class="line">}</span><br><span class="line">fmt.Printf(<span class="string">"Delete emqxbroker successfully"</span>)</span><br><span class="line">}</span><br></pre></td></tr></table></figure><ul><li><code>Create</code></li></ul><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">[> create emqxbroker</span><br><span class="line">create emqxbroker: EmqxBroker instance [emqx],Image [emqx/emqx:4.3.10]</span><br></pre></td></tr></table></figure><p>查看下<code>K8s</code>集群中实例的情况:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="variable">$kubectl</span> get emqx emqx </span><br><span class="line">NAME AGE</span><br><span class="line">emqx 2m26s</span><br><span class="line"><span class="variable">$kubectl</span> get pods </span><br><span class="line">NAME READY STATUS RESTARTS AGE</span><br><span class="line">emqx-0 1/1 Running 0 2m30s</span><br><span class="line">emqx-1 1/1 Running 0 2m30s</span><br><span class="line">emqx-2 1/1 Running 0 2m30s</span><br></pre></td></tr></table></figure><ul><li><code>Get</code></li></ul><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">[> get emqxbroker</span><br><span class="line">emqxbroker found: EmqxBroker instance [emqx],Image [emqx/emqx:4.3.10]</span><br></pre></td></tr></table></figure><ul><li><code>List</code></li></ul><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">[> list emqxbroker</span><br><span class="line">emqxbroker list: &{TypeMeta:{Kind: APIVersion:} ListMeta:{SelfLink: ResourceVersion:157139 Continue: RemainingItemCount:<nil>} Items:[{TypeMeta:{Kind:EmqxBroker APIVersion:apps.emqx.io/v1beta1} ObjectMeta:{Name:emqx GenerateName: Namespace:default SelfLink: UID:74896493-0134-460a-a04d-d3bdaed21902 ResourceVersion:157121 Generation:1 CreationTimestamp:2021-12-26 20:23:15 +0800 CST DeletionTimestamp:<nil> DeletionGracePeriodSeconds:<nil> Labels:map[] Annotations:map[] OwnerReferences:[] Finalizers:[] ClusterName: ManagedFields:[{Manager:Go-http-client Operation:Update APIVersion:apps.emqx.io/v1beta1 Time:2021-12-26 20:23:15 +0800 CST FieldsType:FieldsV1 FieldsV1:{<span class="string">"f:spec"</span>:{<span class="string">"."</span>:{},<span class="string">"f:image"</span>:{},<span class="string">"f:labels"</span>:{<span class="string">"."</span>:{},<span class="string">"f:cluster"</span>:{}},<span class="string">"f:listener"</span>:{<span class="string">"."</span>:{},<span class="string">"f:nodePorts"</span>:{},<span class="string">"f:ports"</span>:{}},<span class="string">"f:replicas"</span>:{},<span class="string">"f:resources"</span>:{},<span class="string">"f:serviceAccountName"</span>:{}}} Subresource:} {Manager:__debug_bin4215542756 Operation:Update APIVersion:apps.emqx.io/v1beta1 Time:2021-12-26 20:23:15 +0800 CST FieldsType:FieldsV1 FieldsV1:{<span class="string">"f:status"</span>:{<span class="string">"."</span>:{},<span class="string">"f:conditions"</span>:{}}} Subresource:}]} Spec:{Replicas:0xc000492ec8 Image:emqx/emqx:4.3.10 ServiceAccountName:emqx Resources:{Limits:map[] Requests:map[]} Storage:<nil> Labels:map[cluster:emqx] Listener:{Type: LoadBalancerIP: LoadBalancerSourceRanges:[] ExternalIPs:[] Ports:{MQTT:0 MQTTS:0 WS:0 WSS:0 Dashboard:0 API:0} NodePorts:{MQTT:0 MQTTS:0 WS:0 WSS:0 Dashboard:0 API:0}} Affinity:nil ToleRations:[] NodeSelector:map[] ImagePullPolicy: ExtraVolumes:[] ExtraVolumeMounts:[] Env:[] ACL:[] Plugins:[] Modules:[]} Status:{Conditions:[{Type:Healthy Status:True LastUpdateTime:2021-12-26T20:27:26+08:00 LastUpdateAt:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2021-12-26T20:23:16+08:00 Reason:Cluster available Message:Cluster ok} {Type:Creating Status:True LastUpdateTime:2021-12-26T20:23:15+08:00 LastUpdateAt:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2021-12-26T20:23:15+08:00 Reason:Creating Message:Bootstrap emqx cluster}]}}]}</span><br></pre></td></tr></table></figure><ul><li><code>Delete</code></li></ul><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">[> delete emqxbroker</span><br><span class="line">Delete emqxbroker successfull</span><br></pre></td></tr></table></figure>]]></content>
<summary type="html"><h2 id="需求背景分析"><a href="#需求背景分析" class="headerlink" title="需求背景分析"></a>需求背景分析</h2><p>基于 <code>k8s</code> 的二次开发过程中,有些场景我们会定制化的去开发自己的 <code>C</summary>
<category term="kubernetes" scheme="http://kiragoo.github.com/categories/kubernetes/"/>
<category term="CRD" scheme="http://kiragoo.github.com/categories/kubernetes/CRD/"/>
<category term="kubernetes" scheme="http://kiragoo.github.com/tags/kubernetes/"/>
<category term="client-go" scheme="http://kiragoo.github.com/tags/client-go/"/>
<category term="设计" scheme="http://kiragoo.github.com/tags/%E8%AE%BE%E8%AE%A1/"/>
</entry>
<entry>
<title>手撕kube-proxy iptables实例</title>
<link href="http://kiragoo.github.com/archives/3562ba1.html"/>
<id>http://kiragoo.github.com/archives/3562ba1.html</id>
<published>2021-11-04T03:58:55.000Z</published>
<updated>2022-04-21T12:46:07.735Z</updated>
<content type="html"><![CDATA[<p>继 <a href="https://kiragoo.github.io/archives/26a027f0.html"><code>iptables</code> 入门出坑</a>之后,加深个人理解无非是实战,此篇我们来通过示例来加深个人理解。</p><h2 id="环境准备"><a href="#环境准备" class="headerlink" title="环境准备"></a>环境准备</h2><p>本人是 <code>mbp</code>,由于 <code>Docker Desktop For Mac</code> 实在太黑盒了,打算用 <code>minikube</code> 来进行 <code>k8s</code> 实验环境的部署。</p><p>对于需要的 <code>minikube, kubectl</code> 这种基本依赖可执行文件具体就不写步骤了。<br>最终执行如下命令拉起 <code>k8s</code> 集群:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line">minikube start --image-mirror-country=<span class="string">'cn'</span> --image-repository=<span class="string">'registry.cn-hangzhou.aliyuncs.com/google_containers'</span> --kubernetes-version=v1.21.0</span><br><span class="line">😄 Darwin 11.6 上的 minikube v1.21.0</span><br><span class="line">❗ Kubernetes 1.21.0 has a known performance issue on cluster startup. It might take 2 to 3 minutes <span class="keyword">for</span> a cluster to start.</span><br><span class="line">❗ For more information, see: https://github.com/kubernetes/kubeadm/issues/2395</span><br><span class="line">🎉 minikube 1.23.2 is available! Download it: https://github.com/kubernetes/minikube/releases/tag/v1.23.2</span><br><span class="line">💡 To <span class="built_in">disable</span> this notice, run: <span class="string">'minikube config set WantUpdateNotification false'</span></span><br><span class="line"></span><br><span class="line">✨ 根据现有的配置文件使用 docker 驱动程序</span><br><span class="line">👍 Starting control plane node minikube <span class="keyword">in</span> cluster minikube</span><br><span class="line">🚜 Pulling base image ...</span><br><span class="line">💾 Downloading Kubernetes v1.21.0 preload ...</span><br><span class="line"> > preloaded-images-k8s-v11-v1...: 498.90 MiB / 498.90 MiB 100.00% 17.39 Mi</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"> > index.docker.io/kicbase/sta...: 359.09 MiB / 359.09 MiB 100.00% 4.53 MiB</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">❗ minikube was unable to download gcr.io/k8s-minikube/kicbase:v0.0.23, but successfully downloaded kicbase/stable:v0.0.23 as a fallback image</span><br><span class="line">🔥 Creating docker container (CPUs=2, Memory=4000MB) ...</span><br><span class="line">❗ This container is having trouble accessing https://k8s.gcr.io</span><br><span class="line">💡 To pull new external images, you may need to configure a proxy: https://minikube.sigs.k8s.io/docs/reference/networking/proxy/</span><br><span class="line">🐳 正在 Docker 20.10.7 中准备 Kubernetes v1.21.0…</span><br><span class="line"> ▪ Generating certificates and keys ...</span><br><span class="line"> ▪ Booting up control plane ...</span><br><span class="line"> ▪ Configuring RBAC rules ...</span><br><span class="line">🔎 Verifying Kubernetes components...</span><br><span class="line"> ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5</span><br><span class="line">🌟 Enabled addons: storage-provisioner, default-storageclass</span><br><span class="line">🏄 Done! kubectl is now configured to use <span class="string">"minikube"</span> cluster and <span class="string">"default"</span> namespace by default</span><br></pre></td></tr></table></figure><p>构建实验 <code>manifests</code>,准备的<code>yaml</code> 文件如下:<br><strong>本例的<code>service type</code> 是 <code>ClusterIP</code></strong></p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">apiVersion:</span> <span class="string">apps/v1</span></span><br><span class="line"><span class="attr">kind:</span> <span class="string">Deployment</span></span><br><span class="line"><span class="attr">metadata:</span></span><br><span class="line"> <span class="attr">name:</span> <span class="string">nginx-deployment</span></span><br><span class="line"> <span class="attr">labels:</span></span><br><span class="line"> <span class="attr">app:</span> <span class="string">nginx</span></span><br><span class="line"><span class="attr">spec:</span></span><br><span class="line"> <span class="attr">replicas:</span> <span class="number">3</span></span><br><span class="line"> <span class="attr">selector:</span></span><br><span class="line"> <span class="attr">matchLabels:</span></span><br><span class="line"> <span class="attr">app:</span> <span class="string">nginx</span></span><br><span class="line"> <span class="attr">template:</span></span><br><span class="line"> <span class="attr">metadata:</span></span><br><span class="line"> <span class="attr">labels:</span></span><br><span class="line"> <span class="attr">app:</span> <span class="string">nginx</span></span><br><span class="line"> <span class="attr">spec:</span></span><br><span class="line"> <span class="attr">containers:</span></span><br><span class="line"> <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">nginx</span></span><br><span class="line"> <span class="attr">image:</span> <span class="string">nginx</span></span><br><span class="line"> <span class="attr">securityContext:</span></span><br><span class="line"> <span class="attr">privileged:</span> <span class="literal">true</span></span><br><span class="line"> <span class="attr">ports:</span></span><br><span class="line"> <span class="bullet">-</span> <span class="attr">containerPort:</span> <span class="number">80</span></span><br><span class="line"><span class="meta">---</span></span><br><span class="line"><span class="attr">apiVersion:</span> <span class="string">v1</span></span><br><span class="line"><span class="attr">kind:</span> <span class="string">Service</span></span><br><span class="line"><span class="attr">metadata:</span></span><br><span class="line"> <span class="attr">name:</span> <span class="string">nginx-service</span></span><br><span class="line"><span class="attr">spec:</span></span><br><span class="line"> <span class="attr">selector:</span></span><br><span class="line"> <span class="attr">app:</span> <span class="string">nginx</span></span><br><span class="line"> <span class="attr">ports:</span></span><br><span class="line"> <span class="bullet">-</span> <span class="attr">protocol:</span> <span class="string">TCP</span></span><br><span class="line"> <span class="attr">port:</span> <span class="number">80</span></span><br><span class="line"> <span class="attr">targetPort:</span> <span class="number">80</span></span><br><span class="line"><span class="meta">---</span></span><br><span class="line"><span class="attr">apiVersion:</span> <span class="string">v1</span></span><br><span class="line"><span class="attr">kind:</span> <span class="string">Pod</span></span><br><span class="line"><span class="attr">metadata:</span></span><br><span class="line"> <span class="attr">name:</span> <span class="string">web-server</span></span><br><span class="line"> <span class="attr">labels:</span></span><br><span class="line"> <span class="attr">app:</span> <span class="string">web-server</span></span><br><span class="line"><span class="attr">spec:</span></span><br><span class="line"> <span class="attr">containers:</span></span><br><span class="line"> <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">nginx</span></span><br><span class="line"> <span class="attr">image:</span> <span class="string">nginx</span></span><br><span class="line"> <span class="attr">securityContext:</span></span><br><span class="line"> <span class="attr">privileged:</span> <span class="literal">true</span></span><br><span class="line"> <span class="attr">ports:</span></span><br><span class="line"> <span class="bullet">-</span> <span class="attr">containerPort:</span> <span class="number">80</span></span><br></pre></td></tr></table></figure><p>待资源全部就绪后,查看资源详情,重点关注 <code>Pod</code> 和 <code>Service</code> 的 <code>IP</code>。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line">kubectl get all -o wide</span><br><span class="line">NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES</span><br><span class="line">pod/nginx-deployment-74bc56fb4b-9xqkt 1/1 Running 0 4m23s 172.17.0.5 minikube <none> <none></span><br><span class="line">pod/nginx-deployment-74bc56fb4b-c9zp4 1/1 Running 0 4m23s 172.17.0.4 minikube <none> <none></span><br><span class="line">pod/nginx-deployment-74bc56fb4b-h7rhz 1/1 Running 0 4m23s 172.17.0.6 minikube <none> <none></span><br><span class="line">pod/web-server 1/1 Running 0 4m23s 172.17.0.3 minikube <none> <none></span><br><span class="line"></span><br><span class="line">NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR</span><br><span class="line">service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 3h22m <none></span><br><span class="line">service/nginx-service ClusterIP 10.97.79.231 <none> 80/TCP 4m23s app=nginx</span><br><span class="line"></span><br><span class="line">NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR</span><br><span class="line">deployment.apps/nginx-deployment 3/3 3 3 4m23s nginx nginx app=nginx</span><br><span class="line"></span><br><span class="line">NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR</span><br><span class="line">replicaset.apps/nginx-deployment-74bc56fb4b 3 3 3 4m23s nginx nginx app=nginx,pod-template-hash=74bc56fb4b</span><br></pre></td></tr></table></figure><p>至此实验环境准备完成。</p><h2 id="流量转发分析"><a href="#流量转发分析" class="headerlink" title="流量转发分析"></a>流量转发分析</h2><p>首先需要登陆到 <code>minukube</code> 节点:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">minikube ssh</span><br><span class="line">docker@minikube:~$ iptables</span><br><span class="line">iptables v1.8.4 (legacy): no <span class="built_in">command</span> specified</span><br><span class="line">Try `iptables -h<span class="string">' or '</span>iptables --<span class="built_in">help</span><span class="string">' for more information.</span></span><br><span class="line"><span class="string">docker@minikube:~$sudo su # 切换到root 用户</span></span><br></pre></td></tr></table></figure><h3 id="kube-proxy-iptable-模式分析之ClusterIP"><a href="#kube-proxy-iptable-模式分析之ClusterIP" class="headerlink" title="kube-proxy iptable 模式分析之ClusterIP"></a><code>kube-proxy iptable</code> 模式分析之<code>ClusterIP</code></h3><h4 id="流量的源与目的大概如下:"><a href="#流量的源与目的大概如下:" class="headerlink" title="流量的源与目的大概如下:"></a>流量的源与目的大概如下:</h4><pre class="mermaid">graph LR web[web-server:172.17.0.3]-->nginx[nginx-service:10.97.79.231]</pre><blockquote><p>分析下为啥是从本机出去的流量?<br>因为<code>kube-proxy</code>是以<code>daemonSet</code>的形式部署在所有节点上的,所以每个节点都会有相同的<code>iptable</code>规则,当任何一个节点上的<code>pod</code>访问<code>service</code>时,其实都是可以在该<code>pod</code>所在的node的的<code>iptable</code>中找到对应的<code>service</code>规则从而找到<code>service</code>所代理的<code>pod</code>的,而对于<code>node</code>而言,寄宿在自己上的<code>pod</code>的发出的流量就是从本机的某进程出去的流量。</p></blockquote><p>在 <code>iptables</code> 的学习中可以知道,从本机出去的流量经过的链路流程如下:</p><pre class="mermaid">graph LR output[OUTPUT链]-->postroting[POSTROUTING链]</pre><h4 id="OUTPUT-链分析"><a href="#OUTPUT-链分析" class="headerlink" title="OUTPUT 链分析"></a><code>OUTPUT</code> 链分析</h4><p><code>OUTPUT</code> 链涉及到4张表:<code>raw</code>,<code>mangle</code>,<code>nat</code>,<code>filter</code>。以下分析中会忽略与本例无关的 <code>rule</code>。<strong>重点看<code>nat</code>表和<code>filter</code>表,<code>raw</code>表和<code>mangle</code>表是空表</strong>。</p><h5 id="NAT-表"><a href="#NAT-表" class="headerlink" title="NAT 表"></a><code>NAT</code> 表</h5><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">iptables -t nat -nvL</span><br><span class="line">...</span><br><span class="line">Chain OUTPUT (policy ACCEPT 819 packets, 49140 bytes)</span><br><span class="line"> pkts bytes target prot opt <span class="keyword">in</span> out <span class="built_in">source</span> destination</span><br><span class="line">15746 946K KUBE-SERVICES all -- * * 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */</span><br><span class="line"> 72 4789 DOCKER_OUTPUT all -- * * 0.0.0.0/0 192.168.65.2</span><br><span class="line">10852 651K DOCKER all -- * * 0.0.0.0/0 !127.0.0.0/8 ADDRTYPE match dst-type LOCAL</span><br><span class="line">...</span><br></pre></td></tr></table></figure><p>可以看到,所有OUTPUT流量都被导向了名叫<code>KUBE-SERVICES</code>的自定义链,我们来看看它是做什么的。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">Chain KUBE-SERVICES (2 references)</span><br><span class="line"> pkts bytes target prot opt <span class="keyword">in</span> out <span class="built_in">source</span> destination</span><br><span class="line"> ...</span><br><span class="line"> 0 0 KUBE-MARK-MASQ tcp -- * * !10.244.0.0/16 10.97.79.231 /* default/nginx-service cluster IP */ tcp dpt:80</span><br><span class="line"> 0 0 KUBE-SVC-V2OKYYMBY3REGZOG tcp -- * * 0.0.0.0/0 10.97.79.231 /* default/nginx-service cluster IP */ tcp dpt:80</span><br><span class="line"> ... </span><br><span class="line"> 0 0 KUBE-SVC-TCOU7JCQXEZGVUNU udp -- * * 0.0.0.0/0 10.96.0.10 /* kube-system/kube-dns:dns cluster IP */ udp dpt:53</span><br><span class="line"> 719 43140 KUBE-NODEPORTS all -- * * 0.0.0.0/0 0.0.0.0/0 /* kubernetes service nodeports; NOTE: this must be the last rule <span class="keyword">in</span> this chain */ ADDRTYPE match dst-type LOCAL</span><br></pre></td></tr></table></figure><p>可以看到这里定义了所有<code>namespace</code>的<code>service</code>相关的规则,其中就有我们创建的<code>nginx-service</code>规则(其他几条<code>service</code>与<code>kube-dns</code>和<code>apiServer</code>相关,大家感兴趣的话可以自己分析一下),可以看到它匹配的是目标地址为<code>10.97.79.231</code>,端口为<code>80</code>的数据包,而我们发往<code>nginx-service</code>的数据包正好匹配这条规则,我们看到这条规则的<code>target</code>是名叫<code>KUBE-SVC-V2OKYYMBY3REGZOG</code>的自定义链,我们来继续挖这个链:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">Chain KUBE-SVC-V2OKYYMBY3REGZOG (1 references)</span><br><span class="line"> pkts bytes target prot opt <span class="keyword">in</span> out <span class="built_in">source</span> destination</span><br><span class="line"> 0 0 KUBE-SEP-3VDHYO53IOQ2XWUD all -- * * 0.0.0.0/0 0.0.0.0/0 /* default/nginx-service */ statistic mode random probability 0.33333333349</span><br><span class="line"> 0 0 KUBE-SEP-C54WIGIB4NQVIFB3 all -- * * 0.0.0.0/0 0.0.0.0/0 /* default/nginx-service */ statistic mode random probability 0.50000000000</span><br><span class="line"> 0 0 KUBE-SEP-KN3IA7DQGTHQJWSD all -- * * 0.0.0.0/0 0.0.0.0/0 /* default/nginx-service */</span><br></pre></td></tr></table></figure><p>我们看到这个<code>KUBE-SVC-V2OKYYMBY3REGZOG</code>链里面定义了3条规则,第一条规则有<code>0.33333333349</code>的概率匹配,也就是<code>1/3</code>的概率命中,第一条没命中的话第二条规则有<code>1/2</code>的概率命中,也就是<code>2/3 * 1/2 = 1/3</code>,第二条没命中的话就去第3条了。很明显,这里是在做负载均衡,那我们可以猜到这3条规则后面的<code>target</code>就是这个<code>service</code>代理的3个<code>pod</code>相关的规则了。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line">Chain KUBE-SEP-3VDHYO53IOQ2XWUD (1 references)</span><br><span class="line"> pkts bytes target prot opt <span class="keyword">in</span> out <span class="built_in">source</span> destination</span><br><span class="line"> 0 0 KUBE-MARK-MASQ all -- * * 172.17.0.4 0.0.0.0/0 /* default/nginx-service */</span><br><span class="line"> 0 0 DNAT tcp -- * * 0.0.0.0/0 0.0.0.0/0 /* default/nginx-service */ tcp to:172.17.0.4:80</span><br><span class="line"></span><br><span class="line">Chain KUBE-SEP-C54WIGIB4NQVIFB3 (1 references)</span><br><span class="line"> pkts bytes target prot opt <span class="keyword">in</span> out <span class="built_in">source</span> destination</span><br><span class="line"> 0 0 KUBE-MARK-MASQ all -- * * 172.17.0.5 0.0.0.0/0 /* default/nginx-service */</span><br><span class="line"> 0 0 DNAT tcp -- * * 0.0.0.0/0 0.0.0.0/0 /* default/nginx-service */ tcp to:172.17.0.5:80</span><br><span class="line"></span><br><span class="line">Chain KUBE-SEP-KN3IA7DQGTHQJWSD (1 references)</span><br><span class="line"> pkts bytes target prot opt <span class="keyword">in</span> out <span class="built_in">source</span> destination</span><br><span class="line"> 0 0 KUBE-MARK-MASQ all -- * * 172.17.0.6 0.0.0.0/0 /* default/nginx-service */</span><br><span class="line"> 0 0 DNAT tcp -- * * 0.0.0.0/0 0.0.0.0/0 /* default/nginx-service */ tcp to:172.17.0.6:80</span><br></pre></td></tr></table></figure><p>可以看到这3个自定义链的规则很类似,注意到第一条匹配的是是<code>Pod</code>自己访问自己的情况,会去<code>KUBE-MARK-MASQ</code>这个<code>target</code>,其他的情况会去第二条规则,也就是<code>DNAT</code>。在我们的假设中,是另外一个<code>pod</code>访问<code>nginx-service</code>,所以不会命中第一条,命中第二条<code>DNAT</code>。</p><p>假设我们的数据包在<code>KUBE-SVC-V2OKYYMBY3REGZOG</code>链中被负载均衡分配到了第一个<code>target</code>,也就是<code>KUBE-SEP-3VDHYO53IOQ2XWUD</code>,那么<code>DNAT</code>之后,该数据包的<code>destination</code>从<code>10.97.79.231:80</code>被改成了<code>172.17.0.4:80</code>,即:</p><pre class="mermaid">graph LR orignal[orginal:172.17.0.3:xxx]-->dst1[10.97.79.231:80] afterdnat[after dnat:172.17.0.3:xxx]-->dst2[172.17.0.4:80]</pre><p>至此,<code>nat</code> 表中 <code>OUTPUT</code> 链分析技术,进入<code>filter</code>表进行分析。</p><h5 id="filter-表"><a href="#filter-表" class="headerlink" title="filter 表"></a><code>filter</code> 表</h5><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">iptables -t filter -nvL</span><br><span class="line">...</span><br><span class="line">Chain OUTPUT (policy ACCEPT 193K packets, 30M bytes)</span><br><span class="line"> pkts bytes target prot opt <span class="keyword">in</span> out <span class="built_in">source</span> destination</span><br><span class="line">17326 1040K KUBE-SERVICES all -- * * 0.0.0.0/0 0.0.0.0/0 ctstate NEW /* kubernetes service portals */</span><br><span class="line">1379K 225M KUBE-FIREWALL all -- * * 0.0.0.0/0 0.0.0.0/0</span><br><span class="line">...</span><br><span class="line">Chain KUBE-SERVICES (2 references)</span><br><span class="line"> pkts bytes target prot opt <span class="keyword">in</span> out <span class="built_in">source</span> destination</span><br></pre></td></tr></table></figure><p>可以看到所有所有的新的连接(ctstate NEW)都会匹配到第一条规则<code>KUBE-SERVICES</code>。但是我们会发现 <code>KUBE-SERVICES</code> 是一条空链,在此着重看第二条表<code>KUBE-FIREWALL</code>。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">Chain KUBE-FIREWALL (2 references)</span><br><span class="line"> pkts bytes target prot opt <span class="keyword">in</span> out <span class="built_in">source</span> destination</span><br><span class="line"> 0 0 DROP all -- * * 0.0.0.0/0 0.0.0.0/0 /* kubernetes firewall <span class="keyword">for</span> dropping marked packets */ mark match 0x8000/0x8000</span><br><span class="line"> 0 0 DROP all -- * * !127.0.0.0/8 127.0.0.0/8 /* block incoming localnet connections */ ! ctstate RELATED,ESTABLISHED,DNAT</span><br></pre></td></tr></table></figure><p>可以看到,所有被标记了<code>0x8000/0x8000</code>的数据包都会被直接<code>DROP</code>掉,而我们的数据包一路走过来没有被标记,所以不会被<code>DROP</code>。这样一来<code>filter</code>的<code>OUTPUT</code>规则也走完了,终于进入了下一个阶段 – <code>POSTROUTRING</code>链。</p><h5 id="POSTROUTING-链"><a href="#POSTROUTING-链" class="headerlink" title="POSTROUTING 链"></a><code>POSTROUTING</code> 链</h5><p><code>POSTROUTING</code> 链主要涉及两张表:<code>mangle</code> 和 <code>nat</code>。由于<code>mangle</code>表示空表,只需关注<code>nat</code>表的<code>POSTROUTING</code>规则。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">iptables -t nat -nvL</span><br><span class="line">...</span><br><span class="line">Chain POSTROUTING (policy ACCEPT 488 packets, 29280 bytes)</span><br><span class="line"> pkts bytes target prot opt <span class="keyword">in</span> out <span class="built_in">source</span> destination</span><br><span class="line"> 8165 491K KUBE-POSTROUTING all -- * * 0.0.0.0/0 0.0.0.0/0 /* kubernetes postrouting rules */</span><br><span class="line"> 13 780 MASQUERADE all -- * !docker0 172.17.0.0/16 0.0.0.0/0</span><br><span class="line"> 0 0 DOCKER_POSTROUTING all -- * * 0.0.0.0/0 192.168.65.2</span><br><span class="line">...</span><br></pre></td></tr></table></figure><p>首先进入第一个 <code>target</code>,<code>KUBE-POSTROUTING</code>:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">Chain KUBE-POSTROUTING (1 references)</span><br><span class="line"> pkts bytes target prot opt <span class="keyword">in</span> out <span class="built_in">source</span> destination</span><br><span class="line"> 486 29160 RETURN all -- * * 0.0.0.0/0 0.0.0.0/0 mark match ! 0x4000/0x4000</span><br><span class="line"> 0 0 MARK all -- * * 0.0.0.0/0 0.0.0.0/0 MARK xor 0x4000</span><br><span class="line"> 0 0 MASQUERADE all -- * * 0.0.0.0/0 0.0.0.0/0 /* kubernetes service traffic requiring SNAT */ random-fully</span><br></pre></td></tr></table></figure><p>最终会发现数据包是从<code>docker0</code>网卡发送出来,并没有做<code>SNAT</code>操作,<code>source ip</code> 依然是 <code>172.17.0.3</code>,但是这个时候的<code>DST IP</code>是<code>10.172.0.4</code> 而不是<code>service ip: 10.97.79.231</code>。</p><h4 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h4><p><code>ClusterIP</code> 类型的数据包经历的链路是:</p><pre class="mermaid">graph TB s((数据包))-->1[nat:OUTPUT] 1-->2[nat:KUBE-SERVICES] 2-->3[nat:KUBE-SVC-V2OKYYMBY3REGZOG] 3-->4[nat:KUBE-SEP-*:3选一] 4-->5[filter:OUTPUT] 5-->6[filter:KUBE-FIREWALL] 6-->7[nat:POSTROUTING] 7-->8[nat:KUBE-POSTROUTING] 8-->9((未被SNAT))</pre>]]></content>
<summary type="html"><p>继 <a href="https://kiragoo.github.io/archives/26a027f0.html"><code>iptables</code> 入门出坑</a>之后,加深个人理解无非是实战,此篇我们来通过示例来加深个人理解。</p>
<h2 id="环</summary>
<category term="kubernetes" scheme="http://kiragoo.github.com/categories/kubernetes/"/>
<category term="网络" scheme="http://kiragoo.github.com/categories/kubernetes/%E7%BD%91%E7%BB%9C/"/>
<category term="iptables" scheme="http://kiragoo.github.com/tags/iptables/"/>
<category term="kube-proxy" scheme="http://kiragoo.github.com/tags/kube-proxy/"/>
</entry>
<entry>
<title>iptables 入门出坑</title>
<link href="http://kiragoo.github.com/archives/26a027f0.html"/>
<id>http://kiragoo.github.com/archives/26a027f0.html</id>
<published>2021-11-02T02:42:35.000Z</published>
<updated>2023-04-03T08:26:55.907Z</updated>
<content type="html"><![CDATA[<blockquote><p>本文整理自<a href="https://mp.weixin.qq.com/s/Dgv5BU9YU0tuSMxtzcuiVw"><code>iptables</code>长文详解,值得收藏细读</a><br>整理此篇博文的初衷还是因为目前很多 <code>k8s</code> 的内部的 <code>service</code> 路由规则还是通过 <code>iptables</code> 来实现的,否则最终还是只停留在使用层面,能够让我们对某个服务夯住又多了个排查利器。</p></blockquote><p><code>Linux</code> 的网络控制模块在内核中,叫做<code>netfilter</code>。而<code>iptables</code>是位于用户空间的一个命令行工具,它作用在OIS7层网络模型中的第四层[物理层,数据链路层,网络层,<strong>传输层</strong>,会话层,表示层,应用层],用来和内核的<code>netfilter</code>交互,配置<code>netfilter</code>进而实现对网络的控制、流量的转发。</p><p>主要功能:</p><ul><li>流量转发:<code>DNAT</code> 实现 <code>IP</code> 地址和端口的映射</li><li>负载均衡:<code>statistic</code> 模块为每个后端设置权重</li><li>会话保持:<code>recent</code> 模块设置会话保持时间</li></ul><h2 id="基本两要素"><a href="#基本两要素" class="headerlink" title="基本两要素"></a>基本两要素</h2><p>表和链路,5表5链路。</p><p>5张表分别是:<code>raw</code>,<code>filter</code>,<code>nat</code>,<code>mangle</code>,<code>security</code><br>5条链路分别是:<code>prerouting</code>,<code>input</code>,<code>forward</code>,<code>output</code>,<code>postrouting</code></p><p>通过 <code>iptables -t ${table} -nL</code> 查看相关表:</p><p><img src="https://kblogs.oss-cn-beijing.aliyuncs.com/blogimgs/iptables-nl.png" alt="`filter` 表查看"></p><pre class="mermaid">graph LRR[iptables]-->FT(filter表)FT-->FTFIC(input链)FTFIC-->在入口对流量做过滤FT-->FTFC(forward链)FTFC-->做流量转发FT-->FTOC(output链)FTOC-->在出口对流量做过滤FT-->作用:常用于控制到达某条链路上的数据是继续放行,丢弃,拒绝R-->NT(nat表)NT-->NTPREC(prerouting链)NTPREC-->做dnat目标地址转换NT-->NTIC(input链)NT-->NTOC(output链)NT-->NTPOSTC(postouting链)NTPOSTC-->做snat源地址转换NT-->常用于修改数据包的原地址,目的地址R-->MT(mangle表)MT-->MTPREC(prerouting)MT-->MTIC(input链)MT-->MTFC(forward链)MT-->MTOC(output链)MT-->MTPOSTC(postouting链)MT-->常用于修改ip数据包的头信息R-->RT(raw表)RT-->RTPREC(prerouting链)RT-->RTOC(output链)RT-->对连接的状态进行追踪,常见的状态有new,establishedR-->ST(security表)ST-->是新加入的表,用于将数据包应用在selinux上</pre><h2 id="流量走向分析"><a href="#流量走向分析" class="headerlink" title="流量走向分析"></a>流量走向分析</h2><p>对于流量的分析,我们可以考虑两个场景:</p><ul><li>来及本机的流量经过了 <code>iptables</code> 的哪些节点,最终又到哪里去了?</li><li>来自互联网的外界流量,是如何经历 <code>iptables</code> 的,然后最终的去处?</li></ul><h3 id="iptables-处理流程"><a href="#iptables-处理流程" class="headerlink" title="iptables 处理流程"></a><code>iptables</code> 处理流程</h3><p><img src="https://kblogs.oss-cn-beijing.aliyuncs.com/blogimgs/Netfilter-packet-flow.svg.png" alt="`netfilter-packet-filter`"></p><h3 id="iptables-在-kube-proxy-的处理流程"><a href="#iptables-在-kube-proxy-的处理流程" class="headerlink" title="iptables 在 kube-proxy 的处理流程"></a><code>iptables</code> 在 <code>kube-proxy</code> 的处理流程</h3><p><img src="https://kblogs.oss-cn-beijing.aliyuncs.com/blogimgs/kube-proxy-its.png" alt="`kube-proxy-iptables`"></p><blockquote><p>关于 <code>kube-proxy</code> 的规则处理,可以借鉴博文<a href="https://zjj2wry.github.io/network/iptables/">理解<code>kube-proxy</code>中<code>iptable</code>规则</a></p></blockquote><p>如上表解析:<br>1、 红,蓝,绿,紫分别代表上<code>iptables</code>的四张表,如果开启了<code>seLinux</code>,会多出一个<code>security</code>表。</p><p>2、上图左上角部分:<code>incoming packet</code>,表示这是从互联网设备过来的流量,会经历各个表的<code>preouting</code>阶段,再由<code>routing decision</code>(路由选择)决定这些流量是由本机处理还是<code>forward</code>转发走。</p><p>3、上图左上角部分:<code>incoming packet</code> 在做 <code>routing descision</code> 之前会经过<code>nat preouting</code>阶段,在此阶段可以做<code>dnat</code>,可以简单理解为:比如这个数据包原来的<code>dst ip</code>是百度的,经过<code>routing desicion</code> 之后进入<code>forward</code> 转发阶段,这个时候改写目标地址为自己本机,让数据进入<code>input</code>通路,可以在本机截获这个数据包。</p><p>4、上图右上角部分:<code>locally generated packet</code>,表示这是本机自己生成的流量。它会一路经过各个表的<code>output</code>链,然后流到<code>output interface</code>(网卡)上。你注意下,流量在被打包成<code>outgoing packet</code>之前,会有个<code>localhost dest</code>的判断,如果它判断流量不是发往本机的话,流量会经过<code>nat</code>表的<code>postrouting</code>阶段。一般会在这里做DNAT源地址改写。</p><p>理解如上流程,我们可以灵活的对流量进行自定义控制,通常的流量控制无非如下:</p><p>1、 丢弃来自XXX的流量 (<code>filter</code> 表 <code>INPUT</code> 链)</p><p>2、 丢弃去往XXX的流量 (<code>filter</code> 表 <code>OUTPUT</code>链)</p><p>3、 只接收来自XXX的流量 (<code>filter</code> 表 <code>INPUT</code> 链)</p><p>4、 在流量刚流入时,将目的地址改写成其他地址(<code>nat</code>表<code>preouting</code>链)</p><p>5、 在流量即将流出时,将源地址改写成其他地址(<code>nat</code>表<code>postouting</code>链)</p><p>6、 将发往 <code>A</code> 的数据包,转发给<code>B</code>(<code>filter</code>表<code>forward</code>链)</p><p><img src="https://kblogs.oss-cn-beijing.aliyuncs.com/blogimgs/iptables-flow.png" alt="iptables-flow-zsy(出自https://www.zsythink.net/archives/1199)"></p><p><img src="https://kblogs.oss-cn-beijing.aliyuncs.com/blogimgs/iptable-flow.drawio.png" alt="iptable-flow"></p><h2 id="iptables-commands"><a href="#iptables-commands" class="headerlink" title="iptables commands"></a><code>iptables commands</code></h2><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">iptabls -t <span class="variable">${表名}</span> <span class="variable">${Commands}</span> <span class="variable">${链名}</span> <span class="variable">${链中规则}</span> <span class="variable">${匹配条件}</span> <span class="variable">${目标动作}</span></span><br></pre></td></tr></table></figure><table> <tr> <th>命令行属性</th> <th>属性列表</th> <th>说明</th> </tr> <tr> <td rowspan="5">表名</td> <td> raw </td> <td>iptables 是有状态的,其对数据包有链接追踪机制,链接追踪信息在/proc/net/nf_conntrack 中可以看到记录</td> </tr> <tr> <td> filter </td> <td> 用于控制到达某条链接上的数据包是继续放行,直接丢弃还是拒绝</td> </tr> <tr> <td> mangle</td> <td> 用于修改数据包的IP头信息</td> </tr> <tr> <td> nat </td> <td>network address translation 网络地址转换,用于修改数据包的源地址和目的地址 </td> </tr> <tr> <td> security </td> <td> 不常用的表,用在SeLinux上</td> </tr> <tr> <td rowspan="5"> Commands </td> <td> PREOUTING </td> <td> 数据包进入之前,可以在此进行DNAT</td> </tr> <tr> <td> POSTOUTING </td> <td> 发送到网卡之前,可以在此处进行SNAT</td> </tr> <tr> <td> INPUT </td> <td> 一般处理本地进程的数据包,目的地址为本机</td> </tr> <tr> <td> OUTPUT </td> <td>原地址为本机,向外发送,一般处理本地进程的数据数据包 </td> </tr> <tr> <td> FORWARD </td> <td> </td> </tr> <tr> <td> POSTROUTING </td> <td> </td> </tr> <tr> <td rowspan="10">Commands</td> <td>-A</td> <td>添加</td> </tr> <tr> <td>-C</td> <td>检查</td> </tr> <tr> <td>-C</td> <td>检查</td> </tr> <tr> <td>-D</td> <td>删除</td> </tr> <tr> <td>-I</td> <td>在头部插入</td> </tr> <tr> <td>-R</td> <td>替换</td> </tr> <tr> <td>-L</td> <td>查看全部</td> </tr> <tr> <td>-F</td> <td>清空</td> </tr> <tr> <td>-N</td> <td>新建</td> </tr> <tr> <td>-P</td> <td>默认是ACCEPT</td> </tr> <tr> <td rowspan="4">匹配条件</td> <td>-p</td> <td>协议,-4,-6</td> </tr> <tr> <td>-s</td> <td>源地址</td> </tr> <tr> <td>-d</td> <td>目的地址</td> </tr> <tr> <td>-i</td> <td>网络接口名称</td> </tr> <tr> <td rowspan="7">目标动作</td> <td>-j REJECT</td> <td>拒绝访问</td> </tr> <tr> <td>-j ACCEPT</td> <td>允许通过</td> </tr> <tr> <td>-j DROP</td> <td>丢弃</td> </tr> <tr> <td>-j LOG</td> <td>记录日志</td> </tr> <tr> <td>-j SNAT</td> <td>源地址转换</td> </tr> <tr> <td>-j DNAT</td> <td>目标地址转换</td> </tr> <tr> <td>RETURN,QUEUE</td> <td></td> </tr></table><h2 id="iptables-的匹配规则"><a href="#iptables-的匹配规则" class="headerlink" title="iptables 的匹配规则"></a><code>iptables</code> 的匹配规则</h2><p>常见的规则如下:<br>源地址:<code>-s 192.168.1.0/24</code><br>目标地址:<code>-d 192.168.1.11</code><br>协议:<code>-p tcp|udp|icmp</code><br>从哪个网卡进来:<code>-i eth0|lo</code><br>从哪个网卡出去:<code>-o eth0|lo</code><br>目标端口(必须制定协议):<code>-p tcp|udp --dport 8080</code><br>源端口(必须制定协议):<code>-p tcp|udp --sport 8080</code></p><p><code>iptables</code> 中的每条规则顺序都是由上至下顺序执行的,除非碰到了 <code>DROP</code>,<code>REJECT</code>,<code>RETURN</code>。</p><p>还有就是如果定义的动作是<code>JUMP</code>,那就会相应的 <code>jump</code> 到指定链路上的指定规则:</p><pre class="mermaid">graph TB chain1((chain1)) --> rule1-1((rule1-1)) chain2((chain2)) --> rule2-1((rule2-1)) rule1-1 --> rule1-2((rule1-2)) rule1-2 --> rule1-3((rule1-3)) rule2-1 --> rule2-2((rule2-2)) rule2-2 --> rule2-3((rule2-3)) rule1-2 --JUMP--> rule2-1 rule2-3 --JUMP--> rule1-3</pre><h2 id="iptables-中的模块"><a href="#iptables-中的模块" class="headerlink" title="iptables 中的模块"></a><code>iptables</code> 中的模块</h2><ul><li>多端口</li></ul><p>可以如下执行命令:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 其中的20:30表示20和30之间的所有端口</span></span><br><span class="line">iptables -t <span class="variable">${表名}</span> <span class="variable">${commands}</span> <span class="variable">${chain}</span> <span class="variable">${规则号}</span> --dport 20:30 -j <span class="variable">${动作}</span></span><br></pre></td></tr></table></figure><p>想指定多个不连续的端口可以使用<code>iptables</code>的<code>multiport</code>。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 查看帮助文档</span></span><br><span class="line">~]<span class="comment"># iptables -m multiport --help</span></span><br><span class="line">...</span><br><span class="line">multiport match options:</span><br><span class="line">[!] --source-ports port[,port:port,port...]</span><br><span class="line"> --sports ...</span><br><span class="line"> match <span class="built_in">source</span> port(s)</span><br><span class="line">[!] --destination-ports port[,port:port,port...]</span><br><span class="line"> --dports ...</span><br><span class="line"> match destination port(s)</span><br><span class="line">[!] --ports port[,port:port,port]</span><br><span class="line"> match both <span class="built_in">source</span> and destination port(s)</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="comment">#命令例子</span></span><br><span class="line">iptables -t <span class="variable">${表名}</span> <span class="variable">${commands}</span> <span class="variable">${chain}</span> <span class="variable">${规则号}</span> </span><br><span class="line"> <span class="variable">${-p 协议}</span> -m multiport --dports 20,30 -j <span class="variable">${动作}</span></span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="comment">#相当于如下两行命令</span></span><br><span class="line">iptables -t <span class="variable">${表名}</span> <span class="variable">${commands}</span> <span class="variable">${chain}</span> <span class="variable">${规则号}</span> -p <span class="variable">${协议}</span> --dprot 20 -j <span class="variable">${动作}</span></span><br><span class="line">iptables -t <span class="variable">${表名}</span> <span class="variable">${commands}</span> <span class="variable">${chain}</span> <span class="variable">${规则号}</span> -p <span class="variable">${协议}</span> --dprot 30 -j <span class="variable">${动作}</span></span><br></pre></td></tr></table></figure><ul><li><code>ip</code> 范围</li></ul><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">~]<span class="comment"># iptables -m iprange --help</span></span><br><span class="line">iprange match options:</span><br><span class="line">[!] --src-range ip[-ip] Match <span class="built_in">source</span> IP <span class="keyword">in</span> the specified range</span><br><span class="line">[!] --dst-range ip[-ip] Match destination IP <span class="keyword">in</span> the specified range</span><br></pre></td></tr></table></figure><ul><li>连接状态</li></ul><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">~]<span class="comment"># iptables -m state --help</span></span><br><span class="line">state match options:</span><br><span class="line"> [!] --state [INVALID|ESTABLISHED|NEW|RELATED|UNTRACKED][,...]</span><br></pre></td></tr></table></figure><blockquote><p>参考借鉴</p></blockquote><ul><li><a href="https://mp.weixin.qq.com/s/Dgv5BU9YU0tuSMxtzcuiVw"><code>iptables</code>长文详解,值得收藏细读</a></li><li><a href="http://www.tianfeiyu.com/?p=2894"><code>kube-proxy iptables</code> 模式源码分析</a></li><li><a href="https://www.zsythink.net/archives/1199"><code>iptables详解(1)</code></a></li></ul>]]></content>
<summary type="html"><blockquote>
<p>本文整理自<a href="https://mp.weixin.qq.com/s/Dgv5BU9YU0tuSMxtzcuiVw"><code>iptables</code>长文详解,值得收藏细读</a><br>整理此篇博文的初衷还是因为目前很多 <</summary>
<category term="Linux" scheme="http://kiragoo.github.com/categories/Linux/"/>
<category term="iptables" scheme="http://kiragoo.github.com/tags/iptables/"/>
<category term="网络" scheme="http://kiragoo.github.com/tags/%E7%BD%91%E7%BB%9C/"/>
</entry>
<entry>
<title>kubernetes源码分析系列之源码分析系列之Statefulset Controller</title>
<link href="http://kiragoo.github.com/archives/f7f100a2.html"/>
<id>http://kiragoo.github.com/archives/f7f100a2.html</id>
<published>2021-10-21T03:01:43.000Z</published>
<updated>2023-04-03T08:26:56.205Z</updated>
<content type="html"><![CDATA[<blockquote><p>对于分析 <code>Controller</code> 源码选用 <code>StatefulSet Controller</code> 来,其它控制器源码分析一个套路,可以做参考。</p></blockquote><h1 id="StatefulSet-简介"><a href="#StatefulSet-简介" class="headerlink" title="StatefulSet 简介"></a><code>StatefulSet</code> 简介</h1><p>此篇文章默认你已经具备了熟练使用 <code>Statefulset</code> 的基础知识,所以常规介绍及使用 <code>Demo</code> 的描述不在阐述,具体可参考 <a href="https://kubernetes.io/zh/docs/tutorials/stateful-application/basic-stateful-set/">StatefulSet 基础</a></p><h1 id="StatefulSet-Controller-启动分析"><a href="#StatefulSet-Controller-启动分析" class="headerlink" title="StatefulSet Controller 启动分析"></a><code>StatefulSet Controller</code> 启动分析</h1><h2 id="kube-manager-controller-入口调用链分析"><a href="#kube-manager-controller-入口调用链分析" class="headerlink" title="kube-manager-controller 入口调用链分析"></a><code>kube-manager-controller</code> 入口调用链分析</h2><p><em>对于 <code>Kubernetes</code> 的源码组织结构不做过多介绍,希望你有一定的了解。</em></p><p>对于 <code>k8s</code> 是如何启动 <code>kube-controller-manager</code>,可以通过<a href="https://kubernetes.io/zh/docs/reference/command-line-tools-reference/kube-controller-manager/">文档<code>kube-controller-manager </code></a> 查找到对应如下内容:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">--controllers strings 默认值:[*]</span><br><span class="line">要启用的控制器列表。\* 表示启用所有默认启用的控制器; foo 启用名为 foo 的控制器; -foo 表示禁用名为 foo 的控制器。</span><br><span class="line">控制器的全集:attachdetach、bootstrapsigner、cloud-node-lifecycle、clusterrole-aggregation、cronjob、csrapproving、csrcleaner、csrsigning、daemonset、deployment、disruption、endpoint、endpointslice、endpointslicemirroring、ephemeral-volume、garbagecollector、horizontalpodautoscaling、job、namespace、nodeipam、nodelifecycle、persistentvolume-binder、persistentvolume-expander、podgc、pv-protection、pvc-protection、replicaset、replicationcontroller、resourcequota、root-ca-cert-publisher、route、service、serviceaccount、serviceaccount-token、statefulset、tokencleaner、ttl、ttl-after-finished</span><br><span class="line">默认禁用的控制器有:bootstrapsigner 和 tokencleaner。</span><br></pre></td></tr></table></figure><p>这里我们发现默认值启动中已经加入了 <code>statefulset</code> 的初始化,那么在代码是在哪里体现的呢?继续往下看。</p><p>进入 <code>cmd/controller-manager</code> 的 <code>main</code> 函数,实际上就做两个事情:</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">main</span><span class="params">()</span></span> {</span><br><span class="line">command := app.NewControllerManagerCommand() <span class="comment">// 初始化</span></span><br><span class="line">code := cli.Run(command) <span class="comment">//真正执行</span></span><br><span class="line">os.Exit(code)</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>通过函数调用关系,我们进入 <code>cmd/kube-controller-manager/controllermanager.go</code> 中,查看 <code>Run</code> 执行究竟做了啥。</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line">...</span><br><span class="line">Run: <span class="function"><span class="keyword">func</span><span class="params">(cmd *cobra.Command, args []<span class="keyword">string</span>)</span></span> {</span><br><span class="line">verflag.PrintAndExitIfRequested()</span><br><span class="line">cliflag.PrintFlags(cmd.Flags())</span><br><span class="line"></span><br><span class="line">err := checkNonZeroInsecurePort(cmd.Flags())</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line">fmt.Fprintf(os.Stderr, <span class="string">"%v\n"</span>, err)</span><br><span class="line">os.Exit(<span class="number">1</span>)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 这里是我们需要关注的点,就是 `KnownControllers() 实际上就是将我们进行需要我们初始化已知的 `Controllers`</span></span><br><span class="line">c, err := s.Config(KnownControllers(), ControllersDisabledByDefault.List()) </span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line">fmt.Fprintf(os.Stderr, <span class="string">"%v\n"</span>, err)</span><br><span class="line">os.Exit(<span class="number">1</span>)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> err := Run(c.Complete(), wait.NeverStop); err != <span class="literal">nil</span> {</span><br><span class="line">fmt.Fprintf(os.Stderr, <span class="string">"%v\n"</span>, err)</span><br><span class="line">os.Exit(<span class="number">1</span>)</span><br><span class="line">}</span><br><span class="line">},</span><br><span class="line">...</span><br></pre></td></tr></table></figure><p>深入 <code>KnowControllers()</code> 函数分析:</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// KnownControllers returns all known controllers's name</span></span><br><span class="line"><span class="comment">// 这里我们发现实际上这边就是之前 `daemon` 进程启动需要的参数,为一个 `controller` 控制器数组。</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">KnownControllers</span><span class="params">()</span> []<span class="title">string</span></span> {</span><br><span class="line">ret := sets.StringKeySet(NewControllerInitializers(IncludeCloudLoops))</span><br><span class="line"></span><br><span class="line"><span class="comment">// add "special" controllers that aren't initialized normally. These controllers cannot be initialized</span></span><br><span class="line"><span class="comment">// using a normal function. The only known special case is the SA token controller which *must* be started</span></span><br><span class="line"><span class="comment">// first to ensure that the SA tokens for future controllers will exist. Think very carefully before adding</span></span><br><span class="line"><span class="comment">// to this list.</span></span><br><span class="line">ret.Insert(</span><br><span class="line">saTokenControllerName,</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> ret.List()</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>通过 <code>NewControllerInitializers</code> 可以知道的是真正执行 <code>controller-manager</code> 初始化的执行函数是这个。</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// NewControllerInitializers is a public map of named controller groups (you can start more than one in an init func)</span></span><br><span class="line"><span class="comment">// paired to their InitFunc. This allows for structured downstream composition and subdivision.</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">NewControllerInitializers</span><span class="params">(loopMode ControllerLoopMode)</span> <span class="title">map</span>[<span class="title">string</span>]<span class="title">InitFunc</span></span> {</span><br><span class="line">controllers := <span class="keyword">map</span>[<span class="keyword">string</span>]InitFunc{}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 此处即为在 `map` 中进行实质性的初始化赋值</span></span><br><span class="line">controllers[<span class="string">"statefulset"</span>] = startStatefulSetController </span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> loopMode == IncludeCloudLoops {</span><br><span class="line">controllers[<span class="string">"service"</span>] = startServiceController</span><br><span class="line">controllers[<span class="string">"route"</span>] = startRouteController</span><br><span class="line">controllers[<span class="string">"cloud-node-lifecycle"</span>] = startCloudNodeLifecycleController</span><br><span class="line"><span class="comment">// <span class="doctag">TODO:</span> volume controller into the IncludeCloudLoops only set.</span></span><br><span class="line">}</span><br><span class="line">controllers[<span class="string">"persistentvolume-binder"</span>] = startPersistentVolumeBinderController</span><br><span class="line">controllers[<span class="string">"attachdetach"</span>] = startAttachDetachController</span><br><span class="line">controllers[<span class="string">"persistentvolume-expander"</span>] = startVolumeExpandController</span><br><span class="line">controllers[<span class="string">"clusterrole-aggregation"</span>] = startClusterRoleAggregrationController</span><br><span class="line">controllers[<span class="string">"pvc-protection"</span>] = startPVCProtectionController</span><br><span class="line">controllers[<span class="string">"pv-protection"</span>] = startPVProtectionController</span><br><span class="line">controllers[<span class="string">"ttl-after-finished"</span>] = startTTLAfterFinishedController</span><br><span class="line">controllers[<span class="string">"root-ca-cert-publisher"</span>] = startRootCACertPublisher</span><br><span class="line">controllers[<span class="string">"ephemeral-volume"</span>] = startEphemeralVolumeController</span><br><span class="line"><span class="keyword">if</span> utilfeature.DefaultFeatureGate.Enabled(genericfeatures.APIServerIdentity) &&</span><br><span class="line">utilfeature.DefaultFeatureGate.Enabled(genericfeatures.StorageVersionAPI) {</span><br><span class="line">controllers[<span class="string">"storage-version-gc"</span>] = startStorageVersionGCController</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> controllers</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>通过如上可以得知 <code>StatefulSet</code> 是如何被初始化到 <code>kube-controller-manager</code> 中的。</p><h2 id="Statefulset-Controller-启动过程"><a href="#Statefulset-Controller-启动过程" class="headerlink" title="Statefulset Controller 启动过程"></a><code>Statefulset Controller</code> 启动过程</h2><p>通过 <code>cmd/kube-manager-controller/app/controllermanager.go</code> 中 <code>Run</code> 函数分析,其中</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line">...</span><br><span class="line">Run: <span class="function"><span class="keyword">func</span><span class="params">(cmd *cobra.Command, args []<span class="keyword">string</span>)</span></span> {</span><br><span class="line">verflag.PrintAndExitIfRequested()</span><br><span class="line">cliflag.PrintFlags(cmd.Flags())</span><br><span class="line"></span><br><span class="line">err := checkNonZeroInsecurePort(cmd.Flags())</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line">fmt.Fprintf(os.Stderr, <span class="string">"%v\n"</span>, err)</span><br><span class="line">os.Exit(<span class="number">1</span>)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line">c, err := s.Config(KnownControllers(), ControllersDisabledByDefault.List()) </span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line">fmt.Fprintf(os.Stderr, <span class="string">"%v\n"</span>, err)</span><br><span class="line">os.Exit(<span class="number">1</span>)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 此处是对 `Controller manager` 进行 `config` 的初始化,然后对其中所管理的 `controller` 进行 `Run` 执行</span></span><br><span class="line"><span class="keyword">if</span> err := Run(c.Complete(), wait.NeverStop); err != <span class="literal">nil</span> { </span><br><span class="line">fmt.Fprintf(os.Stderr, <span class="string">"%v\n"</span>, err)</span><br><span class="line">os.Exit(<span class="number">1</span>)</span><br><span class="line">}</span><br><span class="line">},</span><br><span class="line">...</span><br></pre></td></tr></table></figure><p>进入 <code>Run</code> 函数进行分析:</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Run runs the KubeControllerManagerOptions. This should never exit.</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">Run</span><span class="params">(c *config.CompletedConfig, stopCh <-<span class="keyword">chan</span> <span class="keyword">struct</span>{})</span> <span class="title">error</span></span> {</span><br><span class="line"><span class="comment">// To help debugging, immediately log version</span></span><br><span class="line">klog.Infof(<span class="string">"Version: %+v"</span>, version.Get())</span><br><span class="line">...</span><br><span class="line">clientBuilder, rootClientBuilder := createClientBuilders(c)</span><br><span class="line"></span><br><span class="line">saTokenControllerInitFunc := serviceAccountTokenControllerStarter{rootClientBuilder: rootClientBuilder}.startServiceAccountTokenController</span><br><span class="line"></span><br><span class="line">run := <span class="function"><span class="keyword">func</span><span class="params">(ctx context.Context, startSATokenController InitFunc, initializersFunc ControllerInitializersFunc)</span></span> { <span class="comment">// 此处是我们需要关注的 `Controller` 初始化。</span></span><br><span class="line"></span><br><span class="line">controllerContext, err := CreateControllerContext(c, rootClientBuilder, clientBuilder, ctx.Done())</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line">klog.Fatalf(<span class="string">"error building controller context: %v"</span>, err)</span><br><span class="line">}</span><br><span class="line">controllerInitializers := initializersFunc(controllerContext.LoopMode)</span><br><span class="line"> <span class="comment">// 在这里是真正意义上开始对管理的控制器执行启动</span></span><br><span class="line"><span class="keyword">if</span> err := StartControllers(ctx, controllerContext, startSATokenController, controllerInitializers, unsecuredMux, healthzHandler); err != <span class="literal">nil</span> {</span><br><span class="line">klog.Fatalf(<span class="string">"error starting controllers: %v"</span>, err)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line">controllerContext.InformerFactory.Start(stopCh)</span><br><span class="line">controllerContext.ObjectOrMetadataInformerFactory.Start(stopCh)</span><br><span class="line"><span class="built_in">close</span>(controllerContext.InformersStarted)</span><br><span class="line"></span><br><span class="line"><span class="keyword">select</span> {}</span><br><span class="line">}</span><br><span class="line">...</span><br><span class="line"><span class="keyword">select</span> {}</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>查看 <code>StartControllers</code> 函数:</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// StartControllers starts a set of controllers with a specified ControllerContext</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">StartControllers</span><span class="params">(ctx context.Context, controllerCtx ControllerContext, startSATokenController InitFunc, controllers <span class="keyword">map</span>[<span class="keyword">string</span>]InitFunc,</span></span></span><br><span class="line"><span class="function"><span class="params">unsecuredMux *mux.PathRecorderMux, healthzHandler *controllerhealthz.MutableHealthzHandler)</span> <span class="title">error</span></span> {</span><br><span class="line"> ...</span><br><span class="line"> <span class="comment">// 这里会 `for` 循环遍历初始化过的 `controllers` 进行处理,需要关注下 `initFn` 究竟做了啥.</span></span><br><span class="line"><span class="keyword">for</span> controllerName, initFn := <span class="keyword">range</span> controllers {</span><br><span class="line"><span class="keyword">if</span> !controllerCtx.IsControllerEnabled(controllerName) {</span><br><span class="line">klog.Warningf(<span class="string">"%q is disabled"</span>, controllerName)</span><br><span class="line"><span class="keyword">continue</span></span><br><span class="line">}</span><br><span class="line"></span><br><span class="line">time.Sleep(wait.Jitter(controllerCtx.ComponentConfig.Generic.ControllerStartInterval.Duration, ControllerStartJitter))</span><br><span class="line"></span><br><span class="line">klog.V(<span class="number">1</span>).Infof(<span class="string">"Starting %q"</span>, controllerName)</span><br><span class="line">ctrl, started, err := initFn(ctx, controllerCtx)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line">klog.Errorf(<span class="string">"Error starting %q"</span>, controllerName)</span><br><span class="line"><span class="keyword">return</span> err</span><br><span class="line">}</span><br><span class="line"><span class="keyword">if</span> !started {</span><br><span class="line">klog.Warningf(<span class="string">"Skipping %q"</span>, controllerName)</span><br><span class="line"><span class="keyword">continue</span></span><br><span class="line">}</span><br><span class="line"> ...</span><br><span class="line">klog.Infof(<span class="string">"Started %q"</span>, controllerName)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line">healthzHandler.AddHealthChecker(controllerChecks...)</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">}</span><br></pre></td></tr></table></figure><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 用于判断具体的 `controller` 是否满足接口需求来得到 `controller manager` 支持的特性</span></span><br><span class="line"><span class="keyword">type</span> InitFunc <span class="function"><span class="keyword">func</span><span class="params">(ctx context.Context, controllerCtx ControllerContext)</span> <span class="params">(controller controller.Interface, enabled <span class="keyword">bool</span>, err error)</span></span></span><br></pre></td></tr></table></figure><p>看到这里我们似乎还是没看到 <code>StatefulSet Controller</code> 真正执行的地方,请再次回顾下我们之前 <code>NewControllerInitializers</code> 中的内容:</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">NewControllerInitializers</span><span class="params">(loopMode ControllerLoopMode)</span> <span class="title">map</span>[<span class="title">string</span>]<span class="title">InitFunc</span></span> {</span><br><span class="line"> ...</span><br><span class="line"> controllers[<span class="string">"statefulset"</span>] = startStatefulSetController</span><br><span class="line"> ...</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>感觉看到了一丝丝曙光,继续往下看 <code>startStatefulSetController</code> 的具体实现。</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">startStatefulSetController</span><span class="params">(ctx context.Context, controllerContext ControllerContext)</span> <span class="params">(controller.Interface, <span class="keyword">bool</span>, error)</span></span> {</span><br><span class="line"><span class="keyword">go</span> statefulset.NewStatefulSetController(</span><br><span class="line"><span class="comment">// 如下是 `Sts` 直接相关类型</span></span><br><span class="line">controllerContext.InformerFactory.Core().V1().Pods(),</span><br><span class="line">controllerContext.InformerFactory.Apps().V1().StatefulSets(),</span><br><span class="line">controllerContext.InformerFactory.Core().V1().PersistentVolumeClaims(),</span><br><span class="line">controllerContext.InformerFactory.Apps().V1().ControllerRevisions(),</span><br><span class="line">controllerContext.ClientBuilder.ClientOrDie(<span class="string">"statefulset-controller"</span>),</span><br><span class="line">).Run(<span class="keyword">int</span>(controllerContext.ComponentConfig.StatefulSetController.ConcurrentStatefulSetSyncs), ctx.Done())</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, <span class="literal">true</span>, <span class="literal">nil</span></span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>看到这里我们终于知道了 <code>StatefulSet Controller</code> 真正意义上是被如何启动的了。</p><h1 id="StatefulSet-Controller-明细分析"><a href="#StatefulSet-Controller-明细分析" class="headerlink" title="StatefulSet Controller 明细分析"></a><code>StatefulSet Controller</code> 明细分析</h1><p>通过如上的分析,下面就到了 <code>StatefulSet Controller</code> 具体的范畴了。</p><h2 id="StatefulSetController"><a href="#StatefulSetController" class="headerlink" title="StatefulSetController"></a><code>StatefulSetController</code></h2><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">type</span> StatefulSetController <span class="keyword">struct</span> {</span><br><span class="line"><span class="comment">// client interface</span></span><br><span class="line">kubeClient clientset.Interface</span><br><span class="line"><span class="comment">// control returns an interface capable of syncing a stateful set.</span></span><br><span class="line"><span class="comment">// Abstracted out for testing.</span></span><br><span class="line">control StatefulSetControlInterface</span><br><span class="line"><span class="comment">// podControl is used for patching pods.</span></span><br><span class="line">podControl controller.PodControlInterface</span><br><span class="line"><span class="comment">// podLister is able to list/get pods from a shared informer's store</span></span><br><span class="line">podLister corelisters.PodLister</span><br><span class="line"><span class="comment">// podListerSynced returns true if the pod shared informer has synced at least once</span></span><br><span class="line">podListerSynced cache.InformerSynced</span><br><span class="line"><span class="comment">// setLister is able to list/get stateful sets from a shared informer's store</span></span><br><span class="line">setLister appslisters.StatefulSetLister</span><br><span class="line"><span class="comment">// setListerSynced returns true if the stateful set shared informer has synced at least once</span></span><br><span class="line">setListerSynced cache.InformerSynced</span><br><span class="line"><span class="comment">// pvcListerSynced returns true if the pvc shared informer has synced at least once</span></span><br><span class="line">pvcListerSynced cache.InformerSynced</span><br><span class="line"><span class="comment">// revListerSynced returns true if the rev shared informer has synced at least once</span></span><br><span class="line">revListerSynced cache.InformerSynced</span><br><span class="line"><span class="comment">// StatefulSets that need to be synced.</span></span><br><span class="line">queue workqueue.RateLimitingInterface</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>通过对 <code>StatefulSetController</code> 结构体的大纲,了解下大概的结构:<br><img src="https://kblogs.oss-cn-beijing.aliyuncs.com/blogimgs/ssc.png" alt="ssc"></p><h2 id="NewStatefulSetController"><a href="#NewStatefulSetController" class="headerlink" title="NewStatefulSetController"></a><code>NewStatefulSetController</code></h2><p>对于 <code>ssc</code> 的构造函数分析:</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">NewStatefulSetController</span><span class="params">(</span></span></span><br><span class="line"><span class="function"><span class="params">// 可以观察到,这边都是 `ssc` 关心的 `resource` 对象,Pod/Sts/Pvc/Revision</span></span></span><br><span class="line"><span class="function"><span class="params">podInformer coreinformers.PodInformer,</span></span></span><br><span class="line"><span class="function"><span class="params">setInformer appsinformers.StatefulSetInformer,</span></span></span><br><span class="line"><span class="function"><span class="params">pvcInformer coreinformers.PersistentVolumeClaimInformer,</span></span></span><br><span class="line"><span class="function"><span class="params">revInformer appsinformers.ControllerRevisionInformer,</span></span></span><br><span class="line"><span class="function"><span class="params">kubeClient clientset.Interface,</span></span></span><br><span class="line"><span class="function"><span class="params">)</span> *<span class="title">StatefulSetController</span></span> {</span><br><span class="line">eventBroadcaster := record.NewBroadcaster()</span><br><span class="line">eventBroadcaster.StartStructuredLogging(<span class="number">0</span>)</span><br><span class="line">eventBroadcaster.StartRecordingToSink(&v1core.EventSinkImpl{Interface: kubeClient.CoreV1().Events(<span class="string">""</span>)})</span><br><span class="line">recorder := eventBroadcaster.NewRecorder(scheme.Scheme, v1.EventSource{Component: <span class="string">"statefulset-controller"</span>})</span><br><span class="line">ssc := &StatefulSetController{</span><br><span class="line">kubeClient: kubeClient,</span><br><span class="line">control: NewDefaultStatefulSetControl(</span><br><span class="line">NewRealStatefulPodControl(</span><br><span class="line">kubeClient,</span><br><span class="line">setInformer.Lister(),</span><br><span class="line">podInformer.Lister(),</span><br><span class="line">pvcInformer.Lister(),</span><br><span class="line">recorder),</span><br><span class="line">NewRealStatefulSetStatusUpdater(kubeClient, setInformer.Lister()),</span><br><span class="line">history.NewHistory(kubeClient, revInformer.Lister()),</span><br><span class="line">recorder,</span><br><span class="line">),</span><br><span class="line">pvcListerSynced: pvcInformer.Informer().HasSynced,</span><br><span class="line">queue: workqueue.NewNamedRateLimitingQueue(workqueue.DefaultControllerRateLimiter(), <span class="string">"statefulset"</span>),</span><br><span class="line">podControl: controller.RealPodControl{KubeClient: kubeClient, Recorder: recorder},</span><br><span class="line"></span><br><span class="line">revListerSynced: revInformer.Informer().HasSynced,</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// `Sts` 管理的 `Pod crud` 时对应的处理方法</span></span><br><span class="line">podInformer.Informer().AddEventHandler(cache.ResourceEventHandlerFuncs{</span><br><span class="line"><span class="comment">// lookup the statefulset and enqueue</span></span><br><span class="line">AddFunc: ssc.addPod,</span><br><span class="line"><span class="comment">// lookup current and old statefulset if labels changed</span></span><br><span class="line">UpdateFunc: ssc.updatePod,</span><br><span class="line"><span class="comment">// lookup statefulset accounting for deletion tombstones</span></span><br><span class="line">DeleteFunc: ssc.deletePod,</span><br><span class="line">})</span><br><span class="line">ssc.podLister = podInformer.Lister()</span><br><span class="line">ssc.podListerSynced = podInformer.Informer().HasSynced</span><br><span class="line"></span><br><span class="line"><span class="comment">// `Sts crud` 时对应的方法 </span></span><br><span class="line">setInformer.Informer().AddEventHandler(</span><br><span class="line">cache.ResourceEventHandlerFuncs{</span><br><span class="line">AddFunc: ssc.enqueueStatefulSet,</span><br><span class="line">UpdateFunc: <span class="function"><span class="keyword">func</span><span class="params">(old, cur <span class="keyword">interface</span>{})</span></span> {</span><br><span class="line">oldPS := old.(*apps.StatefulSet)</span><br><span class="line">curPS := cur.(*apps.StatefulSet)</span><br><span class="line"><span class="keyword">if</span> oldPS.Status.Replicas != curPS.Status.Replicas {</span><br><span class="line">klog.V(<span class="number">4</span>).Infof(<span class="string">"Observed updated replica count for StatefulSet: %v, %d->%d"</span>, curPS.Name, oldPS.Status.Replicas, curPS.Status.Replicas)</span><br><span class="line">}</span><br><span class="line">ssc.enqueueStatefulSet(cur)</span><br><span class="line">},</span><br><span class="line">DeleteFunc: ssc.enqueueStatefulSet,</span><br><span class="line">},</span><br><span class="line">)</span><br><span class="line">ssc.setLister = setInformer.Lister()</span><br><span class="line">ssc.setListerSynced = setInformer.Informer().HasSynced</span><br><span class="line"></span><br><span class="line"><span class="comment">// <span class="doctag">TODO:</span> Watch volumes</span></span><br><span class="line"><span class="keyword">return</span> ssc</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h3 id="ControllerRevision"><a href="#ControllerRevision" class="headerlink" title="ControllerRevision"></a><code>ControllerRevision</code></h3><p>这里了解下 <code>ControllerRevision</code> 究竟是啥,为啥需要关注。</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// ControllerRevision implements an immutable snapshot of state data. Clients</span></span><br><span class="line"><span class="comment">// are responsible for serializing and deserializing the objects that contain</span></span><br><span class="line"><span class="comment">// their internal state.</span></span><br><span class="line"><span class="comment">// Once a ControllerRevision has been successfully created, it can not be updated.</span></span><br><span class="line"><span class="comment">// The API Server will fail validation of all requests that attempt to mutate</span></span><br><span class="line"><span class="comment">// the Data field. ControllerRevisions may, however, be deleted. Note that, due to its use by both</span></span><br><span class="line"><span class="comment">// the DaemonSet and StatefulSet controllers for update and rollback, this object is beta. However,</span></span><br><span class="line"><span class="comment">// it may be subject to name and representation changes in future releases, and clients should not</span></span><br><span class="line"><span class="comment">// depend on its stability. It is primarily for internal use by controllers.</span></span><br><span class="line"><span class="keyword">type</span> ControllerRevision <span class="keyword">struct</span> {</span><br><span class="line">metav1.TypeMeta <span class="string">`json:",inline"`</span></span><br><span class="line"><span class="comment">// Standard object's metadata.</span></span><br><span class="line"><span class="comment">// More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#metadata</span></span><br><span class="line"><span class="comment">// +optional</span></span><br><span class="line">metav1.ObjectMeta <span class="string">`json:"metadata,omitempty" protobuf:"bytes,1,opt,name=metadata"`</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// Data is the serialized representation of the state.</span></span><br><span class="line">Data runtime.RawExtension <span class="string">`json:"data,omitempty" protobuf:"bytes,2,opt,name=data"`</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// Revision indicates the revision of the state represented by Data.</span></span><br><span class="line">Revision <span class="keyword">int64</span> <span class="string">`json:"revision" protobuf:"varint,3,opt,name=revision"`</span></span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>由对应的注释可以知道:<br><code>ControllerRevision</code> 提供给 <code>DaemonSet</code>和 <code>StatefulSet</code> 用作更新和回滚,<code>ControllerRevision</code> 存放的是数据的快照,<code>ControllerRevision</code> 生成之后内容是不可修改的,由调用端来负责序列化写入和反序列化读取。其中 <code>Revision(int64)</code> 字段相当于 <code>ControllerRevision</code> 的版本 <code>id</code> 号,Data字段则存放序列化后的数据。<br>所以 <code>Sts</code> 的更新以及回滚是基于新旧 <code>ControllerRevision</code> 的对比来进行的。</p><h3 id="NewDefaultStatefulSetControl"><a href="#NewDefaultStatefulSetControl" class="headerlink" title="NewDefaultStatefulSetControl"></a><code>NewDefaultStatefulSetControl</code></h3><p>深入看下 <code>NewDefaultStatefulSetControl</code> 定义:</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// NewDefaultStatefulSetControl returns a new instance of the default implementation StatefulSetControlInterface that</span></span><br><span class="line"><span class="comment">// implements the documented semantics for StatefulSets. podControl is the PodControlInterface used to create, update,</span></span><br><span class="line"><span class="comment">// and delete Pods and to create PersistentVolumeClaims. statusUpdater is the StatefulSetStatusUpdaterInterface used</span></span><br><span class="line"><span class="comment">// to update the status of StatefulSets. You should use an instance returned from NewRealStatefulPodControl() for any</span></span><br><span class="line"><span class="comment">// scenario other than testing.</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">NewDefaultStatefulSetControl</span><span class="params">(</span></span></span><br><span class="line"><span class="function"><span class="params">// 管理 Sts 对应的 Pod 的接口</span></span></span><br><span class="line"><span class="function"><span class="params">podControl StatefulPodControlInterface,</span></span></span><br><span class="line"><span class="function"><span class="params">// 管理 Sts 的 Status 的更新接口</span></span></span><br><span class="line"><span class="function"><span class="params">statusUpdater StatefulSetStatusUpdaterInterface,</span></span></span><br><span class="line"><span class="function"><span class="params">// 管理 ControllerRevision 的接口</span></span></span><br><span class="line"><span class="function"><span class="params">controllerHistory history.Interface,</span></span></span><br><span class="line"><span class="function"><span class="params">// 事件记录器接口</span></span></span><br><span class="line"><span class="function"><span class="params">recorder record.EventRecorder)</span> <span class="title">StatefulSetControlInterface</span></span> {</span><br><span class="line"><span class="keyword">return</span> &defaultStatefulSetControl{podControl, statusUpdater, controllerHistory, recorder}</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h3 id="Run-函数执行过程"><a href="#Run-函数执行过程" class="headerlink" title="Run 函数执行过程"></a><code>Run</code> 函数执行过程</h3><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Run runs the statefulset controller.</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(ssc *StatefulSetController)</span> <span class="title">Run</span><span class="params">(workers <span class="keyword">int</span>, stopCh <-<span class="keyword">chan</span> <span class="keyword">struct</span>{})</span></span> {</span><br><span class="line"><span class="keyword">defer</span> utilruntime.HandleCrash()</span><br><span class="line"><span class="keyword">defer</span> ssc.queue.ShutDown()</span><br><span class="line"></span><br><span class="line">klog.Infof(<span class="string">"Starting stateful set controller"</span>)</span><br><span class="line"><span class="keyword">defer</span> klog.Infof(<span class="string">"Shutting down statefulset controller"</span>)</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> !cache.WaitForNamedCacheSync(<span class="string">"stateful set"</span>, stopCh, ssc.podListerSynced, ssc.setListerSynced, ssc.pvcListerSynced, ssc.revListerSynced) {</span><br><span class="line"><span class="keyword">return</span></span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="keyword">for</span> i := <span class="number">0</span>; i < workers; i++ {</span><br><span class="line"><span class="keyword">go</span> wait.Until(ssc.worker, time.Second, stopCh)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><-stopCh</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>此处关注下 <code>wait.Until</code> 工具:</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Until loops until stop channel is closed, running f every period.</span></span><br><span class="line"><span class="comment">//</span></span><br><span class="line"><span class="comment">// Until is syntactic sugar on top of JitterUntil with zero jitter factor and</span></span><br><span class="line"><span class="comment">// with sliding = true (which means the timer for period starts after the f</span></span><br><span class="line"><span class="comment">// completes).</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">Until</span><span class="params">(f <span class="keyword">func</span>()</span>, <span class="title">period</span> <span class="title">time</span>.<span class="title">Duration</span>, <span class="title">stopCh</span> <-<span class="title">chan</span> <span class="title">struct</span></span>{}) {</span><br><span class="line">JitterUntil(f, period, <span class="number">0.0</span>, <span class="literal">true</span>, stopCh)</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>通过注释可以知道, <code>Until</code> 工具会根据 <code>channel</code> 的关闭来周期性的执行函数 <code>f</code>。<br>主要解决的是当我们执行完某些操作后,还需要等待其他资源执行的情况,例如对于有依赖条件的资源释放的时候,<code>A</code> 依赖于 <code>B</code>,那么对 <code>A</code> 资源释放的时候还需要对 <code>B</code> 资源的释放进行观望。这在 <code>k8s</code> 的资源操作场景是常见的。</p><p>继续关注 <code>wait.Until</code> 中包的函数 <code>ssc.worker</code>。</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// worker runs a worker goroutine that invokes processNextWorkItem until the controller's queue is closed</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(ssc *StatefulSetController)</span> <span class="title">worker</span><span class="params">()</span></span> {</span><br><span class="line"><span class="keyword">for</span> ssc.processNextWorkItem() {</span><br><span class="line">}</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p><code>worker</code> 通过运行一个 <code>goroutine</code> 来处理 <code>processNextWorkItem</code> 直到 <code>controller</code> 相关的 <code>queue</code> 被关闭。</p><p>毫无疑问,需要分析 <code>processNextWorkItem()</code> 对应函数:</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// processNextWorkItem dequeues items, processes them, and marks them done. It enforces that the syncHandler is never</span></span><br><span class="line"><span class="comment">// invoked concurrently with the same key.</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(ssc *StatefulSetController)</span> <span class="title">processNextWorkItem</span><span class="params">()</span> <span class="title">bool</span></span> {</span><br><span class="line">key, quit := ssc.queue.Get()</span><br><span class="line"><span class="keyword">if</span> quit {</span><br><span class="line"><span class="keyword">return</span> <span class="literal">false</span></span><br><span class="line">}</span><br><span class="line"><span class="keyword">defer</span> ssc.queue.Done(key)</span><br><span class="line"><span class="comment">// 其它语义很容易理解,需要关注的是 ssc.sync() 函数</span></span><br><span class="line"><span class="keyword">if</span> err := ssc.sync(key.(<span class="keyword">string</span>)); err != <span class="literal">nil</span> {</span><br><span class="line">utilruntime.HandleError(fmt.Errorf(<span class="string">"error syncing StatefulSet %v, requeuing: %v"</span>, key.(<span class="keyword">string</span>), err))</span><br><span class="line">ssc.queue.AddRateLimited(key)</span><br><span class="line">} <span class="keyword">else</span> {</span><br><span class="line">ssc.queue.Forget(key)</span><br><span class="line">}</span><br><span class="line"><span class="keyword">return</span> <span class="literal">true</span></span><br><span class="line">}</span><br></pre></td></tr></table></figure><p><code>processNextWorkItem()</code> 主要用于对 <code>queue</code> 的元素出队,并标记为已处理。</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// sync syncs the given statefulset.</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(ssc *StatefulSetController)</span> <span class="title">sync</span><span class="params">(key <span class="keyword">string</span>)</span> <span class="title">error</span></span> {</span><br><span class="line">startTime := time.Now()</span><br><span class="line"><span class="keyword">defer</span> <span class="function"><span class="keyword">func</span><span class="params">()</span></span> {</span><br><span class="line">klog.V(<span class="number">4</span>).Infof(<span class="string">"Finished syncing statefulset %q (%v)"</span>, key, time.Since(startTime))</span><br><span class="line">}()</span><br><span class="line"></span><br><span class="line"><span class="comment">// 对缓存中的 key 进行 split操作</span></span><br><span class="line">namespace, name, err := cache.SplitMetaNamespaceKey(key)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> err</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 对缓存根据 namespace 及 name 进行 get 操作</span></span><br><span class="line">set, err := ssc.setLister.StatefulSets(namespace).Get(name)</span><br><span class="line"><span class="keyword">if</span> errors.IsNotFound(err) {</span><br><span class="line">klog.Infof(<span class="string">"StatefulSet has been deleted %v"</span>, key)</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">}</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line">utilruntime.HandleError(fmt.Errorf(<span class="string">"unable to retrieve StatefulSet %v from store: %v"</span>, key, err))</span><br><span class="line"><span class="keyword">return</span> err</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 获取 sts 的 selector</span></span><br><span class="line">selector, err := metav1.LabelSelectorAsSelector(set.Spec.Selector)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line">utilruntime.HandleError(fmt.Errorf(<span class="string">"error converting StatefulSet %v selector: %v"</span>, key, err))</span><br><span class="line"><span class="comment">// This is a non-transient error, so don't retry.</span></span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 调用 ssc.adoptOrphanRevisions 检查是否有孤儿 controllerrevisions 对象,若有且能匹配 selector 的则添加 ownerReferences 进行关联,已关联但 label 不匹配的则进行释放。</span></span><br><span class="line"><span class="keyword">if</span> err := ssc.adoptOrphanRevisions(set); err != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> err</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 调用 ssc.getPodsForStatefulSet 通过 selector 获取 sts 关联的 pod,若有孤儿 pod 的 label 与 sts 的能匹配则进行关联,若已关联的 pod label 有变化则解除与 sts 的关联关系。</span></span><br><span class="line">pods, err := ssc.getPodsForStatefulSet(set, selector)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> err</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 执行真正的 sync 操作</span></span><br><span class="line"><span class="keyword">return</span> ssc.syncStatefulSet(set, pods)</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p><code>syncStatefulSet</code>:</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// syncStatefulSet syncs a tuple of (statefulset, []*v1.Pod).</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(ssc *StatefulSetController)</span> <span class="title">syncStatefulSet</span><span class="params">(set *apps.StatefulSet, pods []*v1.Pod)</span> <span class="title">error</span></span> {</span><br><span class="line">klog.V(<span class="number">4</span>).Infof(<span class="string">"Syncing StatefulSet %v/%v with %d pods"</span>, set.Namespace, set.Name, <span class="built_in">len</span>(pods))</span><br><span class="line"><span class="keyword">var</span> status *apps.StatefulSetStatus</span><br><span class="line"><span class="keyword">var</span> err error</span><br><span class="line"><span class="comment">// <span class="doctag">TODO:</span> investigate where we mutate the set during the update as it is not obvious.</span></span><br><span class="line"><span class="comment">// 中仅仅是调用了 ssc.control.UpdateStatefulSet 方法进行处理。</span></span><br><span class="line">status, err = ssc.control.UpdateStatefulSet(set.DeepCopy(), pods)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> err</span><br><span class="line">}</span><br><span class="line">klog.V(<span class="number">4</span>).Infof(<span class="string">"Successfully synced StatefulSet %s/%s successful"</span>, set.Namespace, set.Name)</span><br><span class="line"><span class="comment">// One more sync to handle the clock skew. This is also helping in requeuing right after status update</span></span><br><span class="line"><span class="keyword">if</span> utilfeature.DefaultFeatureGate.Enabled(features.StatefulSetMinReadySeconds) && set.Spec.MinReadySeconds > <span class="number">0</span> && status != <span class="literal">nil</span> && status.AvailableReplicas != *set.Spec.Replicas {</span><br><span class="line">ssc.enqueueSSAfter(set, time.Duration(set.Spec.MinReadySeconds)*time.Second)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">}</span><br></pre></td></tr></table></figure><p><code>UpdateStatefulSet</code>:</p><ol><li>获取历史 <code>revisions</code>;</li><li>计算 <code>currentRevision</code> 和 <code>updateRevision</code>,若 <code>sts</code> 处于更新过程中则 <code>currentRevision</code> 和 <code>updateRevision</code> 值不同;</li><li>调用 <code>ssc.performUpdate</code> 执行实际的 <code>sync</code> 操作;</li><li>调用 <code>ssc.updateStatefulSetStatus</code> 更新 <code>status subResource</code>;</li><li>根据 <code>sts</code> 的 <code>spec.revisionHistoryLimit</code> 字段清理过期的 <code>controllerrevision</code>;</li></ol><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(ssc *defaultStatefulSetControl)</span> <span class="title">UpdateStatefulSet</span><span class="params">(set *apps.StatefulSet, pods []*v1.Pod)</span> <span class="params">(*apps.StatefulSetStatus, error)</span></span> {</span><br><span class="line"><span class="comment">// 获取 revisions 并排序</span></span><br><span class="line">revisions, err := ssc.ListRevisions(set)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, err</span><br><span class="line">}</span><br><span class="line">history.SortControllerRevisions(revisions)</span><br><span class="line"></span><br><span class="line"><span class="comment">// 计算 Revison</span></span><br><span class="line">currentRevision, updateRevision, status, err := ssc.performUpdate(set, pods, revisions)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, utilerrors.NewAggregate([]error{err, ssc.truncateHistory(set, pods, revisions, currentRevision, updateRevision)})</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 清除过期的历史版本</span></span><br><span class="line"><span class="keyword">return</span> status, ssc.truncateHistory(set, pods, revisions, currentRevision, updateRevision)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(ssc *defaultStatefulSetControl)</span> <span class="title">performUpdate</span><span class="params">(</span></span></span><br><span class="line"><span class="function"><span class="params">set *apps.StatefulSet, pods []*v1.Pod, revisions []*apps.ControllerRevision)</span> <span class="params">(*apps.ControllerRevision, *apps.ControllerRevision, *apps.StatefulSetStatus, error)</span></span> {</span><br><span class="line"><span class="keyword">var</span> currentStatus *apps.StatefulSetStatus</span><br><span class="line"><span class="comment">// get the current, and update revisions</span></span><br><span class="line">currentRevision, updateRevision, collisionCount, err := ssc.getStatefulSetRevisions(set, revisions)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> currentRevision, updateRevision, currentStatus, err</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 实现具体的 update 操作</span></span><br><span class="line">currentStatus, err = ssc.updateStatefulSet(set, currentRevision, updateRevision, collisionCount, pods)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> currentRevision, updateRevision, currentStatus, err</span><br><span class="line">}</span><br><span class="line"><span class="comment">// update status</span></span><br><span class="line">err = ssc.updateStatefulSetStatus(set, currentStatus)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"><span class="keyword">return</span> currentRevision, updateRevision, currentStatus, err</span><br><span class="line">}</span><br><span class="line">klog.V(<span class="number">4</span>).Infof(<span class="string">"StatefulSet %s/%s pod status replicas=%d ready=%d current=%d updated=%d"</span>,</span><br><span class="line">set.Namespace,</span><br><span class="line">set.Name,</span><br><span class="line">currentStatus.Replicas,</span><br><span class="line">currentStatus.ReadyReplicas,</span><br><span class="line">currentStatus.CurrentReplicas,</span><br><span class="line">currentStatus.UpdatedReplicas)</span><br><span class="line"></span><br><span class="line">klog.V(<span class="number">4</span>).Infof(<span class="string">"StatefulSet %s/%s revisions current=%s update=%s"</span>,</span><br><span class="line">set.Namespace,</span><br><span class="line">set.Name,</span><br><span class="line">currentStatus.CurrentRevision,</span><br><span class="line">currentStatus.UpdateRevision)</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> currentRevision, updateRevision, currentStatus, <span class="literal">nil</span></span><br><span class="line">}</span><br></pre></td></tr></table></figure><p><code>updateStatefulSet</code>:</p><pre><code class="golang">// 作为updateStatefulSet的核心方法,重试保障Statefulset到达期望状态,update策略主要分为三类:// 1.RollingUpdateStatefulSetStrategyType// 2.OnDeleteStatefulSetStrategyType// 3.PartitionStatefulSetStrategyTypefunc (ssc *defaultStatefulSetControl) updateStatefulSet( set *apps.StatefulSet, currentRevision *apps.ControllerRevision, updateRevision *apps.ControllerRevision, collisionCount int32, pods []*v1.Pod) (*apps.StatefulSetStatus, error) { ... // 获取当前和更新的 Revision currentSet, err := ApplyRevision(set, currentRevision) ... // 构建 sts 对象 currentSet, err := ApplyRevision(set, currentRevision) ... // 构建 sts 对象 updateSet, err := ApplyRevision(set, updateRevision) ... // status 赋值 status := apps.StatefulSetStatus{} status.ObservedGeneration = set.Generation status.CurrentRevision = currentRevision.Name status.UpdateRevision = updateRevision.Name status.CollisionCount = new(int32) *status.CollisionCount = collisionCount // replicas 存放 Pods such that 0 <= getOrdinal(pod) < set.Spec.Replicas replicas := make([]*v1.Pod, replicaCount) // condemned 存放 Pods such that set.Spec.Replicas <= getOrdinal(pod) condemned := make([]*v1.Pod, 0, len(pods)) ... // 对 pods 进行处理分别存放到 replicas 和 condemned 切片中 for i := range pods { status.Replicas++ // 统计 running 和 ready 的副本数 if isRunningAndReady(pods[i]) { status.ReadyReplicas++ // 对门控是否开启特性的判断 if utilfeature.DefaultFeatureGate.Enabled(features.StatefulSetMinReadySeconds) { if isRunningAndAvailable(pods[i], set.Spec.MinReadySeconds) { status.AvailableReplicas++ } } else { // 如果门控特性未开启,那么所有ready 的副本数将被认为是可用状态的副本数 status.AvailableReplicas = status.ReadyReplicas } } // 统计 current 和 update 的副本数 if isCreated(pods[i]) && !isTerminating(pods[i]) { if getPodRevision(pods[i]) == currentRevision.Name { status.CurrentReplicas++ } if getPodRevision(pods[i]) == updateRevision.Name { status.UpdatedReplicas++ } } if ord := getOrdinal(pods[i]); 0 <= ord && ord < replicaCount { // replicas 的赋值 replicas[ord] = pods[i] } else if ord >= replicaCount { // condemned 的赋值 condemned = append(condemned, pods[i]) } } // 检查 replicas数组中 [0,set.Spec.Replicas) 下标是否有缺失的 pod,若有缺失的则创建对应的 pod object // 在 newVersionedStatefulSetPod 中会判断是使用 currentSet 还是 updateSet 来创建 for ord := 0; ord < replicaCount; ord++ { if replicas[ord] == nil { replicas[ord] = newVersionedStatefulSetPod( currentSet, updateSet, currentRevision.Name, updateRevision.Name, ord) } } // 对 condemned 数组进行排序 sort.Sort(ascendingOrdinal(condemned)) // 根据 ord 在 replicas 和 condemned 数组中找出 first unhealthy Pod for i := range replicas { if !isHealthy(replicas[i]) { unhealthy++ if firstUnhealthyPod == nil { firstUnhealthyPod = replicas[i] } } } for i := range condemned { if !isHealthy(condemned[i]) { unhealthy++ if firstUnhealthyPod == nil { firstUnhealthyPod = condemned[i] } } } if unhealthy > 0 { klog.V(4).Infof("StatefulSet %s/%s has %d unhealthy Pods starting with %s", set.Namespace, set.Name, unhealthy, firstUnhealthyPod.Name) } // 判断 set 是否处于 deleting if set.DeletionTimestamp != nil { return &status, nil } // 默认设置为非并行模式 monotonic := !allowsBurst(set) // 确保 replicas 数组中的所有 pod 都是 running 状态 for i := range replicas { // 删除和重建失败的 pods if isFailed(replicas[i]) { ssc.recorder.Eventf(set, v1.EventTypeWarning, "RecreatingFailedPod", "StatefulSet %s/%s is recreating failed Pod %s", set.Namespace, set.Name, replicas[i].Name) if err := ssc.podControl.DeleteStatefulPod(set, replicas[i]); err != nil { return &status, err } if getPodRevision(replicas[i]) == currentRevision.Name { status.CurrentReplicas-- } if getPodRevision(replicas[i]) == updateRevision.Name { status.UpdatedReplicas-- } status.Replicas-- replicas[i] = newVersionedStatefulSetPod( currentSet, updateSet, currentRevision.Name, updateRevision.Name, i) } // 如果 pod 未被创建则进行创建 if !isCreated(replicas[i]) { if err := ssc.podControl.CreateStatefulPod(set, replicas[i]); err != nil { return &status, err } status.Replicas++ if getPodRevision(replicas[i]) == currentRevision.Name { status.CurrentReplicas++ } if getPodRevision(replicas[i]) == updateRevision.Name { status.UpdatedReplicas++ } // if the set does not allow bursting, return immediately if monotonic { return &status, nil } // pod created, no more work possible for this round continue } // 当 pod 处于 terminating 状态的时候且不允许并行的情况下 则进行等待删除完成 if isTerminating(replicas[i]) && monotonic { klog.V(4).Infof( "StatefulSet %s/%s is waiting for Pod %s to Terminate", set.Namespace, set.Name, replicas[i].Name) return &status, nil } // 当 pod 已经被创建且不运行并行的情况下,状态并不是 running 和 ready 状态的处理。 if !isRunningAndReady(replicas[i]) && monotonic { klog.V(4).Infof( "StatefulSet %s/%s is waiting for Pod %s to be Running and Ready", set.Namespace, set.Name, replicas[i].Name) return &status, nil } // pod creates 成功但是并不是可用状态时的处理。 if utilfeature.DefaultFeatureGate.Enabled(features.StatefulSetMinReadySeconds) && !isRunningAndAvailable(replicas[i], set.Spec.MinReadySeconds) && monotonic { klog.V(4).Infof( "StatefulSet %s/%s is waiting for Pod %s to be Available", set.Namespace, set.Name, replicas[i].Name) return &status, nil } // 对 sts 的唯一性及相关存储唯一性的检查 if identityMatches(set, replicas[i]) && storageMatches(set, replicas[i]) { continue } ... }}至此结束。</code></pre>]]></content>
<summary type="html"><blockquote>
<p>对于分析 <code>Controller</code> 源码选用 <code>StatefulSet Controller</code> 来,其它控制器源码分析一个套路,可以做参考。</p>
</blockquote>
<h1 id="State</summary>
<category term="kubernetes" scheme="http://kiragoo.github.com/categories/kubernetes/"/>
<category term="源码分析" scheme="http://kiragoo.github.com/categories/kubernetes/%E6%BA%90%E7%A0%81%E5%88%86%E6%9E%90/"/>
<category term="kubernetes" scheme="http://kiragoo.github.com/tags/kubernetes/"/>
<category term="Controller" scheme="http://kiragoo.github.com/tags/Controller/"/>
<category term="源码分析" scheme="http://kiragoo.github.com/tags/%E6%BA%90%E7%A0%81%E5%88%86%E6%9E%90/"/>
<category term="StatefulSet" scheme="http://kiragoo.github.com/tags/StatefulSet/"/>
</entry>
<entry>
<title>Kubernetes之编写Controller</title>
<link href="http://kiragoo.github.com/archives/a66802dc.html"/>
<id>http://kiragoo.github.com/archives/a66802dc.html</id>
<published>2021-10-14T03:50:53.000Z</published>
<updated>2022-04-21T12:46:07.732Z</updated>
<content type="html"><![CDATA[<blockquote><p>翻译自<a href="https://github.com/kubernetes/community/blob/8cafef897a22026d42f5e5bb3f104febe7e29830/contributors/devel/controllers.md">Writing Controllers</a></p></blockquote><h1 id="Writing-Controller"><a href="#Writing-Controller" class="headerlink" title="Writing Controller"></a><code>Writing Controller</code></h1><p><code>Kubernetes Controller</code> 是个“常驻调谐进程”。它除了会“监视”对象的期望状态外,也会“监视”对象的运行状态。会通过发送“指令”尝试着将对象的运行状态更加趋近于期望状态。</p><p>如下是个简单的 <code>loop</code> 循环:</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">for</span> {</span><br><span class="line"> desired := getDesiredState()</span><br><span class="line"> current := getCurrentState()</span><br><span class="line"> makeChanges(desired, current)</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h1 id="Guidelines"><a href="#Guidelines" class="headerlink" title="Guidelines"></a><code>Guidelines</code></h1><p>当我们写 <code>Controller</code> 的时候,有如下知道方针来帮助我们实现我们想要的结果和表现。</p><ol><li><strong>一次之操作一个元素</strong>。 如果你使用 <code>workqueue.Interface</code>,你将能够将一个具体的<code>Resource</code>入队,然后稍后将它们 <code>pop</code> 到 <code>worker gofuncs</code>,此处需要保证的是同一时间不能有多个 <code>gofuncs</code> 处理同一个元素。</li></ol><p><code>Controllers</code> 会引发多个 <code>Resource</code> 之前的关联关系(例如 Y 发生改变了 我需要检查 X),但是几乎所有的 <code>Controller</code> 会基于 <code>relationships</code> 将检查X的所有放入到队列中。例如,<code>RepicaSet Controller</code> 需要对正在进行删除 <code>pod</code> 做出反应,但是它需要发现关联的<code>RepicaSets</code> 并且对此做出入队。</p><ol start="2"><li><strong><code>Resources</code> 随件排序</strong>。当 <code>queue off</code> 多种 <code>resources</code> 的时候,将不会保证这些 <code>resources</code> 的顺序。</li></ol><p>“监视” 将会实时的进行更新,即使在明显顺序如“create resource A/X”,“create resource B/Y”,<code>Controller</code> 也许注意到的为 <code>create resource B/Y</code>,<code>create resource A/X</code>。</p><ol start="3"><li><strong>水平驱逐而不是边缘驱逐</strong>。比如某个 <code>shell</code> 脚本没有一直运行,你的 <code>controller</code> 将再重新运行该 <code>shell</code> 之间“休眠”不确定时间。</li></ol><p>如果某个 <code>API</code> 对象出现某个 <code>marker</code> 为 <code>true</code>,你也无法判断出它是由 <code>false</code> 变成 <code>true</code> 的,你只能知道它当前为 <code>true</code>。即使 <code>API</code> “监视”深受其害,所以你将无法对此看出变化,除非你的 <code>controller</code> 在对象的 <code>status</code> 中记录相关信息。</p><ol start="4"><li>**<code>SharedInformers</code>**。<code>SharedInformers</code> 提供对具体 <code>resource</code> 的添加、更新、删除事件的钩子。同时提供对共享缓存便利性的函数访问。</li></ol><p>使用 <a href="https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/client-go/informers/factory.go">https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/client-go/informers/factory.go</a> 工厂方法来保证使用相同的缓存实例。</p><p>这可以使的我们不使用 <code>API server</code>, 重复序列化会消耗服务端资源,重复反序列化会消耗控制器资源以及重复缓存会也会消耗控制器资源。</p><p>你也许发现其他的机制比如 <code>reflectors</code> 和 <code>deltafifos</code> 驱动控制器。 这些已是陈旧的机制,后续我们将构建 <code>SharedInformers</code>。你需要避免在新的控制器中使用这些。</p><ol start="5"><li><strong>切勿转换原始对象</strong>。 控制器建通过缓存来进行共享,这就意味着如果你转换了 <em>“拷贝”</em> 对象,你将使得其与其他的控制器混肴了。</li></ol><p>最常见的就是通过 <em>“浅拷贝”</em> 然后对 <code>map</code> 进行转换导致失败,比如对 <code>Annotations</code>。</p><ol start="6"><li><strong>二级缓存</strong>。许多控制器拥有一级资源和二级资源。一级资源对应着那些我们将要更新 <code>Status</code> 的原始对象资源。二级资源对应着将要管理的对象。</li></ol><p>在启用一级更新操作之前使用 <code>framework.WaitFirCacheSync</code> 来等待二级缓存。</p><ol start="7"><li><strong>系统中的其他因素</strong>。因为你没更新对象并不意味着没有其他人更新对象。</li></ol><p>不要忘记当前状态会在任意时刻发生更新–仅仅观察期望的状态是不够的。如果你使用在期望状态下的对象缺失来提示当前状态下的东西被删除,请确保你的可观察代码中没有错误(例如,在你的缓存填充之前进行处理)。</p><ol start="8"><li><strong>过滤错误到顶层以保持一直的重新队列</strong>。我们采用 <code>workqueue.RateLimitingInterface</code> 来允许简单的排队与合理的回退。</li></ol><p>当在排队时,你的主控制器返回结果应该包含 <code>error</code>。当不存在错误时,则应该使用 <code>utilruntime.HandleEroor</code> 并且以返回 <code>nil</code> 代替它。这使的审核人员能够容易地检查错误处理情况,并确信控制器不会丢失它应该重试处理的内容。</p><ol start="9"><li>**<code>Watches、Informers 将会同步</code>**。他们会定期的将集群中匹配的对象进行 <code>Update</code> 更新。这对于你可能需要对对象采取额外的操作是很好的,但是大多数情况下你知道不会存在较多的额外工作。</li></ol><p>你可以通过比对新旧对象中的资源版本来判断他们是否发生变化来决定是否需要进行再次入队处理。如果它们是相同的则跳过重新入队的工作环节。需要你注意的是,如果你曾经在再次入队的时候失败了,应该是失败处理而不是再次入队,并且不要再对它们进行重试。</p><ol start="10"><li>如果你的控制器协调的一级资源在其 <code>Status</code> 中支持 <code>ObservedGeneration</code>,请确保其正确的设置为 <code>MetaData</code> 元数据。当两个字段间不匹配的时候进行生成。</li></ol><p>这将让客户端知道控制器在处理资源。确保你的控制器是负责此资源的,否则如果你需要通过自己的控制器与其通信,你将需要在资源的<code>Status</code> 中创建一个不同类型的<code>ObservedGeneration</code>。</p><ol start="11"><li>考虑到资源创建时对其他资源的所有者引用(例如,<code>ReplicaSet</code> 导致创建 <code>Pods</code>)。因此你得确保被控制器管理的资源被删除时那些依赖的子资源能够很好的呗回收处理。关于所有者引用的更多明细,请参考<a href="https://github.com/kubernetes/community/blob/master/contributors/design-proposals/controller-ref.md">这里</a>。</li></ol><p>需要特别注意的是,当父资源或者子资源被标机为删除时,你不应该采用子资源。如果你对资源使用了缓存,你最好通过直接的 <code>API</code> 绕过缓存,以防你观察到的某个资源的所有者引用已被更新。所以你可以确保你的控制器不会与垃圾回收期产生竞争。</p><p>查看<a href="https://github.com/kubernetes/kubernetes/pull/42938">k8s.io/kubernetes/pull/42938</a>获取更多的细节。</p><h1 id="Rough-Structure"><a href="#Rough-Structure" class="headerlink" title="Rough Structure"></a><code>Rough Structure</code></h1><p><code>Controller</code> 大概如下:</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">type</span> Controller <span class="keyword">struct</span> {</span><br><span class="line"> <span class="comment">// podLister is secondary cache of pods which is used for object lookups</span></span><br><span class="line"> podLister cache.StoreToPodLister</span><br><span class="line"></span><br><span class="line"> <span class="comment">// queue is where incoming work is placed to de-dup and to allow "easy"</span></span><br><span class="line"> <span class="comment">// rate limmited requeues on errors</span></span><br><span class="line"> queue workqueue.RateLimitingInterface</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(c *Controller)</span> <span class="title">Run</span><span class="params">(threadiness <span class="keyword">int</span>, stopCh <span class="keyword">chan</span> <span class="keyword">struct</span>{})</span></span> {</span><br><span class="line"> <span class="comment">// don't let panics crash the process</span></span><br><span class="line"> <span class="keyword">defer</span> utilruntime.HandleCrash()</span><br><span class="line"></span><br><span class="line"> <span class="comment">// make sure the work queue is shutdown which will trigger workers to end</span></span><br><span class="line"> <span class="keyword">defer</span> c.queue.ShutDown()</span><br><span class="line"></span><br><span class="line"> glog.Infof(<span class="string">"Starting <NAME> controller"</span>)</span><br><span class="line"></span><br><span class="line"> <span class="comment">// wait for your secondary caches to fill before starting your work</span></span><br><span class="line"> <span class="keyword">if</span> !framework.WaitFirCacheSync(stopCh, c.PodStoreSunced) {</span><br><span class="line"> <span class="keyword">return</span></span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="comment">// start up your worker threads based on threadiness. Some controllers</span></span><br><span class="line"> <span class="comment">// have multiple kinds of workers</span></span><br><span class="line"> <span class="keyword">for</span> i:=<span class="number">0</span>;i<threadiness;i++{</span><br><span class="line"> <span class="comment">// runWorker will loop until "something bad" happens. The .Until will</span></span><br><span class="line"> <span class="comment">// then rekick the worker after one second</span></span><br><span class="line"> <span class="keyword">go</span> wait.Until(c.runWorker, time.Second, stopCh)</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="comment">// wait until we're told to stop</span></span><br><span class="line"> <-stopCh</span><br><span class="line"> glog.Infof(<span class="string">"Shutting down <NAME> controller"</span>)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(c *Controller)</span> <span class="title">runWorker</span><span class="params">()</span></span> {</span><br><span class="line"> <span class="comment">// hot loop until we're told to stop. processNextWorkItem will</span></span><br><span class="line"> <span class="comment">// automatically wait until there's work available, so we don't worry </span></span><br><span class="line"> <span class="comment">// about secondary waits</span></span><br><span class="line"> <span class="keyword">for</span> c.processNextWorkItem() {</span><br><span class="line"></span><br><span class="line"> }</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// processNextWorkItem deals with one key off the queue. It returns false</span></span><br><span class="line"><span class="comment">// when it's time to quit.</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(c *Controller)</span> <span class="title">processNextWorkItem</span><span class="params">()</span> <span class="title">bool</span></span> {</span><br><span class="line"> <span class="comment">// pull the next work item from queue. It should be a key we use to lookup</span></span><br><span class="line"> <span class="comment">// something in a cache</span></span><br><span class="line"> key, quit := c.queue.Get()</span><br><span class="line"> <span class="keyword">if</span> quit {</span><br><span class="line"> <span class="keyword">return</span> <span class="literal">false</span></span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="comment">// you always have to indicate to the queue that you've completed a piece of </span></span><br><span class="line"> <span class="comment">// work</span></span><br><span class="line"> <span class="keyword">defer</span> c.queue.Done(key)</span><br><span class="line"></span><br><span class="line"> <span class="comment">// do your work on the key. This method will contains your "do stuff" logic</span></span><br><span class="line"> err := c.syncHandler(key.(<span class="keyword">string</span>))</span><br><span class="line"></span><br><span class="line"> <span class="keyword">if</span> err == <span class="literal">nil</span> {</span><br><span class="line"> <span class="comment">// if you had no error, thll the queue to stop tracking history for your</span></span><br><span class="line"> <span class="comment">// key. This will reset things like failuer counts for per-item rate</span></span><br><span class="line"> <span class="comment">// limiting</span></span><br><span class="line"> c.queue.Forget(key)</span><br><span class="line"> <span class="keyword">return</span> <span class="literal">true</span></span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="comment">// there was a failure so be sure to report it. This method allows for </span></span><br><span class="line"> <span class="comment">// pluggable error handling which can be used for things like</span></span><br><span class="line"> <span class="comment">// cluster-monitoring</span></span><br><span class="line"> utilruntime.HandleError(fmt.Errorf(<span class="string">"%v failed with : %v"</span>, key, err))</span><br><span class="line"></span><br><span class="line"> <span class="comment">// since we failed, we should requeue the item to work on later. This</span></span><br><span class="line"> <span class="comment">// method will add a backoff to avoid hotlooping on particular items</span></span><br><span class="line"> <span class="comment">// (they're probably still not going to work right away) and overall</span></span><br><span class="line"> <span class="comment">// controller protection (everything I've done is broken, this controller</span></span><br><span class="line"> <span class="comment">// needs to calm down or it can starve other usefull work) cases.</span></span><br><span class="line"> c.queue.AddRateLimited(key)</span><br><span class="line"></span><br><span class="line"> <span class="keyword">return</span> <span class="literal">true</span></span><br><span class="line">}</span><br></pre></td></tr></table></figure>]]></content>
<summary type="html"><blockquote>
<p>翻译自<a href="https://github.com/kubernetes/community/blob/8cafef897a22026d42f5e5bb3f104febe7e29830/contributors/devel/control</summary>
<category term="Kubernetes" scheme="http://kiragoo.github.com/categories/Kubernetes/"/>
<category term="Controller" scheme="http://kiragoo.github.com/categories/Kubernetes/Controller/"/>
<category term="Kubernetes" scheme="http://kiragoo.github.com/tags/Kubernetes/"/>
<category term="Controller" scheme="http://kiragoo.github.com/tags/Controller/"/>
</entry>
<entry>
<title>PersistentVolumnes 单Pod 访问模式</title>
<link href="http://kiragoo.github.com/archives/96e8d00f.html"/>
<id>http://kiragoo.github.com/archives/96e8d00f.html</id>
<published>2021-09-28T03:08:05.000Z</published>
<updated>2022-04-21T12:46:07.732Z</updated>
<content type="html"><![CDATA[<blockquote><p>翻译自<a href="https://kubernetes.io/blog/2021/09/13/read-write-once-pod-access-mode-alpha/">Introducing Single Pod Access Mode for PersistentVolumes</a></p></blockquote><h1 id="访问模式及重要意义"><a href="#访问模式及重要意义" class="headerlink" title="访问模式及重要意义"></a>访问模式及重要意义</h1><p>当使用持久化储存时,对于存储有多种模式进行访问使用。</p><p>例如,存储系统中的网络文件可以同时被多个用户进行读写数据。在另一种场景下,也许每个用户允许进行读而不具备写的权限。对于高敏数据,可能只允许一个用户用户进行读写操作而不是所有的用户。</p><p>在 <code>Kubernetes</code> 的世界中,<a href="https://kubernetes.io/docs/concepts/storage/persistent-volumes/#access-modes"><code>access mode</code></a>就是我们定义持久化存储如何使用的方式。这些访问方式作为 <code>PVs</code> 和 <code>PVCs</code> 中 <code>spec</code> 描述的一部分内容。</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">kind:</span> <span class="string">PersistentVolumeClaim</span></span><br><span class="line"><span class="attr">apiVersion:</span> <span class="string">v1</span></span><br><span class="line"><span class="attr">metadata:</span></span><br><span class="line"> <span class="attr">name:</span> <span class="string">shared-cache</span></span><br><span class="line"><span class="attr">spec:</span></span><br><span class="line"> <span class="attr">accessModes:</span></span><br><span class="line"> <span class="bullet">-</span> <span class="string">ReadWriteMany</span> <span class="comment"># Allow many pods to access shared-cache simultaneously.</span></span><br><span class="line"> <span class="attr">resources:</span></span><br><span class="line"> <span class="attr">requests:</span></span><br><span class="line"> <span class="attr">storage:</span> <span class="string">1Gi</span></span><br></pre></td></tr></table></figure><p>在 <code>v1.22</code> 版本之前, <code>Kubernetes</code> 为 <code>PVs</code> 和 <code>PVCs</code>提供如下三种访问模式:</p><ul><li><code>ReadWriteOnce</code> - <code>volume</code> 只允许被单个 <strong><code>node</code></strong> 进行读写</li><li><code>ReadOnlyMany</code> - <code>volume</code> 允许被多个 <strong><code>node</code></strong> 进行读操作</li><li><code>ReadWriteMany</code> - <code>volume</code> 允许被多个 <strong><code>node</code></strong> 进行读写操作</li></ul><p>这些访问模式通过 <code>Kubernetes</code> 组件如 <code>kube-controller-manager</code> 和 <code>kubelet</code> 来保证相应的 <code>Pods</code> 能够访问所应用的 <code>PersistentVolume</code>.</p><h1 id="新的访问模式及运行原理"><a href="#新的访问模式及运行原理" class="headerlink" title="新的访问模式及运行原理"></a>新的访问模式及运行原理</h1><p><code>Kubernetes v1.22</code> 介绍了 <code>PVs</code> 和 <code>PVCs</code> 的第四种访问模式:</p><ul><li><code>ReadWriteOncePod</code> - <code>volume</code> 允许单个 <strong><code>Pod</code></strong> 进行读写操作</li></ul><p>如果你对使用了 <code>PVC</code> 的 <code>Pod</code> 配置了 <code>ReadWriteOncePod</code> 的访问模式,<code>Kubernetes</code> 将确保此 <code>Pod</code> 是集群中唯一能够对此 <code>PVC</code> 进行读写的唯一 <code>Pod</code>.</p><p>如果你将另一个 <code>Pod</code> 对同一个 <code>PVC</code> 进行配置关联且也是配置了此访问模式,那么此 <code>Pod</code> 将无法启动,因为此 <code>PVC</code> 已经被另一个 <code>Pod</code> 使用了。例如:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">Events:</span><br><span class="line"> Type Reason Age From Message</span><br><span class="line"> ---- ------ ---- ---- -------</span><br><span class="line"> Warning FailedScheduling 1s default-scheduler 0/1 nodes are available: 1 node has pod using PersistentVolumeClaim with the same name and ReadWriteOncePod access mode.</span><br></pre></td></tr></table></figure><h2 id="此访问模式与-ReadWriteOnce-的区别?"><a href="#此访问模式与-ReadWriteOnce-的区别?" class="headerlink" title="此访问模式与 ReadWriteOnce 的区别?"></a>此访问模式与 <code>ReadWriteOnce</code> 的区别?</h2><p><code>ReadWriteOnce</code> 访问模式约束 <code>volume</code> 对应单个 <code>node</code>,这意味着在同一个节点上的多个 <code>Pods</code> 能够读写通过一个 <code>volume</code>。对于某些应用此访问模式可能是个潜在的重大问题,尤其是对那些要求至多要求保证只有一个数据写入而言。</p><p>对 <code>PVC</code> 设置 <code>ReadWriteOncePod</code> 访问策略,那么 <code>Kubernetes</code> 将保证只有一个 <code>Pod</code> 能够进行访问。</p><h1 id="如何使用?"><a href="#如何使用?" class="headerlink" title="如何使用?"></a>如何使用?</h1><p><code>ReadWriteOncePod</code> 访问模式在 <code>v1.22</code> 版本中是 <code>alpha</code> 版本,并且支持 <code>CSI volume</code>。首先你得在 <code>kube-apiserver</code>, <code>kube-scheduler</code>, <code>kubelet</code> 中开启对 <code>ReadWriteOncePod</code> 的<a href="https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates">特性</a>。你可以通过设置如下命令行参数进行配置:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">--feature-gates=<span class="string">"...,ReadWriteOncePod=true"</span></span><br><span class="line"></span><br><span class="line">另外你的将如下 `CSI sidecars` 升级到如下版本或更高:</span><br><span class="line">- [csi-provisioner:v3.0.0+](https://github.com/kubernetes-csi/external-provisioner/releases/tag/v3.0.0)</span><br><span class="line">- [csi-attacher:v3.3.0+](https://github.com/kubernetes-csi/external-attacher/releases/tag/v3.3.0)</span><br><span class="line">- [csi-resizer:v1.3.0+](https://github.com/kubernetes-csi/external-resizer/releases/tag/v1.3.0)</span><br></pre></td></tr></table></figure><h2 id="创建-PersistentVolumeClaim"><a href="#创建-PersistentVolumeClaim" class="headerlink" title="创建 PersistentVolumeClaim"></a>创建 <code>PersistentVolumeClaim</code></h2><p>对 <code>PVs</code> 和 <code>PVCs</code> 使用 <code>ReadWriteOncePod</code> 访问模式,你需创建一个新的 <code>PVC</code> 配置如下的访问模式:</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">kind:</span> <span class="string">PersistentVolumeClaim</span></span><br><span class="line"><span class="attr">apiVersion:</span> <span class="string">v1</span></span><br><span class="line"><span class="attr">metadata:</span></span><br><span class="line"> <span class="attr">name:</span> <span class="string">single-writer-only</span></span><br><span class="line"><span class="attr">spec:</span></span><br><span class="line"> <span class="attr">accessModes:</span></span><br><span class="line"> <span class="bullet">-</span> <span class="string">ReadWriteOncePod</span> <span class="comment"># Allow only a single pod to access single-writer-only.</span></span><br><span class="line"> <span class="attr">resources:</span></span><br><span class="line"> <span class="attr">requests:</span></span><br><span class="line"> <span class="attr">storage:</span> <span class="string">1Gi</span></span><br></pre></td></tr></table></figure><p>如果你的存储插件支持 <a href="https://kubernetes.io/docs/concepts/storage/dynamic-provisioning/"><code>dynamic provisioning</code></a>,新的 <code>PersistentVolumes</code> 将会应用<code>ReadWriteOncePod</code>访问模式。</p><h2 id="迁移已有的-PersistentVolumes"><a href="#迁移已有的-PersistentVolumes" class="headerlink" title="迁移已有的 PersistentVolumes"></a>迁移已有的 <code>PersistentVolumes</code></h2><p>如果你已经有了存在的 <code>PersistentVolumes</code>,也可以将它们迁移使用 <code>ReadWriteOncePod</code> 访问模式。</p><p>在此例子中,我们已经拥有了与 <code>cat-pictures-pv</code> 绑定的 <code>cat-pictures-pvc PersistentVolumeClaim</code>,另外 <code>cat-pictures-writer Deployment</code> 已经使用了此 <code>PersistentVolumeClaim</code>。</p><p>第一步,你需要编辑你的 <code>PersistentVolume</code> 中的 <code>spec.persistentVolumeReclaimPolicy</code> 将其改为 **<code>Retain</code>**。这是为了保证当我们相关 <code>PersistentVolumeClaim</code> 的时候 <code>PersistentVolume</code> 将不会被删除。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">kubectl patch pv cat-pictures-pv -p <span class="string">'{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'</span></span><br></pre></td></tr></table></figure><p>下一步你需要停止那些使用了你想做迁移的 <code>PersistentVolume</code> 想关联的 <code>PersistentVolumeClaim</code> 的工作平面,同事删除这些 <code>PersistentVolumeClaim</code>。</p><p>一旦如上步骤已经完成,你需要清除与你相关的 <code>PersistenVolume</code> 相关的 <code>spec.claimRef.uid</code> 字段内容,以便确保 <code>PersistentVolumeClaims</code> 能够在再次创建过程中能够被绑定。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">kubectl scale --replicas=0 deployment cat-pictures-writer</span><br><span class="line">kubectl delete pvc cat-pictures-pvc</span><br><span class="line">kubectl patch pv cat-pictures-pv -p <span class="string">'{"spec":{"claimRef":{"uid":""}}}'</span></span><br></pre></td></tr></table></figure><p>结束之后你需要将 <code>PersistentVolume</code> 的访问模式替换为 <code>ReadWriteOncePod</code>:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">kubectl patch pv cat-pictures-pv -p <span class="string">'{"spec":{"accessModes":["ReadWriteOncePod"]}}'</span></span><br></pre></td></tr></table></figure><blockquote><p><strong>Note</strong>:<code>ReadWriteOncePod</code> 访问模式无法与其他的访问模式进行结合使用。确保在更新 <code>PersistentVolume</code> 的时候 <code>ReadWriteOncePod</code>是唯一的访问模式,否则将会放生请求失败。</p></blockquote><p>下一步你的得将你的 <code>PersistentVolumeClaim</code> 修改为 <code>ReadWriteOncePod</code> 作为唯一的访问模式。同时你还得将配置 <code>PersistentVolumeClaim</code> 中的 <code>spec.volumeName</code> 对应到你的 <code>PersistentVolume</code>。</p><p>一旦以上步骤都已经完成,你可以重新创建你的 <code>PersistenVolumeClaim</code> 并且启动你的工作平面:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># IMPORTANT: Make sure to edit your PVC in cat-pictures-pvc.yaml before applying. You need to:</span></span><br><span class="line"><span class="comment"># - Set ReadWriteOncePod as the only access mode</span></span><br><span class="line"><span class="comment"># - Set spec.volumeName to "cat-pictures-pv"</span></span><br><span class="line"></span><br><span class="line">kubectl apply -f cat-pictures-pvc.yaml</span><br><span class="line">kubectl apply -f cat-pictures-writer-deployment.yaml</span><br></pre></td></tr></table></figure><p>最后你可能需要编辑的你的 <code>PersistentVolume</code> 中 <code>spec.persistentVolumeReclaimPolicy</code> 字段并将其配置为 <code>Delete</code> 如果你确实改动过它:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">kubectl patch pv cat-pictures-pv -p <span class="string">'{"spec":{"persistentVolumeReclaimPolicy":"Delete"}}'</span></span><br></pre></td></tr></table></figure><p>你可以阅读<a href="https://kubernetes.io/docs/tasks/configure-pod-container/configure-persistent-volume-storage/">Configure a Pod to Use a PersistentVolume for Storage</a>了解更多的细节。</p>]]></content>
<summary type="html">介绍与迁移示例</summary>
<category term="Kubernetes" scheme="http://kiragoo.github.com/categories/Kubernetes/"/>
<category term="PersistentVolumes" scheme="http://kiragoo.github.com/categories/Kubernetes/PersistentVolumes/"/>
<category term="Kubernetes" scheme="http://kiragoo.github.com/tags/Kubernetes/"/>
<category term="PersistentVolumes" scheme="http://kiragoo.github.com/tags/PersistentVolumes/"/>
</entry>
<entry>
<title>深入理解k8s网络原理之-Service原理</title>
<link href="http://kiragoo.github.com/archives/1d9c7e0d.html"/>
<id>http://kiragoo.github.com/archives/1d9c7e0d.html</id>
<published>2021-09-13T02:05:16.000Z</published>
<updated>2023-04-03T08:26:56.200Z</updated>
<content type="html"><![CDATA[<blockquote><p>收集整理,转载自<a href="https://zhuanlan.zhihu.com/p/404837363">深入理解kubernetes(k8s)网络原理之二-service原理</a></p></blockquote><p>在<a href="https://kiragoo.github.io/archives/1d9a37c9.html">深入理解k8s网路原理之-POD连接主机</a>中主要介绍了<code>POD</code>与主机及<code>POD</code>访问外网的原理。</p><h2 id="Linux网络基础知识"><a href="#Linux网络基础知识" class="headerlink" title="Linux网络基础知识"></a><code>Linux</code>网络基础知识</h2><h3 id="netfilter"><a href="#netfilter" class="headerlink" title="netfilter"></a><code>netfilter</code></h3><p><code>netfileter</code>子系统5个关键扩展点:</p><ul><li><code>PREROUTING</code>,数据包刚到达时会经过这个点,通常用来完成DNAT的功能。</li><li><code>INPUT</code>,数据包要进入本机的传输层时会经过这个点,通常用来完成防火墙入站检测。</li><li><code>FORWARD</code>,数据包要通过本机转发时会经过这个点,通常用来完成防火墙转发过滤。</li><li><code>OUTPUT</code>,从本机的数据包要出去的时候会经过这个点,通常用来做DNAT和防火墙出站检测。</li><li><code>POSTROUTING</code>,数据包离开本机前会经过这个点,通常用来做SNAT。</li></ul><p><img src="https://kblogs.oss-cn-beijing.aliyuncs.com/blogimgs/netfilter.drawio.png" alt="netfilter"><br>1、 主机的应用程序接收外部的数据包会经过的点: PREROUTING -> INPUT</p><p>2、 主机的应用程序发送数据包到外部经过的点: OUTPUT -> POSTROUTING</p><p>3、 主机的<code>POD</code>发送的数据包去外部或者去主机的另外一个<code>POD</code>: PREROUTING -> FORWARD -> POSTROUTING</p><blockquote><p>主机上运行的<code>POD</code>虽然也是主机上的一个进程,但是<code>POD</code>发送数据包出去的流程去和主机的其他进程不一样,由于<code>POD</code>在新的<code>NS</code>中,所以他的发包流程适合主机收到另一台主机的数据然后转发的流程是一样的。</p></blockquote><p>4、<strong>注意图中的<code>IPIsLocal</code>?的判断,如果数据包的目标<code>IP</code>是本机<code>IP</code>,则往<code>INPUT</code>点走,否则查看<code>net.ipv4.ip_forward</code>是否为1,是则往<code>FORWARD</code>走,0则丢弃。</strong></p><h3 id="iptables-基础知识"><a href="#iptables-基础知识" class="headerlink" title="iptables 基础知识"></a><code>iptables</code> 基础知识</h3><p><code>iptables</code> 初识别</p><p><code>iptables -A INPUT -t filter -s 192.168.1.10 -j DROP</code><br>意思是指不允许来源为<code>192.168.1.10</code>的<code>ip</code>访问本机的服务。</p><p>命令详解:</p><ul><li>-A 是指后面加一条规则,其他为<ul><li>-I 是前面加一条规则,优先级更高</li><li>-D 删除规则</li><li>-N 新增加链</li><li>-F 清除链上的所有规则或者所有链</li><li>-X 删除一条用户的自定义链</li><li>-P 更改链的默认策略</li><li>-L 真是指定链上的规则</li></ul></li><li>防火墙规则一般制工作在<code>INPUT/OUTPUT/FORWARD</code>三个扩展点</li><li>-t 指定当前命令操作所属的表,主要有:<ul><li><code>filter</code> 表,主要用于拦截或者房型,不修改包,如果不指定,则默认为<code>filter</code>表</li><li><code>nat</code> 表,用于修改<code>ip</code>包的源/目的地址</li><li><code>mangle</code> 表,用于给数据包打标记</li><li><code>raw</code> 表,#TODO</li><li><code>security</code> 表, #TODO</li></ul></li><li>-s 数据包的匹配规则,规则可以一个或者多个,多个是与的效果,这里 -s 是匹配来源的的意思,其他的还有<ul><li>-d 匹配目标地址</li><li>–sport 匹配来源端口</li><li>–dport 匹配目标端口</li><li>-p tcp 匹配协议类型</li></ul></li><li>-j <code>DROP</code>是执行的动作,这里是跳转到(<code>jump</code>)<code>DROP</code> 链,<code>iptables</code> 有几个预定义的链:<ul><li><code>DROP</code> 丢弃进入该链的包</li><li><code>ACCEPT</code> 接收进入该链的包</li><li><code>RETURN</code> 返回上一级链</li><li><code>SNAT</code> 源地址转换,要指定转换后的源地址</li><li><code>DNAT</code> 目标地址转换,要指定转换后目标地址</li><li><code>MASQUEREDE</code> 对进入该链的包进行源地址转换,与<code>SNAT</code> 类似,但不用指定具体的转换后的源地址,会自动应用网卡的地址作为原地址,通常都用这条链完成<code>SNAT</code>。<br>进阶示例:<br>— 把本机应用发往10.96.0.100的数据包的目标地址转换为10.244.3.10上,注意使要影响本机应用,</li></ul></li></ul><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">iptables -A OUTPUT -t nat -d 10.96.0.100 -j DNAT --to-distination 10.244.3.10</span><br></pre></td></tr></table></figure><ul><li>上面的规则支队本机的应用程序发送的流量有影响, 对于本机的<code>POD</code>发出的流量乜有影响,如果要影响本机的<code>POD</code>,还要加一条,规则都一样,只是工作在<code>PREROUTING</code>链.</li></ul><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">iptables -A RPEROTING -t nat -d 10.96.0.100 -j DNAT --to-distination 10.244.3.10</span><br></pre></td></tr></table></figure><ul><li>对本机发送的数据包中来源<code>ip</code> 为172.20.1.10 的数据包进行源地址伪装,注意修改源地址只有个一个点可以用,就是<code>POSTROUTING</code>,下面的规则就是配置<code>POD</code>上外网时使用的:</li></ul><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">iptabels -A POSTROUTING -t NAT -S 172.20.1.10 -j MASQUEREDE</span><br></pre></td></tr></table></figure><ul><li>允许来源<code>IP</code>为192.168.6.166并访问本机的<code>TCP</code>80 端口</li></ul><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">iptables -A INPUT -t filter -s 192.168.8.166 -p tcp --dport 80 -j ACCEPT</span><br></pre></td></tr></table></figure><blockquote><p><code>iptables</code> 上的规则创建会有一些限制,如无法在<code>POSTROUTING</code>链上创建<code>DNAT</code>的规则,因为饿在<code>POSTROUTING</code>之前,数据包要进行路由判决,内核会根据当前的目的地选择一个最合适的出口,而<code>POSTROUTING</code>链的规则是在路由判决后发生,在这里修改数据包的目的地会造成数据包不可到达的后果。</p></blockquote><h3 id="K8S-的-Service-设计"><a href="#K8S-的-Service-设计" class="headerlink" title="K8S 的 Service 设计"></a><code>K8S</code> 的 <code>Service</code> 设计</h3><p>主要考虑如下两个原因:</p><ul><li><code>pod</code> 的特性是快速创建销毁,所以<code>pod</code>的<code>ip</code>是不固定的,要让调用方有个固定依赖,所以需要一个<code>VIP</code>来代表服务</li><li>一般来说为了追求应用的高可用,一个应用会部署多个<code>POD</code>,这时需要一个<code>VIP</code>充当多个<code>pod</code>的流量负载</li></ul><h4 id="service-几种类型的使用场景"><a href="#service-几种类型的使用场景" class="headerlink" title="service 几种类型的使用场景"></a><code>service</code> 几种类型的使用场景</h4><ul><li><code>clusterIP</code>: 只能在集群的节点和<code>pod</code>中访问,解决的就是集群内应用间的相互访问的问题</li><li><code>nodeport</code>: 通过节点的地址和端口将<code>pod</code>暴露到集群外,让集群外的应用能够访问集群内的应用,设置服务类型为<code>nodeport</code>时,是在<code>clusterIP</code>的基础上再给节点开个端口转发,所以<code>nodeport</code>的服务也会有一个<code>clusterIP</code></li><li><code>loadBalancer</code>: 因为使用<code>nodeport</code>方式时,需要在应用的调用方写死一个集群节点的<code>IP</code>,此方式并非为高可用方式,这个时候使用第三方负载均衡器的方式,转发到多个节点的<code>nodeport</code>,<code>loadBanlancer</code>是在<code>nodeport</code>的基础上再创建一个<code>lb</code>,所以也会先分配一个<code>clusterIP</code>,再创建节点的端口转发。</li><li><code>headless</code>: 应用多个副本彼此间相互访问,比如要部署到<code>mysql</code>的主从,从的副本想要找主的副本:</li><li><code>externalName</code>: 相当于<code>coredns</code>里面的<code>cname</code>记录</li></ul><blockquote><p>在<code>iptables</code> 模式下,<code>clusterIP</code>都是<code>ping</code>不通的,这是因为 <code>kube-proxy</code> 在实现时之根据<code>ip</code>+端口+协议精确匹配才转发,这才导致<code>clusterIP</code>不能<code>ping</code></p></blockquote><p><strong><code>hairpin flow</code>场景: <code>pod</code>通过<code>clusterIP</code>访问自己</strong></p><blockquote><p>推荐一篇万字长文<a href="https://mp.weixin.qq.com/s/Dgv5BU9YU0tuSMxtzcuiVw"><code>iptables</code>长文详解</a></p></blockquote>]]></content>
<summary type="html"><blockquote>
<p>收集整理,转载自<a href="https://zhuanlan.zhihu.com/p/404837363">深入理解kubernetes(k8s)网络原理之二-service原理</a></p>
</blockquote>
<p>在<a hr</summary>
<category term="kubernetes" scheme="http://kiragoo.github.com/categories/kubernetes/"/>
<category term="network" scheme="http://kiragoo.github.com/categories/kubernetes/network/"/>
<category term="kubernetes" scheme="http://kiragoo.github.com/tags/kubernetes/"/>
<category term="iptables" scheme="http://kiragoo.github.com/tags/iptables/"/>
<category term="network" scheme="http://kiragoo.github.com/tags/network/"/>
<category term="service" scheme="http://kiragoo.github.com/tags/service/"/>
</entry>
<entry>
<title>深入理解k8s网路原理之-POD连接主机</title>
<link href="http://kiragoo.github.com/archives/1d9a37c9.html"/>
<id>http://kiragoo.github.com/archives/1d9a37c9.html</id>
<published>2021-09-03T07:58:47.000Z</published>
<updated>2022-04-21T12:46:07.735Z</updated>
<content type="html"><![CDATA[<blockquote><p>转载自<a href="https://zhuanlan.zhihu.com/p/403856388">深入理解kubernetes(k8s)网络原理之一-pod连接主机</a></p></blockquote><h1 id="关于Linux网络的知识"><a href="#关于Linux网络的知识" class="headerlink" title="关于Linux网络的知识"></a>关于<code>Linux</code>网络的知识</h1><h2 id="向外发送一个数据包,执行步骤:"><a href="#向外发送一个数据包,执行步骤:" class="headerlink" title="向外发送一个数据包,执行步骤:"></a>向外发送一个数据包,执行步骤:</h2><p>1、查找该数据包的目的地的路由信息,如果是直连,则在邻居表中查找该目的地的<code>Mac</code>地址<br>2、如果非直连路由,则在邻居表中查找下一跳的<code>Mac</code>地址<br>3、如果找不到对应的路由,则报<code>"network is unreachable"</code><br>4、如果在邻居表中没有查到相应的<code>MAC</code>地址信息,则向外发送<code>ARP</code>请求询问<br>5、发送出去的数据帧,源<code>MAC</code>地址为发送网卡的<code>MAC</code>地址,目标<code>MAC</code>则是下一跳的<code>MAC</code>,只要不经过<code>NAT</code>,那么源目的<code>IP</code>全程不会变化,而<code>MAC</code>地址则每一跳都会变化</p><h2 id="收到数据帧,执行步骤"><a href="#收到数据帧,执行步骤" class="headerlink" title="收到数据帧,执行步骤"></a>收到数据帧,执行步骤</h2><p>1、如果数据帧目标<code>MAC</code>地址不是收包网卡的<code>MAC</code>,也不是<code>ARP</code>广播地址,且网卡未开启混杂模式,则拒绝收包<br>2、如果数据帧目标<code>MAC</code>为<code>ff:ff:ff:ff:ff:ff</code>,则进入<code>ARP</code>请求处理流程<br>3、如果数据帧目标<code>MAC</code>地址是收包网卡的<code>MAC</code>,且是<code>IP</code>包则:<br> 1、目标<code>IP</code>地址在本机,则上送到上一层协议继续处理<br> 2、目标<code>IP</code>地址不在本机,则看<code>net.ipv4.ip_forward</code>是否为1,若是1,则查找目标<code>IP</code>的路由信息,进行转发<br> 3、目标<code>IP</code>不在本机,且<code>net.ipv4.ip_forward</code>为0,则丢弃</p><h2 id="常见命令"><a href="#常见命令" class="headerlink" title="常见命令"></a>常见命令</h2><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#</span><span class="bash"> 查看网卡信息</span></span><br><span class="line">ip link </span><br><span class="line"><span class="meta">#</span><span class="bash"> 查看网卡ip地址</span></span><br><span class="line">ip addr</span><br><span class="line"><span class="meta">#</span><span class="bash"> 查看邻居表信息</span></span><br><span class="line">ip neigh</span><br><span class="line"><span class="meta">#</span><span class="bash"> 查看所有iptables规则</span></span><br><span class="line">iptables-save</span><br></pre></td></tr></table></figure><blockquote><p>为了让多个进程高效互不影响地运行,衍生出容器技术,其中以<code>Docker</code>最为流行:<br>1、资源隔离: 使用<code>linux control group</code> 解决各种进程<code>CPU</code>和<code>Memory</code>、<code>io</code>的资源分配问题<br>2、网络隔离: 使用<code>linux network group</code>让各个进程运行在独立的网络命名空间,使各个进程运行在独立的网络命名空间<br>3、文件系统隔离:使用<code>union fs</code>,让各个进程运行在独立的根文件系统中</p></blockquote><p><strong><code>POD</code>即共享同一个<code>ns</code>的多个容器</strong></p><h1 id="示例"><a href="#示例" class="headerlink" title="示例"></a>示例</h1><p><code>docker</code> 运行一个容器时,都会为当前容器创建一个<code>ns</code>,多个容器只能相互访问对方的<code>ip</code>地址</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">docker run -itd --name=pause busybox</span><br><span class="line">docker run --name=nginx -d nginx </span><br></pre></td></tr></table></figure><p>此时要在 <code>pause</code> 中访问 <code>nginx</code>,先查找下<code>nginx</code>容器的<code>ip</code>地址</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">docker inspect nginx | grep IPAddress</span><br><span class="line"> "SecondaryIPAddresses": null,</span><br><span class="line"> "IPAddress": "172.17.0.3",</span><br><span class="line"> "IPAddress": "172.17.0.3",</span><br></pre></td></tr></table></figure><p>然后在<code>pause</code>容器中用刚查到的<code>ip</code>地址进行访问</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">docker exec -it pause curl 172.17.0.8</span><br><span class="line"></span><br><span class="line"><!DOCTYPE html></span><br><span class="line"><html></span><br><span class="line"><head></span><br><span class="line"><title>Welcome to nginx!</title></span><br><span class="line">....</span><br><span class="line"></body></span><br><span class="line"></html></span><br></pre></td></tr></table></figure><p>这里可以让 <code>nginx</code> 容器加入<code>pause</code>容器的<code>ns</code>,用下面的命令可以模拟:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">docker run -itd --name=pause busybox</span><br><span class="line">docker run --name=nginx --network=container:pause -d nginx</span><br></pre></td></tr></table></figure><p>此时<code>pause</code>容器和<code>nginx</code>容器是在相同的<code>ns</code>中,相互间访问就可以使用`localhost进行访问了,可以用下面的命令进行验证:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">docker exec -it pause culr localhost</span><br></pre></td></tr></table></figure><p><em><code>pause</code>容器和<code>nginx</code>容器就是共享一个<code>ns</code>的两个容器,所以<code>pause</code>和<code>nginx</code>两个容器加起来就是<code>k8s</code>的<code>pod</code></em></p><blockquote><p>在<code>k8s</code>集群的节点中使用<code>docker ps</code>,总会发现一堆名为<code>pause</code>的容器,<code>pause</code>是为多个业务容器提供共享的<code>ns</code>的。</p></blockquote><p>1、进入<code>docker</code>创建的<code>pause</code>容器的<code>ns 先获取</code>pause<code>容器的</code>pid`</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">docker inspect pause | grep Pid</span><br></pre></td></tr></table></figure><p>2、进入指定<code>pid</code>的`ns</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">nsenter --net=/proc/3083138/ns/net</span><br></pre></td></tr></table></figure><p>3、此时已经在<code>pause</code>容器的<code>ns</code>中了,可以查看该<code>ns</code>的网卡,路由表,邻居表等信息了</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">ip addr show</span><br><span class="line">ip route</span><br></pre></td></tr></table></figure><h1 id="认识ns"><a href="#认识ns" class="headerlink" title="认识ns"></a>认识<code>ns</code></h1><p>影响网络方面的配置主要有以下几个:<br>— 网卡:启动时初始化,后期可以添加虚拟设备</p><ul><li>端口:1到65535,所有进程共享</li><li><code>iptables</code>规则: 配置进出主机的防火墙策略和<code>NAT</code>规则</li><li>路由表:到目标地址的路由信息</li><li>邻居表:与主机在同个二层网络的其他主机的<code>MAC</code>地址与<code>IP</code>地址的映射关系</li></ul><h2 id="示例-1"><a href="#示例-1" class="headerlink" title="示例"></a>示例</h2><p>1、创建新的<code>ns</code></p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">ip netns add ns1</span><br></pre></td></tr></table></figure><p>然后可以使用<code>ip netns exec ns1</code>前缀来执行命令,这样显示的结果就都是<code>ns1</code>的网络相关的配置了.</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">ip netns exec ns1 ip link show</span><br></pre></td></tr></table></figure><p>2、主机与<code>pod</code>相互访问<br>首先给<code>ns1</code>正价一张与主机相连的网卡,这里用到<code>linux</code>虚拟网络设备<code>veth</code>网卡对,对于<code>veth</code>,基本可以理解为中间连着线的两张网卡:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#</span><span class="bash"> 增加一对veth网卡,名为 ns1-eth0 和 veth-ns1</span></span><br><span class="line">ip link add ns1-eth0 type veth peer name veth-ns1</span><br><span class="line"><span class="meta">#</span><span class="bash"> 其中一端挪到刚才创建的ns1中,另一端留在主机端,这样主机和ns就连接起来了</span></span><br><span class="line">ip link set ns1-eth0 netns ns1</span><br><span class="line"><span class="meta">#</span><span class="bash"> 启动主机端的网卡veth-ns1</span></span><br><span class="line">ip link set veth-ns1 up</span><br><span class="line"><span class="meta">#</span><span class="bash"> 执行设置网卡的ip</span></span><br><span class="line">ip netns exec ns1 ip addr add 172.20.1.10/24 dev ns1-eth0</span><br><span class="line"><span class="meta">#</span><span class="bash"> 启动ns1端的网卡ns1-eth0</span></span><br><span class="line">ip netns exec ns1 ip link set ns1-eth0 up</span><br></pre></td></tr></table></figure><p>3、测试与主机<code>ip</code>是否能<code>ping</code>通</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">ip netns exec ns1 ping xxx.xxx.xxx.xxx</span><br></pre></td></tr></table></figure><p>此时发现不能<code>ping</code>通主机,这是因为没有到目的地的路由,所以在这里给<code>ns1</code>增加一条默认路由</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">ip netns exec ns1 ip route add default via 172.20.1.1. dev ns-eth0</span><br></pre></td></tr></table></figure><p>通过</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">ip netsns exec ns1 ip route</span><br></pre></td></tr></table></figure><p>查看路由信息<br>此时去<code>ping</code>发现还是不行,这是因为<em>如果是非直连路由,会先去拿下一跳的<code>mac</code>地址,下一跳是<code>172..20.1.1</code>,能获取到它的<code>MAC</code>地址吗?</em><br>用如下命令查一下路由表:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">ip netns exec ns1 ip neigh</span><br></pre></td></tr></table></figure><p>会发现获取不到,以为网关<code>ip</code>地址确实是个不存在的地址,网关<code>IP</code>是不会出现在<code>pod</code>发送的数据包中的,真正需要用的是网关的<code>mac</code>地址,我们的目的是要得到主机端<code>veth-ns1</code>的<code>mac</code>地址,有两个方法:<br>·、设置对端的网卡<code>apr</code>代答,<code>ns1-eth0</code>的对端是主机上的<code>veth-ns1</code>网卡</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#</span><span class="bash"> 这样就开启了veth-ns1的arp代答,只要收到arp请求,不管目标IP是什么,veth-ns1网卡都会把自己MAC地址回复回去</span></span><br><span class="line">echo 1 > /proc/sys/net/ipv4/conf/veth-ns1/proxy_arp </span><br></pre></td></tr></table></figure><p>或者把网关地址设置在对端的网卡上<br>4、此时拿到网关的<code>mac</code>地址但是<code>ping</code>之后发现还是不行。这是因为主机上没有添加到<code>pod</code>的直连路由</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">ip route add 172.20.1.10 dev veth-ns1</span><br></pre></td></tr></table></figure><p>此时只能保证主机与<code>pod</code>进行互通,此时<code>pod</code>是没法访问外网的,这个时候需要做原地址转换,所以我们需要在主机上也要配置针对刚才创建的<code>pod</code>的原地址转换规则。</p><h2 id="pod访问外网"><a href="#pod访问外网" class="headerlink" title="pod访问外网"></a><code>pod</code>访问外网</h2><ul><li>首先第一步需要打开本机的<code>ip</code>转发功能<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">echo 1 > /proc/sys/net/ipv4/ip_forward</span><br></pre></td></tr></table></figure></li><li>设置<code>snat</code>规则<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">iptables -A POSTROUTING -t nat -s 172.20.1.10 -j MASQUERADE</span><br></pre></td></tr></table></figure>此时发现可以<code>ping</code>通百度</li></ul>]]></content>
<summary type="html">温顾而知新,最近在做公司产品 `MQ` 场景的 `Operator` 开发设计,好久没更新博客了。</summary>
<category term="kubernetes" scheme="http://kiragoo.github.com/categories/kubernetes/"/>
<category term="network" scheme="http://kiragoo.github.com/categories/kubernetes/network/"/>
<category term="kubernetes" scheme="http://kiragoo.github.com/tags/kubernetes/"/>
<category term="network" scheme="http://kiragoo.github.com/tags/network/"/>
</entry>
<entry>
<title>Istio初体验之后续补充</title>
<link href="http://kiragoo.github.com/archives/ffa85151.html"/>
<id>http://kiragoo.github.com/archives/ffa85151.html</id>
<published>2021-05-12T10:33:02.000Z</published>
<updated>2023-04-03T08:26:55.906Z</updated>
<content type="html"><![CDATA[<p>继<a href="https://kiragoo.github.io/archives/b47cf59d.html">docker desktop之Istio初体验</a>中完成集群内部基础组件部署之后,我们有了大概的认知。其中关于如何将服务对外暴露并没有明细说明太多,此篇补充下<em>如何对外开放应用服务</em>。</p><h1 id="对外开放应用程序"><a href="#对外开放应用程序" class="headerlink" title="对外开放应用程序"></a>对外开放应用程序</h1><p>此时我们查看如下:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">~/Documents/Opts/k8s/addons/istio-1.8.1/v1.8.1 kubectl get pods ✔ 18:28:42</span><br><span class="line">NAME READY STATUS RESTARTS AGE</span><br><span class="line">details-v1-79c697d759-smfhp 2/2 Running 0 5m56s</span><br><span class="line">productpage-v1-65576bb7bf-k6797 2/2 Running 0 5m54s</span><br><span class="line">ratings-v1-7d99676f7f-xjr74 2/2 Running 0 5m55s</span><br><span class="line">reviews-v1-987d495c-2hqcg 2/2 Running 0 5m55s</span><br><span class="line">reviews-v2-6c5bf657cf-s4db5 2/2 Running 0 5m56s</span><br><span class="line">reviews-v3-5f7b9f4f77-7rt5t 2/2 Running 0 5m55s</span><br><span class="line"> ~/Documents/Opts/k8s/addons/istio-1.8.1/v1.8.1 kubectl <span class="built_in">exec</span> <span class="string">"<span class="subst">$(kubectl get pod -l app=ratings -o jsonpath='{.items[0].metadata.name}')</span>"</span> -c ratings -- curl -s productpage:9080/productpage | grep -o <span class="string">"<title>.*</title>"</span> </span><br><span class="line"></span><br><span class="line"><title>Simple Bookstore App</title></span><br></pre></td></tr></table></figure><p>确保服务状态都是正常的。</p><p>此时,<code>BookInfo</code> 应用已经部署,但还不能被外界访问。 要开放访问,你需要创建 <a href="https://kiragoo.github.io/archives/c3a53ddf.html"><code>Istio</code> 入站网关(<code>Ingress Gateway</code>)</a>, 它会在网格边缘把一个路径映射到路由。</p><blockquote><p>如对基础概念还不清楚的,可以参考<a href="https://kiragoo.github.io/archives/6a996a50.html">Istio-文档-概念</a></p></blockquote><ol><li>把应用关联到<code>Istio</code>网关:<br>先看下 <code>bookinfo-gateway.yaml</code> 中内容<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br></pre></td><td class="code"><pre><span class="line">apiVersion: networking.istio.io/v1alpha3</span><br><span class="line">kind: Gateway</span><br><span class="line">metadata:</span><br><span class="line"> name: bookinfo-gateway</span><br><span class="line">spec:</span><br><span class="line"> selector:</span><br><span class="line"> istio: ingressgateway # use istio default controller</span><br><span class="line"> servers:</span><br><span class="line"> - port:</span><br><span class="line"> number: 80</span><br><span class="line"> name: http</span><br><span class="line"> protocol: HTTP</span><br><span class="line"> hosts:</span><br><span class="line"> - "*"</span><br><span class="line">---</span><br><span class="line">apiVersion: networking.istio.io/v1alpha3</span><br><span class="line">kind: VirtualService</span><br><span class="line">metadata:</span><br><span class="line"> name: bookinfo</span><br><span class="line">spec:</span><br><span class="line"> hosts:</span><br><span class="line"> - "*"</span><br><span class="line"> gateways:</span><br><span class="line"> - bookinfo-gateway</span><br><span class="line"> http:</span><br><span class="line"> - match:</span><br><span class="line"> - uri:</span><br><span class="line"> exact: /productpage</span><br><span class="line"> - uri:</span><br><span class="line"> prefix: /static</span><br><span class="line"> - uri:</span><br><span class="line"> exact: /login</span><br><span class="line"> - uri:</span><br><span class="line"> exact: /logout</span><br><span class="line"> - uri:</span><br><span class="line"> prefix: /api/v1/products</span><br><span class="line"> route:</span><br><span class="line"> - destination:</span><br><span class="line"> host: productpage</span><br><span class="line"> port:</span><br><span class="line"> number: 9080</span><br></pre></td></tr></table></figure><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"> ~/Documents/Opts/k8s/addons/istio-1.8.1/v1.8.1 kubectl apply -f samples/bookinfo/networking/bookinfo-gateway.yaml </span><br><span class="line">gateway.networking.istio.io/bookinfo-gateway created</span><br><span class="line">virtualservice.networking.istio.io/bookinfo created</span><br></pre></td></tr></table></figure></li><li>确保配置文件没有问题:<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"> ~/Documents/Opts/k8s/addons/istio-1.8.1/v1.8.1 istioctl analyze ✔ 18:43:44</span><br><span class="line">✔ No validation issues found when analyzing namespace: default.</span><br></pre></td></tr></table></figure><h2 id="确定入站-IP-和端口"><a href="#确定入站-IP-和端口" class="headerlink" title="确定入站 IP 和端口"></a>确定入站 <code>IP</code> 和端口</h2>为访问网关设置两个变量:<code>INGRESS_HOST</code> 和 <code>INGRESS_PORT</code>。*由于我的环境是<code>Mac</code>,通过<code>docker desktop</code>部署的<code>k8s</code>*,所以如下的为此环境的处理方式。</li></ol><p>执行下面命令进行判断:你的 <code>Kubernetes</code> 集群环境是否支持外部负载均衡:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"> ~/Documents/Opts/k8s/addons/istio-1.8.1/v1.8.1 kubectl get svc istio-ingressgateway -n istio-system ✔ 18:50:32</span><br><span class="line"></span><br><span class="line">NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE</span><br><span class="line">istio-ingressgateway LoadBalancer 10.106.247.46 localhost 15021:30307/TCP,80:30353/TCP,443:30695/TCP,31400:30328/TCP,15443:32035/TCP 28m</span><br></pre></td></tr></table></figure><p>设置 <code>EXTERNAL-IP</code> 的值之后, 你的环境就有了一个外部的负载均衡,可以用它做入站网关。 但如果 <code>EXTERNAL-IP</code> 的值为 <<code>none</code>> (或者一直是 <<code>pending</code>> 状态), 则你的环境则没有提供可作为入站流量网关的外部负载均衡。 这个情况,你还可以用服务(<code>Service</code>)的 节点端口 访问网关。</p><ul><li><p><strong>由于我的环境中确实存在外部的负载均衡,那么继续往下走。</strong><br>设置入站 <code>IP</code> 地址和端口</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">~/Documents/Opts/k8s/addons/istio-1.8.1/v1.8.1 <span class="built_in">export</span> INGRESS_HOST=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath=<span class="string">'{.status.loadBalancer.ingress[0].ip}'</span>) </span><br><span class="line"></span><br><span class="line">~/Documents/Opts/k8s/addons/istio-1.8.1/v1.8.1 <span class="built_in">export</span> INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath=<span class="string">'{.spec.ports[?(@.name=="http2")].port}'</span>) </span><br><span class="line"></span><br><span class="line">~/Documents/Opts/k8s/addons/istio-1.8.1/v1.8.1 <span class="built_in">export</span> SECURE_INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath=<span class="string">'{.spec.ports[?(@.name=="https")].port}'</span>) </span><br></pre></td></tr></table></figure><blockquote><p>在某些环境中,负载均衡除了 <code>IP</code> 地址,还可以用主机名访问。 在这种情况下,入站流量网关的<code>EXTERNAL-IP</code> 值不是 <code>IP</code> 地址,而是一个主机名, 那上面设置 <code>INGRESS_HOST</code> 环境变量的操作会失败。 使用下面命令纠正 <code>INGRESS_HOST</code> 的值。<strong>我的环境是出现这样的情况</strong></p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">>$ <span class="built_in">export</span> INGRESS_HOST=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath=<span class="string">'{.status.loadBalancer.ingress[0].hostname}'</span>)</span><br></pre></td></tr></table></figure></blockquote></li><li><p><strong>如果没有负载均衡,那就选择一个节点端口来代替</strong><br>设置入站的端口:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">$ <span class="built_in">export</span> INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath=<span class="string">'{.spec.ports[?(@.name=="http2")].nodePort}'</span>)</span><br><span class="line">$ <span class="built_in">export</span> SECURE_INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath=<span class="string">'{.spec.ports[?(@.name=="https")].nodePort}'</span>)</span><br></pre></td></tr></table></figure></li></ul><p><code>GKE</code>:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ <span class="built_in">export</span> INGRESS_HOST=workerNodeAddress</span><br></pre></td></tr></table></figure><p>你需要创建一个防火墙规则,放行发往 <code>ingressgateway</code> 的 <code>TCP</code> 流量。 再运行下面的命令,单独放行发往 <code>HTTP</code> 端口或 <code>HTTPS</code> 端口的流量,或者都放行。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">$ gcloud compute firewall-rules create allow-gateway-http --allow <span class="string">"tcp:<span class="variable">$INGRESS_PORT</span>"</span></span><br><span class="line">$ gcloud compute firewall-rules create allow-gateway-https --allow <span class="string">"tcp:<span class="variable">$SECURE_INGRESS_PORT</span>"</span></span><br></pre></td></tr></table></figure><p><code>IBM Cloud Kubernetes Service</code>:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">$ ibmcloud ks workers --cluster cluster-name-or-id</span><br><span class="line">$ <span class="built_in">export</span> INGRESS_HOST=public-IP-of-one-of-the-worker-nodes</span><br></pre></td></tr></table></figure><p><code>Docker For Desktop</code>:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ <span class="built_in">export</span> INGRESS_HOST=127.0.0.1</span><br></pre></td></tr></table></figure><p><code>Other environments</code>:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ <span class="built_in">export</span> INGRESS_HOST=$(kubectl get po -l istio=ingressgateway -n istio-system -o jsonpath=<span class="string">'{.items[0].status.hostIP}'</span>)</span><br></pre></td></tr></table></figure><ol><li>设置环境变量 <code>GATEWAY_URL</code>:<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">~/Documents/Opts/k8s/addons/istio-1.8.1/v1.8.1 <span class="built_in">export</span> GATEWAY_URL=<span class="variable">$INGRESS_HOST</span>:<span class="variable">$INGRESS_PORT</span> ✔ 18:54:31</span><br></pre></td></tr></table></figure></li><li>确保 <code>IP</code>地址和端口均成功的赋值给了环境变量:<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"> ~/Documents/Opts/k8s/addons/istio-1.8.1/v1.8.1 <span class="built_in">echo</span> <span class="variable">$GATEWAY_URL</span> ✔ 19:03:53</span><br><span class="line">localhost:80</span><br></pre></td></tr></table></figure><h2 id="验证外部访问"><a href="#验证外部访问" class="headerlink" title="验证外部访问"></a>验证外部访问</h2>用浏览器查看 <code>Bookinfo</code> 应用的产品页面,验证 <code>Bookinfo</code> 已经实现了外部访问。</li><li>运行下面命令,获取 <code>Bookinfo</code> 应用的外部访问地址。<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"> ~/Documents/Opts/k8s/addons/istio-1.8.1/v1.8.1 <span class="built_in">echo</span> <span class="string">"http://<span class="variable">$GATEWAY_URL</span>/productpage"</span> ✔ 19:03:58</span><br><span class="line">http://localhost:80/productpage</span><br></pre></td></tr></table></figure></li><li>把上面命令的输出地址复制粘贴到浏览器并访问,确认 Bookinfo 应用的产品页面是否可以打开。<br><img src="https://kblogs.oss-cn-beijing.aliyuncs.com/blogimgs/bookinfo.png" alt="bookinfo"></li></ol><p>至此结束,后续继续更新<code>Istio</code>相关文章。</p><blockquote><p>参考<a href="https://istio.io/latest/zh/docs/setup/getting-started/">入门</a></p></blockquote>]]></content>
<summary type="html"><p>继<a href="https://kiragoo.github.io/archives/b47cf59d.html">docker desktop之Istio初体验</a>中完成集群内部基础组件部署之后,我们有了大概的认知。其中关于如何将服务对外暴露并没有明细说明太多,此</summary>
<category term="kubernetes" scheme="http://kiragoo.github.com/categories/kubernetes/"/>
<category term="istio" scheme="http://kiragoo.github.com/categories/kubernetes/istio/"/>
<category term="kubernetes - istio" scheme="http://kiragoo.github.com/tags/kubernetes-istio/"/>
</entry>
</feed>