HanyangTechAI · Thxios · May 18, 2022 · May 18, 2022 · May 18, 2022
diff --git a/week5/5주차_이지명.md b/week5/5주차_이지명.md
@@ -0,0 +1,89 @@
+22이지명 HAI 과제
+
+-----
+
+<br>
+
+# 5주차 논문
+
+Bochkovskiy, A., Wang, C. Y., & Liao, H. Y. M. (2020). Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934.
+
+<br>
+
+---
+
+<br>
+
+# 발표 내용 (YOLO v1 ~ v4)
+
+## YOLO v1
+
+- real-time object detection을 위해 제안된 1-stage 모델
+
+- 이미지를 여러 개의 grid cell로 분할한 뒤, 각 grid cell마다 해당 cell을 중심으로 하는 bounding box와, 그 bounding box가 속하는 class를 예측하는 vector 추출
+  - 한 grid cell당 하나의 class, 2개의 bounding box를 예측
+  - 각 cell의 vector는 class 예측값 $c_1, c_2 ... c_{20}$, bounding box 예측값 $p, x, y, w, h$ 2개씩으로 구성
+
+
+- 예측된 bounding box 중 같은 클래스를 가지는 box들에 대해 non-max suppression (NMS)
+  - 한 cell에서 예측된 2개의 box에 대해, train step에서는 ground truth와 더 높은 IOU를 가지는 box만 학습, test step에서는 $p$ (objectness, =confidence)가 높은 box를 선택
+  - 같은 클래스로 예측된 box 중 $p$가 가장 높은 box를 선택, 선택된 box와 일정 threshold 이상의 IOU를 가지는 box들은 suppress
+
+
+- pretrained feature extractor(backbone)로 feature map을 뽑은 뒤, fully-connected layer를 통해 $S \times S \times 20$ 텐서로 변환(S는 row, col 당 grid cell의 개수)
+
+<br>
+
+---
+
+## YOLO v2
+
+- Batch Normalization 사용, resolution 늘림, Backbone을 강화, etc...
+
+- feature map을 grid cell의 vector로 변환하는 과정을 fully connected layer 대신 convolution layer로 대체
+
+- Pass-Through layer 추가
+  - 마지막 feature map에 추가로 한 단계 이전의 higher resolution을 가지는 feature map도 concatenate해서 사용(pass-through)
+  - 큰 범위의 물체 뿐만 아니라 작은 물체의 정보도 보존
+
+- anchor box 사용
+  - cell vector의 예측값으로 direct로 bounding box의 크기를 예측하는 대신, 미리 설정된 anchor box에서의 변화량을 예측
+  - anchor box는 데이터셋의 ground truth들을 K-means clustering해서 설정 (정확도-복잡도 간 trade-off로 experimental하게 $K=5$로 설정)
+  - box의 cell 안 상대위치 예측에 sigmoid를 적용해 $(0,1)$ 범위로 regularization
+
+<br>
+
+---
+
+## YOLO v3
+
+- Feature Pyramid Network 사용 (prediction across scales)
+  - feature map에서 skip connection, transpose convolution, concatenate 적용해 다양한 resolution의 feature map 추출 (U-net 구조와 비슷)
+  - low, mid, high resolution의 feature map에서 각각 bounding box를 예측하여 다양한 크기의 물체 예측 가능
+- objectness 예측 향상
+  - objectness score에 sigmoid를 적용해 logistic regression으로 변환 
+  - 학습 시 best prediction에 대해서 1로, IOU < 0.5인 box에 대해서 0으로, 나머지는 ignore
+- class 예측 향상
+  - class logit에 softmax 대신 sigmoid 적용, 각 class에 대한 binary classification으로 변환 
+  - multiclass와 같이 class 간 상관관계가 높은 classification에 더 유연하게 대응 
+- Backbone 향상, etc...
+
+<br>
+
+---
+
+## YOLO v4
+
+살짝 experiment report 느낌
+- Object Detection 분야의 다양한 기법을 YOLOv3에 적용했을 경우의 성능을 평가 (SOTA architectures, Back-of-Freebies, Back-of-Specials)
+- archtecture selection : accracy, cost 간 trade-off -> CSPDarkNet53 사용
+- BoF, BoS selection : Mish activation, Batch Normalization, DropBlock, etc...
+- 새로운 method : Mosaic data agumentation, Self-Adversarial Training (SAT)
+- genetic algorithm으로 적절한 hyperparameter 설정
+- 다양한 기법 적용 -> 더 빠르고 (high fps), 더 정확한 SOTA detector
+
+<br>
+
+------
+
+