[AdaIN 논문 리뷰] - Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization

*AdaIN을 위한 논문 리뷰 글입니다! 궁금하신 점은 댓글로 남겨주세요!

AdaIN paper: [1703.06868] Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization (arxiv.org)

Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization

Gatys et al. recently introduced a neural algorithm that renders a content image in the style of another image, achieving so-called style transfer. However, their framework requires a slow iterative optimization process, which limits its practical applicat

arxiv.org

AdaIN github: GitHub - xunhuang1995/AdaIN-style: Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization

GitHub - xunhuang1995/AdaIN-style: Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization

Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization - GitHub - xunhuang1995/AdaIN-style: Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization

github.com

1. Simple Introduction

2. Method

- Loss function

3. Result

Simple Introduction

이미지의 style을 변경시키는 딥러닝 분야는 계속 활발해지고 있다.

최근 NeRF를 공부하면서 NeRF로 생성된 Novel view에서 이미지를 generative 할 수 있는 방법론을 구상하고자 논문을 여러편 읽어보는데, AdaIN이 style transfer field에서 많이 이용되는 것을 확인하고 해당 논문 리뷰를 진행하게 되었다..!

Method

구조를 설명하기에 앞서서 AdaIN이 무엇인지 살펴보자.

AdaIN의 아이디어는 BN(Batch Normalization, Instance Normalization)에서 나온 것인데,

AdaIN은 layer의 형태이지만 learnable parameter가 없다!!

또한 수식도 매우 간단한데, x는 content input, y는 style input이고 각각의 input에 대한 분산과 평균을 이용해서 값을 만들어내는 것이다!

논문의 구조는 매우 간단하다!

1. Style image와 Original image를 VGG-19 Encoder에 넣어서 feature embedding을 생성한다.

2. AdaIN layer에 content embedding, style embedding을 각각 넣는다.

3. Decoder를 통해서 원래의 resolution으로 되돌린다.

- 여기서 변환된 style image가 나온다.

+) VGG-19 Encoder는 pre-trained 모델이고, Decoder의 경우 학습을 통해서 가중치를 optimization 해야한다.

Loss function

첫번째로 Content Loss가 있다.

이것은 AdaIN에서 나온 embedding과 decoder로 부터 생성된 image를 VGG-19 Encoder에 넣어서 나온 embedding간의 L2 loss를 통해서 계산이 된다.

두번째로 Style Loss가 있다.

Style Loss의 경우 style image와 decoder에서 나온 style image 간의 embedding 차이를 L2 loss를 이용해서 계산하는 방식입니다.

특이해야할 점은, 수식의 첫번째 항은 각 embedding의 평균의 차이를 이용하고 두번째 항은 표준편차의 차이를 이용합니다.

그리고 φ function이 있는 이것은 VGG-19에 존재하는 relu1_1, relu2_1, relu3_1, relu4_1 layer에서 나온 feature 값을 의미합니다.

Result

- 다른 논문의 결과와 비교했을 때 Content와 Style부분에서 퀄리티가 좋다는 것을 확인할 수 있다.

- Style Interpolation을 실시했을 때, 잘 표현되는 것도 확인할 수 있다.

- 2023.02.08 Kyujinpy 작성.

'AI > Paper - Theory' 카테고리의 다른 글

[MPS-Net 논문 리뷰] - Capturing Humans in Motion: Temporal-Attentive 3D Human Pose and Shape Estimation from Monocular Video (0)	2023.02.10
[StylizedNeRF 논문 리뷰] - StylizedNeRF: Consistent 3D Scene Stylization as Stylized NeRF via 2D-3D Mutual Learning (0)	2023.02.08
[Relevance-CAM 논문 리뷰] - Relevance-CAM: Your Model Already Knows Where to Look (0)	2023.01.27
[Grad-CAM++ 논문 리뷰] - Grad-CAM++: Improved Visual Explanations for Deep Convolutional Networks (0)	2023.01.27
[Grad-CAM 논문 리뷰] - Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization (0)	2023.01.27

kyujinpy

[AdaIN 논문 리뷰] - Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization

Contents

Simple Introduction

Method

Loss function

Result

'AI > Paper - Theory' 카테고리의 다른 글

티스토리툴바

[AdaIN 논문 리뷰] - Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization

Contents

Simple Introduction

Method

Loss function

Result

'AI > Paper - Theory' 카테고리의 다른 글

'AI/Paper - Theory' Related Articles

티스토리툴바