[FlowEdit 논문 리뷰] - Inversion-Free Text-Based Editing Using Pre-Trained Flow Models

*FlowEdit를 위한 논문 리뷰 글입니다! 궁금하신 점은 댓글로 남겨주세요!

FlowEdit paper: https://arxiv.org/abs/2412.08629

FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models

Editing real images using a pre-trained text-to-image (T2I) diffusion/flow model often involves inverting the image into its corresponding noise map. However, inversion by itself is typically insufficient for obtaining satisfactory results, and therefore m

arxiv.org

FlowEdit github: https://matankleiner.github.io/flowedit/

FlowEdit

a website with flowedit results

matankleiner.github.io

1. Simple Introduction

2. Background Knowledge: Rectified Flow

3. Method

4. Result

Simple Introduction

최근에 training-free 기법으로 image를 editing하는 방법이 굉장히 나오고 있다.

Training-free 기반 editing을 할 때, DDIM이나 Rectified flow 기반 모델로 이용할 때는 noise의 시작점이 굉장히 중요하다.

그래서 noise를 guidance하는 방법이 크게 3가지가 있다. (아래 장단점 소개)

Random noise start
- 장점: 가우시안 분포에서 랜덤하게 시작
- 단점: Noise가 source image의 특징을 갖고 있지 못하여 잘못된 경로로 noise가 제거되어 퀄리티가 낮아짐.
Noise Inversion (위 이미지에서 (a))
- 장점: Noise가 source image의 특징을 갖고 있어서, editing이 좀 더 쉬워짐.
- 단점: Source image에서 noise를 만들어야 하는 inversion 과정을 걸쳐야하므로 시간소요가 2배가 됨. (t=0부터 T까지 source image에 noise를 추가해서 만듦.)
Inversion-Free (위 이미지에서 (c))
- 장점: Inversion 과정이 필요없이, source image가 시작점이므로 속도와 성능을 둘 다 잡을 수 있음.

FlowEdit은 rectified flow에서 처음으로 inversion-free 방법을 도입하여 image editing하는 방법을 소개하는 논문이다!

Inversion-free 방법은 무엇이고 이를 어떻게 사용하는지 밑에서 살펴보자!

Background Knowledge: Rectified Flow

Rectified Flow 간단한 설명글: https://kyujinpy.tistory.com/176

[Rectified Flow 간단한 설명] - Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow

*Rectified flow를 위한 간단한 리뷰 글입니다! 궁금하신 점은 댓글로 남겨주세요! Rectified flow: https://arxiv.org/abs/2209.03003 Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified FlowWe present rectified

kyujinpy.tistory.com

*ChatGPT Rectified Flow Velocity 예측 설명: Retified flow target 설명

Method

위 알고리즘을 이해하면 flowedit을 이해할 수 있습니다!

Rectified flow를 통해서 sampling을 시작할 때, initial input으로 X0 (source image)를 input으로 넣습니다!

(아래는 for 문 안의 수식에 대한 설명입니다!)

*Z_FE는 source image에서 target image로 서서히 변화가 되는 latent vector 입니다! (아래에 자세한 설명 참고);

1. 가우시안 noise 추출

2. t시점의 Source latent Z_src = (1-t) * X0 + noise

=> 초반 step에는 X0의 영향력이 크므로, editing 시작점의 좋은 gudiance를 줌

3. Target latent Z_tar = Z_FE + Z_src - X0

=> First step에서는 Z_FE=X0이므로, Z_src가 그대로 들어감.

4. Target과 source 사이의 velocity를 구하고, 그 velocity의 차이를 계산함. (source에서 target으로 변화하는데 필요한 차이)

5. Z_FE에 velocity를 누적해서 더함.

=> First step에서는 Z_FE=X0이지만, step이 지나면서 source와 target 사이의 velocity 차이를 누적해서 더하게 되고, Z_FE가 target image로 서서히 변해감.

즉, FlowEdit의 방법을 간단히 한 문장으로 정리하면 아래와 같다!

-> Source image에서 target image로 서서히 변화할 수 있도록, velocity를 source image latent에 더해서 editing을 수행한다.

Result

- FlowEdit 이미지 생성 결과물 (text로 잘 editing이 이루어진다.)

- Metric 기반으로 측정해도, 다른 training-free 방법과 비교하였을 때 성능이 좋다.

- 다른 모델과의 시각화 비교

- 2025.08.15 Kyujinpy 작성.

'AI > Paper - Theory' 카테고리의 다른 글

[FlowAlign 논문 리뷰] - Trajectory-Regularized, Inversion-Free Flow-based Image Editing (1)	2025.08.16
[FlowDirector 논문 리뷰] - Training-Free Flow Steering for Precise Text-to-Video Editing (3)	2025.08.15
[Rectified Flow 간단한 설명] - Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow (2)	2025.08.15
[InfEdit 논문 리뷰 + DDIM Inversion] - Inversion-Free Image Editing with Natural Language (2)	2025.06.08
[CogVideoX 논문 리뷰] - Text-to-Video Diffusion Models with An Expert Transformer (1)	2025.03.16

kyujinpy

[FlowEdit 논문 리뷰] - Inversion-Free Text-Based Editing Using Pre-Trained Flow Models

Contents

Simple Introduction

Background Knowledge: Rectified Flow

Method

Result

'AI > Paper - Theory' 카테고리의 다른 글

티스토리툴바

[FlowEdit 논문 리뷰] - Inversion-Free Text-Based Editing Using Pre-Trained Flow Models

Contents

Simple Introduction

Background Knowledge: Rectified Flow

Method

Result

'AI > Paper - Theory' 카테고리의 다른 글

'AI/Paper - Theory' Related Articles

티스토리툴바