본문 바로가기

AI/Paper - Theory

(54)
[LGM 논문 리뷰] Large Multi-View Gaussian Model for High-Resolution 3D Content Creation *LGM를 위한 논문 리뷰 글입니다! 궁금하신 점은 댓글로 남겨주세요! LGM github: LGM (kiui.moe)  LGMLGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation Arxiv 2024 Jiaxiang Tang1, Zhaoxi Chen2, Xiaokang Chen1, Tengfei Wang3, Gang Zeng1, Ziwei Liu2 1 Peking University   2 S-Lab, Nanyang Technological University   3 Shanghai AI Lame.kiui.moeContents1. Simple Introduction2. Background Knowledge: Gaussia..
[3D Gaussian Splatting 간단한 논문 리뷰] *Gaussian Splatting에 대한 간단한 논문 리뷰 입니다!*이해를 돕기 위해 수식은 거의 제외했습니다. GS 논문: repo-sam.inria.fr/fungraph/3d-gaussian-splatting/3d_gaussian_splatting_high.pdf GS github: 3D Gaussian Splatting for Real-Time Radiance Field Rendering (inria.fr) 3D Gaussian Splatting for Real-Time Radiance Field Rendering[Müller 2022] Müller, T., Evans, A., Schied, C. and Keller, A., 2022. Instant neural graphics primitives..
[LRM 논문 리뷰] - LARGE RECONSTRUCTION MODEL FOR SINGLE IMAGE TO 3D *LRM를 위한 논문 리뷰 글입니다! 궁금하신 점은 댓글로 남겨주세요! LRM paper: https://arxiv.org/abs/2311.04400 LRM: Large Reconstruction Model for Single Image to 3DWe propose the first Large Reconstruction Model (LRM) that predicts the 3D model of an object from a single input image within just 5 seconds. In contrast to many previous methods that are trained on small-scale datasets such as ShapeNet in a category-specarxi..
[FNO 논문 리뷰 & 코드 리뷰] - FOURIER NEURAL OPERATOR FOR PARAMETRIC PARTIAL DIFFERENTIAL EQUATIONS *FNO를 위한 논문 리뷰 글입니다! 궁금하신 점은 댓글로 남겨주세요! FNO paper: [2010.08895] Fourier Neural Operator for Parametric Partial Differential Equations (arxiv.org) Fourier Neural Operator for Parametric Partial Differential EquationsThe classical development of neural networks has primarily focused on learning mappings between finite-dimensional Euclidean spaces. Recently, this has been generalized to neural oper..
[Diffusion Transformer 논문 리뷰3] - Scalable Diffusion Models with Transformers *DiT를 한번에 이해할 수 있는(?) A~Z 논문리뷰입니다! *총 3편으로 구성되었고, 마지막 3편은 제 온 힘을 다하여서.. 논문리뷰를 했습니다..ㅎㅎ *궁금하신 점은 댓글로 남겨주세요! DiT paper: https://arxiv.org/abs/2212.09748 Scalable Diffusion Models with Transformers We explore a new class of diffusion models based on the transformer architecture. We train latent diffusion models of images, replacing the commonly-used U-Net backbone with a transformer that operates o..
[Diffusion Transformer 논문 리뷰2] - High-Resolution Image Synthesis with Latent Diffusion Models *DiT를 한번에 이해할 수 있는(?) A~Z 논문리뷰입니다! *총 3편으로 구성되었고, 2편은 DiT를 이해하기 위하여 LDM를 논문리뷰를 진행합니다! *궁금하신 점은 댓글로 남겨주세요! DiT paper: https://arxiv.org/abs/2212.09748 Scalable Diffusion Models with Transformers We explore a new class of diffusion models based on the transformer architecture. We train latent diffusion models of images, replacing the commonly-used U-Net backbone with a transformer that operates on..
[Diffusion Transformer 논문 리뷰1] - DDPM, Classifier guidance and Classifier-Free guidance *DiT를 한번에 이해할 수 있는(?) A~Z 논문리뷰입니다! *총 3편으로 구성되었고, 1편은 DiT를 이해하기 위한 지식들을 Preview하는 시간입니다! *궁금하신 점은 댓글로 남겨주세요! DiT paper: https://arxiv.org/abs/2212.09748 Scalable Diffusion Models with Transformers We explore a new class of diffusion models based on the transformer architecture. We train latent diffusion models of images, replacing the commonly-used U-Net backbone with a transformer that operates..
[SORA 설명] - OpenAI의 Video Generation AI (기술부분 번역 + 설명 이미지 추가) Technical Report: Video generation models as world simulators (openai.com) Video generation models as world simulators We explore large-scale training of generative models on video data. Specifically, we train text-conditional diffusion models jointly on videos and images of variable durations, resolutions and aspect ratios. We leverage a transformer architecture that oper openai.com SORA: https..
[ControlNet 논문 리뷰] - Adding Conditional Control to Text-to-Image Diffusion Models *ControlNet를 위한 논문 리뷰 글입니다! 궁금하신 점은 댓글로 남겨주세요! ControlNet paper: [2302.05543] Adding Conditional Control to Text-to-Image Diffusion Models (arxiv.org) Adding Conditional Control to Text-to-Image Diffusion Models We present ControlNet, a neural network architecture to add spatial conditioning controls to large, pretrained text-to-image diffusion models. ControlNet locks the production-ready large..
[MoE 논문 리뷰] - Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity *MoE를 위한 논문 리뷰 글입니다! 궁금하신 점은 댓글로 남겨주세요! MoE paper: https://arxiv.org/abs/2101.03961 Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity In deep learning, models typically reuse the same parameters for all inputs. Mixture of Experts (MoE) defies this and instead selects different parameters for each incoming example. The result is a sparsely-activated mode..
[NeRF-CAM 논문리뷰] - COORDINATE-AWARE MODULATION FOR NEURAL FIELDS 💰새해복 많이 받으세요!!💰 *NeRF-CAM를 위한 논문 리뷰 글입니다! 궁금하신 점은 댓글로 남겨주세요! NeRF-CAM paper: arxiv.org/pdf/2311.14993.pdf NeRF-CAM github: Coordinate-Aware Modulation for Neural Fields (maincold2.github.io) Coordinate-Aware Modulation for Neural Fields Neural fields, mapping low-dimensional input coordinates to corresponding signals, have shown promising results in representing various signals. Numerous methodo..
[Tune-A-Video 논문 리뷰] One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation *Tune-A-Video를 위한 논문 리뷰 글입니다! 궁금하신 점은 댓글로 남겨주세요! Tune-A-Video paper: [2212.11565] Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation (arxiv.org) Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation To replicate the success of text-to-image (T2I) generation, recent works employ large-scale video datasets to train a text-to-video (T..

반응형