2023.05.06
1. Segment Anything: https://ai.facebook.com/research/publications/segment-anything/
Segment Anything | Meta AI Research
Abstract We introduce the Segment Anything (SA) project: a new task, model, and dataset for image segmentation. Using our efficient model in a data collection loop, we built the largest segmentation dataset to date (by far), with over 1 billion masks on 11
ai.facebook.com
2. InstructGPT: https://arxiv.org/abs/2203.02155
Training language models to follow instructions with human feedback
Making language models bigger does not inherently make them better at following a user's intent. For example, large language models can generate outputs that are untruthful, toxic, or simply not helpful to the user. In other words, these models are not ali
arxiv.org
3. DDPM(Denoising Diffusion Probabilistic Models): https://arxiv.org/abs/2006.11239
Denoising Diffusion Probabilistic Models
We present high quality image synthesis results using diffusion probabilistic models, a class of latent variable models inspired by considerations from nonequilibrium thermodynamics. Our best results are obtained by training on a weighted variational bound
arxiv.org
4. DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection: https://paperswithcode.com/paper/dino-detr-with-improved-denoising-anchor-1
Papers with Code - DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection
🏆 SOTA for Object Detection on COCO 2017 val (box AP metric)
paperswithcode.com
5. LLaMA: Open and Efficient Foundation Language Models: https://paperswithcode.com/paper/llama-open-and-efficient-foundation-language-1
Papers with Code - LLaMA: Open and Efficient Foundation Language Models
🏆 SOTA for Question Answering on PIQA (Accuracy metric)
paperswithcode.com
2023.05.16
6. Hypernetworks: https://arxiv.org/abs/1609.09106
HyperNetworks
This work explores hypernetworks: an approach of using a one network, also known as a hypernetwork, to generate the weights for another network. Hypernetworks provide an abstraction that is similar to what is found in nature: the relationship between a gen
arxiv.org
7. PET-Neus: https://paperswithcode.com/paper/pet-neus-positional-encoding-tri-planes-for
Papers with Code - PET-NeuS: Positional Encoding Tri-Planes for Neural Surfaces
Implemented in one code library.
paperswithcode.com
2023.05.17
8. LoRA: [2106.09685] LoRA: Low-Rank Adaptation of Large Language Models (arxiv.org)
LoRA: Low-Rank Adaptation of Large Language Models
An important paradigm of natural language processing consists of large-scale pre-training on general domain data and adaptation to particular tasks or domains. As we pre-train larger models, full fine-tuning, which retrains all model parameters, becomes le
arxiv.org
9. SAM high quality: https://paperswithcode.com/paper/segment-anything-in-high-quality
Papers with Code - Segment Anything in High Quality
Implemented in one code library.
paperswithcode.com
10. QLoRA:https://arxiv.org/abs/2305.14314
QLoRA: Efficient Finetuning of Quantized LLMs
We present QLoRA, an efficient finetuning approach that reduces memory usage enough to finetune a 65B parameter model on a single 48GB GPU while preserving full 16-bit finetuning task performance. QLoRA backpropagates gradients through a frozen, 4-bit quan
arxiv.org
11. LightGlue: https://paperswithcode.com/paper/lightglue-local-feature-matching-at-light
Papers with Code - LightGlue: Local Feature Matching at Light Speed
Implemented in 2 code libraries.
paperswithcode.com
12. DragGAN: https://github.com/XingangPan/DragGAN
GitHub - XingangPan/DragGAN: Official Code for DragGAN (SIGGRAPH 2023)
Official Code for DragGAN (SIGGRAPH 2023). Contribute to XingangPan/DragGAN development by creating an account on GitHub.
github.com
2023.08.05
13. SDXL: https://github.com/stability-ai/generative-models
GitHub - Stability-AI/generative-models: Generative Models by Stability AI
Generative Models by Stability AI. Contribute to Stability-AI/generative-models development by creating an account on GitHub.
github.com
14. TAV: https://github.com/showlab/Tune-A-Video
GitHub - showlab/Tune-A-Video: [ICCV 2023] Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
[ICCV 2023] Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation - GitHub - showlab/Tune-A-Video: [ICCV 2023] Tune-A-Video: One-Shot Tuning of Image Diffusion Models...
github.com
15. CoDeF: https://paperswithcode.com/paper/codef-content-deformation-fields-for
Papers with Code - CoDeF: Content Deformation Fields for Temporally Consistent Video Processing
Implemented in one code library.
paperswithcode.com
16. TF-ICON: https://paperswithcode.com/paper/tf-icon-diffusion-based-training-free-cross
Papers with Code - TF-ICON: Diffusion-Based Training-Free Cross-Domain Image Composition
Implemented in one code library.
paperswithcode.com
17. Point-Bind & Point-LLM: https://arxiv.org/pdf/2309.00615.pdf
19. DEVA: https://paperswithcode.com/paper/tracking-anything-with-decoupled-video
Papers with Code - Tracking Anything with Decoupled Video Segmentation
🏆 SOTA for Unsupervised Video Object Segmentation on DAVIS 2016 val (G metric)
paperswithcode.com
20. Vote2Cap: https://github.com/ch3cook-fdu/vote2cap-detr
GitHub - ch3cook-fdu/Vote2Cap-DETR: Code release for ''End-to-End 3D Dense Captioning with Vote2Cap-DETR'' (CVPR2023)
Code release for ''End-to-End 3D Dense Captioning with Vote2Cap-DETR'' (CVPR2023) - GitHub - ch3cook-fdu/Vote2Cap-DETR: Code release for ''End-to-End 3D Dense Captioning wit...
github.com
21. InstaFlow: https://github.com/gnobitab/instaflow
GitHub - gnobitab/InstaFlow: :zap: InstaFlow! One-Step Stable Diffusion with Rectified Flow
:zap: InstaFlow! One-Step Stable Diffusion with Rectified Flow - GitHub - gnobitab/InstaFlow: :zap: InstaFlow! One-Step Stable Diffusion with Rectified Flow
github.com
2023.10.02
22. DreamGaussian: https://paperswithcode.com/paper/dreamgaussian-generative-gaussian-splatting
Papers with Code - DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation
Implemented in one code library.
paperswithcode.com
23. Propainter: https://github.com/sczhou/propainter
GitHub - sczhou/ProPainter: [ICCV 2023] ProPainter: Improving Propagation and Transformer for Video Inpainting
[ICCV 2023] ProPainter: Improving Propagation and Transformer for Video Inpainting - GitHub - sczhou/ProPainter: [ICCV 2023] ProPainter: Improving Propagation and Transformer for Video Inpainting
github.com
24. 3D gaussian splatting: https://arxiv.org/pdf/2308.04079.pdf
25. MetaClip: https://paperswithcode.com/paper/demystifying-clip-data
Papers with Code - Demystifying CLIP Data
Implemented in one code library.
paperswithcode.com
26. From CLIP to DINO: https://paperswithcode.com/paper/from-clip-to-dino-visual-encoders-shout-in
Papers with Code - From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models
Implemented in one code library.
paperswithcode.com
27. NEFTune: https://github.com/neelsjain/neftune
GitHub - neelsjain/NEFTune: Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning
Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning - GitHub - neelsjain/NEFTune: Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning
github.com
28. Cutie: https://github.com/hkchengrex/Cutie
GitHub - hkchengrex/Cutie: [arXiv 2023] Putting the Object Back Into Video Object Segmentation
[arXiv 2023] Putting the Object Back Into Video Object Segmentation - GitHub - hkchengrex/Cutie: [arXiv 2023] Putting the Object Back Into Video Object Segmentation
github.com
29. PALI: https://github.com/kyegomez/PALI3
GitHub - kyegomez/PALI3: Implementation of PALI3 from the paper PALI-3 VISION LANGUAGE MODELS: SMALLER, FASTER, STRONGER"
Implementation of PALI3 from the paper PALI-3 VISION LANGUAGE MODELS: SMALLER, FASTER, STRONGER" - GitHub - kyegomez/PALI3: Implementation of PALI3 from the paper PALI-3 VISION LANGUAGE MODEL...
github.com
30. Lion: https://github.com/lucidrains/lion-pytorch
GitHub - lucidrains/lion-pytorch: 🦁 Lion, new optimizer discovered by Google Brain using genetic algorithms that is purported
🦁 Lion, new optimizer discovered by Google Brain using genetic algorithms that is purportedly better than Adam(w), in Pytorch - GitHub - lucidrains/lion-pytorch: 🦁 Lion, new optimizer discovered by...
github.com
31. Mustango: https://github.com/amaai-lab/mustango
GitHub - AMAAI-Lab/mustango: Mustango: Toward Controllable Text-to-Music Generation
Mustango: Toward Controllable Text-to-Music Generation - GitHub - AMAAI-Lab/mustango: Mustango: Toward Controllable Text-to-Music Generation
github.com
32. OneLLM: https://paperswithcode.com/paper/onellm-one-framework-to-align-all-modalities
Papers with Code - OneLLM: One Framework to Align All Modalities with Language
Implemented in one code library.
paperswithcode.com
33. Alpha CLIP:https://github.com/sunzey/alphaclip
GitHub - SunzeY/AlphaCLIP: Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
Alpha-CLIP: A CLIP Model Focusing on Wherever You Want - GitHub - SunzeY/AlphaCLIP: Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
github.com
34. Ferret(LMM): https://github.com/apple/ml-ferret
GitHub - apple/ml-ferret
Contribute to apple/ml-ferret development by creating an account on GitHub.
github.com
35. DreamGaussian4D: https://github.com/jiawei-ren/dreamgaussian4d
GitHub - jiawei-ren/dreamgaussian4d: [arXiv 2023] DreamGaussian4D: Generative 4D Gaussian Splatting
[arXiv 2023] DreamGaussian4D: Generative 4D Gaussian Splatting - GitHub - jiawei-ren/dreamgaussian4d: [arXiv 2023] DreamGaussian4D: Generative 4D Gaussian Splatting
github.com
36. Affusion: https://github.com/happylittlecat2333/Auffusion
GitHub - happylittlecat2333/Auffusion: Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and
Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation" - GitHub - happylittlecat2333/Auffusion: Offic...
github.com
37. InstantID: https://paperswithcode.com/paper/instantid-zero-shot-identity-preserving
Papers with Code - InstantID: Zero-shot Identity-Preserving Generation in Seconds
Implemented in one code library.
paperswithcode.com
38. https://paperswithcode.com/paper/scalable-diffusion-models-with-state-space
Papers with Code - Scalable Diffusion Models with State Space Backbone
Implemented in one code library.
paperswithcode.com
39. https://paperswithcode.com/paper/the-boundary-of-neural-network-trainability
Papers with Code - The boundary of neural network trainability is fractal
Implemented in one code library.
paperswithcode.com
40. RoSA: https://github.com/ist-daslab/rosa?tab=readme-ov-file
GitHub - IST-DASLab/RoSA
Contribute to IST-DASLab/RoSA development by creating an account on GitHub.
github.com
41. VMamba: https://paperswithcode.com/paper/vmamba-visual-state-space-model
42. Latte: https://github.com/Vchitect/Latte
GitHub - Vchitect/Latte: Latte: Latent Diffusion Transformer for Video Generation.
Latte: Latent Diffusion Transformer for Video Generation. - Vchitect/Latte
github.com
43. SuGaR: https://github.com/Anttwo/SuGaR
GitHub - Anttwo/SuGaR: Official PyTorch implementation of SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Recons
Official PyTorch implementation of SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering (CVPR 2024) - GitHub - Anttwo/SuGaR: Official PyTo...
github.com
44. TripoSR: https://paperswithcode.com/paper/triposr-fast-3d-object-reconstruction-from-a
Papers with Code - TripoSR: Fast 3D Object Reconstruction from a Single Image
Implemented in one code library.
paperswithcode.com
45. ViewDiff: https://github.com/facebookresearch/viewdiff?tab=readme-ov-file
GitHub - facebookresearch/ViewDiff: ViewDiff generates high-quality, multi-view consistent images of a real-world 3D object in a
ViewDiff generates high-quality, multi-view consistent images of a real-world 3D object in authentic surroundings. (CVPR2024). - facebookresearch/ViewDiff
github.com
46. InstantStyle: https://github.com/instantstyle/instantstyle
GitHub - InstantStyle/InstantStyle: InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation 🔥
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation 🔥 - InstantStyle/InstantStyle
github.com
47. XCube: https://arxiv.org/pdf/2312.03806
48. https://github.com/hp-l33/aim (AiM)
49. https://github.com/huage001/linfusion?tab=readme-ov-file
GitHub - Huage001/LinFusion: Official PyTorch and Diffusers Implementation of "LinFusion: 1 GPU, 1 Minute, 16K Image"
Official PyTorch and Diffusers Implementation of "LinFusion: 1 GPU, 1 Minute, 16K Image" - Huage001/LinFusion
github.com
50. KSID: https://github.com/axning/ksid
51. MambaST: https://github.com/FilippoBotti/MambaST
52. Direct3D: https://arxiv.org/abs/2405.14832
Direct3D: Scalable Image-to-3D Generation via 3D Latent Diffusion Transformer
Generating high-quality 3D assets from text and images has long been challenging, primarily due to the absence of scalable 3D representations capable of capturing intricate geometry distributions. In this work, we introduce Direct3D, a native 3D generative
arxiv.org
53. Intrinsic Image Decomposition: https://github.com/compphoto/Intrinsic
GitHub - compphoto/Intrinsic: Repo for the paper "Intrinsic Image Decomposition via Ordinal Shading" (TOG 2023)
Repo for the paper "Intrinsic Image Decomposition via Ordinal Shading" (TOG 2023) - compphoto/Intrinsic
github.com
54. https://github.com/pixtella/anagram-mtl
GitHub - Pixtella/Anagram-MTL: [WACV 2025] Official implementation for the paper "Diffusion-based Visual Anagram as Multi-task L
[WACV 2025] Official implementation for the paper "Diffusion-based Visual Anagram as Multi-task Learning" - Pixtella/Anagram-MTL
github.com
55. Divot: https://paperswithcode.com/paper/divot-diffusion-powers-video-tokenizer-for
Papers with Code - Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation
Implemented in one code library.
paperswithcode.com
'Anything else' 카테고리의 다른 글
[LLM 리더보드 관련 기사] (1) | 2023.10.12 |
---|---|
Github 꾸미는 꿀팁 - shield.io (0) | 2023.10.03 |
[DeepL pro 이용하기] - Rapid API 이용 (0) | 2023.07.20 |
블로그 제작할 때 참고하는 사이트 (0) | 2022.12.20 |
순수한 AI 개발자가 되고 싶은 사람 (0) | 2022.12.02 |