본문 바로가기

text

(4)
[OpenFlaminKO] - Polyglot-KO를 활용한 한국어 기반 MultiModal 도전기! Github: https://github.com/Marker-Inc-Korea/OpenFlaminKO OpenFlamingo: https://github.com/mlfoundations/open_flamingo GitHub - mlfoundations/open_flamingo: An open-source framework for training large multimodal models. An open-source framework for training large multimodal models. - GitHub - mlfoundations/open_flamingo: An open-source framework for training large multimodal models. github.com Op..
[Tune-A-Video 논문 리뷰] One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation *Tune-A-Video를 위한 논문 리뷰 글입니다! 궁금하신 점은 댓글로 남겨주세요! Tune-A-Video paper: [2212.11565] Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation (arxiv.org) Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation To replicate the success of text-to-image (T2I) generation, recent works employ large-scale video datasets to train a text-to-video (T..
[CLIP-NeRF 논문 리뷰] - CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields *해당 글은 CLIP-NeRF 논문 리뷰를 위한 글입니다. 궁금하신 점은 댓글로 남겨주세요! CLIP-NeRF paper: [2112.05139] CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields (arxiv.org) CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields We present CLIP-NeRF, a multi-modal 3D object manipulation method for neural radiance fields (NeRF). By leveraging the joint language-image embedding space of t..
[CLIP 논문 리뷰] - Learning Transferable Visual Models From Natural Language Supervision *CLIP 논문 리뷰를 위한 글입니다. 질문이 있다면 댓글로 남겨주시길 바랍니다! CLIP paper: [2103.00020] Learning Transferable Visual Models From Natural Language Supervision (arxiv.org) Learning Transferable Visual Models From Natural Language Supervision State-of-the-art computer vision systems are trained to predict a fixed set of predetermined object categories. This restricted form of supervision limits their generality and..

반응형