본문 바로가기

video

(7)

[CogVideoX 논문 리뷰] - Text-to-Video Diffusion Models with An Expert Transformer *CogVideoX를 위한 논문 리뷰 글입니다! 궁금하신 점은 댓글로 남겨주세요! CogVideoX paper: https://arxiv.org/abs/2408.06072 CogVideoX: Text-to-Video Diffusion Models with An Expert TransformerWe present CogVideoX, a large-scale text-to-video generation model based on diffusion transformer, which can generate 10-second continuous videos aligned with text prompt, with a frame rate of 16 fps and resolution of 768 * 1360 pixel..

[LLaVA-Video 논문 리뷰] - VIDEO INSTRUCTION TUNING WITH SYNTHETIC DATA *LLaVA-Video를 위한 논문 리뷰 글입니다! 궁금하신 점은 댓글로 남겨주세요! LLaVA-Video paper: https://arxiv.org/abs/2410.02713 Video Instruction Tuning With Synthetic DataThe development of video large multimodal models (LMMs) has been hindered by the difficulty of curating large amounts of high-quality raw data from the web. To address this, we propose an alternative approach by creating a high-quality synthetic dataset ..

[LLaVA-OneVision 논문 리뷰] - LLaVA-OneVision: Easy Visual Task Transfer *LLaVA-OneVision를 위한 논문 리뷰 글입니다! 궁금하신 점은 댓글로 남겨주세요! LLaVA-OneVision paper: https://arxiv.org/abs/2408.03326 LLaVA-OneVision: Easy Visual Task TransferWe present LLaVA-OneVision, a family of open large multimodal models (LMMs) developed by consolidating our insights into data, models, and visual representations in the LLaVA-NeXT blog series. Our experimental results demonstrate that LLaVA-OneVisi..

[SORA 설명] - OpenAI의 Video Generation AI (기술부분 번역 + 설명 이미지 추가) Technical Report: Video generation models as world simulators (openai.com) Video generation models as world simulators We explore large-scale training of generative models on video data. Specifically, we train text-conditional diffusion models jointly on videos and images of variable durations, resolutions and aspect ratios. We leverage a transformer architecture that oper openai.com SORA: https..

[Tune-A-VideKO] - 한국어 기반 One-shot Tuning of diffusion for Text-to-Video 모델 Github: https://github.com/KyujinHan/Tune-A-VideKO/tree/master GitHub - KyujinHan/Tune-A-VideKO: 한국어 기반 One-shot video tuning with Stable Diffusion 한국어 기반 One-shot video tuning with Stable Diffusion - GitHub - KyujinHan/Tune-A-VideKO: 한국어 기반 One-shot video tuning with Stable Diffusion github.com Tune-A-VideKO-v1-5🏄: https://huggingface.co/kyujinpy/Tune-A-VideKO-v1-5 kyujinpy/Tune-A-VideKO-v1-5 ·..

[Tune-A-Video 논문 리뷰] One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation *Tune-A-Video를 위한 논문 리뷰 글입니다! 궁금하신 점은 댓글로 남겨주세요! Tune-A-Video paper: [2212.11565] Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation (arxiv.org) Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation To replicate the success of text-to-image (T2I) generation, recent works employ large-scale video datasets to train a text-to-video (T..

[MCCNet 논문 리뷰] - Arbitrary Video Style Transfer via Multi-Channel Correlation *MCCNet를 위한 논문 리뷰 글입니다! 궁금하신 점은 댓글로 남겨주세요! MCCNet paper: [2009.08003] Arbitrary Video Style Transfer via Multi-Channel Correlation (arxiv.org) Arbitrary Video Style Transfer via Multi-Channel Correlation Video style transfer is getting more attention in AI community for its numerous applications such as augmented reality and animation productions. Compared with traditional image style transfer, ..

이전 1 다음

티스토리툴바