본문 바로가기

Kyujinpy

(152)
[Window] bitsandbytes download - PEFT using LoRA(QLoRA) 드디어... LLM을 PEFT 방식을 이용해서 훈련시킬려고 하는데, bitsandbytes가 linux만 지원한다는 얘기만 잔뜩 있고 도움이 되는 글들을 못 찾아서 디버깅하는데만 5시간 쏟은 것 같다... 요즘 LLM fine-tuning하면 LoRA와 QLoRA가 대세이고, 또 이것을 양자화해서 memory의 효율성을 가져가는게 트렌드인데 이 양자화를 해서 memory를 아낄려면 8-bit, 4-bit로 weight를 설정하고 float과 상호성을 띄도록 만들어줘야 하는데 여기 필요한 모듈 중 하나가 바로 bitsandbytes이다... 방법1) bitsandbytes-windows-webui github https://github.com/jllllll/bitsandbytes-windows-webui ..
[Instant-NGP 논문 리뷰] - Instant Neural Graphics Primitives with a Multiresolution Hash Encoding *이 글의 목표: Hash-encoding 완전 이해하기!!! (부셔버려!!) *Instant-NGP를 위한 논문 리뷰 글입니다! 궁금하신 점은 댓글로 남겨주세요! Instant-NGP paper: nvlabs.github.io/instant-ngp/assets/mueller2022instant.pdf Instant-NGP github: GitHub - NVlabs/instant-ngp: Instant neural graphics primitives: lightning fast NeRF and more GitHub - NVlabs/instant-ngp: Instant neural graphics primitives: lightning fast NeRF and more Instant neural graph..
[Instant-stylization-NeRF 논문 리뷰] - Instant Neural Radiance Fields Stylization *Instant Neural Radiance Fields Stylization를 위한 논문 리뷰 글입니다! 궁금하신 점은 댓글로 남겨주세요! Instant Neural Radiance Fields Stylization paper: [2303.16884] Instant Neural Radiance Fields Stylization (arxiv.org) Instant Neural Radiance Fields Stylization We present Instant Neural Radiance Fields Stylization, a novel approach for multi-view image stylization for the 3D scene. Our approach models a neural radian..
[LoRA 논문 리뷰] - LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS *LoRA를 위한 논문 리뷰 글입니다! 궁금하신 점은 댓글로 남겨주세요! LoRA paper: https://arxiv.org/abs/2106.09685 LoRA: Low-Rank Adaptation of Large Language Models An important paradigm of natural language processing consists of large-scale pre-training on general domain data and adaptation to particular tasks or domains. As we pre-train larger models, full fine-tuning, which retrains all model parameters, becomes le arxi..
ipykernel_launcher.py: error: unrecognized arguments: -f parser = config_parser() args = parser.parse_args() parser를 불러오는 과정에서 위와 같은 에러를 만났다. 도대체 무엇이 문제인가!? 해결방법: parse_args()에 '' 추가하기 parser = config_parser() args = parser.parse_args('') 위에 처럼 코드에 ''를 추가했을 뿐인데 에러가 없어졌다!? 신기하지만(?) 일단 에러가 없어졌으니 해결완료! 2023.06.02 Kyujinpy 작성.
[LINC3.0 사업단 보행데이터 활용 헬스케어 AI 해커톤 경진대회] - 대상 보호되어 있는 글입니다.
[제2회 ETRI 휴먼이해 인공지능 논문경진대회] - 논문 aceepted 보호되어 있는 글입니다.
[ChatGPT 리뷰] - GPT와 Reinforcement Learning Human Feedback *ChatGPT에 대해서 설명하는 글입니다! 궁금하신 점은 댓글로 남겨주세요! InstructGPT: https://openai.com/research/instruction-following#guide Aligning language models to follow instructions We’ve trained language models that are much better at following user intentions than GPT-3 while also making them more truthful and less toxic, using techniques developed through our alignment research. These InstructGPT models, which ar..
[KoChatGPT 코드 리뷰] - KoChatGPT: ChatGPT fine tuning with korean dataset References: GitHub - airobotlab/KoChatGPT: ChatGPT의 RLHF를 학습을 위한 3가지 step별 한국어 데이터셋 GitHub - airobotlab/KoChatGPT: ChatGPT의 RLHF를 학습을 위한 3가지 step별 한국어 데이터셋 ChatGPT의 RLHF를 학습을 위한 3가지 step별 한국어 데이터셋. Contribute to airobotlab/KoChatGPT development by creating an account on GitHub. github.com My code colab: https://colab.research.google.com/drive/1p6SVWfqgLDYTrQYkfFAxMUbDKtGuhyMl?usp=sharing ' kocha..
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1024, 1024]], which is output 0 of AsStridedBackward0, is at version 2; expected version 1 instead. 에러코드 전체 ''' RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1024, 1024]], which is output 0 of AsStridedBackward0, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True) ''' 구현하고자 했던 git..
[추후 논문 리뷰 paper 정리] - 계속 업데이트 2023.05.061. Segment Anything: https://ai.facebook.com/research/publications/segment-anything/ Segment Anything | Meta AI ResearchAbstract We introduce the Segment Anything (SA) project: a new task, model, and dataset for image segmentation. Using our efficient model in a data collection loop, we built the largest segmentation dataset to date (by far), with over 1 billion masks on 11ai.facebook...
[DAE-Former 논문 리뷰] - DAE-Former: Dual Attention-guided Efficient Transformer for Medical Image Segmentation *DAE-Former를 위한 논문 리뷰 글입니다! 궁금하신 점은 댓글로 남겨주세요! DAE-Former paper: [2212.13504] DAE-Former: Dual Attention-guided Efficient Transformer for Medical Image Segmentation (arxiv.org) DAE-Former: Dual Attention-guided Efficient Transformer for Medical Image Segmentation Transformers have recently gained attention in the computer vision domain due to their ability to model long-range dependencies. Howev..

반응형