huggingface / transformers Public

Notifications You must be signed in to change notification settings
Fork 27k
Star 135k

Code
Issues 996
Pull requests 475
Actions
Projects 1
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Issues: huggingface/transformers

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

996 Open 15,144 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

possible llama rope implementation issue bug

#34741 opened Nov 15, 2024 by ilml

2 of 4 tasks

potential rope implementation issue in llama model

#34740 opened Nov 15, 2024 by ilml

BUG : Modeling nemotron file does not cache key values even though bug

#34739 opened Nov 15, 2024 by jeongin601

2 of 4 tasks

Translating attention.md to Chinese WIP

Label your PR/Issue with WIP for some long outstanding Issues/PRs that are work in progress

#34738 opened Nov 15, 2024 by wwwbai

RFC: Reducing Download Traffic and Latency with ZipNN Lossless Compression for AI Models Feature request

Request for a new feature

#34737 opened Nov 14, 2024 by moshik1

Incompatibility between transformers 4.45.0 and torch 1.9.1 bug

#34736 opened Nov 14, 2024 by realjoshqsun

2 of 4 tasks

[i18n-<languageCode>] Translating docs to <languageName> WIP

Label your PR/Issue with WIP for some long outstanding Issues/PRs that are work in progress

#34734 opened Nov 14, 2024 by zashab

10 tasks

Better error message when loading adapter models with peft dependency missing Feature request

Request for a new feature

#34733 opened Nov 14, 2024 by maxjeblick

Translation model M2M100 uses 2 models in cache (from version 4.46.0) bug

#34731 opened Nov 14, 2024 by harelfar2

3 of 4 tasks

RuntimeError in _group_tensors_by_device_and_dtype (torch/optim/optimizer.py) when training with FSDP on N>1 GPUs. bug

#34730 opened Nov 14, 2024 by julien-piet

2 of 4 tasks

[Idefics3] processing_idefics3 - IndexError: list index out of range for multiple image input bug

#34727 opened Nov 14, 2024 by Glider95

2 of 4 tasks

CI fails on few test_training_gradient_checkpointing tests for LLAMA

#34722 opened Nov 14, 2024 by dvrogozh

ValueError: Invalid cache_implementation (offloaded). bug

#34718 opened Nov 13, 2024 by leigao97

1 of 4 tasks

TypeError: Accelerator.__init__() got an unexpected keyword argument 'dispatch_batches' bug

#34714 opened Nov 13, 2024 by SiyuWu528

2 of 4 tasks

BUG: AutoModel.from_pretrained(modelName) bug

#34709 opened Nov 12, 2024 by NarwhalChen

2 of 4 tasks

Gemma2: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:7 and cuda:0! (when checking argument for argument index in method wrapper_CUDA__index_select) bug

#34706 opened Nov 12, 2024 by Terrencezzj

2 of 4 tasks

AttributeError when accessing .logits from BLIP-2 model output during conversion bug

#34704 opened Nov 12, 2024 by thisisiron

1 of 4 tasks

FSDP with SFTrainer: expected dtype float for end but got dtype c10::BFloat16 bug

#34702 opened Nov 12, 2024 by asc-raynor

1 of 4 tasks

WanDB callback fails on training end when eval dataset is provided bug

#34701 opened Nov 12, 2024 by eyalmazuz

2 of 4 tasks

TypeError: Accelerator.__init__() got an unexpected keyword argument 'dispatch_batches' bug

#34699 opened Nov 12, 2024 by bardenthenry

2 of 4 tasks

Video-Llava model's generation error due to causal mask shape mismatch bug

#34696 opened Nov 12, 2024 by jiqing-feng

2 of 4 tasks

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! (when checking argument for argument mat2 in method wrapper_CUDA_bmm)

#34695 opened Nov 12, 2024 by ra-MANUJ-an

Discrepancy in Training Loss Behavior with Gradient Accumulation using DeepSpeed bug

#34694 opened Nov 12, 2024 by kmchiti

2 of 4 tasks

top-p sampling gives different results even after fixing all random seeds bug

#34693 opened Nov 12, 2024 by jasonppy

2 of 4 tasks

CPU processing is extremely slow for models loaded with torch_dtype = torch.float16 bug

#34692 opened Nov 11, 2024 by blincoln-bf

4 tasks

Previous 1 2 3 4 5 … 39 40 Next

Previous Next

ProTip! Adding no:label will show everything without a label.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly