Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Flex Attention Monkey Patch for LLAMA #540

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

austin362667
Copy link
Collaborator

@austin362667 austin362667 commented Jan 25, 2025

Summary

We need flex attention for custom attentions/masks to achieve better performance (for example, shared prefix)

Two ways to enable flex attention in liger:

  1. Set the attn_implementation of ModelConfig from PyTorch sdpa/eager to flex_attention (for instance, LlamaConfig). By doing so, we'll switch config._attn_implementation to use flex attention impl.
  2. (This PR) Patch all attention impls dict in HuggingFace to use flex attention. So that we can still use original default attention key, say sdpa(however now it's flex_attention instead).

Testing Done

  • Hardware Type:
  • run make test to ensure correctness
  • run make checkstyle to ensure code style
  • run make test-convergence to ensure convergence

Signed-off-by: Austin Liu <[email protected]>

wip

Signed-off-by: Austin Liu <[email protected]>

wip

Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>
@austin362667 austin362667 force-pushed the austin362667/llama_flex_attn branch from 73c3f2b to 585c765 Compare January 28, 2025 06:50
Signed-off-by: Austin Liu <[email protected]>

fix logits tests

Signed-off-by: Austin Liu <[email protected]>
@austin362667 austin362667 force-pushed the austin362667/llama_flex_attn branch from 585c765 to 7d342a6 Compare January 28, 2025 09:56
@austin362667 austin362667 marked this pull request as ready for review January 29, 2025 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant