Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate FP6-LLM implementation from module to tensor subclass #402

Closed
gau-nernst opened this issue Jun 19, 2024 · 0 comments · Fixed by #399
Closed

Migrate FP6-LLM implementation from module to tensor subclass #402

gau-nernst opened this issue Jun 19, 2024 · 0 comments · Fixed by #399
Assignees

Comments

@gau-nernst
Copy link
Collaborator

With the tensor subclass guide published in #391, FP6-LLM implementation, which is currently a Linear module replacement, should migrate to tensor subclass too.

@gau-nernst gau-nernst self-assigned this Jun 19, 2024
yanbing-j pushed a commit to yanbing-j/ao that referenced this issue Dec 9, 2024
* ET or AOTI backend logic

* use args, not builder_args

* typo

* typo

* typo
yanbing-j pushed a commit to yanbing-j/ao that referenced this issue Dec 9, 2024
yanbing-j pushed a commit to yanbing-j/ao that referenced this issue Dec 9, 2024
* Revert "Revert "Embedding quantization per backend (pytorch#402)" (pytorch#411)"

This reverts commit 8b35acdff4fded779799ab8a419e55f885dd8918.

* 4b and 8b embedding table quantization

* minor changes

* remove extra et workflow
yanbing-j pushed a commit to yanbing-j/ao that referenced this issue Dec 9, 2024
* Revert "Revert "Embedding quantization per backend (pytorch#402)" (pytorch#411)"

This reverts commit 8b35acdff4fded779799ab8a419e55f885dd8918.

* merge GGUF tests into pull.yml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant