Migrate FP6-LLM implementation from module to tensor subclass #402

gau-nernst · 2024-06-19T02:29:08Z

With the tensor subclass guide published in #391, FP6-LLM implementation, which is currently a Linear module replacement, should migrate to tensor subclass too.

* ET or AOTI backend logic * use args, not builder_args * typo * typo * typo

This reverts commit 052fb1a3792123e291fb5f47084644913c8ecba7 to fix [test-tinystories-executorch](https://github.com/pytorch/torchchat/actions/runs/8802811768/job/24159454773?pr=411#logs)

* Revert "Revert "Embedding quantization per backend (pytorch#402)" (pytorch#411)" This reverts commit 8b35acdff4fded779799ab8a419e55f885dd8918. * 4b and 8b embedding table quantization * minor changes * remove extra et workflow

* Revert "Revert "Embedding quantization per backend (pytorch#402)" (pytorch#411)" This reverts commit 8b35acdff4fded779799ab8a419e55f885dd8918. * merge GGUF tests into pull.yml

gau-nernst self-assigned this Jun 19, 2024

gau-nernst mentioned this issue Jun 24, 2024

Add FP5 E2M2 support from upstream #399

Merged

msaroufim closed this as completed in #399 Jun 25, 2024

yanbing-j pushed a commit to yanbing-j/ao that referenced this issue Dec 9, 2024

Embedding quantization per backend (pytorch#402)

c204150

* ET or AOTI backend logic * use args, not builder_args * typo * typo * typo

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate FP6-LLM implementation from module to tensor subclass #402

Migrate FP6-LLM implementation from module to tensor subclass #402

gau-nernst commented Jun 19, 2024

Migrate FP6-LLM implementation from module to tensor subclass #402

Migrate FP6-LLM implementation from module to tensor subclass #402

Comments

gau-nernst commented Jun 19, 2024