LLM Long Context

LLM Long Context
- Survey
- Long Context
- Evaluation
- Projects
- Misc

Survey

Long Context

Titans: Learning to Memorize at Test Time, arXiv, 2501.00663, arxiv, pdf, cication: -1

Ali Behrouz, Peilin Zhong, Vahab Mirrokni
Stick-breaking Attention

· (𝕏)
Long Context vs. RAG for LLMs: An Evaluation and Revisits, arXiv, 2501.01880, arxiv, pdf, cication: -1

Xinze Li, Yixin Cao, Yubo Ma, ..., Aixin Sun
SCOPE: Optimizing Key-Value Cache Compression in Long-context Generation, arXiv, 2412.13649, arxiv, pdf, cication: -1

Jialong Wu, Zhenglin Wang, Linhai Zhang, ..., Yulan He, Deyu Zhou · (SCOPE. - Linking-ai)
Revisiting In-Context Learning with Long Context Language Models, arXiv, 2412.16926, arxiv, pdf, cication: -1

Jinheon Baek, Sun Jae Lee, Prakhar Gupta, ..., Siddharth Dalmia, Prateek Kolhar
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization, arXiv, 2412.17739, arxiv, pdf, cication: -1

Ermo Hua, Che Jiang, Xingtai Lv, ..., Xue Kai Zhu, Bowen Zhou
SCBench: A KV Cache-Centric Analysis of Long-Context Methods, arXiv, 2412.10319, arxiv, pdf, cication: -1

Yucheng Li, Huiqiang Jiang, Qianhui Wu, ..., Yuqing Yang, Lili Qiu · (aka)
Star Attention: Efficient LLM Inference over Long Sequences, arXiv, 2411.17116, arxiv, pdf, cication: -1

Shantanu Acharya, Fei Jia, Boris Ginsburg · (Star-Attention - NVIDIA)
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training, arXiv, 2411.13476, arxiv, pdf, cication: -1

Haonan Wang, Qian Liu, Chao Du, ..., Kenji Kawaguchi, Tianyu Pang · (AnchorContext - haonan3)
Long Term Memory: The Foundation of AI Self-Evolution, arXiv, 2410.15665, arxiv, pdf, cication: -1

Xun Jiang, Feng Li, Han Zhao, ..., Mengdi Wang, Tianqiao Chen · (𝕏)
Large Language Models Can Self-Improve in Long-context Reasoning, arXiv, 2411.08147, arxiv, pdf, cication: -1

Siheng Li, Cheng Yang, Zesen Cheng, ..., Yujiu Yang, Wai Lam
KV-Compress: Paged KV-Cache Compression with Variable Compression Rates per Attention Head, arXiv, 2410.00161, arxiv, pdf, cication: -1

Isaac Rehg

· (vllm-kvcompress - IsaacRe)
A Simple and Effective $L_2$ Norm-Based Strategy for KV Cache Compression, arXiv, 2406.11430, arxiv, pdf, cication: 5

Alessio Devoto, Yu Zhao, Simone Scardapane, ..., Pasquale Minervini
Language Models can Self-Lengthen to Generate Long Texts, arXiv, 2410.23933, arxiv, pdf, cication: -1

Shanghaoran Quan, Tianyi Tang, Bowen Yu, ..., Jingren Zhou, Junyang Lin · (Self-Lengthen - QwenLM)
ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference, arXiv, 2410.21465, arxiv, pdf, cication: -1

Hanshi Sun, Li-Wen Chang, Wenlei Bao, ..., Yuejie Chi, Beidi Chen · (ShadowKV - bytedance)
LongReward: Improving Long-context Large Language Models with AI Feedback, arXiv, 2410.21252, arxiv, pdf, cication: -1

Jiajie Zhang, Zhongni Hou, Xin Lv, ..., Ling Feng, Juanzi Li · (LongReward - THUDM) · (huggingface)
In Defense of RAG in the Era of Long-Context Language Models, arXiv, 2409.01666, arxiv, pdf, cication: 3

Tan Yu, Anbang Xu, Rama Akkiraju · (zyphra)
LOGO -- Long cOntext aliGnment via efficient preference Optimization, arXiv, 2410.18533, arxiv, pdf, cication: -1

Zecheng Tang, Zechen Sun, Juntao Li, ..., Qiaoming Zhu, Min Zhang
How to Train Long-Context Language Models (Effectively), arXiv, 2410.02660, arxiv, pdf, cication: 1

Tianyu Gao, Alexander Wettig, Howard Yen, ..., Danqi Chen

· (prolong - princeton-nlp)
Why Does the Effective Context Length of LLMs Fall Short?, arXiv, 2410.18745, arxiv, pdf, cication: -1

Chenxin An, Jun Zhang, Ming Zhong, ..., Jingjing Xu, Lingpeng Kong

· (STRING - HKUNLP)
LLM$\times$MapReduce: Simplified Long-Sequence Processing using Large Language Models, arXiv, 2410.09342, arxiv, pdf, cication: -1

Zihan Zhou, Chong Li, Xinyi Chen, ..., Zhiyuan Liu, Maosong Sun

Evaluation

🌟 MMDocIR: Benchmarking Multi-Modal Retrieval for Long Documents, arXiv, 2501.08828, arxiv, pdf, cication: -1

Kuicai Dong, Yujing Chang, Xin Deik Goh, ..., Ruiming Tang, Yong Liu

Projects

360-LLaMA-Factory - Qihoo360
LLMTest_NeedleInAHaystack - gkamradt
KVCache-Factory - Zefan-Cai
Awesome-LLM-Long-Context-Modeling - Xnhyacinth

Misc

Misc

Extending the Context Length to 1M Tokens!