Skip to content

Latest commit

 

History

History
451 lines (316 loc) · 40.7 KB

llm_reasoning.md

File metadata and controls

451 lines (316 loc) · 40.7 KB

LLM Reasoning

Survey

  • 🌟 Towards Large Reasoning Models: A Survey on Scaling LLM Reasoning Capabilities, arXiv, 2501.09686, arxiv, pdf, cication: -1

    Fengli Xu, Qianyue Hao, Zefang Zong, ..., Chen Gao, Yong Li

  • 🌟 Test-time Computing: from System-1 Thinking to System-2 Thinking, arXiv, 2501.02497, arxiv, pdf, cication: -1

    Yixin Ji, Juntao Li, Hai Ye, ..., Linjian Mo, Min Zhang · (Awesome_Test_Time_LLMs - Dereck0602) Star

  • A Survey on LLM Inference-Time Self-Improvement, arXiv, 2412.14352, arxiv, pdf, cication: -1

    Xiangjue Dong, Maria Teleki, James Caverlee

Reasoning

  • PokerBench: Training Large Language Models to become Professional Poker Players, arXiv, 2501.08328, arxiv, pdf, cication: -1

    Richard Zhuang, Akshat Gupta, Richard Yang, ..., Zhengyu Li, Gopala Anumanchipalli · (pokerbench - pokerllm) Star

  • 🌟 OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking, arXiv, 2501.09751, arxiv, pdf, cication: -1

    Zekun Xi, Wenbiao Yin, Jizhan Fang, ..., Fei Huang, Huajun Chen · (zjunlp.github)

  • 🌟 Evolving Deeper LLM Thinking, arXiv, 2501.09891, arxiv, pdf, cication: -1

    Kuang-Huei Lee, Ian Fischer, Yueh-Hua Wu, ..., Dale Schuurmans, Xinyun Chen

  • Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident Even When They Are Wrong, arXiv, 2501.09775, arxiv, pdf, cication: -1

    Tairan Fu, Javier Conde, Gonzalo Martínez, ..., María Grandury, Pedro Reviriego

  • Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains, arXiv, 2501.05707, arxiv, pdf, cication: -1

    Vighnesh Subramaniam, Yilun Du, Joshua B. Tenenbaum, ..., Shuang Li, Igor Mordatch · (llm-multiagent-ft.github)

  • Towards AI Superhuman Reasoning for Math and beyond

    · (youtu)

  • Aligning with Logic: Measuring, Evaluating and Improving Logical Consistency in Large Language Models, arXiv, 2410.02205, arxiv, pdf, cication: -1

    Yinhong Liu, Zhijiang Guo, Tianya Liang, ..., Ivan Vulić, Nigel Collier

  • 🌟 Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking, arXiv, 2403.09629, arxiv, pdf, cication: -1

    Eric Zelikman, Georges Harik, Yijia Shao, ..., Nick Haber, Noah D. Goodman

  • 🌟 Token-Budget-Aware LLM Reasoning, arXiv, 2412.18547, arxiv, pdf, cication: -1

    Tingxu Han, Chunrong Fang, Shiyu Zhao, ..., Zhenyu Chen, Zhenting Wang · (TALE - GeniusHTX) Star · (𝕏)

  • 🌟 B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners, arXiv, 2412.17256, arxiv, pdf, cication: -1

    Weihao Zeng, Yuzhen Huang, Lulu Zhao, ..., Zifei Shan, Junxian He

  • Deliberation in Latent Space via Differentiable Cache Augmentation, arXiv, 2412.17747, arxiv, pdf, cication: -1

    Luyang Liu, Jonas Pfeiffer, Jiaxing Wu, ..., Jun Xie, Arthur Szlam

  • Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning, arXiv, 2412.15797, arxiv, pdf, cication: -1

    Sungjin Park, Xiao Liu, Yeyun Gong, ..., Edward Choi

  • 🌟 Chain-of-Thought Reasoning Without Prompting, arXiv, 2402.10200, arxiv, pdf, cication: -1

    Xuezhi Wang, Denny Zhou · (𝕏)

  • SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models, arXiv, 2412.11605, arxiv, pdf, cication: -1

    Jiale Cheng, Xiao Liu, Cunxiang Wang, ..., Hongning Wang, Minlie Huang · (SPaR - thu-coai) Star

  • 🌟 Are Your LLMs Capable of Stable Reasoning?, arXiv, 2412.13147, arxiv, pdf, cication: -1

    Junnan Liu, Hongwei Liu, Linchen Xiao, ..., Songyang Zhang, Kai Chen · (GPassK. - open-compass) Star

  • Compressed Chain of Thought: Efficient Reasoning Through Dense Representations, arXiv, 2412.13171, arxiv, pdf, cication: -1

    Jeffrey Cheng, Benjamin Van Durme

  • LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks, arXiv, 2412.15204, arxiv, pdf, cication: -1

    Yushi Bai, Shangqing Tu, Jiajie Zhang, ..., Jie Tang, Juanzi Li · (longbench2.github)

  • RARE: Retrieval-Augmented Reasoning Enhancement for Large Language Models, arXiv, 2412.02830, arxiv, pdf, cication: -1

    Hieu Tran, Zonghai Yao, Junda Wang, ..., Zhichao Yang, Hong Yu

  • Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models, arXiv, 2412.02674, arxiv, pdf, cication: -1

    Yuda Song, Hanlin Zhang, Carson Eisenach, ..., Dean Foster, Udaya Ghai · (𝕏)

  • Frontier Models are Capable of In-context Scheming, arXiv, 2412.04984, arxiv, pdf, cication: -1

    Alexander Meinke, Bronson Schoen, Jérémy Scheurer, ..., Rusheb Shah, Marius Hobbhahn

  • 🌟 Training Large Language Models to Reason in a Continuous Latent Space, arXiv, 2412.06769, arxiv, pdf, cication: -1

    Shibo Hao, Sainbayar Sukhbaatar, DiJia Su, ..., Jason Weston, Yuandong Tian · (𝕏)

  • 🌟 Paper page - Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM's Reasoning Capability

  • MALT: Improving Reasoning with Multi-Agent LLM Training, arXiv, 2412.01928, arxiv, pdf, cication: -1

    Sumeet Ramesh Motwani, Chandler Smith, Rocktim Jyoti Das, ..., Ronald Clark, Christian Schroeder de Witt

  • Reverse Thinking Makes LLMs Stronger Reasoners, arXiv, 2411.19865, arxiv, pdf, cication: -1

    Justin Chih-Yao Chen, Zifeng Wang, Hamid Palangi, ..., Chen-Yu Lee, Tomas Pfister

  • 🌟 Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTS, arXiv, 2411.18478, arxiv, pdf, cication: -1

    Jinyang Wu, Mingkuan Feng, Shuai Zhang, ..., Zengqi Wen, Jianhua Tao · (arxiv) · (jinyangwu.github)

  • DeepSeek-R1-Lite-Preview is now live: unleashing supercharged reasoning power 𝕏

  • BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games, arXiv, 2411.13543, arxiv, pdf, cication: -1

    Davide Paglieri, Bartłomiej Cupiał, Samuel Coward, ..., Jack Parker-Holder, Tim Rocktäschel

  • 🌟 Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding, arXiv, 2411.04282, arxiv, pdf, cication: -1

    Haolin Chen, Yihao Feng, Zuxin Liu, ..., Caiming Xiong, Huan Wang · (LaTRO - SalesforceAIResearch) Star

  • Large Language Models Can Self-Improve in Long-context Reasoning, arXiv, 2411.08147, arxiv, pdf, cication: -1

    Siheng Li, Cheng Yang, Zesen Cheng, ..., Yujiu Yang, Wai Lam

  • 🌟 Combining Induction and Transduction for Abstract Reasoning, arXiv, 2411.02272, arxiv, pdf, cication: -1

    Wen-Ding Li, Keya Hu, Carter Larsen, ..., Yewen Pu, Kevin Ellis · (𝕏)

  • 🌟 The Surprising Effectiveness ofTest-Time Training for Abstract Reasoning

    · (𝕏) · (marc - ekinakyurek) Star

  • Can Language Models Learn to Skip Steps?, arXiv, 2411.01855, arxiv, pdf, cication: -1

    Tengxiao Liu, Qipeng Guo, Xiangkun Hu, ..., Xipeng Qiu, Zheng Zhang

  • SocialGPT: Prompting LLMs for Social Relation Reasoning via Greedy Segment Optimization, arXiv, 2410.21411, arxiv, pdf, cication: -1

    Wanhua Li, Zibin Meng, Jiawei Zhou, ..., Chuang Gan, Hanspeter Pfister · (SocialGPT - Mengzibin) Star

  • A Pointer Network-based Approach for Joint Extraction and Detection of Multi-Label Multi-Class Intents, arXiv, 2410.22476, arxiv, pdf, cication: -1

    Ankan Mullick, Sombit Bose, Abhilash Nandy, ..., Gajula Sai Chaitanya, Pawan Goyal

  • Combining Induction and Transduction for Abstract Reasoning

  • Improve Vision Language Model Chain-of-thought Reasoning, arXiv, 2410.16198, arxiv, pdf, cication: -1

    Ruohong Zhang, Bowen Zhang, Yanghao Li, ..., Ruoming Pang, Yiming Yang

    · (LLaVA-Reasoner-DPO - RifleZhang) Star

Math Reasoning

  • 🌟 The Lessons of Developing Process Reward Models in Mathematical Reasoning, arXiv, 2501.07301, arxiv, pdf, cication: -1

    Zhenru Zhang, Chujie Zheng, Yangzhen Wu, ..., Jingren Zhou, Junyang Lin

  • 🌟 BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning, arXiv, 2501.03226, arxiv, pdf, cication: -1

    Beichen Zhang, Yuhong Liu, Xiaoyi Dong, ..., Dahua Lin, Jiaqi Wang · (BoostStep - beichenzbc) Star

  • URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics, arXiv, 2501.04686, arxiv, pdf, cication: -1

    Ruilin Luo, Zhuofan Zheng, Yifan Wang, ..., Jin Zeng, Yujiu Yang · (ursa-math.github)

  • 🌟 DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models, arXiv, 2402.03300, arxiv, pdf, cication: 155

    Zhihong Shao, Peiyi Wang, Qihao Zhu, ..., Y. Wu, Daya Guo · (𝕏)

  • continual-pre-training of Llama-3.2-3B on a mix of 📐 FineMath (our new high quality math dataset) and FineWeb-Edu. 🤗

  • Slow Perception: Let's Perceive Geometric Figures Step-by-step, arXiv, 2412.20631, arxiv, pdf, cication: -1

    Haoran Wei, Youyang Yin, Yumeng Li, ..., Zheng Ge, Xiangyu Zhang · (Slow-Perception - Ucas-HaoranWei) Star

  • HUNYUANPROVER: A Scalable Data Synthesis Framework and Guided Tree Search for Automated Theorem Proving, arXiv, 2412.20735, arxiv, pdf, cication: -1

    Yang Li, Dong Du, Linfeng Song, ..., Tao Yang, Haitao Mi

  • AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling, arXiv, 2412.15084, arxiv, pdf, cication: -1

    Zihan Liu, Yang Chen, Mohammad Shoeybi, ..., Bryan Catanzaro, Wei Ping · (research.nvidia)

  • Formal Mathematical Reasoning: A New Frontier in AI, arXiv, 2412.16075, arxiv, pdf, cication: -1

    Kaiyu Yang, Gabriel Poesia, Jingxuan He, ..., Swarat Chaudhuri, Dawn Song

  • FineMath consists of 34B tokens (FineMath-3+) and 54B tokens (FineMath-3+ with InfiMM-WebMath-3+) of mathematical educational content filtered from CommonCrawl. 🤗

    · (𝕏)

  • U-MATH: A University-Level Benchmark for Evaluating Mathematical Skills in LLMs, arXiv, 2412.03205, arxiv, pdf, cication: -1

    Konstantin Chernyshev, Vitaliy Polshkov, Ekaterina Artemova, ..., Alexei Miasnikov, Sergei Tilga · (u-math - Toloka) Star

  • ProcessBench: Identifying Process Errors in Mathematical Reasoning, arXiv, 2412.06559, arxiv, pdf, cication: -1

    Chujie Zheng, Zhenru Zhang, Beichen Zhang, ..., Jingren Zhou, Junyang Lin

  • FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI, arXiv, 2411.04872, arxiv, pdf, cication: -1

    Elliot Glazer, Ege Erdil, Tamay Besiroglu, ..., Tetiana Grechuk, Shreepranav Varma Enugandla · (epochai) · (𝕏)

  • Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning, arXiv, 2410.22304, arxiv, pdf, cication: -1

    Yihe Deng, Paul Mineiro

  • Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics, arXiv, 2410.21272, arxiv, pdf, cication: -1

    Yaniv Nikankin, Anja Reusch, Aaron Mueller, ..., Yonatan Belinkov · (x)

  • Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models, arXiv, 2410.07985, arxiv, pdf, cication: -1

    Bofei Gao, Feifan Song, Zhe Yang, ..., Tianyu Liu, Baobao Chang · (Omni-MATH - KbsdJames) Star · (omni-math.github) · (huggingface) · (huggingface)

O1 Reasoning

Disentanglement

  • Disentangling Memory and Reasoning Ability in Large Language Models, arXiv, 2411.13504, arxiv, pdf, cication: -1

    Mingyu Jin, Weidi Luo, Sitao Cheng, ..., William Yang Wang, Yongfeng Zhang · (Disentangling-Memory-and-Reasoning - MingyuJ666) Star

Self Correction

  • ProgCo: Program Helps Self-Correction of Large Language Models, arXiv, 2501.01264, arxiv, pdf, cication: -1

    Xiaoshuai Song, Yanan Wu, Weixun Wang, ..., Wenbo Su, Bo Zheng

Knowledge

Context Learning

  • 🌟 Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization, arXiv, 2412.18525, arxiv, pdf, cication: -1

    Yang Shen, Xiu-Shen Wei, Yifan Sun, ..., Yazhou Yao, Errui Ding

  • The broader spectrum of in-context learning, arXiv, 2412.03782, arxiv, pdf, cication: -1

    Andrew Kyle Lampinen, Stephanie C. Y. Chan, Aaditya K. Singh, ..., Murray Shanahan · (𝕏)

Chain Of Thought

  • Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step, arXiv, 2501.13926, arxiv, pdf, cication: -1

    Ziyu Guo, Renrui Zhang, Chengzhuo Tong, ..., Hongsheng Li, Pheng-Ann Heng · (Image-Generation-CoT - ZiyuGuo99) Star

  • Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model, arXiv, 2501.07246, arxiv, pdf, cication: -1

    Ziyang Ma, Zhuo Chen, Yuping Wang, ..., Eng Siong Chng, Xie Chen

  • To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning, arXiv, 2409.12183, arxiv, pdf, cication: 24

    Zayne Sprague, Fangcong Yin, Juan Diego Rodriguez, ..., Kyle Mahowald, Greg Durrett · (To-CoT-or-not-to-CoT - Zayne-sprague) Star · (𝕏)

  • Internalize_CoT_Step_by_Step - da03 Star

    · (huggingface)

  • LLMs Do Not Think Step-by-step In Implicit Reasoning, arXiv, 2411.15862, arxiv, pdf, cication: -1

    Yijiong Yu

    · (𝕏)

  • A Theoretical Understanding of Chain-of-Thought: Coherent Reasoning and Error-Aware Demonstration, arXiv, 2410.16540, arxiv, pdf, cication: -1

    Yingqian Cui, Pengfei He, Xianfeng Tang, ..., Jiliang Tang, Yue Xing · (𝕏)

  • Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse, arXiv, 2410.21333, arxiv, pdf, cication: -1

    Ryan Liu, Jiayi Geng, Addison J. Wu, ..., Tania Lombrozo, Thomas L. Griffiths

Prompt

Projects

Planning

  • Revealing the Barriers of Language Agents in Planning, arXiv, 2410.12409, arxiv, pdf, cication: 1

    Jian Xie, Kexun Zhang, Jiangjie Chen, ..., Lei Li, Yanghua Xiao

    Jian Xie, Kexun Zhang, Jiangjie Chen, ..., Lei Li, Yanghua Xiao

Misc