-
Hunyuan Video Speed Up with Teacache method has been added to my comfyui toolset 𝕏
-
TransPixar: Advancing Text-to-Video Generation with Transparency,
arXiv, 2501.03006
, arxiv, pdf, cication: -1Luozhou Wang, Yijun Li, Zhifei Chen, ..., Zhe Lin, Yingcong Chen · (wileewang.github) · (huggingface)
-
Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control,
arXiv, 2501.03847
, arxiv, pdf, cication: -1Zekai Gu, Rui Yan, Jiahao Lu, ..., Wenping Wang, Yuan Liu · (DiffusionAsShader - IGL-HKUST)
-
Magic Mirror: ID-Preserved Video Generation in Video Diffusion Transformers,
arXiv, 2501.03931
, arxiv, pdf, cication: -1Yuechen Zhang, Yaoyang Liu, Bin Xia, ..., Eric Lo, Jiaya Jia · (julianjuaner.github) · (MagicMirror - dvlab-research)
-
LTX-Video: Realtime Video Latent Diffusion,
arXiv, 2501.00103
, arxiv, pdf, cication: -1Yoav HaCohen, Nisan Chiprut, Benny Brazowski, ..., Zeev Melumian, Ofir Bibi · (LTX-Video - Lightricks)
-
Open-Sora: Democratizing Efficient Video Production for All,
arXiv, 2412.20404
, arxiv, pdf, cication: -1Zangwei Zheng, Xiangyu Peng, Tianji Yang, ..., Tianyi Li, Yang You · (Open-Sora - hpcaitech)
-
musubi-tuner - kohya-ss
-
Autoregressive Video Generation without Vector Quantization,
arXiv, 2412.14169
, arxiv, pdf, cication: -1Haoge Deng, Ting Pan, Haiwen Diao, ..., Yonggang Qi, Xinlong Wang · (NOVA - baaivision)
-
DirectorLLM for Human-Centric Video Generation,
arXiv, 2412.14484
, arxiv, pdf, cication: -1Kunpeng Song, Tingbo Hou, Zecheng He, ..., Ahmed Elgammal, Felix Juefei-Xu
-
DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation,
arXiv, 2412.18597
, arxiv, pdf, cication: -1Minghong Cai, Xiaodong Cun, Xiaoyu Li, ..., Ying Shan, Xiangyu Yue · (onevfall.github) · (DiTCtrl - TencentARC)
-
LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation with Linear Computational Complexity,
arXiv, 2412.09856
, arxiv, pdf, cication: -1Hongjie Wang, Chih-Yao Ma, Yen-Cheng Liu, ..., Niraj K. Jha, Xiaoliang Dai · (lineargen.github)
-
🌟 FastVideo - hao-ai-lab
· (huggingface) · (huggingface) · (𝕏)
-
🌟 STIV: Scalable Text and Image Conditioned Video Generation,
arXiv, 2412.07730
, arxiv, pdf, cication: -1Zongyu Lin, Wei Liu, Chen Chen, ..., Kai-Wei Chang, Yinfei Yang
-
Mobile Video Diffusion,
arXiv, 2412.07583
, arxiv, pdf, cication: -1Haitam Ben Yahia, Denis Korzhenkov, Ioannis Lelekas, ..., Amir Ghodrati, Amirhossein Habibian · (qualcomm-ai-research.github)
-
🌟 SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints,
arXiv, 2412.07760
, arxiv, pdf, cication: -1Jianhong Bai, Menghan Xia, Xintao Wang, ..., Pengfei Wan, Di Zhang · (jianhongbai.github)
-
GenMAC: Compositional Text-to-Video Generation with Multi-Agent Collaboration,
arXiv, 2412.04440
, arxiv, pdf, cication: -1Kaiyi Huang, Yukun Huang, Xuefei Ning, ..., Yu Wang, Xihui Liu · (karine-h.github) · (arxiv)
-
Genie 2: A large-scale foundation world model
· (𝕏)
-
Mimir: Improving Video Diffusion Models for Precise Text Understanding,
arXiv, 2412.03085
, arxiv, pdf, cication: -1Shuai Tan, Biao Gong, Yutong Feng, ..., Jingdong Chen, Ming Yang · (lucaria-academy.github)
-
Open-Sora Plan: Open-Source Large Video Generation Model,
arXiv, 2412.00131
, arxiv, pdf, cication: -1Bin Lin, Yunyang Ge, Xinhua Cheng, ..., Yonghong Tian, Li Yuan · (Open-Sora-Plan - PKU-YuanGroup)
-
🌟 VideoGen-of-Thought: A Collaborative Framework for Multi-Shot Video Generation,
arXiv, 2412.02259
, arxiv, pdf, cication: -1Mingzhe Zheng, Yongqi Xu, Haojian Huang, ..., Harry Yang, Ser-Nam Lim · (cheliosoops.github)
-
HunyuanVideo - Tencent
A Systematic Framework For Large Video Generation Model Training · (HunyuanVideo - Tencent)
-
Spatiotemporal Skip Guidance for Enhanced Video Diffusion Sampling,
arXiv, 2411.18664
, arxiv, pdf, cication: -1Junha Hyung, Kinam Kim, Susung Hong, ..., Min-Jung Kim, Jaegul Choo · (junhahyung.github) · (STGuidance - junhahyung)
-
Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model,
arXiv, 2411.19108
, arxiv, pdf, cication: -1Feng Liu, Shiwei Zhang, Xiaofeng Wang, ..., Qixiang Ye, Fang Wan · (liewfeng.github) · (TeaCache - LiewFeng)
-
VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement,
arXiv, 2411.15115
, arxiv, pdf, cication: -1Daeun Lee, Jaehong Yoon, Jaemin Cho, ..., Mohit Bansal · (video-repair.github)
-
DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation,
arXiv, 2411.16657
, arxiv, pdf, cication: -1Zun Wang, Jialu Li, Han Lin, ..., Jaehong Yoon, Mohit Bansal · (dreamrunner-story2video.github)
-
AnchorCrafter: Animate CyberAnchors Saling Your Products via Human-Object Interacting Video Generation,
arXiv, 2411.17383
, arxiv, pdf, cication: -1Ziyi Xu, Ziyao Huang, Juan Cao, ..., Jintao Li, Fan Tang · (cangcz.github)
-
🌟 Identity-Preserving Text-to-Video Generation by Frequency Decomposition,
arXiv, 2411.17440
, arxiv, pdf, cication: -1Shenghai Yuan, Jinfa Huang, Xianyi He, ..., Jiebo Luo, Li Yuan · (arxiv) · (pku-yuangroup.github) · (ConsisID - PKU-YuanGroup)
-
LTX-Video - Lightricks
· (huggingface) · (𝕏)
-
Pyramid Flow, a training-efficient Autoregressive Video Generation method based on Flow Matching. 🤗
· (pyramid-flow.github) · (Pyramid-Flow - jy0205)
-
🌟 Generative World Explorer,
arXiv, 2411.11844
, arxiv, pdf, cication: -1Taiming Lu, Tianmin Shu, Alan Yuille, ..., Daniel Khashabi, Jieneng Chen · (generative-world-explorer.github) · (𝕏) · (youtube)
-
The Matrix Infinite-Horizon World Generation with Real-Time Interaction
· (mp.weixin.qq)
-
lucid-v1 - SonicCodes
-
Motion Control for Enhanced Complex Action Video Generation,
arXiv, 2411.08328
, arxiv, pdf, cication: -1Qiang Zhou, Shaofeng Zhang, Nianzu Yang, ..., Ye Qian, Hao Li · (mvideo-v1.github)
-
GameGen-X: Interactive Open-world Game Video Generation,
arXiv, 2411.00769
, arxiv, pdf, cication: -1Haoxuan Che, Xuanhua He, Quande Liu, ..., Cheng Jin, Hao Chen · (GameGen-X - GameGen-X) · (mp.weixin.qq)
-
Adaptive Caching for Faster Video Generation with Diffusion Transformers,
arXiv, 2411.02397
, arxiv, pdf, cication: -1Kumara Kahatapitiya, Haozhe Liu, Sen He, ..., Michael S. Ryoo, Tian Xie · (adacache-dit.github) · (AdaCache - AdaCache-DiT)
-
CogVideoX is an open-source video generation model originating from Qingying. 🤗
· (CogVideo - THUDM)
-
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion,
arXiv, 2411.04928
, arxiv, pdf, cication: -1Wenqiang Sun, Shuo Chen, Fangfu Liu, ..., Jun Zhang, Yikai Wang · (chenshuo20.github) · (DimensionX - wenqsun)
-
Enhancing Motion in Text-to-Video Generation with Decomposed Encoding and Conditioning,
arXiv, 2410.24219
, arxiv, pdf, cication: -1Penghui Ruan, Pichao Wang, Divya Saxena, ..., Jiannong Cao, Yuhui Shi · (pr-ryan.github) · (DEMO - PR-Ryan)
-
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality,
arXiv, 2410.19355
, arxiv, pdf, cication: -1Zhengyao Lv, Chenyang Si, Junhao Song, ..., Ziwei Liu, Kwan-Yee K. Wong · (FasterCache - Vchitect)
-
MarDini: Masked Autoregressive Diffusion for Video Generation at Scale,
arXiv, 2410.20280
, arxiv, pdf, cication: -1Haozhe Liu, Shikun Liu, Zijian Zhou, ..., Jürgen Schmidhuber, Juan-Manuel Pérez-Rúa · (mardini-vidgen.github- Multi-Style Video Generation with Enhanced Effects)
-
ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation,
arXiv, 2410.20502
, arxiv, pdf, cication: -1Zongyi Li, Shujie Hu, Shujie Liu, ..., Hefei Ling, Furu Wei · (aka)
-
WorldSimBench: Towards Video Generation Models as World Simulators,
arXiv, 2410.18072
, arxiv, pdf, cication: -1Yiran Qin, Zhelun Shi, Jiwen Yu, ..., Wanli Ouyang, Ruimao Zhang
· (iranqin.github)
-
Allegro - rhymes-ai
-
Introducing Mochi 1 preview. A new SOTA in open-source video generation. Apache 2.0. 𝕏
-
Movie Gen: A Cast of Media Foundation Models,
arXiv, 2410.13720
, arxiv, pdf, cication: -1Adam Polyak, Amit Zohar, Andrew Brown, ..., Vladan Petrovic, Yuming Du · (ai.meta)
-
VidPanos: Generative Panoramic Videos from Casual Panning Videos,
arXiv, 2410.13832
, arxiv, pdf, cication: -1Jingwei Ma, Erika Lu, Roni Paiss, ..., Michael Rubinstein, Forrester Cole · (vidpanos.github)
-
DisPose: Disentangling Pose Guidance for Controllable Human Image Animation,
arXiv, 2412.09349
, arxiv, pdf, cication: -1Hongxiang Li, Yaowei Li, Yuhang Yang, ..., Xuxin Cheng, Long Chen · (lihxxx.github) · (DisPose - lihxxx)
-
An image-to-video model by CreateAI. 🤗
· (Ruyi-Models - IamCreateAI)
-
DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses,
arXiv, 2412.00397
, arxiv, pdf, cication: -1Yatian Pang, Bin Zhu, Bin Lin, ..., Harry Yang, Li Yuan
-
🌟 StableAnimator: High-Quality Identity-Preserving Human Image Animation,
arXiv, 2411.17697
, arxiv, pdf, cication: -1Shuyuan Tu, Zhen Xing, Xintong Han, ..., Chong Luo, Zuxuan Wu · (StableAnimator - Francis-Rings)
-
Trajectory Attention for Fine-grained Video Motion Control,
arXiv, 2411.19324
, arxiv, pdf, cication: -1Zeqi Xiao, Wenqi Ouyang, Yifan Zhou, ..., Jianlou Si, Xingang Pan
-
AnimateAnything: Consistent and Controllable Animation for Video Generation,
arXiv, 2411.10836
, arxiv, pdf, cication: -1Guojun Lei, Chi Wang, Hong Li, ..., Yikai Wang, Weiwei Xu · (yu-shaonian.github)
-
FlipSketch: Flipping Static Drawings to Text-Guided Sketch Animations,
arXiv, 2411.10818
, arxiv, pdf, cication: -1Hmrishav Bandyopadhyay, Yi-Zhe Song · (hmrishavbandy.github)
-
EasyAnimate - aigc-apps
-
SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation,
arXiv, 2411.04989
, arxiv, pdf, cication: -1Koichi Namekata, Sherwin Bahmani, Ziyi Wu, ..., Igor Gilitschenski, David B. Lindell · (kmcode1.github)
-
HelloMeme: Integrating Spatial Knitting Attentions to Embed High-Level and Fidelity-Rich Conditions in Diffusion Models,
arXiv, 2410.22901
, arxiv, pdf, cication: -1Shengkai Zhang, Nianhong Jiao, Tian Li, ..., Boya Niu, Jun Gao · (HelloMeme - HelloVision) · (songkey.github)
-
CamI2V: Camera-Controlled Image-to-Video Diffusion Model,
arXiv, 2410.15957
, arxiv, pdf, cication: -1Guangcong Zheng, Teng Li, Rui Jiang, ..., Tao Wu, Xi Li · (zgctroy.github) · (CamI2V - ZGCTroy)
-
FrameBridge: Improving Image-to-Video Generation with Bridge Models
-
Animate-X: Universal Character Image Animation with Enhanced Motion Representation,
arXiv, 2410.10306
, arxiv, pdf, cication: -1Shuai Tan, Biao Gong, Xiang Wang, ..., Jingdong Chen, Ming Yang · (lucaria-academy.github)
-
DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control,
arXiv, 2410.13830
, arxiv, pdf, cication: -1Yujie Wei, Shiwei Zhang, Hangjie Yuan, ..., Yingya Zhang, Hongming Shan · (dreamvideo2.github)
-
🌟 Evaluation Agent: Efficient and Promptable Evaluation Framework for Visual Generative Models,
arXiv, 2412.09645
, arxiv, pdf, cication: -1Fan Zhang, Shulin Tian, Ziqi Huang, ..., Yu Qiao, Ziwei Liu
-
VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models,
arXiv, 2411.13503
, arxiv, pdf, cication: -1Ziqi Huang, Fan Zhang, Xiaojie Xu, ..., Yu Qiao, Ziwei Liu · (huggingface)
-
How Far is Video Generation from World Model: A Physical Law Perspective,
arXiv, 2411.02385
, arxiv, pdf, cication: -1Bingyi Kang, Yang Yue, Rui Lu, ..., Gao Huang, Jiashi Feng · (phyworld.github)
-
Video Seal: Open and Efficient Video Watermarking,
arXiv, 2412.09492
, arxiv, pdf, cication: -1Pierre Fernandez, Hady Elsahar, I. Zeki Yalniz, ..., Alexandre Mourachko · (videoseal - facebookresearch)
-
VideoDPO: Omni-Preference Alignment for Video Diffusion Generation,
arXiv, 2412.14167
, arxiv, pdf, cication: -1Runtao Liu, Haoyu Wu, Zheng Ziqiang, ..., Renjie Pi, Qifeng Chen · (videodpo.github)
-
LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment,
arXiv, 2412.04814
, arxiv, pdf, cication: -1Yibin Wang, Zhiyu Tan, Junyan Wang, ..., Cheng Jin, Hao Li · (codegoat24.github) · (LiFT - CodeGoat24)
-
From Slow Bidirectional to Fast Causal Video Generators,
arXiv, 2412.07772
, arxiv, pdf, cication: -1Tianwei Yin, Qiang Zhang, Richard Zhang, ..., Eli Shechtman, Xun Huang · (causvid.github)
-
Progressive Autoregressive Video Diffusion Models,
arXiv, 2410.08151
, arxiv, pdf, cication: -1Desai Xie, Zhan Xu, Yicong Hong, ..., Arie Kaufman, Yang Zhou
· (desaixie.github)
-
🌟 STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution,
arXiv, 2501.02976
, arxiv, pdf, cication: -1Rui Xie, Yinhong Liu, Penghao Zhou, ..., Zhenheng Yang, Ying Tai · (STAR - NJU-PCALab) · (arxiv) · (nju-pcalab.github)
-
SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration,
arXiv, 2501.01320
, arxiv, pdf, cication: -1Jianyi Wang, Zhijie Lin, Meng Wei, ..., Chen Change Loy, Lu Jiang · (iceclear.github)
-
Generative Video Propagation,
arXiv, 2412.19761
, arxiv, pdf, cication: -1Shaoteng Liu, Tianyu Wang, Jui-Hsien Wang, ..., Soo Ye Kim, Jiaya Jia
-
MoViE: Mobile Diffusion for Video Editing,
arXiv, 2412.06578
, arxiv, pdf, cication: -1Adil Karjauv, Noor Fathima, Ioannis Lelekas, ..., Amir Ghodrati, Amirhossein Habibian
-
DIVE: Taming DINO for Subject-Driven Video Editing,
arXiv, 2412.03347
, arxiv, pdf, cication: -1Yi Huang, Wei Xiong, He Zhang, ..., Mingfu Yan, Shifeng Chen · (dino-video-editing.github)
-
MyTimeMachine: Personalized Facial Age Transformation,
arXiv, 2411.14521
, arxiv, pdf, cication: -1Luchao Qi, Jiaye Wu, Bang Gong, ..., David W. Jacobs, Roni Sengupta · (mytimemachine.github)
-
🌟 Generative Omnimatte Learning to Decompose Video into Layers
· (𝕏)
-
StableV2V: Stablizing Shape Consistency in Video-to-Video Editing,
arXiv, 2411.11045
, arxiv, pdf, cication: -1Chang Liu, Rui Li, Kaidong Zhang, ..., Yunwei Lan, Dong Liu
-
AutoVFX: Physically Realistic Video Editing from Natural Language Instructions,
arXiv, 2411.02394
, arxiv, pdf, cication: -1Hao-Yu Hsu, Zhi-Hao Lin, Albert Zhai, ..., Hongchi Xia, Shenlong Wang · (haoyuhsu.github) · (autovfx - haoyuhsu)
-
🌟 ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning,
arXiv, 2411.05003
, arxiv, pdf, cication: -1David Junhao Zhang, Roni Paiss, Shiran Zada, ..., Neal Wadhwa, Nataniel Ruiz · (generative-video-camera-controls.github)
-
Fashion-VDM: Video Diffusion Model for Virtual Try-On,
arXiv, 2411.00225
, arxiv, pdf, cication: -1Johanna Karras, Yingwei Li, Nan Liu, ..., Chris Lee, Ira Kemelmacher-Shlizerman · (johannakarras.github) · (arxiv)
-
InvokeAI - invoke-ai
-
ComfyUI-MochiEdit - logtd
-
Framer: Interactive Frame Interpolation,
arXiv, 2410.18978
, arxiv, pdf, cication: -1Wen Wang, Qiuyu Wang, Kecheng Zheng, ..., Yujun Shen, Chunhua Shen
-
InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption,
arXiv, 2412.09283
, arxiv, pdf, cication: -1Tiehan Fan, Kepan Nan, Rui Xie, ..., Jian Yang, Ying Tai · (InstanceCap - NJU-PCALab) · (arxiv)
-
EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation,
arXiv, 2411.08380
, arxiv, pdf, cication: -1Xiaofeng Wang, Kang Zhao, Feng Liu, ..., Yingya Zhang, Xingang Wang · (egovid.github)
-
TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation,
arXiv, 2411.04709
, arxiv, pdf, cication: -1Wenhao Wang, Yi Yang · (tip-i2v.github.io)
- cogvideox-factory - a-r-r-o-w