ZeyuLing / MotionLLaMA Public

Notifications You must be signed in to change notification settings
Fork 0
Star 41

Official pytorch implementation of MotionLLaMA based on MMEngine

41 stars 0 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
assets		assets
configs		configs
mmotion		mmotion
scripts		scripts
tools		tools
README.md		README.md

Repository files navigation

MotionLLaMA: A Unified Framework for Motion Synthesis and Comprehension

MotionLLaMA: A Unified Framework for Motion Synthesis and Comprehension
Zeyu Ling* Bo Han* Shiyang Li Hongdeng Shen Jikang Cheng Changqing Zou
Zhejiang University Zhejiang Lab

💻 Project Page

📖 Introduction

This project introduces:

MMotion: A public motion-related common library based on MMEngine, which includes PyTorch implementations of MotionLLaMA and various motion models.
MotionHub: Currently the largest open-source multimodal, multi-task motions dataset.

📜 What's New

2024-12-27: Release the MotionHub V2, which involves following updates compared to the original version:
- 1. Manually correct the captions in Fit3D, HumanSC3D, Hi4D subset.
- 1. Manually filter and correct the InterHuman datset, low-quality motion clips are removed.
- 1. Chi3D dataset is removed, since the motion quality is not good.
- 1. Use PoseScript to generate frame-level caption for AIST++ and BEATV2 dataset, and we use ChatGPT-4o-mini to propess the frame-level caption to sentence-level caption.
- 1. Use ChatGPT-4o-mini to correct the caption in MotionX dataset w.r.t the frame-level caption, some original captions are not correct.
- 1. We define the granularity of all captions, including Macro, Meso and Micro. Macro is the lowest granularity, and Micro is the highest granularity.
- 1. We segment the BEATV2 dataset into clips with duration less than 12 seconds. We use whisper to generate the corresponding spoken text of each clip. Each clip contains complete setences, we do not segment one single sentence into multiple clips.
- 1. We remove the preclude dance clips in FineDance dataset, in the preclude clips, the dancer is not dancing but keeping the same pose. Then, we segment the remaining clips into clips with duration less than 12 seconds. We hope this version can be more useful for the community.
Release the MMotion Library.
Release the MotionHub dataset.
Release the demo video.

📥 Dataset Download

Dataset	Clip Number	Caption Number	Google Drive	Baidu Disk
MotionHub V1	131512	269873	Coming Soon	https://pan.baidu.com/s/1vuewGrtVF9PjhEIiv153pw?pwd=AIXM
MotionHub V2	142350	259998	Coming Soon	https://pan.baidu.com/s/1KNc31GrwBhuqTzopqu_U7Q?pwd=AIXM

About

Official pytorch implementation of MotionLLaMA based on MMEngine

Report repository

Releases

No releases published

Packages

No packages published

Languages

Python 99.2%
Other 0.8%