[PROPOSAL]: refactor the core API for better usability #3046
Pinned
FrankLeeeee
started this conversation in
Development | Core
Replies: 3 comments 4 replies
-
This project will be fully managed in the GitHub Project Kanban (public to all) Meanwhile, the development practice will follow the Developer Guideline. |
Beta Was this translation helpful? Give feedback.
0 replies
-
@FrankLeeeee may I know if you have had a branch for implementing this proposal? and it looks like a big change. do you know when we can expect it to be done? |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Proposal
Note: this discussion is migrated from the issue #2975 with some modifications. Thanks @ver217 , @YuliangLiu0306 , @kurisusnowdeng and @1SAA for their brainstorming.
Motivation
if-else
, which is hard to read and modify.Engine
is hard to use. The usage is very different from native torch, and users may take some effort to learn before starting their first applications.Engine
is not flexible. It relies on a configuration file or dict and a global context. If we want to run two models with different parallelism method, it's hard to implement this now. It also only supports single model training, which cannot support some famous RL like PPO.Gemini
and auto-parallelism both have another entry points instead ofEngine
.Design
We design several components for API refactoring.
Engine has 6 main components:
Booster
features include:no_sync()
)Booster
is not a singleton, though in the most cases single engine is enough.Possible sample code (pseudo-code)
Single-model supervised learning train loop without pipeline
Single-model supervised learning train loop with pipeline
Multi-model RL train loop without pipeline
Possible class definition (pseudo-code)
Futher work
Huggingface/accelerate and Lightning/fabric may have similar design.
We may provide colossalai plugin / strategy to these libs.
Self-service
Beta Was this translation helpful? Give feedback.
All reactions